18
eLTC, Markus Zerlauth, 7 th March 2008 1 Post Mortem System (during HWC and Beam operation) Markus Zerlauth, AB-CO-MI Acknowledgments: Jorg, Verena, Vito, Adriaan, Nikolai and many others

Post Mortem System (during HWC and Beam operation)

  • Upload
    lucita

  • View
    23

  • Download
    0

Embed Size (px)

DESCRIPTION

Post Mortem System (during HWC and Beam operation). Markus Zerlauth, AB-CO-MI Acknowledgments: Jorg, Verena, Vito, Adriaan, Nikolai and many others. Outline. PM System during Hardware Commissioning Automated testing Application software - PowerPoint PPT Presentation

Citation preview

Page 1: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

1

Post Mortem System (during HWC and Beam operation)

Markus Zerlauth, AB-CO-MI

Acknowledgments: Jorg, Verena, Vito, Adriaan, Nikolai and many others

Page 2: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

2

Outline

PM System during Hardware Commissioning– Automated testing– Application software

PM System towards parallel commissioning and beam operation– Infrastructure– Application software for beam PMA

Conclusions and outlook

Page 3: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

3

PM System during HWC

Page 4: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

4

Post Mortem Analysis during HWC

During HWC, each circuit is following a pre-defined set of current cycles to validate functionality of powering equipment and protection systems

LSA Test plan / circuit type (example: individually powered quads in sector 45) HWC sequencer is executing the tests and collecting the results of PMA before sending

data to MTF Hand-shake mechanism between sequencer and PMA for increased automation Main Requirement during HWC: Efficient analysis and validation of specifc test cycles

(prior knowledge about what to expect!)

Page 5: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

5

3. PNO.2 auto analysis

PM system and PM Analysis for HWC

Courtesy of PM and sequencer teams

Executed test creates analysis request + data completness check (collection of related PM data)

Test will be analysed and approved in PM Event Handler (dedicated tools for equip experts and general PM data viewer provided by CO-MA)

– Pass / Fail decision to continue / repeat the test step

Page 6: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

6

3. PNO.2 auto analysis

PM system and PM Analysis for HWC

Courtesy of PM and sequencer teams

Page 7: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

7

PM System towards Beam (and parallel HWC)

Infrastructure

Page 8: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

8

PM infrastructure upgrade – Preparing for the full LHC load

n FE servers

PO/QPS gateway

BE server

n FE servers

BE server

Redundant and scalable services in CCR and TCR, connected to independent network resources

Redundancy for client connections

Multiple Front-end (FE) servers for primary data storage of clients

Back-end (BE) servers with large disks for complete data image

Single Proliantcs-ccr-pm1

Until Mar 08: single server single process on cs-ccr-pm1

Single points of failures due to dependencies in CCR (network, upgrades, etc..)

Small data volume (250GB in 3 years)

Page 9: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

9

PM infrastructure - miscellaneous

Today Switch to new platform has been performed successfully this week and can now be transparently

scaled to the coming needs

HWC clients started to re-compile operational GW to use new client lib

Upgrade is a major improvment in terms of reliability, availability and performance

Further infrastructure work during 2008: Scalability / Load tests (some 10 GB / dump – BLM data concentrator)

DB catalogue of PM dumps (including PM data before Mar 2008)

Dependency studies of major clients (machine protection systems) on network, mains supply, etc... (watch out for correlated failures!)

In collaboration with CO-FE, feasibility study launched for the use of local memory in FE

Further extension of PM client lib towards use of multiple FE servers / client (>2) and better automated load sharing

Set of monitoring and data consistency tools

Decision on Data lifetime on various system levels (e.g. CASTOR as long term archive)

Page 10: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

10

PM System towards Beam

Application Software

Page 11: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

11

Main Requirements for beam PMA:

– needs to reveal cause of emergency beam abort / possible equipment damage to improve operational procedures and protection systems

• Initiating event

• Event sequence leading to dump/incident

– Validate correct funtioning of protection systems (redundancy within system, etc..)

– Automated analysis modules in view of systems and data volume

Main difficulty & difference to HWC: No prior knowledge about what will happen, many more systems involved, no ‘good’ use-cases yet (-> experience)

Operations crew needs clear indication after an emergency beam dump, whether the machine is ready to proceed or not

Application SW for LHC Beam PMA

ES: Application SW for LHC Beam PMA

Page 12: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

12

Likely use-case

Initiating event: Power converter failure (e.g. due to water fault) in recombination dipole RD1.LR5

Possible event sequences derived from the BIS:

– PC (interlocked circuit) > WIC/PIC > BIC > dump triggered

– Current decay > FMCM > BIC > dump triggered

– PC > orbit change:

• Beam loss @ collimators > BLMs > BIC > dump triggered

• Fast orbit change > BPM interlock > BIC > dump triggered

Multiple triggers arriving at BIS

– Even for ‘simple’ use case the event sequence might not be conclusive

– Which one was first ? depends on reaction + transmission times, thresholds, accuracy of internal time-stamps,…

Page 13: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

13

Data flow and analysis

BLM BPM FGC QPS PIC/WIC BIS… …XPOC

Data completeness and consistency check at system and global level (minimum data, configurable)

Upon beam dump / self triggering, systems start pushing data to PM system, Logging, Alarms, etc…

Individual System Analysis/Checks: Validation of machine protection features, pre-analysis of PM buffers into result files, flagging of interesting systems/data reduction, database catalogue

I/XPOCIPOC-BISEvent Sequence

Circuit events

BLM, BPM > threshold

Global PM Analysis: Global Event sequence, summaries, advised actions, event DB,…

FMCM

Global event sequence

Advised Actions

Machine ProtOK

Page 14: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

14

Open Post Mortem framework

„Open“ To cope with diversity

– Large variety of systems and data sources which provide PM data

– External systems with PM-like functionality (e.g. I/XPOC-LBDS, IPOC-BIS, HWC tools, etc..) that need to be integrated

– Multiple analysis modules, contributed by different parties, written in different languages (C/C++/JAVA/Labview,...) that work on the same data and need to be executed in the right order

– Different users that want to use the system for different purposes (operators/PM team/experts)

How: coherent overall architecture with:– Support to plug-in different analysis modules into the analysis data flow

(order of execution and the data they should process)

– Standardized data structures (result files , XML or DB) for data exchange between the different analysis modules

Page 15: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

15

Data flow and analysis – First vertical slice

BLM BPM FGC QPS PIC/WIC BIS… …XPOC

Data completeness and consistency check (configurable)

Upon beam dump / self triggering, systems start pushing data to PM system, Logging, Alarms, etc…

Individual System Analysis/Checks: Validation of machine protection features, pre-analysis of PM buffers into result files, flagging of interesting systems/data reduction, database catalogue

X/IPOCIPOC-BISEvent Sequence

Circuit events

BLM, BPM > threshold

Global PM Analysis: Global Event sequence, summaries, advised actions, event DB,…

FMCM

Global event sequence

Advised Actions

Machine ProtOK

Page 16: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

16

First prototyping has been done based on HWC experience In addition to existing classification / system, fully automated process will create

SCEvent - Classification (Single Circuit Event)– Relating all equipment data belonging to this event (Layout DB)– Identifying the event sequence by retrieving history buffer of interlock

systems (Logging DB)– (Pre-)analysis of event based on event sequence and PM data– Event summary for DB upload and further (global analysis)

Individual System Analysis (event building) for HWC

Page 17: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

17

Milestones

First Infrastructure upgrade completed (1st priority) March 08

– Start testing BI data transfer to PM system, scalability tests with BLM (Global PM trigger in timing)

April 08

– With help of OP and equip experts, work out a series of use cases for first months/years of operation with beam

– Specification of data completness and result file structure for minimum Individual System Analysis (tbd with LBDS, BIS, HWC, BI, ...)

July 08

– Basic PM framework including individual system analysis for HWC, BIS, XPOC, BLM and BPM

– GUI for first event sequence and results, standardised data viewers 2nd half 08

– First global PMA modules

Page 18: Post Mortem System (during HWC and Beam operation)

eLT

C, M

arku

s Z

erla

uth,

7th M

arch

200

8

18

Conclusions and Outlook

HWC is useful to gain experience for use of PM tools during operation Variety of systems and data sources requires an open framework to

accept user-provided code Focus on standardisation and first vital individual system / POC checks

for first months of operation With experience, build up more advanced modules for global PMA As recent focus has been HWC -> Lots of work still to be done, but we

have a plan and a motivated team across CO and OP (Verena, Jorg, Vito, Roman, Adriaan, Hubert, Dmitriy, Nikolai, Markus,...)

Thanks a lot for your attention - Questions?