Piotr Golonka, Manuel Gonzalez-Berges, Josef Hofer, Axel

Preview:

Citation preview

Piotr Golonka, Manuel Gonzalez-Berges, Josef Hofer, Axel VoitierPiotr Golonka, Manuel Gonzalez-Berges, Josef Hofer, Axel VoitierCERN European Organization for Nuclear Research, Geneva, SwitzerlandCERN European Organization for Nuclear Research, Geneva, Switzerland

Archiving architecture after re-designArchiving architecture after re-design

MigrationMigration

Poster: MOPGF021Track: “Control System Upgrades”

Logging DB

long-termdata store

Archiving architecture prior to re-designArchiving architecture prior to re-design

Logging DB

long-termdata storeX

ML

Da

ta L

oa

der

● Single high-performance RDB Archive Manager writes data to central Oracle DB● Data requiring long-term storage is transferred over a DBLink to the LoggingDB● Standard database transfer job mechanism, used by other LoggingDB clients,

serves all applications and provides diagnostic and statistical information ● Data stored in the LoggingDB can be made available in WinCC OA transparently

● Several WinCC OA Value Archive managers write historical data to local disk files● Custom WinCCOA LoggingManager programs query data, transform to XML

and send to a dedicated data loader application of the LoggingDB service● The architecture would not scale to meet the new requirements for LHC Run II

Index organized tablesIndex organized tables Row size reductionRow size reductionColumn Type

ELEMENT_ID NUMBER (25)

TS TIMESTAMP(9)

VALUE_NUMBER BINARY_DOUBE

STATUS NUMBER (20)

MANAGER NUMBER (20)

USER_ NUMBER (5)

SYS_ID NUMBER (20)

BASE NUMBER (1)

TEXT VARCHAR2 (4000)

VALUE_STRING VARCHAR2 (4000)

VALUE_TIMESTAMP TIMESTAMP(9)

CORRVALUE_NUMBER BINARY_DOUBLE

OLVALUE_NUMBER BINARY_DOUBLE

CORRVALUE_STRING VARCHAR2 (4000)

OLVALUE_STRING VARCHAR2 (4000)

CORRVALUE_TIMESTAMP TIMESTAMP(9)

OLVALUE_TIMESTAMP TIMESTAMP(9)

Column Type

ELEMENT_ID NUMBER (25)

TS TIMESTAMP(9)

VALUE_NUMBER BINARY_DOUBE

BASE NUMBER (1)

Typically NULL values

110 B (90B of data + 20 B indexes) • Standard used schema: e.g. ATLAS

60 B • Data and indexes merged together due to

usage of IOT

30 B • New schema with reduced column set, IOT

and index compression • Used in QPS Highly redundant

Time based partitioningTime based partitioning

● I/O bottleneck in the database due to Oracle's REDU log size (data-recovery stream)

● Option to eliminate redundant/unnecessary data

Data recovery performance testData recovery performance test

MotivationMotivation

ConclusionConclusion

EVEN

THIS

TORY

_XEV

ENTH

ISTO

RY_0

EH_X <Tn;T

n-1>

EH_X <Tn-1

;Tn-2

>

EH_X <Tn-2

;Tn-3

>

...

...

EH_1 <T1;T

2>

EH_1 <T2;T

3>

EH_1 <T0;T

1>

EVENTHISTORY_0

EVENTHISTORY_1

...

EVENTHISTORY_X

EVENTHISTORY EVENTHISTORY

● Key ingredient for optimization of readout path● Access by index identified as the primary

data-access pattern (transfers, trend plots)● Query performance increased by a large factor● Storage-space consumption reduced up to 50%

● Growing IOT leads to degraded query performance● Time-based partitioning (partition pruning)● Queries on views with pushed-predicate hints

● Data retention policy (3 months in daily partitions)● Automated partition-managent job for 200 schemas

WinCC OA SystemWinCC OA SystemWinCC OA System

archive

trend

WinCC OA SystemWinCC OA SystemWinCC OA System

Value ArchiveValue ArchiveValue ArchiveLoggingManagerLoggingManager

DB Link

archive

trend

A challenging task, not only from a technical standpoint, but also from a coordination point of view.

EVENTHISTORY_1 QPSHISTORY_1

EVENTHISTORY_2 QPSHISTORY_2

EVENTHISTORY_X QPSHISTORY_X

INDE

X IN

DEX

HEAP ORGANIZED INDEX ORGANIZED

Standard SchemaIOT Schema

Compressed Index

0

10

20

30

40

50

60

Disk Space Utilisation

Dis

k S

pa

ce [G

B]

Simulate a database-outage scenario in a set-up mimicking the LHC magnet protection system (QPS)

WinCC OA Projects 50

Archived Signals 150 000

Data Input Rate [rows/s] 200 000

Outage duration [h] 8

Recovery Time [h] 2

Peak Recovery Rate [rows/s] 1 000 000

● Migration completed in time for Run II of the LHC● Single consistent archiving technology CERN-wide● Satisfactory performance and reliability● Optimisations available to all WinCC OA users

JDBCHTTP

Nominal ConditionsNominal Conditions

Database FailureDatabase Failure

RecoveryRecovery

Nominal ConditionsNominal Conditions

● Data buffered by RDB Archivers during DB outage

● Buffers inserted at maximal rate once DB becomes available

● Even in degraded mode (1 DB node) complete recovery in acceptable time.

LoggingManager

RDB Archive Manager

Standard SchemaIOT Schema

Compressed Index

01020304050607080

Query Time

Qu

ery

pro

ces

sin

g ti

me

[s]

WinCC OA Projects Registered Signals [K] Transferred Rows [M/h]

Databases in production

SCADAR QPSR

140

48

2000

135 20

400

Migration

200 Systemsand

DB schemas

Minimum downtime

Human supervision

Debugging and tuning

RDB Archive

Transfer Logic

EventhistoryEventhistory

DB Schemaswith

data tables

WinCC OA File Archiver:

* Used by 200 WinCC OA systems for LHC and technical infrastructure + Stable, proven technology - Performance scalability issues - Complicated maintenance on many SCADA nodes

WinCC OA Oracle RDB Archiver:

+ Already used in a few hundred controls systems at CERN+ Proven scalability and performance for large systems+ Centralized architecture eases management and handling of the SCADA nodes, provides better tools, and enables data analytics

New requirement for the 13 TeV run of the LHC: the magnet protection system (QPS) needs data archived at a rate of 200.000 values/s