16
Approaches, Challenges, and Solutions for PTC Systems Management Kevin Nichter Lilee Systems [email protected] 502-438-6950 Number of Words: 4354 Abstract The freight and commuter railroads continue to focus on the development, testing, and deployment of Positive Train Control. These systems will introduce hundreds of thousands of new assets onto the railways, each of which include specific configurations, software versions, alarming criteria, and other variables which need to be monitored and managed accordingly. Furthermore, requirements for interoperability necessitated standardization of the protocols and standards which would enable systems management to become interoperable across host and tenant railroads. The concept of systems management as it related to PTC assets evolved with the standards of PTC itself. Within the rail industry there is some precedent for PTC systems management, using benchmarks such as existing network management systems already in place to monitor systems such as the status of wayside microprocessor-based controllers or the locomotive fuel usage and sensor information. It should be noted that the scale of PTC systems management is expected to be much larger than any other management systems for wayside and locomotive assets currently in place. The approaches to systems management vary across the railroads, and are highly dependent on the types of assets deployed, the communications transports that are available, the availability of internal resources to support implementation, and the operational processes the railroad desires to implement. Furthermore, the railroads have recognized the need for systems management to encompass not only traditional network management functions, but also to serve as a platform for comprehensive analytics and root cause analysis of the PTC system itself. © AREMA 2016® 1183

Approaches, Challenges, and Solutions for PTC … Challenges, and Solutions for PTC Systems Management Kevin Nichter Lilee Systems [email protected] 502-438-6950 Number

Embed Size (px)

Citation preview

Approaches, Challenges, and Solutions for PTC Systems Management

Kevin Nichter

Lilee Systems

[email protected]

502-438-6950

Number of Words: 4354

Abstract

The freight and commuter railroads continue to focus on the development, testing, and deployment of

Positive Train Control. These systems will introduce hundreds of thousands of new assets onto the

railways, each of which include specific configurations, software versions, alarming criteria, and other

variables which need to be monitored and managed accordingly. Furthermore, requirements for

interoperability necessitated standardization of the protocols and standards which would enable systems

management to become interoperable across host and tenant railroads.

The concept of systems management as it related to PTC assets evolved with the standards of PTC itself.

Within the rail industry there is some precedent for PTC systems management, using benchmarks such

as existing network management systems already in place to monitor systems such as the status of

wayside microprocessor-based controllers or the locomotive fuel usage and sensor information. It should

be noted that the scale of PTC systems management is expected to be much larger than any other

management systems for wayside and locomotive assets currently in place.

The approaches to systems management vary across the railroads, and are highly dependent on the

types of assets deployed, the communications transports that are available, the availability of internal

resources to support implementation, and the operational processes the railroad desires to implement.

Furthermore, the railroads have recognized the need for systems management to encompass not only

traditional network management functions, but also to serve as a platform for comprehensive analytics

and root cause analysis of the PTC system itself.

© AREMA 2016® 1183

This paper will detail some of the approaches the railroads have taken to implement PTC systems

management, the challenges that they encountered, and how they envision the systems will be used to

support operations once PTC is fully deployed. Topics such as security, interoperability, change

management, project timelines, and other considerations will also be addressed.

Technology Context

In support of the development of interoperable standards for the deployment of interoperable PTC in the

United States, four Class I railroads formed the Interoperable Train Control committee, or the ITC. These

interoperable standards extended to cover the area of systems management among other areas of need

for an interoperable train control system. Meteorcomm, LLC (“MCC”) was subsequently hired by the four

Class I railroads to implement the standards as defined by the ITC. The manifestation of this work is the

architecture known as Interoperable Train Control Communications, or simply as “ITCC.”

There are three major components of an ITC communications network for PTC:

• A data radio network (provided by ITCR, or Interoperable Train Control Radio)

• The messaging system (provided by ITCM, or Interoperable Train Control Messaging)

• A systems management system.

The data radio network is comprised of both narrow band and broadband networks. In the case of the

narrow band network, an interoperable radio link is provided through a radio operating in the 220 MHz

spectrum based on an MCC reference design. Broadband networks (such 802.11 Wi-Fi, satellite, fiber,

copper, and cellular / LTE) are also utilized within PTC for functions such as distributing large data

downloads needed for train initialization and providing image and firmware updates to components on the

network. Other wireless communications mediums may be utilized as alternate paths for redundant

communications, to augment coverage issues, or optimize large file transfers. In addition to the 220 MHz

spectrum owned by the railroads, individual railroads also own spectrum in the 160 MHz, 450 MHz, and

900 MHz bands for use with both voice and data. As a result, railroads have used radios operating on

some of these licensed bands to provide additional coverage for the PTC system and to utilize the

existing infrastructure.

The ITCM is a suite of software that is deployed in both the back office and remote environments

(typically wayside and locomotive areas). ITCM is a store-and-forward system which utilizes no

guaranteed method of delivery. This allows the system to operate more efficiently, but places an onus on

the sender and receiver to detect lost messages and take appropriate actions to retransmit automatically.

ITCM is designed to run on SBCs (single board computers) currently based on x86 architectures. These

SBCs are coupled with other devices (such as Ethernet switches) within a ruggedized enclosure to

operate as a WMS (Wayside Messaging Server) operating at PTC-equipped wayside locations. In the

1184 © AREMA 2016®

locomotive, an LMS (Locomotive Messaging Server) performs a similar function as the WMS – both the

WMS and LMS are responsible for running the ITCM at their respective locations.

The locomotive and wayside installations of ITCM, also known as Remote Areas, connect to the back

office using ITCM through any available transport (such as the 220 MHz radio, satellite, fiber, copper,

cellular, or Wi-Fi paths). These Remote Areas connect to the back office through an ITCM Application

Gateway.

The third component of an ITC communications network is ITCSM, which is designed to pass information

concerning asset health, configuration, alarms, and other data over the network. ITCSM also enables

remote log retrieval and over-the-air software update capability. ITCSM provides the necessary

framework for over-the-air software downloads, remote command execution, and kits.

ITCSM can be broken down into two additional sub-components. The first is the ITC SMG (Systems

Management Gateway), which provides services such as orchestration authorization, and security for the

railroad back office applications as well as ITC SMGs located on other railways. ITCSM uses the ITCM for

transport across the network and utilizes a standard protocol known as ISMP, or Interoperable Systems

Management Protocol. ISMP is embedded within an Edge Message Protocol (EMP) message, enabling it

to be transported across the ITCM in the same manner as other PTC messages.

The second sub component is the SMA, or Systems Management Agents, which are software modules

installed in back office locations or directly on assets. SMAs enable wayside and locomotive devices to

utilize ITCSM infrastructure to report health status and other information about the asset to interfaces and

users in a centralized location such as a back office. An asset may be any device the railroad desires to

manage – a WIU, TMC, or WMS for example. Each asset to be managed via ITCSM has a respective

SMA. These agents take the form of either an embedded or remote agent and are utilized depending in

part of the architecture deployed for PTC as well as the hardware that is utilized in the field. In the case of

an embedded agent, the asset will provide support for communicating directly with ITCSM via API calls

embedded in the asset’s software without the need for external software. This external software would

have been necessary to provide protocol conversion support to the ITCSM-required format of ISMP. Such

remote agents provide translation services to remote assets which support a non-ISMP form of

management such as SNMP.

Evaluations of Current Technologies and Interoperability

For decades, commercial off-the-shelf NMS (Network Management Systems) have been in use across

the industry for general purpose management of devices in the railroad back office. For example, routers

and servers which conform to SNMP version 2 can be supported using any NMS that conforms to the

same specification which is publicly available. Prior to PTC connectivity to wayside locations (i.e.

intermediate signal locations and control points) was constrained by the limited bandwidth available

© AREMA 2016® 1185

through the ATCS 900 MHz radio unless the railroad had IP connectivity or some alternate wide band

connection to the location. This 900 MHz radio supported the non-vital train code line traffic, which offered

limited capacity for functionality such as over-the-air management. In the event that a broadband

connection was available to the wayside location, it could be utilized as an alternate path for train control

traffic or for providing adequate bandwidth to access wayside assets for tasks such as log retrieval and

event notification from the local devices. Various proprietary solutions were also implemented in the event

that protocols for accessing individual vendor’s equipment was not made publicly available. This further

created difficulties in approaching systems management in a comprehensive manner.

In the context of a truly interoperable PTC system, the need that systems management would also need

to be interoperable was one catalyst for the ITC to evaluate the various network management protocols in

use today. As one of the standard NMS protocols available and one which is well defined within an IETF

RFC and has proven elements of interoperability, SNMP was evaluated as a potential solution for

systems management. However, SNMP offers essentially only three functions: retrieve a value (“GET”),

set a value (“SET”), and generate an unsolicited event or alarm message (known as a “TRAP”). However,

one of the constraints that diminished SNMP’s attractiveness was the usability of the protocol when

coupled with the relatively limited bandwidth offered by the 220 MHz ITC radios. Moreover, there was a

need for enhanced encryption and authentication and delivering data or software to many remotes areas

with a single command. Therefore, ISMP has become a standard for railroads and equipment

manufacturers to develop an interoperable method of systems management.

As defined, an ITCSM Gateway would need be present in every railroad engaging in interoperable PTC

operations to provide the secure interface between railroads for the following purposes:

1.) Obtaining OPK (HMAC) cryptographic keys from other railroads for their trains or locomotives

operating on their own track.

2.) Providing the OPK (HMAC) cryptographic keys for trains and locomotives operating on a foreign

railroad’s track

3.) Allowing access for monitoring purposes to other roads that share the use of asset (such as a

radio base station) that operate on a host railroad

4.) Allowing access to other road’s assets that are shared for monitoring purposes

Security

ITCSM has accommodations for security through public and private keys in conjunction with X.509

security certificates. Interoperable PTC uses what are referred to as Operational Private Keys (OPK) or

just “keys” for authentication of messages sent from one place in the network to another, most importantly

from the wayside to the locomotive onboard computer. These keys are 20 byte binary keys, and are used

with the Hashed Message Authentication Code (HMAC) algorithm to authenticate each message. The

sender of a message uses the OPK and the HMAC algorithm to calculate a hash value over the content

1186 © AREMA 2016®

of a message. This calculated value is then placed at the end of the message. The receiver must know

the same OPK, and uses it to verify that the content of the message is correct. OPKs are used with all of

the basic operational messages in PTC: from the wayside to the locomotive, locomotive to wayside, and

back office server to locomotive.

In PTC, “certificates” are conventional X.509 certificates similar to those used in the datacenter and on

desktop and laptop computers in use. An X.509 certificate contains identification information as well as

key material. X.509 certificates are used in PTC to implement public/private key encryption in order to

protect sensitive systems management messages used for downloading software or configuration,

uploading logs or current parameters, and even to replace existing OPKs with new ones at remote sites.

Certificates are used primarily for communication between the systems management server in the back

office and remote devices on the locomotive and at the wayside, or for communication between the SMG

of one railroad with the SMG of another railroad for interoperability.

For example, in order to send a “reboot” command requested by the back office, the command is

encrypted with the “public” key of the remote device to be rebooted, and signed for authentication using

the private key of the SMG. The receiving device uses its private key to decrypt the message, and the

public key of the SMG to check the signature to make sure the received message is authentic.

The ITC Systems Management Specification directs that the private keys of each remote device be

maintained in an “asset repository” in the back office, and defines messages to be used by the SMG to

communicate with that repository. A key server implements this asset repository in order to maintain the

security of these private keys, although using it can be implemented in other ways depending on the

railroad.

PTC is governed by 49 CFR Part 236, Subpart I – Positive Train Control Systems. Security requirements

for PTC are described in section 236.1033 in that regulation entitled “Communications and Security

requirements”. This regulation prescribes requirements for communication security using keys and

requirements for generating, distributing, and storing these keys. Each railroad must describe in its PTC

Safety Plan the methods it will use to meet these requirements. The ITC has created specifications

designed to meet or exceed these requirements.

Key Management (which is inclusive of both OPKs and X.509 certificates used for communication with

remote PTC devices) is important because keys and certificates are used extensively throughout the PTC

system – from the back office through the base radios and other communication paths to the remote

assets. They are used to authenticate messages and encrypt and decrypt messages used for both the

basic functions of the system and management of the operation of the system. It is for this reason that

key management must be considered in every phase of PTC including:

• Interoperability

• Design of wayside signaling

© AREMA 2016® 1187

• Wayside construction

• Track database/subdivision file preparation and maintenance

• Commissioning of locomotives and wayside

• Operational and FRA required testing of locomotives and waysides

• Operational monitoring of locomotives and waysides

• Maintenance of locomotives and waysides

• Execution of security related procedures such as key installation and periodic key refresh (or

“rolling”) of keys and certificates.

Challenges While a great deal of time and effort has been expended to develop the standards for interoperable PTC,

several challenges remain to fully deliver an interoperable systems management solution.

First and foremost, interoperable PTC has no precedent. While the industry as a whole has worked to

leverage commercially available solutions and standards wherever possible to achieve their respective

goals, a significant number of specifications and requirements not only needed to be defined by the

industry committees and other stakeholders, but conforming solutions must also built, tested, delivered,

and evaluated. The testing and the evaluation of the success metrics may be more complex because

requirements may be implemented differently, and discrepancies must be worked out and potentially

refined with the various industry bodies involved with the implementation of PTC. Moreover, it is possible

that the specifications will evolve over time to accommodate for “lessons learned” and other

advancements by the underlying technology in PTC as it is deployed. This implies that standards bodies

will have a continuing role to play in the evolution of PTC as it becomes part of the railroads’ daily

operations.

Another issue that potentially impacts the incorporation of systems management in the broader

discussion of delivery of a PTC system that is compliant with ITC specifications – and in particular, for

those commuter railways who rely on systems integrators – is the method for managing the potentially

evolutionary aspects of PTC. In other words, when a specific scope of work is committed by a third party

to deliver an interoperable PTC solution, and it is understood that many aspects of systems management

(and PTC in general) may evolve over time, and therefore, the potential impact to schedules and budgets

must also be accounted for as well.

Processing capabilities of the software and technologies applied for systems management also may

present constraints. For example, a railroad may wish to access and report signal aspect information from

their wayside locations for trending or other diagnostic purposes. Wayside status messages (WSM) may

be transmitted from each wayside location at a rate of approximately once every three seconds. For a

railroad with 1000 wayside locations, and WSMs being delivered asynchronously, it would be expected to

1188 © AREMA 2016®

receive 500 or more messages from the field at any given second. However, a single wayside may be

heard by more than one base radio, and base radios may hear other base radios as well. This generates

duplicate messages which can be filtered in the back office before processing and forwarding but this still

diverts processing capability and takes time. Assuming a multiple of five due to this duplication, such an

application may need to process up to 2500 messages per second. While modern computing is capable

of meeting this level of processing capability, this assessment is also based on estimated loads which

should be reevaluated as more railroads extend their PTC deployments.

There are also technological limitations which may present unique challenges to deployment of systems

management. The wired and wireless mediums used to access locomotive and wayside areas may

present constraints as to the amount of information that the railroads may be able to access without

overloading the network. In the event that a narrow band 220 MHz radio is the sole connection to a

remote asset, the amount of systems management traffic that can be supported is partly a function of the

amount of train control traffic which takes priority over all other traffic. Retransmissions of missed data,

duplicated data received from other base stations, and other events further limit the amount of bandwidth

available for other purposes.

A final challenge involves the decision to implement remote or embedded agents into the hardware

necessary for PTC. As stated above, ITCM is designed to run on SBCs based on x86 architectures. In the

ITC architecture, the WMS and LMS are designated to run the ITCM in the wayside and locomotive

environments, respectively. For a variety of technology reasons, other devices on the PTC system may

not offer a readily available ITCSM agent (ITCSM can be implemented on any processor).

For a railroad looking to leverage the information available from devices that do not support an ITCSM

embedded agent, a remote agent running on an x86 device is utilized to function as a protocol converter

between ISMP and the other chosen protocol (e.g. SNMP). In an IP connected wayside environment, the

remote agent could be implemented on a server located in the back office or elsewhere on the network.

For wayside installations which include a WMS, the remote agent can be implemented directly on the

WMS itself. This offers the additional benefit of hosting remote agents for other devices – especially

COTS devices - that may have information useful for management purposes.

Approaches and Recommendations

As interoperable PTC has not yet been fully achieved across all railroads in the United States, it is

important to characterize systems management as a process rather than a simply a product or solution.

Regardless of whether standards continue to evolve, the railroads will look to harvest the immense

amount of information made available to them. Railroads will continue their work to optimize their

operations once the PTC overlay is installed. With embedded and remote agents in place, new data

© AREMA 2016® 1189

variables will be made accessible to provide information on PTC-enforced braking events, initialization

failures, and other events that railroads desire to track, diagnose, and report. Remote agents in particular

offer the railway a method to incorporate non-ISMP enabled devices into ITCSM infrastructure, which in

turn allows the railroad to leverage a great deal of their existing infrastructure in the broader scope of

systems management.

While ITCSM has been primarily discussed in the context of PTC systems management, the industry may

look to consolidate the data provided through the ITCC infrastructure with other existing management

systems and databases in use today. The eventual analysis of this “big data” will potentially lead to new

ways to evaluate and streamline railway operations using some of the techniques employed by other

users of ITCSM:

1.) Identify and refine the amount of information that can be useful. In light of the limited bandwidth

that may be available, work to reduce the amount of information reported from agents to reduce

complexity and “data overload.” For example, pre-processing and optimization performed by

software agents deployed in the field will reduce the amount of traffic distributed over-the-air, as

compared to processing all data in the back office.

2.) Consolidate the data provided by ITCSM with other network management systems and

databases which can provide supporting information. For example, data provided by crash

hardened memory modules (CHMM), logs from the CTC system, and WIU logging information

may also provide useful clues when analyzing the root cause of a PTC braking event due to a lost

WSM.

3.) Identify the proper technologies and tools that may be useful for mining large sets of data

composed of information compiled from numerous and disparate sources. This is one of the

characteristics of the “big data.”

4.) Recognize the trends and potential contributors that may be associated with a potential failure

scenario in the field that is desired to resolve. Even with a large set of data available, the proper

algorithms and logic must be concise so as to provide the railroad with the capability to perform

root cause analysis. Complicating this approach, however, is that PTC is still a relatively new

technology and it is difficult to envision all of the potential ways in which problems can occur. For

example, the identification of the cause of a braking event may involve analysis of the entire

message flow between the locomotive and one or more wayside locations. This could potentially

involve data acquisition from dozens of devices. Yet there may not be enough “tribal knowledge”

regarding all of the ways in which messages could unintentionally flow. In other words, the user

may not yet fully understand all the ways in which the system can fail which complicates the

approach to creating a method to identify the cause of the problem in the first place.

5.) Categorize how consumers will use the data provided by systems management. The “Help Desk”

or a general support team for PTC is understood to be a likely consumer, but others may benefit

from the information provided as well. For example, equipment vendors could use the information

1190 © AREMA 2016®

to assist the railroads with troubleshooting issues in the field, while railroads could also have

useful metrics for objectively assessing the quality of the equipment they have installed in the

field. This is especially important for the railroads to be able to calculate and demonstrate they

can meet reliability benchmarks for the overall PTC system.

6.) Understand how the information that can be provided by PTC assets can be used in new ways to

reduce the costs of maintenance (specifically, dispatching of personnel for on-site diagnostic

tasks and local software upgrades).

7.) Consider the investment in out-of-band solutions to augment the bandwidth available solely

through the 220 MHz radio. This will be especially important in areas of where there will be a high

level of expected RF activity such as major metropolitan areas. The large files sizes associated

with image and firmware updates further increases this need.

While the scope of systems management should now be understood in the context of delivering an

interoperable and compliant PTC system, it should also be noted that there is a potential role for it to play

in configuration management as well. With PTC as the catalyst, equipment manufacturers have continued

to increase the amount of information that can be retrieved from their hardware using agents.

Manufacturers have included the ability to retrieve hardware and software versions from their equipment

using standard software calls. Having the ability to retrieve this information automatically – without the

need for a site visit – enables the railroad to be notified of changes to the configuration of assets

instantaneously. For example, a railroad could receive an automatic event notification that a software

CRC has changed on a WIU in the field. This event could then trigger a comparison exercise with a

software configuration management database, and if a discrepancy is detected the proper personnel can

be notified to take action. This information could potentially be of significant use to assist the railroads in

their efforts to comply with configuration management requirements in their Railroad Safety Program Plan

(RSPP).

Recommendations for Further Review

The following recommendations have been made in consideration of the input received when discussing

this paper and related topics in industry forums:

1.) Several standards and specifications related to PTC have not yet been published by the AAR.

Once these standards have been formally published, it could be useful to revisit the subject

matter discussed in this paper to assess the impact of the challenges noted herein.

2.) It may be of interest to explore the use cases associated with the use of ITCSM in an operational

PTC environment, especially one which integrates PTC systems management into a broader

network management paradigm.

© AREMA 2016® 1191

3.) A final recommendation would be to review an actual scenario in which data provided by ITCSM

infrastructure was utilized to identify a root-cause failure associated with a train initialization

failure, braking event, or other occurrence that may be associated with the operation of PTC.

List of Acronyms

AAR American Association of Railroads

ATCS Advanced Train Control System

CFR Code of Federal Regulations

CHMM Crash Hardened Memory Module

COTS Commercial Off the Shelf

CRC Cyclical Redundancy Check

CTC Centralized Traffic Control

EMP Edge Messaging Protocol

HMAC Hashed Message Authentication Code

IETF Internet Engineering Task Force

ISMP Interoperable Systems Management Protocol

ITC Interoperable Train Control

ITCC Interoperable Train Control Communications

ITCM Interoperable Train Control Messaging

ITCR Interoperable Train Control Radio

LMS Locomotive Messaging Server

MCC Meteorcomm Communications

NMS Network Management System

OPK Operational Private Keys

RFC Request for Comment

RSPP Railroad Safety Program Plan

SBC Single Board Computer

SMG Systems Management Gateway

1192 © AREMA 2016®

SNMP Simple Network Management Protocol

TMC Train Management Computer

WIU Wayside Interface Unit

WMS Wayside Messaging Server

WSM Wayside Status Messages

Acknowledgements

The author wishes to thank members of AREMA for their feedback as well as input on the next steps.

References

1.) "ITCC Overview." ITCC Overview. Federal Railroad Administration, 6 July 2012. Web. 27 May

2016. <https://www.fra.dot.gov/Elib/Document/2218>.

2.) “Communications and Security Requirements” 49 CFR 236.1033.2012

3.) “Railroad Safety Program Plan (RSPP)” 49 CFR 236.905.2012

4.) Hartong, Mark, Rajni Goel, and Duminda Wijesekera. "Positive Train Control (PTC) Failure

Modes *." Positive Train Control (PTC) Failure Modes. N.p., 25 Dec. 2010. Web. 27 May 2016.

<http://www.sciencedirect.com/science/article/pii/S1018364710001588>.

5.) “Mackenzie, Kevin. Lilee Systems. Key and Certificate Management and Key Server. San Jose,

CA: Lilee Systems, 2016. Print.”

© AREMA 2016® 1193

AR

EM

A 2

01

6 A

nn

ual

Con

fere

nce

& E

xp

osi

tion

Ap

pro

ach

es, C

hal

len

ges

, an

d

Solu

tion

s fo

r PT

C S

yst

ems

Man

agem

ent

K

evin

Nic

hte

r

1194

© A

REM

A 2

016®

AREMA 2016 Annual Conference & Exposition

Agenda

• Technology context and components • ITC Systems Management • ITC SMG and SMA • Evaluation of current technology • Security • Challenges • Approaches and recommendations • Further consideration

AREMA 2016 Annual Conference & Exposition

Technology Context and Ecosystem

Initially four Class I railroads Development of interoperable standards for PTC

Interoperable Train Control Committees

Owned by four Class I railroads Hired to implement the standards from the ITC

Meteorcomm, LLC. Consumers of

specification Implement into hardware and software

Railroads and Vendors

Data Radio Network

Messaging System

Systems Management

System

ITC Communications Network

AREMA 2016 Annual Conference & Exposition

Medium Types Bandwidth

Data Radio Network

Narrowband

220 MHz

Non-Interoperable

Broadband

Wi-Fi

4G . LTE

802.11

Satellite

Data Radio Network (Component #1)

• Narrow band radio based on Meteorcomm reference design

• Broadband utilized for large data downloads (train initialization and software updates to assets)

• Alternate paths for redundant communications and to augment coverage issues

AREMA 2016 Annual Conference & Exposition

ITCM (Component #2)

• Suite of software that is deployed in both the back office and remote environments

• Store-and-forward system which utilizes no guaranteed method of delivery

Back Office Systems

Onboard Systems

Wayside Systems

Systems Management

Messaging System

EMP

EMP

EMP

EMP

Cell, Wi-Fi, Satellite ITC Radio

IP

220 M

Hz

AREMA 2016 Annual Conference & Exposition

ITCM (Component #2)

• Runs on SBCs currently based on x86 architectures

• WMS and LMS perform a similar function.

• These Remote Areas, connect to the back office using ITCM through any available transport

• Remote Areas connect to the back office through an ITCM Application Gateway

Wayside Messaging Server

Locomotive Messaging Server

Back Office Deployment

EXTENSIONS of the messaging system

AREMA 2016 Annual Conference & Exposition

Locomotive and Wayside Messaging Server

Hardware

X86 Application

Engine

Ethernet Switch IP Router

Software

Red Hat 6.X ITC Messaging Systems Management Agents (SMA)

ITC Radio SMA WIU SMA WMS SMA SNMP SMA

TC aging Systemstestemstemeeteeeees

ITC R diImplementations of software are different across railroads

ITCM (Component #3)

© AREMA 2016®1195

AREMA 2016 Annual Conference & Exposition

ITCSM (Component #3)

• Passes information about asset health, configuration, alarms, and other data

• Provides framework for over-the-air software downloads, remote command execution, and kits.

• Subcomponents

– ITC SMG (Systems Management Gateway)

– SMA (Systems Management Agents)

-

ITC

SM

ITC SMG

SMA

AREMA 2016 Annual Conference & Exposition

Systems Management Gateway

ITC SMG • Orchestration

• Security Services

Transport

Back Office Systems / Applications

Transport

ITC Assets (Messaging System, WIUs, TMCs, ITCR, etc.)

ITC Messaging

Other Railroads

AREMA 2016 Annual Conference & Exposition

Systems Management Agents

ITC SMG

Systems Management Operational Console / GUI

ITCM

Remote Agent

ITC Asset

Embedded Agent (Asset

API)

ITC Asset

IP

AREMA 2016 Annual Conference & Exposition

ISMP

• Interoperable Systems Management Protocol

• Uses similar format as PTC status messages for identical transport

• Efficient for transport over narrowband links (220 MHz)

• Proxy via SNMP and EMP (remote agent)

AREMA 2016 Annual Conference & Exposition

Evaluations of Current Technology

• Commercial NMS systems using standard protocols are plentiful! – Already utilized by railroad IT departments

– Standardized IETF protocols

– Asset agnostic

• Vendor-specific management systems already in use by railroads – Crossing monitoring and management

– Defect detectors and HBDs

– Radio code management

AREMA 2016 Annual Conference & Exposition

Evaluations of Current Technology

• Narrowband radio code networks had limited capacity for network management functions

• Evaluation of SNMP – Right-sized for narrowband

– Limited protocol (GET / SET / TRAP)

– Desire for enhanced encryption and authentication

– Optimize for functions such as delivering software to many remote areas via a single command

1196© AREMA 2016®

AREMA 2016 Annual Conference & Exposition

Security - OPKs

• ITCSM accommodates for security through public and private keys in conjunction with X.509 security certificates

• Operational Private Keys (OPK) are used for authentication of messages sent across network

• Keys are 20 byte binary keys, and are used with the Hashed Message Authentication Code (HMAC) algorithm to authenticate each message

• Sender of a message uses the OPK and the HMAC algorithm to calculate a hash value over the content of a message

• Receiver must know the same OPK, and uses it to verify that the content of the message is correct

AREMA 2016 Annual Conference & Exposition

X.509 Certificates

• In PTC, X.509 certificates are similar to those used in datacenters and computers

• Contains identification information and key material

• Implements public/private key encryption in order to protect sensitive systems management messages

• Used for communication between the systems management server and remote devices, or for communication between the different railroads’ SMGs

AREMA 2016 Annual Conference & Exposition

Railroad A

Security

Railroad B

X.509 Certificates

Private Key

Public Key

AREMA 2016 Annual Conference & Exposition

Key Server and Management

• Key Management is inclusive of both OPKs and X.509 certificates

• Used to authenticate messages and encrypt and decrypt messages used for both the basic functions of the system and management of the operation of the system

• Security of the private keys for each remote device can be maintained in an asset repository (key server) located in the back office

• Key server generates OPKs for remote assets

AREMA 2016 Annual Conference & Exposition

Challenges

Specifications had to be created Solutions have to be developed, tested, and implemented Specifications may evolve (“lessons learned”) Scale of deployment – issues may not be prevalent on smaller deployment

No Precedence

ISMP optimization Bandwidth limitations of 220 MHz Relative priority of systems management traffic Effect of retransmissions and duplicates

Prioritization

Organizational structures Identifying stakeholders of systems management

Handling

Particularly on large, systems integrator projects Scope of network management often incomplete in specification

“Scope creep”

AREMA 2016 Annual Conference & Exposition

Approaches and Recommendations

• Understand that systems management is a PROCESS

• Further the evolution of agents to automate root cause analysis

• Leverage the investment in the PTC systems management infrastructure to integrate non-standard edge devices

• Optimize the railroad operation once PTC is installed

© AREMA 2016®1197

AREMA 2016 Annual Conference & Exposition

Analytics and Correlation

• Identify and refine data

• Consolidate the data with existing NMS

• Recognize the trends and potential contributors

• Categorize the consumers

• Understand how the information can be valued

• Consider the investment in out-of-band solutions to augment the bandwidth available solely through a narrowband 220 MHz radio

Positive Train

Control Current

Operating Environment

AREMA 2016 Annual Conference & Exposition

Configuration Management

• Manufacturers have continued to increase the amount of information that can be retrieved from their hardware agents

• Hardware and software versions retrieved using standard software calls

• This ability enables the railroad to be notified of changes to the configuration of assets instantaneously

• Potentially of significant use to assist the railroads in their efforts to comply with configuration management requirements

AREMA 2016 Annual Conference & Exposition

For Further Consideration

• Once the AAR standards have been formally published, revisit the subject matter to assess the impact of the challenges

• Explore the use cases associated with ITCSM in an operational PTC environment, especially one which integrates PTC systems management into a broader network management paradigm

• Review an actual scenario in which data provided by ITCSM infrastructure was utilized to identify a root-cause failure associated with a train initialization failure, braking event, or other occurrence that may be associated with the operation of PTC

AREMA 2016 Annual Conference & Exposition

Thank You!

Kevin Nichter [email protected]

1198© AREMA 2016®