Veri cation of Risk Algorithm Implementations in a ...umu.diva-portal.org/smash/get/diva2:1141867/FULLTEXT01.pdf · Veri cation of Risk Algorithm Implementations in a Clearing System

Verification of Risk AlgorithmImplementations in a Clearing SystemUsing a Random Testing Framework

Mikael Vahlberg

May 24, 2017

Master Thesis, 2017, 30 hpMSC Industrial engineering and management – Risk management, 300 hp

Spring 2017

2

Abstract

Clearing is keeping track of transactions until they are settled. Standardized derivatives such as optionsand futures can be cleared through a clearinghouse if you are a clearing member. The clearinghouse step inas an intermediary between trades and manages all occurring counterparty risk. To be able to keep trackof all transactions and also monitor members risk exposure a clearinghouse use advanced clearing software.Counterparty risk is mainly handled by collecting collateral from each clearing member, the initial collateralthat a clearinghouse require from a member trading with derivatives, is called initial margin. Initial margin iscalculated by a risk algorithm incorporated in the clearing software. Cinnober Financial Technology deliversclearing solutions to clearinghouses world wide, software providers to the financial industry have high demandson software quality. Ensuring high software quality can be done by performing various types of software testing.

The goal of this thesis is to implement an extendable random testing framework that can test risk algorithmimplementations that are part of a clearing system under development by Cinnober. By using the implementedframework, we aim to verify if the risk algorithm SPAN calculates fair initial margin amount. We also intendto increase the quality assurance of the risk domain that is responsible for all risk calculations.

In this thesis we implement a random testing framework suitable for testing risk algorithms. Furthermore,we implement a framework extension for SPAN that is used to test the SPAN algorithm’s initial margin calcu-lations. The implementation consist of two main parts, the first being a random generation entity that feedsthe clearing system with randomized input data. The second part is a verification entity called test oracle, it isresponsible for verifying the SPAN algorithm’s calculation results.

The random testing framework for risk algorithms was successfully implemented. By running the SPANextension of the framework, we managed to find four issues related to the accuracy of the SPAN algorithm.This discovery led to the conclusion that the current SPAN algorithm implementation does not calculate fairinitial margin. It also led to an immediate increase of quality assurance because the issues will be corrected.As a result of the frameworks extensible characteristics, long term quality also increases.

Sammanfattning

Clearing haller koll pa transaktioner tills de ar avvecklade och reglerade. Standardiserade derivat somoptioner och terminer kan clearas genom ett clearinghus om du ar clearingmedlem. Clearinghuset gar in somen mellanhand mellan medlemmar och hanterar all motpartsrisk. For att kunna halla reda pa alla transak-tioner och aven overvaka medlemmarnas riskexponering anvander clearinghuset avancerad clearingprogramvara.Motpartsrisk hanteras huvudsakligen genom att samla in sakerheter fran varje clearingmedlem. Den initialasakerheten som ett clearinghus kraver fran en medlem som handlar med derivat, kallas initial marginal. Initialmarginal beraknas med en riskalgoritm som ingar i clearingsprogramvaran. Cinnober Financial Technology lev-ererar clearinglosningar till clearinghus over hela varlden. Mjukvaruleverantorer till finansbranschen har hogakrav pa mjukvarukvalitet. Att sakerstalla hog programkvalitet kan utforas genom att utfora olika typer avprogramtest.

Malet med denna avhandling ar att genomfora ett utokningsbart testramverk som avander slumpmassigdata och kan testa riskalgoritmimplementeringar som ingar i ett clearingssystem som utvecklas av Cinnober.Genom att anvanda det implementerade ramverket stravar vi efter att verifiera om riskalgoritmen SPANberaknar rattvist marginalbelopp. Vi har ocksa for avsikt att oka kvalitetssakringen av riskdomanen somansvarar for alla riskberakningar.

I denna avhandling implementerar vi ett testramverk som ar lampligt for att testa riskalgoritmer. Dessutomimplementerar vi en utokning av testramverket for SPAN, som anvands for att testa SPAN-algoritmens initial-marginalberakningar. Implementeringen bestar av tva huvuddelar, den forsta ar en slumpgenereringsenhet sommatar clearingssystemet med slumpmassig data. Den andra delen ar en verifieringsenhet som kallas test oracle,den ansvarar for att verifiera SPAN-algoritmens berakningsresultat.

Implementeringen av ramverket som anvander slumpmassig data for riskalgoritmer genomfordes framgangsrikt.Genom att kora ramverkets SPAN-utokning lyckades vi hitta fyra problem relaterade till berakningsdetaljer iSPAN-algoritmen. Denna upptackt ledde till slutsatsen att den nuvarande SPAN-algoritmimplementationeninte beraknar rattvis initial marginal. Det ledde ocksa till en omedelbar okning av kvalitetssakringen eftersomproblemen kommer att atgardas. Som ett resultat av ramverkets utokande egenskap okar langsiktig kvalitetocksa.

3

4

Acknowledgements

I would like to express my gratitude to my supervisor Andre Massingwho have guided me through the writing process of this thesis. I also wantto thank Noah Hojeberg for opening my eyes to the next generation ofsoftware testing. And a special thanks to my partner Maja Anderssonfor her support throughout my education. Thank you!

5

Contents

1 Introduction 111.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.2 Clearing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111.3 Software Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131.4 Purpose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2 Approach and Limitations 162.1 Implement Random Testing Framework . . . . . . . . . . . . . . . . . . . . . 162.2 Implement Test Oracles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 162.3 Limitations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3 Project Organization 173.1 About Cinnober . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173.2 Previous work at Cinnober . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

4 Theoretical Background 184.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 184.2 Standard Portfolio Analysis of Risk . . . . . . . . . . . . . . . . . . . . . . . 20

4.2.1 Combined Commodities . . . . . . . . . . . . . . . . . . . . . . . . . 204.2.2 Scan Risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 214.2.3 Intra-Commodity Spread Charge . . . . . . . . . . . . . . . . . . . . 234.2.4 Inter-Commodity Spread Credit . . . . . . . . . . . . . . . . . . . . . 244.2.5 Net Option Value . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.2.6 Short Option Minimum . . . . . . . . . . . . . . . . . . . . . . . . . . 26

4.3 Test Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264.3.1 Heuristic Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.2 True Oracle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 274.3.3 Sampling Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

4.4 Random Testing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

5 Random Testing Framework Implementation 295.1 Testing structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

5.1.1 Single-Threaded Structure . . . . . . . . . . . . . . . . . . . . . . . . 295.1.2 Multi-Threaded Structure . . . . . . . . . . . . . . . . . . . . . . . . 29

5.2 Finding Heuristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2.1 System Breakdown . . . . . . . . . . . . . . . . . . . . . . . . . . . . 305.2.2 Simple Patterns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31

5.3 Data generation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.4 Single-Threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

5.4.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 325.5 Multi-Threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

5.5.1 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36

6

6 Random Testing Results 386.1 Single-Threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 386.2 Multi-Threaded Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

7 Implementation Results 417.0.1 Extensibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

8 Discussion 44

9 Conclusion and Outlook 46

7

Glossary

ClearingThe procedure of keeping track of transactions until they aresettled.

Risk The possibility of financial losses from investments.

Central Counterparty An intermediary for financial transactions.

ClearinghouseAn organization clearing transactions by acting as a centralcounterparty.

Clearing MemberA member of the clearinghouse. Members are allowed totrade standardized derivatives with other members using theclearinghouse as an intermediary.

Instrument Financial contract tracking one or many financial assets.

CollateralA pledged asset held as security while awaiting a futurepayment.

DerivativeA tradable financial contract that derive its value from oneor more underlying assets.

FutureFinancial derivative where the buyer and seller have agreedupon a price to exchange an underlying asset at specificpoint in time in the future.

OptionFinancial derivative where the buyer have the right to buythe underlying asset to a predetermined price. The sellerhave the obligation to sell the asset to the buyer.

Contract Size

The quantity to be delivered per contract held in aderivative. A stock option may have contract size of 100lots, meaning 100 lots of the underlying stock will bedelivered per contract held.

Long/Short

A long position generally means that the speculator haspurchased an asset believing that it will increase in value. AShort position means that the speculator have sold the assetand expecting a decrease in asset valuation.

Initial MarginAn initial sum of money that a clearing house require fromits clearing members when dealing with derivatives. Worksas a margin of safety for a clearing house.

8

Test OracleAn instance determining whether a softwares response to acertain test is correct or not.

Spot PriceThe spot price is the current price of an asset withimmediate payment and delivery.

Option ExerciseAn option exercise means that the buyer of the optionexercises his or her right that were specified in the optioncontract.

Over-The-CounterTrading assets directly between participants outside anexchange.

Interest Rate CurveA curve based out of interest rates from different points oftime. Also known as yield curve.

Implied VolatilitySurface

A surface extracted from option market data. The surface isdisplaying volatilities for options with different time tomaturity and moneyness.

9

Abbreviations

SPAN Standard Portfolio Analysis of Risk

CME Chicago Mercantile Exchange

CC Combined Commodity

STM Single-Threaded Model

MTM Multi-Threaded Model

OTC Over-The-Counter

PSR Price Scan Range

VSR Volatility Scan Range

10

1 Introduction

1.1 Background

Counterparty credit risk is one of the key risks of the financial system today. The financialcrisis in 2008 proved the magnitude of this particular risk when the two investment banksBear Stearns failed and Lehman Brothers collapsed and declared bankruptcy. How muchthese events stressed the global economy and the financial system is difficult to quantify,but its certain they did. LCH Clearnet, clearinghouse to Lehman Brothers during 2008,declared the bank in default and thereby assumed all of Lehman’s assets and risks. LCHClearnet inherited, among many other assets, a swap portfolio worth $9 trillion that had tobe managed. The clean-up process was successful, meaning that collateral collected fromLCH Clearnet’s members covered all losses. That means Lehman’s counterparties did notsuffer any major losses due to this enormous bankruptcy. de Teran [2008]

Reducing counterparty credit risk is a crucial part of a clearinghouse main objectives.Doing this requires software solutions that can monitor trades, collect collateral and calcu-late margin requirements to its clearing members, often big financial institutions such asLehman Brothers. Collateral is collected to cover losses occurring from payment defaults.A payment default means that your counterparty is unable to fulfill its obligations and yourawaiting payments will be absent. The clearinghouse want to be able to offer a fair mar-gin requirement. Having too low margin requirement means that all parties, including theclearinghouse, might be taking lots of unhandled risk. Having too high margin requirementcould make trading too expensive and discourage traders from trading. How much collateraleach clearing member should pay is up to the clearinghouse to decide. Clearinghouse specificsoftware is responsible to make sure each clearing member get correct margin requirementaccording to their current positions held in their portfolio. Highly reliable software withproven accuracy and quality is important because of this particular reason.

1.2 Clearing

Clearing is keeping track of trades and evaluating risk exposure until the trades are settled.Typical risk is the possibility of financial losses from investments. A trade, or a deal, isan agreement between two parties where they have agreed upon exchanging tradable goodssuch as commodities, currencies, derivatives or stocks. The trade is kept in the clearingorganizations records, much like bookkeeping and accounting, but instead keeping track oftrades and payment obligations. Clearing derivative instruments is special in the sense thatmost derivatives depends upon an underlying assets price at a certain point of time in thefuture. Deals of this nature makes payment ability more unpredictable and increases therisk of payment defaults. Hovmoller [2015]

To make derivative clearing possible, a clearinghouse step in as an intermediary betweentransactions and becomes a central counterparty (CCP). Clearing trades through a clear-inghouse require dealers to be members of the clearinghouse. Clearing members are oftenbig financial institutions and are the only entities connected directly to the clearinghouse.Brokers, trading firms and other non-member clients are connected to clearing members ofthe clearinghouse, the overall structure is shown in Figure 1.

11

CH

CM1 CM2

TP1 TP2 TP3

C2 C3C1 C4 C5 C6

Figure 1: Simple clearing topology showing all participating parties and how they are con-nected to each other. A clearinghouse (CH) only have clearing members (CM) connected toit. CM’s are usually bigger financial institutions and may have other trading participants(TP) clearing trades through them. Clients (C) such as traders and retail investors areconnected to the TP’s.

Without a CCP, deals would be executed directly between trader A and B. Introducinga CCP means that all deals conducted by clearing members flow through the CCP and onedeal becomes two, as illustrated in Figure 2. Assume members A and B are interested intrading 10 barrels of oil with each other next year, A wants to buy 10 barrels and B wish tosell the same amount. Two trades are required, A and B would form individual agreementswith the CCP to buy and sell 10 barrels of oil next year. This particular agreement is calleda future contract and is commonly traded on exchanges.

A B A CCP B

Figure 2: A trade with and without a CCP as intermediary.

Derivatives comes in a wide range, they derive its value from one or more underlying assetswhich can be anything from a commodity to a stock index. Not all derivatives are clearedbecause some particular exotic derivatives can be difficult to valuate mathematically, theseare mainly traded OTC. A clearinghouse deals with standardized contracts, most commonlyfutures and various type of options. By concentrating on standardized contracts, as well asbeing able to monitor all deals, a clearinghouse has the possibility to make a quantitativerisk assessment and calculate appropriate margin requirements. A margin requirement is asum of money that the clearinghouse require from its members, each member has to pay theclearinghouse collateral (money or other accepted assets) to cover their individual marginrequirement.

A key mission for a clearinghouse is to make risk assessments and collect collateral fromits members in order reduce counterparty risk for traders dealing with derivatives. Withcounterparty credit risk being one of the key risks in financial markets today, risk evaluationfor a CCP is fundamental and in interest for all involved parties. The risk assessment done

12

by the clearinghouse is usually made with software using one or several risk algorithms,for example Standard Portfolio Analysis of Risk (SPAN), Value at Risk(VaR) or ExpectedShortfall(ES). These algorithms calculates a margin requirement or initial margin, whichshould be the amount of initial collateral a member is required to pay to the CCP. [Rehlon,2013, p. 2] What type of collateral that is considered valid is up to the clearing organizationto decide, some commonly accepted assets are domestic currency, government bonds andforeign currency.

Using a central counterparty provide several benefits to its members, such as risk andexposure reduction. In a CCP network, participants do not have to consider their actualcounterparty speculating on the other side of the trade, as long as they trust the CCPwhich instead has to consider the counterparty risk. Another benefit is the multilateralnetting of transactions. In a bilateral network each individual transaction has to be settledwhile multilateral netting allows summing of contracts which has a gross exposure reducingcharacteristic, see Figure 3. Consider participant A using bilateral transactions, A is exposedto D and risk losing $80 in the event of D defaulting. Participant A is still obliged to pay $50to B, A thus has a gross exposure of $130. By instead using a CCP and multilateral netting,A’s only exposure is towards the CCP and as netting is possible the total gross exposure isreduced to $30, that is $100 less than using bilateral transactions. [Rehlon, 2013, p. 2-4]

The key benefit of using a CCP is the result of a clearinghouses key mission - the processof handling defaults. Payment default by one trader may cause other traders to default ontheir payments. As this is a fragile network with possible chain reaction events, the CCPtries to eliminate counterparty risk by collecting collateral to cover potential losses frompayment defaults and thus guaranteeing that obligations are met. [Rehlon, 2013, p. 1-2]Ensuring correct margin requirements by gathering enough collateral is of importance tohandle default situations.

How can risk algorithm implementations be validated to reduce potential errors? Varioussoftware testing methods allow us to explore and verify both simple and complex algorithms.

A

B C

D

$50

$100

$30

$80A

B C

D

CCP

$30

$50 $70

$50

Figure 3: The Bilateral network to the left has a gross exposure of $260 while the multilateralnetwork to the right has a gross exposure of $200.

1.3 Software Testing

Software testing is made to ensure and measure quality of the software. Well designed testingand test execution can increase a projects level of confidence of overall software condition,which is important when delivering solutions to customers. A project is aiming fulfill user

13

needs, technical and functional requirements from a customer. Evaluating whether or notthe user requirements and user needs are fulfilled is called validation, while verification is theprocess of evaluating if a software meets the requirements set by the customer. Continuouslytesting alongside the software development is preferred because the cost of fixing defectsincreases over time. [Graham et al., 2008, p. 6]

Software testing is a broad concept and is often divided into different types of testing.The common types are component testing, or unit testing, integration testing, system testingand acceptance testing.

Component Testing Component testing, usually referred to as unit testing, is conductedat a low system level and focus on testing specific functions, modules, classes etc. By target-ing components of the system, unit testing is independent of the system and its state. [Gra-ham et al., 2008, p. 41]

Integration Testing Integration testing is performed to test the integration between dif-ferent modules, functions or areas of a system. This is on a higher level than componenttesting as the global system design is tested rather than detailed design. Integration betweenmodules is key to have a fully functional system, making integration testing important. [Gra-ham et al., 2008, p. 42]

System Testing Testing the system as a whole and evaluating its overall behavior is calledsystem testing. This type of testing works on a higher level than both previous mentionedtest types, meaning testing is executed from a realistic perspective, testing actual systemflows based from a users viewpoint. The system testing is commonly based on use cases andrequirements but general behavior and business logic also fits into this level of testing. Systemtesting should consist of both functional and performance testing, as both are important ina live system. [Graham et al., 2008, p. 43-44]

Acceptance Testing Acceptance testing is conducted by the actual customer or user toevaluate whether the system is ready to be released. All type of testing is a way of increasingquality assurance, and acceptance testing is the last layer to increase the confidence beforereleasing the system. [Graham et al., 2008, p. 44-45]

Test Automation Automated tests are tests executed automatically by a computer.These tests typically run a specific test exercises against the system repeatedly and canbe categorized as regression testing. Regression testing is verifying that the software stillbehave as intended after changes in the software code. Automated tests are great at regres-sion testing but have limited possibilities of finding software defects, which could have easilybeen found with manual testing. The reason for this is that the manual tester will vary theinput to the system which the automated test does not. [Hoffman, 2000, p. 1-2]

Random Testing Douglas Hoffman describes random testing as the second generation oftest automation. The general idea is to randomize input data instead of feeding the systemwith the same data over and over again, which is the behavior of regular automated tests.

14

Randomizing test input increases variety and combination of test data, it also allows anincrease of the testing intensity.

Test Oracles One of the biggest challenges of software testing is to figure out the expectedresults from the system under test. A test oracle is an entity that should know the expectedresults of the system. For example, the person performing manual testing or building anautomated test is considered the oracle as he or she has to know the expected results fromthe system.

1.4 Purpose

The purpose of this thesis is to create a sophisticated and flexible random testing suitefocusing on testing risk algorithms. With the implemented testing suite we aim to test arisk algorithm implementation that is part of a clearing system under development. Morespecifically, we want to answer whether the risk algorithm SPAN calculates fair initial marginfor the clearing members. As all clearing members has to provide collateral to cover themargin requirements, it is of importance to have an accurate and reliable algorithm. Asystem under development may contain uncovered bugs, by testing the system any foundissues can be fixed before system release. Finding bugs is not the main purpose of thisthesis, with this study we aim to increase the overall quality assurance of the risk algorithmimplementation used in the clearing system. To summarize, the thesis is aiming to:

• Create a flexible and extendable random testing framework well suited for risk algo-rithms.

• Uncover potential issues in the clearing system related to the SPAN algorithm.

• Increase quality assurance, specifically in the risk domain.

15

2 Approach and Limitations

2.1 Implement Random Testing Framework

The risk algorithm implementations will be tested using a random testing approach. Theidea of random testing is to continuously feed the system under test (SUT) with pseudo-random data such as random trades and asset prices, and then analyze the output with atest oracle. For this thesis the output we are interested in is the initial margin amount. Javais the programming language to be used when implementing test oracles and the randomtesting framework. An in-house test framework as well as Test NG’s framework are availabletools for the random testing. The implementation design of the random testing frameworkshall be flexible enough to allow for a seamless integration of additional testing components.

2.2 Implement Test Oracles

The SPAN algorithm’s initial margin amount is verified by implementing various type oforacles. As random data is continuously fed into the system, the oracle listens to updatesin the system and validates the results for each update. Parts of the SPAN algorithm areverified exactly using what is called a true oracle, and other parts are verified using anoracle based out of heuristics. Additionally, an error tracking script is implemented to findexceptions related to risk calculations in the different servers log files. Two typical exceptionsthat would be of interest for us would be ArithmeticException and NullPointerException.The first exception can for example be thrown if there is a division by zero occurring in aserver, the latter is typically thrown when there is a value missing (null) when the applicationexpects an object. How the different oracles work are explained in the theoretical backgroundsection.

2.3 Limitations

This thesis work has a restricted time frame that create certain limitations.There exists many risk algorithms and clearinghouses might use one or several. This

work will only cover one of them, Standard Portfolio Analysis of Risk (SPAN), which is therisk algorithm that is part of the particular clearing system considered in this thesis.

General material and previous studies of SPAN are few and may limit the depth andcomplexity of the SPAN oracle to be implemented. No material regarding the characteristicsof SPAN such as continuity, complexity, distribution or dynamics has yet been found. Theamount of work to determine those characteristics would be a thesis of its own.

A risk algorithm can never be tested completely, as there are infinity number of input datacombinations, there are infinity number of system output results. For this reason distributionassumptions of the input data will be made and this data will be randomized into the systemto create a usable output set to analyze. As most real trading data is confidential or veryexpensive, creating a realistic and sophisticated input data model is out of this thesis scope.

16

3 Project Organization

This master thesis is written on behalf of Cinnober Financial Technology with Noah Hojebergas supervisor at Cinnober. Supervisor at Umea University is Andre Massing.

3.1 About Cinnober

Cinnober Financial Technology delivers financial software to clearinghouses, exchanges, banksand other financial institutions. Clearing and trading solutions are the main products Cin-nober provides to its customers. The financial industry and markets as a whole have highdemands on speed and reliability when it comes to software systems. A challenge for Cin-nober is to meet these high quality demands when delivering complex solutions.

3.2 Previous work at Cinnober

Cinnober has been part of several master theses regarding software testing using a randomtesting approach. Most of the previously written theses focus on testing Cinnobers tradingsystem TRADExpress, while this thesis is testing their Real-time Clearing System.

A study made by the supervisor of this thesis, Noah Hojeberg, is about random testingin a trading system Hojeberg [2008]. The thesis provides great general knowledge aboutsoftware testing, random testing and test oracles. It also show indications that randomtesting is a powerful testing method that have higher probability of finding defects thanregular automated testing.

Because random testing is well known at Cinnober there are some random testing suitesalready implemented. I have been involved in building one of them together with JonasNylen and it is used to test a real-time clearing system in a more general sense than thisthesis aim to do.

17

4 Theoretical Background

This section covers the theory required to implement a SPAN specific oracle and other relatedfinancial theory.

To validate the SPAN algorithm implementation a study of the algorithm is essential.How does the algorithm calculate initial margin? What values can be adjusted to manipulatethe calculation result? These two questions have to be answered to test whether the systemcalculates fair initial margin amount.

Software testing is a very broad concept where many different strategies can be applied.This section also provides theory about software testing and the test methods used in thisthesis, that is, random testing utilized together with different types of test oracles.

4.1 Preliminaries

Future A future is a tradable instrument or financial contract commonly traded on manyexchanges. The contract is an agreement where a buyer and seller have agreed upon apredetermined price to exchange an underlying asset at a certain time in the future. Supposethat trader A and B enters an agreement to exchange 1000 bushels of corn for $3.5 per bushelin January next year. In January next year when the agreement expires, both traders havean obligation to fulfill their agreement, despite how the market price has evolved duringthe time period between the agreement start and agreement expiration. Many traders usefutures to speculate in price changes of an asset rather than trading them for the purpose ofhaving the underlying asset delivered to them when the contract expires. Hull [2012]

For example, imagine if trader A agreed to buy 1000 corn bushels for $3.5 per bushel fromtrader B in January next year. A couple of days after the agreement took place the marketprice of January corn futures increased to $3.7, giving trader A a chance to realise profit. Todo that trader A could enter another agreement with trader C to sell 1000 bushels for $3.7 perbushel of corn in January next year. Trader A will make a profit of $3.7− $3.5 · 1000 = $200and be left with a net position of 0 January corn futures.

Option Options come in a variety of types with different exercise styles. As a clearingorganization only trade standardized contracts the focus will be on the common options.

A call option is the first option type which gives the buyer the right to buy an underlyingasset to a certain price at a certain date in the future. The seller of the call option has theobligation to sell the underlying asset to the buyer if the buyer decides to exercise the option.A put option gives the buyer the right to sell the underlying asset to a predetermined priceand date, which means the seller of the put option has the obligation to buy the underlyingasset if the buyer exercise the option. Exercising the option means that the buyer has decidedto buy or sell the underlying asset at the previously determined price at the expiration date.There are options that can be exercise at any time, these are called american options. Byperforming an option exercise the buyer is exercising the right specified in the option contract.

The two most common exercise styles are European and American. European optionscan only be exercised at the exact date of expiration while american options can be exercisedat any time between the purchase date and expiration date.

18

A standardized option always has a strike price, it is the price the buyer will pay for theunderlying asset whenever the option is exercised. For a put option it is the price the optionbuyer can sell the underlying asset at. An option also has an expiration date, which is thedate the option expires.

It only makes sense to exercise an option if it is in the money. For a call option inthe money means the underlying asset’s price has exceeded the strike price. The oppositerelation is true for a put option, in the money is achieved whenever the underlying asset’sprice is below the strike price. A buyer of a call option is only interested to exercise theoption if it is in the money because it has the right to purchase an asset to a price below themarket price.

Methods used for pricing an European option theoretically are Black Scholes and Black76and we will only use European options in this thesis.

Black Scholes Black Scholes is an option pricing method for European call and put optionsintroduced by Fischer Black, Myron Scholes and Robert Merton. The call and put optionstheoretical price is calculated with the equations below. Let c and p denote the call and putoption price respectively. Let S0 be the underlying asset price, K be the strike price, r beinterest rate, σ be the implied volatility and T be time. Then the call option price c andput option price p are computed by [Hull, 2012, p. 351-353]

c = S0N(d1)−Ke−rTN(d2) (4.1)

andp = Ke−rTN(−d2)− S0N(−d1) (4.2)

where

d1 =ln(S0/K) + (r + σ2/2)T

σ√T

(4.3)

d2 =ln(S0/K) + (r − σ2/2)T

σ√T

= d1 − σ√T (4.4)

Black76 Black76 is a variant of the Black-Scholes model and commonly used for pricingoptions on futures. F is the price of the underlying future.

c = e−rT (FN(d1)−K(d2)) (4.5)

andp = e−rT (KN(−d2)− FN(−d1) (4.6)

where

d1 =ln(F/K) + (σ2/2)T

σ√T

(4.7)

d2 =ln(F/K) + (−σ2/2)T

σ√T

= d1 − σ√T (4.8)

19

4.2 Standard Portfolio Analysis of Risk

Standard Portfolio Analysis of Risk (SPAN) was introduced in 1988 by the Chicago Mer-cantile Exchange (CME) and has since then become an industry standard risk assessmentmethod. SPAN is a scenario based risk algorithm that calculates the so-called performancebond requirement. A performance bond requirement, also called margin, is the deposit (partof the collateral) from clearing members held by the clearing house as a guarantee againstdefault of payments. SPAN is calculated per portfolio and can handle many different finan-cial instruments. Given a set of scenarios, SPAN calculates the worst possible loss of theportfolio that could occur over a specified time period. SPAN also requires a set of inputparameters, called SPAN Parameters. There are no standardized SPAN parameters given byCME, the exchanges and clearing organizations will set these parameters themselves.[CME,2010, p. 2-3]

Initial margin (IM), or performance bond requirement, is calculated with equation (4.9),all parts of the equation are reviewed later in this section. [CME, 2010, p. 23]

Initial Margin = max(Scan Risk + Inter Commodity Credit−Intra Commodity Charge, Short Option Minimum)− Net Option Value

(4.9)

4.2.1 Combined Commodities

In SPAN, all types of tradable instruments are grouped in what they call combined commodi-ties (CC) to simplify risk calculations between instruments within the group and betweengroups. For a portfolio, each CC has portfolio risk calculated, to obtain the overall portfo-lio risk, all CC’s individual portfolio risk are aggregated together into a final portfolio riskresult. More specifically, portfolio risk between CC’s are not aggregated by simply addingthem together, the aggregation follow certain rules set in the SPAN configuration. Howrisk from different CC’s are aggregated will be covered in later sections. A simple exampleexplaining combined commodities follows below. [CME, 2010, p. 4]

ExampleGiven three tradable instruments, a S&P500 future, a S&P500 call option and a NASDAQfuture. The instruments are grouped by combined commodities, one CC called S&P500and the other called NASDAQ, as seen in Table 1. Table 2 show that risk are calculatedper CC then aggregated together to achieve the overall portfolio risk. For a net portfolio,the aggregation is probably the most complex part of SPAN and will be discussed in theInter-Commodity Spread Credit section 4.2.3.

Instrument Combined CommodityS&P500 Future S&P500S&P500 Call S&P500NASDAQ Future NASDAQ

Table 1: Tradable instruments grouped into combined commodities.

20

Combined Commodity Portfolio riskS&P500 $XNASDAQ $YOverall Portfolio risk aggregated($X, $Y)

Table 2: The overall portfolio risk is an aggregation of the portfolio risk from each CC, inthis case S&P500 and NASDAQ.

4.2.2 Scan Risk

When all combined commodities are specified, the SPAN algorithm calculate what is calledthe scan risk. By using any number of market scenarios set by the clearing organization,profit and loss scenarios are simulated for each tradable instrument. Scenarios in SPAN aredefined as changes in price of the underlying asset and volatility. In this step, futures areassumed to move as the underlying asset which is a simplification of the reality. This is notrealistic due to various factors such as interest rate expectations and contango, SPAN addsan additional charge called intra-commodity spread charge to cover for the simplification.Exchanges commonly use the 16 scenarios presented below in Table 3 where price scan range(PSR) and volatility scan range (VSR) are parameters set in the SPAN configuration. Pricemoves are calculated by multiplying PSR with factors defined in the scenarios and representspotential market moves. Volatility moves are calculated by adding or subtracting VSR toa base volatility (BV), which is retrieved from an implied volatility surface. The simulatedprofit and loss scenarios are called risk arrays and the largest risk array (maximum loss) willrepresent the scan risk of the portfolio. [CME, 2010, p. 6-9]

ExampleConsider a simple portfolio containing one long S&P500 future and one long S&P500 calloption expiring in 90 days with strike price $2,200. Assuming a S&P500 spot price of $2,000and 15% base volatility with PSR and VSR set to $200 and 10% respectively. The interestrate used in the example is 5%. The option price is calculated using Black Scholes formulas,(4.1), (4.2), (4.5) or (4.6). Risk arrays for S&P500 combined commodity are shown in Table 4and the scan risk, $173, is the largest potential loss of the example portfolio with the givenscenarios.

21

Scenario Price Change Volatility Change1 0 BV + VSR2 0 BV - VSR3 0.33 · PSR BV + VSR4 0.33 · PSR BV - VSR5 −0.33 · PSR BV + VSR6 −0.33 · PSR BV - VSR7 0.67 · PSR BV + VSR8 0.67 · PSR BV - VSR9 −0.67 · PSR BV + VSR10 −0.67 · PSR BV - VSR11 1.0 · PSR BV + VSR12 1.0 · PSR BV - VSR13 −1.0 · PSR BV + VSR14 −1.0 · PSR BV - VSR15 3.0 · PSR · 0.33 016 −3.0 · PSR · 0.33 0

Table 3: 16 market scenarios commonly used by exchanges.

Sce-nario

PriceChange

VolatilityChange

S&P500Future Loss

S&P500Option Loss

PortfolioLoss

1 0 25% 0 -29 -292 0 5% 0 10 103 66 25% 66 -50 164 66 5% 66 10 765 -66 25% -66 -15 -816 -66 5% -66 10 -567 134 25% 134 -77 578 134 5% 134 4 1389 -134 25% -134 -3 -13710 -134 5% -134 10 -12411 200 25% 200 -112 8812 200 5% 200 -27 17313 -200 25% -200 3 -19714 -200 5% -200 10 -19015 600 0 198 -131 6716 -600 0 -198 10 -188

Table 4: Risk arrays for an example portfolio containing 1 S&P500 future and 1 S&P500call option expiring in 30 days with strike price $2,200.

22

Composite DeltaComposite delta is a weighted average option delta ∆ based out of 7 scenarios specified inTable 5.[CME, 2010, p. 11] More specifically, option delta is derivative of the option price Vwith respect to the underlying asset price S,

∂V

∂S= ∆

Composite delta allows netting between futures and options across combined commoditiesin the inter-commodity spread credit calculation which is later discussed. For an Europeanoption tracking a non-dividend-paying asset, the option delta for call and put options canbe calculated using d1 from (4.3) as

∆(call) = N(d1)

and∆(put) = N(d1)− 1

respectively, where N is the standard normal cumulative distribution. [Hull, 2012, p. 382]

ScenarioUnderlying Price

ChangeProbability Weight

1 0 27 %2 0.33 · PSR 21.7 %3 −0.33 · PSR 21.7 %4 0.67 · PSR 11 %5 −0.67 · PSR 11 %6 1.0 · PSR 3.7 %7 −1.0 · PSR 3.7 %

Table 5: Composite delta scenarios.

4.2.3 Intra-Commodity Spread Charge

As mentioned in a previous section, a future and its underlying asset does not have a one toone relationship, this also applies between futures with different maturity time. Let r and qdenote rate and dividend respectively, let T be time to maturity and S0 be spot price, from[Hull, 2012, p. 112] the future price F0 is given by

F0 = S0e(r−q)T

The equation above illustrates that the future price are affected by maturity time, rateexpectations, dividend expectations and price movements. In the extreme case with T = 0the future price only depends on the underlying asset price S0 while a future with T > 0depends upon interest rate expectations, dividend expectations and the underlying assetprice. This discrepancy has to be considered when netting futures with different maturitytime and that is the purpose of the intra-commodity charge.

23

Intra-Commodity charge is managed by grouping instruments within the combined com-modity in different tiers. A tier is defined by a time interval and instruments expiring withinthat interval are included in that tier. The spread charge is often defined between tiersbut a charge can also be set within tiers. Tier periods and spread charges are set by theclearinghouse in the SPAN configuration.[CME, 2010, p. 14-15]

Example Tier 1 and tier 2 includes instrument expiring within the interval 0-30 days and30-60 days respectively. A spread charge of $200 has been defined by the clearinghouse andis set between tier 1 and tier 2. The example net portfolio contains one long and one shortfuture position. Both futures have the same underlying asset and one future expires in tier 1,the other future expires in tier 2. The futures scan risk will be netted out to 0, however, theintra-commodity charges amounts to $200 because we have one offsetting position betweentier 1 and tier 2.

4.2.4 Inter-Commodity Spread Credit

This spread is used for risk offsetting, or netting, between positions grouped in different com-bined commodities. For example, the clearinghouse might configure SPAN to offset inversedcorrelated assets such as gold and dollar. The amount of offsetting allowed between combinedcommodities is also defined in the SPAN configuration by the clearing organization.[CME,2010, p. 16-19]

Example A portfolio contains the following positions: 1 Long S&P future, 1 Short NAS-DAQ future. Assume the S&P future has an individual scan risk of $20000 and the NASDAQfuture’s scan risk is $10, 000, the aggregated scan risk for both positions is $30, 000. If thespread credit is configured to 90% the credit would amount to 0.90 · 30, 000 = $27, 000.The total SPAN risk is reduced from $30,000 to $3,000 after the spread credit is taken intoconsideration.

Spread Credit with Option Positions This calculation becomes more complex whenadding option positions. When introducing an option position to the portfolio the samesimple netting procedure can not be applied as the previous example with two future posi-tions. To offset an option position with another option or future, composite delta is utilizedtogether with the scan risk of the option to obtain the weighted future price risk (WFPR).WFPR is used to calculate the spread credit amount for portfolios containing options thatare eligible for inter-commodity spread credit.[CME, 2003, p. 77] WFPR is calculated withthe equation below, the SPAN algorithm tested in this project use scan risk instead of pricerisk. [CME, 2003, p. 71]

Price Risk

|Net composite delta for a combined commodity|

Multiple spreads Portfolios can hold assets from many different combined commodi-ties. Suppose there are three CC’s A, B and C with the following spread pairs available:

24

{A,B}, {A,C}, {B,C}. As inter-commodity spread credit is calculated to allow netting be-tween different CC’s, there has to be a prioritization about what order to perform positionnetting. This prioritization is defined by the clearinghouse in the SPAN configuration.

Advanced Example Assume a portfolio contains the following positions

• Long 1 S&P500 call option

• Short 2 NASDAQ futures

• Long 2 GOLD futures

Offsetting positions (netting) are allowed between all CC’s, with prioritization defined as

1. {S&P500,NASDAQ} with netting rate 1.0.

2. {S&P500,GOLD} with netting rate 0.5.

3. {NASDAQ,GOLD} with netting rate 0.5.

Let us also assume the three instruments have the scan risks and composite deltas definedin Table 6, note that composite delta is always 1.0 for futures.

Instrument Scan Risk Composite DeltaS&P500 Call 5, 000 0.5NASDAQFuture

7, 000 1.0

GOLD Future 4, 000 1.0

Table 6: Scan risk and composite deltas for the three instruments used in the current exam-ple.

Total Scan RiskThe total scan risk is 5, 000 · 1 + 7, 000 · 2 + 4, 000 · 2 = 27, 000

Spread Priority 1The first prioritized spread is between CC’s S&P500 and NASDAQ. As the netting rate is1.0, full netting are allowed per opposing position. The example portfolio includes 1 S&P500call option and -2 NASDAQ futures, the number of spreads available for this spread pair isthus 1 (|1− 2|).

S&P500 Credit = 5, 000/0.5 · 1 = 10, 000NASDAQ Credit = 7, 000 · 1 = 7, 000

Spread Priority 2The second spread pair is between CC S&P500 and GOLD. But because the S&P500 CC isalready consumed (netted with NASDAQ) there is no credit available for this spread pair.

25

Spread Priority 3The last spread pair is between NASDAQ and GOLD. One NASDAQ future has already beenconsumed when calculating the credit for the first spread pair, leaving 1 available spread.

GOLD Credit = 4, 000 · 0.5 · 1 = 2, 000NASDAQ Credit = 7, 000 · 0.5 · 1 = 3, 500

Initial MarginThe initial margin is the total scan risk subtracted with the aggregated credits: 27, 000 −(10, 000 + 7, 000 + 2, 000 + 3, 500) = 4, 500.

4.2.5 Net Option Value

The net option value is a calculated to adjust for option exercise risk. When an option isexercised the seller of the option has to buy/sell the underlying asset from/to the buyer to apredetermined strike price. Net option value is the possible payment (or loss) that will occurwhen all options in a portfolio are exercised. There are two different ways of calculatingthe net option value, market premium or theoretical premium. We will cover the theoreticalpremium as it is used in the SPAN algorithm tested. Theoretical premium is calculatedusing the theoretical option price calculated with (4.1), (4.2), (4.5) or (4.6) and scaled withposition size and contract size. The formula for calculating net option value is as follows.

Net Option Value = Net Option Positions · Contract Size

· Underlying Contract Size · Theoretical Option Price(4.10)

4.2.6 Short Option Minimum

Shorting options may be of substantial risk because it can theoretically cause unlimitedfinancial losses. Substantial price movements in the underlying asset may cause extremevolatility in deep out of the money options. Because of this, a short option minimum charge(SOMC) is set to cover risk from this particular situation. As seen in (4.9) short optionminimum is not added as an additional charge, instead it is compared against the SPAN riskand works as a floor.

Short option minimum is calculated with equation (4.11)

SOMC =Short Option Positions · Short Option Minimum Charge

Base Contract Size(4.11)

4.3 Test Oracle

The idea of software testing is to evaluate system output after feeding a system under test(SUT) with input data. All evaluation procedures are done by various types of test oracles,e.g. human oracle (manual verification), true oracle (separate implementation of the software)or heuristic oracles (checks based on heuristics). Test oracles finds an expected result andcompares it to the results from the SUT. [Hoffman, 1999, p. 29-30]

26

4.3.1 Heuristic Oracle

An heuristic oracle does not verify all values, instead it conducts consistency checks basedout of heuristics. Douglas Hoffman gives an example of a test oracle that makes consistencychecks on a sine wave. The sine function should be increasing between 0 and 90 degrees and270 and 360 degrees, it is expected to decrease between 90 and 270. Instead of verifyinge.g. sin(33) exactly, the oracle can make a consistency check that sin(34) > sin(33). Exactcomparison can be used as a complement to the consistency checks if some particular resultsare known or can easily be obtained. For the sine wave there are at least 4 values thatare easily accessible to be verified exactly, that is sin(0) = 0, sin(90) = 1, sin(180) =0, sin(270) = −1, and sin(360) = 0. If all expected values were available we would be usinga true oracle. Heuristic oracles are often cheaper and faster to implement than a true oracle,however the heuristic oracle can and will miss some errors. [Hoffman, 1999, p. 29-30]

Usually, the hard part of implementing a heuristic oracle is to come up with heuristicsfor the algorithms under test. Breaking up the SUT, visualizing the results and searchingfor simple relationships are three useful tricks in order to find potential heuristics.

Heuristic oracles are not always ideal. If the SUT’s patterns are too complex and includemany inconsistencies, a true oracle might be a better choice. Building a very advancedheuristic oracle handling the inconsistent SUT may be as expensive as building a true oracle.[Hoffman, 1999, p. 31-32]

The whole SPAN algorithm is complex, to avoid implementing a very complex heuristicoracle, the oracle to be implemented will be focus on simple heuristic patterns together withexact verification whenever it is possible.

4.3.2 True Oracle

A true oracle is a separate, independent implementation of the SUT. It could also be anindependent implementation of a subpart of the SUT, if the purpose is to test a specific sub-module. By feeding the oracle with the same data provided to the SUT, a strict comparisonbetween the results from the SUT and oracle implementation is possible.

Complex algorithms are time consuming and expensive to implement, meaning a trueoracle testing a complex algorithm will also be expensive to implement. Because of thisrelationship, true oracles may also very well contain bugs of their own. [Hoffman, 1998, p. 5]

For the purpose of this thesis - testing a risk algorithm, implementing a true oracle forthe whole SPAN algorithm would be too time consuming. However, parts of the oracle tobe implemented will use exact verification, making subparts of the heuristic oracle act as atrue oracle.

4.3.3 Sampling Approach

The sampling test approach is one of the more common test automation methods. Input datais not chosen randomly, instead significant input values are chosen by the tester, for exampleboundary values such as maximum and minimum values. When the data is sampled, anoracle knowing the expected values can be implemented. [Hoffman, 1998, p. 6] For example,consider a simple calculator application where the division operator is going to be tested.An obvious boundary value is 0, {X, 0}, X ∈ R is one testable input data set. If 0 is in

27

the numerator, the tester knows that the SUT always should return 0 and if 0 is in thedenominator the tester knows the result should be undefined. The oracle verifies that theoutput is either 0 or undefined.

4.4 Random Testing

Random testing is a software testing concept where data is randomized into the system undertest. The main difference from traditional test automation is that the set of test data differsfor each test iteration, this increases the probability of finding bugs. Traditional softwaretesting is very good for regression testing but not as good for exploratory testing.

One or more test oracles are responsible to validate the system output. This allows for abroader verification because of the larger set of input data than traditional software testingdoes. The randomized data should consist of both realistic and unrealistic data and can beachieved by using different distributions while randomizing the input.[Hojeberg, 2008, p. 18]Random testing can thus find boundary values that testers and developers miss as well asdefects occurring in the exploratory range of the data set. Traditional automated tests willnot find these type of issues unless the first run revealed a glitch or if the system is changing.[Hojeberg, 2008, p. 17]

The data generation entity can be viewed as an actor or user of the actual system.As users can be humans or computers, their input is unpredictable and contains certainrandomness. With random testing we are trying to simulate realistic (and unrealistic) usagesof the system. One approach is to send the simulated data to both the SUT and the oracle,the results from both entities can then be compared and validated, as illustrated in Figure 4.For some validation the oracle might not require any knowledge of input data, then data isonly randomized and sent to the SUT.

Actor

TestOracle

SystemUnder Test

Verification

Input data

Input data System results

Expected results

Figure 4: Data is distributed from the actor to the system and the oracle. Results from thesystem are verified against the expected results from the oracle.

28

5 Random Testing Framework Implementation

To meet the purpose of this thesis, that is, implementing a random testing framework andverifying whether the clearing system calculates fair initial margin, a random testing frame-work including two test oracles are implemented. As the clearing system is actually going tobe set to production, the risk algorithm is tested from a system testing perspective. Randomdata is generated as input to the system, test oracles are responsible to investigate whethermargin requirements are fair and if risk calculations are successfully processed. As mentionedearlier, a heuristic oracle will not find all errors because it uses characteristics rather thanexact computations, however perfection is not necessary for an oracle to be useful. [Hoffman,1999, p.30] The oracles are used together to be able to verify as large area of the softwareas possible.

5.1 Testing structure

Two different testing structures are applied, one realistic multi-threaded model which sim-ulates a realistic environment and one single-threaded model that focus on verifying actualrisk calculations. Multi-threading means parts of a program can run concurrently on dif-ferent CPUs, implying multiple activities can be run at the same time. Because these twostructures have different dynamics, the same test oracle can not be used for both of them.However, they are trying to answer the same question about fair margin requirements, butapproaching the problem from different angles.

5.1.1 Single-Threaded Structure

The single-threaded structure is implemented to create a controlled environment with oneactor interacting with the system. Data sent to the SUT is also received and processed bythe oracle as shown in Figure 4. With the single-threaded environment, the oracle receivesdata from the actor and verifies risk calculations for each data update, after a verificationis complete new data is randomized into the system again. This loop of randomizing data,processing data and verifying data will end if the oracle evaluates that the initial margin isincorrect.

5.1.2 Multi-Threaded Structure

In an realistic production environment, many actors, for example clearing members and theclearing house, will interact with the clearing system simultaneously. Multi-threading isused to simulate a more realistic environment, sending data from different threads mimicsthe situation where different actors interact with the system.

The actors in each thread, produces the input data that is sent to the system and oracle.The SUT retrieves input data from multiple actors, then process all messages and calculatesrisk before responding with a result, this is illustrated in Figure 5. Multi-threading mayintroduce asynchronous problems between the oracle and SUT, making the verification pro-cess more difficult. Making some oracle operations thread-safe is what would be requiredto allow multi-threading, but because of time limitations this is left out. So this particular

29

setup limits the ability to verify initial margin calculations. Instead this structure focus onverifying the system state, stability and potential corner case issues.

A bad system state can for example occur if data sent to the SUT went missing. Serverstability is also vital to be able to provide initial margins to the system users. Cornercase errors can occur when providing specific data to the system and it tries to performan impossible task, for example divide a number with 0. To analyze this, a script lookingfor errors and exceptions are implemented to read server logs for exceptions regarding riskcalculations. Java use exceptions to handle errors, whenever an exception is thrown, it isprinted and logged in the systems server logs. If an error occurred and a risk calculation didnot proceed, the calculated initial margin is certainly not fair.

Actor1

Actor2

Actor3

Oracle

SystemUnder Test

Verification

Figure 5: General testing structure. Actors generate data that are sent to both the SUTand the oracles. Oracles may also query the system for specific data.

5.2 Finding Heuristics

To find heuristics for the SPAN algorithm the following techniques were used: system break-down (smaller modules) and simple pattern recognition. The system breakdown is dividingthe algorithm into parts to enable verification per part or to find any characteristics of apart. There was no direct verification per part implemented, the breakdown rather gave anunderstanding of what parts had to be verified heuristically. With simple pattern recognitionany simple heuristic is used for verification, for example initial margin should be positivebut less than infinity.

5.2.1 System Breakdown

In the theoretical background section 4.2 a breakdown of the SPAN algorithm were pre-sented, breaking down the algorithm into these parts simplifies the testing procedure. Thecalculation of scanning risk/risk arrays is relatively trivial and thus verified exactly with atrue oracle. Net option value and short option minimum are also verified with a true oracle.The rest of the parts, inter-commodity spread credit and intra-commodity spread charge areconsiderably harder to calculate and will fall under the heuristic oracle to test. To heuristi-cally verify inter-commodity spread credits and intra-commodity spread charges an intervalis created. Initial margin produced by the SUT is expected to be inside the interval.

30

Upper Bound As scan risk for each CC are available from the exact calculation they areused heuristically together with the spread charges to form an upper interval. The upperheuristic interval is defined as

Initial Margin ≤ (∑

cc∈CC

ScanRiskcc +max(Ccc ·Xcc))

where CC is a vector of all combined commodities, C hold all spread charges defined for acombined commodity and X is the number of positions in the combined commodity.

We are adding the maximum spread charge for all position in each combined commodityto the scan risk, then aggregating this for all combined commodities to create the upperbound of the interval. This heuristic assumes no netting are made between CC’s (inter-commodity spread credit = 0).

Lower Bound Let all CC’s that are not eligible to any inter-commodity spread credit bedefined as NC. The lower bound of the interval is defined as

(∑

cc∈NC

ScanRiskcc) ≤ Initial Margin

This heuristic hold because the netting effect between combined commodities should neverexceed 100%. We are assuming that all CC’s that are eligible for spread credit are fullynetted and the only remaining risk lies in the CC’s that were not eligible for any spreadcredit. As the scan risk is already calculated for all CC’s, we can aggregate all CC’s inNC to retrieve the lower bound of the interval. Note that this heuristic also assumes nointra-commodity spread charge for any position.

5.2.2 Simple Patterns

The following heuristics are simple but still useful for the purpose of investigating initialmargin behavior:

• Initial margin should never be a negative value to prevent unrealistic scenarios whereyou get paid to trade. It should also be less than infinity.

• Initial margin amount should always be available, meaning when querying for the initialmargin amount the SUT should always respond with a valid response. Null (having novalue) or 0 are two values that would mean the response is not valid. The probabilityof having 0 initial margin when simulating a realistic situation is extremely low.

• Suppose each iteration of randomized data produces an initial margin (IM), that willproduce a set of IM’s: {IM1, IM2, . . . , IMN}, where IM ∈ N. For each data updatethat generate changes to previous data, initial margin should change, giving us thefollowing heuristic relationship IM1 6= IM2, IM2 6= IM3, . . . , IMN−1 6= IMN . Thisrelationship does not hold for all actions, for example, if the price action is sent twotimes in a row with the same price it does change the initial margin. For this reason weare limiting the oracle to check this heuristic only when position updates are occurring.

31

5.3 Data generation

All data generated into the system is normally distributed to simplify the implementation.However, largely varying standard deviation and mean are used to expose the system tosmall and big data. There are several input factors that can be adjusted to vary the resultof a risk calculation, all of these are randomized. The factors randomized into the systemare the following:

• Asset price

• Underlying asset price

• Deal price

• Deal quantity

• Volatility surface

• Interest rate curve

For the single-threaded structure there are some exceptions of what factors are randomized.Because the verification will be on the initial margin amount, the implementation is consid-erably easier if some factors are held constant. The volatility surface will be a flat surfaceand the interest rate curve will be a flat curve, the other factors are randomized normally.The multi-threaded structure will have all factors randomized to increase the number ofinput data combinations to explore many untested data-sets.

5.4 Single-Threaded Model

This model was implemented to be able to verify initial margin calculations during a marketsimulation. Each data update in the system could potentially cause an error, to establish asmany errors as possible we want to listen to initial margin updates and verify them for allupdates.

5.4.1 Implementation

There are two main parts of the implementation, simulating market events and verifyingresults with an oracle. Simulating market events happens inside a class instance calledSPANActor, which is mainly responsible of feeding the SUT with random input data. Oncean actor send data to the SUT it also dispatch an object holding data to the test oracle classinstance called SPANOracle. The SPANOracle retrieves the object with new market data,then make computations according to the heuristics previously explained. The computationsresult in an interval with a lower and upper bound, where it expects the SUT’s initial marginto be within. The oracle then query for the initial margin from the SUT and asserts thatthe responding initial margin is within the expected interval. Figure 6 show a map of howthe software implementation is organized, information flow and how classes collaborates.

32

Risk-Random-Testing

SPANActor

EnterDeal()SetPrice()

SetVolatility()SetInteresetRate()

DispatchOracle

RandomSPANAction

SUT

SPANOracleSPANOracleUtil

Simple-BlackScholes

Initiatingactor Request

Action

data

Req

uest

IM IMresp

onse

Request calculation

Calculation result

Figure 6: Structure of the software used in the Single-Threaded Model.

The following paragraphs explain each implemented class, what they do and how theydo it.

Class RiskRandomTesting This class is the main class of the whole suite. It initiatesthreads, actors and the oracle. In this class the number of actors can be modified, selecting1 means we are running the Single-Threaded Model currently being explained. By selecting2 or more actors the Multi-Threaded Model will run, this is because each actor will runon their own thread. Running multiple actors makes the SPANOracle inactive. When allinitiations are complete all threads are started and the flow moves on to the SPANActor.

Class SPANActor The Actor should resemble a market actor, e.g. a trader at a tradingfirm. Technically this means the actor has to hold all random data as well as informationhow to send a trade to the system.

There are a three important features an actor needs in order to interact with the clearingsystem.

1. We want an actor to run on its own thread.

2. To communicate with the clearing system user sessions are needed, with connectedsessions an actor can send requests and communicate with the system.

3. An actor need a method that interacts with the system, this method calls all actionsan actor can do.

33

As the actor actions are called only from the interaction method it’s easy to add anddelete an action. The following actions are currently implemented for the SPANActor usedin Single-Threaded model:

• Enter deal, which enters a deal in a random tradable instrument between two accounts.For every deal a new account is created, this account will only hold one position forthis particular deal. The other account can be viewed as the actors account, it will holdmany positions as it is part of all transactions.

• Set price, this method randomly selects a tradable instrument and sets a random priceon it.

• Set volatility to a random number.

• Set interest rate to a random number.

The random information is generated using a random seed, making all random operationspseudo random. Information about this seed is logged, to make all actor actions reproducible.All input data randomized use this seed together with a set of random utility functionsalready available at Cinnober. The functions used in the utility class are basic statisticaltools, such as generating standard normal random variables. Connecting to sessions andutility methods to perform each action were also available in Cinnobers code base.

Class RandomSpanAction This is a utility class holding all actions that can be dis-patched from a SPANActor. It is built with the switch statement (case structure), wherecase 0-3 resembles all actions that an actor can do. A random number in the interval [0, 3] isgenerated to pick what case and action the actor should do. For example, if 0 is the randomnumber given, the actor will enter a deal. When the deal request is sent, all interesting in-formation is collected inside a DispatchOracle object that is dispatched to the SPANOracleinstance.

Class DispatchOracle An instance of the DispatchOracle holds all information that hasto be provided from the actor to the oracle. For example whenever a deal is entered, theDispatchOracle store all information about the position that the SPANOracle later need forverification. It’s a simple object built to ship action information to the oracle.

Class SPANOracle The SPANOracle instance is created in the RiskRandomTesting classand running on its own thread. This class has the responsibility to validate the systemoutput, which is the initial margin amount for a given account. The SPANOracle willretrieve an object from the DispatchOracle, when the SPANOracle has the object containingaction data, the SPANOracle starts to calculate the heuristic interval for the initial margin.The heuristic interval is used for the account holding multiple positions, while an exactverification is made on the account holding a single position. Most calculation logic are heldin a utility class called SPANOracleUtil.

Single position initial margin is verified exactly because the computations are relativelyeasy. As no inter-commodity spread credit or intra-commodity spread charge are available,the oracle calculates the exact initial margin and verifies that against the initial margin

34

calculated by the SUT. For a single position the initial margin is obtained by calculatingmax(Scan Risk, Short Option Minimum) − Net Option Value, which obtained from Equa-tion (4.9). Scan risk is calculated by multiplying the number of contracts with maximumor minimum value of the risk array (depending whether the account is long or short theposition). Net option value, calculated with Equation (4.10), is then subtracted from thescan risk to retrieve the final initial margin.

Pseudo Code for the SPANOracle

1. Retrieve a data update from the DispatchOracle.

2. Identify what type of update it was: price, position, rate or volatility update and savethe data.

3. Calculate risk arrays with the new data.

4. Calculate the scanning risk for each instrument.

5. Calculate composite delta and net option value for each instrument.

6. Calculate short option minimum for each option position.

7. Query the SUT for the initial margin amount.

8. Verify initial margin on the single-position account exactly using (4.9).

9. Calculate the upper and lower bound using the method explained in section 5.2.1 forthe multi-position account.

10. Verify that SUT’s calculated initial margin is within the interval.

11. Verify all simple pattern heuristics from section 5.2.2.

12. Go to step 1 when new data arrives to the oracle.

Class SPANOracleUtil This class hold all calculation logic needed for verifying the initialmargin. All methods and variables are static, meaning we do not need to construct an objectto call a method in this class. Other classes and tests can thus use this class in the futureif needed. Holding the calculation logic separated is beneficial because it makes it easier toadd an additional calculation and verification in the future.

Class SimpleBlackScholes A class holding static methods to calculate (4.1), (4.2), (4.5)and (4.6).

Class OptionData As calculating the theoretical price for options require the followingvalues: volatility, interest rate, time to maturity, strike price, underlying asset price andoption type (call/put) they are collected and stored in this object and accessible inside theSPANOracle instance.

35

5.5 Multi-Threaded Model

This model has many similarities with the Single-Threaded Model (STM), much code areshared between the Multi- and Single-Threaded Model. The main differences are the follow-ing:

• The Multi-Threaded Model (MTM) have many actors performing actions simultane-ously - this is made possible through multi-threading.

• The class SPANOracle is inactive when running the MTM, this implies there are norisk validations during the market simulation.

• MTM does not validate the actual initial margin amount, but rather checks if the SPANalgorithm is robust enough to handle multiple actions with unrealistic and realistic data.

• In the STM the test oracle listens to the system and verifies the initial margin amountfor every transaction. Instead, MTM’s evaluation process has to run after all actorshave performed their actions, which is checking log-files for potential issues. There area few servers responsible for all risk calculations, all messages processed are logged andsaved which enables error tracking. Searching for exceptions can help identify unwantederrors such as NullPointerException or detect system defects.

5.5.1 Implementation

Considering that most code used in the STM can be reused for the MTM no further ex-planations are provided except for the error tacking script that is built. The structure ofMTM is visualized in Figure 7, as earlier noted the overall structure is very similar to STM’sstructure showed in Figure 6.

Risk-Random-Testing

SPANActor

SPANActor

SPANActor


SetVolatility()SetInterestRate()DispatchOracle

SUT

Log-filesError Track-

ing ScriptResults

Loggin

g

Figure 7: Structure of the software used in the Multi-Threaded Model. Note that the numberof SPANActors can be configured to any positive number.

36

Error Tracking Script This error tracking script is implemented using three bash scripts.The first script is simply looking for any exception occurring in the server log files, and savesthem to a new log file. The second script goes through the exceptions previously found, thensort them per server and counts the number of unique exceptions. This makes it easy tonote what server is causing most exceptions and what type of exceptions are most frequent.A third script works as the running script by wrapping the two other scripts.

37

6 Random Testing Results

This section presents the results from running the random testing framework that has beenimplemented. There are two different models that are configurable with the framework, thatis the single-threaded model and the multi-threaded model. Both models simulates actorsusing the clearing system, where all actions performed by an actor should affect the initialmargin amount. Oracles evaluates whether the initial margin is fair and calculated withoutany errors by the different methodologies explained in section 5.

6.1 Single-Threaded Model

Many simulations using one actor has been performed. The actor is entering deals, changingasset prices, yields and volatilities. As of now the random testing simulation stops as soonas an error occurs, that is when initial margin is outside the heuristic interval used for themulti-position account or if the exact verification is incorrect for the single-position account.Because of the way this is built, if a bug is present, an actor will only perform one or afew actions before the oracle finds the issue and trigger the random testing to stop. It isbuilt this way because there are no logic implemented yet to automatically discover whatthe potential error is, this is a manual procedure as of now. As a result of interruption, theexpected number of issues to be found decreases. Major errors that happens repeatedly andgreatly affects the initial margin will not go unnoted by the oracle, minor errors that mayhappen more irregularly will be overshadowed by any major errors. There were a total offour issues found when running the Single-Threaded Model.

Net Option Value Issue The first bug found was regarding net option value, which isan adjustment to cover option exercise risk. Net option value is calculated using Equation(4.10), which takes net option positions, contract size and theoretical price into consideration.All tradable instruments used by the actors have a contract size between 1,000 and 100,000,and most of the underlying assets has contract sizes between 1 and 1,000. The issue detectedby the SPANOracle was that initial margin is incorrect for most options because it does notconsider the contract size. When considering contract size, there is a multiplication of atleast 1,000, meaning the bug is quite substantial and largely affects the initial margin. Ifan option with contract size larger than 1 is part of a deal, the oracle will detect the errorand stop the simulation. The multi-position account using heuristics will not always detectthis, for example if there are many future positions on the account and a single and smalloption position, the net option value might not breach the heuristic intervals. When theinitial margin is verified for the single-position account it will always detect this issue.

Composite Delta Another bug found was regarding the Composite delta. Compositedelta is a future looking option delta that is calculated using probability weights togetherwith 7 market scenarios. The composite delta bug was found during an attempt to build atrue oracle for the multi-position account in the single-threaded model. This attempt wasaborted due to time constraints. Calculating the exact initial margin interval require theinter-commodity spread credit, which require the composite delta. The SPAN algorithm’s

38

calculated composite delta value was slightly different from the oracles calculated compositedelta. The actual root cause of the issue is yet undiscovered.

Extreme Risk Array Scenario The third issue found was regarding the extreme scenar-ios in the risk array calculation. CME SPAN is the specific SPAN algorithm implementedand used in the clearing system under test. There are several other versions of SPAN, forexample the London SPAN algorithm. In CME SPAN the two last risk array scenarios,also called extreme scenarios, are defined as the a price move equals to 300% of the pricescan range. The two extreme scenarios are then reduced by multiplying a probability factorof 33% to lower the impact of these scenarios. Instead of 33%, as CME uses, the clearingsystems SPAN algorithm was configured to use 30%. The Single-Threaded Model foundthis bug during a verification on the single-position account where initial margin are verifiedexactly.

Future Price Affect Risk Array Calculation Calculating risk arrays for futures is avery simple calculation where the price scan range is multiplied by 16 different values foundin the SPAN scenario definition. The current implementation of the SPAN algorithm takesthe current market price into consideration, this means if the price of the future is $1, abuyer only risk to lose $1 per contract held. One would argue that that is probably morerealistic, but it is not how the algorithm is intended to work according to CME. CME doesnot take market price into consideration, meaning the buyer of the $1 future still risk tolose the dollar amount of the scan risk (max(Risk Arrays)) per contract held. If the scanrisk was $1000 and the buyer had 10 contracts, CME means this buyer still risk to lose$1, 000 · 10 = $10, 000 even though the trader bought the future for $1 per contract. Thisbug was also found with the single-position account which performs an exact verification.The random testing framework randomized a very low price on a future, that made this issuepresent itself for the oracle who expected a far greater initial margin.

6.2 Multi-Threaded Model

A three hour market simulations with the MTM using four actors was conducted to explorethe robustness, error handling and potential corner case errors for the servers calculatingrisk. During this time approximately 864 000 actions were performed by different actorsand meanwhile the servers did not have any downtime. The actors were randomizing eithervolatility surfaces, positions, instrument prices or interest rate curves into the system for eachaction performed, an action was fired every 50 millisecond. Figure 8 illustrates an unrealisticinterest rate curve that was randomized into the system. A realistic interest rate curve isshowed in Figure 9. A sample of an unrealistic volatility surface from the simulation can beseen in Figure 10, it also illustrates a smoother and more realistic surface as a comparison.

The error tracking script, which is reading server log files for exceptions, did not find anyexceptions related to risk calculations at all. There were some other exceptions occurring, butthey are left out from the result because they were all unrelated to risk calculations and theSPAN algorithm. Hence, this model did not find any issues related to the SPAN algorithmin the clearing system. Because no relevant issues were revealed during the simulation, the

39

results from this model should be interpreted as positive from a quality perspective. Theclearing system can handle multiple requests per seconds from different actors without anymajor issues while running on a development machine. As it is designed to handle severalthousand trades per second this should not be a problem, but this result still increases theconfident that the risk domain can handle lots of data, as well as unrealistic data.

Figure 8: Unrealistic interest rate curve that was randomized into the clearing system bythe testing framework.

Figure 9: Realistic interest rate curve with rates from US treasury yield curves. Data wasdownloaded 23rd of May 2017.

(a) Unrealistic volatility surface that wasrandomized into the clearing system bythe testing framework.

(b) A smooth volatility surface, constructed byarbitrary data.

Figure 10: An illustration of the difference between an unrealistic volatility surface and asmoother, more realistic surface.

40

7 Implementation Results

The framework implemented to test risk algorithms using randomized input data was suc-cessfully implemented. Considering that part of this projects purpose was to implement arandom testing framework suitable for risk calculations, a major part of the results lies inthe actual implementation. The current implementation allows great flexibility in terms ofnew additions of verifications, actors, and even risk algorithms. Suppose we want to adda new action that requires additional verification and calculations for the SPAN algorithm.Additions of any action would be added to the RandomSpanAction class, that would givethe actor another action it can perform. The verification logic, which includes collecting datafrom the SUT and some sort of assertion, belong in the oracle class called SPANOracle. Anyadditional SPAN related calculation logic would belong inside the SPANOracleUtil class.Figure 11 show the easily accessible design of classes and where the method additions shouldbe located. A scalable framework is important for a new tester or developer who wants todo further testing of the SPAN algorithm or test another risk algorithm. Below are someresulting benefits of this random testing framework implementation.


SetVolatility()SetInterestRate()DispatchOracle

...AdditionalActions()

RandomSpanAction

VerifyRiskArrays()VerifyInitialMargin()

VerifyNetOptionValue()...

AdditionalVerifications()

SPANOracle

CalculateRiskArrays()CalculateOptionPrice()CalculateIMInterval()

...AdditionalCalculations()

SPANOracleUtil

Figure 11: The implemented framework has an accessible design, which for example allowsfor additions of actions and new verifications.

7.0.1 Extensibility

Adding Actions If a tester or developer decides to add additional actions, they shouldbe defined inside the actor class. The actor hold all actions it is allowed to perform while allactual action methods are defined in the RandomSpanAction class. For example, a testerwants to add an action that transfers positions between accounts. Inside the actor class,define that position transfers are allowed for the actor, then implement a method inside theRandomSpanAction class that perform the actual position transfer. All action methods arekept in the RandomSpanAction class to reduce potential code duplication and to give all theactors access to them.

Adding Actors To make the simulation even more realistic it would be of interest tocreate more actors. Actors are supposed to resemble real market participants and users ofthe clearing system. For example, one actor could mirror Trader A, which are only allowed totrade (use method EnterDeal()) from bank A with access to account A. Another actor could

41

mirror some market data supervisor, with access to setting prices, volatilities and rates. Athird actor may resemble a risk operator, who are constantly monitoring risk calculationsby querying for it every 10th second. These are just examples of how an even more realisticsimulation could easily be implemented with this framework. What needs to be done in orderto extend this is to add a new actor that use the same actor interface, hence implementingall required actor methods. Inside the new actor class, define some action restrictions theninitiate the actor on a thread in the main class. This example extension of the Multi-Threaded Model, previously showed in Figure 7, would have the structure showed in Figure12.

Span-Random-Testing

Trader A

Marketdata

supervisor

RiskOperator

EnterDeal()

QueryRisk()

SetPrice()SetVolatility()

SetInterestRate()SUT


ing ScriptResults

Loggin

g

Figure 12: Visualization of how an extended version of the Multi-Threaded Model wouldlook like.

Adding Risk Algorithms Adding a new risk algorithm require some more work thanadding a new action or actor but is still relatively effortless. Assume a tester wants toadd testing of the risk algorithm value at risk (VaR), which also calculates initial margins.Instead of adding complicated if-else logic inside the SPANOracle, a new VaRActor classand VaROracle class should be created. The VaRActor should implement the same actormethods as previously created actors, that will provide the actor with the actor-requiredmethods. Then inside the VaRActor, define the relevant actions it should be able to perform.To verify the new algorithm an oracle that is suited for the value at risk algorithm has tobe implemented. Configure the main class to initiate the new actor and oracle instead ofthe previously configured actor and oracle. The time consuming part when adding new riskalgorithms will be the implementation of a good oracle. Figure 13 visualize how the structureof the random testing framework could look like with a risk algorithm extension.

42

SPAN-Random-Testing

SPANActor

EnterDeal()

QueryRisk()

SetPrice()SetVolatility()

SetInterestRate()SUT


ing ScriptResults

Risk-Random-Testing

VaR-Random-Testing

VaRActor VaROracle

Con

figu

reSP

AN

Con

figu

reV

aRL

oggi

ng

Figure 13: Visualization of how the random testing framework would look like with two riskalgorithms .

43

8 Discussion

In this section, we discuss the advantages and disadvantages of the methodologies used insection 5, as well as the results from section 6 and 7.

Complex software require the testing strategy to be sophisticated. The main strategymay be a combination of many strategies to make the testing cover as much code or modulesof the software as possible. With the implemented random testing framework, we combinedseveral different strategies to test whether the SPAN algorithm calculated fair initial margin.

A challenge of software testing is to find a balance between no verification and exhaustiveverification. No verification implies no testing, and exhaustive verification means that a lot oftime and large resources are spent on testing the software. For example, implementing a trueoracle could lead to test exhaustion because an advanced algorithm has to be implementedagain. We approached this compromise between ease of implementation and test coverageby using random testing together with two types of oracles and an error tacking script.

Heuristic Oracle Heuristics are used to reduce the level of complexity when implementingan oracle and preventing the test implementation to be as advanced as the algorithm itself.The main challenge of implementing a heuristic oracle is to find useful and simple heuristics,this was also challenging for the SPAN algorithm. To test the SPAN algorithm withoutspending all effort and time to implement it again (true oracle), the main focus was directedto implement a heuristic oracle.

In section 5.2.2 we defined three simple heuristics that were part of the heuristic oracle.These simple heuristics were easy to implement and all three heuristics test important partsof the clearing system. A big advantage of using simple heuristics is that they are extremelyreliable and also cheap to implement. The disadvantage of the simple heuristics that arepart of our oracle, is that they do not tell us anything about the accuracy of the algorithm.

In section 5.2.1, we used system breakdown to identify any heuristics that could functionas a surrounding interval to the expected initial margin amount for multi-position accounts.The idea behind creating an interval was to increase the accuracy testing of the algorithm.Obtaining the heuristic interval required more calculations than initially expected, that madethis part of the oracle less reliable. When an oracle gets too complex, it may contain bugsof its own, and that is a major disadvantage with this particular heuristic check.

A great benefit of having the extendable testing framework implemented in this thesis,is that adding additional heuristic checks is cheap and easy. Whenever a tester or developerfind a new heuristic, it can be added without difficulties.

True Oracle The other part of the oracle used in the single-threaded model verifies initialmargin for single-position accounts exactly, and can thus be viewed as a true oracle. An exactverification should be used whenever it can be done without adding too much complexity.That was also the reason to proceed with implementing the exact verification part of theoracle.

The exact verification was easier to implement than the heuristic interval verifying initialmargin for multi-position accounts. Because the implementation was less complicated it isalso more reliable. From this thesis results, 3 out of 4 issues were found with the true oracle,that indicates that exact verification is very useful, especially when testing for accuracy.

44

Implementing a true oracle for the multi-position account would be far more complexthan for single-position accounts, meaning it would also be a lot more expensive and un-reliable. These are two major disadvantages of true oracles, it is also part of the reasonfor leaving a complete true oracle out of this thesis. Another reason was to find a suitablebalance of verification load, the heuristic oracle function as a compromise between ease ofimplementation and test coverage.

Error Tracking Script The error tracking script in the multi-threaded model is reliablebecause of its non-complex nature, that is one of the big advantages with this script. Thescript looks for exceptions in the systems server logs to find unwanted corner case errors, theprocedure is straightforward and does not require a complex algorithmic implementation. Itworks on a high level where it observes the clearing systems state rather than evaluating it,with help from the scripts output the evaluation process is easy and can be done manuallyby any tester or developer.

Bugs, such as unhandled division by zero logic, would not go undetected from the errortracking script. Building testing tools that depends upon other system functionality, such aslogging, has an obvious disadvantage. Suppose the servers logging process is poor, or that itsuffers from any other critical issue, then this script would be less useful. If you however trustthe logging functionality works, then this is an excellent tool when used together with othertesting, such as random testing. This error tracking script provided a lot of test coverage toa reasonable implementation cost (time).

Results As earlier mentioned, all issues were found by either using or building the oraclesused in the single-threaded model. All issues were related to calculation details of the SPANalgorithm, no major error such as a server crash occurred. Prior to this thesis, traditionalsystem testing had been performed on the SPAN algorithm. Traditional system testing isoften performed using the sampling approach, where the input is a sample chosen by thetester and the same input is used over and over again (also known as regression testing).Regression testing leaves room for issues because a very small set of input data is tested. Theimplemented random testing framework is a great complement to the previously conductedregression testing because of its superior ability of finding system defects.

No issues were found with the multi-threaded model. The purpose of the multi-threadedmodel is not to verify the initial margin amount, it rather tries to find errors related to riskcalculations by using parallel execution and lots of input data. As the implemented randomtesting framework is configurable to either be single- or multi-threaded, the possibility toimplement a thread-safe oracle to test the accuracy of the initial margin amount exists. Con-sidering no issues were found using the multi-threaded model together with the implementederror tracking script, the clearing systems risk domain show great stability.

45

9 Conclusion and Outlook

This thesis had the three following main objectives:

1. Implement a random testing framework for risk algorithms

2. Investigate whether the SPAN algorithm calculates fair initial margin

3. Increase quality assurance in the risk domain

The first objective is the most measurable part of the three objectives. Random testing isno doubt a very powerful testing tool that allows testing from both realistic and unrealisticperspectives and has also higher probability of finding bugs than regular automatic regressiontesting. This framework will be useful for further testing of the SPAN algorithm and it allowstesting of new risk algorithms that may be incorporated in the clearing system. With theframework in place test teams at Cinnober can extend the SPAN oracle by adding moredetailed verifications and eventually extend the oracle implementation to a complete trueSPAN oracle.

From the random testing results in section 6 we can draw the conclusion that the currentimplementation does not calculate fair initial margin because of the four found issues. Theseissues are all directly related to the initial margin amount.

As the four issues found will be reported and corrected by a developer there will be animmediate quality improvement in the risk domain, the quality of the SPAN algorithm itselfwill also increase. Obviously, because SPAN is a complex algorithm there might exist otherundiscovered bugs. By testing an algorithm with heuristics you are always leaving room forerrors that exists on a detail level. However, heuristic testing is very powerful and far betterthan no testing at all. The implemented random testing framework will help to increase thelong term quality assurance because of its current application and also by the extensibilityit provides. The multi-threaded model tells us that the system is showing stability with fewobvious corner case issues, and that it can handle parallel execution of both realistic andunrealistic data. This surely increases the confident about the quality of the risk domain.The quality assurance also increases because specific random testing focusing on the SPANalgorithm with exact and heuristic verification, had not previously been conducted.

46

References

CME. Span 4 technical specifications, 2003.

CME. Cme span methodology. https://www.cmegroup.com/clearing/files/span-methodology.pdf,2010.

Natasha de Teran. How the world’s largest default was unravelled. Financial News, October2008.

Dorothy Graham, Erik van Veenedaal, Isabel Evans, and Rex Black. Foundations of SoftwareTesting: ISTQB Certification. Cengage Learning, revised edition edition, 2008.

Douglas Hoffman. A taxonomy for test oracles. Quality Week, pages 1–7, March 1998.

Douglas Hoffman. Heuristic test oracles - the balance between exhaustive comparison andno comparison at all. Software Testing & Quality Engineering Magazine, pages 29–32,March/April 1999.

Douglas Hoffman. Mutating automated tests. Technical report, Software Quality Methods,LLC., 2000.

Noah Hojeberg. Random tests in a trading system. Master thesis, KTH, 2008.

Mikael Hovmoller. What is clearing and why is it important? Cinnober internal material,November 2015.

John C. Hull. Options, Futures, and other Derivatives. Pearson, 8th edition edition, 2012.

Amandeep Rehlon. Central counterparties: what are they, why do they matter and howdoes the bank supervise them? Technical report, Bank of Englands Market InfrastructureDivision and Dan Nixon of the Bank’s Media and Publications Division, 2013.

47

Documents

Veri cation of Risk Algorithm Implementations in a ...umu.diva-portal.org/smash/get/diva2:1141867/FULLTEXT01.pdf · Veri cation of Risk Algorithm Implementations in a Clearing System