Upload
others
View
1
Download
0
Embed Size (px)
Citation preview
University of Rome“La Sapienza”
Engineering Faculty
Master Thesis in Computer EngineeringAutumn Session - October 2010
A contract-based event-driven model for cooperative environments:
the case of collaborative security
Leonardo Aniello
Supervisor: Prof. Roberto Baldoni Examiner: Prof. Luca Becchetti
Assistant Supervisor: Dott.ssa Giorgia Lodi
ii
iii
to my family and my long-time and more recent friends
iv
Acknowledgements
I believe that one’s results heavily depend on the context he lives daily. Looking back to my life
and to the goals I’ve accomplished, I consider myself very lucky to have been surrounded by persons
that have always given me the required serenity and the right boost to go ahead with my aspirations
despite the usual big and small troubles that one has often to face.
At this regard, I’d like to thank with all my heart my mother and my father for having supported
me in all these years, in spite of the several changes I’ve made about my future directions. And
I can’t forget to thank all my friends in Terni and surroundings, from the high school ones to the
others that I’ve met over time, for their loud company and strong understanding.
A special thank goes to Roberto Baldoni and Giorgia Lodi for having introduced and guided me
through the nice experience of getting involved in a European project and for the help they’ve given
me in the writing of my thesis. I’d like also to thank Stefano, Luca and Silvia for their contribution
to my work.
Contents
Introduction 1
1 Scenario and Requirements 5
1.1 Reference Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.1.1 Distributed Denial of Service . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.1.2 Man in the Middle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.1.3 Identity Theft . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.2 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.1 General CoMiFin Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.2.2 SR Management Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.3 Privacy and Security Requirements . . . . . . . . . . . . . . . . . . . . . . . . 14
1.2.4 Event Processing Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.2.5 Performance Monitoring Requirements . . . . . . . . . . . . . . . . . . . . . . 16
2 CoMiFin Architecture 17
2.1 The CoMiFin Service Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.1.1 CoMiFin Principals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.1.2 Semantic Rooms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 CoMiFin Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
2.2.1 SR Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.2 Commodity Services . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
2.2.3 SR Complex Event Processing and Applications . . . . . . . . . . . . . . . . 24
3 SR Life Cycle Management 27
3.1 Requirements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.1 SR Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.1.2 Segregation of Duties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3.2 High-Level Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.1 Partner and Account Management . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2.2 Software Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.2.3 SR Schema Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
v
vi CONTENTS
3.2.4 SR Life Cycle Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.2.5 SLA Violation Management . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.2.6 Component Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.3 Low-Level Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.1 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.3.2 Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.1 Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.4.2 HowTo . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
4 An SR for Intrusion Detection 49
4.1 Collaborative Intrusion Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.2 The Agilis System . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.1 WebSphere eXtreme Scale . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2.2 Hadoop and MapReduce . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.2.3 Processing Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.4 Long-Term Data Storage . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.2.5 The Jaql Language . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.3 The ID-SR Processing Steps . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Main Innovations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.5 Performance Monitoring . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5 Conclusions 57
5.1 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.2 Future Directions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.1 Other Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
5.2.2 Botnet Detection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2.3 Resource Allocation and Scheduling Optimization . . . . . . . . . . . . . . . 59
5.2.4 Privacy and Trustworthiness . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
Bibliography 61
List of Figures 63
Introduction
The raising availability of Internet connectivity and the increase of network bandwidth are boosting
the use of the world wide web for providing more and more services. Supported services range from
e-commerce to social networks, from telephone calls to business-to-business transactions. Therefore
a huge amount of businesses are relying on the Internet. Moreover, the access to an unbelievable
quantity of high-value and sensitive information is enabled and secured employing the Internet as
Critical Infrastructure. What could be the next generation services provided by Internet? Besides,
is Internet reliable enough for supporting all these services?
This work aims at addressing together these two questions, investigating the paradigm of coopera-
tive environments as a possible mean of letting diverse organizations collaborate in the context of
the security of Financial Institutions. A brief overview of cooperative environments is provided, in
order to introduce the main topics of the whole work. All these aspects are supported by CoMiFin,
a European project focused exactly on the security of Financial Institutions. A short description
of CoMiFin is also given, together with a more detailed background about Financial Institutions
scenario.
Cooperative Environments
For today’s software systems, the need of timely and adaptively reacting to unpredictable changes
in the environment, so as to identify and notify possible opportunities and threats to interested
actors, is becoming more and more crucial. On these lines, a very interesting class of systems is
that of Sense and Respond (SRS). They detect and correlate external events, that is the sense
phase, and then produce on time useful outputs, that is the respond phase.
A primary property of SRSs is the ability to produce timely responses. Part of the complexity of
present environments is due to the high speed in which changes occur and it’s often crucial to react
with a limited delay. For example, systems that monitor Critical Infrastructures have to notify
anomalies as soon as possible, in order to prevent damages to people or things.
An important aspect of SRSs is that the quality of generated responses depends on how much input
is gathered. Let’s consider a system in charge of computing the fastest route to reach a certain
destination. If this system were based on roads’ topology only, such calculation wouldn’t take into
account any problem due to traffic condition or road accidents. The employment of some automatic
1
2 CHAPTER 0
mean to detect the current situation of roads would surely improve the quality of computed route.
We can say ”the more I could sense, the better I would respond”. So, another relevant property of
SRSs is the ability of gathering input data from several sources, possibly heterogeneous and widely
distributed.
This latter property can be seen as a way of putting together many different actors with the aim of
obtaining some added-value benefit. Resuming route calculation example, roads’ topology, traffic
conditions and real-time information about road accidents are correlated to get the fastest way.
Another interesting example is about failure detection. In distributed systems, load peaks and
hardware/software failures can heavily affect the communication between nodes. The capacity of
dynamically detecting failed nodes is fundamental for carrying out main distributed communica-
tion protocols, but the delays introduced by the network and the topology of the network itself can
make hard for each node to have the same knowledge about the health of the other nodes. The
idea is to let all the nodes collaborate sharing their own knowledge in order to achieve a common
perception about which nodes are alive and which not.
In this regard, a very interesting area of research concerns the usage of SRSs to let a set of in-
dependent systems collaborate for an established common goal, creating in this way a cooperative
environment. A brief analysis of the properties satisfied by this class of systems is quite helpful to
identify what is needed to obtain collaboration in practice. First of all, due to the large volume
of input data and to the demand for elaborating them in a timely fashion, we need to use some
parallel computing paradigm, so that computation could be speeded up as required to meet time
constraints. This in turn implies the employment of a huge number of computational, storage and
network resources, so that a parallel elaboration could be carried out on a big input data set.
An emerging technology suitable for supporting cooperative environments is cloud computing [26].
Besides its innovative business aspects, it concerns also with the over-the-Internet provision of
dynamically scalable and often virtualized resources. Its scalability is useful to face any peak in
the load or hardware/software failure. The virtualization is important for maximizing resource
utilization. So, cloud computing can be the right mean for having at disposal the required amount
of resources where the desired parallel computing framework can be then deployed.
CoMiFin
CoMiFin (Communication Middleware for Monitoring Financial Critical Infrastructure) [3] is an
EU project funded by the Seventh Framework Programme (FP7), started in September 2008 and
continuing for 30 months. The research area is Critical Infrastructure Protection (CIP), focussing
on the Critical Financial Infrastructure (CFI).
An increasing amount of sensitive traffic is being carried over open communication media, such as
the Internet. This trend exposes services and the supporting infrastructure to massive, coordinated
attacks and frauds that are not being effectively countered by any single organization. In order to
3
identify threats against Critical Infrastructures and business continuity, CoMiFin aims to facilitate
information exchange and distributed event processing among a subset of participants grouped in
federations. Federations are regulated by contracts and they are enabled through the Semantic
Room abstraction: this abstraction facilitates the secure sharing and processing of information by
providing a trusted environment for the participants to contribute and analyze data. Input data
can be real time security events, historical attack data, logs, and other sources of information that
concern other Semantic Room participants. Semantic Rooms can be deployed on top of an IP net-
work allowing adaptable configurations from peer-to-peer to cloud-centric configurations, according
to the needs and the requirements of the Semantic Room participants.
A key objective of CoMiFin is to prove the advantages of having a cooperative approach in the
rapid detection of threats. Specifically, CoMiFin demonstrates the effectiveness of its approach
by addressing the problem of protecting financial Critical Infrastructure. This allows groups of
financial actors to take advantage of the Semantic Room abstraction for exchanging and processing
information, thereby allowing them to take proactive steps in protecting their business continuity,
for example, through generating fast and accurate intruder blacklists.
Scenario: Financial Institutions
Nowadays, the financial industry is witnessing technological and usage changes due to globalization,
new trends towards the ”webification” of financial services such as home banking, online trading,
remote payments, and increased competitions among financial stakeholders.
In such a context, it emerges that financial institutions’ infrastructures are no longer being confined
within single organizational boundaries; they start becoming part of a global unmanaged financial
ecosystem that consists of interconnected financial domains and other critical infrastructures like
telecommunication supply, electricity supply in which cross-domain interactions spanning different
administrative borders are in place.
As of today, the overall number of transactions being conducted over the above mentioned financial
ecosystem is increasing. Specifically, it is increasing the portion of traffic that is carried out through
public networks such as Internet, thus exposing the overall ecosystem to massive and coordinated
attacks and frauds.
Protecting the ecosystem from faults and malicious attacks is then essential in order to ensure
stability, availability, and continuity of key financial markets and individual businesses, since those
attacks might also significantly compromise the results of financial transactions.
Currently, some threats or attacks targeted to different financial entities are difficult or even im-
possible to be detected adopting local single-domain monitoring approaches. However, a novel
approach based on inter-organization cooperation and information sharing can improve security
threats detection capabilities.
The main objective of CoMiFin is to design and develop a distributed system that can enhance the
4 CHAPTER 0
situation-awareness of financial organizations so as to allow them to better address security threats
and timely trigger local protection mechanisms, thus preventing or mitigating dangerous effects.
In order to build the system, it has been considered a possible scenario within which proving the
effectiveness of the CoMiFin solution.
My Contribution
Within the context of CoMiFin project introduced so far, my contribution has been twofold:
� I’ve been actively involved in the analysis, design, development and installation of SR Man-
ager, a component of CoMiFin system architecture that will be deeply described later;
� I’ve developed a simple monitoring software aimed at collecting statistics about the perfor-
mances of Agilis, a distributed parallel computing framework that will be detailed afterwards.
Organization
The thesis is organized as follow:
- in Chapter 1, the scenario of Financial Institutions is investigated in more details in order to
catch the requirements the system is expected to meet;
- in Chapter 2, the architecture of CoMiFin system is illustrated by a top-down approach;
- in Chapter 3, requirements for SR Management components are identified and refined from
high-level requirements, and the architecture of these components is described in more details;
- in Chapter 4, an insight of an SR for Intrusion Detection is provided, together with a possible
architecture supporting it;
- finally, Chapter 5 outlines the conclusions obtained from this work and proposes some future
directions for this area.
Chapter 1
Scenario and Requirements
As already stated in the introduction, the collaborative approach can offer benefits in several
scenarios:
� in the financial context, organizations can cooperate sharing their traffic data to improve
their security defenses;
� the calculation of shortest routes can be enhanced putting together info coming from sources
of different nature, that is making them collaborate;
� reliable failure detection services can be built on top of collaborative environments that make
involved nodes exchange their knowledge about the health of other nodes.
This chapter, as well as the rest of thesis, is aimed at exploring in more details the current situation
of financial environments. Most important threats are described and analyzed in order to get the
basis for identifying the requirements that have been relevant for what concerns my contribution
and that CoMiFin project is expected to address.
1.1 Reference Scenarios
The key elements of the overall CoMiFin financial scenario have been investigated starting from
the structure of a single organization (or business entity) that may be willing to use the CoMiFin
system and participate in the so-called CoMiFin cloud. Figure 1.1 depicts this structure. An orga-
nization can be thought of as logically divided into two principal parts: an internal part (the left
hand side of the shaded rectangle in Figure 1.1) and an external part (the right hand side of the
rectangle in the same figure). The internal part consists of the set of hardware and/or software
components that communicate with each other using intra-domain communications; intra-domain
communications are usually carried on over possibly proprietary and highly secure networks, using
proprietary protocols. The internal part can be in turn constructed out of a set of internal networks
(the dotted circles in Figure 1.1 ) that interact one another inside a major organization. These
networks might define the boundaries of internal organizations that compose the major one.
5
6 Chapter 1: Scenario and Requirements
Figure 1.1: Structure of business entity
Hardware and software resources deployed within the internal part are not accessible from the
outside, since they are not directly connected to the Internet and are properly protected by hard-
ware/software components in order to guarantee a high level of isolation. For instance, Figure
1.1 illustrates a bank corporation (i.e., Lloyds TSB) that may consist of a set of banks each of
which can be located in different geographical areas by means of bank agencies. Each bank in
the major corporation, and recursively each bank agency, can use an own proprietary network in
order to carry out financial activities. The external part consists of the set of hardware and/or
software components connected to the outside world (i.e., the Internet). In the bank corporation
example of Figure 1.1, dedicated resources can be used for instance in order to provide end users
with e-banking services that use communications over Internet. Communication and interaction
between the external and internal parts are regulated by specific security policies and several levels
of firewalls (which may include the definition of a DMZ for hosting external resources) in order to
protect the internal part from attacks and unauthorized accesses.
Based on this organization structure, Figure 1.2 shows how a single organization can contribute to
the construction of a global financial ecosystem, and the position in that scenario of the CoMiFin
cloud.
In the CoMiFin scenario, there exist actors strictly related to the financial context, and critical ser-
vice providers. Figure 1.2 depicts some financial actors such as banks (e.g., Unicredit, Lloyds TSB)
and clearing houses (e.g., SWIFT), although any other organization related to the finance envi-
ronment (e.g., insurances, regulatory agencies, government agencies), and critical service providers
such as telecommunication companies (e.g., AT&T) and power grid companies can be also involved.
Each actor that is willing to exploit the functionalities offered by the CoMiFin system can partici-
pate in the CoMiFin cloud with a number of so-called end-points (the pattern filled circles in Figure
1.2). These end-points include dedicated software components that, along with the resources pro-
vided by each actor, are connected to the Internet in order to exploit Internet’s robustness. Events
1.1. REFERENCE SCENARIOS 7
Figure 1.2: CoMiFin in a Financial Scenario
will flow from the various end-points into the CoMiFin cloud: they will be processed by the CoMiFin
monitoring system in order to obtain semantically enriched data (or processed data) for threat de-
tection. The processed data will then be disseminated among the interested actors participating in
the CoMiFin cloud.
In the considered scenario, regular transaction activities that Financial Institutions (FIs) per-
form are isolated from the activities carried out by the CoMiFin system. In particular, financial
transactions are usually handled through dedicated networks and protocols (e.g, SWIFT) for inter-
Financial Institutions activities or through proprietary secure internal or external networks that
FIs maintain (e.g., between bank branches).
In contrast, in order to exploit the functionalities offered by the CoMiFin system, FIs or service
providers sign a basic contract which regulates their participation in the CoMiFin cloud and the
usage of the CoMiFin system.
An actor participating in the CoMiFin cloud will provide data produced by its own internal moni-
toring software and, in return, it will get processed data by the CoMiFin system that allows it to
detect in advance possible threats and take the suitable countermeasures.
In this context, three case studies have been investigated with the aim to provide a clear image of
the monitoring and communication capabilities offered by the CoMiFin system. The first two cases
are related to some threats and failures that may occur in the global financial ecosystem such as
- DDoS attack [28]
malicious action aimed at making unavailable the services provided by financial actors such
as banks, telecommunication suppliers, insurances, and security agencies.
- MitM attack particular threat that is carried out when an active attacker inserts himself
between two communicating parties with malicious purposes.
8 Chapter 1: Scenario and Requirements
In addition to these, one more case strictly related to the financial context has been discussed, as
it introduces such typical financial crimes as Identity Theft and related fraud. This case considers
the problems related to the misappropriation of identity information for criminal activities such as
obtaining goods or services by fraud.
This section goes on with the detailed description of the aforementioned case studies and of the
ways CoMiFin can possibly address them.
1.1.1 Distributed Denial of Service
Distributed Denial of Service (DDoS) attacks are carried out through geographically distributed at-
tack infrastructures, commonly referred to as botnets. They can consist of huge numbers of zombie
machines (some recent cases refer to more than one million of involved hosts) that are connected
to the Internet and controlled by the same attacker. In order to create the attacker’s distributed
infrastructure, a complex preparation phase is required. The CoMiFin monitoring system provides
benefits in two different DDoS scenarios:
1. Early detection of a DDoS attack
Assumption: in this case the CoMiFin monitoring system can provide benefits only if it is
possible to receive alerts about traffic spikes by the FIs and/or by Internet Service Providers
(ISPs).
2. Post-processing analysis
Assumption: the involved critical infrastructures (e.g., ISPs, FIs) want to share traffic and
system log data. In such a case there is no technical limit.
The preparation of this kind of attack can be very long. Usually, a specific malware propagates
through a huge number of machines, replicating itself several times and exploiting some known
vulnerabilities. Once the malware is installed on a machine, this becomes a so-called zombie. Addi-
tional required code is then downloaded from some designated repository, so that the compromised
machine can be ready to accept instructions from an entity called herder, that is in charge of
controlling all the zombie hosts and coordinating the whole attack. To overcome the limitation of
having a centralized coordinator, that would make the whole mechanism easy to block for security
systems, the communication between the zombies and the herder is carried out using some P2P
schemas or taking advantage from dynamic DNS.
The real attack begins when the herder orders it to the compromised hosts. The attack generally
lasts a little time and involves a very big number of malicious machines. The goal of DDoS is
saturating network connections of target host or however exhaust some of its resources in order to
prevent it from correctly delivering its services. This can be obtained simply generating licit re-
quests if the number of zombies is very high. If there are not so many malicious hosts, the requests
to submit can be chosen so that they are able to heavily stress the attacked system, for example
issuing queries for data that is hard to generate and that cannot be cached.
1.1. REFERENCE SCENARIOS 9
After it has been launched, a DDoS attack can be detected noticing a sudden peak in resource
consumption that makes the target system unable to fulfill incoming requests. The only viable
reaction is the network traffic analysis aimed at understanding the characteristics of the attack and
identifying its sources, so that effective countermeasures (i.e. proper filtering rules) can be derived
and possibly disseminated to other interested actors (i.e. other FIs or ISPs).
The added-value of CoMiFin results from its ability to gather traffic data from several sources (ie:
FIs and ISPs) and correlate them in order to recognize suspicious patterns. In this way, possible
attacks to more systems of a particular FI or to several services of the same infrastructure can
be detected earlier because they are more likely to stand out from the aggregated logs than from
the log of single FIs. Moreover, deeper processing of network traffic can allow the identification
of common properties of attack sources. These properties can be employed to define simpler and
more effective filtering rules to disseminate to interested actors and to use for providing a common
database of past attacks.
1.1.2 Man in the Middle
Man in the Middle (MitM) attacks are carried out by making a legitimate user start a connection
with a rogue server, mimicking the legitimate server behavior. The rogue server is then able to
initiate a fraudulent transaction with the real server by pretending to be the legitimate client.
Unlike the DDoS scenario, where the target is the FI, MitM attacks concern customer-based e-
services, such as e-banking, e-payment, e-trading.
The CoMiFin monitoring system provides benefits in two different MitM scenarios:
� Early detection of a MitM attack
Assumption: in this case the CoMiFin monitoring system can provide benefits only if it is
possible to receive alerts about anomalous traffic patterns by the FIs.
� Post-processing analysis
Assumption: the involved critical infrastructures (e.g., ISPs, FIs) want to share traffic and
system log data. In such a case there is no technical limit.
This class of attacks requires the end user believe to communicate with a licit server, while in
reality it is sending its requests to a malicious one. There several ways for getting this situation
� a DNS server can be poisoned [25] in order to resolve specific hostnames to the IP address of
a rogue server;
� also a router can be compromised [21] so that packets are routed to a malicious server;
� one of the emerging method is phishing [29], that consists is making the end user believe to
interact with the legitimate web interface, while what it is actually watching is an indistin-
guishable copy of the original one;
10 Chapter 1: Scenario and Requirements
� authentication mechanisms may present vulnerabilities that can be exploited to carry out
MitM attacks
Once a customer has been lured to a rogue Web server used as MitM, the attack is extremely simple.
The user authenticates against the rogue server by sending its credentials. The rogue server stores
the user credentials, relays them to the licit server, and forwards the response to the user on behalf
of the licit server. This mechanism has a twofold purpose: it makes the attack much more realistic
than a traditional phishing strategy, because the user receives the expected replies from the server;
furthermore, the MitM node is able to eavesdrop all data flowing between the client and the server,
thus accessing to a great amount of critical information. It is also worth highlighting that the
MitM attacker is able to modify on-the-fly the content of the transaction, for example by altering
sensitive and critical information.
These attacks can be difficult to detect, because they appear as a normal transaction between a
server and a licit (already known and registered) client. Current detection strategies are based
on the identification of anomalous traffic patterns. In the instance of a simple MitM attack, in
which only a very limited number of compromised servers is used to mediate the sessions of several
users of the same service, it is possible for a single FI to detect a possible MitM attack through
anomaly-detection algorithms. The underlying hypothesis is that customers usually access their
financial services from a limited number of devices, and that the same device is not used by a large
number of customers to access to the same financial services in a short period of time. Hence, a
licit server receiving a high number of connections on behalf of different customers from the same
device can trigger an alarm about suspicious activities.
The added-value of CoMiFin results from the possibility to early detect MitM attacks to different
organizations thanks to the correlation of traffic logs coming from the organizations themselves.
Once identified, suspicious IP addresses can be promptly and automatically disseminated to FIs
and ISPs in order to let them properly update their blacklists and check existing logs to locate any
past attack.
1.1.3 Identity Theft
Person’s identity and the ability to prove it are central to any commercial and financial activity.
Financial organizations need to verify an identity before issuing goods or services. They need
identity evidence to open bank accounts, obtain credit cards, finance, loans and mortgages, to
obtain goods or services, or to claim benefits. To this purpose, they need to ensure that:
� the person paying or applying for credit is who he/she say he/she is and lives where he/she
claims to live;
� the person’s name and address, and other information related to identity (e.g., SSN, fiscal
code, driver license number) are correct.
This section provides an overview of the motivations for considering identity theft and related
fraud crimes in CoMiFin, describes the preparation and the execution of these kinds of attacks,
1.1. REFERENCE SCENARIOS 11
and presents some prevention and mitigation activities that can be managed by the CoMiFin
communication system. Here, identity theft is considered in its broadest meaning whereas related
frauds refer to financial crimes.
Fraudsters can impersonate a person and take out various forms of credit using a good name. This
phenomenon is commonly known as identity theft and identity fraud. There is a temporal and
functional difference between these two types of crimes
� Identity theft is the misappropriation of the identity (such as the name, date of birth, fiscal
code, SSN, current address or previous addresses, credit card number) of another person,
without their knowledge or consent.
� Identity fraud is the use of a misappropriated identity in a criminal activity, to obtain
goods or services by deception. An identity fraud usually involves the use of stolen or forged
identity documents (including credit card numbers), hence it comes after an identity theft.
The economic cost to society, to enterprises and more directly to FIs provides significant motivation
for the design of a comprehensive framework to prevent and arrest the growth of identity fraud and
related crimes from their current levels. Statistics are impressive, as outlined below.
� According to the Federal Trade Commission (FTC), ”identity theft cost American consumers
US $5bn and businesses US $48bn last year” [24]. US lenders report losses of $1 billion a
year due to identity fraud, but US authorities cannot accurately work out the total cost of
identity fraud because of its complexity. Credit card fraud (25%) was the most common form
of reported identity theft in 2006.
� In February 2006, the UK Home Office ”reported that the annual cost of identity fraud had
reached £1.72 billion” [30] up from ”£1.3 billion in 2002”.
� In Canada, the Better Business Bureau estimated that the identity theft and identity fraud
”annual cost was $2.5 billion to Canadian consumers and businesses, and the total annual
cost to the Canadian economy was estimated at $5 billion”.
� The Australian Institute of Criminology (AIC) estimates the cost of fraud to Australia is in
excess of AU $5 billion a year, which represents almost a third of the total cost of crime in
the nation (AU $19 billion) [23]. The cost of identity fraud in Australia was estimated to be
”AU $1.1 billion a year with an estimation error of AU $130 million in 2001-02” [27].
Generally, there are two types of attack methods: to data repository or to person.
The attack of a data repository, for example a database containing all the identities of a certain
service, can be carried out when sensible data is on move or exploiting temporary and uninten-
tional exposures. Moreover, a malicious person working inside an organization can steal this kind
of sensible data.
The attacks directed to persons can employ several channels. The most important and growing at
12 Chapter 1: Scenario and Requirements
the moment is the e-mail, that is the mean used to carry out phishing attacks. Also traditional
mail and internet website are exploited to cheat people.
This category of frauds is of interest for CoMiFin. In fact these threats are increasingly growing
thanks to e-commerce and e-banking services provided by FIs and other related organizations.
People committing identity theft and fraud are used to create new false identities or to take some-
one else credentials to perpetrate financial crimes, that are obviously crucial for FIs. Moreover,
effectively facing this issue is a very valuable service for FIs, because identity related frauds are
dangerously undermining the confidence of customers to FIs, causing potential future huge losses.
Finally, there are practical difficulties in investigating these crimes and the percentages of resolved
cases are very low. So, FIs either writing-off amounts appropriated as loss or the use of alternative
modes of recovery via mercantile agents as a cost-effective response.
CoMiFin project is focused on credit card frauds. The possibility to share information about iden-
tity thefts and credit card transactions between FIs and credit card companies can be usefully
employed to maintain and distribute effective lists of suspicious web sites.
1.2 Requirements
The scope of CoMiFin is very wide. In fact it focuses on an absolutely complex topic that nowadays
is becoming more and more critical and concerns many different aspects, ranging from the practical
ways of making FIs cooperate to the devising of algorithms for detecting/predicting threats to
the addressing of security and performance issues. Due to the extent of this project, my work
and contribution have concerned some specific questions only. In this section, such questions are
described in order to assess the actual requirements that have driven me during my master thesis.
At this purpose, I’ve identified five main areas of interest
1. general CoMiFin requirements
2. SR management requirements
3. privacy and security requirements
4. event processing requirements
5. performance monitoring requirements
1.2.1 General CoMiFin Requirements
This section is about the general and highest level characteristics that CoMiFin system is expected
to exhibit. These requirements introduce also some key concepts and definitions that hereinafter
will be used often, anticipating some design choices that will be exhaustively explained in the next
chapter.
One of the most crucial achievement to fulfill is about giving real evidence that the system is able
1.2. REQUIREMENTS 13
to improve the security of FIs. The advantages of employing CoMiFin instead of enforcing secu-
rity in the usual ways should be clear. More threats should be likely to be detected or predicted,
earlier enough to let the FIs effectively react by their own or on the basis of suggestions provided
by CoMiFin system. Moreover, the proposed solution should be cost-effective for participating
organizations.
One of the main key concepts that has heavily influenced the development of such system is that of
collaboration, already described in the introduction. The ability of CoMiFin to prove its strengths
is principally based on the assumption that the involved actors have the strong will to cooperate
each other for the common goal of raising their security level. Obviously, the quality of such cooper-
ation depends also on which organizations are actually participating. The services provided by FIs
(and targeted by the threats CoMiFin is willing to face) are increasingly relying on complex chains
of other low-level services supplied by other kinds of providers. Power providers, telco providers
and ISPs are as necessary as FIs for the correct delivery of that services. This implies that the
participation of all the organizations involved in this chain is another decisive prerequisite for the
success of the whole project.
Another primary concept that has been devised in the context of the project is that of Semantic
Room (SR). The concept of SR is an abstraction aimed at creating a mean for the collaboration.
Joining an SR, an organization has at disposal a contract-regulated channel for sharing raw data
(i.e. network traffic logs) and obtaining useful information for coping with security attacks. The
aim of an SR is putting together different actors to give them back the advantages derived from their
cooperation. From a technical point of view, an interesting challenge coming from this interaction
is the capacity of integrating several different domains that could be considerably heterogeneous
and widely distributed through the world.
Thanks to the concepts of collaboration and SR, we can get into details about the way CoMiFin
has been though to work. The actors that have joined an SR look for a common goal, for example
the detection and the prediction of DDoS attacks, or the blacklisting of sites suspected of being
the source of MitM attacks. They share log files and other resources of interest, by means of the
ability of the SR to integrate different domains and to gather a huge amount of input data. At
this aim, another relevant technical requirement concerns the capability to interface the existing
monitoring systems that are already installed on the resources of participants. These input data
are properly processed depending on the objective of the SR, and the resulting outputs (being
either the lists of suspected addresses or some advice about the suitable countermeasures to take)
are then disseminated to all the interested organizations.
1.2.2 SR Management Requirements
The SR concept introduced so far is central in all the activities carried out by CoMiFin system. The
management of the different SRs needs to be fully supported by the functionalities provided by the
system. In order to define a distributed monitoring system and allow data, events and information to
14 Chapter 1: Scenario and Requirements
be exchanged among different domains, a global private interconnection infrastructure is required.
CoMiFin communication system shall support the creation of an overlay monitoring network on
top of the Internet and shall provide basic communication functionalities.
The organizations interested in taking part to CoMiFin should be supported by the system in the
creation of new SRs as well as in the join to existing SRs. Moreover, all the operations related to
the SR lifecycle management should be available. The possibility to leave a previously joined SR
or the option to disband an existent SR should be supplied.
As mentioned before, the join to an SR is regulated by a contract that should define the services
offered and the rules to obey, together with some specification about the quality of the declared
facilities and other possible technical and organizational aspects that characterize the SR itself.
1.2.3 Privacy and Security Requirements
Being committed to improve the security of FIs and ISPs on large scale, CoMiFin can’t miss to
prove itself to be attack-proof against security threats. This is consistent with the philosophy of
CoMiFin based on the protection of all the critical infrastructures taking part in the chain that
supports the delivery of financial services. In fact, from this point of view, CoMiFin system itself
is really a critical infrastructure.
Another important class of requirements about security concerns the communication of data. The
main properties to guarantee are the classic following four one
1. confidentiality
Message confidentiality means that the content of a message can be accessed only by its legit-
imate recipient(s). Possible recipients are a single CoMiFin participant, a subset of CoMiFin
participants or all CoMiFin participants. The best solution to handle secure group commu-
nications depends on the purpose that induces CoMiFin participants in sharing information
with others. These processes shall be regulated by well defined policies defining data formats,
the minimum level of security to be guaranteed by the participants during the communica-
tions, information security rules, processing rules, QoS levels and other aspects.
2. integrity
Integrity means that the content of the transmitted message has not been changed during the
transmission operation. That is, the recipient of the message is sure that data is identically
maintained, so that the received message is the same originally sent.
3. authentication of the sender(s)
Message authentication allows all the recipients of a message to determine which of other the
CoMiFin participants is the sender. If authentication information is generated according to
appropriate cryptographic techniques, such as digital signatures, it is also possible to provide
non-repudiation properties
4. non-repudiation of the message
1.2. REQUIREMENTS 15
Non-repudiation means that the recipient of a message is able to demonstrate to a third party
that a specific CoMiFin participant sent the message itself. This property represents the basis
on which cryptographic proofs of misbehavior can be built
A critical point for secure communication is the interface between CoMiFin system and the internal
network of participants. CoMiFin system requires to access Internet resources and receive inputs
from appliances and devices deployed in the internal network of the CoMiFin participant. To
guarantee higher levels of security, CoMiFin resources should be deployed only in isolated network
compartments. Multiple layers of firewalls (packet filtering and application gateways) should be
used to regulate the traffic flow between CoMiFin cloud and participant’s internal network, as well
as the traffic flow between CoMiFin cloud and Internet resources.
Moreover, there should be the possibility for involved organizations to anonymize the data they
provide. In fact, CoMiFin system shall prevent its participants from gaining knowledge about
identity and operations of other CoMiFin participants that are not willing to share such information.
At the same time, it shall collect and maintain sufficient information to perform meaningful data
analysis.
Also the interaction with the end users should be properly secured. At this aim, besides the
classic employment of username-password authentication to access any CoMiFin service accessible
by browser, a step more should be added to guarantee an acceptable level of security. It’s worth
noticing that all the SR Management functionalities described before will be available through a
web interface. Since the operations that can be executed thanks to these functionalities have a
strong impact both on the contract and on the membership of SRs, they must be considered highly
critical. This means that a single malicious user could break the vital services provided CoMiFin,
becoming a real single-point-of-failure for the whole system. A viable solution for addressing this
issue can be the application of the principles of Segregation of Duties (SoD). The basic idea behind
the SoD is that to complete an operation that is considered critical, the intervention of at least two
different users is required. This way, the single-point-of-failure could be avoided.
1.2.4 Event Processing Requirements
This section describes the requirements about the processing activities carried out by the system
on input data provided by participants and the dissemination of related outputs. CoMiFin system
shall be able to collect, analyze and process large (and possibly disjoint) event streams in order to
produce derived events, extract complex threats detection patterns and detect mounting attacks,
imminent failures and other potentially dangerous activities. Event processing capabilities shall
include:
� event filtering and classification;
� event pre-processing and formatting;
� event aggregation and correlation
16 Chapter 1: Scenario and Requirements
Moreover, the system shall be able to identify and suggest countermeasures and security policies to
be applied for preventing or reacting to malicious activities and for mitigating the effects of attacks,
failures and threats both at a local domain level and at the overall interconnected infrastructure
level.
1.2.5 Performance Monitoring Requirements
The contract that regulates an SR includes a section for QoS specification, where performance
constraints are likely to be set. In order to check the compliance to these constraints, monitoring
activities should be carried out to assess the actual performances offered by the system. For
example, interesting metrics for the processing performance could be the response time and the
throughput.
Chapter 2
CoMiFin Architecture
This chapter describes the architecture of CoMiFin system. The aim is designing a flexible archi-
tecture able to address all the requirements identified in the previous chapter.
Firstly, the CoMiFin Service model is introduced. It allows its clients to form business relationships,
called Semantic Rooms (SR), which can be leveraged for the information sharing and processing
purposes. Each SR is associated with a contract determining the set of services provided by that
SR along with the data protection, isolation, trust, security, availability, fault-tolerance, and per-
formance requirements. The CoMiFin middleware incorporates the necessary mechanisms to share
the underlying computational, communication and storage resources both across and within each
of the hosted SR’s so as to satisfy the requirements prescribed by their contracts.
The processing within an SR is accomplished through a collection of software modules, called
CoMiFin Applications, hosted within the SR. The architecture for the application hosting support
is derived based on the types of the hosted applications, and their resource management require-
ments.
The design of the CoMiFin middleware faces many challenges stemming from the need to support
on-line processing of massive amounts of live data in a timely fashion, wide distribution, resource
heterogeneity, and diverse SLA constraints imposed by the SR contracts. In the architecture pro-
posal, these challenges are addressed by following a top-down approach wherein the system is first
broken down into a collection of high-level components whose functionality and interaction with
each other are clearly specified. The state-of-the-art is then further elaborated for each of the
proposed components and several design alternatives are presented to serve as a starting point for
the ensuing design and development effort.
2.1 The CoMiFin Service Model
The primary functionality supported by the CoMiFin middleware is to facilitate information ex-
change and processing among participating principals for the sake of identifying threats against
their IT infrastructure and business. The information sharing is facilitated through the SR ab-
straction, which provides a trusted environment for the participants to contribute and analyze
17
18 Chapter 2: CoMiFin Architecture
the input data. The input data can be real time security events, historical attack data, logs, etc.
This data can be generated by local monitoring subsystems (such as system management products,
Intrusion Detection Systems (IDSs), firewalls, etc.) installed within the IT of the participating
principals, but also from external sources.
The processing within the SR is accomplished through various types of applications, which can
support the following functionality:
1. data pre-processing, which may include filtering and anonymization;
2. on-line data analysis for real-time anomaly detection (such as complex event processing);
3. off-line data analysis;
4. long-term data storage for the future off-line analysis and/or ad-hoc user queries.
The results of the processing can be either disseminated within the SR, or exported to other SRs
and/or external clients. The output might include descriptions of the suspicious transactions,
users, network addresses, attack signatures, etc. The rules for information sharing and resource
provisioning within the SR are governed by the SR contracts.
2.1.1 CoMiFin Principals
CoMiFin principals are those organizations that can exploit the functionalities made available by
the system. Specifically, in current design and implementation, the following types of entities can
be involved:
� FI stakeholders are the Financial Institutions primarily targeted by CoMiFin. Provided
services are aligned with the needs of these organizations: banks, insurance companies, in-
vestment management organizations;
� FI Utilities forming the operational environment of financial institutions are the organizations
that can influence the IT of the FI stakeholders (ISP, Telco, Power Grid, SW services, HW
providers, external service providers);
� Additional FI players are organizations that play an important role in the interaction of
FI stakeholders (government agencies, regulatory agencies, FI associations, brokers, stock
exchange, rating agencies);
� CoMiFin Providers are organizations that offer various types of IT resources for the operation
of the system (ISP, ASP, HW/SW provider, hosting businesses, cloud computing platform
provider);
� CoMiFin Authority is an organization that is trusted by the principals. The CoMiFin Au-
thority (CA) is responsible for the organization and operation of the whole CoMiFin system.
It can host some centralized parts of the system. Typically, a CA is a trusted third party,
2.1. THE COMIFIN SERVICE MODEL 19
such as a central bank, a regulatory body or an association of banks. Such an authority not
necessarily provides resources for CoMiFin but is needed for the management of contractual
processes;
� CoMiFin Partner is a business entity or organization interested in the CoMiFin services.
Unless otherwise stated, the term Partner will be used instead. A CoMiFin Partner subscribes
to the system by signing a basic contract and can belong to either FI stakeholders, FI utilities,
additional FI players, CoMiFin Providers;
2.1.2 Semantic Rooms
An SR is a federation formed by a subset of partners for the sake of information sharing and
processing. Partners can participate in a specific SR performing a join operation.
Each SR is associated with a contract that defines the set of processing and data sharing services
provided by that SR along with the data protection, isolation, trust, security, dependability, and
performance requirements. Since SRs will typically encapsulate various types of data processing
services (such as on-line event processing, or long-running analytics and intelligence extraction),
the data processing scenarios will be used to illustrate the discussion of the SR functionality below.
Figure 2.1 depicts the SR data handling. As shown in this figure, SR participants provide raw data
that are then processed in order to produce processed data. Raw data may include real-time data,
inputs from human beings, stored data (e.g., historical data), queries, and other types of dynamic
and/or static content.
Processed data can be used for internal consumption within the SR (the lined arrow in Figure 2.1):
in this case, derived events, models, profiles, blacklists, alerts and query results can be fed back into
the SR so that the participants can take advantage of the intelligence provided by the processing.
In addition, a (possibly post-processed) subset of data can be offered for external consumption (the
diamond arrow in Figure 2.1). The processed data can be rendered by means of Graphical User
Interfaces (GUIs) or Dashboard applications.
Figure 2.1: SR data hadnling
20 Chapter 2: CoMiFin Architecture
Semantic Rooms Members and Clients
The organizations that participate in an SR are called SR Members. They have full access to both
all the raw data that the members agreed to contribute by contract, and the data being processed
and thus output by the SR. The Members can also be in charge of performing processing and
dissemination.
In addition to the SR Members, there might exist SR Clients. They cannot contribute raw data
directly to the SR but can be the consumers of the processed data the SR is willing to make
available for the external consumption. Figure 2.2 illustrates the SR Client and Member roles.
The CoMiFin Partner that instantiates a new SR becomes the SR Administrator of such SR. It
Figure 2.2: SR Members and Clients
is in charge of
� defining the details of SR contract;
� managing/supervising the on-going SR operations;
� possibly cancel the join of Members;
� finally disband the SR.
SRs can communicate with each other. In particular, processed data produced by an SR can
be injected into another SR (e.g., through the SR client role above). The ability for SRs to
communicate with one another may enable a composition of multiple services, provided by each
individual SR, into higher-level functionalities. For instance, processed data in the SR related to
”DDoS attacks in banks” can be used by a more specialized SR such as ”DDoS attacks in banks
of a country” whose data can be in turn used by the SR related to ”DDoS attacks of a bank in a
specific country” in order to provide CoMiFin partners with richer services.
Semantic Room Contract
An SR Contract is used to define the set of rules for accessing and using data processed within
the SR. There is a contract for each SR. Partners willing to participate in the SR must sign it.
According to state of the art on contracts definition, an SR contract includes four main parts
2.2. COMIFIN ARCHITECTURE 21
1. details of the involved parties
2. contractual statements
3. Service Level Specifications (SLSs)
4. the signatures of the involved parties
2.2 CoMiFin Architecture
The framework that supports the Semantic Room abstraction over a pool of (locally and geograph-
ically) distributed computational, storage, and network resources is shown in Figure 2.3. In the
framework we can clearly identify two principal layers: the SR management layer which is respon-
sible for the management of the SR, and the Complex Event Processing and Applications layer
which realizes the SR processing and sharing logic. In addition, all the architectural components
of both layers above can utilize various commodity services for
� exchanging control and monitoring information among them (such as load and availability),
� managing resource allocation to the complex event processing and applications both off-line
and at run time through the use of controllers and schedulers,
� storing processing state and data. In the following subsections we describe in more detail the
layers of our framework
Figure 2.3: The framework
22 Chapter 2: CoMiFin Architecture
2.2.1 SR Management
This layer is responsible for supporting the SR abstraction on the top of the individual processing
and data sharing applications provided by the Complex Event Processing and Applications layer.
It embodies a number of components fulfilling various SR management functions. Such functions
include the management of the entire SR lifecycle (i.e., creation of an SR, instantiation of an SR,
disband of an SR, management of the SR membership), the registration and discovery of SRs
and SR contracts, the configuration and planning of SRs, the management of the communications
among different SRs, and the management of trust and reputation within SRs. In addition, each
SR member interfaces an SR through the use of a component of this layer termed SR gateway.
This component transforms raw data into events in the format specified by the SR contract. In
general, this transformation is necessary as depends on the specific SR objective to be met and com-
prises three distinct pre-processing steps; namely, filtering, aggregation, and anonymization. This
latter consists of applying different anonymization techniques in case privacy and confidentiality
requirements are prescribed by the SR contract.
2.2.2 Commodity Services
A number of services can be used by both layers previously mentioned in order to provide func-
tionalities such as communication, storage, resource and contract management, and monitoring.
These services are transversal to the other layers and are described in isolation in the following,
highlighting several design alternatives available at the state of the art and that can be used for
their implementation.
Storage Services
The Storage Service layer of the architecture consists of a collection of components providing various
kinds of storage services to the other layers. This layer can embody services for long term storage of
large data sets, such as monitoring logs and historical data , and for low latency storage of limited
amounts of real-time data.
Communication
The Communication layer consists of a collection of components providing various kinds of com-
munication services to the other layers. This layer can include large-scale group communication
services, reliable low-latency, high throughput message streaming services that are useful for sup-
porting real time event streaming from within the external components into an SR, and pub-
lish/subscribe services.
Resource and Contract Management
The Resource and Contract management layer is responsible for allocating physical resources (such
as computational, storage, and communication) to the SRs so as to satisfy the business objectives
2.2. COMIFIN ARCHITECTURE 23
(such as performance goals, data and resource sharing constraints, etc.) prescribed by the SR
contracts. A capacity planning study might be completed in the preliminary phase of an SR
startup. In such a way, the Resource Management layer ”can be aware” of the maximum capacity
that each SR has to provide in terms of computational power, throughput, memory and storage.
To this end, this layer can include a scheduler and placement controller for initial allocation of the
services to the physical resources, and a runtime load balancer for possible dynamic re-allocations.
As each SR can count on a set of (locally and geographically) distributed data and computational
resources, we find convenient to consider the following three alternatives for the SR deployment
that can be all three supported by our framework:
� SR-owned platform
the computational resources of each SR are owned by its members, although one member is
deputy as an SR administrator. The computational platform of an SR is fully dedicated to
the complex event processing and applications of that SR.
� Third party-owned platform
the computational resources are owned by a third party. The computational resources could be
shared among the complex event processing and applications of the SRs. The collocation and
data flows can be subject to some restrictions specified by the SR contract. This alternative
corresponds to the so called ”SR as a service”
� Mixed platform
the computational resources of each SR are owned by the members of that SR. This platform
runs the logic of its SR, but it can occasionally offer hosting services for running the logic of
other SRs. This case is allowed only after explicit request coming from another SR where its
complex event processing and applications exceed the capacity of its computational platform
and implies some business (and trust) relationship among the involved SRs.
Metrics Monitoring
The Metric Monitoring layer is responsible for monitoring the architecture in order to assess whether
the requirements specified by the SR contract are effectively met. As the Metrics Monitoring
is a transversal layer of the framework, it operates at both the SR Management and Complex
Event Processing and Applications layers. In particular, at the SR Management layer it is in
charge of periodically collecting monitoring information related to the management of SRs (e.g.,
SR membership information) in order to detect whether that management violates the requirements
included into SR contracts. The Metrics Monitoring keeps track of the dynamic behavior of the SRs
and check whether or not SRs and SR members themselves are honoring their respective contracts.
In case the Monitoring detects that SR contracts are close to be violated, it interacts with SR
Management components in order to trigger proper reconfiguration activities.
At the Complex Event Processing and Applications layer (see below), the Metrics Monitoring is in
charge of periodically evaluating whether or not the resource management required by this layer is
24 Chapter 2: CoMiFin Architecture
effectively able to support the execution carried out within this layer. In addition, it is responsible
for detecting whether or not the processing execution violates all those requirements specified into
the SR contracts. The Metrics Monitoring uses ”sensors”, possibly located at physical resource
and container levels, in order to obtain the set of information required for enforcing the metrics of
interest (in our current implementation we favored the use of Nagios monitoring technology [12]
for metrics monitoring purposes).
2.2.3 SR Complex Event Processing and Applications
This layer consists of applications implementing the data processing and sharing logic required to
support the SR functionality. A typical application being hosted in an SR will need to fuse and an-
alyze large volumes of incoming raw data produced by numerous heterogeneous and possibly widely
distributed sources, such as sensors, intrusion and anomaly detection systems, firewalls, monitoring
systems, etc. The incoming data will be either analyzed in real-time, possibly with assistance of
analytical models, or stored for the subsequent off-line analysis and intelligence extraction. This
suggests a characterization of the applications supported by an SR, whose runtime instances can
be hosted within various runtime container components. In particular the application containers,
which can be either standalone or clustered can be the following:
� Event Processing Container
This container is responsible for supporting event-processing applications in a distributed
environment. The applications manipulate and/or extract patterns from streams of event
data arriving in real-time, from possibly widely distributed sources, and need to be able to
support stringent guarantees in terms of the response time and/or throughput.
� Analytics Container
This container is responsible for supporting parallel processing and querying massive data sets
on a cluster of machines. It will be used for supporting the analytics and data warehousing
applications hosted in the SR.
� Web Container
This container will provide basic web capabilities to support the runtime needs of the web
applications hosted within an SR. These applications support the logic enabling the interac-
tion between the client side presentation level artifacts (such as web browser based consoles,
dashboards, widgets, rich web clients, etc.) and the processing applications.
Different implementations of the Complex Event Processing and Applications layer can be sup-
ported by our framework. In particular, it can be possible deploying an SR that uses a central
server for the implementation of both event processing and analytics containers. In this case, finan-
cial institutions of the SR send their own data to the central server). The central engine performs
the correlation and analysis of the data and sends back to the financial institutions the generated
processed data to let each financial institution adopt its own countermeasures in a timely fashion.
2.2. COMIFIN ARCHITECTURE 25
However, although this solution is fully supported by our framework, it suffers from the inherent
drawbacks of a centralized system. The central server may become a single point of failure or secu-
rity vulnerability: if the server crashes or is compromised by a security attack, the complex event
processing computation it carries out can be unavailable or jeopardized. In addition, the volume
of events the central server can process in the time unit is limited by the server’s processing and
bandwidth capacities, thus limiting the system scalability. Therefore, in our current implementa-
tion of the architecture we favored the use of technologies that allow us to realize a decentralized
complex event processing.
26 Chapter 2: CoMiFin Architecture
Chapter 3
SR Life Cycle Management
The most of my contribution to CoMiFin project has been focused on the development of the SR
Manager component. It is in charge of managing the life cycle of the SRs and all the interactions
with the Partners and users of CoMiFin system. A more detailed description of SR model and
involved users is provided so that design choices can result clearer. Since it is the main element
for the management of SR life cycle, its interactions with the other CoMiFin components are also
fundamental for better understanding its internal organization.
This chapter is organized accordingly to the aforementioned topics.
� Firstly, the requirements specific to SR Manager are identified and refined. All the matters
about SR abstraction are issued, from the details of SR life cycle to the structure of the
contract. An exhaustive explanation is also given about the way users can be managed in
order to apply SoD principles.
� Subsequently, an high-level design is presented, based on the interactions with the other
components as well as with the end users, so that a functional overview of SR Manager can
be provided.
� Then, an architecture is exposed, together with a formal description of the data model on
which the SR Manager in based.
� Finally, few words are spent about some implementation details, such as employed technolo-
gies.
3.1 Requirements
This section puts the basis for the next functional design of the component. Concepts that are
very important for SR model are definitely defined, such as Basic Contract, Contract Schema. The
life cycle of an SR is then completely specified, so that more formal approaches can be used to
implement related software modules. The way SoD actually applies to user management is also
described.
27
28 Chapter 3: SR Life Cycle Management
3.1.1 SR Model
So far, the concepts about SRs have not been deepened enough to be able to proceed with the
design of the related management functionalities. In fact, the life cycle of a generic SR should be
formally defined. Before an SR could be instantiated, a contract should be specified, so a precise
knowledge about contract structure should be provided. Moreover, an explanation on how a generic
organization is expected to participate in CoMiFin is missing. In order to tell the whole story, this
section is structured as follow
� Basic Semantic Room and registration: explains the way an interested actor can take part to
CoMiFin;
� Contract and schemas: defines the structure of a contract and introduces the important
distinction between SR Schema and SR Instance;
� SR life cycle: the various phases of the existence of an SR are specified.
Basic Semantic Room and Registration
In the previous chapter, the CoMiFin Authority (CA) entity has been introduced. The fact that
it is trusted by every Partner is crucial and can be exploited to enforce some restricted rules for
participating in CoMiFin. With the aim of reusing existing concepts, the participation of an entity
to CoMiFin has been modeled employing the notion of SR. The system comes with a fixed SR
called Basic Semantic Room whose Administrator is the CoMiFin Authority. An entity willing to
become a Partner has to join the Basic SR, signing the related Basic Contract. At the moment, the
exact content of such contract has not been decided, it will become surely clearer once the system
will be going in production. The actual point to get here is the fact the becoming a CoMiFin
Partner corresponds to join the Basic SR.
Obviously, the mechanisms used to let an organization join the Basic SR are different from those
employed to make a Partner join a generic SR, because in the first case the organization must be
registered to the system and given the required accounts. Moreover, to avoid that any company
participate to CoMiFin, a point should be set where a decision could be taken about accepting or
not a new Partner. To address this issue, a registration phase has been included, where an entity
interested to join CoMiFin specifies its general information and submits its request to participate
to the CA, which in turn decides whether such entity owns all the characteristics to take part into
the system and applies its own choice.
Contract and Schemas
This section is about SR contracts and concerns more generally the information needed to create
a brand new SR, with the aim of understanding which could be the best way of formulating
such functionality. Before going on, it’s important to give a more formal specification of the SR
abstraction. It is defined by the three following principal elements:
3.1. REQUIREMENTS 29
� the contract
each SR is regulated by a contract that defines the set of processing and data sharing services
provided by the SR along with other QoS requirements. The contract also contains the
hardware and software requirements a member has to provision in order to be admitted into
the SR.
� the objective
each SR has a specific strategic objective to meet. For instance, there can exist SRs created
for implementing large-scale stealthy scans detection, or SRs created for detecting Man-In-
The-Middle attacks.
� the deployment
the event processing engine of each SR is implemented using a specific technology and a
particular set of software.
A simple example can help to introduce next choices. Let’s assume that there exists an SR with its
own members aimed at tackling DDoS attacks and that its event processing engine is implemented
in a certain technology. Afterwards, other CoMiFin Partners decides to form another SR for facing
the same DDoS threat, but with an event processing engine using a different technology, maybe
because a better technology has came out or for example because the license of the other is not
suitable per these Partners. It’s clear that the two SRs share many characteristics, in fact they
are likely to be different only for the technology employed and maybe for the value of some QoS
parameter.
As another example, consider again an SR for DDoS but constrained to have only Italian members.
This could happen for example because Italian banks and interested telco providers have found a
convenient common agreement for this collaboration. Then, assume that a similar agreement is
reached also in Spain, so that another SR for DDoS is created, this time forced to accept only
Spanish members. In this case also, the two SR are almost equals, in that they differ only for the
membership.
These examples are to show the potential need of instantiating the same SR in different ways. In
order to address this requisite, a distinction has been made between the schema of an SR and the
instance of an SR. The idea is that when an SR is created, an SR schema is defined, specifying a set
of information that will be discussed later. The instantiation of an SR boils down to the creation
of an SR instance starting from an SR schema. Such an instantiation consists in the definition of
further information besides the ones derived from the SR schema. Another distinction that arises
naturally is the one between a contract schema and a contract instance. An SR schema is associated
with a contract schema that defines a set of information. An SR instance derived from that SR
schema is associated with a contract instance that is based on that contract schema. Furthermore,
an SR schema can define a set of possible deployments, each of one corresponding to a different
set of software that set up an instance. When an SR is instantiated, one of those deployments is
chosen.
30 Chapter 3: SR Life Cycle Management
Resuming what has been told in the previous chapter about the content of a contract, and putting
together the new concepts described so far, it’s possible to structure the information contained in
a contract in a more precise way:
1. details of the involved parties
this section is for listing the Administrator, the Members and the Clients of the SR;
2. contractual statements
here the following data are placed
� the objective
� a unique identifier (SRID)
� the list of provided services
� the list of rules to obey for obtaining the aforementioned services
3. Service Level Specifications
the SLSs are firstly grouped in several categories, for example ”Data processing” or ”Security
and trust”; then each SLS is defined with its name, its unit, its threshold value and possible
other attributes of interest;
The schema of a contract defines or suggests some these information. While the membership and
the SRID are specific for the single instance, all the other information can be specified, or at least
structured, in the schema. More precisely:
� the objective is defined in the schema but can be changed in the contract instance, to ad-
dress situations like the one presented before about a DDoS SR for Italian and another one
for Spanish; in fact, the contract schema may specify ”SR for DDoS attacks” and contract
instance may change it to ”SR for DDoS attacks in Italy/Spain”;
� the list of provided services as well as the list rules are defined in the schema and cannot be
modified in the instance;
� the schema defines the list of the SLSs that will be available for the instances; in the contract
instance, the sub list of SLSs to include will be selected among the available ones, and for
each one of them a threshold value and a possible penalty will be specified.
A contract instance can be presented as an XML document, so the contract schema can be thought
of like something that specifies how such an XML document has to be structured. The natural
mapping that takes place is between a contract schema and an XML schema.
SR Life Cycle
This section describes all the operations about the life cycle of an SR that can be executed thanks
to the support of the SR Manager.
3.1. REQUIREMENTS 31
The first operation is the creation of an SR Schema suitable for the purpose. During such a creation,
all the information described in the previous section are specified by an end user. To instantiate an
SR, an SR schema has to be chosen and missing information (updated objective, SLSs to include,
deployment to use) have to specified. The Partner that has executed the instantiation operation
becomes the Administrator of the SR.
Once instantiated, an SR can be joined by CoMiFin Partners that are interested in the services
it provides. In order to effectively become a member or a client of an SR, its Administrator has
to accept the join. In this way, a sort of authority is introduced to control the membership of
the SRs. A member/client of an SR can then leave the SR on its own or can be forced to leave
by the Administrator, for example because several violations of SLSs due to its actions have been
detected.
Finally, an instantiated SR can be disbanded by its Administrator to force the end of SR’s work.
3.1.2 Segregation of Duties
In order to meet the requirements of privacy and security illustrated in section 1.2.3, the concepts of
Segregation of Duties (SoD) [15] have been taken into account for the design of each functionality.
As already stated, the basic notion of SoD is that, for each critical operation, the intervention of
at least two different users is required. This is for preventing that the bad behavior of a single
licit user or the usage of a stolen identity by a malicious user could disrupt the whole system.
The first step for enforcing SoD is the identification of the operations that are to be considered as
critical. Each of them has to be split in at least two tasks, so that its completion actually requires
the execution of many distinct and consecutive tasks. The next step is forcing these tasks to be
fulfilled by different users. At this regard, a Role Based Access Control (RBAC) can be effectively
employed.
Before going on with how the SoD is practically applied in the SR Manager, it’s necessary to define
the profiling of SR Manager users, because it concerns also with the role assignment, that is the
information missing for completely understand the remainder of this section. The end users of the
SR Manager access its functionalities using a web browser. A username-password authentication is
required to begin the interaction with the SR Manager, so a licit user is given an account with the
needed credentials. An account refers to a Partner, meaning that the user to which the account
has been furnished works for that Partner, at least from the point of view of CoMiFin. A Partner
can have many related accounts, while a single account refers to a single Partner only.
Similarly to a Partner playing one or more roles within a certain SR (Administrator, Member,
Client), an account plays one or more roles within its Partner. Each role enables the account
having it to access specific functionalities of SR Manager. This is the list of the roles supported by
the SR Manager
� CoMiFin Authority
this role is given at installation-time to a predefined account that is used to carry out the
actions of CoMiFin Authority Partner, for example accepting new Partner;
32 Chapter 3: SR Life Cycle Management
� Creator
this role enables an account to manage software configurations and SR Schemas and can be
given to any account;
� Business Manager
there is exactly one account per Partner with this role, that is given to the account that
is defined when a new Partner is registered; this account is in charge of beginning all the
operations related to SR life cycle (instantiation, join, leave, disband);
� Technical Administrator there is exactly one account per Partner with this role, that is given
to an account defined by the Business Manager of the Partner itself; this account is in charge
of managing all the technical aspects of joining the SRs (configuring the resources to employ
in a join, download and install SR software, etc.);
� Operator
there is no limit at all on the number of account in a Partner with this role; accounts that
already play CoMiFin Authority or Business Manager or Technical Administrator roles can’t
have Operator role; this role enables the account to access simple browsing functionalities,
without any mean for configuring/managing other aspects.
At the moment, the principles of SoD have been applied to the join operation. This operation
has been chosen for two reasons. First of all, it should be considered a critical operation because
enables the participation of an organization to a running SR, and this could be risky both for the
Partners that are already member/client of the SR and for the Partner willing to join. In fact, the
collaboration supported by the SR abstraction relies on the integration of the resources provided
by each participant, exposing them to possibly not trusted companies. The other reason concerns
the different skills required for carrying out all the join. A business knowledge is needed to properly
choose the right SR, understand its contract and assess the nature of its members. On the contrary,
a more technical profile is suitable for configuring the resources the Partner is willing to share with
the cloud. So, the SoD here is a choice driven by both security and areas of expertise reasons.
Putting together both the SoD and the fact that a join needs to be accepted by the Administrator
(that is, its Business Manager) of the SR, the whole join operation can be split in the following
tasks, each of them with the indication of the role required for its execution
� begin the join
the Business Manager of the Partner that wants to join the SR can begin the join;
� configure the join
the Technical Administrator of the Partner that wants to join the SR can then configure the
resources the Partner is willing to join the SR with;
� accept the join
the Business Manager of the Administrator of the SR can accept the join;
3.2. HIGH-LEVEL DESIGN 33
� complete the join
the Technical Administrator of the Partner that wants to join the SR can download the
software required to join the SR and install them on the previously configured resources; at
this point the join can be marked as completed
The evidence of the correct application of SoD principles comes from noticing that three different
accounts are necessary for completing the join operation.
3.2 High-Level Design
The requirements pointed out int chapter 1 and refined so far, together with the general architecture
introduced in chapter 2, drive the detailed design of SR Manager. This section identifies the macro-
functionalities that SR Manager shall provide, describing each of them in respect to the interactions
with the other components and with the users; UML sequence diagrams [17] are used to model
such interactions. The final result of this section is the UML component diagram of CoMiFin
components from the point of view of SR Management modules.
The following macro-functionalities have been distinguished, as also depicted in Figure 3.1
� Partner and Account management
� Software management
� SR Schema management
� SR life cycle management
� SLA violation management
A specific subsection is dedicated for each of them. This section ends with another subsection
which describes the aforementioned component diagram.
Figure 3.1: SR Manager macro-functionalities
3.2.1 Partner and Account Management
This macro-functionality regards all the operations related to the registration of a new Partner and
the configuration of its accounts. The considered functionalities are
34 Chapter 3: SR Life Cycle Management
� Partner registration
� Operator registration
� Account profile editing
� Creator role granting
� Account enabling/disabling
� Business Manager enabling/disabling
Partner Registration
Figure 3.2 illustrates the interactions that occur among the SR Manager, the user that registers the
Partner and the CoMiFin Authority that is in charge to confirm such registration. The interaction
Figure 3.2: Partner registration
takes place through the web GUI provided by the SR Manager. The creation (registration) process
consists in 3 separated phases, carried out by two different users, so that SoD requirements can be
met. Here we are interested in the high level interactions between CoMiFin components. So we
don’t enter in details about the way the user actually specify the information about the Partner and
the Business Manager. This is the reason why we model all these operations with a single message
3.2. HIGH-LEVEL DESIGN 35
register(partner, business manager). This message represents all the interactions between the user
and the SR Manager (through the web GUI) for this first phase. The store(business manager)
message actually stores the Business Manager account as unconfirmed, preventing it from logging
into the system. Later, the CoMiFin Authority will confirm such account so that it can login. The
ackRegistration(partner, business manager) message has been included to model the obvious fact
that the user really receives a visual confirmation that the creation process has been successfully
completed. The notifyRegistration(partner, business manager) message represents the notification
(by e-mail) to CoMiFin Authority that a new Partner has been registered and that CoMiFin Au-
thority confirmation is required to complete this operation. Upon such confirmation, represented
by confirm(business manager) message, the SR Manager updates the Business Manager account
through update(business manager) message. At this point, the CoMiFin Authority receives from
SR Manager an ackConfirmation(business manager) message meaning that Business Manager Ac-
count has just been confirmed. Moreover, the Business Manager is informed about this thanks
to notifyConfirmation(business manager) message, sent by e-mail. Now the Business Manager can
login and create Technical Administrator account. This interaction between the Business Manager
and the SR Manager through the web GUI is modeled by create(technical administrator) message.
Afterwards, the Technical Administrator account is stored with the store(technical administrator)
message and the Partner is finally enabled through the enable(partner) message. The last message,
ackCreation(technical administrator), represents the visual confirmation that the whole operation
has been successfully completed. It’s worth noticing that the procedure can be considered as com-
pleted only after the definition of the Technical Administrator. In fact, if a Partner had its Business
Manager as the only account, it couldn’t join any SR.
Operator Registration
Once the Technical Administrator has been registered, the Business Manager can configure any
number of Operators.
Account Profile Editing
Any registered account can edit its own profile, for example for updating its personal data or change
the password.
Creator Role Granting
The Creator role can be given to any account. The CoMiFin Authority is the only user enabled to
grant it.
36 Chapter 3: SR Life Cycle Management
Account Enabling/Disabling
The Business Manager of a Partner can enable/disable any of the account of the Partner itself,
preventing it from logging into the system.
Business Manager Enabling/Disabling
The CoMiFin Authority can enable/disable the Business Manager of any of CoMiFin Partner,
preventing it from logging into the system.
3.2.2 Software Management
The software is a very important entity for SR Management. In fact, when an SR Schema is defined,
the Creator has to configure all the software needed for each of the expected deployments. Each
software has to be described and uploaded to the system, so that the Technical Administrator can
download it when its Partner is going to join an SR. The usual operations of addition, update and
deletion are provided by the SR Manager to users having the Creator role.
3.2.3 SR Schema Management
An SR Schema is a sort of template that enables the instantiation of SRs which differ for a very
few details. The SR Manager provides a wizard that guides the end user through the definition of
a new SR Schema. Only users with Creator role can access such wizard.
The wizard consists of 5 steps
1. objective and description definition
this first step is for writing a textual description of what the derived instances are expected
to do; in particular, the objective should describe the actual goal the future members will be
willing to achieve;
2. services definition
the next step if for defining the list of the services provided by the SR;
3. rules definition
then, the rules for obtaining the aforementioned services are to be listed;
4. SLSs definition
in this step, the Creator has to choose the SLSs that will be included in the contract of derived
instances; once an SLS has been selected, the flag for make it mandatory could be set and a
penalty could be associated to run-time violations of such an SLS;
5. deployments definition
this final step is for specifying the deployments, that are the possible ways an SR can be
deployed once instantiated; It’s worth noticing that a deployment can be associated to any
3.2. HIGH-LEVEL DESIGN 37
subset of the software already configured into the system, and that deployments can be
added/removed using the homonym buttons;
3.2.4 SR Life Cycle Management
Although these operations have been already described before, here the focus is on the coordinated
interaction between the SR Manager and the other CoMiFin components. A sequence diagram is
provided for each operation.
SR Instantiation
Figure 3.3 highlights that once the Business Manager has completed the instantiation, the infor-
mation about the brand new SR are stored into the SR Registry and a message is published to a
specific topic for notifying the interested components that another SR is available. Such message is
then delivered to the SLA Manager, that reads the related contract and derives from it the SLAs.
These are stored into the SR Registry again, allowing the MeMo to get them for enforcing its
monitoring activities.
Figure 3.3: SR instantiation
SR Join
Figure 3.4 models what has been already stated about the join and gives an idea about its com-
plexity. Besides the high number of human users involved, it’s worth observing that
� proper e-mails are sent to the user that should act next;
� the information about the SR are updated in the SR Registry;
38 Chapter 3: SR Life Cycle Management
� the MeMo asks the SR Manager which are the resources to monitor for that SR, since they
are likely to have changed consequently to this update.
Figure 3.4: SR join
SR Leave
Figure 3.5 presents the spontaneous leave performed by the Business Manager of the Partner
wanting to exit the SR. This operation affects the membership section of the contract, so these
changes are to be applied to the data stored into the SR Registry. A notification is also published,
because the MeMo is likely to be interested that some resources are no longer to be monitored.
SR Forced Leave
When the Administrator makes a Partner leave its SR, what is different is only the name of the
operation (besides the user allowed to execute it) in fact it’s called Forced Leave. All the other
3.2. HIGH-LEVEL DESIGN 39
Figure 3.5: SR leave
interactions remain the same.
SR Disband
An Administrator can make its SR stop working. Figure 3.6 shows such operation. Again, related
data are updated into the SR Registry.
Figure 3.6: SR disband
3.2.5 SLA Violation Management
When the SLA Manager detects a violation of an SLS defined in a contract, it notifies the SR
Manager so that a possible penalty is applied to responsible Partner. The action undertaken by
the SR Manager obviously depends on what has been configured in the SR Schema from which the
SR has been instantiated.
3.2.6 Component Diagram
Thanks to the interactions pointed out so far, a more detailed knowledge has been provided about
how the SR Manager is placed within the whole CoMiFin system. Figure 3.7 expresses this idea
40 Chapter 3: SR Life Cycle Management
in a more formal way using an UML component diagram [17]. The interfaces exposed by the SR
Figure 3.7: Component diagram
Manager are
� UserProfilePort
The Dashboard is in charge of displaying many visual information about monitored metrics.
It also enforces some authorization rule exploiting the related configuration set up into the
SR Manager. This interface serves to let the Dashboard get such configuration.
� SLAViolationPort
When the SLA Manager detects a violation, it uses this interface to notify the SR Manager.
� ResourceMapPort
Once a new member has joined an SR, the resources the Partner has declared to participate
with have to be monitored by the MeMo. This interface lets the MeMo read the details about
such resources.
� SREvents
This is actually a topic where the SR Manager publishes all the news about SR life cycle
events: instantiation, join, leave and disband.
3.3. LOW-LEVEL DESIGN 41
The unique interface the SR Manager uses is SRPort, exposed by the SR Registry to let SR data
be inserted, updated and removed.
3.3 Low-Level Design
This section presents the high-level design of SR Manager. An architecture is firstly proposed,
together with the description of its layers and sub-components, then its data model is formally
defined.
3.3.1 Architecture
In order to obtain a better separation of concerns and introduce indirection levels that may turn
to be quite useful, the proposed architecture has been structured laying the sub-components out
in several layers. Figure 3.8 shows such architecture. The way the layers have been conceived has
Figure 3.8: SR Manager architecture
been heavily affected by the technology chosen for the implementation. The whole component has
been designed having in mind JBoss Application Server [9] as target middleware. This choice is
motivated by the will of using a cross-platform language such as java and exploiting as much as
possible the services usually provided by a middleware solution. One of the most valuable service
offered by JBoss is an implementation of EJB 3.0 specification [5]. In particular, an EJB container
42 Chapter 3: SR Life Cycle Management
is available that takes care of enabling the use of Entity Beans and Session Beans. The Data and
Data Management layers on the bottom of the architecture have been designed to fit the EJB
support. The boxes into Data layer are for grouping the entities managed by the SR Manager.
More details are provided in next section. The Data Management layer contains several Stateless
Session Beans that enable the access to persistent data managed by the layer below.
On the top of the architecture, two layers are placed. Both exploit other technologies supported
by JBoss AS, that are
� web service [18]
� publish/subscribe topic [8]
� servlet and JSP [16]
In the Communication & Presentation layer there’s a sub-component for each of the interfaces
exposed. These have been already described in the previous section. Moreover, it includes all the
software modules relative to dynamic web page creation. The side-effect operations the web GUI
provides are mediated by several servlets that are grouped on the basis of the macro-functionality
they refer to.
The middle layer of Business Logic serves as a glue for merging together the core functionalities
supported by the EJB container and the external world that interacts with SR Manager by the
mean of web services, topics and web pages.
Besides the horizontal division by layers, a vertical one is also proposed, that highlights the mapping
between modules and functionalities.
3.3.2 Data Model
The data managed by SR Manager are placed in the most bottom layer. A more formal description
of these data can be given by figure 3.9. It’s evident that an Account may be granted more Roles,
and that a Role may be granted to several different Accounts. There actually are some higher-level
constraints about this concern that cannot be easily captured in UML [17]. For example, the same
account cannot be given both Business Manager and Technical Administrator roles, otherwise SoD
would become unfeasible.
It’s worth noticing that a Partner can have a twofold relationship with an SR. In fact a Partner can
be either the Administrator of an SR and/or join it as o member of a client. The latter relationship
is actually a class relationship, since it relates to a set of Resources for specifying which are the
physical hosts the Partner wants to employ in such join. In turn, each Resource is linked to the
set of Software installed on it. Only the Software included in the SR can be downloaded and
installed on these Resources. Also this constraint is not captured in the diagram but is enforced by
the business logic that works on the top of EntityBean layer. The same reasoning applies for the
constraint that the Software of an SR has to be the same of the Software linked to the Deployment
chosen for such SR.
3.4. IMPLEMENTATION 43
Figure 3.9: SR Data Model
SR and SR Schema both have relationships with PenaltyTemplate and SLSTemplate. The latter
represents a metric that the system is capable to monitor. The former denotes an action the SR
Manager is expected to execute. Then, the SLSSchema is another class relationship which models
that a certain SLS can be included in an SR and whether it is mandatory. Instead, the SLS is for
expressing that a particular SLS has been really put into the contract of an instantiated SR with
a specific threshold to check.
3.4 Implementation
This section gives some information about used technologies and includes a short how-to about the
usage of the web interface.
3.4.1 Technologies
The usage of JBoss Application Server [9] has been already motivated in section 3.3.1. Precisely,
version 5.1 has been employed. For data persistence, RDBMS MySQL version 5.1 [11] has been
chosen. The target OS where an instance of SR Manager has been deployed is CentOS 5.4 [2], a
community-supported, mainly free software operating system based on Red Hat Enterprise Linux
[14]. As development environment, Eclipse Galileo (version 3.5) [4] has been extensively used.
The choice of all the aforementioned tools has been driven by the need of using open source software.
44 Chapter 3: SR Life Cycle Management
3.4.2 HowTo
Apart from being a complete user guide, this subsection reports some notices and screenshots about
the following functionalities accessible by the web interface
� registration
� login
� SR Schema creation
� SR instantiation
� SR join
Registration
Figure 3.10 shows the form for the registration. You have to fill in all the required fields, read
and acknowledge Terms of Service and Privacy Policy and then press Confirm registration button.
Then you have to wait that the CoMiFin Authority checks and accepts your registration, so that
you can login into the system using the credentials you’ve specified before. The next step is the
definition of the Technical Administrator, as already explained in section 3.2.
Login
Figure 3.11 displays the login form. Simply type username and password you’ve provided at
registration time.
SR Schema Creation
Only a user with Creator role can access this functionality, as explained before in section 3.2.3.
Figure 3.12 draws the step of the wizard for defining the SLSs, that is choosing which SLS to include
in future derived SR instances. In figure 3.13, the step for the specification of the deployments that
will be available for the instances is shown.
SR Instantiation
In order to instantiate a new SR, a user with Business Manager role is needed, as stated in section
3.2.4. You have to browse available SR Schemas and choose the one you want to instantiate. A
wizard is then started, with the following steps
� refine objective and description previously defined in the Schema
� choose which SLSs include and their values
� finally, decide the deployment to adopt
3.4. IMPLEMENTATION 45
Figure 3.10: Registration
SR Join
This is a complex functionality, already described in section 3.2.4. Required steps are detailed
immediately below. Let X be the SR to join, P the partner that wants to join X, and A the
administrator of X.
� begin the join
Login as Business Manager of P. Click on Semantic Rooms on the left menu. On Instantiated
Semantic Rooms page, click on details on the row of X. On Semantic Room details page,
press Begin Join button and confirm your choice.
� configure resources
Login as Technical Administrator of P. Click on Semantic Rooms on the left menu. On
Joined Semantic Rooms page, click on details on the row of X. On Semantic Room details
page, press Configure Join button. On Join configuration page, insert requested data then
press Ok button and confirm your choice.
46 Chapter 3: SR Life Cycle Management
Figure 3.11: Login
� accept join
Login as Business Manager of A. Click Semantic Rooms on the left menu. On Instantiated
Semantic Rooms page, click on details on the row of X. On Semantic Room details page,
press View membership button. Press Accept Join button on the row of P and confirm your
choice.
� download software and complete the join
Login as Technical Administrator of P. Click Semantic Rooms on the left menu. On Joined
Semantic Rooms page, click on details on the row of X. On Semantic Room details page,
press Download Software button. On Downloadable Softwares page, click download on the
interested rows and save the files somewhere. After having installed such software, return on
Semantic Room details page, press Complete Join button and confirm your choice.
3.4. IMPLEMENTATION 47
Figure 3.12: SR Schema creation - SLSs definition
48 Chapter 3: SR Life Cycle Management
Figure 3.13: SR Schema creation - Deployments definition
Chapter 4
An SR for Intrusion Detection
One of the SR implemented so far is for carrying out Intrusion Detection (ID) [20], based on the
analysis of stealthy port scanning activities. After a brief explanation about how ID could be
enhanced with a collaborative approach, the Agilis system is introduced. The required processing
steps performed by a dedicated SR are then presented and the main innovations brought by Agilis
are described. This chapter ends with the description of work about performance monitoring.
4.1 Collaborative Intrusion Detection
The subjects of this kind of attack are the web servers handling the external web connectivity of
the participating FIs. Those web servers typically run outside the corporate firewall (in DMZ), and
are therefore frequently targeted by the attackers. The goal of the attack is to identify TCP ports
that might have been left opened at the attacked subjects. The attack is carried out by initiating a
series of TCP connections to ranges of ports at each of the targeted DMZ servers. The ports that
are detected as opened can be used as the intrusion vectors at a later time.
The attack detection is based on identifying patterns of unusually high number of TCP SYN
requests possibly targeting an unusually high number of ports, and originating from the same
external IP address. Such a detection relies on some thresholds, that set the maximum number
of requests which can be issued before being classified as a malicious source. The problem that
arises concerns that these requests are spread by the attacker on several target hosts so that those
threshold are never exceeded. Here the collaboration comes into play. In fact the statistics are
collected and analyzed across the entire set of the ID-SR participants, thus improving chances of
identifying low volume activities, which would have gone undetected if the individual participants
were exclusively relying on their local protection systems. In addition, to minimize the amount
of false positives, the real-time suspicions are periodically calibrated through a reputation system
which maintains the site ranking based on the past history of the malicious activities originating
from those sites.
49
50 Chapter 4: An SR for Intrusion Detection
4.2 The Agilis System
To support required data analysis, a distributed event processing system has been implemented,
called Agilis. It consists of a distributed network of processing and storage elements hosted on
a cluster of machines allocated from the ID-SR hardware pool. The processing is based on the
Hadoop’s MapReduce framework [1] [10]. The processing logic is specified in a high-level language,
called Jaql [7], which compiles into a series of MapReduce jobs. To improve detection latency, the
input data is gathered through buffers stored in an in-memory distributed storage system, called
WebSphere eXtreme Scale (WXS) [19]. The individual components of the Agilis’ framework are
illustrated in Figures 4.1 and 4.2, and described in detail below.
Figure 4.1: MapReduce-based Semantic Room
4.2.1 WebSphere eXtreme Scale
WebSphere eXtreme Scale (WXS) [19] is a distributed in-memory database implemented in Java.
It allows the user data to be organized into a collection of maps consisting of either relational
records, or key-value pairs. At runtime, the data are stored in Data Servers or containers hosted
on a cluster of machines. The clients can query the stored data using either a simple get/set API,
or full-blown SQL queries. The queries can be executed either on the client, or within a container
using an embedded SQL engine.
For scalability, the map’s data can be broken into a fixed number of partitions, which would
then be evenly distributed among the WXS containers by the WXS runtime. In addition, for
4.2. THE AGILIS SYSTEM 51
Figure 4.2: The Components of the Agilis Runtime
fault tolerance, each map partition can be replicated on a configured number of containers. The
information about the operational containers as well as the layout of hosted map partitions and
their replicas is maintained at runtime in the WXS Catalog service, which is typically replicated
for high availability.
4.2.2 Hadoop and MapReduce
MapReduce is a programming model introduced by Google for processing and generating large data
sets. Users specify a map function that processes a key/value pair to generate a set of intermediate
key/value pairs, and a reduce function that merges all intermediate values associated with the same
intermediate key. Many real world tasks are expressible in this model.
Programs written in this functional style are automatically parallelized and executed on a large
cluster of commodity machines. The run-time system takes care of the details of partitioning the
input data, scheduling the program’s execution across a set of machines, handling machine failures,
and managing the required inter-machine communication. This allows programmers without any
experience with parallel and distributed systems to easily utilize the resources of a large distributed
system.
Apache Hadoop is a software framework that supports data-intensive distributed applications under
a free license. It enables applications to work with thousands of nodes and petabytes of data.
Hadoop was inspired by Google’s MapReduce and Google File System (GFS) papers. Hadoop is
a top-level Apache project being built and used by a global community of contributors, using the
Java programming language. Yahoo! has been the largest contributor to the project, and uses
Hadoop extensively across its businesses.
52 Chapter 4: An SR for Intrusion Detection
The usual interaction with external world consists in the user submitting a MapReduce job which
defines both map and reduce steps. The system is then responsible of splitting the job in a set of
concurrent map tasks and reduce tasks that are allocated to available processing nodes.
4.2.3 Processing Framework
The processing is carried out on the machines within the ID-SR cluster, and orchestrated through
the optimized Hadoop scheduling framework. The latter consists of a centralized Job Tracker (JT)
which coordinates the local execution of mappers and reducers on each of the ID-SR nodes through
a collection of Task Trackers (TT) (one per machine).
Most of the scheduling optimizations were targeted at improving locality of processing by schedul-
ing the map tasks close to the WXS partitions holding their respective input splits. To match
the input splits with the WXS partitions, a new implementation of the Hadoop’s InputFormat
interface has been provided, which was packaged with every Agilis’ MapReduce job submitted to
JobTracker. Subsequently, the getSplits method of this interface was used by JT to determine the
split locations at runtime (which was obtained by interrogating the WXS Catalog service); and
the createRecordReader method to create an instance of RecordReader to read the data from the
corresponding WXS partition. To further improve locality, the implementation of RecordReader
recognized the SQL select, project, and aggregate queries (by interacting with the Jaql interpreter),
and delegated their execution to the SQL engine embedded into the WXS container.
In many cases, this approach resulted in a substantial reduction in the volumes of intermediate
data reaching the reducers thus improving latency, bandwidth utilization, and reducing processing
costs. It also allowed to further enforce privacy of the input data submitted by the individual
ID-SR members by scheduling the initial map processing on the machines residing within their
administrative boundaries.
4.2.4 Long-Term Data Storage
Hadoop Distributed File System (HDFS ) [6] is used to provide storage services for massive amount
of data that should be preserved over time (such as e.g., historical data keeping track of past
attacks). The data stored in HDFS can be injected into Hadoop through the provided HDFS
InputFormat implementation, and combined with the WXS data using the Jaql I/O constructs.
HDFS is managed and kept consistent by the Hadoop’s Chunk Manager (CM), and Zookeeper (ZK)
services.
4.2.5 The Jaql Language
The processing logic is expressed in a high-level language, called Jaql [7]. It supports SQL-like
query constructs that can be combined into flows. It can also interact with a large variety of data
sources due to its use of the standardized JSON data model. As shown in Figure 4.2, in Agilis, the
locally compiled Jaql flows are first augmented with the input and output formats to interact with
4.3. THE ID-SR PROCESSING STEPS 53
WXS, and then submitted to the modified Hadoop scheduler, which orchestrates their execution
on the ID=SR machines as explained above.
4.3 The ID-SR Processing Steps
The processing steps followed by the ID-SR implementation are depicted in Figure 4.3. At the
fist step, the raw data capturing the current networking activity at each of the participating ma-
chines is collected using the tcpdump utility and forwarded to the local ID-SR gateway. Each
gateway will then normalize the incoming raw data producing a stream of LogEvent records of the
form: <SOURCEIP, DESTINATIONIP, SOURCEPORT, DESTINATIONPORT, BYTESSENT,
BYTESRECEIVED, RETURNSTATUS>. The LogEvent records are stored in a WXS partition
hosted on a locally deployed WXS container. The incoming LogEvent records are then processed
Figure 4.3: Data flow for Port Scan Detection in ID-SR
by a collection of MapReduce jobs handled by Agilis. The processing logic consists of the follow-
ing steps: First, the input records are subjected to the Summarization flow which consists of two
processing steps surrounded by two I/O steps (for reading the input, and writing the results). The
outcome of the two processing steps is a collection of summary records of the form <SOURCEIP,
PORTSNUM, REQNUM>representing for each source IP address (SOURCEIP) the number of
distinct ports (PORTSNUM) accessed from SOURCEIP along with the total number of requests
(REQNUM) originating from SOURCEIP.
The summary records are then fed into the Blacklisting flow (see Figure 4.4), which will blacklist a
source IP address if the number of requests and distinct ports accessed from that IP address exceed
54 Chapter 4: An SR for Intrusion Detection
fixed limits. In addition, the summary records are also joined with the historical records of the
form <SOURCEIP, RANK>using SOURCEIP as a key to adjust the long-term rank representing
the IP address threat level. The historical records are used to periodically calibrate the blacklist
by excluding the IP addresses whose ranks fall below a fixed threshold.
Figure 4.4: Jaql query fragments used for Port Scan Detection in ID-SR
4.4 Main Innovations
Agilis brings some innovations respect to what is the actual use of the frameworks it includes.
Firstly, Hadoop has been designed to simply support distributed computation, in fact the scenario
where it is usually involved presents a user that submits a MapReduce job specifying where input
data are placed. Agilis employs Hadoop as computing engine for distributed event processing, where
the beginning of the computation is activated by the arrival of new events instead of by the explicit
action of a human user.
The other aspect, that really depends on the previous one, concerns the attempt to shift the timely
properties provided by Hadoop. It is developed to fit well for batch processing. Here the goal is to
make it suitable for near-real-time processing, so that possible attack patterns can be recognized
early enough to allow proper countermeasures to be undertaken.
4.5 Performance Monitoring
Part of my contribution to the processing engine has consisted in designing and implementing
a simple system, called HadoopMeter, for monitoring the performance of an Agilis deployment.
4.5. PERFORMANCE MONITORING 55
PlanetLab [13] nodes have been used to set up the cluster. PlanetLab is a global research network
that supports the development of new network services.
The main need is to flexibly provide the following information about the performance provided by
Agilis
� response time
the average time that takes to complete a job
� throughput
the number of jobs completed in an hour
These information can be inferred from the log files produced by the JobTracker. In fact such logs
include both the time of submission and of completion of each executed job.
It has been chosen to move the computation away from the node where the JobTracker is running,
in order to avoid further load in a node that already is a bottleneck, since the JobTracker actually is
the centralized scheduler for all the map and reduce tasks. For this reason, the logs are first quickly
scanned to extract submission and completion times of just completed jobs, then these information
are sent to another node where the rest of the processing is carried out. An end user can access
these statistics through a web site that, given a time window, displays average response time and
throughput of the jobs completed within such a range. Figure 4.5 shows a sketch of HadoopMeter
architecture. HadoopMeterSender component is in charge of reading fresh logs to extract required
information about newly completed jobs and send them to HadoopMeterReceiver component, which
in turn saves them to a storage module called HadoopMeterDB. HadoopMeterGUI enables end users
to access and query stored information through a web interface.
56 Chapter 4: An SR for Intrusion Detection
Figure 4.5: HadoopMeter architecture
Chapter 5
Conclusions
This chapter concludes summarizing the whole work and giving some hints about possible future
directions of research and application of what has been described so far.
5.1 Summary
This work can be seen as a top-down survey that begins from the advantages and the opportunities
of collaborative environments, then gets through a practical application to Financial context within
CoMiFin European project and finally forks in two different directions
� design of SR Manager component
� analysis of Agilis system
In this section, such survey is briefly retraced highlighting the main aspects the have been touched.
More and more operations nowadays are carried out over the Internet, with a huge amount of
critical data and high-value information being on move over the net. The consequence of this trend
is making the Internet become more and more a Critical Infrastructure (CI). But unlike other CIs,
this one is openly accessible by everyone all over the world, and that’s a very dangerous risk. In
the context of critical infrastructure protection, the need arises to properly protect the Internet
from the attacks that are getting increasingly frequent, complex and very hard to face. CoMiFin
European project addresses this need focusing on the protection of Financial Institutions (FIs) and
of all the critical infrastructures that make FIs work.
The collaboration among financial actors can help in this direction. The correlation of traffic data
provided by several organizations is a very appealing way for coping with internet attacks. Two
threats are mainly tackled by CoMiFin
� Distributed Denial of Service (DDoS)
� Man in the Middle (MitM)
57
58 Chapter 5: Conclusions
The Semantic Room abstraction is an interesting mean for making FIs cooperate sharing data and
resources with the aim of getting back useful information and advices for their own protection.
CoMiFin is actually aimed at implementing such abstraction, providing all the required manage-
ment functionalities and facing all the issues related to privacy and trustworthiness in a financial
context. In fact, strong guarantees are needed to convince a FI to move its sensitive data through
the internet and to other FIs. Moreover, in order to enhance the security of CoMiFin system itself,
the principles of Segregation of Duties (SoD) are applied. A very important role in this project is
played by the actual processing engine. Due to the huge amount of data to process, to the com-
plexity of detection algorithms to implement and to the timeliness requirements to meet, a flexible
distributed event-driven computing paradigm is required.
These requisites have driven the design of an adaptable architecture that supports the Semantic
Room abstraction. In such architecture, two principal layers can be identified: the SR management
layer which is responsible for the management of the SRs, and the Complex Event Processing and
Applications layer which realizes the SR processing and sharing logic. In addition, all the archi-
tectural components of both layers above can utilize various commodity services for monitoring
activities, resource management and storage services.
Most of my work has concerned with the SR Manager component, chiefly in charge of supporting
the life cycle of SRs. It lets users define the contract of an SR in order to regulate the behavior
of its participants and gives the possibility to create SR Schemas, a sort of templates for having
similar SRs. In order to improve the security of this component, users and related roles have been
conceived taking in mind the basics of SoD.
I’ve spent a little effort in the analysis of an SR for Intrusion Detection based on the recognition
of stealthy port scanning. It employs IBM Agilis as event processing engine. A simple monitoring
system has been implemented to maintain useful statistics about Agilis performance.
5.2 Future Directions
Although the scope of the work has been precisely defined, the previous survey has left space to
other topics that could turn to be quite interesting to focus on. This section tries to give some
hints about such topics.
5.2.1 Other Scenarios
The Semantic Room abstraction supported by the proposed architecture is general and could be
applied to scenarios other that the Financial one. In fact it is really a mean for enabling collab-
orative environment, so whenever a cooperation among networked entities is suitable, an SR-like
approach can be feasible.
An interesting example scenario is the mobility. It concerns the calculation of the fastest or the
shortest path to reach a certain destination, that is becoming an everyday action. This computa-
tion is simply based on roads topology and doesn’t consider the current situation of roads, that can
5.2. FUTURE DIRECTIONS 59
sensibly influence the time needed for covering them. But if we put together sensors for detecting
traffic conditions, other sources able to provide real-time info about roads state, for example acci-
dents, and also the existing topology, we probably provide a better input for computing the best
path to the desired destination. Connecting these sources and correlate the data they provide are
operations that completely fit the functionalities offered by an SR.
5.2.2 Botnet Detection
This area concerns an important improvement to the Intrusion Detection algorithm executed by
Agilis. Currently, it produces a list of IP addresses suspected to be intruders. It would be very
interesting and useful to make a step further and try to recognize a whole botnet instead of single
IP addresses.
A botnet [22] is a number of Internet computers that, although their owners are unaware of it,
have been set up to forward transmissions (including spam or viruses) to other computers on the
Internet. Any such computer is referred to as a zombie. Most computers compromised in this
way are home-based. Often, botnets comprising hundreds of thousands computers are rented by
criminal organizations to execute a DDoS attack aimed at undermining the reputation of the target
or at extorting money. This example gives an idea about the practical value of identifying entire
botnets.
5.2.3 Resource Allocation and Scheduling Optimization
The employment of cloud computing as dynamic resource provider is very appealing for what con-
cerns collaborative environments. The ability of allocating resources to users so as to maximize
hardware utilization and react to sudden changes in the load, fulfilling at the same time many
requirements about fairness among served users and response time constraints, is currently an on-
going research area that can be further investigated in these new scenarios enabled by collaborative
environments.
Moreover, the way the various tasks that form a distributed computation are scheduled to available
processing nodes is a key element for the performances of the whole parallel computing framework.
Within CoMiFin context, we’ve seen the JobTracker component (section 4.5), that performs the
allocation of map and reduce tasks to TaskTrackers with available slots. We’ve already pointed out
that it constitutes a bottleneck for Hadoop performances and a single-point-of-failure for the entire
system. Redesigning the JobTracker as a distributed component can overcome this limitation and
generally improve the quality of the entire architecture.
5.2.4 Privacy and Trustworthiness
A major obstacle to the practical usage of collaboration within financial context is the lack of proper
guarantees about data privacy. Information managed by financial actors are very sensitive and high-
value, and any unexpected disclosure could cause huge economic losses and high undermining of
60 Chapter 5: Conclusions
reputation. This risk prevents financial actors from sharing their data, cutting off what actually
would make the collaboration work well.
There are mainly two directions for addressing this issue. From one part, better technologies
for anonymizing data could be investigated, in order to encourage potential participant to share
their data, assuring them that the confidentiality of their information can be completely preserved.
Another way is increasing the level of trustworthiness between all the participant, so that they
become more confident each other and inclined to provide their data.
Bibliography
[1] Apache Hadoop. http://hadoop.apache.org.
[2] CentOS – Community ENTerprise Operating System. http://www.centos.org.
[3] CoMiFin (Communication Middleware for Monitoring Financial Critical Infrastructure). http://www.
comifin.eu/.
[4] Eclipse Galileo. http://www.eclipse.org/galileo.
[5] Enterprise Java Bean 3.0. http://java.sun.com/products/ejb/docs.html.
[6] Hadoop Distributed File System. http://hadoop.apache.org/hdfs.
[7] Jaql - a query language designed for Javascript Object Notation (JSON). http://code.google.com/
p/jaql.
[8] Java Message Service. http://java.sun.com/products/jms.
[9] JBoss Application Server. http://www.jboss.org/jbossas.
[10] MapReduce: Simplified Data Processing on Large Clusters. http://labs.google.com/papers/
mapreduce.html.
[11] MySQL - The world’s most popular open source database. http://www.mysql.com.
[12] Nagios - The Industry Standard In Open Source Monitoring. http://www.nagios.org.
[13] PlanetLab - an open platform for developing, deploying and accessing planetary-scale services. http:
//www.planet-lab.org.
[14] Red Hat Enterprise Linux - The world’s leading open source application platform. http://www.redhat.
com/rhel.
[15] Segregation of Duties. http://en.wikipedia.org/wiki/Separation_of_duties.
[16] Servlet and Java Server Pages. http://java.sun.com/products/servlet.
[17] Unified Modeling Language. http://www.uml.org.
[18] Web Services. http://www.w3.org/TR/ws-arch/.
[19] WebSphere eXtreme Scale - An essential for elastic scalability and next-generation cloud environments.
http://www.ibm.com/software/webservers/appserv/extremescale.
[20] Chenfeng Vincent Zhou, Christopher Leckie, Shanika Karunasekera. A survey of coordinated attacks
and collaborative intrusion detection. Computers & Security, 29:124–140, 2010.
61
62 BIBLIOGRAPHY
[21] Donald Smith. D-Link router based worm? available online at http://isc.sans.org/diary.html?
storyid=4175.
[22] Evan Cooke, Farnam Jahanian, Danny McPherson. The Zombie Roundup: Understanding, Detecting,
and Disrupting Botnets, 2005. Proc. of USENIX Workshop on Step to reducing unwanted traffic on the
internet (SRUTI’05), Boston, 2005.
[23] Federal Trade Commission. Consumer Fraud and Identity Theft Complaint Data January - December
2007. available online at http://www.ftc.gov/opa/2008/02/fraud.pdf, February 2008.
[24] Ilett Dan. US to force firms to ’fess up on data loss. available online at http://software.silicon.
com/security/0,39024655,39157787,00.htm, April 2006. Security Strategy.
[25] Kim Davies. 2008 DNS cache poisoning vulnerability. available online at http://www.iana.org/about/
presentations/davies-cairo-vulnerability-081103.pdf.
[26] Michael Armbrust, Armando Fox, Rean Griffith, Anthony D. Joseph, Randy Katz, Andy Konwinski,
Gunho Lee, David Patterson, Ariel Rabkin, Ion Stoica and Matei Zaharia. A View of Cloud Computing.
CommunicationS of the ACM, 53:50–58, April 2010.
[27] S. Cuganesan, D. Lacey. Identity fraud in Australia: An evaluation of its nature, cost and extent.
Standards Australia International Ltd. Sydney, 2003.
[28] Susan Orr. DDoS Threatens Financial Institutions - Get Prepared! available online at http://www.
orrandorrconsulting.com/articles/DDoS-Threatens-Financial-Institutions.pdf, 2005.
[29] Tim Wilson. For Sale: Phishing Kit. available online at http://www.darkreading.com/security/
management/showArticle.jhtml?articleID=208804288.
[30] UK Home Office. Updated estimate of the cost of identity fraud to the UK economy. available online
at http://www.identity-theft.org.uk/IDfraudtable.pdf, February 2006.
List of Figures
1.1 Structure of business entity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2 CoMiFin in a Financial Scenario . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.1 SR data hadnling . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 SR Members and Clients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 The framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
3.1 SR Manager macro-functionalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.2 Partner registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.3 SR instantiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
3.4 SR join . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3.5 SR leave . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.6 SR disband . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.7 Component diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.8 SR Manager architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.9 SR Data Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.10 Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.11 Login . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.12 SR Schema creation - SLSs definition . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.13 SR Schema creation - Deployments definition . . . . . . . . . . . . . . . . . . . . . . 48
4.1 MapReduce-based Semantic Room . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
4.2 The Components of the Agilis Runtime . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.3 Data flow for Port Scan Detection in ID-SR . . . . . . . . . . . . . . . . . . . . . . . 53
4.4 Jaql query fragments used for Port Scan Detection in ID-SR . . . . . . . . . . . . . . 54
4.5 HadoopMeter architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
63