Introduction to Distributed Systems Prof. Walter Kriha, fall term 2013/14 Hochschule der Medien

Introduction to Distributed Systems

Prof. Walter Kriha, fall term 2013/14

Hochschule der Medien

Overall Goals• Learn the basic concepts of Distributed Systems

like concurrency and remoteness• Understand different programming models for

Distributed Systems• Acquire a theoretical foundation of computability

in distributed systems• Learn to design distributed systems with a focus

on performance, availability, scalability and security

• Understand the constraints imposed by hardware

Timeline• Introduction to DS

• Socket based DS and delivery guarantees

• Remote procedure calls

• Remote objects

• Theoretical foundations of DS (FLP, time, causality and consensus)

• Distributed Services 1 and 2

• Distributed Security 1 and 2

• Distributed Components

• System Management in DS

• Design of Distributed Systems: Methodology and Examples

• Web Services: SOA and REST

• Peer-to-Peer Systems

• Ultra-large-scale Systems

Goal for today

• Give an overview of distributed systems. Later lectures will dig into the gory details like security, transactions, remote calling mechanisms etc.

Introduction

• What is a Distributed System (DS)

• Why distribute?

• Types of DS

• Characteristics of DS

• Middleware for DS

• Resources

• Exercises

Definition of a Distributed System:

Independent agents repeatedly interacting in a way that a coherent behavior („system“) emerges

Agents are intelligent, with incomplete local information, differ in capabilities, suffer from random events (local or network). See: A.B.Downey, Think Complexity

Why learn about Distributed Systems?

Because most IT systems ARE ALREADY DISTRIBUTED SYSTEMS (and not only the IT Systems)

The demands of a consumer Web site/API or multitenant enterprise application simply exceed the computing capacity of any one machine. [MC]

An enterprise moves an existing application, such as a three-tier system, onto a cloud service provider in order to save on hardware/data-center costs. [MC]

See: M.Cavage, There's Just No Getting around It: You're Building a Distributed System, ACM Queue, April. 2013

Why Distribute?

• Robustness: avoid single points of failures (e.g. Use hot stand-by data centers) with replication

• Performance: Split processing • Throughput: use many connections• Security: create different security domains• Price per request: use cheaper horizontal scaling or

free resources

Types of Distributed Systems

• Energy grid, telcom net

• Villages, towns and big cities

• It-Infrastructure of large companies

• High-performance clusters

• The WWW

• The human body, organizations, states

• A flock of birds

The energy grid now: hub and spoke

Power-plant

factory

officehome home

home

Electricity flows in one direction only, with a lot of it lost during transport. Control resides with the power plant.

www.wired.com/wired/archive/9.07/juice.html

The future grid : micropower

Power-plant

factory

officehome home

home

Power flows many directions, controlled by independent sensors in the grid. A tenfold increase of transactions. Modern GRID computing allows users to tap into a wealth of distributed computing resources. http://www.thegridreport.com/

Fuelcell

Fuelcell

Fuelcell

Fuelcell

Fuelcell

carcar

car

The power law of settlements

There are many villages, quite a few towns but only a small number of big cities

IT-Infrastructure of large corporations

Customer data zone

Un-trusted clients

Processing zone

Warehouse-Scale Machines

Datacenters are huge distributed systems, requiring special middleware. Diagram from: Barroso et.al., The Datacenter as a Computer

World Wide Web

Internet

Intranet

Dialup clients live at the „edges“ of the internet (no fixed IP address, slow upload). How many graphs are layered on top of the physical network structure? (hyperlinks, search-engines, DNS)

DNS

The New Web

Internet

Aggregation of external information and collaboration based on social networks will bring new forms of content production and consumption and consumer areas will influence companies („consumerization“, Gartner Group). More interconnection of different net-types brings more emergent phenomenons.

Sensors

P2p overlay

Locationbased service

mashups

Social network overlay

Mobile PAN network

Real world

Cams

Distributed Application Structure

Views of Distributed Systems

1) Enterprise View: the role within an enterprise

2) Information View: the flow of information in the DS (information architecture)

3) Computational View: the processing of information in the DS (logical architecture)

4) Engineering View: the system infrastructure (nodes, connections, system management, replicas etc.) (physical or distribution architecture)

5) Technology View: the specific technology used to build the DS

(From Open Distributed Processing plus my architecture categories)

Characteristics of Distributed Systems

• Influence of distribution topology and remoteness

• emergent behavior and computational irreducibility

• No analytic solutions

•Simulation is proportional to prediction time

• heterogeneous components

• a strong need for security

• Concurrency, Parallelism and replication

Don‘t worry, we‘ll dig into all this another day!

IT-Problems with Distributed Systems

• Badly supported in programming language concepts

• heterogeneous components

• a strong need for security

• Scalability and Availability

• Network problems

• Global naming/adressing

• Consensus

• transactions across nodes

• no global state

• no centralized control

• many points of failure

• Asynchronous communication

Programming Languages and Distributed Systems

DS:

-system defines security

-Objects are versioned

-Global identity

-Components

PL: -Design security (private, protected etc.)- no versioning, no components as types- memory address used for identity- platform dependent basic type size (int)

PL and DS are orthogonal.

Distribution Topology

The „small world“ effect:

It takes only a small number of intermediate persons to connect any person on this world to any other one. (A knows B, B knows C, .... F knows G.)

From: The Milgram experiments on social networks. (Andy Oram, Peer-To-Peer, Harnessing the power of disruptive technologies). OpenBC or LinkedIn create a social network from distributed participants.

Look at the facebook profiler by Thommy Fankhauser at http://profiler.mi.hdm-stuttgart.de/

Systems showing the small world effect

How efficient can this DS transport messages? Queries?

How robust is it against random attacks on nodes, targeted attacks on the important connecting nodes?

High local clustering

The power law of DS – a law of nature?

Cities, Companies, Power, social networks etc seem to exhibit the power law. Each size is a tenth of the next bigger size but has ten times more instances. „In defense of cities“, Clay Shirky, www.openp2p.com

Metcalfe’s law - Network Effects

• The usefulness of a network grows by the square of the number of users (think about a fax machine – how useful is one?)

• The adoption rate of a network increases in proportion to the utility provided by the network. (That‘s why companies give away software e.g.)

Emergent Behavior – a flock of birds

There is no central controller, no „Super-bird“. No bird has a representation of the figure in its head. Instead, every bird follows very simple rules. The resulting figure shows EMERGENT behavior. Many distributed systems show it as well – for good or for bad. (Kevin Kelly, Out of Control – The biology of the new machines. Peter Wegner, Interaction vs Algorithm.) Picture: http://www.pdfnet.dk/ (PD)

http://www.pdfnet.dk/

Emergence

„An emergent property is a characteristic of a system that results from the interaction of its components, not from their properties. [] Emergent properties are surprising: it is hard to predict the behavior of the system, even if we know all the rules“. A.B. Downey, Think Complexity (www.thinkcomplexity.com)

Heterogeneous Components

Hardware unreliable

Frequent downtimes

Little endian byte order

Java Data Types

No callbacks

Slow, no access control

Fault tolerant hardware

System management

Big endian byte order

C++ data types

Fast, access controlled

Security in Distributed Systems

Authentication

Authorization

Authentication

Authorization

Integrity, Confidentiality

But: sometimes anonymity is needed!!

(peer-to-peer systems)

Security Topics

• Authentication (who are you?)

• Authorization (what can you do?)

• Confidentiality (can someone spy on us?)

• Integrity (Did somebody change your message?)

• Non-repudiation (It was you who ordered X)

• Privacy/Anonymity

• Firewalls

• Certificates, Public Key Infrastructure, Digital Signature

• Encryption (methods and devices)

• Software Architecture

• Intrusion Detection

• Sniffing

• PGP, SSL etc.

• Denial of Service attacks

Theoretical Foundations of DS

• No global time (logical clocks, vector clocks)

• FLP theorem of asynchronous systems

• The problem of failure detection and timeout

• Concurrency and deadlocks

• CAP theorem: consistency, availabilty and partitioning: choose two only!

• End-to-end argument

• Consensus, leader selection, etc.

Important Programming Terms for DS

• Identity

• Value vs. Reference

• Exception

• Interface vs. Implementation

• Interface Definition Language (IDL)

• Quality of Service (QOS)

• Stubs/Proxies

Distributed System Design

• Common Problems (performance, fail-over, maintenance, policies, security integration)

• Information Architecture (define and qualify the information fragments and flows)

• Distribution Architecture (create a map of all participating systems and their quality of service)

Middleware for Distributed Systems

What is Middleware?

software that helps two separate systems communicate seamlessly. (www.knownow.com/middleware/lexicon.html)

In a strict sense middleware is transport software that is used to move information from one program to one or more other programs, shielding the developer from dependencies on communication protocols, operating systems and hardware platform (“plumbing”) (www.talarian.com)

Positioning Middleware

General structure of a distributed system as middleware.

1-22

From: van Steen/Tanenbaum

The Transparency Dogma

• Middleware is supposed to hide remote-ness and concurrency by hiding distribution behind local programming language constructs

Critique: Jim Waldo, SUN

Full transparency is impossible and the price is too high

Distribution Transparencies

• Access: mask differences in data representation and invocation mechanisms between heterogeneous systems

• Failure: mask failures to enable fault tolerance (e.g. Intelligent load-balancing)

• Location: use logical, not physical names to access services

• Migration: hide the true location of a service or object from clients. If the location changes, the client won‘t notice it.

Distribution Transparencies cont‘d

• Replication: hide a group of equal objects behind an interface (performance, availability)

• Persistence: hide the storage mechanisms and internal policies from a client. Make a remote object look like it is persistently activated.

• Transaction: hide the complex coordination necessary to achieve consistency.

Source: ISO/IEC 10746-1 Open Distributed Processing, www.iso.org

Where do we find Middleware?

Application ServerWeb-Tier

Application ServerEJBTier

LDAP or DCE

E-bank

News

Quotes

DistributedCache

WebServer

CORBA

RMIXML-RPC

WebServiceJMS

JDBC

Part of a Portal running on a Web Cluster.

Directory

JNDI

Classification• Socket Based Services• Remote Procedure Calls (RPCs)• Object Request Brokers (CORBA, RMI)• Message Oriented Middleware (MOMs) and

Event-Driven-Systems• Web-Services (XML-RPC, SOAP,UDDI) and

SOA• Component Systems (Enterprise Java Beans,

J2EE)• Peer-To-Peer (Napster, Gnutella, Freenet,

seti@home)• Agent based (Jini, Aglets)• Tuple-Spaces, distributed blackboards

The “ilities”

• Reliability

• Availability

• Security

• Scalability

• Quality

• Performance

• Maintainability

Before using a specific middleware, always make sure that the “ilities” aka non-functional requirements are met. Middleware almost always differs implementation quality between vendors.

Real-World Problems

• Skills/Understanding: Best practice patterns?

• Single-Point-Of-Failures: replication, load-balancing etc.

• Tooling: generators, deployment tools

• “Brittle-ness” if interfaces change (Compiler “illusion”)

RPC type Middleware

• E.g. Sun-RPC, OSF DCE, SOAP

• Main idea: distribute functions, use concurrent processing

• On top of it: Distributed Directory, File system, Security (cells, principals)

• XML-RPC over http (www.userland.com)

• Apache Thrift, jnb.ociweb.com/jnb/jnbJun2009.html

Layer foundations: UUIDs, value vs. reference, marshaling, versioning etc.

http://www.userland.com/

Distributed Objects

• Object Request Broker

• Multi-language support (platform independence)

• Interface Definition language

• Wire Protocoll: IIOP, GIOP

• Java only (e.g.Introspection used)

• Lightweight method call semantics

• Java Implementations

• Wire Protocoll now mostly: RMI over IIOP

Both try to preserve object semantics. Interface/Implementation separation

CORBA RMI

Distributed Components

• Objects are too granular: performance and maintenance problems

• Programmers need more help: separation of concerns and context

Solutions:

• Enterprise Java Beans

• CORBA Components

• COM+

• .NET

• Spring, Ruby on Rails etc.

Example: Enterprise Java Beans

Automatic Transaction Management

System Management defines Data Sources and Containers

EJB Framework (Separation of concerns):

Deployment (Separation of context):

Concurrency Control

Automatic, method level Security

System Management defines Pool sizes

System Management defines Role/User Binding

EJB Container

Client Entity Beaninvoke

delegate

At the point of interception the container provides the following services to the bean: Resource management, life-cycle, state-

management, transactions, security, persistence

Load/

persist

Distributed Messages (MOM)

Asynchronous, loosely-coupled (fault tolerant), persistent messages with either publish/subscribe (topics) or queuing semantics. Scales well. Delivery guarantees differ.

Pub

Sub

Sub

Sub

Topic

publish send

MOM

Pub Sub

Sub

M2

Put (M1,M2)

Get

M1

Get

queue

MOM

Distributed Code I (Agents, Aglets)

OS

AgentRuntime

Agent

Channel

OS

AgentRuntime

Agent

Serialized Agentpack unpack

Perform work, come back with results

The Problem: who wants a new runtime system?

Distributed Code II (Jini) – The End of Protocols?

Jini LookupService

Jini Client Jini Service

Service Proxy Code

Proxy moves to lookup service during registration

Proxy moves to client during service lookup

Service private protocol

Peer 2 Peer

INTERNET DNS

INTERNET DNSNodes have no

fixed IP address and frequent down-times

P2P uses cycles, provides file sharing and anonymity because no central servers are used

Seti@home, freenet JXTA etc.

ISP

ISP

ISP

Problems: How do you version files? Overhead?

WebServices

Use your “de-hyper” generously!

XML Syntax/HTTP

Universal Description, Discovery and Integration

Web Services Description Language

SOAP

Registry (advertise)

Service features

exchange messagesWire Format/ Transport

Promises de-coupling of service provider and requester, document interfaces, machine-to-machine communication and ease of use compared to distributed objects.

Service Granularity? Application, Component, Object or Request?

Web Server Broker

Security, Transactions etc. Core services

GRID Computing

The abstraction: single system image

different companies

A Grid providing OGSA

The problem: requires a massive amount of system resources

Grid computing promises the just-in-time availability of vast amounts of computing resources, easily accessible through a single system image. Scientific simulation or even game construction (www.butterfly.net ) are possible applications. See: http://www.globus.org/research/papers/anatomy.pdf, the anatomy to the grid.

Cloud Computing

Lots of additional software like deployment tools are becoming popular. Unsolved problems include security of data and platform lock-in.

Software as a Service (office apps, CRM apps)

Platform as a Service (google app engine)

Infrastructure as a Service (amazon EC2)

(Tuple) Spaces

The abstraction: Anything can be stored as long as it is addressable

A space providing tuple storage

users or agents storing or finding tuples

The worlds largest space is the WWW. Other spaces are WIKI-WIKI collaboration systems or more traditional tuple spaces like tspaces or jspaces. The principle is always the same: a few simple methods (put/take/find) which lets users or machines store or find content. The content itself is returned as a representation of a resource. That‘s why some people call those systems REST (Representational State Transfer Architecture), after a theses from Roy Fielding, the father of http.

users or agents interacting through the space

Others• Parallel Processing: PVM, MPI

(e.g. for Linux Beowulf cluster)• Wireless: mobile communication,

Bluetooth• System Management:

Jiro/FMA/JMX• Group Computing (virtual

synchrony): Horus, iBus, javagroups (www.javagroups.org , good for building distributed caches or HA infrastructures)

• Distribution Subsystem (DSS) middleware library

• Simjava (discrete event simulation)• Gridsim (grid simulation package)• Teatime (www.opencroquet.org)

• Internet Games

• Portal Architectures

• Java Communicating Sequential Processes (JCSP). A library implementing Hoares CSP.

• Pi-Calculus for mobility• Mozart/OZ http://www.mozart-

oz.org/

• E-language, http://www.erights.org/

• Erlang language for distributed telco systems with asynchr. message passing http://www.erlang.org/

• Cloud gaming

• Giga-Spaces

http://www.erlang.org/

Future Applications• Collective Intelligence: collaborative production of content• Mashups: dynamic integration of external sources• Social Networks Analysis: the use of information and knowledge from many

people and their personal networks. • Sensor Mesh Networks: ad hoc (self-organizing) networks formed by

dynamic meshes of peer nodes, each of which includes simple networking, computing and sensing capabilities. (Real-world-web)

• Event-driven Applications: an architectural style for distributed applications, in which certain discrete functions are packaged into modular, encapsulated, shareable components, some of which are triggered by the arrival of one or more event objects.

• Web 2.0 represents a broad collection of recent trends in Internet technologies and business models. Particular focus has been given to user-created content, lightweight technology, service-based access and shared revenue models. (technical base: AJAX, RubyOnRails etc)

Roughly taken from: Gartner's Emerging Technologies http://www.gartner.com/it/page.jsp?id=495475

Resources (Technologies)

• Jim Waldo, A note on distributed computing (please read till next session)

• www.highscalability.com on large scale distributed systems and technologies

• www.infoq.com on DS frameworks

• www.theserverside.com

on Enterprise Java Beans

• Clay Shirky, What is P2P and what Isn’t (www.openp2p.com)

• S.Tai, I.Rouvellou, Strategies for Integrating Messaging and Distributed Object Transactions

http://www.highscalability.com/

http://www.infoq.com/

Resources (Programming)

• Wolfgang Emmerich, Engineering Distributed Objects (www.distributed-objects.com) With slides and tests.

• Marco Boger, Java in verteilten Systemen

• Mastering Enterprise Java Beans (www.theserverside.com) free!

• Ted Neward, Java Server Side Programming (sockets, servlets etc.) www.manning.com/neward

• www.swarm.org, portal for swarm programming. Used also as simulation tools for research in economics and finance

Apache Thrift, http://jnb.ociweb.com/jnb/jnbJun2009.html

Overall Goals

• Learn the basic concepts of Distributed Systems like concurrency and remoteness

• Understand different programming models for Distributed Systems

• Acquire a theoretical foundation of computability in distributed systems

• Learn to design distributed systems with a focus on performance, availability, scalability and security

Resources (Systems)• Coulouris, e.al., Distributed Systems• Andrew Tanenbaum, Maarten van Steen, Distributed Systems.

Get this one or Coulouris for a long term effect . (http://ajax.prenhall.com/divisions/esm/app/author_tanenbaum/custom/dist_sys_1e/index.html (slides and book chapters)

• Ken Birman, Building secure and reliable Network Applications

• Grey/Reuter, Transaction Processing• Jiro/Federated Management Architecture (FMA)• Open Grid Service Architecture

http://www.gridforum.org/ogsi-wg/drafts/ogsa_draft2.9_2002-06-22.pdf , explains the services needed in the new GRID computing paradigm

Resources (Theory)

• Designing Distributed Systems, A Conversation with Ken Arnold, Part III, http://www.artima.com/intv/distribP.html , shows importance of failures and state in DS

• The Paradigm Shift from Algorithms to Interaction, Peter Wegner, 1996, a provocative short essay on why interactive systems are much more powerful than turing machines. Shows that DS is more than just concurrency and remoteness. The basics of emergence and non-algorithmic behavior. Good for agent systems as well.

• Phillip J. Windley, Digital Identity, Contains architecture of identity repositories including federation aspects. Network effects and its effects against bilateral identity management.

• Nancy A. Lynch, Distributed Algorithms (proofs and concepts)• M.Cavage, There's Just No Getting around It: You're Building a

Distributed System, ACM Queue, April. 2013, http://portal.acm.org/ft_gateway.cfm?id=2482856&type=pdf

Resources (Scale-Free)

• Stability and topology of scale-free networks under attack and defense strategies, Lazaros K. Gallos u.a. http://xxx.lanl.gov/pdf/cond-mat/0505201

• Albert-Laszlo Barabasi, Linked. Investigates small worlds, scale free networks etc. Basically moves from random networks to hub/spoke architectures. Discovered how the WWW space is organized (in/out/core/islands etc.). A must read for everybody interested in the effects of topology (e.g. on virus spreads)

Resources (Web)

• Tim Oreilly’s famous article on Web2.0: http://www.oreillynet.com/pub/a/oreilly/tim/news/2005/09/30/what-is-web-20.html

• Gartner's 2006 Emerging Technologies Hype Cycle Highlights Key Technology Themes http://www.gartner.com/it/page.jsp?id=495475

• Mashups: Duane Merrill, http://www-128.ibm.com/developerworks/library/x-mashups.html?ca=dnw-727

• L.A.Barroso, J Clidaras, U.Hölzle, The Datacenter as a Computer – An Introduction to the Design of Warehouse-Scale Machines, 2nd Edition 2013 (a google book) http://www.morganclaypool.com/doi/pdf/10.2200/S00516ED2V01Y201306CAC024

http://www-128.ibm.com/developerworks/library/x-mashups.html?ca=dnw-727

http://www-128.ibm.com/developerworks/library/x-mashups.html?ca=dnw-727

Resources (Events, Simulation)

• Simjava, discrete event simulation package. Tutorial at: http://www.dcs.ed.ac.uk/home/simjava/tutorial/

• GridSim, Grid Simulation Package, http://www.gridbus.org/gridsim/gridsim2.2/

Documents

Introduction to Distributed Systems Prof. Walter Kriha, fall term 2013/14 Hochschule der Medien