Lecture 6 - Other Distributed Systems CSE 490h – Introduction to Distributed Computing, Spring 2007 Except as otherwise noted, the content of this presentation

Lecture 6 - Other Distributed Systems

CSE 490h – Introduction to Distributed Computing, Spring 2007

Except as otherwise noted, the content of this presentation is licensed under the Creative Commons Attribution 2.5 License.

Outline

DNS BOINC PlanetLab OLPC & Ad-hoc Mesh Networks Lecture content wrap-up

DNS: The Distributed System in the Distributed System

Domain Name System

Mnemonic identifiers work considerably better for humans than IP addresses

“www.google.com? Surely you mean 66.102.7.99!”

Who maintains the mappings from nameIP?

A Manageable Problem

© 2006 Computer History Museum. All rights reserved. www.computerhistory.org

In the beginning…

Every machine had a file named hosts.txt Each line contained a name/IP mapping New hosts files were updated and

distributed via email

… This clearly wasn’t going to scale

DNS Implementations

Modern DNS system first proposed in 1983

First implementation in 1984 (Paul Mockapetris)

BIND (Berkeley Internet Name Domain) written by four Berkeley students in 1985.

Many other implementations today

Hierarchical Naming

DNS names are arranged in a hierarchy:

www.cs.washington.edu

Entries are either subdomains or hostnames subdomains contain more subdomains, or

hosts (up to 127 levels deep!) Hosts have individual IP addresses

Mechanics: Theory

DNS Recurser (client) parses address from right to left

Asks root server (with known, static IP address) for name of first subdomain DNS server

Contacts successive DNS servers until it finds the host

Mechanics: In Practice

ISPs provide a DNS recurser for clients DNS recursers cache lookups for period of

time after a request

Greatly speeds up retrieval of entries and reduces system load

BOINC

What is BOINC?

“Berkeley Open Infrastructure for Network Computing”

Platform for Internet-wide distributed applications

Volunteer computing infrastructureRelies on many far-flung users volunteering

spare CPU power

Some Facts

1,000,000+ active nodes 521 TFLOPS of computing power 20 active projects (SETI@Home,

Folding@Home, Malaria Control…) and several more in development

(Current as of March 2007)

Comparison to MapReduce

Both are frameworks on which “useful” systems can be built

Does not prescribe particular programming style

Much more heterogeneous architecture Does not have a formal aggregation step Designed for much longer-running

systems (months/years vs. minutes/hours)

Architecture

Central server runs LAMP architecture for web + database

End-users run client application with modules for actual computation

BitTorrent used to distribute data elements efficiently

System Features

Homogenous redundancy Work unit “trickling” Locality scheduling Distribution based on host parameters

Client software

Available as regular application, background “service”, or screensaver

Can be administered locally or LAN-administered via RPC

Can be configured to use only “low priority” cycles

Client/Task Interaction

Client software runs on variety of operating systems, each with different IPC

Uses shared memory message passing to transmit information from “manager” to actual tasks and vice versa

Why Participate?

Sense of accomplishment, community involvement, or scientific duty

Stress testing machines/networks Potential for fame (if your computer “finds”

an alien planet, you can name it!) “Bragging rights” for computing more units

“BOINC Credits”

Credit & Cobblestones

Work done is rewarded with “cobblestones” 100 cobblestones = 1 day of CPU time for a

computer with performance equaling 1,000 double-precision floating-point MIPS (Whetstone) & 1,000 integer VAX MIPS (Dhrystone)

Computers are benchmarked by the BOINC system and receive credit appropriate to their machine

Anti-Cheating Measures

Work units are computed redundantly by several different machines, and results are compared by the central server for consistency

Credit is awarded after the internal server validates the returned work units

Work units must be returned before a deadline

Conclusions Versatile infrastructure

SETI tasks take a few hoursClimate simulation tasks take monthsNetwork monitoring tasks are not CPU-bound at all!

Scales extremely well to internet-wide applications Provides another flexible middleware layer to

base distributed applications on Volunteer computing comes with add’l

considerations (rewards, cheating)

PlanetLab

What if you wanted to:

Test a new version of Bittorrent that might generate GB’s and GB’s of data?

Design a new distributed hashtable algorithm for thousands of nodes?

Create a gigantic caching structure that mirrored web pages in several sites across the USA?

Problem Similarities

Each of these problems requires: Hundreds or thousands of servers Geographic distribution An isolated network for testing and controlled

experiments Developing one-off systems to support these

would be Costly Redundant

PlanetLab

A multi-university effort to build a network for large-scale simulation, testing, and research

“Simulate the Internet”

Usage Stats

Servers: 722+ Slices: 600+ Users: 2500+ Bytes-per-day: 3 - 4 TB IP-flows-per-day: 190M Unique IP-addrs-per-day: 1M

As of Fall, 2006

Project Goals

Supports short- and long-term research goals

System put up “as fast as possible” – PlanetLab design evolves over time to meet changing needs PlanetLab is a process, not a result

Simultaneous Research

Projects must be isolated from one another Code from several researchers:

Untrustworthy? Possibly buggy? Intellectual property issues?

Time-sensitive experiments must not interfere with one another

Must provide realistic workload simulations

Architecture

Built on Linux, ssh, other standard toolsProvides “normal” environment for application

development Hosted at multiple universities w/ separate

adminsRequires trust relationships with respect to

previous goals

Architecture (cont.)

Network is divided into “slices” – server pools created out of virtual machines

Trusted intermediary “PLC” system grants access to network resourcesAllows universities to specify who can use

slices at each siteDistributed trust relationships Central system control Federated control

Resource allocation PLC authenticates users and understands relationships

between principals; issues tickets SHARP system at site validates ticket + returns lease

PLC

User

request

ticket

slice

ticket

lease

User Verification

Public-key cryptography used to sign modules entered into PlanetLabX.509 + SSL keys are used by PLC + slices to

verify user authenticityKeys distributed “out of band” ahead of time

Final Thoughts

Large system with complex relationships Currently upgrading to version 4.0 New systems (GENI) are being proposed Still provides lots of resources to

researchersCoralCache, several other projects run on

PlanetLab

OLPC

“They want to deliver vast amounts of information over the Internet. And again, the Internet is not something you just dump something on. It's not a big truck. It's a series of tubes.”

The Internet is a series of tubes

The internet is composed of a lot of infrastructure:Clients and serversRouters and switchesFiber optic trunk lines, telephone lines, tubes

and trucks And if we map the density of this

infrastructure…

… it probably looks something like this

Photo: cmu.edu

How do we distribute knowledge when there are no tubes? What if we wanted to share a book?

Pass it along, door-to-door. What if we wanted to share 10,000 books?

Build community library. How about 10 million books? Or 300

copies of one book?A very large library?

Solutions We need to build infrastructure to make

large-scale distribution easy (i.e., computers and networking equipment)

We need to be cheapMost of those dark spots don’t have much money

We need reliability where reliable power is costlyAgain, did you notice that there weren’t so many

lights? It’s because there’s no electricity!

The traditional solution: a shared computer with Internet India

75% of people in rural villages 90% of phones in urban areas

Many villagers share a single phone, usually located in the town post office

Likewise, villages typically share a few computers, located at the school (or somewhere with reliable power)

What’s the downside to this model? It might provide shared access to a lot of information,

but it doesn’t solve the “300 copies of a book” case

The distributed solution: the XO

AKA: Children’s Machine, OLPC, $100 laptop A cheap (~$150) laptop designed for children in

developing countries

• OLPC = One Laptop Per Child.

Photo: laptop.org

XO design

Low power consumptionNo moving parts (flash memory, passive cooling)Dual-mode display

In color, the XO consumes 2-3 watts In high-contrast monochrome, less than 1 watt

Can be human powered by a foot-pedal Rugged, child-friendly design Low material costs Open-source software

XO networking

The XO utilizes far-reaching, low-power wireless networking to create ad-hoc mesh networks If any single XO is connected to the Internet,

other nearby computers can share the connection in a peer-to-peer scheme

Networks can theoretically sprawl as far as ten miles, even connecting nearby villages

XO storage and sharing

XO relies on network for content and collaborationContent is stored on a central servers

Textbooks Cached websites (Wikipedia) User content

Software makes it easy to see other users on the network and share content

XO distribution

XO must be purchased in orders of 1 million units by governments in developing nations (economies of scale help to lower costs)

Governments are responsible for distribution of laptops

Laptops are only for children, designed solely as a tool for learning

XO downfalls

Distribution downfalls What about children in developed nations?

Sell to developed markets at a higher price to subsidize costs for developing nations.

Can governments effectively distribute? What about black markets?

OLPC could perhaps partner with local schools and other NGOs to aid in distribution, training and maintenance

Too expensive? Some nations can only afford as much $20 per child

per year. How can we cater to them?

What can the XO achieve?

Today, only 16 percent of the world’s population is estimated to have access to the Internet

Develop new markets Microcredit

Make small loans to the impoverished without requiring collateral

Muhammad Yunus and the Grameen Bank won the 2006 Nobel Peace Prize for their work here

The power of the village economy As millions of users come online in developing nations, there

will be many new opportunities for commerce. Helps those in developing nations to advance their economies

and develop stronger economic models

Why give the XO to children?

UN Millennium Development Goal #2: “achieve universal primary education”

Empower children to think and compete in a global spaceChildren are a nations greatest resourceBacked by a bolstered economy, they will

grow to solve other issues (infrastructure, poverty, famine)

The Course Again (in 5 minutes)

So what did we see in this class?Moore’s law is starting to failMore computing power means more

machinesThis means breaking problems into sub

problems Sub-problems cannot interfere with or depend on

one another Have to “play nice” with shared memory

MapReduce

MapReduce is one paradigm for breaking problems upMakes the “playing nice” easy by enforcing a

decoupled programming modelHandles lots of the behind-the-scenes work

Distributed Systems & Networks

The network is a fundamental part of a distributed systemHave to plan for bandwidth, latency, etc

We’d like to think of the network as an abstractionSockets = pipesRPC looks like a normal procedure call,

handles tricky stuff under the hood Still have to plan for failures of all kinds

Distributed Filesystems

The network allows us to make data available across many machinesNetwork file systems can hook into existing

infrastructureSpecialized file systems (like GFS) can offer

better performance with loss of generality Raises issues of concurrency, process

isolation, and how to combat stale data

And finally…

There are lots of distributed systems out there

MapReduce, BOINC, MPI, several other architectures, styles, problems to solve

You might be designing an important one soon yourself!

Documents

Lecture 6 - Other Distributed Systems CSE 490h – Introduction to Distributed Computing, Spring 2007 Except as otherwise noted, the content of this presentation