Upload
hoangtuong
View
235
Download
4
Embed Size (px)
Citation preview
CS550
• Advanced Operating Systems (Distributed Operating Systems)
• Instructor: Xian-He Sun– Email: [email protected], Phone: (312) 567-5260
– Office hours: 1:30pm-2:30pm Tuesday, Thursday at SB229C, or by appointment
• TA: TBA
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 1
• TA: TBA– Email: [email protected]
– Office: xxx
– Office hours: TBA
• Blackboard:– http://blackboard.iit.edu
• Class Web site– http://www.cs.iit.edu/~sun/cs550.html
Outline
• Course information
• Key issues of distributed operating systems
• Hardware concepts– Multiprocessors
– Multicomputers
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 2
– Distributed systems
• Software concepts– Uniprocessor OS
– Distributed OS
– Network OS
– Middleware
What This Course is About
• Understanding the fundamental concepts of
distributed operating system, and distributed
systems in general
• Learning distributed programming techniques
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 3
– Multithreading, RPC, RMI, Sockets, MPI, etc.
• Understanding the general principles of
distributed paradigms
– MPI, JINI, NFS, Web Service, Grid, etc.
Prerequisite
• CS450 “Operating Systems”
• Familiar with
– Programming in C/C++ or Java
– UNIX tools and development environment
• Command
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 4
• Command
• Editors (vi, emacs), compilers (gcc), makefiles (GNU make)
– Networking programming
• Sockets
• Multithreaded
• RPC, Java RMI
– Basic concepts of computer architecture
Course Materials
• Required:– “Distributed Systems: Principles and Paradigms
(2nd edition)” by Tannenbaum and Van Steen, Pearson/Prentice Hall 2007
• Recommended:– “Distributed Systems: Concepts and Design (4th
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 5
– “Distributed Systems: Concepts and Design (4th edition)” by George Coulouris, Jean Dollimore, and Tim Kindberg, Addison-Wesley, 2005
• Supplemental readings– “Virtual Machines: Versatile Platforms for Systems
and Processes” by Jim Smith and Ravi Nair, Morgan Kaufmann, 2005
Misc. Course Details
• You are expected to attend all of the lectures and presentations
• Grading– Written and programming assignments (35%): individual
work– One exam (35%)– Final project (30%): individual or group with 2-3 students
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 6
– Final project (30%): individual or group with 2-3 students
• Use the course blackboard– Announcements– Lecture notes– Assignments– Discussion– …
Policies
• Collaboration
– Encouraged for high level concepts and
understanding the courses materials
– but …..
• Cheating
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 7
• Cheating
– Copying all or part of another student's homework
– Allowing another student to copy all or part of your
homework
– Copying all or part of code found in a book,
magazine, the Internet, or other resource
Any Questions?
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 8
Personal Introduction
• Research interests
– Middleware
– High End Computing
– Performance Analysis and Modeling
• Research group:
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 9
• Research group:
– Scalable Computing Software Laboratory (SCS)
– http://www.cs.iit.edu/~scs/
– Weekly Research seminar
Distributed Computing at SCSMany workstations are made available for graduate students
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 10
Scalable Computing Software (SCS) Lab.
NU-EUIC
ANL
NCSA/UIUC
Uof C
NU-C
Star TapIIT
I-WIRE
Distributed
Optical Testbed
(Grid)
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 11
Parallel Computers at SCS OMNII-WIRE
Pervasive Computing
Environments at SCS
• A computer system is called scalable if it can scale up to accommodate ever-increasing performance and functionality demand
• A software is called scalable if it can maintain
Scalable Systems
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 12
• A software is called scalable if it can maintain its functionality and efficiency while the underlying computer system and problem scale up
Evolution of ComputingBigger becomes even bigger
Smaller becomes ever smaller, & connected
Japan’s Earth Simulator
•640 processor nodes (PNs)
•Each PN is a system with 8
vector-type arithmetic
X.Sun (IIT) CS546 Lecture 1 Page 13.
processors (APs)
•Peak performance 40Tflops
1.4m x1m x 2m
approx; 50m x 65m x 17m
� The fastest
supercomputer in
November 2007
� Scalable ultra-
computer targeted
for 106,496
compute nodes
Scalable ComputingThe way to high performance
X.Sun (IIT) CS546 Lecture 1 Page 14
BlueGene / L
compute nodes
� BlueGene/L is now
running with
212992 processors
and 478.2 TFlops
on LINPACK
� Peak performance:
596 TFlops
Multicore Add Another Dimension
• Cell
– 1 PPE and 8 SPEs
– Shared L2 cache
– EIB
IBM Multicore
X.Sun (IIT) CS546 Lecture 1 Page 15
• Power6 – Dual core
– 5 GHz
Multi-Core• Motivation for Multi-Core
– Exploits improved feature-size and density– Increases functional units per chip (spatial
efficiency)– Limits energy consumption per operation– Constrains growth in processor complexity
• Challenges resulting from multi-core– Aggravates memory wall
• Memory bandwidth– Way to get data out of memory banks
– Way to get data into multi-core processor array
• Memory latency
• Fragments L3 cache
X.Sun (IIT) CS546 Lecture 2 Page 1616
• Fragments L3 cache
– Relies on effective exploitation of multiple-thread parallelism
• Need for parallel computing model and parallel programming model
– Pins become strangle point• Rate of pin growth projected to slow and flatten
• Rate of bandwidth per pin (pair) projected to grow slowly
– Requires mechanisms for efficient inter-processor coordination
• Synchronization
• Mutual exclusion
• Context switching
Distributed Computing: What is the new
� Supercomputers become
ever powerful
� Communities of “Virtual
organizations” are
formed
X.Sun (IIT) CS546 Lecture 1 Page 17
formed
� No VO possesses all
required skills and
resources
� From “community
sharing” to “information
grid”
Higher Quality of Service
IncreasedEfficiency
Integrated VOs: the Grid
Mimic the electrical power grid
X.Sun (IIT) CS546 Lecture 1 Page 18
Reduced Complexity
& Cost
Increased Productivity
Improved Resiliency
The Challenge of Grid Computing
R
Discovery
Many sources
of data, services,
computationR
Registries organize
services of interestAccessRM
RM
RM
Security & policy
must underlie access
& management
decisions
Virtualization and Resource Management
X.Sun (IIT) CS546 Lecture 1 Page 19
services of interest
to a community
Data integration activities
may require access to, &
exploration/analysis of, data
at many locations
Exploration & analysis
may involve complex,
multi-step workflows
RM
RMRM
Resource management
is needed to ensure
progress & arbitrate
competing demandsSecurity
serviceSecurity
servicePolicy
servicePolicy
service
Higher Quality of Service
IncreasedEfficiency
Cloud ComputingMimic the electrical power grid
X.Sun (IIT) CS546 Lecture 1 Page 20
Reduced Complexity
& Cost
Increased Productivity
Improved Resiliency
What is Cloud Computing?
• A computing paradigm in which tasks are assigned to a combination of connections, software and services accessed over a network
• The network of servers and connections is collectively known as the cloud
• Other terms
X.Sun (IIT) CS546 Lecture 1 Page 21
• Other terms– Mesh Computing
– Elastic Cloud Computing
– Network Computing
2009-1-21 21
What are the diff. between Cloud & Grid?
A commercial version of Grid computing
More likely under single management (VO)
More likely provide computing resources than resource
sharing
More like to serve lots of modest size jobs
X.Sun (IIT) CS546 Lecture 1 Page 22
More like to serve lots of modest size jobs
billions of dollars being spent by the likes of Amazon, Google, and Microsoft to create real commercial grids
The prospect of needing only a credit card to get on-demand access to 100,000+ computers in tens of data centers distributed throughout the world
2009-1-21 22
Cloud: Integrated Resource
Provide virtual computing environments on demand
X.Sun (IIT) CS546 Lecture 1 Page 23
X.Sun (IIT) CS546 Lecture 1 Page 24
Virtualization and Virtual Machine
• Virtual service and virtual machine: the key for data
center and Cloud computing
• Virtual machine (in distributed environment): A
hosting platform where each user can create and
operate in a private machine(s), based on
Grid/distribute infrastructure, achieving:
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 25
Grid/distribute infrastructure, achieving:
– Virtualization
– Isolation and Protection
– Privacy
– Accountability and QoS
– On-demand creation and provisioning
Distributed (Dynamic) Virtual Machine
DVM
DVM’
Virtual service node
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 26
DVM Host
(physical)
Virtualization: Key Technique
• Two-level OS structure– Host OS– Guest OS
• Strong isolation– Administration isolation– Installation isolation– Fault / attack Isolation
…
DVMS1 DVMSn
– Fault / attack Isolation– Recovery, migration, and
reconfiguration
• Virtual service node– DVM Service (DVMS)– Guest OS – Internetworking enabled
One DVM host
Host OS
…Guest OS Guest OS
Embedded Systems: What is the new
• Devices become
smaller and more
powerful
• Devices are
coordinated via network
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 28
coordinated via network
• From “autonomous
computing” to
coordinated “human-
center computing”
Pervasive Computing
MIT’s view of pervasive computing
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 29
Evolution of Computing
Remote Comm.
Coordination
High Availability
Security & Fault
Distributed
Computing
Federated Communities
Virtualization
Standardization
Uneven Conditions Grid
Computing
Pervasive
Global
Smart Space
(Cloud)
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 30
Security & Fault
Tolerance
Mobility
Mobile Network
Adaptive & Reflective
Energy Aware system
Computing
Mobile
Computing
Smart Space
Invisibility
Localized Scalability
Context Awareness
Pervasive
Computing
Service Oriented Computing
Convergence of Core Technology Standards allows
Have beenconverging WSRF
Started far apart in
applications &
technology
WS-I
Compliant
Technology
Stack
• Internet computing: Web service
Computing as a service
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 31
Convergence of Core Technology Standards allows
Common base for Business and Technology ServicesWeb service
• Grid computing: Grid service and is merging with WS
• Pervasive computing: Human centered service
• Mobile computing: Phone service
Future Computing: Human-centered Service
Devices
They are connected to form `smart
space’
Grids link `smart spaces’ to
A new IT booming is coming
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 32
Devices
become smaller
and powerful
A device is an entry of the cyber
world
`smart spaces’ to support `global
smartness’
The Third Wave of Computing Revolutions
• Network, communication, and interconnectivity
• Begin in the late 90s until now
• Machine/machine, software/software,
people/people
• Anytime, anywhere, WWW
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 33
• Anytime, anywhere, WWW
• The communications landscape is shifting
• Promising but a continued work
How do we get there?
– Many Challenges ahead!
Resource Management & Task Scheduling
• DVM provider selection:
– Among a set of DVM providers, which one should be
chosen to host an DVM?
• DVMS selection:
– Among a set of potential tenants (DVMSes), which
ones to host? (for QoS, resource utilization, security…)
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 34
ones to host? (for QoS, resource utilization, security…)
• The Grid Harvest Service (GHS) System
– A long-term application-level performance
prediction and task scheduling system for non-
dedicated distributed (Grid) environments
– Reservation-based versus shared resources
– Good, but more issues, such as QoS, FT, need to solve
1,000
10,000
100,000
Perf
orm
ance Uni-rocessor
Multi-core/many-core processor
Processor-memory performance gap
• Processor performance
increases rapidly
– Uni-processor: ~52% until 2004, ~25% since then
– New trend: multi-core/many-core architecture
• Intel TeraFlops chip, 2007
Source: Intel
52%
20%
60%
X.Sun (IIT) CS546 Lecture 1 Page 35
1
10
100
1980 1985 1990 1995 2000 2005 2010
YearP
erf
orm
ance
Memory
• Intel TeraFlops chip, 2007
– Aggregate processor performance much higher
• Memory: ~9% per year
• Processor-memory speed
gap keeps increasing
Source: OCZ
25%
52%
9%9%
The Memory-wall problem requires a rethinking of the design of OS
Gordon
Moore’s Law
“the number of transistors
that can be fabricated on a
single integrated circuit at
a reasonable cost doubles
every year…”
X.Sun (IIT) CS546 Lecture 1 Page 36
• How?– Material techniques such as extreme ultraviolet lithography (<100 nm)
• Corollary– Processor speed doubles at same rate
• Problem– Moore’s law almost hits its limitation
– Then , what is the impact? who and which technology will lead?
Any Questions?
X.Sun (IIT) CS550: Advanced OS Lecture 1 Page 37