149
Univ. of Tehran Distributed Operating Sys tems 1 Advanced Advanced Operating Systems Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani Lecture 3: OS design OS design

Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Embed Size (px)

Citation preview

Page 1: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

1

Advanced Advanced

Operating SystemsOperating Systems

University of TehranDept. of EE and Computer Engineering

By:Dr. Nasser Yazdani

Lecture 3: OS designOS design

Page 2: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

2

How to design an OSHow to design an OS Some general guides and experiences. References

“The Computer for the 21st Century”, Mark Weiser

“Exokernel: An Operating System Architecture for Application Level Resource Management”, Dawson R., Engler M, Frans Kaashoek, et al.

“On Micro-Kernel Constructions“,

Page 3: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

3

OutlineOutline New applications/requirements Organizing operating systems Some microkernel examples Object-oriented organizations

Spring Organization for multiprocessors

Page 4: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

4

New visionNew vision Two important problems: location and

scale. Ubiquitous computing: tiny kernels of

functionality Virtual Reality Mobility Intelligent devices distributed computing" make networks

appear like disks, memory, or other nonnetworked devices.

Page 5: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

5

Ubiquitous computingUbiquitous computing Transparent computing is the ultimate goal Computers should disappear into the background Computation becomes part of the environment Computing everywhere

Desktop, Laptop, Palmtop Cars, Cell phones Shoes, Clothing, Walls (paper / paint)

Connectivity everywhere Broadband Wireless

Mobile everywhere Users move around Disposable devices

Page 6: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

6

Ubiquitous ComputingUbiquitous Computing Structure

Resource and service discovery critical User location an issue Interface discovery Disconnected operation Ad-hoc organization

Security Small devices with limited power Intermittent connectivity

Agents Sensor Networks

Page 7: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

7

Grid ComputingGrid Computing Federated system

No single controlling authority Scheduling

Processors, bandwidth and other resources Policy is an important issue

Reliability, security, of who can use, and what one is willing to use.

Systems Globus toolkit Condor Related but not grid – CORBA, DCOM, DCE

Applications Distributed supercomputing

Page 8: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Peer-to-Peer ComputingPeer-to-Peer Computing Locating Cooperative elements Scalability OS support Security Policies

Page 9: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

9

P2P File Sharing IssuesP2P File Sharing Issues Naming Data discovery Availability Security

Encryption Fault tolerance

Conflict resolution Replication

Page 10: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

10

Other Peer to Peer Other Peer to Peer TechnologiesTechnologies

Ad-hoc networking Untrusted nodes used to relay messages Multiple routes (distributed and replicated) Extends range, reduces power, increases

aggregate bandwidth. Increases latency, management more

difficult.

Sensor networks An application of ad-hoc networking Add processing/reduction in the network

Page 11: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

11

What is the big deal?What is the big deal? Performance Border crossings are expensive

Change in locality Copying between user and kernel

buffers Application requirements differ in terms

of resource management

Page 12: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

12

Operating System Operating System OrganizationOrganization

What is the best way to design an operating system?

Put another way, what are the important software characteristics of an OS?

What should be in OS kernel or application or partitioning. Is there a minimal set for kernel?

Page 13: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

13

Important OS Software Important OS Software CharacteristicsCharacteristics

Correctness and simplicity Power and completeness Performance Extensibility and portability

Flexibility Scalability

Suitability for distributed and parallel systems

Compatibility with existing systems Security and fault tolerance

Page 14: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Common OS Common OS OrganizationsOrganizations

Monolithic Virtual machine Layered designs Kernel designs Microkernels Object-Oriented Note that individual OS components can

be organized these ways Trade off between generality and

specialization

Page 15: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

15

What are we shooting What are we shooting for?for?

OS should be thin (like a microkernel) providing only mechanisms not embodying policies (i.e. management)

Fine grain access to system resources while avoiding border crossings as much as possible (like DOS)

Allow flexible extensions for management of resources (like a microkernel) without sacrificing safety (like a monolithic kernel)

Page 16: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Monolithic OS DesignMonolithic OS Design Build OS as single combined module

Hopefully using data abstraction, compartmentalized function, etc.

OS lives in its own, single address space

Examples DOS early Unix systems most VFS file systems

Page 17: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Monolithic Pros/Cons of Monolithic OS OrganizationOS Organization

+ Highly adaptable (at first . . .)+ Little planning required+ Potentially good performance– Hard to extend and change– Eventually becomes extremely

complex– Eventually performance becomes

poor– Highly prone to bugs

Page 18: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Virtual Machine Virtual Machine OrganizationsOrganizations

A base operating system provides services in a very generic way

One or more other operating systems live on top of the base system Using the services it provides To offer different views of system to users

Examples - IBM’s VM/370, the Java interpreter

Page 19: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Virtual Pros/Cons of Virtual Machine OrganizationsMachine Organizations

+ Allows multiple OS personalities on a single machine

+ Good OS development environment+ Can provide good portability of

applications– Significant performance problems– Especially if more than 2 layers– Lacking in flexibility

Page 20: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

20

Old ideaOld idea VM 370

Virtualization for binary support for legacy apps

Why resurgence today? Companies want a share of everybody’s pie

IBM zSeries “mainframes” support virtualization for server consolidation

Enables billing and performance isolation while hosting several customers

Microsoft has announced virtualization plans to allow easy upgrades and hosting Linux!

You can see the dots connecting up From extensibility (a la SPIN) to virtualization

Page 21: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

21

Possible virtualization Possible virtualization approachesapproaches

Standard OS (such as Linux, Windows) Meta services (such as grid) for users to install

files and run processes Administration, accountability, and performance

isolation become hard Retrofit performance isolation into OSs

Linux/RK, QLinux, SILK Accounting resource usage correctly can be an

issue unless done at the lowest level (e.g. Exokernel)

Xen approach Multiplex physical resource at OS granularity

Page 22: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

22

Full virtualizationFull virtualization Virtual hardware identical to real one

Relies on hosted OS trapping to the VMM for privileged instructions

Pros: run unmodified OS binary on top Cons:

supervisor instructions can fail silently in some hardware platforms (e.g. x86)

Solution in VMware: Dynamically rewrite portions of the hosted OS to insert traps

need for hosted OS to see real resources: real time, page coloring tricks for optimizing performance, etc…

Page 23: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

23

Xen principlesXen principles Support for unmodified application

binaries Support for multi-application OS

Complex server configuration within a single OS instance

Paravirtualization for strong resource isolation on uncooperative hardware (x86)

Paravirtualization to enable optimizing guest OS performance and correctness

Page 24: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

24

Xen: VM managementXen: VM management What would make VM virtualization easy

Software TLB Tagged TLB =>no TLB flush on context

switchX86 does not have either

Xen approach Guest OS responsible for allocating and

managing hardware PT Xen top 64MB of every address space.

Why?

Page 25: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Layered OS DesignLayered OS Design

Design tiny innermost layer of software Next layer out provides more functionality

Using services provided by inner layer Continue adding layers until all

functionality required has been provided Examples

Multics Fluke layered file systems and comm. protocols

Page 26: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Layered Pros/Cons of Layered OrganizationOrganization

+ More structured and extensible+ Easy model and development– Performance: Layer crossing can be

expensive– In some cases, unnecessary layers,

duplicated functionality.

Page 27: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Kernel OS DesignsKernel OS Designs Similar to layers, but only two OS layers

Kernel OS services Non-kernel OS services

Move certain functionality outside kernel file systems, libraries

Unlike virtual machines, kernel doesn’t stand alone

Examples - Most modern Unix systems

Page 28: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Kernel OS Pros/Cons of Kernel OS OrganizationOrganization

+ Many advantages of layering, without disadvantage of too many layers

+ Easier to demonstrate correctness– Not as general as layering– Offers no organizing principle for

other parts of OS, user services– Kernels tend to grow to monoliths

Page 29: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Object-Oriented OS Object-Oriented OS DesignDesign

Design internals of OS as set of privileged objects, using OO methods

Sometimes extended into application space

Tends to lead to client/server style of computing

Examples Mach (internally) Spring (totally)

Page 30: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

30

Object-Oriented Object-Oriented OrganizationsOrganizations

Object-oriented organization is increasingly popular

Well suited to OS development, in some ways OSes manage important data

structures OSes are modularizable Strong interfaces are good in OSes

Page 31: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

31

Object-Orientation and Object-Orientation and ExtensibilityExtensibility

One of the main advantages of object-oriented programming is extensibility

Operating systems increasingly need extensibility

So, again, object-oriented techniques are a good match for operating system design

Page 32: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

32

How object-oriented How object-oriented should an OS be?should an OS be?

Many OSes have been built with object-oriented techniques E.g., Mach and Windows NT

But most of them leave object orientation at the microkernel boundary No attempt to force object orientation

on out-of-kernel modules

Page 33: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Object Pros/Cons of Object Oriented OS Oriented OS OrganizationOrganization+ Offers organizational model for entire system

+ Easily divides system into pieces+ Good hooks for security– Can be a limiting model– Must watch for performance

problemsNot widely used yet

Page 34: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Microkernel OS DesignMicrokernel OS Design Like kernels, only less number of abstractions

exported (threads, address space, communication channel)

Try to include only small set of required services in the microkernel

Moves even more out of innermost OS part Like parts of VM, IPC, paging, etc.

System services (e.g. VM manager) implemented as servers on top

High comm overhead between services implemented at user level and microkernel limits extensibility in practice

Examples - Mach, Amoeba, Plan 9, Windows NT, Chorus, Spring, etc.

Page 35: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Pros/Cons of Pros/Cons of Microkernel Microkernel OrganizationOrganization+ Those of kernels, plus:

+ Minimizes code for most important OS services

+ Offers model for entire system– Microkernels tend to grow into

kernels– Requires very careful initial design

choices– Serious danger of bad performance

Page 36: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

36

Organizing the Total Organizing the Total SystemSystem

In microkernel organizations, much of the OS is outside the microkernel

But that doesn’t answer the question of how the system as a whole gets organized

How do you fit together the components to build an integrated system? While maintaining all the advantages of the microkernel

Page 37: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Micro-ness is in the eye of the beholder

Mach Spring Amoeba Plan 9 Windows NT

Some Important Some Important Microkernel DesignsMicrokernel Designs

Page 38: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

MachMach Mach didn’t start life as a

microkernel Became one in Mach 3.0

Object-oriented internally Doesn’t force OO at higher levels

Microkernel focus is on communications facilities

Much concern with parallel/distributed systems

Page 39: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach ModelMach Model

Kernelspace

UserspaceSoftware

emulationlayer

4.3BSDemul.

SysVemul.

HP/UXemul.

otheremul.

Userprocesses

Microkernel

Page 40: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

What’s In the Mach What’s In the Mach Microkernel?Microkernel?

Tasks & Threads Ports and Port Sets Messages Memory Objects Device Support Multiprocessor/Distributed Support

Page 41: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach TasksMach Tasks An execution environment providing

basic unit of resource allocation Contains

Virtual address space Port set One or more threads

Page 42: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach Task ModelMach Task Model

Processport

Bootstrapport

Exceptionport

Registeredports

Addressspace

Thread

Process

Use

r sp

ace

Ker

nel

Page 43: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach ThreadsMach Threads Basic unit of Mach execution Runs in context of one task All threads in one task share its

resources Unix process similar to Mach task

with single thread

Page 44: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Task and Thread Task and Thread SchedulingScheduling

Very flexible Controllable by kernel or user-level

programs Threads of single task can execute in

parallel On single processor Multiple processors

User-level scheduling can extend to multiprocessor scheduling

Page 45: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach PortsMach Ports Basic Mach object reference mechanism

Kernel-protected communication channel Tasks communicate by sending

messages to ports Threads in receiving tasks pull messages

off a queue Ports are location independent Port queues protected by kernel;

bounded

Page 46: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

46

Port RightsPort Rights mechanism by which tasks control

who may talk to their ports Kernel prevents messages being set

to a port unless the sender has its port rights

Port rights also control which single task receives on a port

Page 47: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

47

Port SetsPort Sets A group of ports sharing a common

message queue A thread can receive messages from

a port set Thus servicing multiple ports

Messages are tagged with the actual port

A port can be a member of at most one port set

Page 48: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach MessagesMach Messages Typed collection of data objects

Unlimited size Sent to particular port May contain actual data or pointer to

data Port rights may be passed in a

message Kernel inspects messages for

particular data types (like port rights)

Page 49: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach Memory ObjectsMach Memory Objects A source of memory accessible by

tasks May be managed by user-mode

external memory manager a file managed by a file server

Accessed by messages through a port Kernel manages physical memory as

cache of contents of memory objects

Page 50: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach Device SupportMach Device Support Devices represented by ports Messages control the device and its

data transfer Actual device driver outside the

kernel in an external object

Page 51: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach Multiprocessor Mach Multiprocessor and DS Supportand DS Support

Messages and ports can extend across processor/machine boundaries Location transparent entities

Kernel manages distributed hardware Per-processor data structures, but also

structures shared across the processors Intermachine messages handled by a

server that knows about network details

Page 52: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

52

Mach’s NetMsgServerMach’s NetMsgServer User-level capability-based

networking daemon Handles naming and transport for

messages Provides world-wide name service

for ports Messages sent to off-node ports go

through this server

Page 53: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

53

NetMsgServer in ActionNetMsgServer in Action

User space

Kernel space

Sender

User process

NetMsgServer

User space

Kernel space

Receiver

User process

NetMsgServer

Page 54: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Mach and User Mach and User InterfacesInterfaces

Mach was built for the UNIX community UNIX programs don’t know about ports,

messages, threads, and tasks How do UNIX programs run under Mach? Mach typically runs a user-level server

that offers UNIX emulation Either provides UNIX system call

semantics internally or translates it to Mach primitives

Page 55: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Windows NTWindows NT More layered than some microkernel

designs NT Microkernel provides base services Executive builds on base services via

modules to provide user-level services User-level services used by

privileged subsystems (parts of OS) true user programs

Page 56: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Windows NT DiagramWindows NT Diagram

Hardware

MicrokernelExecutive

UserProcesses

ProtectedSubsystems

User Mode

Kernel Mode

Win32 POSIX

Page 57: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

NT MicrokernelNT Microkernel Thread scheduling Process switching Exception and interrupt handling Multiprocessor synchronization Only NT part not preemptible or

pageable All other NT components runs in

threads

Page 58: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

NT ExecutiveNT Executive Higher level services than

microkernel Runs in kernel mode

but separate from the microkernel itself ease of change and expansion

Built of independent modules all preemptible and pageable

Page 59: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

NT Executive ModulesNT Executive Modules Object manager Security reference monitor Process manager Local procedure call facility (a la

RPC) Virtual memory manager I/O manager

Page 60: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Typical Activity in NTTypical Activity in NT

Hardware

KernelExecutive

Client Process

Win32ProtectedSubsystem

Page 61: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Windows NT ThreadsWindows NT Threads Executable entity running in an

address space Scheduled by kernel Handled by kernel’s dispatcher Kernel works with stripped-down

view of thread - kernel thread object Multiple process threads can

execute on distinct processors--even Executive ones

Page 62: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Microkernel Process Microkernel Process ObjectsObjects

A microkernel proxy for the real process

Microkernel’s interface to the real process

Contains pointers to the various resources owned by the process e.g., threads and address spaces

Alterable only by microkernel calls

Page 63: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Microkernel Thread Microkernel Thread ObjectsObjects

As microkernel process objects are proxies for the real object, microkernel thread objects are proxies for the real thread One per thread

Contains minimal information about thread Priorities, dispatching state

Used by the microkernel for dispatching

Page 64: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

More On MicrokernelsMore On Microkernels Microkernels were the research

architecture of the 80s But few commercial systems of the

90s really use microkernels To some extent, “microkernel” is

now a dirty word in OS design Why?

Page 65: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

Microkernel Microkernel ConstructionConstruction

Most Microkernels do not perform well Is it inherent in the approach or Implementation?

IPC, microkernel bottleneck, can implemented an order of magnitude faster. Not supervise memory Minimal address space management, grant,

map, flush. Fast kernel-User Switch, usually 20-30 us but

3 in L3 implementation

Page 66: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

66

ExokernelExokernel Traditional operating systems fix the

interface and implementation of OS abstractions.

Abstractions must be overly general to work with diverse application needs.

FIXED

Hardware

Applications

InterfaceAbstractions

Page 67: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

67

ExampleExample

FIXED

Hardware

Apache

InterfaceAbstractions

SQL Server

Traditional OS

Page 68: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

68

The Issues The Issues Performance

Denies applications the advantages of domain-specific optimizations

Flexibility Restricts the flexibility of application

builders Functionality

Discourages changes to the implementations of existing abstractions

Page 69: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

69

Performance Performance Example: A DB can have predictable data

access patterns, that doesn't fit with OS LRU page replacement, causing bad performance.

Cao et al. Found that application-controlled file caching can reduce running time by as much as 45%.

There is no single way to abstract physical resources or to implement an abstraction that is best for all applications.

OS is forced to make trade-offs Performance improvements of application-

specific policies could be substantial

Page 70: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

70

FlexibilityFlexibility Fixed high-level abstractions hide

information from applications. Makes it difficult or impossible for

applications to implement their own resource management abstractions.

Page 71: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

71

FunctionalityFunctionality Only one available interface

between applications and hardware resources.

Because all applications must share one set of abstractions, changes to these abstractions occur rarely, if ever

Page 72: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

72

The SolutionThe Solution Separate protection from management

Allow user level to manage resources Application libraries implement OS abstractions

Exokernel exports resources Low level interface Protects, does not manage Expose hardware

Page 73: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

73

Applications know better than Operating Systems what the goal of their resource management decisions should beApplications should be given as much control as possible over those decisionsImplementation view

ExokernelExokernel PhilosophyPhilosophy

Frame Buffer | TLB | Network | Memory | DiskExokernel

HW

Page 74: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

74

ExampleExample

Hardware

Exokernel – Application level resource management

SQL Server

Library OS Customized for SQLServer

InterfaceAbstractions

Library OSChosen from available

Apache

InterfaceAbstractions

Exokernel

Page 75: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

75

Library O.S., which uses the low-level exokernel interface to implement higher-level abstractions.

Implementation Implementation OverviewOverview

Frame Buffer | TLB | Network | Memory | DiskExokernel

HW

Library O.S.

Page 76: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

76

Applications link to library kernel, leveraging their higher-level abstractions.

Implementation Implementation OverviewOverview

Frame Buffer | TLB | Network | Memory | DiskExokernel

HW

Library O.S.

Application

Library O.S.

Application

Page 77: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

77

End-to-End ArgumentEnd-to-End Argument “if something has to be done by the

user program itself, it is wasteful to do it in a lower level as well.”

Why should the OS do anything that the user program can do itself?

In other words - all an OS should do is securely allocate resources.

Page 78: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

78

Exokernel designExokernel design

Page 79: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

79

Exokernel tasksExokernel tasks Track ownership Guard all resources through bind

points Revoke access to resources

Page 80: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

80

Design principleDesign principle Expose hardware (securely) Expose allocation Expose names Expose revocation

Page 81: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

81

Secure bindingSecure binding Decouples authorization from use Allows kernel to protect resource without

understanding their semantics Example: TLB entry

Virtual to physical mapping performed in the library (above exokernel)

Binding loaded into the kernel; used multiple times

Example: packet filter Predicates loaded into the kernel Checked on each packet arrival

Page 82: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

82

Implementing secure Implementing secure bindingsbindings

Hardware mechanisms Capability for physical pages of a file Frame buffer regions (SGI)

Software caching Exokernel large software TLB overlaying

the hardware TLB Downloading code into kernel

Avoid expensive boundary crossings Similar to the SPIN idea

Page 83: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

83

Examples of secure Examples of secure bindingbinding

Physical memory allocation (hardware supported binding) Library allocates physical page Exokernel records the allocator and the permissions

and returns a “capability” – an encrypted cypher Every access to this page by the library requires this

capability

Page fault:•Kernel fields it•Kicks it up to the library•Library allocated a page – gets an encrypted capability•Library calls the kernel to enter a particular translation into the TLB by presenting the capability

Page 84: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

84

Download code into kernel to establish secure binding Packet filter for demultiplexing network

packets Exactly similar to SPIN How to ensure authenticity? Only trusted servers (library OS) can

download code into the kernel Other use of downloaded code

Execute code on behalf of an app that is not currently scheduled

E.g. application handler for garbage collection could be installed in the kernel

Page 85: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

85

Visible resource Visible resource revocationrevocation

Most resources are visibly revoked E.g. processor; physical page Library can then perform necessary

action before relinquishing the resource E.g. needed state saving for a processor E.g. update of page table

Page 86: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

86

Abort protocolAbort protocol Repossession exception passed to

the library OS Repossession vector

Gives info to the library OS as to what was repossessed so that corrective action can be taken

Library OS can seed the vector to enable exokernel to autosave (e.g. disk blocks to which a physical page being repossessed should be written to)

Page 87: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

87

Aegis – an exokernelAegis – an exokernel

Page 88: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

88

Aegis – processor time Aegis – processor time sliceslice

Linear vector of time slots Round robin An application can mark its “position” in

the vector for scheduling Timer interrupt

Beginning and end of time slices Control transferred to library specified handler

for actual saving/restoring Time to save/restore is bounded

Penalty? loss of a time slice next time!

Page 89: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

89

Aegis – processor Aegis – processor environmentsenvironments

Exception context Program generated

Interrupt context External: e,g. timer

Protected entry context Cross domain calls

Addressing context Guaranteed mappings implemented by

software TLB mimicking the library OS page table

Page 90: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

90

Aegis performanceAegis performance

Page 91: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

91

Aegis - Address Aegis - Address translation translation

On TLB miss Kernel installs hardware from software

TLB for guaranteed mappings Otherwise application handler called Application establishes mapping TLB entry with associated capability

presented to the kernel Kernel installs and resumes execution

of the application

Page 92: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

92

ExOS – library OSExOS – library OS IPC abstraction VM Remote communication using ASH

(application specific safe handlers)

Takeaway:significant performance improvement possible compared to a monolithic implementation

Page 93: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

93

The ExokernelThe Exokernel A thin veneer that multiplexes and

exports physical resources securely. Simplicity allows efficiency The lower the level of a primitive, the

more efficiently it can be implemented, and the more latitude it grants to implementers of higher level abstractions.

Page 94: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

94

The ExokernelThe Exokernel Resource management is restricted

to allocation, revocation, sharing ownership tracking

Page 95: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

95

Library operating Library operating systemssystems

Use the low level exokernel interface Higher level abstractions Special purpose implementations

An application can choose the library which best suits its needs, or even build its own.

Page 96: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

96

Another ExampleAnother Example

Page 97: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

97

Design ChallengeDesign ChallengeHow can an Exokernel allow libOSes to

freely manage physical resources while protecting them from each other? Track ownership of resources

Secure bindings – libOS can securely bind to machine resources

Guard all resource usage Revoke access to resources

Page 98: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

98

Secure BindingsSecure Bindings Exokernel allows libOSes to bind

resources using secure bindings Multiplex resources securely Protection for mutually distrusted apps Efficient

Page 99: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

99

Secure BindingsSecure Bindings Secure Binding – a protection

mechanism that decouples authorization from actual use of a resource Allows the kernel to protect resources

without having to understand them

Page 100: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

100

Guard all resource Guard all resource usageusage

Invisible resource revocation-Efficient – application layer not involved -Traditional OS

Visible resource revocation-Allows libOS to guide deallocation and track availability of resources.-Exokernel

Page 101: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

101

Revoke access to Revoke access to resourcesresources

Abort protocol – Allows exokernel to break secure bindings of an uncooperative libOS by force

Page 102: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

102

ConclusionConclusion An Exokernel securely multiplexes

available hardware raw hardware among applications

Application level library operating systems implement higher-level traditional OS abstractions

LibOSes can specialize an implementation to suit a particular application

Page 103: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

103

ConclusionConclusion The lower the level of a primitive…

…the more efficiently it can be implemented… the more latitude it gives to higher level abstractions

So, separate management from protection and……implement protection at a low level

(exokernel)… implement management at a higher level

(libOS)

Page 104: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

104

Some FeaturesSome Features It is possible to have different

libOSes, for example, one could export a Unix API and another a Windows API

Page 105: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

105

Exokernel vs. Exokernel vs. MicrokernelMicrokernel

A micro-kernel provides abstractions to the hardware such as files, sockets, graphics etc.

An exokernel provides almost raw access to the hardware.

Page 106: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

106

Implementation OverviewAllows the extension, specialization, and even replacement of abstractions.

Example: Page Table implementations can vary from libOS to libOS, and applications can choose whichever is most suitable for their needs.

ExokernelExokernel

Page 107: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

107

Implementation PrinciplesProvide libOS'es maximum freedom while protecting them from each other. It is achieved through separation of protection and resource management.

Resources should only be managed to the extent required for protection. LibOS'es handle how best to use resources, with exokernel arbitrating between competing libraries.

LibOS's should be able to request specific physical resources (like specific physical pages).

Resources should not be implicitly allocated; the LibOS should participate in every allocation.

ExokernelExokernel

Page 108: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

108

Secure Bindings Downloading Code Visible Revocation Abort Protocol

Exokernel DesignExokernel Design

Page 109: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

109

Secure BindingsProtection mechanism that decouples authorization (bind time) from actual use of the resource (access time).

Authorization performed at bind time.Expressed in simple operations that the exokernel can implement quickly and efficiently.

Can protect resources without understanding them. Example:

When a page fault occurs, virtual to physical address mapping is performed, the page is loaded by the exokernel (bind time), and then used multiple times (access time).

ExokernelExokernel

Page 110: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

110

Downloading Code Code can be downloaded into the exokernel, for

execution at defined events (like packet arrival).Reduces kernel crossings.Can execute even when the application isn't scheduled.Can initiate events (e.g. - initiate response message to packet)

Example:A packet filter is downloaded into the exokernel (bind time), and then run on every incoming packet to determine the intended target application (access time), and can even initiate a response.

ExokernelExokernel

Page 111: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

111

Visible Resource Revocation Traditionally, OS's revoke (deallocate) resources

invisibly, without application involvement (e.g. - physical memory).

Advantage: lower latencyDisadvantage: applications cannot guide deallocation

Exokernel uses visible revocation for most resources. The libraryOS is notified of the intention to deallocate, and has the capability of guiding the process.

Example: libOS is told that exokernel will deallocate physical page “5”, it can use this information to update it's page table, or even to suggest a less important page for deallocation.

ExokernelExokernel

Page 112: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

112

Abort Protocol Mechanism to take away resources when libOS's fail to

respond satisfactorily to visible revocation requests. A Repossession Vector is used to keep track of

forcibly deallocated resources. Library OS's can pre-load the vector with information that can be used to write state or data about the resource when it is deallocated (e.g. - define disk blocks for memory paging).

OS's normally require certain allocations to be permanent, so exokernel can guarantee a small number of resources that cannot be forcibly deallocated.Example: page tables, exception areas

ExokernelExokernel

Page 113: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

113

ImplementationAegis: Exokernel

Exports: processor, physical memory, TLB,exceptions, interrupts, and network interface.

ExOS: Library OS Implements: processes, virtual memory, user-

level exceptions, interprocess abstractions, and network protocols (ARP,IP,UDP,NFS)

Compared to Ultrix

ExokernelExokernel

Page 114: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

114

Aegis Processor Time Slices

Time Slices partitioned and allocated at the clock granularity. Scheduled using round robin. Advanced Scheduling can be implemented by libOS through requesting specific positions in the time slices.

Long running apps can allocate contiguous time slices, while interactive apps can allocate several equidistant slices

ExokernelExokernel

Page 115: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

115

Aegis Exceptions Interrupts

Address TranslationsGuarantees address mappings for small number of pages, to simplify boot strapping.

Protected Control Transfers For IPC abstractions Changes program counter to agreed location, sets

appropriate data for context for callee, and donates current time slice.

Dynamic Packet Filter

ExokernelExokernel

Page 116: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

116

ExOSIPC Abstractions

pipe: ExOS uses shared memory buffer, order of magnitude faster than Ultrix, which uses standard unix pipes.

Application Level Virtual Memory150x150 integer matrix mult – doesn't use any special ExOS or Aegis abilities – shows application level VM doesn't incur noticeable overhead (.1 second difference)All other tests performs comparably with Ultrix (reading pages, flipping protection bits, etc...)

Downloaded code for networking handler Round Trip latency for RPC faster than FRPC

ExokernelExokernel

Page 117: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

117

ExOS Extensibility

Extensible Page-Table structures

Implemented inverted page tables

Extensible Schedulers

Stride Scheduling (proportional share scheduling)The processes are succesfully scheduled at a ration of 3:2:1

ExokernelExokernel

Page 118: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

118

Conclusion Experiments with Aegis and ExOS

showSimple exokernel primitives can be implemented efficientlyFast low-level hardware multiplexing can be implemented efficientlyTraditional OS abstractions can be implemented as User LevelApplications can create special-purpose implementations by modifying libraries

ExokernelExokernel

Page 119: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

119

Other Exokernel Work

Porting Multithreading Libraries to an Exokernel SystemErnest Artiaga, Albert Serra, Marisa GilDept. of Computer ArchitectureUniversitat Politecnica de CatalunyaACM SIGOPS European Workshop, ACM 2000, pp. 121-126

Ported Cthreads to Exokernel Slightly faster execution than without threading

ExokernelExokernel

Page 120: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

120

Other Exokernel Work

Fast and Flexible Application-Level Networking on Exokernel SystemGergory Ganger, Dawson Engled, et al.CMU, Stanford, MIT and Vividon, Inc.ACM Transactions on Computer Systems, vol. 20, no. 1, pp. 49--83, 2002

Implemented TCP, HTTP server, and web benchmarking tool

TCP: 50-300% higher throughput HTTP: 3-8 higher throughput Benchmarking: Can produce loads 2-8 times heavier

ExokernelExokernel

Page 121: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

121

Key points of the paperKey points of the paper Microkernel should provide minimal

abstractions Address space, threads, IPC

Abstractions machine independent but implementation hardware dependent for performance

Myths about inefficiency of micro-kernel stem from inefficient implementation and NOT from microkernel approach

Page 122: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

122

What abstractions?What abstractions? Determining criterion:

Functionality not performance Hardware and microkernel should be

trusted but applications are not Hardware provides page-based virtual

memory Kernel builds on this to provide protection for

services above and outside the microkernel Principles of independence and integrity

Subsystems independent of one another Integrity of channels between subsystems

protected from other subsystems

Page 123: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

123

Microkernel ConceptsMicrokernel Concepts

Hardware provides address space mapping from virtual page to a physical page implemented by page tables and TLB

Microkernel concept of address spaces Hides the hardware address spaces and

provides an abstraction that supports Grant? Map? Flush?

These primitives allows building a hierarchy of protected address spaces

Page 124: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

124

Address spacesAddress spaces

A1, P1 V1, R

map

A2, P2 V2, R

R

(P1, v1)

R

(P1, v1)

(P2, v2)

grant

A2, P2 V2, NILR

(P1, v1)

(P2, v2)

A3, P3 V3, R

(P3, v3)

flush

A3, P3 V3, NILR

(P1, v1)

Page 125: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

125

Power and flexibility of address spaces Initial memory manager for address space

A0 appears by magic (similar to SPIN core service BUT outside the kernel) and encompasses the physical memory

Allow creation of stackable memory managers (all outside the kernel)

Pagers can be part of a memory manager or outside the memory manager

All address space changes (map, grant, flush) orchestrated via kernel for protection

Device driver can be implemented as a special memory manager outside the kernel as well

Page 126: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

126

Microkernelprocessor

M0, A0, P0

PT

M1, A1, P1

PT

M2, A2, P2

PT

Map/grant

Page 127: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

127

Threads and IPCThreads and IPC Executes in an address space

PC, SP, processor registers, and state info (such as address space)

IPC is cross address space communication Supported by the microkernel

Classic method is message passing between threads via the kernel

Sender sends info; receiver decides if it wants to receive it, and if so where

Address space operations such as map, grant, flush need IPC

Higher level communication (e.g. RPC) built on top of basic IPC

Page 128: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

128

Interrupts? Each hardware device is a thread from kernel’s

perspective Interrupt is a null message from a hardware

thread to the software thread Kernel transforms hardware interrupt into a

message Does not know or care about the semantics of the

interrupt Device specific interrupt handling outside the kernel Clearing hardware state (if privileged) then carried

out by the kernel upon driver thread’s next IPC TLB handler?

In theory software TLB handler can be outside the microkernel

In practice first level TLB handler inside the microkernel or in hardware

Page 129: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

129

Unique IDsUnique IDs Kernel provides uid over space and

time for Threads IPC channels

Page 130: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

130

Breaking some Breaking some performance mythsperformance myths

Kernel user switches Address space switches Thread switches and IPC Memory effects

Base system: 486 (50 MHz) – 20 ns cycle time

Page 131: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

131

Kernel-user switchesKernel-user switches Machine instruction for entering and exiting

107 cycles Mach measures 900 cycles for kernel-user

switch Why?

Empirical proof L3 kernel ~ 123 cycles (accounting for some TLB,

cache misses) Where did the remaining 800 cycles go in

MACH? Kernel overhead (construction of the kernel, and

inherent in the approach)

Page 132: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

132

Address space switchesAddress space switches Primer on TLBs

AS tagged TLB (MIPS R4000) vs untagged TLB (486)

Untagged TLB requires flush on AS switch Instruction and data caches

Usually physically tagged in most modern processors so TLB flush has no effect

Address space switch Complete reload of Pentium TLB ~ 864

cycles

Page 133: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

133

Do we need a TLB flush always? Implementation issue of “protection

domains” SPIN implements protection domains as

Modula names within a single hardware address space

Liedtke suggests similar approach in the microkernel in an architecture-specific manner

PowerPC: use segment registers => no flush Pentium or 486: share the linear hardware

address space among several user address spaces => no flush

There are some caveats in terms of size of user space and how many can be “packed” in a 2**32 global space

Page 134: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

134

Upshot? Address space switching among medium

or small protection domains can ALWAYS be made efficient by careful construction of the microkernel

Large address spaces switches are going to be expensive ALWAYS due to cache effects and TLB effects, so switching cost is not the most critical issue

Page 135: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

135

Thread switches and Thread switches and IPCIPC

Page 136: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

136

Segment switch (instead of AS switch) makes cross domain calls cheap

Page 137: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

137

Memory Effects – Memory Effects – SystemSystem

Page 138: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

138

Capacity induced MCPICapacity induced MCPI

Page 139: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

139

Portability Vs. Portability Vs. PerformancePerformance

Microkernel on top of abstract hardware while portable Cannot exploit hardware features Cannot take precautions to avoid

performance problems specific to an arch

Incurs performance penalty due to abstract layer

Page 140: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

140

Examples of non-Examples of non-portabilityportability

Same processor family Use address space switch implementation

TLB flush method preferable for 486 Segment register switch preferable for Pentium

=> 50% change of microkernel! IPC implementation

Details of the cache layout (associativity) requires different handling of IPC buffers in 486 and Pentium

Incompatible processors Exokernel on R4000 (tagged TLB) Vs. 486

(untagged TLB)

=> Microkernels are inherently non-portable

Page 141: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

141

SummarySummary Minimal set of abstractions in

microkernel Microkernels are processor specific

(at least in implementation) and non-portable

Right abstractions and processor-specific implementation leads to efficient processor-independent abstractions at higher layers

Page 142: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

142

Performance

Page 143: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

143

Key pointsKey points Goal: extensibility akin to SPIN and

Exokernel goals Main difference: support running several

commodity operating systems on the same hardware simultaneously without sacrificing performance or functionality

Why? Application mobility Server consolidation Co-located hosting facilities Distributed web services ….

Page 144: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

144

Multiprocessor OSMultiprocessor OS Synchronization Communication Scheduling

We have seen these issues already in the other readings in this section of the course

Page 145: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

145

Key IssuesKey Issues Modern parallel machines

Large system sizes stressing bottlenecks in system software (e.g. global data structures)

Higher memory latencies NUMA effects (i.e. symmetric assumption

does not hold Cache hierarchy

Write sharing expensive due coherence traffic False sharing due to large cache lines

Page 146: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

146

Thesis of Tornado Thesis of Tornado paperpaper

In designing multiprocessor OS Pay attention to locality Reduce shared system data structures Reduce distance between accessing

processor and target memory module

Page 147: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

147

Effect of global data Effect of global data structure – shared structure – shared countercounter

Page 148: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

148

Tornado design Tornado design approachapproach

Object-oriented design for scalability Clustered objects Protected procedure call with a view to preserving

locality while ensuring concurrency Semi automatic garbage collection for localizing locking

OS objects have multiple implementations Low overhead version when scalability is not required Resort to scalable implementation when performance

critical Optimize common case

Object invocation should be fast; object creation/destruction can be slower

Page fault handling should be fast; memory region creation/deletion can be slower

Page 149: Univ. of TehranDistributed Operating Systems1 Advanced Operating Systems University of Tehran Dept. of EE and Computer Engineering By: Dr. Nasser Yazdani

Univ. of Tehran Distributed Operating Systems

149

Next LectureNext Lecture Process and Thread

“Cooperative Task Management Without Manual Stack Management”, by Atul Adya, et.al.

“Capriccio: Scalable Threads for Internet Services”, by Ron Von Behrn, et. al.

“The Performance Implication of Thread Management Alternative for Shared-Memory Multiprocessors”, Thomas E. Anderson, et.al.