Upload
cameroon45
View
638
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Center for Experimental Research in Computer Systems
Spring 2007 IAB Meeting
Karsten Schwan, Calton Pu, Douglas Blough, Sudhakar Yalamanchili
IUCRCERCS NSF Industry University Co-operative Research Center
Mission
Scientific
.Grid
Scientific
.Grid
EnterpriseEnterprise
EmbeddedEmbedded
Lead the innovation of new information and computing technologies, to construct the interactive information systems of the future, and to create the intellectual capital that
can advance these technologies and fuel future advances.
Information anytime, anywhere Timeliness!Quality!Security!Robustness!
Remote access toInformation System
Strategic Thrusts - Highlights
– Scientific/Technical Computing -- Dynamic Data Management and• GT: IHPCL Laboratory (e.g., new cluster machines, including substantial Intel donations for
education and for multicore computing initiative)• DOE:ORNL, Sandia: High Performance I/O initiative; involvement with startups• Cisco (MPI and IB QoS); Dell; HP/Intel (Gelato, Itanium donations); IBM (VMM power
management, Cell SUR grant), RNet communication processor design• News: Multicore Focus (HP, IBM, Intel); Ongoing I/O Initiative, Virtualization in HPC (ONRL,
Sandia, UNM)
Strategic Thrusts - Highlights– Enterprise Computing -- Autonomic/Adaptive and Service-Oriented
Systems:• IBM, Intel, TCS (autonomic and critical enterprise systems, dynamic content
distribution/event-based systems, SOA, virtualization – hypervisor scaleout, I/O virtualization, trusted passages, metering, power management, failure diagnosis and fault containment)
• HP (deployment and management, system monitoring, risk-based control, stream data mining)
• Worldspan (runtime behavior detection and QoI, virtualization)• LogicBlox (Dynamic code generation for efficient data access)• Delta, Raytheon (policy/performancerobustness, runtime behavior modeling)• Cisco, Intel (network data services, heterogeneous multicore, IB network
virtualization)• News/Outreach: IBM SUR (joint with Ohio State), OSU industry partners, OSU
IAB meeting, exploring new links: Benchmark, Earthlink, McKesson, NSF CRIAirportLAN
AirportLAN
AirportLAN
AirportLAN
Cluster ComputingReal-Time
Information Processing
Cluster ComputingReal-Time
Information Processing
Optimization
Baggage StatusBaggage Displays
Crew andEquipmentStatus
FAA FlightData
Passenger paging andresponse
GateReaders
Wide-area
Transport
Real-time Decision Tools
Scalable Robust Services
StorageDatabases
capture, display, transport, filter, transform
Real-Time Information Transport
Visualization
Real-timeSituation
Assessment
High Performance Computing
OperationalFlight Displays
Strategic Thrusts - Highlights– Embedded Systems/Architecture
• Boeing (testing, software correctness)• Intel (in-vehicle computing and lightweight methods for system virtualization, system-level power
management, computer architecture)• Motorola (middleware for pervasive and mobile applications)• Federal: pervasive applications (transportation, robotics), upcoming Cyberphysical Systems program• IBM, Intel (network processors and heterogeneous multicore)• Sony (gaming applications)• News: Korea program in Embedded Systems, Samsung educational program, Robotics Center liason,
Sensor (OSU) and MobiEMU testbeds
image quality,end-to-end delay,jitter, loss rate
throughput,response times
CERCS Personnel
• Faculty– Mustaque Ahamad, Mostafa Ammar, Doug Blough, Constantinos
Dovrolis, Greg Eisenhauer, Richard Fujimoto, Ada Gavrilovska, Alexander Gray, Mary Jean Harrold, Hsien-Hsin Lee, Wenke Lee, Ling Liu, Gabriel Loh, Pete Manolios, Alex Orso, Henry Owen, Santosh Pande, Milos Prvulovic, Calton Pu, Kishore Ramachandran, Jay Ramanathan (Ohio State), Rajiv Ramnath (Ohio State), George Riley, David Schimmel, Karsten Schwan, Olin Shivers, Matthew Wolf, Hongyan Zha, Sudhakar Yalamanchili, Ellen Zegura
• Research Staff– Steve Ferenci, David Hilley– Supported by DARPA, DOE, NSF, (CoC), (ECE)
• Associated Faculty/Researchers– David Bader, Tucker Balch (Robotics), Patrick Bridges (UNM),
Robert Butera, Steve DeWeerth, Irfan Essa, Phil Hutto, Byron Jeff (Clayton State), Scott Klasky (ORNL), Kang Li, Sung Kyu Lim, Arthur Maccabe (UNM), Vincent Mooney, Jeff Nichols (ORNL), Krishna Palem, Kalyan Perumalla (ORNL), Jeff Vetter (ORNL), Patrick Widener (UNM)
Industrial Relations
• IUCR CERCS Center – Contributors (GT): Boeing, Cisco, Delta, DOE, HP, IBM, Intel,
LogicBlox, TCS, Worldspan– Industry Workshops and Industrial Advisory Board
• Joint initiatives - e.g., TIE grant with UFL, expansion to Ohio State (joint curriculum/facility efforts), planned expansion to UNM
• Internship Program– Amazon, ATT, CISCO, Delta, (DoCoMo), DOE, Google, HP, IBM,
Intel, Microsoft, Motorola, NetApp, Radisys, TCS, VMWare, Worldspan
• Evolving relationships:– ATT, DoCoMo, Microsoft, Motorola, NetApp, Netronome, Raytheon,
RNet, VMWare, Xilinx
Overview - Current Industry Engagements
• One slide per ongoing project• Federal projects (e.g., joint work with DOE ORNL)
elided, so, few HPC efforts described• Same order: HPC, Enterprise, Embedded
HPC: Cisco - Infiniband-based Research
Gavrilovska, Schwan, Wolf
• Mechanisms for delivering end-to-end QoS levels in challenging settings:– Multi-core nature of future HPC nodes– I/O limitations in high-performance infrastructures– End-to-end virtualized environments
• Two main efforts under current investigations:
– Data virtualization: ‘Datatap’ mechanism on top of low-level IB verb interface to (1) extract data from IB infrastructure, (2) middleware mechanisms to support dynamic extensions for service-oriented applications, (3) dynamic, resource-aware routing and data distribution to meet application QoS requirements.
– Platform virtualization: (1) integrate x86-based virtualization solutions into Infiniband settings, (2) develop mechanisms for end-to-end QoS for VM-to-VM interactions by improved and dynamic resource management and scheduling mechanisms.
Additional Effort :RNET – high end NIC for science applications
Enterprise: Elba Project – HP Labs
Calton Pu
• Apply code generation techniques to automate large system deployment, measurement, evaluation, and management
• Collaborative work (1 faculty-Pu, 1 industry-Sahai, 6 PhD stud., 4 MS stud., 1 undergrad.)
• 8 published papers in 2 years, several more in the pipeline
AutomatedSystem Mgmt
(1) Design
(2) Code Generation
(4) Evaluation& Analysis
(5)Reconfiguration
(3)Deployment
Current work: (step 4 above)Evaluation of 3-tier benchmark (RUBiS) using generated scripts(millions of lines of deployment,measurement, and analysis scripts)
Enterprise:Robust Delivery of Quality Data - Worldspan
Client 1
Message queueClient 2
Client 3
Clearinghouse(CW)
AirlinesGDS
Karsten Schwan, Mohamed Mansour, Jay Lofstead
Problem: Complex GDS with potentially unanticipated behaviorsExample: Variable search times due to caching effectsSolution:Runtime behavior detection, model construction, and mitigationSpecific approach:Mitigation via request reordering
Enterprise:Runtime Behavior Diagnosis – Delta Air Lines
Sandip Agarwala, Mohamed Mansour, Karsten Schwan
• Investigation of multiple enterprise architectures– Revenue Pipeline, delta.com, DNS
• Path detection in complex systems• Autonomic workflows• Monitoring and management in SOA systems (proposed
work)
Enterprise: Collaboration with IBM Research
Ling Liu
• Distributed Systems and Software– Dynamic Content Dissemination: Architectures and
Optimizations– Collaborators: Arun Iyengar, Fred Douglis, Isabelle Rouvellou
• Event Streams and Security– Sensor Stream Processing and Optimization (e.g., load
shedding, load balancing, motion adaptive indexing)– Event Stream Mining– Collaborators: Philip Yu, Bugra Gedik, Rong Chang
• Service Oriented Computing– Secure publish-subscribe systems, Secure Event Dissemination– Collaborators: Arun Iyengar, Liang Jie Zhang
13
Enterprise/HPC:High Productivity Computing
• Stream computing programming model– Kernels expressed in a declarative programming language– Custom hardware for accelerating data intensive kernels– Explicit interaction model: non-coherent shared memory
• Focus on applications such as retail forecasting and data analytics
LocalMemoryCache
ACC
DMA
FIFOLocalMemoryCache
LocalMemoryCache
ACC
DMA
FIFOACC
DMA
FIFO
Network (e.g., Hypertransport)
CPU CPU CPU
Run-timeKernel
Kernel
Inputs Outputs InputsOutputs
Application
Memory Sudha Yalamanchili
with LogicBlox Inc.
Enterprise/HPC: Databus: Runtime Rule Generation - LogicBlox
Greg Eisenhauer
• Fine grain retail data analysis (what-if calculations)
• Rule-based declarative language• Compile down to interpreted
“FactBus”,• Rule objects and variable objects.
Apply rules top to bottom, fallback on failure.
• Use DCG for “Just-In Time” compilation
• Initial results, speedups of 3.• Additional improvements anticipated.
Write
Test
Iterator
Lookup
BinOp
BinOp
FactBus Variables
Rules
DC
G s
ubse
t
apply
apply
apply
apply
apply
apply
DBS
trin
g
int
int
bo
ol
do
ub
le
do
ub
le
Enterprise/HPC:Scalable Hypervisors - Intel
KarstenSchwan
Enterprise:Power Management in Virtualized Systems
Platform HW
VMM
Dom0
OS
Application
VM 3
OS
Application
VM 2
VPM Channel
VP
M M
ech
anis
ms
PMPolicy
Platform HW
VMM
Dom0
OS
Application
VM 1
VPM Channel
VP
M M
ech
anis
ms
PMPolicy
Heterogeneity-awareAllocation Policy
Leverage heterogeneity in:• Performance capabilities• Power efficiency of resources• Power management support
Leverage heterogeneity in:• Performance capabilities• Power efficiency of resources• Power management support
Coordinate virtualized system management:• Enable VM management independence • Decouple virtual and physical resources for management• Introduce “soft” scaling for flexible management
Coordinate virtualized system management:• Enable VM management independence • Decouple virtual and physical resources for management• Introduce “soft” scaling for flexible management
Ripal NathujiKarsten Schwan
IBM + Intel
Enterprise:Trusted Passages on Virtualized Platforms – Intel/NSF
Hypervisor
ServiceVM
GuestVM1
Guest VM2
BE
NIC
Trust ControllerApp.
FEFE
Hypervisor
ServiceVM
GuestVM1
Guest VM2
BE
NIC
Trust ControllerApp.
FEFE
network
network
Host1 Host2
Overlay node1 Overlay node2
network
network networ
k
network
Trusted passage
Trusted passage
Mustaq AhamadGreg Eisenhauer
Wenke LeeKarsten Schwan
Run trusted services across untrusted platforms:• Trust models and trust controller mechanisms for evolving node trust• Virtual Machines Monitoring and Introspection to support trust controllers• Data Interception and Redirection as Remedial Measures
Research Question
How can we perform fault analysis at
the system-model level and make this
information accessible to developers?
Fault Propagation for Safety (Boeing)Problem
Critical avionics systems
• now use integrated modular avionics
• making fault analysis for the entire system difficult
Embedded/Enterprise:Aristotle Research Group (Mary Jean Harrold)
Testing Evolving Software (TCS) (with Alex Orso)
Problem
Changes
• require rapid modification and testing for quick release
• causing released software to have many defects
Research Question
How can we test well (to gain
confidence in changes before
release of changed software)
MaTRIX
Computes conditions test
cases must satisfy to test
changes well
FauPA
Propagates injected faults
forward to determine impact;
Traces faulty components
backward to find root cause
Embedded:Correct Software Assemblies - Boeing