SIGOPS European Workshop 2002 1
The Case for Cyber ForagingThe Case for Cyber ForagingRajesh Krishna BalanRajesh Krishna BalanCarnegie Mellon UniversityCarnegie Mellon University
Joint work with:Joint work with: J. Flinn J. Flinn (Univ. Of Michigan)(Univ. Of Michigan), ,
M. Satyanarayanan M. Satyanarayanan (CMU)(CMU), , S. Sinnamohideen S. Sinnamohideen (CMU)(CMU) , , H. Yang H. Yang (CMU)(CMU)
2
2 Broad class of applications2 Broad class of applications Personal Productivity ApplicationsPersonal Productivity Applications
– EmailEmail– CalendarCalendar
Computationally Intensive Interactive Computationally Intensive Interactive ApplicationsApplications– Speech RecognitionSpeech Recognition– Language TranslationLanguage Translation– Augmented RealityAugmented Reality
3
Motivation: Handhelds are weak!Motivation: Handhelds are weak!- Resource - Resource
intensiveintensiveAppApp
- Huge Data - Huge Data SetsSets
2 GHz, 1 GB,2 GHz, 1 GB,3-D graphics3-D graphics2 GB of data2 GB of data
Resource-poorResource-poorwearablewearable
200 MHz, 32 MB, 200 MHz, 32 MB, no 3-D, no FPUno 3-D, no FPU
32 MB Flash32 MB Flash
Poor performance!Poor performance!
4
Solution: Cyber ForagingSolution: Cyber Foraging
““To live off the land”To live off the land” Use resources in Use resources in
environment to environment to augment device augment device capabilities by using capabilities by using surrogatessurrogates
2 methods 2 methods – Data StagingData Staging– Remote ExecutionRemote Execution
5
The Big PictureThe Big Picture Data Staging Data Staging
– Caching of large amounts of data Caching of large amounts of data – Handhelds with limited storage can access this data fastHandhelds with limited storage can access this data fast– Security and authenticationSecurity and authentication
Remote ExecutionRemote Execution– Uses remote servers to augment computational Uses remote servers to augment computational
capabilities of handheldscapabilities of handhelds– Enables computationally intensive applicationsEnables computationally intensive applications
Service DiscoveryService Discovery– Discover servers used by previous two mechanismsDiscover servers used by previous two mechanisms
6
RoadmapRoadmap Data StagingData Staging Remote ExecutionRemote Execution Service DiscoveryService Discovery
7
Data Staging: MotivationData Staging: Motivation End-to-end latency across the Internet isn’t getting betterEnd-to-end latency across the Internet isn’t getting better
– Physical limitsPhysical limits– Routers, firewallsRouters, firewalls– Shows up in interactive file access delaysShows up in interactive file access delays– Crucial for small to medium filesCrucial for small to medium files
Can overcome this by caching & prefetching, but …Can overcome this by caching & prefetching, but …– Handheld clients don’t have enough resourcesHandheld clients don’t have enough resources– Cache consistency Cache consistency
Can untrusted and unmanaged computers help?Can untrusted and unmanaged computers help?
Yes!!Yes!!
8
Data staging: MechanismData staging: Mechanism
StagingServer
High-bandwidth, high-latencyLow-latency RF
Distant server
Coda clients speculatively prefetch data :Coda clients speculatively prefetch data :– Nearby Nearby surrogatesurrogate runs runs staging serverstaging server– Used like a second level cacheUsed like a second level cache– Cache misses serviced by staging serverCache misses serviced by staging server
Surrogates deployed in high-usage areasSurrogates deployed in high-usage areas
9
SecuritySecurity Must provide level of security users expectMust provide level of security users expect But surrogate is untrustedBut surrogate is untrusted
Use end-to-end encryptionUse end-to-end encryption– Only store encrypted data on surrogateOnly store encrypted data on surrogate– Client caches keys and checksumsClient caches keys and checksums– Only need access control for keysOnly need access control for keys
10
Ease of managementEase of management Surrogate is simple to manage and deploySurrogate is simple to manage and deploy
– No per-client persistent state on surrogateNo per-client persistent state on surrogate– Consistency maintained by clientConsistency maintained by client– Minimalistic set of operationsMinimalistic set of operations– Uses commodity software (Apache)Uses commodity software (Apache)
No modifications to base filesystemNo modifications to base filesystem– Proxy based approachProxy based approach– Proxy encapsulates FS specific codeProxy encapsulates FS specific code
11
The “Gory” DetailsThe “Gory” Details
FileFileServerServer
Data PumpData Pump StagingStagingServerServer
FileFileSystemSystemPr
oxy
Prox
y
Normal file trafficNormal file traffic
File dataFile data
Encrypted file Encrypted file datadata
Staged file Staged file traffictraffic
Handheld clientHandheld client
SurrogateSurrogate
ServerServer
File keys
File keys
(via secure channel
(via secure channel))
12
Benefit for image viewingBenefit for image viewing
0
500
1000
1500
2000
2500
Cold / No Staging Cold / Staging Warm / No Staging Warm / Staging
Cum
ulat
ive
Del
ay (s
ec.)
Data staging reduces cumulative delay up to 64%
Compaq iPaq, 16MB file cache, 60ms latency to serverCompaq iPaq, 16MB file cache, 60ms latency to server 3ms latency to surrogate 3ms latency to surrogate
13
RoadmapRoadmap Data StagingData Staging Remote ExecutionRemote Execution Service DiscoveryService Discovery
14
Motivation: mobile interactive Motivation: mobile interactive applicationsapplications
speech recognition, language translation, speech recognition, language translation, augmented reality, …augmented reality, …– Resource-heavy, but need bounded response timeResource-heavy, but need bounded response time
Columbia U. MARS projectColumbia U. MARS project
15
Solution: Remote ExecutionSolution: Remote Execution Augment capabilities of handhelds by using nearby Augment capabilities of handhelds by using nearby
serversservers
But how do you make legacy applications use But how do you make legacy applications use remote execution?remote execution?
And get good performance as well?And get good performance as well?
16
Strawman SolutionStrawman Solution
Heavily modify each application to use Heavily modify each application to use remote executionremote execution– Tweak every last drop of performanceTweak every last drop of performance
Requires ~3- 4 grad student months per Requires ~3- 4 grad student months per reasonably sized applicationreasonably sized application– Grad students have nothing else to do anyway Grad students have nothing else to do anyway
right?? right?? Method does not scale and is not agileMethod does not scale and is not agile
17
Solution: TacticsSolution: Tactics Concise description of application’s remote execution Concise description of application’s remote execution
capabilitiescapabilities– Only the useful remote partitions are describedOnly the useful remote partitions are described– Can be captured in a compact declarative formCan be captured in a compact declarative form– Allows use of stub generators to ease Allows use of stub generators to ease
programming burdenprogramming burden
Tradeoff between code migration and static Tradeoff between code migration and static partitioningpartitioning
18
Example TacticExample TacticAPPLICATIONAPPLICATION pangloss-lite;pangloss-lite;
/* RPC Specifications *//* RPC Specifications */RPC server_dict (IN string line, OUT string dict_out);RPC server_dict (IN string line, OUT string dict_out);RPC server_ebmt (IN string line, OUT string ebmt_out);RPC server_ebmt (IN string line, OUT string ebmt_out);RPC server_lm (IN string gloss_out, IN string dict_out, RPC server_lm (IN string gloss_out, IN string dict_out, IN IN
string ebmt_out, OUT string translation);string ebmt_out, OUT string translation);
/* Tactics (Useful Ways to Combine the RPCs) *//* Tactics (Useful Ways to Combine the RPCs) */TACTIC dict = server_dict & server_lm;TACTIC dict = server_dict & server_lm;TACTIC ebmt = server_ebmt & server_lm;TACTIC ebmt = server_ebmt & server_lm;TACTIC dict_ebmt = (server_dict, server_ebmt) & server_lm;TACTIC dict_ebmt = (server_dict, server_ebmt) & server_lm;
19
ChromaChroma
20
But is it FAST??But is it FAST?? Yes!!Yes!! Tactics capture useful remote partitions of Tactics capture useful remote partitions of
applicationapplication– No time spent looking at non-useful partitionsNo time spent looking at non-useful partitions
Performance is comparable to hand-modified Performance is comparable to hand-modified applicationapplication– Constant overhead added by runtime to best Constant overhead added by runtime to best
statically decided casestatically decided case– Overhead is low and reasonableOverhead is low and reasonable
21
Adjusting to EnvironmentAdjusting to Environment Availability of compute servers varies wildlyAvailability of compute servers varies wildly
– Mobility Mobility Cannot expect just one situation Cannot expect just one situation
Tactics allow us to automatically use extra Tactics allow us to automatically use extra resources in environmentresources in environment– Without modifying the applicationWithout modifying the application
Resource Poor Resource Rich(Smart Rooms etc.)
22
Tactics: Using Extra ServersTactics: Using Extra Servers
Still able to get performance improvements with loaded servers!!
23
RoadmapRoadmap Data StagingData Staging Remote ExecutionRemote Execution Service DiscoveryService Discovery
24
One Ring to Bind Them All?One Ring to Bind Them All?
Abundance of service discovery protocolsAbundance of service discovery protocols Every site has its own favouriteEvery site has its own favourite
– Mobile applications cannot assume any particular mechanismMobile applications cannot assume any particular mechanism We built a middleware service toWe built a middleware service to
– Leverage existing mechanismsLeverage existing mechanisms– Isolate applications from changes in underlying mechanismIsolate applications from changes in underlying mechanism
uPnPJini
SLP
Cooltown Squirt
Salutation
25
Current WorkCurrent Work Data Staging:Data Staging:
– Proxies for other file systems – e.g. NFSProxies for other file systems – e.g. NFS– Cache management policiesCache management policies– Prediction mechanismsPrediction mechanisms
Remote Execution:Remote Execution:– Creation of resource allocation policiesCreation of resource allocation policies– Adapting software engineering methods to ease Adapting software engineering methods to ease
development timesdevelopment times
26
ConclusionsConclusions Cyber Foraging is a vision of mobile Cyber Foraging is a vision of mobile
computingcomputing Presented two methods of doing itPresented two methods of doing it
– Data StagingData Staging– Remote ExecutionRemote Execution
27
Tactics Don’t HurtTactics Don’t Hurt
28
Middleware Service (VERSUDS)Middleware Service (VERSUDS) Virtual layer on top of Virtual layer on top of
existing service existing service discovery mechanismsdiscovery mechanisms– Supports common Supports common
functionalityfunctionality Provides standard API Provides standard API
to applicationsto applications Provides isolation to Provides isolation to
mobile applications mobile applications
29
Related WorkRelated Work Data StagingData Staging
– OceanstoreOceanstore– Peer to Peer Systems (FreeNet, Gnutella …)Peer to Peer Systems (FreeNet, Gnutella …)– CDNs (Akamai)CDNs (Akamai)
Remote ExecutionRemote Execution– Odyssey, Spectra, Puppeteer, Abacus …Odyssey, Spectra, Puppeteer, Abacus …– Declarative languages (4GLs, Little languages)Declarative languages (4GLs, Little languages)– Corba, Java RMICorba, Java RMI