Data-Driven Inference of API Mappings Department of Computer Science Rutgers University Amruta Gokhale- [email protected] Daeyoung Kim- [email protected]

Embed Size (px)

Citation preview

  • Slide 1
  • Data-Driven Inference of API Mappings Department of Computer Science Rutgers University Amruta Gokhale- [email protected] Daeyoung Kim- [email protected] Vinod Ganapathy- [email protected] PROMOTO 2014
  • Slide 2
  • Personal story: Change in environment is hard! PROMOTO 2014Data-Driven Inference of API Mappings1 Nagpur, IndiaNew Jersey, USA
  • Slide 3
  • Personal story: Change in environment is hard! PROMOTO 2014Data-Driven Inference of API Mappings2 Nagpur, IndiaNew Jersey, USA 45 C - 10 C
  • Slide 4
  • Mobile app for a single platform PROMOTO 2014Data-Driven Inference of API Mappings3 iPhone app
  • Slide 5
  • PROMOTO 2014Data-Driven Inference of API Mappings4 BlackBerry 10 Android Windows Phone Challenge: Porting apps across multiple mobile platforms Windows Phone app BlackBerr y app Android app iPhone app
  • Slide 6
  • Porting assistance Porting to Windows Phone: Developer guides for porting Discussion forums on porting PROMOTO 2014Data-Driven Inference of API Mappings5
  • Slide 7
  • Challenges in porting apps PROMOTO 2014Data-Driven Inference of API Mappings6 Different SDKs for app development Different programming languages Different development environments Different debugging aids Every mobile platform exposes its own programming API Every mobile platform exposes its own programming API PlatformLanguageDevelopment Tools AndroidJavaEclipse iOSObjective CXCode Windows PhoneC#Visual Studio
  • Slide 8
  • iOS classiOS method name CGGeometryCGRect CGRectMake(CGFloat x, y, width, height) Returns a rectangle with the specified coordinate and size values. CGGeometrybool CGRectContainsPoint(CG Rect rect, CGPoint point) Returns whether a rectangle contains a specified point. Using API documentation to write app PROMOTO 2014 Data-Driven Inference of API Mappings7 iPhone App iPhone App
  • Slide 9
  • Using API documentation to write app PROMOTO 2014Data-Driven Inference of API Mappings8 Android class Android method name android.gra phics void drawRect(Rect r, Paint paint) Draws the specified Rect using specified Paint android.gra phics bool contains(int x, int y) Returns true if (x,y) is inside the rectangle. Android App Android App Android phone
  • Slide 10
  • Can we do better than searching API documentation for each new platform? PROMOTO 2014Data-Driven Inference of API Mappings9
  • Slide 11
  • APIs often have similar functionality PROMOTO 2014Data-Driven Inference of API Mappings10 Android class nameAndroid method name android.graphicsvoid drawRect(Rect r, Paint paint) android.graphicsbool contains(int x, int y) iOS class nameiOS method name CGGeometryCGRect CGRectMake (CGFloat x, y, width, height) CGGeometrybool CGRectContainsPoint (CGRect rect, CGPoint point)
  • Slide 12
  • API mapping databases PROMOTO 2014Data-Driven Inference of API Mappings11 API mapping databases map methods in a source API to methods in a target API iOS MethodAndroid Method CGGeometry.CGRectMake()android.graphics.drawRect() CGGeometry.CGRectContainsPoint()android.graphics.contains()
  • Slide 13
  • Platform APIs ~ Natural languages PROMOTO 2014Data-Driven Inference of API Mappings12 Source API Target API Unknown source language Unknown target language
  • Slide 14
  • PROMOTO 2014Data-Driven Inference of API Mappings13 English language text Spanish language text NLP Toolkit Word mappings English word Spanish word northnorte exitsalida Word mappings
  • Slide 15
  • Mappings between English and Spanish words PROMOTO 2014Data-Driven Inference of API Mappings14 enlarge- ment society state control import- ance amplifi- cacion estado sociedad import- ancia control
  • Slide 16
  • PROMOTO 2014Data-Driven Inference of API Mappings15 iOS API methods text Android API methods text NLP Toolkit API mappings iPhone API method Android API method CGRectMakedrawRect CGRect- ContainsPoint contains API method mappings
  • Slide 17
  • iOS and Android API methods mappings PROMOTO 2014Data-Driven Inference of API Mappings16 CGRectGet- Height CGRectGet- Width CGRectMake CGRectCont ainsPoint CGContext FillRect height drawRect width setStyle contains
  • Slide 18
  • API mapping tools PROMOTO 2014Data-Driven Inference of API Mappings17 windowsphone.interoperabilitybridges.com/porting API mappings from Android, iPhone to Windows Phone
  • Slide 19
  • Creating API mapping databases PROMOTO 2014Data-Driven Inference of API Mappings18 Mapping databases are populated manually by domain experts Painstaking, error-prone and expensive Hard to evolve API mapping databases as the corresponding APIs evolve
  • Slide 20
  • Our contribution PROMOTO 2014Data-Driven Inference of API Mappings19 We propose to automatically create API mapping databases We propose to automatically create API mapping databases Prototyped in a tool called DDR (Data- Driven Rosetta) Creates mappings between iOS API and Android API Leverages NLP approach to identify likely API mappings
  • Slide 21
  • Workflow of DDR PROMOTO 2014Data-Driven Inference of API Mappings20 Source Program Path Extraction Target Program Path Extraction NLP Inference Engine Source method Target method PR CGRect- Make() drawRect( ) 0.60 GetWidth()width()0.45 GetWidth() GetHeight() RectMake() height() width() setStyle() drawRect() Source Apps Target Apps Source Program Paths Target Program Paths Output Mappings
  • Slide 22
  • Program path extraction PROMOTO 2014Data-Driven Inference of API Mappings21 Dis- assembler Control flow graph constructor Program path extractor Mobile app binary Intermediate code representation Control flow graph Program paths
  • Slide 23
  • NLP Inference engine Matching Canonical Correlation Analysis (MCCA) [ACL `08*] 1.Define a generative model 2.Inference on the model done via Expectation-Maximization (EM) algorithm * Learning Bilingual Lexicons from Monolingual Corpora Haghighi et. al., ACL `08 PROMOTO 2014Data-Driven Inference of API Mappings22
  • Slide 24
  • Generative model PROMOTO 2014Data-Driven Inference of API Mappings23 Target feature extraction Source feature extraction Source word features Target word features Generative Model Seed Mappings
  • Slide 25
  • Generative model Features computed from individual languages: 1.Frequency of words 2.Substring properties 3.Context counts Features form the observed data explained via a generative process PROMOTO 2014Data-Driven Inference of API Mappings24
  • Slide 26
  • Relating a pair of mapped methods drawRect CGRectMake Common, hidden concept behind the generation processes 25 Generative model PROMOTO 2014Data-Driven Inference of API Mappings Target method features Source method features
  • Slide 27
  • Inference algorithm E-step: Find the maximum weighted (partial) bipartite matching M-step: Find the best parameters of the model by performing canonical correlation analysis (CCA) PROMOTO 2014Data-Driven Inference of API Mappings26
  • Slide 28
  • Our modifications to inference algorithm String similarity function: method names instead of method signatures Output: a list of top 10 mappings sorted in decreasing order of edge weights PROMOTO 2014Data-Driven Inference of API Mappings27
  • Slide 29
  • Implementation PROMOTO 2014Data-Driven Inference of API Mappings28 Collected 50 Android apps and 50 iOS apps 3,414 unique iOS API methods 2,229 unique Android API methods Evaluation under progress!
  • Slide 30
  • Conclusion It is becoming increasingly important to port apps to a variety of platforms Key challenge: Different platforms use different programming APIs API mapping databases help, but they are created manually by domain experts PROMOTO 2014Data-Driven Inference of API Mappings29 We presented a methodology to automate the creation of API mapping databases
  • Slide 31