28
R and Cloud Computing for Higher Education and Research Cloud Era Ltd 9 July 2013 [email protected] Karim Chine UseR 2013 - Tutorial

Use r 2013 tutorial - r and cloud computing for higher education and research

  • Upload
    kchine3

  • View
    842

  • Download
    1

Embed Size (px)

Citation preview

Page 1: Use r 2013   tutorial - r and cloud computing for higher education and research

R and Cloud Computing for Higher Education and Research

Cloud Era Ltd 9 July 2013 [email protected]

Karim Chine

UseR 2013 - Tutorial

Page 2: Use r 2013   tutorial - r and cloud computing for higher education and research

22

Outline Introduction Rethinking virtual research and teaching Elastic-R: Towards a universal platform for data science Elastic-R: Design and technologies overview Elastic-R: The scriptability framework Demo Conclusion

Page 3: Use r 2013   tutorial - r and cloud computing for higher education and research

33

Introduction

Science and the 4th paradigm

The e-Research dancing bears

2

22.

3

4

a

cG

a

a

Experimental Science Theoretical Science Computational Science e-Science / Data-intensive Science

The townspeople gather to see the wondrous sight as the massive, lumbering beast shambles and shuffles from paw to paw. The bear is really a terrible dancer, and the wonder isn't that the bear dances well but that the bear dances at all.

Page 4: Use r 2013   tutorial - r and cloud computing for higher education and research

4

Introduction

* http://en.wikipedia.org/wiki/Data_science

The rise of data science

Data science incorporates varying elements and builds on techniques and theories from many fields, including math, statistics, data engineering, pattern recognition and learning, advanced computing, visualization, uncertainty modeling, data warehousing, and high performance computing with the goal of extracting meaning from data and creating data products

Page 5: Use r 2013   tutorial - r and cloud computing for higher education and research

5

Introduction

www.scilab.org

http://root.cern.ch

www.sagemath.org

www.sas.com

office.microsoft.com

www.mathworks.com

www.scipy.org

www.spss.com

www.wolfram.com

www.python.org

Open-source (GPL) software environment for statistical computing and graphics

Lingua franca of data analysis. Repositories of contributed R

packages related to a variety of problem domains in life sciences, social sciences, finance, econometrics, chemo metrics, etc. are growing at an exponential rate.

R is Super Glue

Fragmentation and friction in the data science arena

Page 6: Use r 2013   tutorial - r and cloud computing for higher education and research

6

Introduction

 Arduino / Raspberry piDemocratizing electronics

Elastic-RDemocratizing data science

The Next Generation Data Science Platform

Page 7: Use r 2013   tutorial - r and cloud computing for higher education and research

77

Introduction

On-demand self service

Rapid elasticity

Resource pooling

Mesured service

Broad network access

5 Essential Characteristics

The Cloud and its capabilities

Page 8: Use r 2013   tutorial - r and cloud computing for higher education and research

88

IntroductionThe cloud and its capabilities

Page 9: Use r 2013   tutorial - r and cloud computing for higher education and research

9

Introduction 3D printers are becoming

a common place: Creating three-dimensional

solid object of virtually any shape from a digital model.

Scripting the physical world Sharing physical reality on

Facebook is now as easy as sharing your holiday pictures

The cloud and its capabilities

Page 10: Use r 2013   tutorial - r and cloud computing for higher education and research

1010

Rethinking virtual research and teaching Free researchers from their IT services dictatorship. Give

them self-service access to the IT resources they need

Make real-time collaboration a free, reliable and ubiquitous service

Allow Researchers to share without restrictions. Make the cloud become an ecosystem for Open Science where all research artifacts can be produced, discovered and reused

Allow researchers to produce and publish to the web advanced applications/services without recourse to developers/admins

Page 11: Use r 2013   tutorial - r and cloud computing for higher education and research

1111

Rethinking virtual research and teaching Provide platforms for Science-as-a-Service

Allow Researchers to « sell » the software/models/algorithms/techniques they invent seamlessly: Create a market place for data science artifacts and application

Provide capabilities for making data analysis and computational research traceable and reproducible

Bridge the gap between the different computational research tools: interconnect SCEs, workflow workbenches, Documents editors…

Page 12: Use r 2013   tutorial - r and cloud computing for higher education and research

1212

Rethinking virtual research and teaching Provide affordable and reliable tools for remote

education High-quality voice and video chat for a large number of users Self-service collaboration tools: Editors, White boards, IDEs, etc. Modules for Traceability/reproducibilty

Extend existing on-line courses platforms to include capabilities such as: Companion software environments in SaaS mode Collaborative problem solving tools Interactive courses Tokens for Ready-to-run e-Learning applications E-Learning environments’ visual designers

Page 13: Use r 2013   tutorial - r and cloud computing for higher education and research

13

Elastic-R: Towards a universal platform for data science

Computational Components R packages, Wrapped C,C++,Fortran code, Python modules, Matlab Toolkits…

Open source or commercial

Computational Resources

Clusters, grids, private or public clouds

Free or pay-per-use

Computational GUIsHTML5 and Desktop Workbench

Built-in views /Plugins /Collaborative views

Open source or commercial

Computational Scripts R / Python / Matlab / Groovy

Computational APIs Java / SOAP / REST, Stateless and stateful

Computational StorageLocal, NFS, FTP, Amazon S3, EBS

Generated Computational Web ServicesStateful or stateless, mapping of R objects/functions

Elastic-R

Page 14: Use r 2013   tutorial - r and cloud computing for higher education and research

14

Elastic-R: Towards a universal platform for data science

Robot submarine dives to the deepest part of the ocean controlled by a 7-mile cable as thin as single human hair

Page 15: Use r 2013   tutorial - r and cloud computing for higher education and research

15

Elastic-R: Towards a universal platform for data science

Page 16: Use r 2013   tutorial - r and cloud computing for higher education and research

16

Elastic-R: Towards a universal platform for data science

Public Clouds

Private Cloud

Page 17: Use r 2013   tutorial - r and cloud computing for higher education and research

17

Elastic-R: Design and technologies overview

You are here

Your data is here

Page 18: Use r 2013   tutorial - r and cloud computing for higher education and research

18

Elastic-R: Design and technologies overview

Remote Java/R Processes Events-driven Remote

Objects/Engines R, Python, Mathematica,

Matlab, Scilab, ... Collaborative Spreadsheets Collaborative Scientific

Graphics Canvas Collaborative Dashboard with

collaborative widgets

Page 19: Use r 2013   tutorial - r and cloud computing for higher education and research

19

Elastic-R: Design and technologies overview

Elastic-R AMI 1R 2.10

BioC 2.5

Elastic-R AMI 2R 2.9

BioC 2.3

Elastic-R AMI 3R 2.8

BioC 2.0

Elastic-R Amazon Machine Images

Elastic-R EBS 1

Data Set XXX

Elastic-R EBS 2

Data Set YYY

Elastic-R EBS 3

Data Set ZZZ

Elastic-R EBS 4

Data Set VVV

Elastic-R AMI 2

R 2.9BioC 2.3

Elastic-R EBS 4

Data Set VVV

Amazon Elastic Block Stores

Eastic-R AMI 2R 2.9

BioC 2.3

Elastic-R.org

Elastic-R EBS 4

Data Set VVV

Page 20: Use r 2013   tutorial - r and cloud computing for higher education and research

2020

Elastic-R: Design and technologies overview

Individuals with AWS accounts Standard AMIs (Amazon Machine Images): paid-per-use to Amazon.

For Academic use and trial purposes Paid AMI: Paid-per-use software model. For business users

Individuals without AWS accounts Trial tokens, purchased tokens, tokens granted by other users. Resources (data science engines) shared by other users Individual subscribtions

Companies/Educational & Research Institutions Dedicated Platform and AMIs on an Amazon VPC (Virtual Private

Cloud). Paid via subscribtion

Modes of access

Page 21: Use r 2013   tutorial - r and cloud computing for higher education and research

2121

Elastic-R: The scriptability framework

Command Line

Web Console

SDK

API

Page 22: Use r 2013   tutorial - r and cloud computing for higher education and research

2222

Elastic-R: The scriptability framework

Command Line

Web Console

SDK

API

Page 23: Use r 2013   tutorial - r and cloud computing for higher education and research

2323

Elastic-R: The scriptability framework

WS

generator

rws.war+ mapping.jar

+ pooling framework

+ R Java Bridge

+ JAX-WS

- Servlets

- Generated artifacts

Eclipse Web Service Client Generator

public static void main(String[] args) throws Exception { RGlobalEnvFunctionWeb g=new RGlobalEnvFunctionWebServiceLocator().getrGlobalEnvFunctionWebPort(); RNumeric x=new RNumeric(); x.setValue(new Double[]{6.0}); System.out.println(g.square(x).getValue()[0]);}

square function(x) {return(x^2) }typeInfo(square) SimultaneousTypeSpecification(TypedSignature(x = "numeric"), returnType = "numeric")

Script / globals.r

Script / rjmap.xml

<rj> <publish> <functions> <function name="square" forWeb="true"/> </functions> </publish> <scripts> <initScript name="globals.r" embed="true"/> </scripts></rj>

Deploy

Page 24: Use r 2013   tutorial - r and cloud computing for higher education and research

2424

Elastic-R: The scriptability framework API: Soap and Restful Web Services

Data analysis engines control Data analysis engines management (life cycle, etc.) Virtual appliances/artifacts management Platform administration Generated web services from R functions

SDKs: Java, R, Microsoft Office (Vba) Command line: elasticR package Html5 Workbench

Elastic-R artifacts management interface Engines control interface

Page 25: Use r 2013   tutorial - r and cloud computing for higher education and research

2525

Demo Register to Elastic-R academic and trial portal (

www.elastic-r.org ) Create data science engines using trial tokens Work with R, Python and scientific Spreadsheets in the

browser Share Data Science Engine and Collaborate Use The Visual and Collaborative Scientific Applications

designer to create and publish to the web an interactive dashboard

Connect to the remote Data Science Engine from withing a local R session, push and pull data, execute commands and show impact on the dashboard

Page 26: Use r 2013   tutorial - r and cloud computing for higher education and research

2626

Conclusion Elastic -R unlocks the potential of the cloud for Data

scientists and educators With Elastic-R, the cloud becomes a cyberspace for

collaborative research and sharing and an eco-system suited for open Science, open innovation and open education

Elastic-R improves dramatically the productivity of the data scientists: The entire data science factory chain, from resources acquisition to services and applications publishing, becomes under their direct control

Elastic-R provides Analytics-as-a-Service platform that can extend any existing portal or application

Page 27: Use r 2013   tutorial - r and cloud computing for higher education and research

2727

What to do Next Register to Elastic-R and try the HTML 5 Workbench and

the collaboration Download the R package elasticR and use it to access

the cloud from local R sessions Download the Java SDK and try to create your first

Analytical application using AWS and the most advanced tools for programming with data.

Get in touch with me to explore potential collaborations

Page 28: Use r 2013   tutorial - r and cloud computing for higher education and research

2828

Contact details Karim Chine [email protected]