Upload
revolution-analytics
View
106
Download
2
Tags:
Embed Size (px)
DESCRIPTION
Citation preview
Revolution Confidential
R evolution R : 100% R and More
P res ented by: David S mith @ revodavid V P Marketing and C ommunity R evolution A nalytic s
Revolution Confidential
P oll Ques tion
Which stats package do you use most?
Revolution Confidential Marc h 13, 2013: Welc ome!
Thanks for coming. Slides and replay available (soon) at: http://bit.ly/YbfQo1
David Smith VP Marketing & Community, Revolution Analytics Editor, Revolutions blog http://blog.revolutionanalytics.com Twitter: @revodavid
3
Revolution Confidential In today’s webc as t:
About Revolution Analytics and R How Revolution R Enterprise enhances R
Resources for getting more from R
Q&A
4
Revolution Confidential
5
Enterprise-ready Multi-platform Scalable from desktop to big data Delivers high performance analytics Easier to build and deploy analytic applications
Revolution Analytics is the leading commercial provider of software and support for the
open-source R statistical computing language
R evolution R E nterpris e is
Revolution Confidential What is R ?
Data analysis software A powerful programming language Development platform designed by and for statisticians
A complete environment Huge library of algorithms for data access, data
manipulation, analysis and graphics An open-source software project Free, open, and active
A vibrant community Thousands of contributors, 2 million users Resources and help in every domain
6
Download the White Paper
R is Hot bit.ly/r-is-hot
Revolution Confidential
R is exploding in popularity and func tionality
Source: http://r4stats.com/popularity; “Why R is a name to know in 2011”, Forbes; number of packages is now 4,250 7
“A key benefit of R is that it provides near-instant availability of new and
experimental methods created by its user base — without waiting for the
development/release cycle of commercial software. SAS recognizes the value of R
to our customer base…”
Product Marketing Manager SAS Institute, Inc
“I’ve been astonished by the rate at which R has been adopted. Four years ago,
everyone in my economics department [at the University of Chicago] was using
Stata; now, as far as I can tell, R is the standard tool, and students learn it first.”
Deputy Editor for New Products at Forbes
Revolution Confidential
A Vibrant R Us er C ommunity
More: The R Ecosystem
bit.ly/R-ecosystem
8
Local R User Groups (93) Local R User Groups (102)
Revolution Confidential
R evolution A nalytic s S cales R to the E nterpris e
9
Power
Productivity
Power Distributed high
performance analytics
Productivity Build & deploy analytics
applications easily
Enterprise Readiness Enterprise landscape Full-service customer
support, consulting and training
Enterprise Readiness
Revolution R Enterprise
Revolution Confidential
10
Revolution R Enterprise
ScaleR High Performance Big Data Analytics
RevoR Performance Enhanced Open Source R
Open Source R packages
ConnectR High Speed Connectors HDFS, Hbase, ODBC, SAS
PlatformR Parallel Distributed Computing
IBM/Netezza, IBM/Platform LSF, MS HPC Server, MS Azure Burst
DevelopR Integrated Development
Environment
DeployR Web Services
R evolution R E nterpris e High P erformanc e, Multi-P latform A nalytic s P latform
Revolution Confidential
Enterprise Deployment
Performance
Productivity
Big Data Analysis
Training & Consulting
Technical Support
R evolution R E nterpris e:
11
Open Source
Performance Enhancements
Greater Productivity & Ease of Use
Tackle “Big Data”
IT-Friendly Enterprise Deployment
On-Call Experts
Revolution Confidential
R evolution R E nterpris e
Productivity
12
Revolution Confidential T he s tandard R interfac e
13
Revolution Confidential DevelopR Integrated Development E nvironment
14
Script with type ahead and code
snippets Solutions window
for organizing code and data
Packages installed and
loaded
Objects loaded in the
R Environment
Object details
Sophisticated debugging with
breakpoints , variable values etc.
http://www.revolutionanalytics.com/demos/revolution-productivity-environment/demo.htm
Revolution Confidential
R evolution R E nterpris e
Performance
15
Revolution Confidential P erformance: Multi-threaded Math
Open Source R
16
Revolution R Enterprise
Computation (4-core laptop) Open Source R Revolution R Speedup
Linear Algebra1
Matrix Multiply 176 sec 9.3 sec 18x
Cholesky Factorization 25.5 sec 1.3 sec 19x
Linear Discriminant Analysis 189 sec 74 sec 3x
General R Benchmarks2
R Benchmarks (Matrix Functions) 22 sec 3.5 sec 5x
R Benchmarks (Program Control) 5.6 sec 5.4 sec Not appreciable
1. http://www.revolutionanalytics.com/why-revolution-r/benchmarks.php 2. http://r.research.att.com/benchmarks/
Revolution Confidential
R evolution R E nterpris e
Big Data Analysis with ScaleR
17
Revolution Confidential
R evoS caleR brings the power of B ig Data to R
18
Distributed Statistical Algorithms
Communications Framework
Data Source API
R Language Interface
Parallel External Memory Algorithms exploit available compute resources (cores & computers) independent of platform
Abstracted communications layer provides portability of
code between platforms: server,
cluster, or in-database
Use the high-speed local data mart (XDF), or stream data from SAS, ODBC, HDFS or other remote data sources.
Familiar, high-productivity
programming environment for R
users
Revolution Confidential
S c aleR A ddres s es P erformance and C apacity L imitations of Open S ource R
19
Revolution Confidential
High P erformance B ig Data A nalytics with S c aleR
20
Statistical Tests
Machine Learning
Simulation
Descriptive Statistics
Data Visualization
R Data Step
Predictive Models
Sampling
Revolution Confidential R evolution R E nterpris e S c aleR : High P erformance B ig Data A nalytics
21
Data import – Delimited, Fixed, SAS, SPSS, OBDC
Variable creation & transformation
Recode variables Factor variables Missing value handling Sort Merge Split Aggregate by category
(means, sums)
Min / Max Mean Median (approx.) Quantiles (approx.) Standard Deviation Variance Correlation Covariance Sum of Squares (cross product
matrix for set variables) Pairwise Cross tabs Risk Ratio & Odds Ratio Cross-Tabulation of Data
(standard tables & long form) Marginal Summaries of Cross
Tabulations
Chi Square Test Kendall Rank Correlation Fisher’s Exact Test Student’s t-Test
Data Prep, Distillation & Descriptive Analytics
Subsample (observations & variables)
Random Sampling
R Data Step Statistical Tests
Sampling
Descriptive Statistics
Revolution Confidential R evolution R E nterpris e S c aleR : High P erformance B ig Data A nalytics
22
Sum of Squares (cross product matrix for set variables)
Multiple Linear Regression Generalized Linear Models (GLM)
- All exponential family distributions: binomial, Gaussian, inverse Gaussian, Poisson, Tweedie. Standard link functions including: cauchit, identity, log, logit, probit. User defined distributions & link functions.
Covariance & Correlation Matrices
Logistic Regression Classification & Regression Trees Predictions/scoring for models Residuals for all models
Histogram Line Plot Scatter Plot Lorenz Curve ROC Curves (actual data and
predicted values)
K-Means
Statistical Modeling
Decision Trees
Predictive Models Cluster Analysis Data Visualization
Classification
Machine Learning
Simulation
Monte Carlo
Revolution Confidential
R evolution R E nterpris e
Enterprise Deployment
23
Revolution Confidential
On-demand sales forecasting
Real-time social media sentiment
analysis
C reate c us tom, on-demand analytic s applic ations S ome examples :
24
Leveraging the power of R from Microsoft tools
Revolution Confidential
R evolution R E nterpris e DeployR integrates R with applications
Seamless Bring the power of R to any web enabled application
Simple Leverage common APIs including JS, Java, .NET
Scalable Robustly scale user and compute workloads
Secure Manage enterprise security with LDAP & SSO
25
R / Statistical Modeling Expert
DeployR
Data Analysis
Business Intelligence
Mobile Web Apps
Cloud / SaaS
Deployment Expert
Revolution Confidential R evolution R E nterpris e
A rc hitec ture Use a connected MPP server or cluster for: Data exploration On-demand R
applications Big-data predictive
models Offline (batch)
operations Code generation for
real-time deployment
Revolution Confidential C onnectR for Hadoop: S tream data from Hadoop to R evolution R E nterpris e
Revolution Confidential
On-Call Technical Support Consulting Migration | Analytics | Applications | Validation
Training R | Revolution R | Statistical Topics
Systems Integration BI | ERP | Databases | Cloud
28
Revolution Confidential
P oll Ques tion
What interests you most about Revolution R Enterprise?
Revolution Confidential
Why cus tomers choos e R evolution R E nterpris e
30
INNOVATION MULTI-PLATFORM
TIME-to-VALUE VALUE
Revolution Confidential T hank You! Download slides, replay http://bit.ly/YbfQo1
Resources for getting started with R http://bit.ly/ZnZGt2
Get Revolution R Enterprise Contact Sales: http://bit.ly/hey-revo Free to Academics:
www.revolutionanalytics.com/academic We’re Hiring! www.revolutionanalytics.com/careers
31
Revolution Confidential T hank you.
32
www.revolutionanalytics.com 650.646.9545 Twitter: @RevolutionR
The leading commercial provider of software and support for the popular open source R statistics language.