Upload
ibm
View
1.495
Download
2
Embed Size (px)
Citation preview
IBM commissioned Forrester Consulting
The topic: How are IT managers and data scientists using open source platforms like Apache Spark to create predictive models?
What do they need to build and deploy these models in real time?
What challenges do they face?
IT Managers and Data Scientists are concerned about:
The amount of data to analyze
Security while transferring data
The time spent wrangling data
IT Managers
67% of IT managers say real-time analytics during transactions would be valuable
Data Scientists
62% of data scientists use predictive analytics for data analysis
91% have an interest in real-time data use for modeling, but 51% of their time is spent preparing and understanding data
Spark is the most popular open source
for data scientists
77% of IT managers agree that Spark should be used for transactional
data analysis
Running Apache Spark helps companies build models…
Quickly Accurately Securely
All in memory – without the need to move data from its original source.
Data gravity does all the heavy lifting
With Spark deployed on the platform where your data resides, you can analyze the data in real-time
No delays or moving expenses
So what should you do?
Based on these findings, Forrester suggests companies:
Re-examine long held views of the mainframe
Consider adopting Apache Spark as a common standard for analysis and model building
It’s no longer necessary for workflow
Analyzing in place improves time-to-value and accuracy
Stop copying data for analysis when it originates on the mainframe