Upload
hsiang-hsuan-hung
View
326
Download
0
Embed Size (px)
Citation preview
Pipeline
Flask
Batch process
Problem: raw data is not ordered by time and 220GB with 13 billions events
Challenges• Connector between Cassandra and Spark
• Design primary keys for data query
• Cleaning data
AboutMe• UCSD, Physics PhD 2011
• U Illinois, ECE 2011-2012
• U Texas Austin, Physics 2012-2015
• Computational material science:
• Programming, travel, fitness….
HPC, e.g. quantum Monte Carlo…