Upload
imc-institute
View
835
Download
0
Tags:
Embed Size (px)
Citation preview
3
Schedule13 March
– 16.00 - 18.00 Workshop / Demo on Big Data Analyticsusing Amazon EMR
– 18.00: Start registration for those who interested in runningthe cluster for 30 Hours & Account access to Amazon EMRwill be given
14 March
– 06.00 Amazon EMR Cluster will be opened
– Participant will be discussed via online / Social Media
15 March (@ EGA Office)
– 12.00 Amazon EMR will be closed
– 13.00 Presentation by each competitor on the result
– 15.30 Winner Announcement
5
Hadoop Cluster for the challenge
10 AWS’s m3.xlarge EC2 server each with4vCPU, 15 GByte Memory, 80 GB SSD Memory
A sample data set with more than 10 millionrecords will be given
6
Challenge rules
A competitor can use a sample data to analysewith Hive, Pig or Map/Reduce
In addition, a competitor can use own large set ofdata.
A winner will be judged from those who have abest innovation / result from the analytics.
Those who are just would like to try using thecluster are also welcome
7
Judging Criteria:
Complexity of the problem & Data Set 30%
Benefit to the society 20%
Innovation 30%
Presentation 20%
8
Judges
Assoc.Prof. Dr.Jirapun Daengdej
Mr. Danairat Thanabodithammachari
Dr.Thanachart Numnonda
Ms.Nantawan Wongkachonkitti
9
Awards
The best winner will receive an Apple TV.
Two winners will be selected for two free trainingcourses on– Big Data using Hadoop Workshop; 30-31 March 2015
– Business Intelligence Design and Process; 18-20, 25-26May 2015
Starbucks Card 200 Baht
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Select EMR
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Creating a cluster in EMR
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Creating a cluster in EMR (cont.)
Name the cluster and also specify Log folder
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Creating a cluster in EMR (cont.)
Leave the Software Configuration as default
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Creating a cluster in EMR (cont.)
Leave the Hardware Configuration as default
Choose an exisitng EC2 key pair
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Creating a cluster in EMR (cont.)
Leave the others as default
Select Create Cluster
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
EMR Cluster Details
Note on the Master public DNS:
To see the details on how to connect to the Master Node using SSH click at SSH
19
Set Up an SSH Tunnel to the Master Node
– See instruction at– http://docs.aws.amazon.com/ElasticMapReduce/latest/
DeveloperGuide/emr-ssh-tunnel.html
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
SSH Instruction for Mac/Linux
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
SSH Instruction for Windows
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Connect to the master node
23
Launch the Hue Web Interface
Set Up an SSH Tunnel to the Master Node
– See instruction at
– http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-ssh-tunnel.html
Configure Proxy Settings to View Websites
– See instruction at
– http://docs.aws.amazon.com/ElasticMapReduce/latest/DeveloperGuide/emr-connect-master-node-proxy.html
24
Launch the Hue Web Interface (Cont.)
http://master-public-dns-name:8888/
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Web Interface Host on EMR Cluster
28
Movielen Data
http://grouplens.org/datasets/movielens/
MovieLens 10M
(http://files.grouplens.org/datasets/movielens/ml-10m.zip)
– ratings.dat
– users.dat
– movies.dat
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Flight Details Data
http://stat-computing.org/dataexpo/2009/the-data.html
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Data Description
Thanachart Numnonda, [email protected] Feb 2015Big Data Hadoop on Amazon EMR – Hands On Workshop
Snapshot of Dataset
40
Registration
Provide your name, organization, mobile, e-mailaddress
On-site registartion at 17.00 pm, 13 March
E-mail: [email protected]
Facebook message to Thanachart Numnonda
Your username & password & key & public DNS willbe send to your e-mail by 6 am, 14 March
41
On-line communication
Facebook Group: Hadoop-Thailand
Line group
Facebook message
E-mail to [email protected]