Upload
lynn-lane
View
217
Download
0
Embed Size (px)
Citation preview
Advanced Software Engineering
PROJECT
November 2015
1. Spark on Yarn and Mesos Yarn/Mesos+Spark+SparkStreaming Mesos: fine/coarse-grained Application: to retrieve Wikepedia posts as a stream Demonstrate “how” Yarn and Mesos help resource
scheduling Visualization through Ganglia Performance analysis and comparisons Max. 2 students
2. Sparkathon Build super-smart weather apps that harness the
power of IBM Bluemix™ and Apache Spark™. http://sparkathon.devpost.com/ Worldwide competition, submitted by January 20,
2016 (5:00pm Eastern Time) Max. 4 students只要入围, Project给满分!
3. US Road Network Analysis https://snap.stanford.edu/data/#road GraphX + MLlib + Spark Streaming + Tableau Think about what you can do ^^ Community detection: J. Leskovec, K. Lang, A. Dasgupta, M.
Mahoney. Community Structure in Large Networks: Natural Cluster Sizes and the Absence of Large Well-Defined Clusters. Internet Mathematics 6(1) 29--123, 2009.
Max. 3 students
4. Images on Internet http://www.sogou.com/lab
s/dl/p2.html Can you predict the
seasonal trends? news trends? Political trends? Or human behaviors?
Max. 3 students Data visualization (Tableau)
5. Your Proposal! Your turn to choose one Technically sound Application oriented Talk with me next lecture!
Arrangement Nov. 23: let me know your choice in class
Week 1-2: Define your roles in a group, start literature research, and decide what to do
Week 2-4: Propose solutions (architectures, frameworks, opensource, steps)
Week 5-8: Implementation, performance analysis, and obtain results
Week 9: Wrap up and spend a few days writing your report
Jan. 16: project report to [email protected] Jan. 18: project presentation
About Result Figures/Tables Each Image:
Clear x-y labling At least 2 lines (One yours, one comparison) Each line at least 5 connected points
Each Table Clear x-y labling At least 2 algorithms (One yours, one comparison) Each row/column at least 5 fields
Report Outline Introduction: background, why this, briefly explain what
others have done Related Work: 10+ references (after 2012), describe
what others have done, their advantages/disadvantages System Architecture Detailed Design: component design, algorithm design Simulation Results: figures/tables, discussions Conclusions and Future Work
Written in English +5!
Attention! Not just an engineering project Aiming to train your research capability Look at what others have done? Are there
existing problems? How to improve? Performance? Performance Analysis (10+ figures)
Yours: accuracy, throughput, delay, system bound Comparisions: other algos, other systems
OPEN SOURCE is the key! Grading is based on YOUR contributions!
IEEE Xplore: http://ieeexplore.ieee.org/
http://dl.acm.org
Social Network Analysis
Advanced Software Engineering
Key PlayersHow to identify key/central nodes in
network
Cohesion How to characterize a network’s structure
Example Facebook: 5.8million users (2009), avr 5.73 degrees, max 12
degrees Twitter:
5.2 billion relationships, avr 4.67 degrees 50% users only 4 step away Almost everyone <5 steps For any 1,500 random users, 3.435 steps
Erdos Number: Collaborative distance through paper co-authoring
Experiment: Forwarding Letters in US
Example: Social Evolution data set by MIT Media Lab 80 undergraduates with smart devices, moving around the
campus. collects the phone usages and student locations from October
2008 to June 2009. phone usage:
3.15 million records of Bluetooth scans 3.63 million scans of WLAN access-points 61,100 call records 47,700 logged SMS events.
students provide offline, self-report answers related to their health habits, diet and exercise, weight changes, and political opinions during the presidential election campaign.
Contact graph, only links of greater than 2,000 contacts between two students are shown. Bigger nodes indicate higher betweenness centrality value for the corresponding participants. Thicker edges indicate higher contact frequency between the connected nodes.