33
BIG DATA – WHAT IS IT?

Big data - What is It?

Embed Size (px)

Citation preview

Page 1: Big data - What is It?

BIG DATA – WHAT IS IT?

Page 2: Big data - What is It?

Big data is a term used to describe the collection, processing and availability of huge volumes of streaming data in real-time.

There are some things so big they have implications for everyone, whether we want it or not.

Big Data is one of those things and is completely transforming the way that we do business and is impacting most other parts of our lives.

The basic idea behind the phrase Big Data is that everything we do is increasing living a digital trace (or data) which we (and others) can use and analyse.

WHAT IS BIG DATA?

Page 3: Big data - What is It?

Big Data is every and inevitable, it ranges from suggesting what movie to watch on Netflix/YouTube, to predicating national disasters.Big Data surrounds us everywhere, ultimately influencing decisions that we make every day. For instance, when shopping on Amazon, products are recommended to us based on our shopping patterns.It is used for weather predictions, including whether or not we will get cyclones or not…..

Page 4: Big data - What is It?

From the dawn of civilization until 2003, humankind

generated 5 exabytes of data. Now we produce 5 exabytes every two days…. and the

pace is accelerating.

Eric SchmidtExecutive Chairman, Google.

Page 5: Big data - What is It?

SOME BIG DATA STATS• Walmart handles more than 1 million customer

transactions every hour• Facebook handles 40 billion photos from its user base.• Decoding the human genome originally took 10years

to process; now it can be achieved in one week.• The analyst firm Gartner says that by 2020 there will be

over 26 billion connected devices.• We sent men to the moon in 1969 on a tiny fraction of

the data that's in the average laptop 1.

1. HTTP://WWW.BUSINESSINSIDER.COM/MIND-BLOWING-GROWTH-AND-POWER-OF-BIG-DATA-2015-6

Page 6: Big data - What is It?

CONTINUED….• 701,389 logins on Facebook• 1.8million likes per minute on Facebook• 350GB of data generated• 69,444 hours watched on Netflix• 51,000 app downloads on Apple’s App Store• 347,222 tweets on Twitter• 28,194 new posts to Instagram• 38,052 hours of music listened to on Spotify• 2.78 million video views on Youtube• 2,083,333 minutes used on Skype Calls

Page 7: Big data - What is It?
Page 8: Big data - What is It?

With the datafication comes big data, which is often described using the four Vs:•Volume•Velocity•Variety•Vercity

Page 9: Big data - What is It?

VOLUME• Refers to the huge amounts of data generated every

second.• Not talking Terabytes but Zettabytes or Brontobytes.• If we take all the data generated in the world between the

beginning of the and 2000, the same amount of data will soon will be generated every minute.• A typical PC might have had 10 gigabytes of storage in

2000.• Today Facebook ingest 500 terabyte of new day every day.• Boeing 737 will generate 240 terabytes of flight data

during a single flight across the US.

Page 10: Big data - What is It?

VELOCITY• Refers to the speed at which new data is generated

and the speed at which data moves around.• Examples: High-frequency stock trading algorithms

reflect market changes within microseconds.• Online gaming systems support millions of

concurrent users. Each producing multiple inputs per second.• 50,000GB/Second is the estimated rate of global

internet traffic by 2018.

Page 11: Big data - What is It?

VARIETY• Refers to the different types of data we can now use. • In the past we only focused on structured data that neatly

fitted into tables or relational databases such as financial data., but now 80% of the worlds data is unstructured (3D Data, images, video, voice etc)• With data technology we can now analyse and bring

together data of different types such as messages, social media conversations, photos sensor data, video or voice recordings.• 1 in 3 business leaders don’t trust the information they use

to make decisions• $3.1trillion is the estiamated amount of money that poor

data quality costs the US economy per year.

Page 12: Big data - What is It?

VERACITY

• Refers to the messiness or trustworthiness of the data.• With many forms of big data quality and

accuracy are less controllable (just think of Twitter posts with hash tags, abbreviations, typos and colloquial speech as well as the reliability and accuracy of content) but technology now allows us to work with this type of data.

Page 13: Big data - What is It?

DIFFERENT TYPES OF DATAACTIVITY DATADigital music players and eBooks collect data on our activities. Your smart phone collects data on how you use it and your web browser collects information on what you are searching for. Your credit card company collects data on where you shop and your shop collects data on what you buy. It is hard to imagine any activity that does not generate data.

PHOTO AND VIDEO IMAGE DATAJust think about all the pictures we take on our smart phones or digital cameras. We upload and share 100s of thousands of them on social media sites every second.

The increasing amounts of CCTV cameras take video images, we also upload hundreds of hours of video images to YouTube and other sites every minute.

Page 14: Big data - What is It?

SENSOR DATA

We are increasingly surrounded by sensors that collect and share data.

Take your smart phone for instance, it contains a GPS senor to track exactly where you are every second of the day, and an accelometer to track the speed and direction at which you are travelling.

Page 15: Big data - What is It?

BIG DATA SOURCES

1. Users: creating data via Facebook, Twitter, internal company systems etc

2. Applications: Automatically create logs of who has changed/accessed what within the system and more.

3. Systems: Monitoring systems on aircraft generate gigabytes of data each flight monitoring different parts of the plane every second of the flight.

4. Sensors: Door entries, temperature controls etc.

Page 16: Big data - What is It?

DATA GENERATION EXAMPLES• Mobile Devices [Phones/Tablets/Ebook

readers….]• Readers/Scanners• Science Facilities/Programs/Software• Social Media• Cameras [Geo-tagging of Photos]

Page 17: Big data - What is It?

THE STRUCTURE OF BIG DATA

• Structured: Most traditional data sources.• Semi-structured: Many sources of big data.•Unstructured: Video data, audio data.

• 90% of generated data is “unstructured”. This includes tweets, photos, customer purchase history and customer service calls.

Page 18: Big data - What is It?

BENEFITS – 8 DIMENSIONS TO MEASURE VALUE FROM YOUR DATA• What makes data valuable to an

organization?• Data with the following qualities can fuel big

returns, but most organizations are struggling to make this a reality.

Page 19: Big data - What is It?
Page 20: Big data - What is It?
Page 21: Big data - What is It?

BIG DATA = BIG OPPORTUNITIESToday many organizations have initiated projects and a subset have truly innovated. But many will fail to convert these projects to knowledge and business value.

Page 22: Big data - What is It?
Page 23: Big data - What is It?

THE INTERNET OF THINGSWe know have smart TVs that are able to collect and process data, smart watches, smart alarms. The Internet of Things connects these devices so that in future, we will be able to have things like traffic sensors in the road to send data to your alarm clock which will wake you up earlier than planned because the blocked road means you will have to leave earlier to make your 9am meeting.

Page 24: Big data - What is It?

LAWS ARE ALREADY BEING PASSED TO DEAL WITH THE IOT & INTERCONNECTIVITYThe US has ruled that cars must be able to talk to each other. The National Highway Traffic Safety Administration has formally proposed a rule requiring a uniform industry-wide system that would be put in all new cars. If the rule is approved, NHTSA said, it would take two to four years for the technology to be in all new cars. Even with human drivers, the technology could help avoid 80% of crashes involving sober drivers, according to NHTSA1.HTTP://MONEY.CNN.COM/2016/12/13/TECHNOLOGY/NHTSA-VEHICLE-TO-VEHICLE-COMMUNICATION-RULE/INDEX.HTML?SECTION=MONEY_TOPSTORIES

Page 25: Big data - What is It?

ADVANTAGES OF BIG DATA & IOT• Supply chain or delivery route optimization - using data from

geographic positioning and radio frequency identification sensors. Allows cities to optimize traffic flows based on real time traffic information., and operate to minimize jams.• Improving Health - Ability to monitor and predict epidemics and

disease outbreaks.• Police forces use big data tools to catch criminals and even predict

criminal activity, and credit card companies use big data analytics to detect fraudulent transactions.• Improving Sports Performance - track athletes while playing

on the field, as well as use fitness trackers to track activity, sleep, and sleep.

Page 26: Big data - What is It?

MAIN BENEFITS1. Safety, Comfort, EfficiencyNow imagine monotonous tasks being automated and done by machines. For example, smart assembly lines could report misconfigurations and errors in real time, producing higher yields and less downtime.The result is more time for productive and rewarding work. This would drive higher employee satisfaction and retention, while dramatically improving profit margins.

2.  Better Decision MakingIf you can analyze larger trends from empirical data, you can make smarter decisions. This takes assumptions out of the equation, giving you data-backed visibility into every aspect of your business. For example, testing cycles would radically shorten—lowering the costs to optimize a process. Additionally, the visibility into system behaviors can yield new insights and ideas, guiding your business like never before.3. Revenue GenerationAt first, the above benefits from the IoT will impact your bottom line simply by reducing expenses and improving efficiency. 

Page 27: Big data - What is It?

DISADVANTAGES OF BIG DATA1. Security and Privacy

1. Companies being hacked, for example, in the US 91% of all healthcare organizations have had at least one data breach in the last two years, and US federal government had 61,000 cyber-security breaches in 2014 alone.

2. Identities stolen, through stolen SSN, credit cards etc.3. Often default device settings equate to wide open, even when access controls are present

many organizations don’t have strong security protocols in place. This is the IoT equivalent of having a username/password combo of “admin” and “password”

2. Data and Complexity• The IoT generates countless bytes of data—but business value is measured not in

bytes, but in the analysis of trends and patterns• Now, imagine the complexity of thousands of sensors collecting data each hour

across a single organization. If you don’t have a plan to process and analyze these huge quantities of data, you won’t be able to translate any of these findings to better business practices.

3: Business and IT Buy-inGiven the above concerns about security and complexity, persuading stakeholders to buy into the IoT can be difficult. The perceived costs and risks to simply lay a foundation or run a single experiment can hold back progress.

THREATS

Page 28: Big data - What is It?

REAL WORLD EXAMPLESPeddamailPeddamail gives an example of a grocery team struggling to understand why sales of a particular produce were unexpectedly declining. Once their data was in the hands of the Cafe analysts, it was established very quickly that the decline was directly attributable to a pricing error. The error was immediately rectified and sales recovered within days2. Sales across different stores in different geographical areas can also be monitored in real-time. One Halloween, Peddamail recalls, sales figures of novelty cookies were being monitored, when analysts saw that there were several locations where they weren’t selling at all. This enabled them to trigger an alert to the merchandising teams responsible for those stores, who quickly realized that the products hadn’t even been put on the shelves. Not exactly a complex algorithm, but it wouldn’t have been possible without real-time analytics.2. HTTP://WWW.FORBES.COM/SITES/BERNARDMARR/2016/08/25/THE-MOST-PRACTICAL-BIG-DATA-USE-CASES-OF-2016/#59B5BDCA7533

Page 29: Big data - What is It?

Rolls-RoyceRolls-Royce put Big Data processes to use in three key areas of their operations: design, manufacture and after-sales support.Design: generate tens of terabytes of data on each simulation of one of our jet engines. We then have to use some pretty sophisticated computer techniques to look into that massive dataset and visualize whether that particular product we’ve designed is good or bad.”Manufacture:  manufacturing systems are increasingly becoming networked and communicate with each other in the drive towards a networked, Internet of Things (IoT) industrial environment, such as linking their manufacturing plants in the UK, in Rotherham and Sunderland.After-Sales Support: In terms of after-sales support, Rolls-Royce engines and propulsion systems are all fitted with hundreds of sensors that record every tiny detail about their operation and report any changes in data in real time to engineers, who then decide the best course of action3.

3. HTTP://WWW.FORBES.COM/SITES/BERNARDMARR/2016/08/25/THE-MOST-PRACTICAL-BIG-DATA-USE-CASES-OF-2016/2/#7B8B0545F431

Page 30: Big data - What is It?

WHY SHOULD WE CARE?• We already have sources of large amounts of data available that

we can leverage and make use of, such as systems we already have, such as our in-house databases, Payroll, electronic bills from vendors [Vodafone/Telephone] . By asking the right questions and analyzing the data we can possibly have big financial savings.• Find out when are the most months times for Mondayitis, sick

leave, and peak times for overtime, so can manage staff and overtime better, and identify if we need/don’t need as many staff in certain locations.• Can be used to perform data analytics on customers, for example

which types of accounts regularly go into 60, 90 days and whether it is customers in certain industries, or at certain times of the year. It can help plan our cash flow better and possibly think of renegotiating agreements with some customers.

Page 31: Big data - What is It?

FUTURE POSSIBILITIESVehicle Tracking• Prices for monitoring equipment are getting lower all the

time, we can monitor fuel costs, vehicle maintenance, spare parts and help see which vehicles are lower cost maintenance wise.

Asset Tracking• Ability to optimize purchasing procedures/costs of other

items like stationary/vehicle spare parts across branches in an effect to cut down on costs, and possible double/triple handling by different people.• The ability is already there to monitor toner usage from

printers, the data can be used to analyze which brands are more reliable/better cost per page in terms of printing, and support.

Page 32: Big data - What is It?

IN CONCLUSION…..The rate of data growth will continue to increase, and from increasing different sources whether we pay attention to it or not….

The main question should be – Can we use some of this data and leverage it so that we can we can work smarter, more efficiently, and cut down costs or make more money?

Page 33: Big data - What is It?

FURTHER READING & RESOURCESBig Data – What is It?http://www.slideshare.net/BernardMarr/140228-big-data-slide-share2016 UPDATE: WHAT HAPPENS IN ONE INTERNET MINUTE?http://www.excelacom.com/resources/blog/2016-update-what-happens-in-one-internet-minuteWhat is Big Data? What are the Benefits of Big Data?https://marketingtechblog.com/benefits-of-big-data/

3 Threats and 3 Benefits of the Internet of Thingshttps://www.atlanticbt.com/blog/3-threats-and-3-benefits-of-the-internet-of-things/Identity Theft: The Risky Side of Big Datahttps://dzone.com/articles/identity-theft-the-risky-side-of-big-dataIdentity Theft + Big Data = Identity Reconstructionhttp://blogging.avnet.com/ts/advantage/2015/12/identity-theft-big-data-identity-reconstruction/Apache Hadoophttp://hadoop.apache.org/Apache Sparkhttp://spark.apache.org/RapidMinerhttps://rapidminer.com/