25
Taeyoung Kim 003711-0016 Advisor: Ms. Alice Franco Evolution of Data Analysis and Big Data: How It Changed the Decision-making Paradigm and Corporate Culture in Business Operations. 1

Extended Essay.docx

Embed Size (px)

Citation preview

Page 1: Extended Essay.docx

Taeyoung Kim003711-0016

Advisor: Ms. Alice Franco

Evolution of Data Analysis and Big Data: How It Changed the Decision-making

Paradigm and Corporate Culture in Business Operations.

1

Page 2: Extended Essay.docx

Acknowledgement

Efwefwefwefwfef

First and foremost, I would like to thank my advisors, Ms. Alice Franco and Ron

Bialkowski for their valuable guidance and advice. They inspired me greatly to work in this

research, and this project would not have been possible without their support.  

Secondly, I would like to thank Ms. Deniz Hanson, our IB coordinator, for countless

extensions she granted me.

Last but not least, I wish to avail myself of this opportunity, express a sense of gratitude

and love to my friends for their mutual support, strength, help and for everything place (in no

particular order): 

Lois Dzebissov, Ravi Maddali, Adrian McClure, Michael Rusnak, Andrew Morgan,

Nicole de la Montanya, and Abhishyant Khare. (I know I omitted many names, and I mean no

offence)

2

Page 3: Extended Essay.docx

Table of Contents

1. Abstract…………………...………………………………………………...5

2. Introduction…………………………………………………………….…...6

3. Body………………………………………………………………………...7

4. Conclusion…………………………………………,……………………...18

5. Bibliography……………………………………….………………………19

3

Page 4: Extended Essay.docx

Abstract

The purpose of my research is to identify the impact of the evolution of data analysis and

Big Data to managerial decision-making paradigm of data-driven enterprises. It addresses the

evolution of data analysis from early 2000s’ Analytics 1.0, to mid-2000’s Analytics 2.0, to

today’s Analytics 3.0 and Big Data. It examines each phase’s characteristics, and how they are

developed and are distinct from each other.

The research examines how corporate culture is changing, and should change in order to

obtain a superior corporate atmosphere where workers can make better decisions on their own,

hence making more efficient operations and resulting in more effective outcomes. Ultimately,

my research states how data analysis technologies are constantly evolving and will bring

revolutionary changes in business operations. 

4

Page 5: Extended Essay.docx

Introduction

As Google's director of research, Peter Norvig, said in 2007: "We don’t have better

algorithms. We just have more data." In the past decade, we’ve gone through a few phases of

data analytics mainly because of the enormous amount of data that is being created every second.

Due to the torrent of data, there has never been a greater need for proper analytic technologies.

As a result, advanced analytic technologies have been developed, and today, we’ve reached the

era of Big Data. People might not realize, however, Big Data has been having huge impact on the

paradigm of managerial decision making. The data analysis technology is constantly evolving

and is bringing a revolutionary change in business operations and corporate cultures.

5

Page 6: Extended Essay.docx

Body

Analytics 1.0 was the first phase of data analytics that was dominant until mid-2000s. It

was marked by enterprises assuming business intelligence systems and expertise to drive

reporting and descriptive analytics1. Analytics 2.0 emerged since the mid-2000s, and it involved

emergence of large, fast moving, external, and unstructured data from various, new, and

interesting sources2. The business impact of Analytics 1.0 and 2.0 were nearly negligible; they

were not always clear. However since the late 2000s, the paradigm began to shift. The era of

Analytics 3.0 represents a stage of maturity. The leading organizations in Silicon Valley began to

realize the measurable business impact from the combination of traditional analytics and Big

Data3.

Analytics 1.0 was basically reporting the existing data from an existing database, and

only addressed what happened in the past. It did not offer any analysis of the data or prediction

of the future whatsoever. Lack of explanation resulted in a sense of indifference towards data.

Data was helpful at the time, but it did not have the usefulness by today’s standard. Todays’

“Competing on analytics” edge did not come in until later.

The emergence of internet-based social network sites and entertainment sites in the mid-

2000 has marked the opening of the second phase of analytics. Companies like Google,

Facebook, and LinkedIn began to analyze new kinds of information. However, the era of Big

Data had not arrived yet. Soon “bigger” data started to be distinguished from “small data”. It

usually happened in public, rather than inside of an organization. “Bigger” data came in various

1 Analytics 3.02 Analytics 3.03 Analytics 3.0

6

Page 7: Extended Essay.docx

types, for examples, public initiatives like the human genome project and the capture of audio

and video recordings.

As analytics entered the 2.0 phase, companies were in need of a new powerful tool that

could probes enormous data. It was proven that any data can result in huge profit, hence they

began to build new capabilities and infrastructure. With more data, companies were able to offer

new features. LinkedIn, for example, has created numerous data products, including People You

May Know, Jobs You May Be Interested In, Groups You May Like, Companies You May Want to

Follow, Network Updates, and Skills and Expertise4.

The industry was in desperate need of talented developers; they began to hire smart

people, and created new positions such as data scientists. New innovative technologies for data

analytics were needed to be created, developed and mastered as soon as possible. A single server

was not capable of handling massive amount of data, and companies turned to a new class of

database know as NoSQL5*. Most of data was stored in public or private cloud-computing

environment, and new technologies such as “In Memory” and “In Database” for faster number

crunching process appeared.

Analytics 3.0 differs greatly from the previous phases. Companies attract new people to

their websites using superior algorithms, create recommendations from friends and colleagues,

suggestions for products to buy, and highly targeted-personalized advertisement based on search

history6. All of these are driven by data analytics from the enormous amount of data created by

the users. All activities on the internet creates data. Even for companies, talking to customers,

working with customers, shipment, usage of device, everything leaves trails of data. Embeded

4 Analytics 3.05 A NoSQL (often interpreted as Not Only SQL) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases.6 Analytics 3.0

7

Page 8: Extended Essay.docx

analytics and optimizations have gotten integrated into every business decision a data-driven

company makes. Every piece of data in the Analytics 3.0 era brings new challenges and

opportunities.

Big data is a relative term describing a situation where the volume, velocity, and variety

of data exceed an organization’s storage or computer capacity for accurate and timely decision

making. Thus methods have been created in order to extract useful information from the scores

of excessive data*.

Most of the data is held in transactional data storage within organizations due to the

consequences of very fast-growing online activity. The excess amount of data in recent years

made it impossible for people to process it manually, which resulted in machine-to-machine

processing, such as call detail records, metering, RFID system, and environmental-sensing. The

bulk of data is collected from social media websites with billions of users like Facebook and

Twitter, and the data is usually unstructured or semi-structured.

By some estimates, a typical organizations in all sectors has at least 100 terabytes of data.

However, Big Data cannot be defined only in terms of volume, which is a constantly moving

target. Rather, it is increasingly more about variety, velocity, variability and complexity.

When referring to ‘Big Data’, the term implies atypical massiveness in the following five

areas: volume, variety, velocity, variability, and complexity. Volume refers to the actual size of

the data, from terabytes to petabytes and onward. Differences in data types and sources is

described by variety, as data can come in many forms (i.e. text, video, and audio) that must be

converted into structured, comparable data before being relied on for decision-making. Velocity,

meanwhile, refers to the speed of the data flow in all directions. With recent advances, data flow

8

Page 9: Extended Essay.docx

is faster than ever before, and companies must build the necessary infrastructure in order to react

fast enough to deal with the massive data inflow. Variability covers the possible inconsistencies

in data flows with periodic peaks and troughs, such as those which occur seasonally or are

triggered by certain events. Lastly, complexity describes the difficulties that come from having

multiple data sources with different data types, and the processes necessary to convert them into

comparable informational sources and understand the relationships between them.

Variety - Up to 85% of an organization’s data is unstructured, and is not in a numeric

form. The data needs to be transformed into a constructed set of quantitative figures in a

database in order to carry out proper analytics. There are diverse categories of data: text,

video, audio, GPS signals, reading from sensors, and other unstructured data. Since the

data is acquired from different platforms, it requires different technology for analysis to

be carried out.

7 Big Data Meets Big Data Analytics

9

Page 10: Extended Essay.docx

Volume – As of 2012, about 2.5 exabytes8 of data are created each day, and that number

is doubling about every 40 months9. Companies now work with many petabytes of data in

a single data set, and the data are not just from the internet; they come from both internal

and external sources, such as customer activities, transactions, and project records.

Velocity – The 21st century is the era of new technologies at faster speeds, and has never

been greater the need to deal with this huge torrent of data with agility. For many

applications, the speed of data creation and the analysis of data is more important than the

volume of the data. Companies must compete for real-time or nearly real-time

information. This puts tremendous pressure on organizations without the infrastructure or

technology to build the necessary architecture and skill base to react quickly enough to

deal with all the data they acquire.

Variability - In addition to the speed and the volume of incoming data, there can be a

high level of variability in the data due daily or seasonal factors, or due to a special event.

Those variations are hard to manage, and need unique attention.

Complexity – Companies can face countless technical difficulties while dealing with

massive amount of data. In the course of expanding venues and sources of data, they need

to link, match and transform the data across different systems to perform high level

analytics. All organizations must understand the relationships and correlations between

different data and systems as well as complex hierarchies and data linkage, when aiming

to use data to make managerial decisions.

8 1 EB = 10006bytes = 1018bytes = 1000000000000000000B = 1000petabytes = 1millionterabytes = 1billiongigabytes.9 Management Revolution

10

Page 11: Extended Essay.docx

A data-friendly environment is hard to build. Any extremes of the dimensions above,

especially when combined two or more of them simultaneously, require high technology and

costs. But it is important for companies to realize that not all data is relevant. Companies should

learn and must be able to separate the futile data from the desired, and focus on the data that

really matters to them.

According to a recent study, companies that identified themselves as data-driven

performed better in terms of objective measures of financial and operational results10. In many

cases, on average, companies in the top third of their industry that characterized themselves as

data-driven, are 5% more productive and 6% more profitable than their competitors11. There was

significant statistical evidence to conclude that data-driven companies perform better even after

the study accounted for the contributions of labor, assets, purchased services, and traditional IT

infrastructure.

It is clear that data-driven companies perform better in many ways compared to

traditional companies, but why? How do they exploit big data? What does it mean by data-

driven? Data-driven companies perform in different ways from? These companies use in

different types of data in many varied situations whenever possible and use those data to

construct and develop a better understanding of their world and its condition. Hence, they are

able to deal with uncertainties reasonably well. Because they can draw predicted outcomes from

the data to support their decisions. Since data is invaluable when it comes to decision-making,

companies recognize importance of high-quality data and invest to improve the quality of data

they create and receive. They also have experienced researchers and data scientist working to

10 Big Data: The Management Revolution 11 Big Data: The Management Revolution

11

Page 12: Extended Essay.docx

improve the data and data models. Subsequently, they realize that decision making is one step

process, and make managerial decisions at the lowest level possible; executives do not need to

weigh in in the decision making process.

Moreover, they strive to incorporate new skills and to bring new data and data

technologies (i.e. big data, predictive analytics, metadata management etc.) into their companies.

They learn from their errors, which also become useful data that can further improve their data

analytics and data models and prevent future mistakes.

Again, the data-driven must place high demands on their data and data sources, which

require them to invest in high quality data and cultivate data sources they can trust12. Timely data

is imperative, and companies as well must be prepared to handle the data as it becomes available.

Analytics based on high-quality data has fewer uncertainties, and it is easier to understand the

variations. Furthermore, high-quality data makes it easier for others to follow the decision-

maker’s logic.

Every decision made brings in more data, including the data on the decision itself. As a

result, the data-driven must continuously re-evaluate their models and data along the way. They

can react quickly to each decision, and reverse when the data suggests that a decision is wrong.

They must realize that stationary data is not sustainable, and they need to learn as they go.

Lastly, in this kind of data-driven environment, companies must acknowledge the

importance of the sources of data, who shares the same data, how is it being used, with whom it

is being shared, whether they have access to the most valuable information and how others are

12 Are You Data Driven? Take a Hard Look in the Mirror

12

Page 13: Extended Essay.docx

using the same data they have access to13. As we approach the next level of analytics, the

competition for the right data will become just as important as competition with data.

The technological aspect of big data can be challenging to many companies, especially

those that are just entering the new world of analytics, but the managerial challenge of finding

the right corporate culture for it is greater. While some observers recognize the importance of

company cultures along with the advanced data analytics, others are skeptical.

However, “The Need for Culture” provides evidence that an advanced analytics culture is the

most import factor in data analysis among other factors such as data management technology and

skills15. Fundamentally, organizational culture is what makes the data-driven companies

13 The Analytics Mandate14 The Analytics Mandate15 The Analytics Mandate

13

Page 14: Extended Essay.docx

competitive, and makes the difference between competitive equivalence and competitive

advantage in the market. Nevertheless, the data analytics culture is based on the management

process, technologies, and overall skill of the company. Analytics culture, along with these

factors, formulates companies’ “Analytics Capabilities”.

An analytic corporate culture of course differs from the culture of a normal company. An

analytics culture integrates information management and business analytics to formulate a

strategy. Peers collaboratively use the data across a company’s lines, and promotion is

personalized based on the data analytics. Managers planning to invest in analytics technology,

must find and stimulate new talent and skill training among existing employees before hiring

outsiders. Also there is always pressure from senior management to become more and more data-

driven and analytical.

Within a company, data is treated as a core asset, and analytics is a top-down mandate

driven by executives16. These analytics provides insights to guide the future strategy of the

company, and data analysis tends to outweigh even extensive management experience when

executives are addressing key business issues. The company is open to new ideas and approach

the challenge with given data. As a result of the Big Data culture, analytics change the way

business is conducted and it can cause a power shift in the organization17.

One of the central aspect of big data is its impact on how decisions are made and who

gets to make them18. It is reasonable in case where data is expensive to obtain, scarce, or not

available in digital form, to let well-paid top executives to make managerial decisions. They do

16 The Analytics Mandate17 The Analytics Mandate18 Big Data: The Management Revolution

14

Page 15: Extended Essay.docx

so on the basis of the years of experiences they have built up and the internalized patterns and

relationships they’ve observed at the company for years. It is called “intuition” by some people

call it, while others call it “gut feeling”. Those executives start off by laying out their opinions on

what’s going to happen, or how well it will work etc., and the lower level position formulate the

plans of actions accordingly based on the opinions given. For particularly important decisions

that need to be handled, these people are either high up in the company, or they are expensive

outside consultants hired to deal with the issues. Despite the torrent of data nowadays, many of

the data-driven companies still leave many decisions to “HiPPO” – the highest-paid person’s

opinion19.

Numerous executives are genuinely data-driven and willing to override their intuition

based on the data analytics, but it is surprising how often there is conflict between the two. It can

be problematic when a company’s operation relies too much on experience and intuition and not

enough on data. It can certainly jeopardize the operation because quantitative figures with proper

analysis and model tend to be right most of the time. Even if they were wrong, it is possible to

pull the plug promptly. With Analytics 3.0, the culture of executives and the company culture

both began to change. Most importantly, the thinking process has changed. The first question of

data-driven companies problem solving is no longer “What do we think?”, rather it is “What do

we know?” This requires a shift from acting solely on the intuition of HiPPos. More questions

will follow up, such as “Where did the data come from?”, “What kinds of analyses were

conducted?”, or “How confident are we in the results?”20 HiPPos must allow themselves to be

overruled by data. The company culture should not hesitate in making decisions just because a

19 Big Data: The Management Revolution20 Big Data: The Management Revolution

15

Page 16: Extended Essay.docx

senior executive has disproved hunches. Big data will surely result in a shift in power and in the

role or domain of experts. Top executives will not be valued so much for their intuitions or

HiPPo style in reaching answers, but for knowing what questions to ask.

Conclusion

It is clear that data-driven decisions tend to be better than others. Whether people like it

or not, it is time to embrace the incoming changes that the advanced data analytics bring. The

decision-making paradigms in companies, regardless of size or type of industry, are shifting, and

the well as cultures of companies are changing. It is not a time to pretend to be data-driven, or

pretend to be more data-driven than they are. While reaching the pinnacle of Analytics 3.0, it is

time to adapt to the changes in order to perform better in many ways.

16

Page 17: Extended Essay.docx

Bibliography

1. Davenport, Thom H. "Analytics 3.0." Harvard Business Review. Harvard Business

Review, Dec. 2013. Web. 22 Oct. 2014.

2. Kiron, David, Pamela Kirk Prentice, and Renee Boucher Ferguson. The Analytics

Mandate. Rep. Cambridge: Massachusetts Institute of Technology, 2014. Print.

3. McAfee, Andrew, and Erik Brynjolfsson. "Big Data: The Management Revolution."

Harvard Business Review. Harvard Business Review, Oct. 2013. Web. 04 Mar. 2014.

<http://hbr.org/2012/10/big-data-the-management-revolution/ar/1>.

4. SAS. Big Data Meets Big Data Analytics. Publication. N.p.: n.p., n.d. Print.

5. Herrin, Angelia. "Analytics 3.0: Measurable Business Impact From Analytics & Big

Data." Harvard Business Review. Harvard Business Review, 11 Nov. 2013. Web. 04 Mar.

2014. http://blogs.hbr.org/2013/11/analytics-3-0-measurable-business-impact-from-

analytics-big-data/

6. Redman, Thomas C. "Are You Data Driven? Take a Hard Look in the Mirror." Harvard

Business Review. Harvard Business Review, 11 July 2013. Web. 22 Oct. 2014.

<http://blogs.hbr.org/2013/07/are-you-data-driven-take-a-har/>.

17