16
EMPOWERING THE CITIZEN DATA SCIENTIST WITH SELF-SERVICE ADVANCED ANALYTICS PROVIDING THE POWER TO ACCELERATE SPEED TO INSIGHT

empowering the citizen data scientist with self-service advanced

  • Upload
    hatu

  • View
    224

  • Download
    2

Embed Size (px)

Citation preview

Page 1: empowering the citizen data scientist with self-service advanced

EMPOWERING THE CITIZEN DATA SCIENTIST WITH SELF-SERVICE ADVANCED ANALYTICS

PROVIDING THE POWER TO ACCELERATE SPEED TO INSIGHT

Page 2: empowering the citizen data scientist with self-service advanced

TABLE OF CONTENTS

OVERVIEW 2

EXPANDING ROLE OF THE DATA SCIENTIST 6

RISE OF CITIZEN DATA SCIENTIST 7

BENEFITS OF THE CITIZEN DATA SCIENTIST 9

EMPOWERING THE CITIZEN DATA SCIENTIST 11

OVERCOMING RESISTANCE 13

SUMMARY 14

ABOUT LAVASTORM ANALYTICS 15

1

Page 3: empowering the citizen data scientist with self-service advanced

OVERVIEW

2

The growth of social media, the Internet of Things, and digital business is dramatically increasing the volume, velocity, and variety of data flowing into business organizations. This presents both opportunities and challenges as business analysts and IT groups struggle to keep up. Most businesses recognize the value of detailed information about customer and operational activities, but establishing and managing a process to convert the mass of data into actionable decision-making can be daunting.

According to Gartner, the trend is now toward Big Data Discovery – maximizing the organizational impact and value

Page 4: empowering the citizen data scientist with self-service advanced

3

of Big Data investments by bringing business and IT closer together. Collecting and analyzing complex data by highly specialized technologists is no longer sufficient. The power to quickly gain insight requires putting these massive and diverse sets of data into business context. Organizations must involve those closest to the business to truly leverage the data for maximum strategic and tactical advantage.

Many businesses have a number of valuable super users/data crunchers already in the organization who are striving to address this business intelligence need. Gartner has relabeled these individuals “citizen data scientists”, and the role they perform can handle the analytics needs of most businesses. The key is identifying them and increasing their productivity by automating their repetitive menial tasks so they can reach the next level of effectiveness.

Page 5: empowering the citizen data scientist with self-service advanced

4

This citizen data scientist role is a hybrid position that bridges the gap between the high level, technical data scientist and the more business oriented data discovery analyst. The recognition of this new level of analyst acknowledges that more people in the organization are accessing the data and helping to increase the positive impact valuable information can make on the business. This is an increasingly important development for the many businesses that don’t yet employ high-level data scientists.

THIS CITIZEN DATA SCIENTIST ROLE IS A HYBRID POSITION THAT BRIDGES THE GAP BETWEEN THE HIGH LEVEL, TECHNICAL DATA SCIENTIST AND THE MORE BUSINESS ORIENTED DATA DISCOVERY ANALYST.

Page 6: empowering the citizen data scientist with self-service advanced

5

To truly empower citizen data scientists, however, businesses must enhance the role by providing them with more direct data access, as well as data manipulation and advanced analytics capabilities that have generally been the sole domain of IT staff and the high-level data scientist. Organizations must also provide appropriate tools that combine the power of IT’s toolbox with the ease of use that fosters broader participation in the concept of the citizen data scientist and the value it delivers. Self-service data preparation and advanced analytics are key to helping the citizen data scientist navigate the turbulent sea of data existing in today’s complex environment.

This paper will examine the role of the citizen data scientist and what businesses need to do to leverage this newly recognized, highly valuable, key resource.

Page 7: empowering the citizen data scientist with self-service advanced

EXPANDING ROLE OF THE DATA SCIENTIST

In recent years, the significance of the role of data scientist has grown exponentially in response to the need to cultivate information from the vast amount of data a business accumulates. The position requires analyzing data to solve business problems, while also identifying patterns and context to help uncover opportunities.

The data scientist must have an extensive background in computer science and coding languages, machine learning, statistical methods, and business. He or she must also have the ability to communicate with both business and IT leaders. Because of this highly specialized skill set, data scientists are hard to find or grow internally and are usually highly compensated. Many businesses are not yet in position to have these highly trained – and expensive – experts.

As the demand for data discovery has grown, so has the workload of data scientists. Their key responsibility has been designing and implementing processes for managing and analyzing complex data sets used for modeling, data mining, and research. Because of the increasing demand for analytics, the stage at which the data scientist must be involved has become a bottleneck in many organizations. The development of self-service data preparation and advanced analytics platforms has helped relieve that stress point, but that hasn’t been enough to free up the data scientist for the important work that needs to be done. There simply are not enough data scientists to meet the increasing need.

6

BECAUSE OF THIS HIGHLY SPECIALIZED SKILL SET, DATA SCIENTISTS ARE HARD TO FIND OR GROW INTERNALLY AND ARE USUALLY HIGHLY COMPENSATED.

Page 8: empowering the citizen data scientist with self-service advanced

RISE OF CITIZEN DATA SCIENTIST

In the modern age, businesses are no longer content with simply forensically analyzing data from the past, but instead are also using available data to provide insight into the future. To achieve that goal, the data needs to be set up for predictive modeling – traditionally and solely the domain of the data scientist. Because of the backlog of initiatives, in many cases, the questions and problems requiring predictive modeling and analytics are being routed to traditional business analysts who aren’t fully prepared to handle the task. They understand there’s a problem and have a good grasp of the analytics process, but they’re not equipped with the right tools to fulfill the requests.

The answer is to provide tools to these business analysts that allow them to perform some of the analytics functions that have been traditionally reserved for data scientists. Technology has advanced to the point where these analysts can be empowered to perform advanced analytic tasks that previously would have required the skill set and training of a data scientist. Armed with the appropriate self-service tools, these individuals are increasingly becoming “citizen data scientists” and able to perform complex data-driven tasks on their own.

High-level data scientists are strong technically in programming and math with a “big picture” understanding of the business. They aren’t likely to have the in-depth knowledge of the departmental and functional components of the business as someone who

7

Page 9: empowering the citizen data scientist with self-service advanced

8

has been involved in a particular area for some time. That’s where the citizen data scientists fit in. They are normally intimately involved with functional business areas like finance, sales, operations, and customer support, and have a greater understanding of the challenges being faced.

That’s why the most effective citizen data scientists are often found within the business functional areas. Current team members are best suited to add valuable business context to analytics initiatives, as well as prioritization based on potential business value. Individuals with solid business domain knowledge, familiarity with data challenges, and the willingness to embrace new methods of analysis are ideal candidates for this role.

Page 10: empowering the citizen data scientist with self-service advanced

BENEFITS OF BECOMING A CITIZEN DATA SCIENTIST

Empowering individuals with intuitive advanced and predictive analytics tools provides the business with additional and valuable resources to tackle the growing list of analytics projects and relieve current analysis bottlenecks. Embracing this model also reduces the load on data scientists, allowing them to focus on being more productive and driving innovation. For business analysts, without the data scientist degree, they can now offer up new insights and uncover new questions that can lead to the next set of critical business issues to tackle.

With self-service tools, data analysis can be seen more as a team sport. Data prep can take up an inordinate amount of time for data scientists, but citizen data scientists can be particularly effective in handling the upfront data prep – gathering data, cleaning it, blending it, and applying business logic before bringing it forward for more advanced analysis and modeling by the data. Data can’t just be randomly combined. It has to be joined together in a sensible way that conforms to how the business actually operates. The value of the citizen data scientist is that they understand why it makes sense to connect and transform certain pieces of data in particular ways.

There is always a risk of the data scientist operating as an island and going down a path of very complex, sophisticated analysis that might not be relevant,

9

WITH SELF-SERVICE TOOLS, DATA ANALYSIS CAN BE SEEN MORE AS A TEAM SPORT.

Page 11: empowering the citizen data scientist with self-service advanced

10

practical, or even usable for the intended business goal. Citizen data scientists, with their extensive knowledge of the business function, can bring the power of analysis to bear where it will do the most good. They know where the problems or opportunities might be and where the valuable information resides, which is not always readily apparent to the traditional data scientist.

The contextual knowledge of the citizen data scientist is vital. Data scientists can be focused on the numbers, but they’re not necessarily hearing the customer complaints or management pain points. This new relationship between data scientist and citizen data scientist is analogous to the dynamic between engineers and product managers. The engineers are technical, but not in tune with the way the product needs to be designed to meet customer needs. The product managers provide the critical context of what the product needs to accomplish.

Democratizing the ability to analyze data means the business can move forward without having to wait for a select few “data gurus” to run the necessary models and analytics. The citizen data scientist delivers much more insight than traditional business analysts have in the past. This enables organizations to relieve analytics bottlenecks by empowering citizen data scientists, rather than having to recruit and hire highly trained and expensive data scientists who are in short supply.

DEMOCRATIZING THE ABILITY TO ANALYZE DATA MEANS THE BUSINESS CAN MOVE FORWARD WITHOUT HAVING TO WAIT FOR A SELECT FEW “DATA GURUS” TO RUN THE NECESSARY MODELS AND ANALYTICS.

Page 12: empowering the citizen data scientist with self-service advanced

EMPOWERING THE CITIZEN DATA SCIENTIST

Businesses must try to accelerate the process of enabling the citizen data scientist to take on this expanded role. A major challenge for some organizations when attempting to develop a team of citizen data scientists is the lack of an appropriate training plan. Training the citizen data scientist in the process, and educating them in the use, of emerging tools is critical to the success of the model.

One of the major considerations when developing the toolset to empower the citizen data scientist is ease of use. The traditional tools used by the data scientist require extensive training in various scripting languages to perform the necessary analytics and modeling. Most individuals within the organization do not have that depth of knowledge. They are, however, familiar and comfortable with using point and click interfaces with minimal configuration. Providing a tool that can effectively deliver interactions through this type of application allows the citizen data scientist to overcome the major hurdle of presumed technical knowledge. With the appropriate tools in hand, citizen data scientists can get to the level where they can extract meaning from data to solve business problems or take advantage of opportunities.

11

Page 13: empowering the citizen data scientist with self-service advanced

12

TRANSPARENCY AND TRACEABILITY ARE CRUCIAL FOR ANY COMPLIANCE GROUP TASKED WITH REPORTING TO GOVERNMENT BODIES.

For example, if a business is trying to figure out why customers are leaving, using these tools will allow the citizen data scientist to run a logistic regression by feeding in demographic variables to figure out the attributes of your customers that are making them likely to leave. That delivers actionable insight to change business process or to design promotions to retain customers. The next step in such analysis may be to determine the profitability of these “at risk” customers to determine which ones are worth keeping.

When doing a market basket analysis, you may want to figure out what types of things happen together that might suggest some other action. For example, if customers are buying speakers and televisions together, the data could reveal that they are likely to buy a state-of-the-art gaming system as well. That kind of actionable insight could feed into a purchasing recommendation system, for special offers or pricing changes.

In addition to providing the right tools, there’s also a need to help the citizen data scientist gain sufficient knowledge to know how to find the right method in the right context. That requires a comprehensive education program to go along with the expanded toolset. This helps them determine whether to use a random forest or other classification method, for example. There are many alternatives and each is dramatically different with specific use cases, but without the right information, it’s difficult to expect the citizen data scientist to know which one is the most appropriate.

Page 14: empowering the citizen data scientist with self-service advanced

OVERCOMING RESISTANCE

There might be concern that the data scientist will object to other analysts encroaching on their territory, but, in fact, they will likely embrace the move to citizen data scientists as a way to help handle the growing workload. By empowering the citizen data scientist, organizations are simply dividing the analytics labor. The data scientist is currently presented with a set of repetitive needs that can become tedious over time. The data scientist persona is typically more interested in trying out new things and exploring different methods of analyzing data with the goal of getting to the cutting-edge of data science.

The citizen data scientist can take conventional business analysis to the next level and handle the regressions, clustering methods, and other analysis tasks that are deployed over and over. This frees up the data scientist to handle tasks that are less conventional and require more progressive thinking.

13

ONCE COMPANIES SATISFY COMPLIANCE REQUIREMENTS, THEY WILL START TO MINE THIS VALUABLE DATA FOR COMPETITIVE INSIGHTS.

Page 15: empowering the citizen data scientist with self-service advanced

The rise of the citizen data scientist is coming from the recognition, elevation, and empowerment of the go-to data crunchers in the enterprise. As the volume of data continues to rise, businesses must adapt to take full advantage of the insights it offers. Centering the analytics process on a small group of highly technical data scientists is no longer an effective way to accomplish that goal. Getting more qualified people involved – especially those with intimate knowledge of the business and its customers – will be more beneficial in mining the valuable nuggets of insight that lie within the data.

Citizen data scientists can perform this critical function, but only if they are empowered with the right tools and processes. They need to be able to analyze the data without having extensive technical knowledge of the underlying IT applications and infrastructure. Self-service data prep and advanced analytics tools provides them with the ability to perform the crucial task of combining Big Data with issues and opportunities tightly focused on their organizations.

This democratization of data analytics and line of business driven self-service functions supporting those efforts are firmly taking hold and will continue to define the new business intelligence paradigm. With the proper tools and training, the citizen data scientist can be a key player in this evolution of data analysis.

14

SUMMARY

Page 16: empowering the citizen data scientist with self-service advanced

Lavastorm accelerates and automates self-service data preparation and advanced analytics for the enterprise. Our software makes it easy to blend complex data from multiple sources to empower technical and non-technical users to quickly build governed analytic applications, as well as harness our robust predictive analytics capabilities. Lavastorm’s intuitive data flows, collaboration tools and extensive pre-built libraries enable our customers to deliver the fastest, most accurate insights to the business. The company’s proven technology is used by thousands of analysts at leading global companies to solve their complex, data-driven problems where speed and accuracy are mission-critical to maintain competitive advantage.

We empower business professionals and analysts with the fastest, most accurate way to discover and transform insights into business improvements, while providing IT with control over data governance. Our visual, discovery-based data flows enable organizations to reduce analytic development time by up to 90%. The core strength of Lavastorm is its ability to easily blend complex data from multiple sources then apply business rules to ensure data quality and handle exceptions. Businesses can then continuously run the visual analytic models they create, allowing the automation of various analytic processes, such as data cleansing and data quality. The end result is an analytics infrastructure that supports larger amounts of data, responds with more speed, adapts to greater change, and empowers a growing number of decision makers. It can be a crucial tool in the drive to empower the citizen data scientist and democratize data analytics.

For more information, please visit www.lavastorm.com.

15

ABOUT LAVASTORM ANALYTICS