10
3rd-Party Data Management Strategy Guide Discover New Ways to Maximize the Value of 3rd-Party Data

3rd-Party Data Management Strategy Guide

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 3rd-Party Data Management Strategy Guide

3rd-Party Data Management Strategy GuideDiscover New Ways to Maximize the Value of 3rd-Party Data

Page 2: 3rd-Party Data Management Strategy Guide

“92 percent of data analytics professionals said their firms needed to increase use of external data sources.”

– Deloitte Insights

© Harbr Group Limited 2020. All rights reserved.

Page 3: 3rd-Party Data Management Strategy Guide

Introductionsupply of new data has been accelerating dramatically in line with digital transformation, something that itself has itself accelerated dramatically over the last six months.

Despite this, many organizations still struggle to optimize their interaction with third-party data. Sourcing is skewed to established providers that can be challenging to navigate due to their size and scale. Slow, manual assessment processes based on samples lead to imperfect results and imperfect decisions. Laborious contract and pricing negotiations bring little or no repeatability. Organizations invest a tremendous amount of effort refining and integrating data to deliver value with little or no re-use of that work.

It’s time for large organizations to take a fresh look at their capability to source, assess, acquire and use third-party data. Are their people, processes and technologies sufficiently prepared for the speed at which the supply and demand for data is changing? Processes will need to be highly scalable. The reliance on manual work will need to reduce. Duplication must be avoided.

Underpinning much of an organization’s capability, the focus on third-party data is only likely to increase in line with the expectations surrounding the ability to extract value from data. The following strategies offer practical guidance for those looking to gain a competitive advantage, whether through general improvements or by pioneering a fundamentally different approach. The five phases associated with acquiring and extracting value from data are: Discovery, Assessment, Acquisition, Adaptation and Use.

For almost every data-driven organization, third-party data has become a critical asset. It provides much needed context to support the use of internal data and derive value.

While organizations inevitably amass a data footprint that provides some insight over their direct sphere of influence, third-party data is required to provide additional perspective and enrichment. No matter how large a company is, its own data footprint is a tiny proportion of the footprint available across all the organizations in the world. Understanding a sector, market or particular phenomenon will often require data from many sources including their own.

Taking a topical example, how many organizations have been able to adequately assess the impact of COVID-19 by simply looking at their own data? They may have an insight into what their employees, existing customers, and direct suppliers are doing, but understanding how a given community, supply chain, sector, or jurisdiction has been affected requires more data. Ironically, the speed of COVID-19 and the sudden desire for data to understand and contextualize it unearthed a general failure on both the supply and demand side to quickly exchange and collaborate on data.

The ability to exchange and collaborate on data within and between large organizations has become a core competency for every data-driven organization. This need is being compounded by the need to fuel data-intensive technologies, such as machine learning, intelligent automation, and artificial intelligence. Concurrently, the

DiscoveryAssessmentAcquisitionAdaptation

Use

TWO

© Harbr Group Limited 2020. All rights reserved.

Page 4: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

Discovery

1 . TWO

Discovery is critical because it sets the scope for what will be possible. If the scope of your inquiry is limited, your potential outcomes are too. That suggests going very wide and deep to discover data that’s optimal for your use case – ideally multiple use cases – but that’s difficult and time-consuming. How do you know you’ve investigated far and wide enough? How do you know whether what’s being offered is relevant and useful? What have people in your organization used for similar use cases? What’s the cost of a given option? These are all difficult questions to answer, and more so when there is an urgent business need. As a result, discovery can be overlooked and rushed, resulting in poor outcomes.

Historically, discovery has involved marketing by data vendors and the use of data brokers but has evolved in the last decade. After a slew of [failed data marketplaces], there was an explosion in ‘alternative data’ driven by financial investment managers seeking exclusive insights to compete in capital markets and data owners aiming to monetize their data. Companies like Quandl (acquired by Nasdaq), EagleAlpha, and BattleFin supported this activity and innovated new concepts like collaboration, buyer-seller networking, and try-before-you-buy. While this led to new challenges, such as how to adequately assess and integrate a bewildering range of options, it provides an insight into the approaches that can be taken for general third-party data discovery.

So, what would ‘great’ discovery look like for a large organization?

1

2

3

4

5

6

Granular Control

Custom Branded

Enterprise Security

Ingest any data from any source

Assign data ownership and

access

Publish data products in the

store

Re-publish engineered data

products

Export and maintain automated data

pielines

Create secure collaborative work spaces

THREE

Page 5: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

• Make it easy. One option is to use a data hunter/scout to help teams find what they’re looking for. This is a specialized role with extensive experience and knowledge of the industry and could be either an employee or an outsourced role. Another option is to create a marketplace or catalog of what third-party data is available. This creates an intuitive experience for the consumer and lays the foundation for other benefits such as accelerating assessment and licensing.

• Compare apples with apples. With limited vocabulary to describe data and high inherent complexity, it’s crucial to create a yardstick to compare third-party data. There are no industry standards for this, so you’ll need to have your own, but there are two ways of doing this. The first is to establish your own spec and invite people to map to it, the second is to consistently profile data and then map it to your needs. Whichever approach is right, your teams will be able to quickly understand what data is potentially useful and this meta-data can easily be recycled to support future discovery efforts.

• Crowd-source. Capturing knowledge of what data has previously been acquired and rejected by people in your organization and what adequately satisfied a given use case provides a significant advantage. While likely to be distributed across the organization, this knowledge is highly valuable in helping people rapidly discover third-party data they can more easily trust. Making this available within your marketplace/catalog not only exposes it to the widest audience but helps to drive active contributions.

• Build data-driven estimates. Maintain a record of what third-party data has cost for the various use cases, so teams can gauge the likely cost of data and tackle the market accordingly. Data has wildly different value propositions depending on the type and scope of the use case. Data vendors understand this and price accordingly to best serve the various markets. Understanding what is ‘fair market value’ for your organization during the discovery phase avoids issues later on.

Discovery is the first phase of acquiring and extracting value from data and potentially a great place for any data-driven enterprise to start focusing. Once discovery is complete and a set of options identified, the task of assessing the data begins. Not only can this be extremely time-consuming, but it can also easily lead to flawed decision-making.

FOUR

Page 6: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

AssessmentWhile Discovery is critical to get right because everything leads from it, the process of assessing even a small number of different data products can be extremely time-consuming and lead to flawed decision-making. In an ideal world, you would have two near-identical products with all the metadata necessary to easily compare them and understand their differences. You would also have a clear and consistent understanding of the nuances of the data, such as how certain fields had been calculated, and instant access to the tools and infrastructure necessary to run your analysis. In the real world that rarely happens.

A common situation is to have multiple potential data products from a range of providers. Each will provide a sample that showcases their data, but rarely the whole

• Rinse and repeat. Data acquisition is not an ad hoc task. Any work done here – and work will always need to be done here – should be recyclable. When acquiring samples ensure you can maintain access to it for future needs. Document your findings, so they can be reused. Share the code that converted the sample into something that could be joined to your data or share the output that was created. Make sure that no one has to do the same work again unless there is a specific need.

• Ask for more. In the same way, that sharing tables doesn’t deliver value providing a sample of a table creates overhead for the consumer. Ask for a data dictionary, structural and content metadata, details of historic changes/versions, the typical volume of deltas, and anything else that might be relevant to your use case. The more information you can capture, the better the decision.

• Find the middle ground. A collaborative data exchange provides a neutral territory where you can work collaboratively on data with your third-party suppliers. This means they can share the full corpus and all the supporting assets while maintaining ownership and visibility, avoiding the problems associated with samples. You can also work together on the data while you’re trying to assess it. They know the data, you know the use case, so working together accelerates progress and creates better outcomes.

Assessing data is one of the most challenging steps and seems overwhelming due to the number of parties involved, the complexity of the task, and the impact of the decision. This area potentially requires the most focus for data-driven enterprises if they are to significantly enhance how they interact with third-party data. Once addressed, the focus can shift to the process of acquiring it, which can be a complex dance between Sales and Procurement with Legal and IT in supporting roles.

corpus. The samples will not necessarily overlap, will likely have different schemas and potentially different formats. They will be sent via a range of mechanisms including upload/download, email, API, and cloud bucket syncs. None of the data products are likely to be ready to combine with your proprietary data.

You wanted data and you ended up with a large, time-consuming data engineering/science project just to work out what data is best for your needs. If you get this wrong it has potentially serious consequences, but the cards are stacked against you. This appears to be the most intractable part of acquiring third-party data, but it doesn’t have to be (and by the way, the data vendors are as frustrated as you are!) Here’s how to get it right:

FIVE

Page 7: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

AcquisitionThe acquisition phase begins with a hand-off. The commercial negotiations around pricing and terms, coupled with the creation of legal documents require very different people and skills from the analytical and technical work in the stages before and after. As a result, it is easy for this stage to lack integration, resulting in delays and leading to issues that can impact the actual data being acquired.

Challenges include defining precisely what is required having, in most cases, only accessed a sample. Typically the price of a data product will vary depending on the scope of what you consume, so it can easily become a bargaining chip. Use cases also affect the price, so this is often the focus of much attention and defined tightly, sometimes too tightly, to prevent unauthorized usage. Ironically this limits the

• Collaborate effectively. At this stage knowledge and expertise is dispersed across multiple people, departments, and organizations with varying skills and technical understanding. If you can collaborate effectively, it will help to drive faster, better outcomes that enable the use case rather than detract from it.

The process of acquiring data involves a significant amount of process that is widely dispersed and challenging to coordinate. By understanding this, being prepared, and enabling future needs, better outcomes can be achieved. Once the data is acquired, attention will quickly shift to adapting it for use.

• Clearly drawn lines. An output of the assessment phase should be a clearly articulated scope of what data is required to frame the negotiation. This is easier if a full-corpus trial has been enabled but even on a flawed, sample-based assessment, it’s helpful. This should also include format, frequency of format, method of transfer, and SLAs. In a collaborative data exchange, the desired output can actually be created and productized, which is the most precise and least intensive way of doing this.

• Plan for the future. It is in the interest of both parties for new use cases of the data product to be easier to establish than the first. So considering adding use cases such as demos, (full volume) trials, and ‘limited’ innovation in addition to the specific use case that is understood today.

potential for data to be trialed and used more widely within an organization as part of an extensible agreement. Due to the length of the sales cycle – at this stage there will have been months of elapsed time and significant investment for both sides – contracts are typically long ranging from 12-36 months. This is a significant commitment, particularly if you have only assessed a sample, leading to more complexity and increasing the overall risk.

This is an area that feels intractable, like the assessment phase, because it is based on widely accepted business processes that make it up. Unlike the assessment phase, there is no option to shift the paradigm by using a technology like a collaborative data exchange, but there are opportunities for improvement.

SIX

Page 8: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

AdaptionWith discovery, assessment and acquisition completed, the adaptation phase begins. This marks a return to more technically-orientated people, who will be actively working with the data, but also technicians supporting data movement. Depending on how well the previous phases were managed and whether or not a technology solution such as a collaborative data exchange is used to manage the end to end process, there will be dramatically different challenges. For many, this may be the first time they access the full data product and will be quickly followed by the first time they experience an update. While that is happening work is underway to make the data product fit the specific use case.

There are myriad issues that can arise here. Format conversion, volume of data, timeliness of update, volume of

• Build bridges. If you start struggling with the data remember that the provider should know their data the best and their other customers may have experienced similar issues. Try to avoid using calls or emails to diagnose and resolve issues. Instead, find a way for both parties to get hands-on with the data in a secure environment. This will dramatically increase the likelihood of getting a successful outcome and the data provider may be able to fix issues at source rather than you fixing it in isolation.

Adapting data is inevitable when trying to extract value from third-party data. Much of the pain can be avoided by fundamentally altering what happens at the discovery and assessment phases and by enabling your organization to better collaborate internally and externally. Once the data has been adapted it can be out to use, but how do you know if it’s working?

• Share and share alike. If you’re having problems or just making adaptations it is likely other users of that data product in your organization will have the same experience. If you are able to share your work with others, it will significantly help your organization move faster. However, this only really works if the sharing happens in a technology that is highly accessible, not a point solution, and will work well if licensing is closely-coupled with where the adaptations are being created.

• Modular engineering. Significant adaptations to a data product are best done in a modular fashion so that different aspects can be shared and forked. The basic work done on formatting, cleansing and filtering may be widely re-used, whereas the more use case-specific work will not. Building in a modular way will allow re-use of as much as possible.

• Robotic movements. Investing in the automation of data movement makes sense because data will always need to be moved unless your entire organization runs on a single database! This is a feature of collaborative data exchanges that seek to remove all of the technical complexity and overhead by providing multiple mechanisms without any overhead, eliminating costs, and managing risks.

change, and cleanliness. Few of which can be preemptively managed if a static sample has been used during the assessment phase. If insufficient assessment work has been done, you can also experience issues with matching/joining to your data, missing values, inaccurate data elements, and various other suitability problems. Some of these issues can be so severe that the data may simply not work for the intended use case. There is also the need to move the data, which can involve many different technologies and (often manual) ways of managing them, generating administrative and technical overhead that can cause weeks of delay.

Unlike some other phases, there is a lot that can be done to improve the situation, but much of that opportunity resides within the Discovery and Assessment phases. However, if you find yourself in a painful situation here are some options:

SEVEN

Page 9: 3rd-Party Data Management Strategy Guide

© Harbr Group Limited 2020. All rights reserved.

UseNow, the ‘Use’ phase begins. This should be far less active than the previous phases. The challenges of finding and assessing data are a distant memory. The frustrations of acquisition and making data fit for purpose have been overcome. The data product is now being used – likely in a point solution – to deliver against the specific value proposition for which it was originally sought. Things seem fine, but is it really working? Did you make the right decision? What’s the return on investment? Did you think this far ahead when you started?

At this point, there is little you can do to change the outcome – good or bad. Without supporting technology like a collaborative data exchange, most organizations will have spent significant

• Quantifiable business value. Some use cases easily lend themselves to measurement as they are directly aligned to a measurable outcome that can be tested with or without a given data product. In many circumstances, the relationship is ambiguous, so precise estimates can be elusive. Directly end-user feedback and A/B testing or stack-ranking alternatives are good options to explore.

Effective strategies for maximizing the value of third-party data require a collaborative, organization-wide mindset. To be truly effective they also need enabling technologies like collaborative data exchanges to shift the current paradigm. Most of all they require attention and focus, without which nothing will change and many organizations will find themselves very far behind where they hoped to be. Data is only valuable when it’s used and getting to that point as efficiently as possible should be a business imperative for every large data-driven organization.

• Total cost of ownership. Third-party data costs more than the price of the license. When you assess the cost, you need to think about people’s time, the tools and technologies used, and the storage and processing necessary to extract value. You also need to consider opportunity costs. What if you had not bought the data? What if you had got the data faster? Thinking about costs helps you examine your decision-making and third-party data capabilities.

• False economies. Your technology project will have a run-rate. If a lack of data leads to expensive resources sitting idle you may be suffering from false economies. Was there ‘analysis paralysis’ during the assessment phase? Did drawn-out negotiations on licensing costs really generate a better financial outcome?

time and effort and will now be locked into a multi-year licensing agreement. If the data hasn’t worked out as hoped there are few levers you can pull to change things. Whether good or bad, you can spend some time understanding your outcome to determine the impact of your efforts and gain valuable insight into how you might improve.

Trying to scientifically measure return on investment for a given data product you are consuming can be really clear or very ambiguous. Understanding the total resource that went into the various phases of acquiring and extracting value from data can also be difficult. Here are some thoughts on what to look for:

• Future returns. What have you done that can be reused by your organization? How much of your hard work can be shared by others? Can you syndicate costs efficiently if there is new demand for the product you are using? Will you be able to realize economies of scale? Regardless of your outcome, you can help others avoid mistakes and have a better chance of success in what can often be a high-cost, high-risk endeavor.

EIGHT

Page 10: 3rd-Party Data Management Strategy Guide

About HarbrHarbr’s collaborative data exchange platform unlocks the value of data through distributed data ownership, controlled data sharing and secure collaboration. Used by some of the world’s largest data vendors and most innovative companies, Harbr brings people, data and tools together to create successful commercial and enterprise data exchanges . Harbr takes care of the technology, so you can focus on what matters most -- your business.

www.harbrdata.com

© Harbr Group Limited 2020. All rights reserved.