Many organizations have invested in the creation of a Hadoop cluster or other big data platform to aggregate the large amount of data generated by their core business, and now want to determine how this data can be leveraged to contribute significantly to earnings.
In general, data use fall into two categories: internal and external. Internal uses include those that could reduce costs or optimize some aspect of operations such as marketing or supply chain. External opportunities could include licensing data or providing an analytic service to organizations outside the company.
There are now a number of tools for working with data: not only collection and storage, but also in-depth analysis. In particular, Machine Learning algorithms in tools like Spark can be coded to recognize patterns that would not be apparent to humans, and then apply what was learned to new data within milliseconds. By leveraging the native capabilities of these algorithms, companies can unlock the value in their data and reduce operational costs, find contributors of new revenue, or both.
Opportunities for monetization generally fall into two categories: Internal, or those that provide value to the business itself, and External, those that provide value to others. Applications of data to improve the efficiency of operations (resulting in measurable cost savings for the business) would be an example of Internal monetization. Creation of an analysis which is sold or licensed to others in adjacent industries would be an External monetization. Well-established use cases exist for Hadoop in both categories, including the following:
- Internal: Sales prediction, predictive maintenance, product or company sentiment analysis, customer classification and segmentation, production forecasting, targeted marketing, server log analysis, recommendation engines.
- External: Licensing of raw data, client analytic dashboards, market analysis, customer behavior insight.
Most of the examples above benefit from the application of Machine Learning techniques made possible by algorithms included with Hadoop and Spark. In fact, any use of data that includes a prediction of some kind is almost certainly a product of Machine Learning.
Examples from Others
- American Express: Amex provides excellent examples of both Internal and External monetization. They are currently responsible for some 25% of all credit card transactions worldwide, collecting data from the purchases of 107 million card members. In 2012 Amex deployed Hadoop with the purpose of improving their operations, starting with fraud detection, and they estimate the results to be over $2 billion in savings. Turning their focus outwards, Amex then began to provide merchant customers with trend analysis instead in addition to the usual usage reporting statistics, improving merchant loyalty.
- Delta Airlines: In addition to collecting the travel data of their customers (and in particular, members of their Medallion frequent flier program), Delta enhances that data with other data provided to them by brokers like Experian to create a deeper understanding of the individual customer. A few years ago they also began collecting unstructured data from social media to determine how certain aspects of their operations (delays, number of empty seats in a flight, baggage fees) would affect customer sentiment, and to what extent that sentiment would influence revenue.
- Cisco: The nature of Cisco’s equipment sales business is that customers buy in small amounts as a steady state, but then “refresh” a large amount of their infrastructure all at once. Cisco wanted to know if there was a way to predict when these refreshes would happen and how much the customer would spend, so they employed Machine Learning techniques to analyze massive amounts of disparate data and discovered that approximately 2-3 months before a major purchase, a customer’s IP address space would show up in their web server logs at a much higher rate than usual. In other words, there was a direct correlation between a customer’s use of cisco.com and their upcoming spend. This allowed Cisco to predict a large purchase and engage the customer much sooner than their competitors.
Common Obstacles to Internal Monetization
Internal monetization is all about utilizing data to make better decisions, and this requires getting the right data into the hands of decision makers at the right time. Some of the more common issues encountered include the following:
- Skills Gap: Many organizations simply don’t have enough personnel with the skills to create data products. Research by TWDI indicates that 46 percent of businesses admit to having inadequate staffing for analytics, while Gartner projects that the market for data scientists and others with data skills will continue to soar for years to come, with industries struggling to fill the need.
- Lack of Executive Sponsorship: Implementing pervasive data use usually involves a culture change. Gaining internal value from data assets requires a commitment from the top, particularly if the organization is not used to managing data as an asset.
- Data Silos and Turf Wars: Data is unfortunately not immune to the age-old impediment of fiefdoms, and control of data can be an issue for some organizations where individual business units are used to operating autonomously.
Common Obstacles to External Monetization
External monetization provides value to those outside the organization, and typically the licensing of raw data is not enough. Companies that succeed in external monetization wrap an extra layer of value around data. Companies seeking to benefit from external monetization typically must overcome a handful of common issues:
- Different Buyers: Even within existing customers, the buyer of data products is not the same as the buyer of other existing products. New relationships must be formed, and trust and rapport developed.
- Lack of Brand Equity around Data: For established companies, particularly those with mainly physical products, establishing clout around data products may take time. Other times there is a natural connection that resonates with potential buyers.
- Different Skill Set: Selling data products often requires a different sales skill set from a company’s other offerings.
- Intellectual Property Protection: Theft or improper use of Intellectual Property is a very real threat in an arena where customers may not even realize what they are doing is damaging to the vendor.
- Conflicting Laws: Governments in various parts of the world treat privacy and data very differently, with Europe typically having the more restrictive laws while the US and Asian countries are more relaxed. Even in the US, however, healthcare data laws can be very restrictive, making the creation of data products difficult.
Hadoop and other big data tools are excellent for the collection and analysis of data at a scale previously impossible, but the collection of data is only half the story. While the simple reporting techniques of typical Business Intelligence applications can provide added value to data, it is Machine Learning that takes value creation to a much higher level and enables monetization on a higher scale. The ability to recognize and predict trends is a definite advantage to any organization, and those who can provide that benefit to themselves or as a service to others put themselves in a leadership position in their industry.