By Jen Sweeney
The hype surrounding big data is over, according to Gartner, and the practices the term describes are now considered essential components of business. Still, many organizations have yet to figure out how to make big data work for them.
Before businesses can truly become data-driven and innovative, they will need to tackle a number of challenges, strategic and otherwise. One issue in particular—ensuring data quality—may prove to be the most challenging.
That’s because, even with a plan in place, data is rarely in a crunchable state from the start. Data is inherently messy—it’s often inconsistent, incomplete and nonstandard. Is it estimated that 80 percent of data analysis is spent on cleaning up the mess.
In an Experian Data Quality study released earlier this year, 83 percent of respondents in commercial companies said they believe inaccurate and incomplete customer or prospect data costs their organization money. In addition, businesses surveyed estimated that 23 percent of their revenue is wasted as a result of inaccurate or incomplete data—an increase over 2014’s 19 percent.
With the amount of data generated today, the cleanup problem grows even larger. The difference between today’s big data and the plain old data of the past is that today’s is bigger in volume, but also faster in velocity and wider in variety (otherwise known as the “3 Vs of big data,” a framework for understanding and dealing with big data that’s attributed to data pioneer and Gartner Research analyst Douglas Laney). Cleanup, therefore, is more complicated.
Cleanup becomes even more challenging when the current data skills shortage and projections for the near future are factored in. Research conducted by the McKinsey Global Institute projects that by 2018, the United States could face a shortage of 140,000 to 190,000 people with advanced training in statistics and machine learning, plus 1.5 million managers and analysts who understand how to use the analysis of big data to make effective decisions. Universities around the world are responding to the demand by creating data science curricula, but not quickly enough to meet the need.
Organizations should not let an off-balance supply and demand for advanced data skills stifle their efforts. They can work toward their long-term goals with small but strategic steps. In a recent report, McKinsey Quarterly notes the increased investment and innovation in data analytics tools and approaches, and highlights a few that can have a more direct—and faster—impact.
- Targeted solutions from analytics-based software and service providers: Models are built by analytics specialists and are targeted to specific use cases. These tools have a clear business focus and can be implemented quickly, but will require user training.
- Self-service tools that “democratize” the use of analytics: These tools enable business users to extract value from data without having to know any coding. Users can link data from multiple sources, apply predictive analytics and use visualization tools to define data exploration needs.
- Hands-on experience with the help of an expert: With guidance, business users can build their confidence using data and increase the role of data in their decision-making processes.
Extracting real value from data analytics requires substantial time and effort—from business users, and management. (See this set of guidelines by MIT’s Erik Brynjolfsson and Andrew McAfee for big data’s management challenges.) After determining a plan, organizations need to begin the long but critical process of changing company culture and building employees’ skills.
It’s no longer a question of if data can deliver value, it’s a series of asking how. Formulating a plan that’s based on proven strategies and expert opinion—and that keeps the core issue of data quality front and center—can bring a company closer to making every decision a data-driven one.