Thursday, September 3, 2015

Before you spend on data analytics and big data

Big data is the answer, but can you please repeat the question?

The last decade has seen an explosion in the amount of digital data being created over the Internet. Companies find themselves with huge data sets on their hands. And of course this huge mine of data has kick-started the big data and data analytics consulting, which remains a fast growing service. 

However, business users need to critically consider the following 3 key points before approving the spend on big data/ extensive data analytics projects:

Process maturity: For big data or data analytics to be useful, underlying data sets have to first be accurate. This means the processes that lead to data generation need to be mature and functioning properly for some time before a data analyst can help. Where multiple data sources are involved, thinking about how they will link together/ identify duplicate data/ linked data/ supplementary data would save a lot of pain later.

Statistics: Crunching big data is not always the answer. Thinking about breaking the data into representative sets and applying statistical analysis can serve the same purpose in many instances. If anything, bulldozing through tonnes of data without understanding data distribution will most probably give you the wrong results.

Cost-benefit: Sure there can be benefit in milking the data. But at what cost? With spending for a full-fledged big data project likely to reach 8 figures, has anybody considered the cost-benefit analysis? Can we achieve the insights without necessarily taking the big data route (e.g. by breaking data into representative sets, statistical analysis)? Are the processes mature enough to yield accurate data? Do we have a data map, so that we at least understand where the data is generated and where it is stored? What is our in-house technical expertise to deal with the demands of such a project? 


Very few data sets would truly qualify as big data if they were first split or linked sensibly. And for data analytics to work, it is important that a company first has ample confidence in the accuracy of their data. Simple lessons which if remembered would do a lot of good.