Thursday, September 3, 2015

Before you spend on data analytics and big data

Big data is the answer, but can you please repeat the question?

The last decade has seen an explosion in the amount of digital data being created over the Internet. Companies find themselves with huge data sets on their hands. And of course this huge mine of data has kick-started the big data and data analytics consulting, which remains a fast growing service. 

However, business users need to critically consider the following 3 key points before approving the spend on big data/ extensive data analytics projects:

Process maturity: For big data or data analytics to be useful, underlying data sets have to first be accurate. This means the processes that lead to data generation need to be mature and functioning properly for some time before a data analyst can help. Where multiple data sources are involved, thinking about how they will link together/ identify duplicate data/ linked data/ supplementary data would save a lot of pain later.

Statistics: Crunching big data is not always the answer. Thinking about breaking the data into representative sets and applying statistical analysis can serve the same purpose in many instances. If anything, bulldozing through tonnes of data without understanding data distribution will most probably give you the wrong results.

Cost-benefit: Sure there can be benefit in milking the data. But at what cost? With spending for a full-fledged big data project likely to reach 8 figures, has anybody considered the cost-benefit analysis? Can we achieve the insights without necessarily taking the big data route (e.g. by breaking data into representative sets, statistical analysis)? Are the processes mature enough to yield accurate data? Do we have a data map, so that we at least understand where the data is generated and where it is stored? What is our in-house technical expertise to deal with the demands of such a project? 

Very few data sets would truly qualify as big data if they were first split or linked sensibly. And for data analytics to work, it is important that a company first has ample confidence in the accuracy of their data. Simple lessons which if remembered would do a lot of good.

Thursday, May 28, 2015

QlikView vs Tableau

Spent a few days playing around with QlikView. Here are my initial thoughts on how it stacks against Tableau.

Cool factor: Tableau charts feel more slick and modern compared to QlikView. The options provided are more focussed, and Tableau even suggests optimal visualisations based on data selection. The ability to share an interactive dashboard with users is common between the two. 
Overall winner: Tableau

Usability: Tableau seems to have done a better homework on Usability and visualisation. QlikView can do as much, but it is not as business user friendly: the sheer number of options in QlikView feel over-whelming to start with for a non-technical user. Consequently, you would spend more on training on QlikView and would need a more technical minded person to be able to build dashboards with it. 
Overall winner: Tableau

Ease of Data Transformation/Extraction/Access: Both need data in database format. However, at the first look it is not immediately obvious whether QlikView can integrate with the same range of data sources as Tableau. Business users would need a data analyst/ technical person to extract, transform, and load data in both, but more so in Qlikview. Tableau provides access to OLAP cubes while QlickView doesn't. 
Overall winner: Tableau

Raw Power: In terms of technical power, QlikView appears more technical and customisable than Tableau for a power user familiar with databases and SQL.
Overall winner: QlikView 

Pricing: Oveall complex pricing plans, but I understand that Tableau may be overall cheaper.
Overall winner: Tableau

Ease of exploration & training: QlikView provides an unlimited personal use edition and Tableau provides free public desktop application. There is good support for training.However, QlikView appears to have a more well defined learning path than Tableau. For example, I could find a well-defined certification track for QlikView, but not for Tableau (as on 28 May 2015).
Overall winnerQlikView

Summary: If you have a team who knows how to build visualisations and is reasonably strong technically, go for QlikView for the sheer power of the tool. You will have to spend some time training for it. If you are in a hurry and want to get to a good looking solution fast, Tableau is the answer. Personally, I am inclinded to overall favour Tableau of the two for its ease of use for the business user. 

The above review was done as a first time business user based on initial impressions by trialing the products over 14 days.

Friday, May 22, 2015

14 Days of Tableau

I recently trialed Tableau, a data visualisations tool used for Analytics. My initial impression as a novice user (based on loading data from an excel and doing a few online trainings) is that it is a fantastic tool for following scenarios:
  • We have database acceess, or want to analyse data coming directly off a known client system (SAP, Salesforce)
  • The data available is structured inline with database tables
  • We need to monitor KPIs over a period of time
  • To extract regular insights from predictable management information
  • It is a dedicated tool with sole purpose of getting insights out of data, and is therefore more powerful than excel for charting/ data visualisation
  • The ability to connect with and combine varied data sources makes it a class apart for large data sets that would simply kill excel
  • The ability to build graphical, interactive storyboards that can be shared online take analysis and engagement to a new level
  • Very strong for the four scenarios identified above, especially for building dashboards that refresh based on an underlying, changing data
  • Needs data in a database format. This may need extra effort and planning. E.g. Converting a simple monthly Management Information for balance sheet to a database friendly format can be quiet a task. This would especially hurt if the model changes often, or the number of rows is very large.
  • Missing data points and incomplete excel data can be tricky to deal with
  • Without training, too many options for data cutting can be confusing
  • May not always be worth the effort as there may not be enough time or resources for it (vs doing a simpler excel based analysis)

After this, I intend to trial Qlikview (another BI tool) and compare. Will keep posting my thoughts.

Wednesday, May 20, 2015

Making Sense of Data

With the ever improving ability of companies to track and store data, it is only natural that business intelligence (BI) tools have gained increasing importance in C-suites. In fact, the market is glutted with umpteen BI tools, vying for this extremely lucrative space. Needless to say, it has been an equally lucrative opportunity for consulting companies as well.  

However, the businesses need to be very careful. Data analysis can often be the right way to reach a wrong conclusion. To truly unlock value from mountains of data, they would need multi-disciplinary teams that understand the importance of the task and are willing to work together. The same is true for consulting firms that want to cash-in on the desire of businesses to use their data more effectively.

From business perspective, there are four key challenges on the road to using data effectively:
  1. Capturing and organising data
  2. Making the data useful and accessible for the business user
  3. Ensuring that data analysis is statistically significant
  4. Making the whole process timely and cost effective 
Based on these, logically it would help if the team has the following profiles:
  • business owner to fund the project & articulate the answers sought from data
  • business analyst to bridge the gap between techology and business
  • trainer to teach business users about how to go about accessing/ manipulating the data
  • data analyst to help business owners to analyse data and decide if the data interpretation is actually significant statistically. Many modern BI tools can help business owner to manipulate data by himself, but he may still need support to extract, transform, and load data into the system. Given the technical nature of things, help may be needed to troubleshoot as well
  • business consultant with industry specific expertise to help business owners to refine the questions they seek answers to, and create processes to ensure data integrity
  • project manager to coordinate the team, budget, and execution

Needless to say, one person can play multiple roles. 

Wednesday, July 30, 2014

Getting IDE to Work

Was exploring a bit of platform by Salesforce and started by setting up the Eclipse IDE. A few key things that I can point out are:

  • Ensure both your Eclipse and the JDK are the same "bit" version (32/64)
  • If you cannot get the "" perspective to appear, instead of adding a vm arg to eclipse ini file, set "JAVA_HOME" and "PATH" system variables on your machine: eclipse should pick it up. I have linked the documentation for setting PATH. You can use the same process for JAVA_HOME 

Monday, October 28, 2013

Dynamic Filtering

Dim filterValues() As String

'add all required values to the filter
'actvate the sheet that has filter
'you can use a dynamic range and filter column as well
'Field corresponds to column used to filter

ActiveSheet.Range("$A$1:$Z$6000").AutoFilter Field:=6,_
 Criteria1:=filterValues, Operator:=xlFilterValues

Friday, October 4, 2013

Handy Functions for Tables (ListObject)

Functions to get row number or column number for an active cell selected in the table

Function GetActiveCellTableRowNum() As Long
    Dim currTable As ListObject
    Set currTable = ActiveCell.ListObject
    If currTable Is Nothing Then
        GetActiveCellTableRowNum = -1
        GetActiveCellTableRowNum = _ 
            ActiveCell.Row - currTable.HeaderRowRange.Row
    End If
End Function

Function GetActiveCellTableColNum() As Long
    Dim currTable As ListObject
    Set currTable = ActiveCell.ListObject
    If currTable Is Nothing Then
        GetActiveCellTableColNum = -1
        GetActiveCellTableColNum = _
           currTable.ListColumns(ActiveCell.Column - _
           currTable.HeaderRowRange.Cells(1).Column + 1).Index
    End If
End Function