Big Data and Data Mining

Big Data

Big Data is the term given to the rapid accumulation of vast amounts of data from a variety of sources and in a multitude of different forms.

Big data is produced on a daily basis, your company is probably already producing big data and you dont even know it – from transactions to smart devices to commercial and industrial equipment to meta data and social media, literally anything that involves information transfer can contribute to big data production.

Big Data is often therefore defined by the 3 V’s – volume, velocity and variety.


Data mining

Data mining is the process by which raw (often big data) is interrogated, explored and converted into useful information by identifying patterns and relationships within.

Data mining is therefore used to filter out unnecessary repetitive or chaotic data from large data sets and identify data which has meaning and offers value from its analysis.

For this reason, it is often applied extensively to the interrogation and analysis of Big Data.


The three V’s of big data introduce huge challenges to its analysis – it is constantly being added to either in batches or real-time, it is coded and stored in different ways (and even locations) and the sheer size of it makes standard approaches slow and tedious.

The challenges however are not insurmountable and data mining techniques cut through the complexity in order to bring out valuable insights and key themes which can directly impact institutional performance.

Some of the areas that big data and data mining offer insights in include:

  • Root cause analyses
  • Calculating risk
  • Identifying outliers and fraud
  • Inform strategy through decision tree generation
  • Identify patterns and relationships which allow for predictions.