Tuesday, October 22, 2013

Data Mining and Big Data

Marketing is a key business function, directly affected by Data Mining. In the past, companies would have to rely on outdated demographic statistics and consumer surveys to make important decisions on marketing strategies. This posed a problem, since the world is constantly changing, these strategies could become outdated/unsuccessful before managers even had a chance to implement. "Data-driven" is a term being used to describe today's economy, not only from transaction records but through new sources, such as social media, mobile devices, emails, and more (Johnson 2012). It is almost impossible for a person to make a decision without leaving a digital footprint, whether a Tweet, a receipt, or an email. It is because of this massive amount of data inflow that businesses are turning to Data Mining when developing their marketing strategy, since it is a new method of defining and measuring consumer demand. However, when using Data Mining, it is important to realize this method is only successful if the data input is accurate and relevant.

How do companies ensure that they are using relevant and accurate data? When it comes to relevant data, it depends on the situation or the problem addressed. It is important for companies to take the time to determine which variables are relevant, before imputing data into a Data Mining algorithm. As for accuracy, a rule of thumb is ensuring that the data is considered "Big Data". Big Data refers to more than just large volume, in fact there are three specific characteristics required for data to be distinguished as Big Data: Volume, Velocity, and Variety (Collet 2011).

Volume refers to the size of the data set. But just how "big" is Big Data? Generally, the volume must be between one terabyte and one petabyte. To put this in perspective, consider that every hour Wal-Mart stores have over one million customer transactions, which is estimated at over 2.5 petabytes (Johnson 2012).

Variety refers to the different forms of data and their sources. Big Data consists of internal and external data, as well as structured, semi-structured, and instructed. It also has several different formats of data, as well as many sources (IBM 2012).

Velocity refers to the increasing speed at which new data is being created or how current the data is. It is the time between when data is created and when it can be analyzed. The higher the velocity, the closer the data is to real-time, which allows businesses to make better informed decisions.

By using data that is current, large in amount, and gathered from a variety of sources, a business is able to use Data Mining more successfully. Data Mining allows companies to find common patterns, which they otherwise would not have realized, and make decisions based on the conclusions drawn (TRA 2000). The first step to using Data Mining and Big Data successfully is clarifying a goal the company wishes to achieve. What is your opinion on Big Data and Data Mining? What other key business functions could companies use Data Mining to make decisions? Do you see any issues or concerns surrounding this topic?


Sources:

Collett, S. (2011). Why Big Data is a big deal. Computerworld, 45(20), 18.

IBM. (2012). What is Big Data? Bringing Big Data to the Enterprise. Retrieved April 2, 2013 from http://www-01.ibm.com/software/data/bigdata/

Johnson, J. E. (2012). Big Data + Big Analtyics = Big Opportunity. Financial Executive, 28(6), 50-53.

TRA, I. c. (September 2000). TRA, Inc. Awarded New Patent for Improvements in Using Big Data for Television Advertising Targeting. Business Wire (English).

No comments:

Post a Comment