what is big data?

Today’s A:360 discusses what the term big data really means. There are many interpretations and misconceptions about big data. In this podcast, we will provide a solid definition of big data and discuss what most people are referring to when they say the term “big data”.

Watch and Listen

Click to Watch on YouTube.

Listen to the Podcast

Click to Listen on SoundCloud
Click to Listen on iTunes

Read the Transcribed Audio

Hey everyone. Welcome to today’s A:360. My name is Brewster Knowlton, and today we’re going to be answering the question: what is “big data?

I’ll be honest, I was looking forward to doing this podcast because this is my least favorite buzzword that there is- this idea of big data. Don’t get me wrong, I love the concept of big data, but I hate how frequently it’s misused. More often than not, people are referring to traditional business intelligence when they say the term big data. People often equate big data with analytics when in reality, big data is just a subset of business intelligence and focuses on some very specific applications. These specific applications can differ quite significantly from traditional business intelligence.

[In this podcast], we’re going to talk about what big data really is and how it differs from traditional business intelligence.

Let’s describe what big data is according to some pretty popular and well respected resources. I’m going to describe three definitions. There are some things I like about each and some things I dislike about each definition. We will go through these three definitions to get an idea for how other people interpret big data, and then I’m going to provide what I believe is a concise definition for big data that addresses the major points but is appropriate enough to differentiate it from traditional business intelligence.

According to the SAS Institute, “Big data is the term that describes large a volume of data, both structured and unstructured, that inundates a business on a day to day basis.”

Gartner says that big data is “high volume, high velocity and or high variety information and assets that define cost effective and innovative forms of information processing that enable enhanced insight, decision making and process automation.”

McKinsey says that “Big data refers to data sets whose size is beyond the ability of typical database software tools to capture, manage, store and analyze.”

If you listen back to each of those definitions, you’ll realize that each of them is a little bit different. Despite all sources being highly respected, it just emphasizes the interpretation and definition of big data between who is explaining it. That’s not to say that one is right and one is wrong, but it just emphasizes that this term is a little bit up for interpretation.

I personally tend to agree most with the SAS definition because it specifically calls out unstructured data, which, in my opinion, is really the crux of big data. I’ve combined these three definitions and defined what I believe is the best of each and created this definition of big data that covers the major points. Therefore, my definition of big data is: “Big data refers to data sets large in volume, incorporating both structured and unstructured data, that demand non-traditional database systems, and business intelligence solutions to process, store and analyze.”

To me, this definition addresses all the major points big data represents.

First, it represents that the data is large. Makes sense, right? It’s called big data– not little data or tiny data. So, it would imply data sets that are large in volume.

[This definition] also incorporates both structured and unstructured data. This is critical because that, to me, is the difference – or one of the major differences – between big data and traditional BI. And, the fact that they demand non-traditional database systems and BI solutions implies that it is a newer version, a newer evolution, of what would be considered traditional business intelligence. That is where we get into that real realm of big data.

Based on the definitions that I’ve put together and that I’ve read from McKinsey, SAS and Gartner, big data boils down to:

  • Datasets large in volume
  • Incorporating both structured and unstructured data
  • Requiring the demand for non-traditional database systems

You’ll hear things like the five V’s of big data, talking about velocity and variety and things like that. In the end it boils down to the previous three points that separate big data from traditional BI. That’s what makes big data different and is what defines big data.

That’s it for today. Thanks again for listening to today’s A:360.

Subscribe to have new content sent directly to your email!


Photo Credit