In today’s world every action
that we take can be recorded as data. From the number of times you search “Pizza
Hut” on Google, how often you scan your gym membership RFID tag, or even the
number of texts you send per hour, it is all data. In fact, the idea of being
surrounded by this universe of data is referred to as “big data”. Though our daily actions can be recorded,
sorted, and analyzed at a relatively cheap cost, is the holy grail of business
answers buried in the mess? Is it worth the time and effort to find the needle
in the haystack? These are the “big” questions when it comes to “big data”.
Big data is a relatively new idea,
and there seems to be confusion at all levels of this concept. One of the initial
problems is pinpointing an absolute standard definition of big data. A reoccurring
definition I have seen declares big data as “a collection of data from
traditional and digital sources inside and outside your company that represents
a source for ongoing discovery and analysis” (Forbes).
Wow, that’s quite a bit to chew. How do we wrap our head around the amount of
data that is out there? Is it possible?
Big data is generally defined by volume,
velocity, and variety (The 3 V’s). The volume
of the data represents the amount of data we are dealing with. The velocity represents how fast the data is
moving, and the rate at which the data is being collected. The variety of the data represents what
type of data we are analyzing, structured or unstructured data. Structured data is data that has been
arranged in a format… but it’s not that simple, not all formats are considered
appropriate relational structures. Good examples of structured data include
relational databases, data warehouses, and complex enterprise systems. Unstructured data is data that need
more interpretation from more complex correlation systems. Some examples may
surprise you, such as an excel spreadsheet without exact specifications, a word
document, or even the tweet you sent 15 seconds ago. It is all considered
unstructured data.
Who has the time and money to
keep track of all this structured and unstructured data? Large analytics companies such as IBM
and SAS
offer “data mining services” to help sort through this tangled web of potentially
useful information. According to SAS’s video “Big Data…What it Means to You”, in 2012 the amount of data
stored in the world exceeded 2.8 million zetabytes (1 zetabyte is roughly 931 billion
gigabytes). By the year 2020 that number is predicted to be 50 times larger! Despite
this enormous value, only 0.5% of this data is analyzed! This leads to
unprecedented levels of competition to find the “secret business plan” or the “solution
of all solutions”. There is no way of knowing that the information we are
looking for is out there, we just need the resources and patience to sort
through it. It will be important to keep your eyes and mind open to the
prospect of big data analytics, it seems to be where our world is headed, and
we all need to buckle our seat-belts for the ride.
Posted by Andrew Miller
No comments:
Post a Comment