Monday, September 15, 2014

Big Data, Small World

In today’s world every action that we take can be recorded as data. From the number of times you search “Pizza Hut” on Google, how often you scan your gym membership RFID tag, or even the number of texts you send per hour, it is all data. In fact, the idea of being surrounded by this universe of data is referred to as “big data”.  Though our daily actions can be recorded, sorted, and analyzed at a relatively cheap cost, is the holy grail of business answers buried in the mess? Is it worth the time and effort to find the needle in the haystack? These are the “big” questions when it comes to “big data”.

Big data is a relatively new idea, and there seems to be confusion at all levels of this concept. One of the initial problems is pinpointing an absolute standard definition of big data. A reoccurring definition I have seen declares big data as “a collection of data from traditional and digital sources inside and outside your company that represents a source for ongoing discovery and analysis” (Forbes). Wow, that’s quite a bit to chew. How do we wrap our head around the amount of data that is out there? Is it possible?

Big data is generally defined by volume, velocity, and variety (The 3 V’s). The volume of the data represents the amount of data we are dealing with. The velocity represents how fast the data is moving, and the rate at which the data is being collected. The variety of the data represents what type of data we are analyzing, structured or unstructured data. Structured data is data that has been arranged in a format… but it’s not that simple, not all formats are considered appropriate relational structures. Good examples of structured data include relational databases, data warehouses, and complex enterprise systems. Unstructured data is data that need more interpretation from more complex correlation systems. Some examples may surprise you, such as an excel spreadsheet without exact specifications, a word document, or even the tweet you sent 15 seconds ago. It is all considered unstructured data.

Who has the time and money to keep track of all this structured and unstructured data? Large analytics companies such as IBM and SAS offer “data mining services” to help sort through this tangled web of potentially useful information. According to SAS’s video “Big Data…What it Means to You”, in 2012 the amount of data stored in the world exceeded 2.8 million zetabytes (1 zetabyte is roughly 931 billion gigabytes). By the year 2020 that number is predicted to be 50 times larger! Despite this enormous value, only 0.5% of this data is analyzed! This leads to unprecedented levels of competition to find the “secret business plan” or the “solution of all solutions”. There is no way of knowing that the information we are looking for is out there, we just need the resources and patience to sort through it. It will be important to keep your eyes and mind open to the prospect of big data analytics, it seems to be where our world is headed, and we all need to buckle our seat-belts for the ride.

Posted by Andrew Miller




No comments:

Post a Comment