Tuesday, March 18, 2014

Big Data A Revolution That Will Transform How We Live, Work and Think

Staff Review by Chris Saliba

In this compelling explanation of how the enormous data flow of the internet age is processed and used, Big Data proves indispensible reading for those who want to understand the direction our world is heading in.

When most of us type searches into Google, leave posts on Facebook or send emails, we presume that once our internet business is done, our digital footprint either evaporates or is sunk in a pile of data landfill. Big Data tells the story of what happens to all of those billions and billions of pieces of personal information we feed on an everyday basis into the servers of the big tech companies.

In the early days of manual computation, before the advent of computers, with their huge memory and processing capacity, modelling and predictions were extrapolated from small data sets. For example, you could estimate how many people in the population liked orange juice by taking a small sample of people, then multiplying the figure out to a population wide figure.

In the world of big data, rather than extrapolate from a small data set, the volume of data used can be pretty much comprehensive. Every tweet, Facebook comment, email, Google search etc. can be sifted, analysed, fed through various algorithms to figure out how many people like orange juice. It can then be further sifted for more nuanced results, like what time of the day people like to drink orange juice etc. However, there is a small negative. In the past, when extrapolating from smaller data sets, the collected information had to be 100% accurate for the predictions to work. When using 'big data, it's not necessary for such accuracy. This is called 'messiness', where there may be some bad data around the edges, but because the capture is so huge these small inaccuracies can be tolerated. In other words, 2 plus 2 can equal 3.9 in big data computations, and this is deemed close enough.

What does the coming era of sweeping data collection and analysis mean for us mere mortals? Big Data opens with a fascinating example of how Google was able to predict flu outbreaks quicker than the United States' Centre for Disease Control and Prevention (CDC). How was this possible? Google takes three billion search queries a day and saves them all. It simply processed this data and discovered in what geographic locations people were making the most flu related queries. One of the big themes of the book is how this collection and processing of such enormous amounts data will give us great tools for making accurate predictions across a broad range of areas.

Another example. The huge retailer Target uses sophisticated technology to track customer purchases made on credit cards or with loyalty cards. Their analytics team studied what pregnant women buy and could map out what type of purchases a pregnant woman would typically make, then send out coupons for baby products to these shoppers. One day a father walked into Target fuming that his daughter had been sent baby coupons and insisted she wasn't pregnant. But it later turned out that his daughter was indeed pregnant. The analytics team had got it right from the history of her purchases. Hence, it seems that our collective data knows more about us than we know about ourselves as individuals. The authors even write quite seriously “Soon big data may be able to tell whether were falling in love.”

Perhaps the best analogy for all of this data collection and processing is that of a the expressionistic art style, Pointillism. Each dot of paint on its own means nothing in particular, but when all the dorts of paint are put together and looked at from a distance, then a picture emerges. This is what the analysts of big data do. They create sophisticated algorithms to sift and sort and process the data, finding out key aspects of our behaviour then using it for various purposes, in a lot of cases commercial. American retailers Walmart even figured out at what times people preferred strawberry flavoured pop tarts and promoted them accordingly. It seems that the end game of big data will be to predict our every move.

As I read Big Data H. G. Wells' The Time Machine (1895) kept coming to mind. The big data boffins were like the Morlocks, predicting our every desire and then fulfilling them. Meanwhile we internet users went frolicking carefree like the Eloi while all our data was sucked up for commercial and government use. The best example of this is probably Facebook. The company's market research firm figured out that every user was worth about a $100 to the company. As technology writer Jaron Lanier has said, when you sign up for Facebook it's you who are actually the product. The world we are moving to seems to be one where a small handful of companies will have huge power and influence over what we do and consume. We're all under the telescope of big data.

I bristled at a lot of what I read in Big Data. The authors, Viktor Mayor-Schonberger and Kenneth Cukier, are experts in their field, and write in a positive vein about the way that computing technology will change our lives. (It must be admitted that they do include a few chapters on the possible negative effects of big data). They do a lot of predicting themselves about what they feel is inevitable for society and us as individuals. Perhaps I bristled because I wanted to keep to the fiction that I could remain an individual in the face of this powerful force. Maybe my response to the book was like the outrage many Americans felt when whistleblower Edward Snowden revealed the government was collecting data on citizens. The age of big data is slowly revealing itself to be the era of Big Brother.

Another scary facet of big data is that, as the authors readily admit, the technology is becoming so sophisticated and extraordinarily complex that ultimately few people, if any, will be able to understand it. It will control us, but we won't have the foggiest idea of how it functions. For example, the authors note how the builders of the complex algorithms for Google's translating service cannot understand the languages they are expert in translating.

Despite my own arguments with the Big Data, I found this a succinct and clear explanation of how big data is transforming our world. It's essential reading for anyone who uses the internet, which is just about all of us, and raises important questions about data privacy (which the authors seem to think almost impossible) and where we want to take this technology as a society.

Big Data: A Revolution That Will Transform How We Live, Work and Think, by
Viktor Mayor-Schonberger and Kenneth Cukier. Published by John Murray. ISBN: 9781848547926  RRP: $19.99