Understanding ‘big data’: the dormant potential of data
30th June 2015. The visit of physicist Antoni Trías Bonet to ideas4all has served to clarify some of the mysteries behind big data. As a specialist in the field, Trías Bonet helped us understand and observe the great potential of this budding industry.
Big data brings great business opportunities to the table in several sectors, but also an enormous volume of low cost data that requires the analysis of massive amounts of information. In fact, big data involves an undertaking that many might consider akin to facing a minefield.
There is a common basis to all big data applications, which is data itself. The difference between one instance of big data and another lies in the processing workflow and analysis of that data, depending on its practical application. Therefore, we might say that the challenges we face have more to do with human, rather than technological, capabilities.
As we have already mentioned, today’s technology allows us to work with large volumes of raw data in a matter of seconds.
All traces of our actions in the Internet, from our browsing preferences to the time at which we access specific digital services, determine a digital trail that allows companies to discover more about us and our interests.
In that regard, for example, big data can be a powerful tool in the production of market surveys, although it may directly come up against values such as privacy.
Big data and its ethical dilemma.
The separation of public and private space, or even intimacy, and the guarantee of values like freedom or equality are some of the red lines that must be respected by people working with big data and that should be protected by legislators. The task of safeguarding these values is a commendable effort, according to Trías Bonet, that must serve to regulate an expanding industry with a potential for development that is still unknown.
If, for example, the Stasi (the political police of the German Democratic Republic) dedicated many years and resources to collecting information and creating a huge database that it used to control citizens, today’s technology would make it possible to carry out such a task in seconds.
According to critics, the fear of an Orwellian ‘Big Brother’ becomes very real with big
data, opening the door to ethical dilemmas.
For example, as Trías Bonet explained, would it be ethical for a bank to trace an individual’s financial movements in Internet in order to grant or deny a loan?
Other critics refer to a possible ‘dictatorship of data’ at the service of the elite, to benefit a minority at the expense of the majority. There have already been cases of people who have requested technology giants to delete their digital trail on the Internet.
But apart from these misgivings, how are we to work with big data? According to Trías Bonet, these uncertainties demand that we work with data analysis professionals, as is evident from the increase in professional profiles in the job market like data scientists and experts in legislation on the subject, which is still relatively undeveloped.
How to read and elaborate practical data.
Working with a large volume of information requires processing systems that allow us to stay clear of the minefields Trías Bonet refers to. In order to do this, data can be classified in categories or ‘packages’ for analysis, like more readily accessible individual pieces.
This is what is known as ‘datification’, which can be applied to multiple fields. For example, words can be reduced and grouped in packages of syllables, or song lyrics may be reduced to extract common grouping patterns using sound wavelengths (microsounds).
In such cases, subjective elements of intoxication, like personal style or the mastery of different performers when tackling the same song, can be set aside if the element of analysis is the musical score, common to all musicians.
The ‘long tail’ theory: when the tail leads the head
When managing a large volume of data we must also learn to select data, as there is not always a correlation between largest and most important data: sometimes the real gems lie hidden in the least repeated values.
In simple terms, sometimes we disregard the ‘tail’ in the graph (see above) even when the sum of all its values is more relevant than the values that are most common or repeated.
This is what is known as Chris Anderson’s Long Tail theory, and when applied to business logic it allows us to measure mass markets (few products and great demand, peak on the graph) against market niches, where many products coexist with little demand, adding up to open the door to a much greater market.
In conclusion, how to work with data?
In view of everything Antoni Trías Bonet told us, there are several things we have learnt:
More than data itself, what is important is the reading we make of it, and that is where we should focus our resources when we work with big data.
For this to be possible, we can work with techniques like those provided by ‘datification’.
Another important point is knowing what we are dealing with as we manage big data. In a complex and relatively new field, according to Trías Bonet, it is particularly important to rely on legal advisers to help us in our work.
Somewhat more personal is the respect for lines marked by ethics. And in this sense the advice Trías Bonet gives us is clear: if you work with big data, you must do so with ethics and humility.
More about Antoni Trías Bonet.