Why Big Data is Bullshit

I have a problem. A huge problem. And I can’t seem to get away from this problem because it’s always in my face: “big data this, big data that.” I just don’t get it. What is the big deal about big data?

In 1987, Milan Zeleny introduced the concept of the knowledge hierarchy, still in its infancy. In 1988, Russell Ackoff took this concept and developed it further, introducing a pyramid consisting of three tiers: understanding, knowledge, and wisdom. In later years, others adapted this hierarchy by adding data, creating the DIKW paradigm, which depicted the relationship between data, information, knowledge, and wisdom. Today, I present an addition to the pyramid, or rather, an alternative approach: The Noisy DIK Paradigm.

The idea is that data doesn’t just exist on its own. Data is a derivation of noise. And noise is omnipresent. That is, if you believe in the Big Bang Theory. You know when you turn on the TV and all you see is static? That’s noise. In fact, it’s the most incredible noise there is. About 1% of the static on the screen is the afterglow of the big bang. And this afterglow is the cosmic microwave background that lurks every crevasse of the universe.


The point is that noise is inescapable. It can be disruptive, irrational, scattered, and annoying, yet we, as humans have the ability to manipulate a lot of this noise into something logical.

Let’s take sound for example.

When sound has no structure or direction, it is simply noise. It is chaos which is impossible to understand due to its lack of logic and rationality. Yet, we, as humans, have the capability to take this disordered sound and turn it into something meaningful; something beautiful. Music.

And no matter how complex the musical piece may be, and regardless if it is improvised, it will always be the result, or output, of a combination of a previously established algorithm where the independent variables derive from theory: riffs, scales, sequences, chords and human reasoning and intuition.

So if we can do this with sounds, what is preventing us from doing it with other types of noise?

At its core, all data is noise. In fact, I would argue that the notorious term “Big Data” is misleading. This concept that unstructured and meaningless qualitative and quantitative signals are data is simply untrue. Because that’s the exact definition of noise.

The moment that a signal becomes data is when it is contextualized and assigned a role. In other words, data exists only when it has the potential to become a variable. But even if you’ve managed to extract data from the noise, it is still not enough. Because data doesn’t provide you with any direct value until it has been organized, structured, processed, and interpreted. Only then, does data evolve to information and evolve once more to become knowledge.

So why is Big Data such an endeared word? Why do businesses insist on parading the fact that they really bring nothing to the table? And I ask you, truly and humbly, what actions can you really do with data?

I’m not saying that data is not important. On the contrary, knowledge cannot exist without data. But data is just one step in a process that deciphers noise to knowledge.


I feel like there is a serious misunderstanding in the industry that if addressed, could expedite innovation and new discoveries. And that misunderstanding is the focus on the D and not the link from I to K.



The reason that the hierarchy doesn’t stop at information is because although information is structured and organized data, it still doesn’t provide the end user with any insights or knowledge. And no matter how sophisticated a program or software may be or how visually stimulating it may be with its data visualization techniques, no computer can ever provide knowledge. Because knowledge is a reaction to the information it receives and the brain. And this reaction is always unique as each brain and individual have their own methods of reasoning and their own intuitions. And once this knowledge has been acquired, the only place it can be stored is in the brain. It is the only place that can understand what is being processed and what decisions are to be made based on this knowledge.

Businesses should begin focusing on being knowledge providers so that the end users can actually accomplish what the business set out to do in the first place. Focusing on the link between information and knowledge means providing a toolset for the end users. It means that businesses who pride themselves in their products should shift their approach to priding themselves in the service they provide.

Because at the end of the day, if I don’t understand how to use my data off of your product, your product is worthless.