“Data is” or “Data are”

“Data is” or “Data are” - Teranalytics

If you go to the opus, like octopodes salad, pizze and beeves, you might have a strong opinion about this perennial debate: “data is” or “data are”. Purists (whom we like) refer to the etymology of the word, arguing that it is the plural form of the Latin singular “datum” and therefore that the verb must be conjugated appropriately, hence “data are”. And they are right. Pragmatists instead (whom we are) accept that a language is evolving, they look at the data through a natural language processing lens and conclude that common usage has made “data is” acceptable. And they are not wrong. So indeed, going to the opus instead of the opera, liking octopodes salads instead of octopuses salads, pizze instead of pizzas, and beeves instead of beefs, is all a matter of choice and purists alongside pragmatists might have to go to an exedra to settle the issue ad nauseam, over a good meal.

We certainly do not mean to settle the debate here but decimate the egregious claim that people are nice… and that language is evolving indeed. Take for example the sentence you just read. Shocking? Or is it because you forgot the original meaning of these words? In today’s language, we should have written “[…] reduce by 10% the positive notion that people are ignorant” (we are a data analytics company after all, we know how to reduce things by 10%). As a matter of fact, “decimate” actually means to cut by 10%, egregious use to mean “remarkably good” (or positive in our sentence), and “nice” – can you believe it – comes from nescius meaning ignorant or stupid. Language is not immutable and that gives me the perfect excuse for not understanding half the sentences Shakespeare wrote in his plays, instead of making me feel nice. Yes you got it, I meant ignorant. But I wouldn’t have guessed that “girl” use to mean a young person from either sex (bad example, that was before Shakespeare).

The world of data analytics, albeit heavily grounded in long established mathematics, is still rather new and in large part driven by Millennia (sorry, Millennials), or the Thumbelina generation as coined by the French philosopher Michel Serres, always active on social mediums (sorry, social media – ha! that one seems accepted, go figure). This world is so fast moving that even our agendum (sorry, agenda, that one is not) cannot keep up, with, sadly, barely enough time to breath the past at local musea (hmm, museums). Generational phenomena, irresistible phenomenons.

If it is any consolation, for more than a century the English aristocracy in fact could not speak English (a long time ago, certainly). European languages have always evolved, mingled, cross-pollinated and, according to others, cross-polluted. France has a very strict French Academy that accepts new words every year. On the list: “big data”, with a French translation though, “megadonnées”, that bypasses any debate around singular vs plural. So our conclusion is simply to pick your battle: if “data is” versus “data are” matters to you, sharpen your arguments and confront the other side ad libitum, which you will find. Otherwise, you may skip the (great) discussion and go straight to the bar to enjoy your martinus.

Post a Comment