I was tempted to title this post as “Journo uses statistics as a lamp post. Here’s How.” as a homage to the clickbaity titles that are frequently employed by the same media outlets that this post is about.
The horrific murder and rape of a young doctor in RG Kar Hospital in Kolkata earlier in August 2024 has shaken the collective conscience of the nation, yet again. The urgent issue of safety of women, safety of healthcare professionals, and generally safety in our Indian cities are once again a focus of discussion, and of nationwide protests.
Like in every other situation, the electronic media is trying to outdo one another in trying to provide the maximum information, including facts that aren’t always necessary, or verified, or both. Often, the media presents one side of a story in a magnified form, probably as a way to generate excitement and hype that translates into clicks, social shares, eyeballs, and ultimately, revenue.
However, media also has to at least appear credible to the readers, many of who are merely keyboard warriors. And what better way to appear credible than to pepper an article with statistics?
I read an article with a shocking headline:
The link to the article is here (behind a paywall).
(I am not going to reproduce the complete article here since it is a member-only article. However, I will share a few necessary excerpts from it here.)
The article looked at the search volumes for keywords relating to the incident, and/or the victim, on Google Trends.
Whenever, there are any data relating to percentage increase, one should always ask the question – “increase from what?” In other words, what is the base that we are talking about.
As a completely hypothetical example, suppose a keyword abc gets a search volume of 1 per million, which later rises to 5 per 2 million, the increase is a whopping 150%. However, while that percentage number in isolation might seem huge, when you see the figure in context, it does not seem as big.
The article shows the search trends for “… rape video” (the blanked out area contains the victim’s name)
What’s more unfortunate, the data shows that the search term spikes every night. While this is utterly shameful and shocking, this does not cover the complete perspective.
Let us add another keyword about the case – “kolkata rape murder case” – and analyse both together:
At this point, we must pause to understand what the scales really mean. The y-axis shows the peak popularity of the search term, for a given area, in a given period of time. The peak popularity is normalised to a scale of 0-100, and therefore has nothing to do with absolute search volumes (in other words, how frequently is the search being performed).
Clearly, in isolation, the search popularity does not convey much information. However, when we look at another keyword, and one that’s more likely to be popular, the information is more meaningful.
Another important visualisation is the column chart on the left in the above image showing aAverage search volumes in the same scale of 0-100. Here, the 7-day average for the keywords “kolkata rape murder case” was 44, whereas the interest for the keyword relating to the video was 1 over the same time period.
Data does not convey the qualitative aspects of why a particular search was carried out. It does not differentiate between someone searching for a video with a perverted state of mind, versus someone who was simply looking for videos relating to the case. Of course, we are a voyeuristic lot, so there is bound to be a lot of interest in intrusive content as well.
When I replicated the trend search on Google Trends, I was able to spot the nightly spike that the article talks about. On the face of it, there seems to be a perverse link to the search and the time of the day.
However, again the get the perspective right, let us compare this against another keyword, something that’s bound to be more popularly searched. When I added the keywords “rg kar case” to the same exploration, here’s what Google Trends showed me:
Surely the search for video goes up at night, but the search for the simple “rg kar case” also has clear and predictable spikes at night. In fact the spikes for both keywords largely match. When you look at this bit of information, the timing of the search might simply be due to the fact that a lot of people want to know more about the case at the end of the day, when they are done with their other chores. Perhaps, it is likely that the searches for video are innocuous, looking for news related videos?
Of course, not for a moment am I saying that there are no perverted minds in our society who might be wanting to watch videos of brutalities being committed. All that I am saying is that holding up individual bits of information while obscuring context is yellow journalism.
Statistics is like a streetlamp. You can use it either to light your way ahead, or you can use it like how a drunk man does – for support.
The relentless pursuit of sensationalism in electronic media undermines the very foundation of journalism, transforming it from a beacon of truth into a circus of misinformation. By prioritizing eye-catching headlines over factual accuracy, these outlets not only distort reality but also erode public trust. The half-truths and speculative narratives they peddle are designed not to inform but to manipulate, encouraging readers to form opinions based on incomplete and often misleading information. This practice ultimately devalues serious journalism and hampers the public’s ability to make informed decisions.
Note: All the trend visualisations that I have run on Google Trends were run on August 22, 2024