Data in the news: lessons from the stories behind the headlines

We asked some of the smartest minds in the sector to uncover the data lessons from recent news stories.

A quick glance at current events will confirm that data is everywhere.

It’s permeated all sectors of society and almost every facet of the working world. As such, barely a week goes by without the subject popping up in the global news cycle.

To find out what lessons we can glean from the ways data is being used in our world – and the ways it’s impacting life – we asked some of the smartest minds in the sector to take a look at some recent stories and uncover the data lessons hidden within.

Helena Schwenk is Exasol’s Market Intelligence Lead and former data analyst. Peter Jackson is co-founder of the CDO Summer School, author of the CDO Playbook and Chief Chief Data and Analytics Officer at Exasol.

Let’s dive in.

The headline:
Only 32 people hospitalised with Covid in UK after having vaccine

The story:

In late April, The Metro newspaper in London reported

Just 32 of the UK’s recent coronavirus hospitalisations had been vaccinated before they were admitted, new figures show.

In an effort to see the real-life effects of the immunisation programme, the Government is being urged to publish data on how many vaccinated Brits are still being hospitalised or dying because of the virus

Now, figures have revealed that out of the 74,405 coronavirus patients admitted to hospital between last September and this March, just 32 of them had been jabbed.

The data lesson from Helena Schwenk:

At face value 32 hospital admissions appears incredibly low when considering the accelerated rollout of the UK’s vaccination program. However, there is a data lesson to be learned – caution must be applied when only looking at the headline, as by itself, this can be misleading.

If we scratch behind the 32 hospitalizations, there are a number of things to be aware of when interpreting this number. 

First, the time period assessed is misaligned. Using data from September 2020 to March 2021 doesn’t take into account the UK vaccination program started in December. 

Second, the study only includes a fraction of all admissions, so the true number of hospitalizations is likely to be much higher. 

And third, the cut-off date for analysis – March 2, 2021 – didn’t provide enough time to assess if people vaccinated in early 2021, when the program got into its stride, ended up in hospital.  

In short, it’s really important the context of data is understood when interpreting headlines. While hospital admissions are thankfully low for those receiving vaccinations, they are not as low as this headline suggests. Understanding the data sample used for analysis is vital if you are to get a true reflection of the hospitalization situation. 

The headline:
Coronavirus: ‘High risk’ list misses off thousands of people

The story:

The BBC recently reported that…

Thousands of people have been missed off the government’s high risk list for Covid-19 – despite meeting the criteria. Among them have been transplant patients, people with asthma and some with rare lung diseases.

Prof Martin Marshall, chair of the Royal College of General Practitioners, said: “GP practices are working hard to notify patients who are considered to be in a vulnerable group, and therefore at higher-risk of getting Covid-19, about measures they should take to keep as safe as possible.” 

He said the information had been collected from GP computer systems and added “every effort is being made to ensure that the data is accurate”.

The data lesson from Peter Jackson:

This is clearly a case of poor data governance.

Ideally data should be validated upon input, we’ve all encountered that when filling in data, if this isn’t possible then there should be appropriate controls to ensure that data is accurate and complete – it is a requirement for GDPR after all – and records should be kept up to date. 

Everyone involved in this process should understand the ‘data value-chain’ and the implications of getting that wrong, which became very apparent in this example.

The headline:
Bad weather forecasts are a climate crisis disaster

The story:

In early May, a WIRED feature explored the work of Jack Kelly, co-founder Open Climate Fix, a non-profit focused on reducing greenhouse gas emissions using machine learning:

Kelly’s idea is to use machine learning to improve what is known as solar ‘nowcasting’ – predicting solar electricity generation less than a few hours in advance. Rather than working out what the weather will generally be like in a given area, to get really precise solar forecasts, Kelly needs to know precisely where each cloud will be located relative to a solar array, and how the size and shape of the clouds influence how much sunlight gets through to the panels.

The data lesson from Helena Schwenk:

There are a three key data lessons to flush out here. 

First, the article demonstrates the seismic and transformational power of machine learning as a way of tackling some of our biggest climate change challenges. In this case machine learning can be applied to deliver more accurate and precise solar forecasts, potentially saving 100,000 tonnes of UK carbon dioxide emissions each year. 

Second, it underlines the very acute need for real time data analysis and interpretation. The power of machine learning comes from not only being able to accurately predict the location, size and shape of clouds but to do this within a time window where impactful action can be taken i.e., where the use of back-up power generators is optimized. 

And third, it also points to the perennial issue of data quality and disparate data, as the location and capacity of many solar panels is often unknown. The good news however is that machine learning can be applied here too, this time to identify solar panels in satellite and aerial imagery. Knowing exactly where solar installations are, is yet another way where good quality data can help improve the accuracy of solar forecasts.


If you’ve seen a data lesson is the news recently, be sure to let us know on social media. We’d love to hear your thoughts on what we can all learn from how data is used and applied in the world.