Those of you who do international business or research know the difficulties and risks of working with data reported based on different methodologies, institutions, policies, and indicators. There are many issues that intervene in national and international data reporting, some of them intentional, like hiding the real numbers, manipulating the public opinion, not creating panic, while other unintentional, like lack of access to information, coding errors, outdated data…

The methodologies and intentions when it comes to reporting on the coronavirus and COVID-19 are affected by the same biases and errors, if not more. Here is the warning regarding the data from the World Health Organization, the top source of information on the topic: “Due to differences in reporting methods, retrospective data consolidation, and reporting delays, the number of new cases may not always reflect the exact difference between yesterday’s and today’s totals. WHO COVID-19 Situation Reports present official counts of confirmed COVID-19 cases, thus differences between WHO reports and other sources of COVID-19 data using different inclusion criteria and different data cutoff times are to be expected .”

So which are some of the variables that you need to consider when analyzing the coronavirus statistics and when trying to compare the data across countries or, even more importantly, modeling them for forecasting purposes? The essential part is creating a table where you take into consideration the variables that you need to weigh and control for, some of which can include:

  • The method of reporting infected cases: based on positive tests, symptoms, hospitalized individuals…
  • The percentage of tests performed per total population, per people hospitalized…
  • The length of time to report the results of a test
  • The method of reporting deaths: again, based on positive tests, symptoms, hospitalized individuals, are individuals dying at home or in collective residence reported…
  • Quarantine and isolation methods…
  • And, let’s not forget, any possible intentions or motivations to manipulate the data, including for positive reasons, to prevent panic, or for political gains.

So you get the idea. This is why reporting and aggregation efforts such as this amazing website from Johns Hopkins University are incredibly helpful, but they might not always reflect what happens in the real world.