The Effect of Smoothing on the Interpretation of Time Series Data: A COVID-19 Case Study

Oded Stein, University of Southern California, Massachusetts Institute of Technology

Alec Jacobson, University of Toronto

Fanny Chevalier, University of Toronto

Figure 1. COVID-19 case count data is often presented as bars accompanied by a smoothed 7-day average line (e.g. Google News (a)). In one task of our study, participants were presented with the bars only, and asked to draw a line representing the smooth trend as well as its extrapolation. (b) visualizes the participants' drawings for this task on one of our plots. We also test other visualization conditions (c), where only the smoothed line is presented (top), and where both bars and line are visualized (middle), for which participants were asked to continue the line as they believed the data would continue in the future.

Abstract

We conduct a controlled crowd-sourced experiment of COVID-19 case data visualization to study if and how different plotting methods, time windows, and the nature of the data influence people's interpretation of real-world COVID-19 data and people's prediction of how the data will evolve in the future. We find that a 7-day backward average smoothed line successfully reduces the distraction of periodic data patterns compared to just unsmoothed bar data. Additionally, we find that the presence of a smoothed line helps readers form a consensus on how the data will evolve in the future. We also find that the fixed 7-day smoothing window size leads to different amounts of perceived recurring patterns in the data depending on the time period plotted -- this suggests that varying the smoothing window size together with the plot window size might be a promising strategy to influence the perception of spurious patterns in the plot.

Cite as

TBD

Acknowledgements

This work was supported in part by the Swiss National Science Foundation’s Early Postdoc.Mobility fellowship. This work was supported in part by a grant from NSERC (RGPIN-2018-05072) This research was funded in part by NSERC Discovery (RGPIN–2022–04680), the Ontario Early Research Award program, the Canada Research Chairs Program, a Sloan Research Fellowship, the DSI Catalyst Grant program and gifts by Adobe Inc.

We thank Yvonne Jansen and Souti Chattopadhyay for valuable comments which helped improve the manuscript.