
Before I get to my story, I would like to say that you can download the notebooks I created and ran in Azure Databricks here: https://github.com/thesqlpro/blog/tree/master/notebooks
The source of my data was: https://covidtracking.com/data
I chose not source my data directly from Maryland’s State Government site because the format was not easy to use. The official Maryland Government provided data basically has each day as a column and had the rows as Zip Codes — not as easy as the data provided from the site above. So there may be few discrepancies between the data on a day to day basis, but the totals are identical. You can read about their methodologies of retrieving data from various official State Government websites and the quality of each.
This post is in no way intended to attack anyone, be part of a political movement, promote any agendas political/financial/social, or support any causes out there except one: highlight how data and statistics can be used to tell stories. As data professionals, we need to give importance to the quality of data and the quality of data reporting. Basically this is a lesson in data visualization and telling stories with data.
Read the rest of this entry »