Introduction of COVID-19 Research and Findings
While the world is experiencing a life-altering pandemic, Evive’s Data Scientists are searching for answers to guide our customers through these unprecedented times. Many sources are reporting case counts and infection rates regularly, but the pace of change is so rapid that a given day’s data may not be adequate to structure business plans. So we built statistical models to predict the number of cases that our customers should plan for and anticipate.
Customers can leverage these three predictive models: Operational Risk, Predictive Risk, and Social Determinants. These will help them better understand infection and transmission rates at a county level, providing them with the insights to make well-informed operational and return-to-work decisions.
Clinical and Operational Risk Scores
This map highlights counties where our customers can anticipate the greatest impact to their business. It is based on the number of employee residents, cases of COVID-19 per every 1,000 residents, and population density.
Employers can use this county-level data to make strategic decisions about which facilities or offices can remain operational and which to close down due to high contagion rates.
The operational risk score highlights areas most likely to be impacted from a resource and demand standpoint, by taking into account employees residing in that county in combination with factors affecting COVID-19 spread (cases per every 1,000 residents and population density). The natural log of population density was taken to allow substantial weight to be placed on the distribution of employees. The risk score was then ranked to simplify prioritization.
While the operational risk score accounts for the broader effects of COVID-19 on the business, the clinical risk score focuses on the effect of COVID-19 on an employer’s high-risk population. For the purposes of our analysis, the high-risk members include individuals with one or more of these pre-existing conditions: diabetes, hypertension, heart, respiratory, and autoimmune or immunocompromised conditions. In addition to distribution of high-risk members, the score considers the ability of that county to care for advanced cases of disease in terms of hospital resources.
Predictive Risk Scores
This map displays the risk level for each county over the next 5 to 30 days based on case counts and deaths counts. Using the IHME, ARIMA and logistic growth models, both short and long-term projections are highly accurate.
Employers can use this model to have a quantifiable understanding of the severity of the COVID-19 situation in each county as far as month into the future to allow for long-term business planning and responsible re-opening.
The red counties are flagged as the highest risk for future COVID-19 cases, based on predicted case counts and death counts.
The IHME model estimates the growth trajectories for cumulative COVID-19 cases and cumulative COVID-19 deaths across locations earlier hit by the disease. While making a strong assumption that overall shape of the trajectory is invariant across locations, it has the advantage of using government responses as a parameter directly within the model with very minor tweaking.
The ARIMA model accurately forecasts the next 5-day COVID-19 case and death counts by county.
The logistic growth model is often used by epidemiologists to represent typical S-curve growth pattern with exponential case counts following the beginning of the outbreak and a tapering off of case counts of the contagion as it nears its maximal reach.
Read the full methodology documentation.
Social Determinants of Health
Individual lifestyles and societal influence happen to play a vital role in the spread rate of COVID-19.
This research identifies those specific social determinants of health that have the largest effect on the spread of COVID-19, including income, commute time, traffic volume, and associations.
Employers can use this data, in combination with the above risk models, to customize return-to-work policies for different regional socio-economic needs.
Click on the map to see an interactive version that shows the SDOH score and rank for each county.
Research shows that low-income, high population density, and longer commute factors contribute toward a higher spread rate for the disease.
Census data, the main source for population health data, along with COVID-19 case data was transformed and synthesized to create a matrix for each county. After multiple rounds of testing—continuously adjusting in each iteration—we successfully established a relationship between COVID-19 incidence and social determinants of health. From there, we created a set-up that is optimized for COVID-19 and fits all the SDOH variables with the spread of COVID-19.
A few variables were found to explain the spread of infection with a high degree of confidence. These select variables were quantified and combined to generate a score. The scores were then applied in a ranking of each county. It was concluded that counties that have a high rank and are considered to have poorer social determinants of health are experiencing a higher rate of COVID-19 cases.
Additional Resources from Evive
Data Science Team
Shatakshi Gariya, Data Analyst
Abhishek Ghosh, Data Analyst
Shravan KM, Data Scientist
Keerthi Prasad, Lead, Machine Learning and Implementation
Sarah Sturino, Outcomes & Analytics Associate
Dr. Vivek Mishra, Senior Data Scientist
Dr. Arun Rajagopalan, Senior Director, Data Sciences
Vijendra S.K, Manager, Data Sciences
Rakshit Bhalla, Software Developer, Data Science