Please note that this page is a work in progress! We will continue to update this page as we update and improve our models. Click here for an excellent primer many of terms used below.

State-level SEIR Model

The model used here was developed in an effort to better understand the past, current, and future transmission dynamics of SARS-CoV-2 in North Carolina. The models were developed and are maintained by researchers at UNC Chapel Hill and North Carolina State using data from North Carolina Department of Health and Human Services (NCDHHS) curated by The Raleigh News & Observer, WRAL News, and the Duke-Margolis Center for Health Policy.

Model Overview

As a base, we use a classic compartmental susceptible (S), exposed (E)1, infectious (I), recovered (R) or SEIR model to estimate SARS-CoV-2 transmission and recovery. In the SEIR model, infection dynamics are modeled via people moving between these different compartments, i.e., S > E > I > R. Because we are modeling over a short time period, the model does not incorporate births or non-COVID-19 deaths that affect population dynamics in North Carolina (click here for more information on SEIR models). We chose this relatively straightforward model given that it requires estimating fewer parameters, which is important given the high level of uncertainty surrounding SARS-CoV-2 and COVID-19.

Time Periods

At the time of the initial model development, visual inspection of the statewide cumulative and daily infections suggested that SARS-CoV-2 transmission changed at three distinct time points in North Carolina, which corresponded to the beginning of the statewide Stay-At-Home order (March 30, 2020), the beginning of Phase 1 of the “Staying Ahead of the Curve” reopening plan (May 8, 2020), the beginning of Phase 2 reopening plan (May 22, 2020); see our Timeline. Thus, we originally implemented the SEIR model in four distinct time periods, which allowed the Effective Reproduction Number (Rt, the average number of people infected by each infected person)2 to vary over time as changes in the restrictiveness of social distancing measures and in people’s behavior changed the transmission dynamics of SARS-CoV-2 in North Carolina.

Time Period Updates

As the pandemic has continued and transmission dynamics have changed in North Carolina, we have added three additional inflection points in our SEIR model, breaking the data into six time periods. The fifth time period begins on July 6th, 2020 which followed Governor Cooper’s order that face coverings are to be worn in public spaces (June 29, 2020) and the 4th of July holiday weekend. The sixth time period began on September 4th, 2020, the beginning of Phase 2.5 of the state’s reopening plan.

Parameters

Transmission

An SEIR model requires parameters to estimate the flow of people between compartments (S > E > I > R). These are \(\beta\), \(\delta\), and \(\gamma\), which can be calculated from the SARS-CoV-2 incubation period (length of time that a person is infected but not yet infectious), infectious period (length of time that a person is infected and infectious, i.e., able to spread the infection), and Rt. A SEIR model also requires the proportion of the population in each compartment at the beginning of the model. In our model, the incubation period and infectious period are fixed over time (which means \(\delta\) and \(\gamma\) are fixed). However, we allowed Rt to vary across the four time periods, which essentially allowed \(\beta\) to vary.

Initial Date and Conditions

The NCDHHS data begins reporting cases on March 2, 2020 (with 1 new case). However, it is likely that COVID was circulating prior to this first positive test result. Furthermore, we needed to take into account that there is a temporal lag between when a person becomes infected and when they are reported as a lab confirmed case. Initial inquiries and estimates suggested that this lag is roughly 7 days. Thus, we shifted the date of the lab confirmed infections back seven days to better reflect when the people developed symptoms. As such, we initialized our model on Febrary 24, 2020 with 200 people (0.002%) in the infectious compartment and 400 people (0.004%) in the exposed compartment (percentages based on a population of 10,488,084 in North Carolina).

Testing

Another hurdle with SARS-CoV-2 modeling efforts is that we do not have measured data on the true number of people who have been infected, as some people infected are asymptomatic or have very mild symptoms. Further, the lack of testing infrastructure (especially early in the epidemic) has hindered (and continues to hinder) testing. What we do have are data on lab confirmed infections, which are subject to a number of limitations because they are a function of the availability of testing, the number of tests performed, the efficiency of the testing approach (who gets tested), the limitations of the tests themselves, and the number of people actually infected– all of which appear to be varying over time. However, importantly, infection dynamics are driven by true infections, not lab confirmed cases. Thus, we modeled true infections and used these data to estimate the number of lab confirmed cases (using a time-varying percentage value), which allowed us to compare the model result to the observed data for North Carolina. Rather than attempt to model the separate factors that could influence the percent of all infections that were lab confirmed, we chose starting and ending percent values, and used the testing data to approximate the percent of true infections captured in the daily lab confirmed case data.

Initial Ranges for Estimated Model Parameters
Parameter Low High Unit
Incubation Period 4 8 Days
Infectious Period 5 12 Days
Infections Lab Confirmed, early 4 10 Percent
Infections Lab Confirmed, current 10 25 Percent
Rt (initial) 2 3 Unitless
Rt (Stay at Home) 1 2 Unitless
Rt (Phase 1) 1 2 Unitless
Rt (Phase 2) 1 2 Unitless

Initial Parameter Estimates

In the initial modeling phase, we ran the time-varying SEIR with data up to June 11, 2020 (the most up-to-date observed data at the time this step was implemented) using initial parameter values drawn from the ranges provided in the table. For each model run, we randomly drew a value from a uniform distribution for each parameter and solved the model. We then calculated the accuracy of that set of parameters by calculating the root mean squared error (RMSE) of the model-derived daily lab confirmed cases compared to the 7-day floating average of the observed daily lab confirmed cases (we used the 7-day average because of the high amount of noise in the raw daily observed data). We performed this calculation 300,000 times, using Latin Hypercube Sampling in an effort to evaluate the full range of the set of potential parameter values. From the 300,000 model runs, we subset to the 5,000 that best fit the observed 7-day floating average data (had the lowest RMSE), and calculated the mean and standard deviation of each of the 8 parameters.

Tuned Model Parameters
Parameter Mean Std Dev Tuned Unit
Incubation Period 5.77 1.11 5.85 Days
Infectious Period 7.42 1.63 6.09 Days
Infections Lab Confirmed, early 7.87 1.55 7.44 Percent
Infections Lab Confirmed, current 17.33 4.28 20.38 Percent
Rt (initial) 2.69 0.23 2.67 Unitless
Rt (Stay at Home) 1.28 0.16 1.13 Unitless
Rt (Phase 1) 1.26 0.17 1.17 Unitless
Rt (Phase 2) 1.36 0.18 1.27 Unitless

Tuned Parameter Estimates

We calculated a tuned set of parameter estimates by (again) running the time-varying SEIR model multiple times (30,000). For these runs, we drew from a normal distribution for each parameter based on the mean and standard deviation from the best 5,000 results of the initial model runs. From those 30,000 model runs, we chose the single set of parameter values that was most accurate (lowest RMSE) compared to the observed 7-day floating average values. The inputs and output parameter values from our best fit (tuned) model are provided in the table, and the observed and modeled data for this initial modeling process are graphed below. The tuned model output fit the observed data very well with an RMSE value of 25.6 (roughly translating to an average error of 25.6 cases per day for March 13 to June 11).

Daily Updates

Every day, we incorporate the new case count and number of tests to update our model. For these updates, we only update the most recent temporal period. To accomplish this, we draw from a normal distribution centered on the most recent Rt value with a standard deviation from the tuned estimates and the other eight parameters are held constant. We run this 30,000 times and identify the Rt that produces the lowest RMSE.

Updated Tuned Model Parameters
Parameter Mean Std Dev Tuned Unit
Incubation Period 5.85 0.54 5.99 Days
Infectious Period 5.80 0.52 6.05 Days
Lab Confirmed Infections (%), 1,000 tests 7.31 1.68 10.89 Percent
Lab Confirmed Infections (%), 40,000 tests 35.03 2.88 37.13 Percent
Rt (initial) 2.74 0.11 2.73 Unitless
Rt (Stay at Home) 1.15 0.07 1.17 Unitless
Rt (Phase 1) 1.16 0.09 1.25 Unitless
Rt (Phase 2) 1.25 0.07 1.21 Unitless
Rt (Phase 2 + Mask Order) 0.98 0.04 0.99 Unitless
Rt (Phase 2.5 and 3) 1.16 0.10 1.19 Unitless
Rt (Thanksgiving to New Year) NA NA 1.44 Unitless
Rt (post-New Year) NA NA 1.01 Unitless
Rt (eased restrictions) NA NA 1.61 Unitless
Rt (fewer restrictions, delta variant) NA NA 3.76 Unitless
Rt (fewer restrictions, delta variant) NA NA 2.94 Unitless
Rt (reduced transmission, delta variant) NA NA 2.33 Unitless

Major Updates

On July 14, we made a change to how our model estimates percent of infections confirmed via testing, which resulted in a relatively large shift in our future estimates of lab-confirmed cases (but not in our estimates of future infections).

On July 28, we added the fifth temporal period to the model.

Using data until October 12th, we reimplemented the entire modeling process with some updates that included narrower initial ranges for the first parameter estimation step (based on our prior results), a more straightforward approach to modeling the percent of infections confirmed and the number of tests, and the addition of a sixth time period corresponding to the beginning of Phase 2.5 of the NC reopening plan.

On January 8th, 2021, we added the seventh temporal period to the model corresponding to the increase in transmission observed around Thanksgiving.

On February 4th, 2021, we added the eighth temporal period to the model corresponding to the decrease in transmission observed after New Years Day.

On March 25th, 2021, we modified the model to account for the vaccinated population; furthermore, we added a ninth temporal period to the model corresponding to the increase in transmission following the easing of some statewide social distancing restrictions.

On July 8th, 2021, we added a tenth temporal period to the model corresponding to the increase in transmission following the further easing of restrictions.

On August 31st, 2021, we added an eleventh temporal period to the model corresponding to a slight reduction in transmission in early August.

On September 27th, 2021, we added a twelfth temporal period to the model corresponding to further reductions in transmission in mid-September.

Current Model Fit

Model Estimates and Observed Data, Daily Lab Confirmed Cases. Note: the popup text lists the model estimates of lab confirmed cases.


  1. People who are infected with SARS-CoV-2 are not immediately infectious (able to transmit the virus). An infected person goes through an incubation period where they cannot transmit the virus to another person. People in this incubation period are “exposed” in this model.↩︎

  2. Because we are referring a dynamic variable that changes over time, we use the term “Effective Reproduction Number” and symbol “Rt”, rather than the term “Basic Reproduction Number” and symbol “R0”. See this article for more information.↩︎