Weakly Coupled Dynamic Programs, Fluid Limits, and COVID-19: How Optimization Can Help Inform Policymaking

By Huikang Liu and Wolfram Wiesemann

Editor's Note: This article originally appeared in SIAM News on December 22, 2021 (https://sinews.siam.org/Details-Page/weakly-coupled-dynamic-programs-fluid-limits-and-covid-19-how-optimization-can-help-inform-policymaking).

Health systems around the world currently face the dual challenge of providing emergency care to intermittent waves of COVID-19 patients while simultaneously scheduling elective (i.e., planned) procedures and caring for other patients. Many countries have addressed this challenge by prioritizing COVID-19 cases, to the detriment of other patients. In fact, some healthcare systems have canceled elective procedures and rationed access to life-saving care. As of October 2021, the U.K. faces an estimated backlog of 5.98 million patients who are waiting for elective treatment; the number of patients who have waited for more than one year is 237 times higher than it was two years ago [2]. In response to these developments, the Nuffield Council on Bioethics issued a national statement that emphasizes the need for a coordinated patient prioritization policy that considers all disease groups [1].

Here we summarize our recent work that develops a national patient prioritization scheme by combining data analysis and optimization [3, 4]. Overall, we believe that these findings provide an encouraging example of using applied mathematics to study the entire health system (rather than individual components) and suggest decisions that are based on the estimated capacity to benefit while rigorously accounting for system dynamics and resource constraints. Doing so could help to improve current medical practices.

Mathematical Optimization Problem

We model the patients in our health system as finite horizon dynamic programs \((\mathcal{S}, \mathcal{A}, q, p, r)\) whose time horizon \(\mathcal{T}= \{ 1, \ldots, 56 \}\) records the weeks of the upcoming year (we add the last four periods to alleviate end-of-horizon effects). The finite state set \(\mathcal{S} = \{ 1, \ldots, S \}\) documents the patient’s health state (i.e., elective procedure or emergency, recovered or deceased) and treatment state (i.e., waiting for treatment, in general and acute or critical care), and the finite action set \(\mathcal{A} = \{ 1, \ldots, A \}\) describes the treatment options (i.e., admit and move to general and acute or critical care, deny care, or discharge from hospital). The initial state is described by \(q (s)\), \(s \in \mathcal{S}\), and the (non-stationary) state transition probabilities are described by \(p_t (s' \, | \, s, a)\), \(t \in \mathcal{T}\), \(s\), \(s' \in \mathcal{S}\), and \(a \in \mathcal{A}\). In contrast, \(r_t (s, a)\) denotes the expected rewards (typically the years of life saved when compared to the do-nothing policy) that are earned in week \(t \in \mathcal{T}\) if the patient is in state \(s \in \mathcal{S}\) and one applies action \(a \in \mathcal{A}\). We seek a policy for this dynamic program—that is, a period-wise mapping from states to actions—that maximizes the expected total rewards. Figure 1 provides a simplified illustration of our patient dynamic program.

Figure 1. A simplified version of our patient dynamic program. Images correspond to patient states, arcs denote stochastic transitions (we omit the associated probabilities, which differ by patient group), and arc labels refer to different actions. Initially, many of the patients are in an artificial waiting state (1) since they have not yet entered the system. Patients then arrive either as elective appointments (2) or emergencies (3). While elective appointments can be postponed, we assume that emergency patients will die if not treated in the same week. Patients require treatment in either general and acute care (4) or critical care (5). Patients that need critical care may still be treated in general and acute care due to a lack of resources, in which case they enter a designated state (6) that reflects less favorable future transition probabilities. Ultimately, patients either recover (7) or die (8). Figure courtesy of Wolfram Wiesemann.

Our numerical results for England consider approximately 10 million individual patient dynamic programs — a number that reflects current patient estimates for the country. We subdivide these dynamic programs into 3,360 homogenous patient groups \(j \in \mathcal{J} = \{ 1, \ldots, 3\text{,}360 \}\) with associated dynamic programs \((\mathcal{S}_j, \mathcal{A}_j, q_j, p_j, r_j)\) that are characterized by the same arrival time \(t \in \mathcal{T}\), same age group (subdivided into three different brackets), and same disease type (out of 20 different possibilities). Because all patients compete for the same resources (most notably senior and junior doctors, nurses, and hospital beds), we cannot study their dynamic programs in isolation. We therefore define a grouped weakly coupled dynamic program whose states \(\sigma = (\sigma_j)_{j \in \mathcal{J}}\) with \(\sigma_j : \mathcal{S}_j \rightarrow \mathbb{N}_0\) record the number of patients in each patient group \(j \in \mathcal{J}\) who are in each of the patient states \(s \in \mathcal{S}_j\). The actions \(\alpha = (\alpha_j)_{j \in \mathcal{J}}\) with \(\alpha_j : \mathcal{S}_j \times \mathcal{A}_j \rightarrow \mathbb{N}_0\) record the number of patients in each patient group \(j \in \mathcal{J}\) and each patient state \(s \in \mathcal{S}_j\) who receive action \(a \in \mathcal{A}_j\). The initial state and transition probabilities of the weakly coupled dynamic program are governed by multinomial distributions that one can derive from the individual patients’ initial states and transition probabilities via an independence assumption. The expected rewards from the action are equal to the sums of the respective expected rewards of the individual patient dynamic programs. As before, we seek a policy that maximizes the expected total rewards (now over all patients) — subject to the satisfaction of the resource constraints.

Fluid Approximation

The aforementioned grouped weakly coupled dynamic program is too large to solve exactly. However, one can show that replacing the stochastic patient evolutions with their expectations and relaxing the integrality requirement—which stipulates that each patient must be assigned a single action at any point in time—results in a relaxed grouped weakly coupled dynamic program that is solved exactly by the following linear program:

\[ \begin{align} \underset{\sigma, \; \pi}{\textrm{maximize}} \enspace \enspace \enspace & \sum_{t \in \mathcal{T}} \sum_{j \in \mathcal{J}} \sum_{s \in \mathcal{S}_j} \sum_{a \in \mathcal{A}_j} r_{jt} (s, a) \cdot \pi_{tj} (s, a) \\

\textrm{subject to} \enspace \enspace \enspace & \sigma_{1j}(s) = n_j \cdot q_j (s) & \forall j \in \mathcal{J}, \; \forall s \in \mathcal{S}_j \\

& \sigma_{t+1, j}(s') = \sum_{s \in \mathcal{S}_j} \sum_{a \in \mathcal{A}_j} p_{jt} (s' \, | \, s, a) \cdot \pi_{tj} (s, a) & \forall j \in \mathcal{J}, \; \forall s' \in \mathcal{S}_j, \; \forall t \in \mathcal{T} \setminus \{ T \} \\

& \sum_{j \in \mathcal{J}} \sum_{s \in \mathcal{S}_j} \sum_{a \in \mathcal{A}_j} c_{tlj} (s, a) \cdot \pi_{tj}(s, a) \leq b_{tl} & \forall l \in \mathcal{L}, \; \forall t \in \mathcal{T} \\

& \sum_{a \in \mathcal{A}_j} \pi_{tj} (s, a) = \sigma_{tj} (s) & \forall j \in \mathcal{J}, \; \forall s \in \mathcal{S}_j, \; \forall t \in \mathcal{T} \\

&\sigma_{tj} (s), \; \pi_{tj} (s, a) \geq 0 &\forall j \in \mathcal{J}, \; \forall (s, a) \in \mathcal{S}_j \times \mathcal{A}_j, \; \forall t \in \mathcal{T}.

\end{align} \]

This linear program has a natural interpretation. The decision variables \(\sigma_{tj} (s)\) record the number of patients in each patient group \(j \in \mathcal{J}\) that are in state \(s \in \mathcal{S}_j\) during time period \(t \in \mathcal{T}\), and the decision variables \(\pi_{tj} (s, a)\) determine how often one applies each admissible action \(a \in \mathcal{A}_j\) to the patients that are accounted for by \(\sigma_{tj} (s)\). The objective function maximizes the expected total rewards over all time periods \(t \in \mathcal{T}\), all patient groups \(j \in \mathcal{J}\), and all state-action pairs \((s, a) \in \mathcal{S}_j \times \mathcal{A}_j\). The first constraint ensures that \(\sigma_{1j}\) adheres to the initial state distributions \(q_j\); here, \(n_j\) denotes the number of patients in patient group \(j \in \mathcal{J}\). The second constraint stipulates that the one-step patient evolutions respect the transition probabilities \(p_{jt}\) and the selected policy \(\pi_{tj}\). The third constraint ensures that the cumulative resource consumption across all patient groups stays within the weekly capacities \(b_{tl}\) for all resources \(l \in \mathcal{L}\). Finally, the penultimate constraint guarantees that no patients “disappear” in any time period. Despite its nontrivial size, one can reliably solve this fluid linear program on a standard laptop with open-source linear programming solvers in under two minutes. This feat is mainly due to the fact that neither the number of variables nor the number of constraints in the problem scale with the number \(\Sigma_{j \in \mathcal{J}} n_j\) of patients in the health system.

One can show that the fluid linear program constitutes a relaxation of the grouped weakly coupled dynamic program — that is, its optimal objective value overestimates the achievable expected total reward. At the same time, an optimal solution \((\sigma^\star, \pi^\star)\) to the fluid linear program enables the recovery of a feasible policy for the grouped weakly coupled dynamic program whose suboptimality (relative to an optimal policy) is small with high probability; this suboptimality vanishes on a relative scale as the health system grows in size. These properties give us confidence that the fluid linear program’s solution can provide meaningful decision-making advice.

Policy Implications

Our numerical study aims to optimally schedule elective and emergency procedures across England over a 56-week planning horizon to maximize the overall years of life that are gained when compared to the current policy. We consider a total of 10.45 million non-COVID-19 patients (3.9 million electives and 6.55 million emergencies), as well as 349,279 COVID-19 patients (all emergencies). Our results indicate that our optimized schedule can lead to a gain of 50,750 to 5,891,608 years of life—depending on the additional provided capacity and assumptions about the inflows of emergency admissions and pandemic trajectories—over prioritization policies like the ones that England implemented during the ongoing pandemic. Notable health gains exist for neoplasms, diseases of the digestive system, injuries, and poisoning. Figure 2 summarizes some of our findings and demonstrates the advantages of prioritizing specific non-COVID-19 patients who have a higher capacity to benefit.

Figure 2. Years of life that our optimized schedule gains relative to England’s implemented policy for a typical scenario, categorized by disease group. Figure courtesy of [4].

Our study outlines life-saving prioritization principles that health systems can implement — even those that lack sufficient amounts of available data to solve our optimization problem. Such principles are as follows:

Consider the relative capacity to benefit of patients with different diseases and ages
Postpone elective treatments for which disease progression is mild and whose patients have lower chances of being admitted to emergency care
Prioritize access to critical care based on capacity to benefit rather than default prioritization of one disease (for example, COVID-19) over other patients.

As data becomes more readily available, we believe that the combination of data analysis and optimization—as we have presented here—will find further applications to improve the efficiency of health systems worldwide.

Wolfram Wiesemann presented this research during a minisymposium at the 2021 SIAM Conference on Optimization, which took place virtually earlier this year in conjunction with the 2021 SIAM Annual Meeting.

References
[1] Archard, D., & Whittall, H. (2021, January 13). Statement: The need for national guidance on resource allocation decisions in the COVID-19 pandemic. Nuffield Council on Bioethics. Retrieved from https://www.nuffieldbioethics.org/news/statement-the-need-for-national-guidance-on-resource-allocation-decisions-in-the-covid-19-pandemic.
[2] The British Medical Association. (2021). Pressure points in the NHS. BMA. Retrieved from https://www.bma.org.uk/advice-and-support/nhs-delivery-and-workforce/pressures/pressure-points-in-the-nhs.
[3] D’Aeth, J.C., Ghosal, S., Grimm, F., Haw, D., Koca, E., Lau, K., … Miraldo, M. (2021). Optimal national prioritization policies for hospital care during the SARS-CoV-2 pandemic. Nat. Comput. Sci., 1(8), 521-531.
[4] D’Aeth, J.C., Ghosal, S., Grimm, F., Haw, D., Koca, E., Lau, K., … Wiesemann, W. (2021). Optimal hospital care scheduling during the SARS-CoV-2 pandemic. Optimization Online.

Huikang Liu is an assistant professor at Shanghai University of Finance and Economics. He received his B.S.E. in mathematics from the University of Science and Technology of China and his Ph.D. in systems engineering and engineering management from the Chinese University of Hong Kong. Liu has subsequently been a postdoctoral researcher at the Chinese University of Hong Kong and Imperial College Business School. His research interests revolve around continuous optimization and its application to machine learning, statistics, and signal processing.

Wolfram Wiesemann is a professor of analytics and operations as well as a fellow of the KPMG Centre for Advanced Business Analytics at Imperial College Business School. He currently serves on the editorial boards of six journals, including the SIAM Journal on Optimization. Wiesemann’s research interests revolve around the methodological aspects of decision-making under uncertainty and applications in logistics, energy, and healthcare.

Categories: Magazine, Articles

Tags:

Please login or register to post comments.

Name:

Email:

Subject:

Message:

Weakly Coupled Dynamic Programs, Fluid Limits, and COVID-19: How Optimization Can Help Inform Policymaking

Mathematical Optimization Problem

Fluid Approximation

Policy Implications

Please login or register to post comments.

Archive