There is a crisis within healthcare technology research and development, wherein historically marginalized groups are under-researched in preclinical studies, under-represented in clinical trials, misunderstood by clinical practitioners, and harmed by biased medical technology. These issues in turn contribute to costly disparities in healthcare outcomes, leading to losses of $93 billion a year in excess medical-care costs, $42 billion a year in lost productivity, and $175 billion a year due to premature deaths. COVID-19 put these disparities into especially sharp focus. In December 2020, pulse oximeters, critical for healthcare monitoring during the pandemic, were shown to be much less accurate in patients with darker skin, thereby putting those patients at a greater risk of organ damage. The Food and Drug Administration (FDA) responded by issuing a safety communication, but not with any changes to regulation of pulse oximeters.
Especially for an administration that has embedded equity throughout its policy agenda, this situation is unacceptable. The Biden-Harris Administration must act to address bias in medical technology at the development, testing and regulation, and market-deployment and evaluation phases. This will require coordinated effort across multiple agencies. In the development phase, science-funding agencies should crack down on federally funded studies that do not conduct mandatory subgroup analysis for diverse populations. Funding agencies should also expand funding for under-resourced research areas. In the testing and regulation phase, the FDA should raise the threshold for evaluation of medical technologies, make diversity requirements binding, and expand data-auditing processes. In the market-deployment and evaluation phases, the FDA should strengthen reporting mechanisms for adverse outcomes, the Federal Trade Commission (FTC) should require impact assessments of deployed technologies, and the Agency for Healthcare Research and Quality (AHRQ) should identify technologies that could address healthcare disparities.
Challenge and Opportunity
Bias is regrettably endemic in medical innovation. Drugs are incorrectly dosed to people assigned female at birth due to historical exclusion of women from clinical trials. Medical algorithms make healthcare decisions based on biased health data, clinically disputed race-based corrections, and/or model choices that exacerbate healthcare disparities. Much medical equipment is not accessible, thus violating the Americans with Disabilities Act. Biased studies, technology, and equipment inevitably produce disparate outcomes in U.S. healthcare.
The problem of bias in medical innovation manifests in multiple ways: cutting across technological sectors in clinical trials, pervading the commercialization pipeline, and impeding equitable access to critical healthcare advances.
Bias in medical innovation cuts across technology sectors
The 1993 National Institutes of Health (NIH) Revitalization Act required federally funded clinical studies to (i) include women and racial minorities as participants, and (ii) break down results by sex and race or ethnicity. Yet a 2019 study found that only 13.4% of NIH-funded trials performed the mandatory subgroup analysis. Moreover, the increasing share of industry-funded studies are not subject to Revitalization Act mandates — they are only governed by non-binding FDA recommendations for clinical-trial diversity. These studies frequently fail to report differences in outcomes by patient population as a result. The resulting disparities in clinical-trial representation are stark: African Americans represent 12% of the U.S. population but only 5% of clinical-trial participants, Hispanics make up 16% of the population but only 1% of clinical trial participants, and sex distribution in some trials is 67% male. Finally, many medical technologies approved prior to 1993 have not been reassessed for potential bias. One outcome of such inequitable representation is evident in drug dosing protocols: sex-aware prescribing guidelines exist for only a third of all drugs.
Bias in U.S. medical innovation — perpetuated by weak or weakly enforced federal regulations — extends beyond clinical trials. As explained below, bias pervades medical algorithms, medical devices, and the pharmaceutical sector as well.
Regulation of medical algorithms varies based on end application, as defined in the 21st Century Cures Act. Only algorithms that (i) acquire and analyze medical data and (ii) could have adverse outcomes are subject to FDA regulation. Thus, clinical decision-support software is not regulated even though these technologies make important clinical decisions in 90% of U.S. hospitals.
Even when a medical algorithm is regulated, regulation may occur through relatively permissive de novo pathways and 510(k) pathways. A de novo pathway is used for novel devices determined to be low to moderate risk, and thus subject to a lower burden of proof with respect to safety and equity. A 510(k) pathway can be used to approve a medical device exhibiting “substantial equivalence” to a previously approved device, i.e., it has the same intended use and/or same technological features. Different technical features can be approved so long as there are not questions raised around safety and effectiveness.
Medical devices approved through de novo pathways can be used as predicates for approval of devices through 510(k) pathways. Moreover, a device approved through a 510(k) pathway can remain on the market even if its predicate device was recalled. Widespread use of 510(k) approval pathways has generated a “collapsing building” phenomenon, wherein many technologies currently in use are based on failed predecessors. Indeed, 97% of devices recalled between 2008 to 2017 were approved via 510(k) clearance.
Even more alarming is evidence showing that machine learning can further entrench medical inequities. Because machine learning medical algorithms are powered by data from past medical decision-making, which is rife with human error, these algorithms can perpetuate racial, gender, and economic bias. Even algorithms demonstrated to be unbiased at the time of approval can evolve in biased ways over time, with little to no oversight from the FDA. As technological innovation progresses, an intentional focus on this problem will be required.
Finally, there is not a list of approved medical algorithms on the market, making it difficult for researchers to assess them for bias.
Currently, the Medical Device User Fee Act requires the FDA to consider the least burdensome appropriate means for manufacturers to demonstrate the effectiveness of a medical device or to demonstrate a device’s substantial equivalence. This requirement was reinforced by the 21st Century Cures Act, which also designated a category for “breakthrough devices” subject to far less-stringent data requirements. Such legislation shifts the burden of clinical data collection to physicians and researchers, who might discover bias years after FDA approval. This legislation also makes it difficult to require assessments on the differential impacts of technology.
Like medical algorithms, many medical devices are approved through 510(k) exemptions or de novopathways. The FDA has taken steps since 2018 to increase requirements for 510(k) approval and ensure that Class III (high-risk) medical devices are subject to rigorous pre-market approval, but problems posed by equivalence and limited diversity requirements remain.
The 1993 Revitalization Act strictly governs clinical trials for pharmaceuticals and does not make recommendations for adequate sex or genetic diversity in preclinical research. The results are that a disproportionately high number of male animals are used in research and that only 5% of cell lines used for pharmaceutical research are of African descent. Programs like All of Us, an effort to build diverse health databases through data collection, are promising steps towards improving equity and representation in pharmaceutical research and development (R&D). But stronger enforcement is needed to ensure that preclinical data (which informs function in clinical trials) reflects the diversity of our nation.
Bias in medical innovation exists throughout the commercialization pipeline
Bias occurs not only in multiple medical innovation sectors, but also across the development, testing and regulation, and market-deployment and evaluation phases of the medical innovation pipeline. This can be understood through the example of pulse oximeters.
Pulse oximetry was developed by Biox and given FDA approval in 1980. The technology works by shining a light through the skin and measuring the difference in light absorbance to estimate arterial oxygen saturation. Melanin absorbs visual and infrared light and will interfere at all wavelengths. No algorithm has yet been developed to account for melanin attenuation. Hence pulse oximeter calibration data does not accurately reflect Black patients.
Testing and regulation
The first pulse oximeter was approved by the FDA at a time when clinical trials did not require gender and racial diversity. Thus, the foundational, 1980s-era pulse oximeter technology upon which subsequent 510(k) clearance for pulse oximeters has been granted is one that was tested almost exclusively on white, male patient populations.
With the 510(k) clearance, only 10 people are required in a study of any new pulse oximeter’s efficacy. The FDA states that pulse oximetry study populations should have a range of skin pigmentations and must include at least two darkly pigmented individuals or 15% of the participant pool, whichever is larger. But the FDA does not provide an objective standard for “darkly pigmented”. Moreover, this requirement (i) does not have the statistical power necessary to detect differences between demographic groups, and (i) does not represent the composition of the U.S. population. Finally, FDA guidance is silent on how pulse oximetry technology should be calibrated — it does not, for instance, specifically recommend studies on melanin interference.
Market deployment and evaluation
To clinical practitioners, pulse oximeters are a metaphorical “black box”, with oxygenation calculations hidden by proprietary algorithms. When errors or biases occur in oximeter data (if they are even noticed), the practitioner may blame the patient for their lifestyle rather than the technology used for assessment. This in turn leads to worse clinical outcomes for patients with darker skin tones, as they are at greater risk of becoming sicker before receiving care. The problem is exacerbated by the fact that clinicians who use oximeter technology for the first time (as was the case during COVID-19) generally are not trained to spot factors that cause inaccurate measurements. This leads to underreporting of adverse events to the FDA — which is already a problem due to the voluntary nature of adverse-event reporting. When problems are ultimately identified during market deployment and evaluation of a given technology, government can be slow to respond. The pulse oximeter’s limitations in monitoring oxygenation levels across diverse skin tones was identified as early as the 1990s. 31 years later, despite repeated follow-up studies indicating biases, no manufacturer has incorporated skin-tone-adjusted calibration algorithms into pulse oximeters. It required the large Sjoding study, and the media coverage it garnered, for the FDA to issue a safety communication. Even then, the safety communication has not been followed with any additional regulatory action.
Inequitable access to medical innovation represents a form of bias
Americans face wildly different levels of access to new medical innovations. As many new innovations have high cost points, these devices exist outside the price range of many smaller healthcare institutions and/or federally funded healthcare services, including Veterans Affairs, health centers, and the Indian Health Service. Emerging care-delivery strategies might not be covered by Medicare and Medicaid, meaning that patients under those systems cannot access the most cutting-edge treatments. Finally, the shift to digital health in response to COVID-19 has compromised access to healthcare in rural communities without reliable broadband access.
Finally, the Advanced Research Projects Agency for Health (ARPA-H) has a commitment to have all programs and projects consider equity in their design. To fulfill ARPA-H’s commitment, there is a need for action across the federal government to ensure that medical technologies are developed fairly, tested with rigor, deployed safely, and made affordable and accessible to everyone.
Plan of Action
The Biden-Harris Administration should launch “Healthcare Technology for All Americans” (HTAA), a government-wide initiative to address systemic inequities in U.S. healthcare wrought by biased medical technology. Through a comprehensive approach that addresses bias in all medical sectors, at all stages of the commercialization pipeline, and in all geographies, the initiative will strive to ensure unbiased, equitable care delivery across the entire medical-innovation ecosystem. HTAA should be a joint mandate of Health and Human Services (HHS) and the Office of Science Technology and Policy (OSTP) to work with federal agencies on priorities of health equity, and initiative leadership should sit at both HHS and OSTP.
This initiative will require involvement of multiple federal agencies, as summarized in the table below. Additional detail is provided in the subsequent sections describing how the federal government can mitigate bias in the development phase; testing, regulation, and approval phases; and market deployment and evaluation phases.
Three guiding principles should underlie the initiative:
- Equity should drive action. Actions should seek to improve the health of those who have been historically excluded from medical research and development. We should design standards that repair past exclusion and prevent future exclusion.
- Coordination and cooperation are necessary. The executive and legislative branches must collaborate to address the full scope of the problem of bias in medical technology, from federal processes to new regulations. Legislative leadership should task the Government Accountability Office (GAO) to engage in ongoing assessment of progress towards the goal of achieving equity in medical innovation.
- Transparent, evidence-based decision making is paramount. There is abundant peer-reviewed literature that examines bias in drugs, devices, and algorithms used in healthcare settings — this literature should form the basis of an equity-driven approach to medical innovation. Gaps in evidence should be focused on through deployed research funding. Moreover, as algorithms become ubiquitous in medicine, every effort should be made to ensure that these algorithms are trained on representative data of those experiencing a given healthcare condition.
|Advanced Research Projects Agency for Health (ARPA-H)||The nascent ARPA-H has already committed to tackling health equity in biomedical research, and to aligning each project it undertakes with that goal. As such, ARPA-H should lead the charge in developing processes for equity in medical technology — from idea conceptualization to large-scale rollout — and serve as a model for other federally funded healthcare programs.|
|Agency for Healthcare Research and Quality (ARHQ)||The ARHQ, a component of HHS, should identify areas where technology bias is leading to disparate healthcare outcomes and report its findings to Congress, the White House, and agency leaders for immediate action.|
|Centers for Disease Control and Prevention (CDC)||CDC’s expertise in health-data collection should be mobilized to identify research and development gaps.|
|Centers for Medicare and Medicaid (CMS)||CMS oversees the coordination of coverage, coding, and payment processes with respect to new technologies and procedures. Thus, CMS should focus on ensuring all new technologies developed through federal funding, like those that will be built by ARPA-H and its industry partners, are covered by Medicare and Medicaid. In addition, CMS and its accrediting partners can require compliance with federal regulatory standards, which should be extended to assess medical technologies.|
|Department of Commerce (DOC)||Given role in enforcing U.S. trade laws and regulations, DOC can do much to incentivize equity in medical device design and delivery. The National Institute of Standards and Technology (NIST) should play a key role in crafting standards for identifying and managing bias across key medical-technology sectors.|
|Department of Defense (DOD)||DOD has formalized relationships with FDA to expedite medical products useful to American military personnel. As a DOD priority is to expand diversity and inclusion in the armed forces, these medical products should be assessed for bias that limits safety and efficacy.|
|Department of Education (ED)||ED should work with medical schools to develop and implement learning standards and curricula on bias in medical technology.|
|Federal Trade Commission (FTC)||FTC should protect America’s medical technology consumers by auditing high-risk medical innovations, such as decision-making algorithms.|
|Food and Drug Administration (FDA)||FDA should take a more active role in uncovering bias in medical innovation, given its role as a regulatory checkpoint for all new medical technologies. This should include more rigorous evaluation protocols as well as better tracking of emergent bias in medical technologies post-approval.|
|Government Accountability Office (GAO)||The GAO should prepare a comprehensive roadmap for addressing bias endemic to the cycle of medical technology development, testing, and deployment, with a focus on mitigating bias in “black box” algorithms used in medical technology.|
|Health Resources and Services Administration (HRSA)||HRSA should coordinate with federally qualified health centers on digital health technologies, taking advantage of the broadband expansion outlined in the bipartisan infrastructure bill.|
|National Institute of Health (NIH)||NIH should fund research that addresses health-data gaps, investigates algorithmic and data bias, and assesses bias embedded in medical technical tools. Simultaneously, NIH should create standards for diversity in samples and/or datasets for preclinical research. Finally, NIH must strongly enforce the 1993 NIH Revitalization Act’s diversity provisions.|
|National Science Foundation (NSF)||NSF should collaborate with NIH on cross-agency programs that fund R&D specific to mitigating bias of technologies like AI.|
|The Office of the National Coordinator for Health Information Technology (ONC)||ONC publishes standards for effective use of healthcare information technology that ensure quality care delivery. Their standards-setting should also extend to ensuring equitable care delivery of novel AI/ML algorithms being used in healthcare IT.|
|Office of Management and Budget (OMB)||OMB should work with HTAA leadership to design a budget for HTAA implementation, including for R&D funding, personnel for programmatic expansion, data collectives, education, and regulatory enforcement.|
|Office of Science Technology and Policy (OSTP)||OSTP should develop processes and standards for ensuring that individual rights are not violated by biased medical technologies. This work can build on the AI Bill of Rights Initiative.|
|Veterans Affairs (VA)||The VA should work with ARPA-H and its industry partners to establish cost-effective rollout of new innovations to VA-run hospitals.|
Addressing bias at the development phase
The following actions should be taken to address bias in medical technology at the innovation phase:
- Enforce parity in government-funded research. For clinical research, NIH should examine the widespread lack of adherence to regulations requiring that government-funded clinical trials report sex and racial or ethnicity breakdown of trial participants. Funding should be reevaluatedfor non-compliant trials. For preclinical research, NIH should require gender parity in animal models and representation of diverse cell lines used in federally funded studies.
- Deploy funding to address research gaps. Where data sources for historically marginalized people are lacking, such as for women’s cardiovascular health, NIH should deploy strategic, targeted funding programs to fill these knowledge gaps. This should include resources for underrepresented groups to participate in research and clinical trials. Results should be added to a publicly available database so they can be accessed by designers of new technologies. Funding programs should also be created to fill gaps in technology, such as in diagnostics and treatmentsfor high-prevalence and high-burden uterine diseases like endometriosis (found in 10% of reproductive–aged people with uteruses).
- Invest in research into healthcare algorithms and databases. Given the explosion of algorithms in healthcare decision-making, NIH and NSF should launch a new research program focused on the study, evaluation, and application of algorithms in healthcare delivery, and on how artificial intelligence and machine learning (AI/ML) can exacerbate healthcare inequities. The initial request for proposals should focus on design strategies for medical algorithms that mitigate bias from data or model choices.
- Task ARPA-H with developing metrics for equitable medical technology development. ARPA-H should prioritize developing a set of procedures and metrics for equitable development of medical technology. Once developed, these processes should be rapidly deployed across ARPA-H, as well as published for potential adoption by additional federal agencies, industry, and other stakeholders. ARPA-H could also collaborate with NIST on standards setting with NIST and ONC on relevant standards setting. For instance, NIST is developing a standard for managing bias in AI, and the ONC engages in setting standards that achieve equity by design. CMS could use resultantstandards for Medicare and Medicaid reimbursements.
Addressing bias at the testing, regulation, and approval phases
The following actions should be taken to address bias in medical innovation at the testing, regulation, and approval phases:
- Raise the threshold for FDA evaluation of devices and algorithms. Equivalency necessary to receive 510(k) clearance should be narrowed. For algorithms, this would involve consideration of whether the datasets or machine learning tactics used by the new device and its predicate are similar. For devices (including those that use algorithms), this would require tightening the definition of “same intended use” (currently defined as a technology having the same functionality as one previously approved by the FDA) as well as eliminating the approval of new devices with “different technological characteristics” (the application of one technology to a new area of treatment in which that technology is untested).
- Evaluate FDA’s guidance on specific technology groups for equity. Requirements for the safety of a given drug, medical device, or algorithm should have the statistical power necessary to detect differences between demographic groups and represent all end-users of the technology.
- Make FDA guidance on diversity in testing binding. Currently, FDA guidance makes ensuring diversity of clinical trials and datasets a voluntary step in industry-funded development of a new medical technology. FDA should make this guidance mandatory for all medical technologies it regulates.
- Establish a data bank for auditing medical algorithms. The newly established Office of Digital Transformation within the FDA should create a “data bank” of healthcare images and datasets representative of the U.S. population. Medical technology developers could use the data bank to assess the performance of medical algorithms across patient populations. Regulators could use the data bank to ground claims made by those submitting a technology for FDA approval.
- Allow data submitted to the FDA to be examined by the broader scientific community. Currently, data submitted to the FDA as part of its regulatory-approval process is kept as a trade secret and not released pre-authorization to researchers. Releasing the data via an FDA-invited “peer review” step in the regulation of high-risk technologies, like automated decision-making algorithms, Class III medical devices, and drugs, will ensure that additional, external rigor is applied to the technologies that could cause the most harm due to potential biases.
- Establish an AI Bill of Rights. The federal government and Congress should create protections for necessary uses of artificial intelligence (AI) identified by OSTP. Federally funded healthcare centers, like facilities part of the Veterans Health Administration, could refuse to buy software or technology products that violate this “AI Bill of Rights” through changes to federal acquisition regulation (FAR).
Addressing bias at the market deployment and evaluation phases
- Strengthen reporting mechanisms at the FDA. Healthcare providers, who are often closest to the deployment of medical technologies, should be made mandatory reporters to the FDA of all witnessed adverse events related to bias in medical technology. In addition, the FDA should require the inclusion of unique device identifiers (UDIs) in adverse-response reporting. Using this data, Congress should create a national and publicly accessible registry that uses UDIs to track post-market medical outcomes and safety.
- Train physicians to identify bias in medical technologies and identify new areas of specialization. ED should work with medical schools to develop curricula training physicians to identify potential sources of bias in medical technologies and ensuring that physicians understand how to report adverse events to the FDA. In addition, ED should consider creating new medical specialties that work at the intersection of technology and care delivery.
- Require impact assessments of deployed technologies. Congress must establish systems of accountability for medical technologies, like algorithms, that can evolve over time. Such work could be done by passing the 2022 Algorithmic Accountability Act (an update of the 2019 Algorithmic Accountability Act), which would require companies that create “high-risk automated decision systems” to conduct impact assessments reviewed by the FTC as frequently as necessary (yearly at minimum). Impact assessments should be expanded to medical devices — specifically, to include requirements for companies to assess clinical outcomes in diverse patient populations one year post-implementation in the market, using UDIs as a tracking mechanism.
- Assess disparities in patient outcomes to direct technical auditing. AHRQ should be given the funding needed to fully investigate patient-outcome disparities that could be caused by biases in medical technology, such as further investigation into algorithms identified in the agency’s March 2021 Request for Information. The results of this research should be used to identify technologies that the FDA should audit post-market for efficacy. CMS and its accrediting agencies can monitor these technologies and assess whether they should receive Medicare funding.
- Ensure that technologies developed by ARPA-H have an enforceable access plan. ARPA-H will produce cutting-edge technologies that must be made accessible to all Americans. ARPA-H should collaborate with the Center for Medicare and Medicaid Innovation to develop strategies for equitable delivery of these new technologies. A cost-effective deployment strategy must be identified to service Veterans Affairs hospitals, federally funded health centers, and facilities that are part of the Indian Health Service, among others.
- Create a fund to support digital health technology infrastructure in rural hospitals. To capitalize on the $65 billion expansion of broadband access allocated in the bipartisan infrastructure bill, the HRSA should deploy strategic funding to federally qualified health centers and rural health clinics to support digital health strategies — such as telehealth and mobile health monitoring — and patient education for technology adoption.
A comprehensive road map is needed
In January 2021, Senators Elizabeth Warren, Cory Booker, and Ron Wyden called for an FDA review of pulse oximetry measurements and their skin tone bias, citing the lack of understanding about clinical outcomes of this bias in their call to action. The GAO should go a step beyond this call to action and conduct a comprehensive investigation of “black box” medical technologies utilizing algorithms that are not transparent to end users, medical providers, and patients. The investigation should inform a national strategic plan for equity and inclusion in medical innovation that relies heavily on algorithmic decision-making. The plan should include identification of noteworthy medical algorithms exacerbating inequities, creation of enforceable regulatory standards, development of new sources of research funding to address knowledge gaps, development of enforcement mechanisms for bias reporting, and ongoing assessment of equity goals.
Timeline for action
Realizing HTAA will require mobilization of federal funding, introduction of regulation and legislation, and coordination of stakeholders from federal agencies, industry, healthcare providers, and researchers around a common goal of mitigating bias in medical technology. Such an initiative will be a multi-year undertaking and require funding to enact R&D expenditures, expand data capacity, assess enforcement impacts, create educational materials, and deploy personnel to staff all the above.
Near-term steps that can be taken to launch HTAA include issuing a public request for information, gathering stakeholders, engaging the public and relevant communities in conversation, and preparing a report outlining the roadmap to accomplishing the policies outlined in this memo.
Medical innovation is central to the delivery of high-quality healthcare in the United States. Ensuring equitable healthcare for all Americans requires ensuring that medical innovation is equitable across all sectors, phases, and geographies. Through a bold and comprehensive initiative, the Biden-Harris Administration can ensure that our nation continues leading the world in medical innovation while crafting a future where healthcare delivery works for all.