Dessislava Fessenko shows that while artificial intelligence might revolutionize drug development, its use comes with risks and potential ethical implications. Iterative ethics oversight from the get-go is needed to address the looming concerns.
__________________________________________
ChatGPT has been making headlines for months now because of its ability to interact with users in a conversational way and to churn out relatively coherent answers or entire essays based on scant initial prompts. This chatbot is only one member of a larger family of so-called “large-scale language models” (LLMs). These are artificial intelligence models that are able to generate text output (e.g. predictions, recommendations, content) by drawing on statistical patterns and correlations that they detect in vast amounts of data that they are trained on (e.g. text, images, code).
LLMs are increasingly deployed in the medical domain and in drug discovery in particular (e.g. GPT and BERT). Research outfits and biomedical companies rely on LLMs for target identification, clinical design, and pharmacovigilance in drug development. AI-assisted drug discovery holds a promise for speed and accuracy that conventional biomedical research cannot achieve.

Photo Credit: julientromeur/pixabay. Image Description: A computer-generated illustration of an individual and a syringe.
The use of LLMs in drug discovery comes, however, with certain risks. In a recent paper, ethicists and researchers at DeepMind, a leading AI development outfit, identify six major categories of risks pertaining to the operations of LLMs. Four of those categories are pertinent to drug discovery in particular. The first category relates to the risks of discrimination or exclusion resulting from biased input data. The second category involves risks to individual privacy due to leaks of personal data or LLMs’ ability to infer personal (including sensitive health) attributes from the input data. The third category includes the risks of inaccuracy that arise from LLMs’ propensity to credit false, misleading or inaccurate information as “truthful” merely because it appears more probable according to the statistical pattern and correlations detected in the input data. The fourth category of risk involves the safety and security risks that can arise in cases when malicious actors “hijack” and use an LLM to harmful ends (e.g. to develop biological weapons rather than medicines).
All these risks could have tangible ethical implications in drug discovery. Discrimination and exclusion undermine the fairness of research. For example, if a data set does not include health data regarding all groups of potential drug patients, this compromises the representativeness of the data set and, thus, the fairness of the overall research process and outcomes. Privacy and security incidents may undermine individual dignity and autonomy. For example, inferences or leaks of personal data may expose individuals, whose data has been used in the research, in ways that they neither expected nor consented to. An LLM might emit inaccurate projections of the effects of a developed drug, or falsely assign greater or lesser probabilistic value to certain therapeutic effects. These inaccuracies might unjustly expose some groups to risks when the drug undergoes clinical trials in humans. Finally, malicious use of an LLM may cause harm by manipulating therapeutic outcomes or blindsiding investigators about actual adverse side effects.
These concerns open up a significant ethical gap between what the use of LLMs might entail and what fair, equitable and socially responsible biomedical research demands. Such a gap cannot be overcome without ethical deliberation and evaluation of the deployment of LLMs in biomedical research. Bridging the gap between what AI technology can offer and science needs to live up to, therefore, necessitates ethics review in AI-assisted drug development.
Ethics review would need to evaluate the methodological and ethical soundness of a candidate LLM model and data sets to be employed. This evaluation is effectively required for the choice of a trustworthy LLM model and, as a result, a reliable study design for a project. Ethics oversight should not, however, stop here. It needs to be an iterative process because of the evolutionary nature of AI-driven processes and the unanticipated outputs that they may generate. Thus, iterative ethical oversight would help control for new ethical implications arising from the LLM’s actual performance on key metrics, such as accuracy, reliability, safety, and privacy. In particular, as results from AI-assisted drug discovery come in, it would be important for the research team to screen for these factors at reasonable intervals. A final ethics sanity check after the research is complete may also be needed if the actual end results from an AI-assisted drug development cast doubt on its scientific or ethical soundness. For example, the LLM-modeled therapeutic effects of a drug might not accurately predict its efficacy, safety or toxicity in clinical trials.
Recognizing from the outset and closing the ethical gap in AI-assisted drug discovery is a precursor to ensuring its scientific validity and ethical integrity. Leaving the gap unaddressed would likely backfire later on, when the developed drug is tested in clinical trials – at the expense of fairness, the favourable risk-benefit balance or the protection of research subjects’ interests. Hence, ethics review has a prominent role to play in closing the gap in AI-powered drug development.
__________________________________________
Dessislava Fessenko is a Master of Bioethics candidate in the Center for Bioethics at Harvard Medical School. She is also an antitrust and technology lawyer and a policy researcher working on AI policy and governance. @DessiFessenko


