Recently there has been a flurry of announcements from AI-led biotechs around the potential of Large Language Models (LLM) in early drug discovery. In the second of a three-part series, Dr Raminderpal Singh presents an example of usage of ChatGPT, which demonstrates how accessible LLMs have become for lab scientists.
In our previous article, we summarised the role and challenges of LLMs for early drug discovery. In this article we provide a simple case example to download and practise with ChatGPT, or other accessible LLM systems. This shows the power that LLMs offer to improve scientists’ daily tasks, despite their caveats and challenges. You can download all the source files to use yourself1 – (see Simple ChatGPT Exercise). Thank you to Nina Truter2 for her support in building this example. The example should work with any LLM but has been tested with ChatGPT.3
About the example
- Goal of the example: Using extracted measurements from 10 papers on acarbose treated mice to improve the recommendations made from the results of the primary study.
- Key outputs required from the example: Recommendation on dose, participants and measurements based on results from the primary study4,5 and papers on acarbose treated mice, with supporting data points.
- Challenges faced in implementing the example: Creating prompts to accurately extract information to support recommendations, accurately describing the content of multiple files and papers.
Importantly, you should be aware that commonly accessible LLM systems often share inputs you provide, so it is recommended not to enter confidential information.
To help ChatGPT provide useful insights, there needs to be some ‘prompt engineering’. This is a technical term for best-practices in the way prompts are written. As an example, the first prompt in this example is only to provide background and context to ChatGPT:
“You are a drug discovery scientist looking to make decisions on dose, participants and measurements when taking an existing diabetes drug into the ageing-related diseases field. You have experimental results from a mouse study that show the effects of acarbose on lifespan, body weight, body composition, fat pads, glucose, grip strength, grip duration, rotarod and pathology. You also have several relevant scientific publications with studies investigating the effects of acarbose on different measurements in mice. You now want to interrogate your study results (which are in Excel files and images) and the publications separately for insights, and then together to get the best set of recommendations for your colleagues who are looking to perform early clinical trials with acarbose on ageing-related diseases. To do this, you will now process a series of specific user-entered ChatGPT prompts.”
The screenshot below shows the results from the last prompt. There are some nuances ChatGPT has not picked up on. For example, in female mice, the lifespan is not extended as much compared to male mice, but their physical measurements are improved. Improved prompts will aid the generation of more nuanced results.
Figure from Dr Raminderpal Singh, illustrating prompt results.
Please comment below to share your findings from the example. Tell us if you managed to improve the output and, if so, how?
The next article in this series, published Monday 24 July, will discuss key challenges in the effective use of LLMs for early drug discovery, and present some practical approaches to address them.
References
1 Reading. HitchhikersAI.org. Available at: https://www.hitchhikersai.org/reading
2 Nina Truter. LinkedIn. Available at: https://www.linkedin.com/in/nina-truter/
3 ChatGPT. Available at: https://chatgpt.com/
4 Alavez S, et al. Acarbose improves health and lifespan in aging HET3 mice. Aging Cell. 18(2) (2019 April). Available at: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6413665/
5 Harrison DE, et al. ITP: Interventions Testing Program: Effects of various treatments on lifespan and related phenotypes in genetically heterogenous mice (UM-HET3) (2004-2023). Mouse Phenome Database. Available at: https://phenome.jax.org/projects/ITP1
About the author
Dr Raminderpal Singh
Dr Raminderpal Singh is a recognised key opinion leader in the techbio industry. He has over 30 years of global experience leading and advising teams on building computational modelling systems that are both cost-efficient and have significant IP value. His passion is to help early to mid-stage life sciences companies achieve novel biological breakthroughs through the effective use of computational modelling.
Raminderpal is currently leading the HitchhikersAI.org open-source community, accelerating the adoption of AI technologies in early drug discovery. He is also CEO and co-Founder of Incubate Bio – a techbio providing a service to life sciences companies who are looking to accelerate their research and lower their wet lab costs through in silico modelling.
Raminderpal has extensive experience building businesses in both Europe and the US. As a business executive at IBM Research in New York, Dr Singh led the go-to-market for IBM Watson Genomics Analytics. He was also Vice President and Head of the Microbiome Division at Eagle Genomics Ltd, in Cambridge. Raminderpal earned his PhD in semiconductor modelling in 1997. He has published several papers and two books and has twelve issued patents. In 2003, he was selected by EE Times as one of the top 13 most influential people in the semiconductor industry.
For more: http://raminderpalsingh.com ; http://hitchhikersAI.org ; http://incubate.bio