How to Make Drug Developability Predictions With Machine Learning


In this blog post we'll discover how ValGenesis Consulting successfully collaborated with a customer to implement a comprehensive developability scoring system using machine learning. 

This project was truly groundbreaking. It didn't just save time and resources but also it sped up the release of new protein drug candidates. 

What Was the Objective? 

Our client aimed to predict the success likelihood of novel protein drug candidates at early stages- a developability assessment. 

By achieving this, they could streamline drug development, reduce development costs and expedite patient access to life-saving medications. All in-all, improve the drug discovery process. 

ValGenesis Consulting Team worked closely with the client engaging in regular technical meetings. 

This joint effort led to the development of a tailor-made application.  

Applying a Knowledge-Driven Approach 

The first steps of the process consisted of: 

  1. Identifying critical variables 
  2. Attributing them to developability attributes,  
  3. Defining the developability target profile 
  4. Preprocessing the data. 

By calculating the developability score using normalized variable ranges and attribute weights, ValGenesis Consultants developed a knowledge-driven approach for scoring. 

Adding the Data-Driven Approach 

ValGenesis consultants harnessed analytical and in silico data, employed dimension reduction and data imputation techniques, and used various statistical and machine learning models to predict the developability score. 

This approach allowed for effective predictions based on amino acid sequences and homology models. 

Creating the In Silico Developability Prediction Model 

Using in silico homology-based models, ValGenesis consultants considered different structural conformations of proteins. This made possible to evaluate properties such as hydrophobic patch areas, net charge, and radius of gyration. 

Additionally, they’ve explored statistical models based on sequence-based descriptors which led to better results. A set of machine learning models were tested and tuned to achieve the best performance possible.  

Developing a Tailor-Made App for Our Client 

The final step was the development of an R Shiny that encompassed the entire analysis process. 

Our team tailored the App to the customer's needs. It mimics the project's analysis and allows the user to create custom datasets, select predictors, build new models, and make new predictions.  

The most exciting is that it continues to enhance prediction power. It has a model lifecycle management workflow to evaluate model performance through time. This is excellent because it ensures the user's autonomy for future analysis for future applications. 

A Case of Success 

ValGenesis' innovative approach pushed the boundaries of science. Through it, we enabled unbiased ranking of drug candidates and training of machine-learning models. 

The predictability of development and manufacturing properties in early-stage drug development is now feasible, empowering faster and leaner workflows. 

Looking to learn more about this project? Our colleague Daniel Pais will present it at the 2023 PDA BioManufacturing Conference. 

At ValGenesis we are at the vanguard of implementing Machine Learning in Pharmaceutical processes with our team of highly experienced Consulting experts. 

If you have any questions, don't hesitate and reach out to us. 

The opinions, information and conclusions contained within this blog should not be construed as conclusive fact, ValGenesis offering advice, nor as an indication of future results.