In the past decade or so, artificial intelligence has gone from the pages of science fiction novels to a very real power that has disrupted — or threatens to disrupt — nearly every process on earth. AI helps our cars, aircraft, and spacecraft navigate, offers you movie suggestions on Netflix, and facilitates dozens of other disruptions, both grand and mundane.
Why, then, has the pharmaceutical industry — an industry which, literally, has life and death in its hands — shown, relatively speaking, almost no sign of disruption, despite ready access to computers and computer tools, such as AI? Experts suggest that the pharmaceutical industry remains one of the most inefficient industries, a last holdout against technological disruption. As proof, these experts show that, in fact, while other industries are becoming more productive and more efficient, the efficiency of the industry has been on the decline since the 1950s. Just as an example, it now costs over $2.6 Billion to bring a drug — or a New Molecular Entity (NME) — to the market. This cost and even the costs of failed drug attempts — of which, I should point out, there are many — are eventually transferred right to you and me, patients, customers and taxpayers.
In this piece, which I hope will fall squarely between the unrealistic hype and the equally unrealistic skepticism about AI, I would like to discuss the challenging environment of traditional drug discovery, the current path of AI in drug discovery and, finally, the potential of new technologies and processes to revolutionize the field.
Place Your Bets: Traditional Drug Discovery
To understand the potential and the limitations of AI in small molecule drug discovery, it is important to understand how pharmaceutical companies have traditionally handled the drug discovery process.
As alluded to previously, the pharmaceutical industry is one of the riskiest ventures on the planet. The process of small molecule drug discovery includes several steps: the scientists form a disease hypothesis, identify a target, design a molecule and then conduct pre-clinical studies takes on average five years and may cost hundreds of millions of dollars. The clinical development process can take another five years and add on hundreds of millions more dollars. In this process the intervention is tested in Phase I (safety), Phase II (efficacy), and Phase III (safety and efficacy at scale).
Drug discovery, then, is better described as a molecular casino. On this roulette wheel, there are over two thousand druggable targets, several thousand diseases, and every patient is in some way unique. The complexity of selecting the right target for a specific patient subpopulation produce ridiculous odds. That is why roulette so rarely gives big payouts and the players must get used to failure.
Even though the pharmaceutical industry is a roulette wheel where the best and brightest people in the world are making the bets, they still lose 99% of the time. Every time it is a gamble with a payout in eight years or more, where in the first four years it is possible to change the bets and in the second four years clinical trials, the wheel starts spinning and it is only possible to cut the losses or bet more on additional clinical programs. And usually the people who are making the bets in the first four years are not the same people who decide to cut or to double down in the clinical phase.
AI Help, AI Hope, or AI Hype
Facing these ridiculous odds and swimming in a data-intensive environment, you might think that artificial intelligence makes a perfect fit for pharmaceutical companies looking for ways to better their probabilities of finding marketable drugs. However, despite the many advances in technology that have led to major disruptions including mobile and personal computing, the Internet, and genome sequencing, the cost to develop a drug is steadily increasing.
In fact, it turns out, that the idea that AI could be used to level the odds has become a good news-bad news situation for the pharmaceutical industry. On one hand, it has unleashed more investments and more talent into the space. But as the hype goes up right alongside skyrocketing drug costs, that’s led to considerable skepticism. Pharmaceutical veterans have seen promising technological breakthroughs that have not significantly improved R&D and, therefore, they prefer to incrementally develop internal capabilities across the entire spectrum of the drug discovery process instead of making big bets on specific enabling technologies.
This tension between AI hope and AI hype continues. In fact, ever since I started working in AI-powered drug discovery, not a day goes by without one or more articles or an analytical report discussing the hype and hope of AI-powered drug discovery. On one hand, the AI hypers predict revolution, while, on the other hand, the more skeptical drug discovery and development experts discount all of the recent advances as incremental and hype.
This is one of the reasons why most industry experts are skeptical about the promises of deep learning, a promising type of artificial intelligence.
Using Deep Learning to Break Through the Hype
There are a lot of reasons to temper the hype that is often injected into conversations of AI being the potential savior of the pharmaceutical industry, but I see hope in a deep-learning-based models, such as generative adversarial networks, or GANs, as we call them. You might also hear this technology referred to as and is often referred to as the “AI imagination”, “creative AI”, or “AI curiosity”.
While some of the ideas may be traced to the 1990s, the first paper on “Generative Adversarial Nets” was published in 2014 by Ian Goodfellow, now referred to as the “Father of GANs” The concept of Generative Adversarial Networks (GANs) is therefore relatively new. As it name suggests, think of a GAN as a competition between two deep neural networks. One is a generator that creates novel content with the desired set of criteria and another, called the discriminator, tests whether the generator’s output is true or false. Almost immediately, this technology powered some interesting results. In 2016, a few teams that used GANs created photorealistic images from natural language. For example, one could give a description: “this small bird has a pink breast and crown, and black primaries and secondaries” and the GAN would generate or “imagine” a large number of images of birds with said properties.
Around the same time, our team at Insilico, began to investigate whether GANs could be used to find novel chemical structures or molecules that the pharmaceutical industry could use. It may sound like a slightly illogical step to go from producing bird pictures and DeepFakes to creating ultra-precise designs for new molecules, but we’ve experienced considerable success with some of the early peer-reviewed papers published in 2016. Since then, we have published a large number of generative approaches and also started combining them with deep reinforcement learning, a form of AI learning strategy. But despite dozens of papers and presentations at conferences, we still face skepticism from many computational chemists and medicinal chemists in the pharmaceutical industry. And this skepticism is not without merit. The only way to clearly demonstrate that generative approaches can significantly impact the pharmaceutical industry is to select a disease that affects millions of people, not just a rare disease, use the AI approach to identify novel biological target in that disease in a completely “driverless” fashion and then use AI to generate novel molecules for the target picked by AI also in a “driverless” fashion and then validate that molecule in biological assays, in animal studies, and, hopefully in humans.
A feat like that would be virtually impossible in academia because it is very costly and requires a very diverse set of expertise including assay development and chemical synthesis, and for the same reason it is quite difficult to do it in a startup. My prediction is that we will get to this point this year or next year – absolutely novel target, absolutely novel molecule, experimental validation in a disease-relevant assay for one of the major diseases. And 2-3 years later we will see these molecules in Phase II clinical studies. Only then the skeptics will be satisfied. But it will take a few years until that happens.
The Future of AI in the Pharmaceutical Industry
I am optimistic about the future of AI approaches to produce badly needed medicine to improve health and treat diseases. The combination and integration of methods like generative reinforcement learning — and the intriguing prospects of quantum computing — are all reasons to be excited about the future. But, let’s be perfectly transparent about the challenge we are facing. Biology is very complex, chemistry is complex, and clinical trials are complex. Being successful at all three at once is quite the task!
I also think that the key to success in pharma AI is massive integration of the systems used to identify biological targets, systems that help design novel molecules, and systems that personalize the treatments, and predict the clinical trials outcomes.
We need one big pharma brain which can span the discovery and development cycles that take ten years or even longer and can integrate clinical data back into target discovery.
And it may take years to master these tasks. AI-powered drug discovery scientists will need to be mixed-martial artists of drug discovery, combining many strategies and styles, in order to develop systems that will significantly accelerate small molecule drug discovery.
The recent COVID-19 pandemic demonstrated the impotence of today’s traditional and AI-powered approaches. I estimate that in just four months about ten percent of all FDA-approved drugs (as well as the bleaches, UV light, and just some powerful light) were proposed for repurposing as the possible treatments of COVID-19 and the new discovery efforts did not yet result in promising preclinical candidates. A lot more work needs to be done in AI and laboratory automation to significantly accelerate drug discovery.