Beyond Text: Why LLMs Alone Won't Discover Your Next Cancer Drug
Key Takeaways
- ▸LLMs can rapidly synthesize research and identify novel connections across scientific literature, but cannot replace hands-on experimental validation required for drug discovery
- ▸The critical gap in AI-for-pharma is the inability to close the experimental loop computationally; new data from wet labs and physics simulations cannot be generated by language models alone
- ▸Real-world drug discovery pipelines require 99%+ step reliability; LLM hallucinations that are merely embarrassing in chatbots can waste millions and half a year of development work in pharma
Summary
In a critical analysis of AI's role in pharmaceutical drug discovery, PAULING.AI founder Javier Tordable argues that while large language models excel at synthesizing research and identifying connections across scientific literature, they fundamentally cannot replace the experimental process required to develop new drugs. He highlights a critical gap in current AI-for-pharma narratives: LLMs can read thousands of papers per second and reason about biological mechanisms, but they cannot conduct the wet lab assays, physics-based simulations, and experimental validation that actually produce drug candidates.
Tordable emphasizes that the true challenge in applying AI to drug discovery is not building impressive demos but achieving the reliability required for real-world pharmaceutical workflows. He notes that drug discovery pipelines chain dozens of sequential steps, meaning that if each step operates at 90% reliability, end-to-end success drops below 10%—insufficient for actual drug development programs. PAULING.AI's approach bridges this gap by connecting LLM reasoning to computational chemistry tools (molecular docking, molecular dynamics, ADMET prediction) that generate new data, allowing the AI system to test hypotheses rather than merely hypothesize.
The distinction Tordable draws is between AI that impresses boardrooms and AI that actually advances drug discovery programs. Achieving this requires driving step reliability to 99% or above, handling edge cases and ambiguous outputs autonomously, and eliminating the need for constant human oversight at each transition point. His argument reframes the AI-in-pharma conversation around the unglamorous but essential engineering work of building autonomous, reliable systems rather than sophisticated language models.
- PAULING.AI integrates LLM reasoning with computational chemistry tools to enable AI systems that test hypotheses and generate new data autonomously, rather than merely synthesizing existing knowledge
- Success in AI-for-pharma depends on unglamorous engineering work around system reliability and autonomous edge-case handling, not impressive demos or boardroom-ready pitches
Editorial Opinion
Tordable's essay cuts through the hype around generative AI in drug discovery with refreshing clarity and technical rigor. His insistence that the real challenge lies in achieving 99%+ reliability across multi-step workflows—rather than building chatbots that can read papers—reframes how we should evaluate AI applications in high-stakes domains. The distinction between computational loops that can close entirely in software versus pharmaceutical workflows that require physical experimentation is crucial and underexplored in most AI vendor narratives. This argument should shift investor and industry focus from flashy LLM demonstrations toward the harder, less publicizable engineering work of building trustworthy, autonomous systems for science.


