Skip to main content
De novo design uses generative AI to create entirely new molecules tailored to your protein target. LiteFold’s de novo capabilities accelerate lead generation and optimization by exploring vast chemical space.

Overview

Generate molecules that:
  • Fit precisely into protein binding pockets
  • Optimize multiple properties simultaneously
  • Explore novel chemical scaffolds
  • Satisfy drug-likeness criteria
  • Are synthetically accessible

Design Approaches

Pocket-Based Design

Generate molecules to fill a specific binding pocket. Best for structure-based design.

Scaffold Hopping

Create chemically distinct scaffolds with similar activity. Expand chemical diversity.

Fragment Growing

Start with a fragment and grow into full ligand. Fragment-based drug design.

Lead Optimization

Optimize existing compounds for improved properties while maintaining activity.

Quick Start: Generate Molecules for a Pocket

1

Define Your Target

Upload protein structure or use LiteFold prediction. Binding pocket identified automatically.
2

Configure Generation

Set parameters:
  • Number of molecules (10-1000)
  • Molecular weight range
  • Drug-likeness filters
  • Diversity settings
3

Select Generation Model

  • DiffSBDD: Diffusion model for structure-based design
  • TargetDiff: Optimized for druggable pockets
  • Pocket2Mol: Graph-based generation
4

Generate Molecules

Click “Generate”. LiteFold creates molecules in 10-30 minutes depending on count.
5

Review and Filter

  • Visual inspection of top molecules
  • Docking scores
  • ADMET predictions
  • Synthetic accessibility
6

Export Candidates

Export selected molecules as SMILES, SDF, or send directly to synthesis planning.

Generative Models

Diffusion Models

How they work: Gradually denoise random molecular structures into valid, pocket-fitted molecules. Models:
  • DiffSBDD: General structure-based design
  • TargetDiff: Enhanced for drug-like molecules
  • DiffLinker: Links molecular fragments
Best for:
  • Exploring diverse chemical space
  • Novel scaffolds
  • Complex pocket geometries

Graph Neural Networks

How they work: Build molecules atom-by-atom or fragment-by-fragment using graph representations. Models:
  • Pocket2Mol: Pocket-conditioned generation
  • GraphGA: Genetic algorithm with GNN scoring
Best for:
  • Fragment-based design
  • Scaffold decoration
  • Specific chemistry constraints

Transformer Models

How they work: Generate SMILES strings using language model approaches. Models:
  • REINVENT: Reinforcement learning optimization
  • ChemFormer: Pre-trained transformer
Best for:
  • Multi-objective optimization
  • Fine-tuning on custom data
  • Large-scale generation

Design Workflows

Structure-Based De Novo Design

Start with protein structure, generate optimized ligands.
1

Pocket Analysis

LiteFold analyzes binding pocket:
  • Volume and shape
  • Hydrophobic/hydrophilic regions
  • Key interaction sites (H-bond donors/acceptors)
  • Subpocket identification
2

Generation Constraints

Specify:
  • Molecular weight: 250-500 Da
  • LogP: 0-5
  • Required interactions (e.g., H-bond to Asp855)
  • Forbidden substructures (e.g., PAINS)
3

Generate Library

Create 100-1000 molecules optimized for pocket.
4

Scoring and Filtering

LiteFold automatically:
  • Docks all molecules
  • Predicts ADMET
  • Calculates synthetic accessibility
  • Ranks by multi-objective score
5

Top Candidates

Select 10-20 for synthesis based on:
  • Docking score
  • Predicted ADMET
  • Synthetic accessibility
  • Chemical novelty

Fragment-Based Design

Start with fragment hits, grow into lead-like molecules.
1

Fragment Placement

Place validated fragment in binding site (from crystallography or NMR).
2

Define Growing Vector

Specify atom(s) to grow from and direction into pocket.
3

Growing Strategy

Choose:
  • Greedy growing: Optimize affinity at each step
  • De novo linking: Connect fragments with linkers
  • Decoration: Add substituents to core scaffold
4

Generate Elaborations

LiteFold creates 50-500 elaborations of fragment.
5

Validation

Dock and score elaborations. Select most promising for synthesis.

Scaffold Hopping

Find chemically distinct scaffolds with similar binding.
1

Reference Compound

Provide known active compound as reference.
2

Define Pharmacophore

Extract key features:
  • H-bond donors/acceptors
  • Hydrophobic centers
  • Aromatic rings
  • Charge centers
3

Generate Alternatives

LiteFold creates molecules matching pharmacophore but with different scaffolds.
4

Diversity Selection

Cluster by scaffold and select diverse representatives.

Lead Optimization

Optimize existing lead compound for better properties.
1

Define Optimization Goals

Select properties to optimize:
  • ↑ Binding affinity
  • ↑ Solubility
  • ↓ CYP inhibition
  • ↑ Brain penetration
  • ↓ Molecular weight
2

Multi-Objective Optimization

LiteFold uses reinforcement learning to optimize multiple objectives simultaneously.
3

Generate Analogs

Create 100-1000 analogs optimized for objectives.
4

Pareto Frontier

Visualize trade-offs between objectives and select optimal balance.

Design Constraints

Drug-Likeness Filters

Apply standard filters:
  • Lipinski’s Rule of 5: MW ≤ 500, LogP ≤ 5, HBD ≤ 5, HBA ≤ 10
  • Veber Rules: Rotatable bonds ≤ 10, TPSA ≤ 140
  • Lead-like: MW ≤ 350, LogP ≤ 3.5
  • Fragment-like: MW ≤ 250, rotatable bonds ≤ 3

Custom Constraints

Define your own:
  • Required substructures: Force inclusion of specific groups
  • Forbidden substructures: Exclude toxicophores, PAINS
  • Specific interactions: Require H-bond to Asp855
  • Property ranges: LogP 2-4, MW 300-450

Chemical Space Restrictions

  • Allowed reactions: Limit to high-yielding chemistry
  • Available building blocks: Use in-stock reagents
  • Synthetic routes: Prefer ≤ 5 step syntheses
  • Stereochemistry: Control chiral centers

Evaluation Metrics

Docking Score

Predicted binding affinity from molecular docking.

Synthetic Accessibility (SA Score)

Estimates synthesis difficulty (1-10):
  • 1-3: Easy to synthesize
  • 4-6: Moderate difficulty
  • 7-10: Very difficult or impractical

QED (Quantitative Estimate of Drug-likeness)

Overall drug-likeness score (0-1):
  • > 0.67: Drug-like
  • 0.49-0.67: Moderate
  • < 0.49: Non-drug-like

ADMET Predictions

  • Absorption: Caco-2 permeability, HIA
  • Distribution: LogD, plasma protein binding
  • Metabolism: CYP substrate/inhibitor
  • Excretion: Clearance
  • Toxicity: hERG, Ames, hepatotoxicity

Novelty Score

Measures chemical novelty vs. known compounds:
  • Tanimoto similarity to nearest ChEMBL compound
  • Scaffold novelty: Is core structure new?

Post-Generation Workflow

1

Initial Filtering

Apply drug-likeness and ADMET filters. Typically reduces 1000 → 200 molecules.
2

Docking Validation

Dock all passing filters. Select top 50 by docking score.
3

Visual Inspection

Manually review binding modes:
  • Do interactions make sense?
  • Any steric clashes?
  • Favorable interactions captured?
4

Diversity Selection

Cluster by scaffold, select 20 diverse representatives.
5

Synthesis Planning

For top 20, generate synthetic routes. Prioritize by:
  • Synthetic accessibility
  • Building block availability
  • Number of steps
6

Final Selection

Select 5-10 for synthesis and testing.

Example: Designing EGFR Inhibitors

Let’s design novel EGFR kinase inhibitors:
1

Target Preparation

  • Protein: EGFR with T790M resistance mutation
  • Binding site: ATP pocket
  • Known inhibitors: Erlotinib (resistant), osimertinib (active)
2

Design Strategy

Generate molecules that:
  • Fit in T790M mutant pocket
  • Avoid steric clash with Met790
  • Maintain key H-bonds (Met793, Cys797)
3

Generation

  • Model: DiffSBDD
  • Count: 500 molecules
  • Constraints: MW 300-500, LogP < 5, covalent to Cys797
  • Time: 25 minutes
4

Filtering

  • Drug-like: 387/500 pass
  • ADMET: 241/387 favorable
  • SA score < 6: 156/241
  • Docking score < -9: 43/156
5

Top Candidates

  • Cluster into 12 scaffolds
  • Select 2 from each scaffold (24 total)
  • Visual inspection: 18 chemically sensible
  • Synthesis planning: 10 feasible (≤ 5 steps)
6

Outcome

  • Synthesize 10 compounds
  • 7/10 have IC50 < 100 nM vs. T790M EGFR
  • 3 selected for cell-based validation

Advanced Features

Conditional Generation

Generate molecules conditioned on:
  • Activity: Generate only high-affinity binders
  • Selectivity: Active on target, inactive on off-target
  • Properties: Optimize for BBB penetration, oral bioavailability

Multi-Target Design

Generate molecules active against multiple targets:
  • Dual inhibitors (e.g., EGFR + HER2)
  • Polypharmacology
  • Avoiding anti-targets (e.g., hERG)

Generative Optimization

Iterative design cycles:
  1. Generate molecules
  2. Evaluate (dock, predict properties)
  3. Retrain model on best molecules
  4. Generate improved next generation
  5. Repeat

Integration with Synthesis Planning

Top candidates automatically flow to synthesis planning:
  • Retrosynthesis: Identify synthetic routes
  • Building block search: Check commercial availability
  • Route scoring: Rank by feasibility and cost
  • Step-by-step protocols: Reaction conditions

Best Practices

Start broad, then filter: Generate 500-1000 molecules, then progressively filter. Don’t over-constrain generation.
Include known actives as controls: Generate variants of known binders to validate model performance.
Balance novelty and feasibility: Very novel molecules may be hard to synthesize. Find the sweet spot.
Validate in silico predictions: Docking and ADMET predictions are estimates. Experimental validation is essential.
Synthetic accessibility is critical: A molecule with perfect predicted properties but SA score 9 is useless. Prioritize synthesizable compounds.

Limitations

Current de novo design limitations:
  • Synthetic feasibility: Models may suggest hard-to-make molecules
  • Activity prediction accuracy: In silico predictions need experimental validation
  • Scaffold bias: Models may favor overrepresented scaffolds in training data
  • Specificity: Hard to guarantee selectivity vs. off-targets
  • ADMET prediction errors: Some toxicities hard to predict computationally

Next Steps

Molecular Docking

Validate generated molecules with docking

Molecular Dynamics

Confirm binding stability with MD

Drug Discovery Workflow

Integrate de novo design in full campaigns

Compound Screening

Combine with virtual screening for comprehensive coverage