Main features

Literature review

Provide evidence-based answers by searching and synthesizing scientific publications.

What it does: Searches PubMed database, synthesizes findings across papers (abstracts only), identifies key researchers and institutions, tracks emerging trends, and provides cited information with source links.

How it works: The service maintains a Qdrant vector database of PubMed article (abstracts only) embeddings. When a query arrives, the service: (1) converts the query to a vector embedding, (2) performs semantic search, (3) re-ranks results, and (4) generates a synthesis with citations.

Example queries:

  • “What are the latest findings on checkpoint inhibitor resistance mechanisms in melanoma?” → Synthesizes recent publications on resistance pathways.

  • “Summarize the evidence for KRAS G12C inhibitors in non-small cell lung cancer.” → Provides a literature-based evidence summary.

  • “What’s the function of CD70 in the literature?” → Returns a synthesis of CD70’s known roles in immune regulation, cancer biology, and therapeutic targeting, with PubMed citations.

Gene knowledge exploration

Provide comprehensive biological context about genes based on publicly available, curated databases.

What it does: Provides access to comprehensive biological information about genes, proteins, and other molecular targets from trusted public databases and resources.

How it works: The service stores prior knowledge in local database tables (gene interactions, pathway annotations, expression profiles). When queried, it uses LiteLLM to generate SQL from natural language, executes the query, and formats the results with LLM-generated explanations.

Key information covered:

  1. Biological Information:

    • Synonym description : for each given gene the gene name aliases and descriptions

    • Protein Family : mapping each gene to its protein family (and super family).

    • Localisation : information and confidence level on the sub-cellular localisation of the protein encoded by each gene

    • Panessentiality : if the gene a a pan-cancer core essential gene

    • Oncogenicity : information on whether a gene has been previously reported as being an oncogene or a tumor suppressor gene

    • Immunity : if genes are involved in Reactome immune pathways, or linked to immune evasion of cancer and other cancer-related dysregulation of the immune system

    • Transcription Factors : information on whether a given gene encodes for a transcription factor (TF)

    • Protein complexes : for each gene, if the protein coded by the gene is involved in one (or more) complex(es)

    • Hallmarks of cancer : for each gene, if the gene is involved in any cancer hallmark pathway

  2. Safety and Toxicity:

    • OpenTargets Baseline Expression

    • GTEx specificity

    • Safety

    • Tolerability

  3. Target Tractability:

    • OpenTargets Tractability

Example queries:

  • “Where is BRCA expressed in the cell?” → Returns subcellular localization, tissue expression, and pathway context.

  • “What immune pathways involve PD-L1?” → Maps the gene to relevant signaling cascades.

  • “Is TP53 a known oncogene or tumor suppressor?” → Provides classification with supporting evidence.

MultiOmics patient data analysis

Analyze complex patient datasets across multiple data modalities, enabling exploration of relationships between molecular features and clinical outcomes.

What it does: Generates advanced analyses and interactive visualizations that help researchers identify patterns, compare patient groups, and test hypotheses. The detailed list of analyses supported is available in the Visualisation capabilitiesarrow-up-right section.

How it works: When a user requests a visualization, the agent: (1) understands the query through an LLM, (2) discovers available data via the Data Discovery Agent, (3) constructs appropriate SQL or data requests, (4) sends the plot specification to the Plotter service via MCP, and (5) returns the interactive Plotly visualization.

Example queries and outputs:

  • “Plot an overview of YAP1 expression over M indications” → Generates a box/violin plot showing YAP1 expression (log2 TPM+1) across 33 cancer types.

  • “Show survival analysis for high vs. low ERBB2 expressors in breast cancer” → Generates a Kaplan-Meier survival curve with log-rank test statistics.

  • “Compare the spatial distribution of T cells and malignant cells in patient X’s tissue” → Generates a spatial transcriptomics colocalization visualization.

  • “What is the differential expression between good responders (PFS > 4y) and non-responders (PFS < 1.5y) in TCGA breast cancer?” → Generates a volcano plot with DESeq2/t-test results.

Last updated

Was this helpful?