The Problem: Querying a Large Oncology Database with Generative AI
Querying large biomedical databases presents significant challenges due to the complexity of ontologies and schemas. Tempus maintains a vast cohort of multimodal oncology records for research purposes, but navigating this data can be daunting — even for experienced users.
Tempus Lens, a software-as-a-service platform, simplifies this process through a drag-and-drop interface. Biomedical concepts are represented by filters grouped into pills, which users can easily manipulate. However, scaling this solution to less experienced users has been a challenge.
Generative AI shows substantial promise in addressing these challenges by facilitating natural language querying across multiple domains. Lens Cohort Builder extends the functionality of Tempus Lens by enabling users to interact with the data using only natural language prompts, abstracting the complexities of biomedical ontologies with the help of large language models (LLMs).
Lens Cohort Builder: Methods and Architecture
Each filter in Lens Cohort Builder is tied to a specific LLM call with a custom-designed prompt. These prompts ensure that the most relevant matches are returned for each associated filter concept. Filters are processed in parallel, and a subsequent LLM call groups these filters into logical relationships based on the user’s input. The resulting query is populated in the user interface, where users can choose to apply or modify the suggested query pills.

Example Workflow
- User Input: Researchers input natural language text, such as “Find patients with lung cancer who received chemotherapy as a first-line treatment.”
- LLM Mapping: The system maps this text to various filters, such as “Primary Diagnosis: Lung Cancer” and “Line of Therapy: Chemotherapy (First Line).”
- Query Assembly: Filters are grouped into logical relationships and displayed as pills in the UI for user approval or refinement.

Testing and Results
Beta testing was conducted with internal users to evaluate the tool’s accuracy and usability. Researchers and product managers generated queries and assessed Lens Cohort Builder’s responses based on their subject matter expertise. Additionally, users provided qualitative feedback through surveys.
Key Findings:
- Utility: A total of 33 users evaluated 1,916 queries, with 63.3% deemed accurate or mostly accurate and 36.7% rated as inaccurate or mostly inaccurate.
- Unknown Scope: Approximately 320 queries were identified as outside the tool’s intended scope.
Filter-Specific Performance:
| Filter Name | Precision | Recall | F1 | Accuracy | 
|---|---|---|---|---|
| Overall | 0.775 | 0.82 | 0.797 | 0.663 | 
| Primary Diagnosis | 0.886 | 0.986 | 0.933 | 0.875 | 
| Somatic Variant Genes | 0.774 | 0.96 | 0.857 | 0.75 | 
| DNA Modality | 0.458 | 0.88 | 0.603 | 0.431 | 
| Biopsy Modality | 0.625 | 0.909 | 0.741 | 0.588 | 
| RNA Modality | 0.826 | 0.864 | 0.844 | 0.731 | 
| Line of Therapy | 1 | 0.714 | 0.833 | 0.714 | 
| Drug Class | 1 | 0.75 | 0.857 | 0.75 | 
| Tumor Stage | 0.667 | 0.857 | 0.75 | 0.6 | 
The results indicate strong performance for commonly used filters, particularly “Primary Diagnosis,” “Somatic Variant Genes,” and “Drug Class.”
Impact
New users who may be unfamiliar with Tempus’ complex data model can now make use of generative AI for more accessible and efficient cohort development.
Next Steps
By continuing to refine accuracy and usability, Tempus aims to unlock even greater value from its data, driving innovation in biomedical research and personalized medicine.
