Preparing your data

For datasets to be exploitable and interoperable in K they must be transformed into the K specifications that are outlined in the section above. By default, K is currently optimised to treat the folowing data modalities: clinical, bulk, single cell, WES, spatial transcriptomics, histology, proteomics, but this can be expanded based on client usecases. We have multiple tools at our disposal to do this:

Data Transformation Agent (DTA)

Our automated Data Transformation Agent prepares datasets for immediate use in K:

  • Supported Data Types: Clinical data, preprocessed bulk RNA-seq, and single-cell datasets (aligned with the self-service upload interface scope)Nb: This data transformation step is flexible to the variety of data formats that can be requested by clients: their specific data schemas, data dictionary, data preparation techniques.

  • Processing Approach: Uses deterministic pipelines and libraries to standardize, validate, and format datasets according to K specifications

  • Key Benefits: Fast turnaround and automated validation to ensure data compatibility with K's analytical tools

Manual Curation Services

For datasets requiring specialized handling or enhanced quality, our expert curation team provides tailored data preparation:

  • Advanced Modalities: Handles data types and processing levels not currently supported by the DTA, such as imaging data, spatial transcriptomics, or custom experimental formats

  • Expert Review: Team of data scientists and biology experts ensures the highest level of data quality, accuracy, and biological relevance

Last updated

Was this helpful?