Bulk RNA seq files

Bulk metadata file : bkrnaseq_metadata_table

  • Objective of the file: Contains sample metadata & UMAP coordinates.

  • Requirements:

  • Named bkrnaseq_metadata_table

  • In parquet format

  • bk_sample_id as primary key

  • Contains the mandatory columns in the table below

  • File columns & description:

Name
Type
Mandatory
Description

bk_sample_id

str

TRUE

Bulk RNA-seq sample identifier

patient_id

str

TRUE

Patient identifier. Maps to the linked clinical data file. There can be mutliple bk_sample_id for one patient_id.

sample_id

str

FALSE

Maps to the linked sample metadata data file. There is a 1 to 1 relationship between bk_sample_id and sample_id

UMAP_1

FALSE

UMAP_2

FALSE

Patients counts file: bkrnaseq_count_table

  • Objective of the file: contains TPM-normalized expressions.

  • Requirements:

  • Named bkrnaseq_count_table

  • In parquet format

  • bk_sample_id as primary key

  • gene_name fit HGNC ontology

  • count_tpm in long format and with “.” decimal separator

  • Contains the mandatory columns in the table below

  • File columns & description:

Name
Type
Mandatory
Description

bk_sample_id

str

TRUE

Maps to the linked clinical data file. There can be mutliple bk_sample_id for one patient_id.

gene_name

str

TRUE

Gene name fitting HGNC ontology

count_tpm

pl.Float32

TRUE

TPM count value

Last updated