Clinical data

In this section, users can analyse how their genes of interest impact patient cohorts through various type of exploration methods.

Cohort selection and view the data

A cohort can be selected in the top. Once selected, its details will be shown on the right. The data can be viewed as an excel format in the “View the data” section. For a gene expression, since this table is usually a large data, by default it shows the first 1000 lines to avoid over memory usega. The uses can choose to show all the data if they want.

Survival analysis

This section allows you to examine the association between gene expression and survival outcomes within a selected cohort.

To get started, enter the gene names one per line in the input box, taking care not to include extra spaces before or after each name. The gene names must exactly match those used in your gene expression dataset. Alternatively, you can also choose a geneset from the custom genesets registered in OmicsBridge.

Then choose how to group the samples: either by a median split, dividing them into high and low expression groups, or by comparing the top 25% and bottom 25% of expression levels (quartile split). Next, select the clinical event to use for the analysis, such as Overall Survival (OS) or Progression-Free Survival (PFS).

Once the settings are in place, click the Start button to run the analysis. A results table will be generated, showing the p-value and hazard ratio for each gene (sorted by p-value). Clicking on any gene in the table will display its Kaplan–Meier survival curve on its right.

Adjustable graph parameters

The size (width and height) of the figure.
The size of the X and Y axis/label font size.
The size of the legend title
The colour for the high- and low-expression group.

Gene correlation

This section allows you to explore the correlation between gene expression levels within a selected cohort.

Begin by entering the name of the target gene, which will be shown on the Y-axis of the scatter plots.

You can choose between two analysis modes under Explore type:

Explore one gene's correlation with specific genes (default): After entering the target gene, you manually specify one or more genes to compare with it. These genes will appear on the X-axis. You can input them one per line in the text box (avoiding extra spaces), or select them from a saved Custom Geneset.
Explore one gene's correlation with all the genes: Only the target gene is needed; the system will automatically calculate correlations between the target gene and all genes available in the selected cohort. Note that this takes a few minutes.

Next, choose the correlation method: Pearson (for linear relationships) or Spearman (for rank-based relationships). Click the Start button to run the analysis.

The output will include a table showing the correlation coefficient and p-value for each gene. Clicking on a gene in the table will display a scatter plot on the right, showing the correlation between that gene and the target gene.

Adjustable graph parameters

The size (width and height) of the figure.
The size of the X and Y axis/label font size.
Change the background from gray to white
The colour of the dot
Show the correlation line

Mutation analysis

This section allows you to explore and compare the frequency of gene mutations within a selected cohort, provided that mutation data has been uploaded.

To begin, specify the genes you wish to analyse. There are three input options: 1) Enter gene names one per line in the text box (avoiding extra spaces), 2) Choose to analyse all genes in the dataset, or 3) Select genes from a saved Custom Geneset.

By default, the analysis includes all samples in the cohort. However, you may filter the samples based on metadata categories (e.g. treatment group, subtype, or gender) to compare mutation frequencies across different groups.

After clicking the Start button, a table will be generated showing the number of samples with mutations in each gene, and the corresponding mutation frequency. In addition, a bar plot will be displayed, visualising either the top genes by count or by frequency of mutation, depending on the results.

Adjustable graph parameters

Can choose to show either the number of samples with mutations or the mutation frequency
Can choose to show a score on top of each bar
The number of genes to show in the bar plot
The size (width and height) of the figure.
The size of the X and Y axis/label font size.
The size of the legend font and the score font
Use a white background
The colour of the bar plot

Gene expression across subtypes

When metadata for the cohort is provided and patients can be divided into subtypes, users can compare gene expression across these groups.

Enter your genes of interest and select a category for subtyping from the "Group by" menu. Click "Start comparing" to analyse gene expression across subtypes using statistical tests. For two subtypes, the tool uses the Wilcox test; for three or more subtypes, it uses the Kruskal-Wallis test. The results table shows statistical scores (W values for two subtypes, H values for three or more) and p-values, sorted by p-value. Therefore, genes at the top of the table show the largest expression differences between subtypes. Click any row to display a visualization on the right. You can choose between Box plot, Violin plot, Swarm plot, or Violin + Swarm plot formats.

Adjustable graph parameters

the size (width and height) of the figure.
the size of the X and Y axis/label font size.
the size of the graph title.
the colour for the high- and low-expression group.

Signature analysis

Deconvolution analysis

Compare cohorts

Cancer Gene Census (COSMOS)

Manage the cohort database

Select a cohort dataset to view its details on the right. Three tables will be displayed in the "View the data" section: Gene expression, Patient survival information, and Metadata. You can also upload your own cohort from the "upload own cohort" sub-section.

Pre-installed cohort

TCGA data (34 cancer types, see the table below) is available as pre-installed cohorts. This includes mRNA sequencing results, clinical information, metadata and mutation data downloaded from UCSC Xena, with gene expression values transformed as log2(RSEM normalised count+1).

TCGA abbreviation

Abbreviation	Cancer type
TCGA_ACC	Adrenocortical carcinoma
TCGA_BLCA	Bladder Urothelial Carcinoma
TCGA_BRCA	Breast invasive carcinoma
TCGA_CESC	Cervical squamous cell carcinoma and endocervical adenocarcinoma
TCGA_CHOL	Cholangiocarcinoma
TCGA_COAD	Colon adenocarcinoma
TCGA_DLBC	Lymphoid Neoplasm Diffuse Large B-cell Lymphoma
TCGA_ESCA	Esophageal carcinoma
TCGA_GBM	Glioblastoma multiforme
TCGA_HNSC	Head and Neck squamous cell carcinoma
TCGA_KICH	Kidney Chromophobe
TCGA_KIRC	Kidney renal clear cell carcinoma
TCGA_KIRP	Kidney renal papillary cell carcinoma
TCGA_LAML	Acute Myeloid Leukemia
TCGA_LGG	Brain Lower Grade Glioma
TCGA_LIHC	Liver hepatocellular carcinoma
TCGA_LUAD	Lung adenocarcinoma
TCGA_LUSC	Lung squamous cell carcinoma
TCGA_MESO	Mesothelioma
TCGA_PAAD	Pancreatic adenocarcinoma
TCGA_PCPG	Pheochromocytoma and Paraganglioma
TCGA_PRAD	Prostate adenocarcinoma
TCGA_READ	Rectum adenocarcinoma
TCGA_SARC	Sarcoma
TCGA_SKCM	Skin Cutaneous Melanoma
TCGA_TGCT	Testicular Germ Cell Tumors
TCGA_THCA	Thyroid carcinoma
TCGA_THYM	Thymoma
TCGA_UCEC	Uterine Corpus Endometrial Carcinoma
TCGA_UCS	Uterine Carcinosarcoma
TCGA_UVM	Uveal Melanoma
TCGA_COADREAD	Colon and Rectal Cancer
TCGA_GBMLGG	lower grade glioma and glioblastoma
TCGA_LUNG	Lung Cancer

How to upload an own cohort

The users can upload their own cohort and analyse it here. Three files (Gene expression, Clinical data a d Metadata) should be uploaded. Optionally, mutation data can be added. Each data has to follow the following data format.

1. Gene expression

A tab-delimited table of the gene expression of each sample (genes × samples(patients)) from bulk RNAseq (or microarray).

Ensure the data is already normalised before uploading, as the interface does not perform normalisation automatically.
Rows (index): gene names.
Columns (headers): sample names that match those used in your clinical data.

Example

2. Patient survival information

A tab-delimited table containing the information of overall survival, progression-free survival, etc (those needed for generating a Kaplan-Meier curve or survival analysis). Please follow these rules.

The first column must contain sample IDs and should have the header named sample (in all lowercase). All sample IDs should exactly match those used in your gene expression and clinical data.
All other columns must represent pairs of event data: One column for the event status (censoring), with binary values: 1 (event occurred) or 0 (censored). One corresponding column for the event time (in days), labelled with the same event name followed by .time.
For example: For Overall Survival (OS), use one column named OS for event status and use another column named OS.time for the number of days until the event or censoring. Similary, for other types of events (e.g., DSS, DFI, PFI), follow the same format. DSS and DSS.time, DFI and DFI.time, PFI and PFI.time, etc.
You may include other columns in the dataset that do not follow the event/time pair format. These columns will be safely ignored and will not affect the analysis.

Example

3. Metadata

Please upload a tab-delimited (.tsv) table containing metadata for the samples (patients) in your cohort. This may include information such as treatment condition, gender, grade, or cancer subtype.

The first column must contain the sample IDs, and the header for this column must be sample (all lowercase). All sample IDs should exactly match those used in your gene expression and clinical data.

If you do not have any metadata to include, please upload a .tsv file that contains only the sample IDs in the first column with the header sample. This ensures consistency and allows the interface to process the data correctly.

Example

Clinical data

Cohort selection and view the data

Survival analysis

Gene correlation

Mutation analysis

Gene expression across subtypes

Signature analysis

Deconvolution analysis

Compare cohorts

Cancer Gene Census (COSMOS)

Manage the cohort database

Pre-installed cohort

How to upload an own cohort

1. Gene expression

2. Patient survival information

3. Metadata

4. Mutation data

Edit or delete the cohort