The codes and documentation for my BSc project in the area of Cancer Genomics
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
Alireza Tajmirriahi 5f7ebec844 Update '1-Preprocessing/README.md' 10 months ago
..
README.md Update '1-Preprocessing/README.md' 10 months ago
expressionSummarizer.r Add code for the 1st stage 10 months ago
geneIdConverter.r Add code for the 1st stage 10 months ago
prepare_mutation.py Add code for the 1st stage 10 months ago
preprocessing.py Add code for the 1st stage 10 months ago

README.md

Preprocess data

There are a number of scripts involved in this stage.

  • expressionSummarizer.r prepares expression data columns and Id values
  • geneIdConverter.r uses a Bioconductor annotation package to unify gene symbols
  • prepare_mutation.py prepares mutation data by removing useless columns and unwanted rows.
  • preprocessing.py this code generates a summary for each cancer that will be further used to aggregate in the next stage

by running each python script with a --help flag you will learn more about input args.