The file used in this assignment is the onecombining clinical data and genetic data provided in two previous modules (BRCAMerged.csv). A number of functions have been provided in cells#1-15 of the...

1 answer below »

The file used in this assignment is the onecombining clinical data and genetic data provided in two previous modules (BRCAMerged.csv).


A number of functions have been provided in cells#1-15 of the attached notebook (AssignmentWeek6.pdf). The corresponding script is provided as a Jupyternotebook and an R script, both called AssignmentWeek6.


The question asked in this assignment is to compare clustering of this dataset and to map selected genes to biological pathways, in a manner similar to the worksheet.


1) Download the attached files and place them in the same folder:



  • BRCAMerged.csv

  • AssignmentWeek6.ipynb

  • AssignmentWeek6.R


2) Run the script either as AssignmentWeek6.iptnb(Jupyternotebook installation) or as AssignmentWeek6.R (RStudioinstallation).


3) Add the scripts in cells #8 to #19, by taking inspiratin from the corresponding scripts from the worksheet:



  • Cell #8select features by running bsswss between the two classes cancer / normal and keep 150 top ranked features in mrnaDataReduced

  • Cell #9cluster mrnaDataReduced into 2 clusters with KMeans

  • Cell #10display a table showing how the two classes are clustered in the two clusters

  • Cell #11plot the two clusters

  • Cell #12cluster mrnaDataReduced into 2 clusters with PAM

  • Cell #13display a table showing how the two classes are clustered in the two clusters

  • Cell #14plot the two clusters

  • Cell #15performe gene set enrichment on the 22 top ranked genes Note: because the gene sames are separated by a period instead of |, two of the lines in this cell need to be changed:
    hugoNames

    # extract HUGO names
    entrezNames

    # extract Entrez names

  • Cell #16display a bar plot of the pathways selected

  • Cell #17display a gene concept network with cnetplot

  • Cell #18display a gene concept network with cnetplot and circular format

  • Cell #19display an enrichment map


4) Turn in the assignment as a plain R script file (not Jupyter Notebook file), attached to your submission.


Note: the file BRCAMerged.csv can also be downloaded from Google Drive:https://drive.google.com/file/d/1I8yySge8gTfKR2WlpQ_Q1SSAR-O8dtwn/view?usp=sharing(Links to an external site.)


Attachments



Formative Quiz Week #6" style="float: left;">Previous
Discussion forum week six" style="float: right;">Next
Answered 1 days AfterAug 14, 2021

Answer To: The file used in this assignment is the onecombining clinical data and genetic data provided in two...

Saravana answered on Aug 15 2021
144 Votes
# cell #1 - run the first time only, then only run from cell #2
install.packages("cluster")
if (!requireNamespace("BiocManager", quietly = TRUE)
)
install.packages("BiocManager")
BiocManager::install("ReactomePA")
BiocManager::install("org.Hs.eg.db")
BiocManager::install("DOSE")
install.packages("class")
# cell #2
library(cluster)
library(ReactomePA)
library(org.Hs.eg.db)
library(class)
# cell #3 load the dataset, which has patients as rows and variables as columns
setwd("/media/priyan/files/GreyNodes/Assignment36")
mrnaNorm <- read.table("BRCAMerged.csv", header = T, sep=",")
class(mrnaNorm)
sampClass <- lapply(mrnaNorm[,"type"], function(t) (if (t == "MN") return("0") else return("1")))
mrnaClass <- as.data.frame(sampClass)
dim(mrnaClass)
table(unlist(sampClass))
# cell #4
# 0 1
# 112 1100
sampClassNum <- lapply(mrnaNorm[,"type"], function(t) (if (t == "MN") return(0) else return(1)))
mrnaClassNum <- as.data.frame(sampClassNum)
table(unlist(mrnaClassNum))
# 0 1
# 112 1100
# cell #5
geneNames <- as.data.frame(colnames(mrnaNorm[,-c(1:40)])) # extract the gene names from mrnaNorm as column names after column 40
dim(geneNames)
# 20531 genes
# cell #6
mrnaData <- mrnaNorm[,-c(1:40)]
dim(mrnaData)
gc()
# 1212 patients and 20531 gene expression values
# cell #7
bssWssFast <- function (X, givenClassArr, numClass=2)
# between squares / within square feature selection
{
    classVec <-...
SOLUTION.PDF

Answer To This Question Is Available To Download

Related Questions & Answers

More Questions »

Submit New Assignment

Copy and Paste Your Assignment Here