- The leukemia gene expression dataset available in the url below consists of 72 subjects/patients and 3571 genes.
http://www.ams.sunysb.edu/~pfkuan/Teaching/AMS597/Data/leukemiaDataSet.txt
Each patient is of either type ALL (Acute lymphocytic leukemia) or type AML (Acute myelogenous leukemia). Using the genes as covariates, we will construct a model that can predict the two types of leukemia as follows:
- First split the data randomly into two subsets containing 50 (trainData) and 22 (testData) subjects, respectively as follows:
dat <- read.delim(http://www.ams.sunysb.edu
/~pfkuan/Teaching/AMS597/Data/leukemiaDataSet.txt
,header=T,sep=t) ### please read this as a single line in R, I break this into
### 3 lines to avoid overflowing outside paper margin str(dat)
set.seed(123) trainID <- sample(1:72,round(0.7*72))
trainData <- dat[trainID,] testData <- dat[-trainID,]
- Build your best model on trainData using the Group variable as response and the genes as predictors/covariates.
- Evaluate your model from (2) on the testData by computing the percentage of AML correctly predicted, the percentage of ALL correctly predicted and the overall percentage of AML and ALL correctly predicted.
- Write a function that will generate and return a random sample of size n from the twoparameter exponential distribution Exp(,) for arbitrary n, , and using inverse transform method. Note that the pdf of X Exp(,) is
f(x) = e(x),x
and > 0, > 0. Generate a random sample of size 1000 from the Exp(2,1) distribution.
- Write a function to generate a random sample of size n from the standard Cauchy distribution with pdfusing inverse transform method. Generate a random sample of size 1000 from the Cauchy distribution.
- Write a function to generate a random sample of size n from the Beta(a,b) distribution by the acceptance-rejection method. Generate a random sample of size 1000 from the Beta(3,2) distribution.
- Write a function to generate a random sample of size n from the Gamma(,1) distribution by the acceptance-rejection method. Generate a random sample of size 1000 from the Gamma(3,1) distribution. (Hint: you may use g(x) Exp( = 1/) as your proposal distribution, where is the rate parameter. Figure out the appropriate constant c).
Reviews
There are no reviews yet.