#########################################################
################### BTRY 4520 ###########################
################## Homework 5 ###########################
############## Due: May 3, 2019 #########################
#########################################################
# Instructions: save this file in the format
# Complete each question using code below the question number.
# You need only upload this file to CMS.
# Note, we assume your working directory contains any files
# that accompany this one.
# Further note: 10% will be deducted if your file produces
# an error when run. If your code produces an error and you
# cannot find it, comment out that portion of the code and we
# will give partial credit for it.
# Do not use either the function set.seed() or rm(list=ls())
# in your code.
####################################
# Question 1:Logistic Regression #
####################################
# Logistic regression is used to model a binary outcome in terms
# of covariates. That is, we have data (Y_i,X_i) where Y_i is
# either 0 or 1, and we need to model P(Y = 1|X).
# Here we observe the constraints that this is a probability
# and must be between 0 and 1. A common way to achieve this is
# to write
#
#P(Y = 1|X,b)=g( b0 + b1 X1 + + bp Xp )
#
# for some coefficients b = (b0,b1,,bp).
# The function g now makes sure that the input is between 0 and 1.
# A classical choice of this is
#
# g(z) = exp(z)/(1+exp(z))
#
# Below you may also find it useful to observe that
#
# g'(z) =g(z)(1-g(z))
# For this question we will use the data in
grains = read.table(SpiderBeach.txt,head=TRUE)
# which records the size of sand grains on 28 Japanese beaches
# and whether the burrowing wolf spider was found there. If
# we plot
plot( spiders ~ grain.size, data = grains)
# we can see an apparent increase in the likelihood of coming
# across spiders as grain sizes increase.
# A) Implement a function to take a matrix X, and a vector b and
# return the vector of probabilities P(Y=1|X)
LogistProb = function(X,b)
{
}
# B) Given data, we would like to estimate b. To do this we will
# maximize the likelihood of our observed outcomes. That is
#
# L(b) = prod P(Y=1|X,b)^Y*(1- P(Y=1|X,b))^(1-Y)
#
# (if you think about this, the terms in the product are
# P(Y=1|X,b) if Y is 1, and P(Y=0,X,b) if Y is 0).
#
# However, we generally use the log of the likelihood
#
# l(b) = sum Y log P(Y=1|X,b)+ (1-Y) log( 1-P(Y=1|X,b) )
# Write functions to evaluate l(b) (here we assume that X is
# NOT given with a column of 1s for the b0 term, but you can
# put it in if you like).
Logist.loglik = function(b,X,Y)
{
}
# its gradient with respect to b.
Logist.dloglik = function(b,X,Y)
{
}
# And its hessian with respect to b
Logist.d2loglik = function(b,X,Y)
{
}
# Note, it may help to represent these as matrix multiplications.
# You will likely get an expression of the form
#
# H_jk = sum_i w_i X_ij X_ik
#
# which is also obtained byt(X)%*%W%*%Xif W is diagonal.
# C) Write a function to maximize the log likelihood
# using a Newton-Raphson method. Assume that you start
# from b = 0.You need only return the maximizing value of b.
Logist.NR = function(X,Y,tol=1e-8,maxit=1000)
{
}
# Provide an estimate of the parameters for the SpiderBeach
# data. You can check this with the glm function (make sure to
# specify family=binomial for logistic regression).
# Plot the data and add a line giving the predicted probability
# as grain size changes, based on 101 equally spaced points over
# the range of x values. Upload this as HW5Q1C.png.
# D) Write a function to start with an estimated b and
# conduct a parametric bootstrap. That is, construct new
# data by simulating new Ys from P(Y=1|X,b), then re-estmate
# bs and construct percentile bootstrap confidence intervals
# for b, using the input bs as your estimate.
# In this case you should also produce a bias correction. Your
# function should return a 2-by-p matrix with columns indicating
# the different entries in b.
Logist.boot = function(X,b,nboot = 1000)
{
}
# Provide confidence intervals for b0 and b1 in the SpiderBeach
# data.
# BONUSThe lm function in R will give estimates that minimize
# a weighted sum of squared errors:
#
#sumW_i (Y_i X_i b)^2
#
# with estimate
#
# bhat =(t(X)%*%W%*%X)^{-1} t(X)%*%W%*%Y
#
# where W is a diagonal matrix. Describe how you could use lm
# in place of explicit NewtonRaphson calculations.
source(HW5Q1.R)
# Part A
all.equal( as.vector(LogistProb(matrix(c(0.4,0.6,0.8),3,1),c(1,2))),
c(0.8581489,0.9002495,0.9308616),tol=1e-6)
# Part B
all.equal( Logist.loglik(c(0,0.25),matrix(grains$grain.size,28,1),grains$spiders), -18.56395,tol=1e-6)
all.equal( Logist.dloglik(c(0,0.25),matrix(grains$grain.size,28,1),grains$spiders),
matrix(c(4.097711, 3.099247),2,1),tol=1e-6)
all.equal( Logist.d2loglik(c(0,0.25),matrix(grains$grain.size,28,1),grains$spiders)[1,],
c(-6.965371, -3.592547),tol=1e-6)
# Part C
all.equal( Logist.NR(matrix(grains$grain.size[-c(5,10)],26,1),grains$spiders[-c(5,10)]),
matrix(c(-2.205532,5.951406),2,1), tol=1e-6)
# Part D
set.seed(3042019)
all.equal( Logist.boot(matrix(grains$grain.size[-c(5,10)],26,1),c(-2,5))[,1],
c(-3.113813, 3.842143),tol=1e-2)
#########################################################
################### BTRY 4520 ###########################
################## Homework 5 ###########################
############## Due: May 3, 2019 #########################
#########################################################
# Instructions: save this file in the format
# Complete each question using code below the question number.
# You need only upload this file to CMS.
# Note, we assume your working directory contains any files
# that accompany this one.
# Further note: 10% will be deducted if your file produces
# an error when run. If your code produces an error and you
# cannot find it, comment out that portion of the code and we
# will give partial credit for it.
# Do not use either the function set.seed() or rm(list=ls())
# in your code.
################################################
# Question 2:Constrained Levenberg-Marquardt #
################################################
# So far, our optimization methods have all assumed that the maximum is inside
# the region that we are looking at. However, its possible that we want to define
# a boundary. If the function is still increasing at the boundary, youd use that as
# the boundary (or some point along it) as your maximum.
#
# Note that it is important that you be able to return a point that is exactly on
# the boundary.
#
# In statistics, its often naturally to assume that at least some parameters are
# postive (variances, for example are always non-negative).
#
# Here we will examine the problem for beta = (beta_1,,beta_p)
#
# beta = argmax f(beta)such thatbeta_j >= 0for some particular j
#
# and we will use a modification of the Levenberg-Marquardt algorithm (see code in
# Lecture 10) to do this.Here we will use a three-dimensional
#
# beta = (beta_0,beta_1,beta_2)
#
# where we want beta_2 > 0. Note that the indices in R shift this so that
# beta_2 is given by beta[3] in a vector in R.
# However, we will start simply producing a Levenberg-Marquardt modification
# to the Gauss-Newton algorithm without any constraints.
# Our modivation for this is from the data
troutdata = read.table(troutpcb.txt,header=T)
# which contains records of PCB levels of trout in Cayuga lake in 1972. There
# are 28 trout with their PCB levels and age in the data.
# We will want to minimize the squared error for the model
#
#log(PCB) = beta_0 + beta_1 * Age/( 1 + beta_2 + Age) + e
#
# in other words
#
#y = f(beta,Age) + e
#
# where beta_2 > 0.We will fit this with a Levenberg-Marquardt modification
# of Gauss-Newton methods, so we will need functions to calculate the vector
# of f(beta,Age) and J(beta,Age) = df/dbeta.These will be the arguments of
# fn and J in the functions below.
#
# As with Gauss-Newton methods, you can approximate the Hessian of squared error:
# H = t(J)%*%J
# The derivative of squared error is of course t(J)%*%(y f)
# Ai. Write down a function fn to calculate f(beta,Age) and J(beta,Age)
#implementning the functions above.
#fn should return a vector of values, one for each element of the vector Age,
#J should return an n-by-3 matrix, with one row for each element of the vector Age.
fn = function(beta,Age){
}
J = function(beta,Age){
}
# Aii. Implement the Levenberg-Marquardt modification of the Gaus-Newton algorithm.
# That is, we use
#
#t(J(beta,Age))%*%(log(PCB)-fn(beta,Age))
#
# for the negative gradient of squared error and
#
# t(J(beta,Age))%*%J(beta,Age)
#
# for the Hessian. You may evaluate
#
# sum( (log(PCB) fn(beta,Age))^2 )
#
# within your algorithm without conding a function to calculate this.Remember that
# in this case you are minimizing so that the Hessian should be positive definite
# not negative definite. Your Levenberg-Marquardt algorithm to should be modified to
# make sure this is the case. (Yes, you need to work out what that modification is).
# Your function should return the solution and iteration number as in implementations
# of GaussNewton or LevenbergMarquardt in Lecture10_code.r
GNLevMarq = function(beta,X,Y,fn,J,tol=1e-8,maxit=100){
return(list(beta=beta,iter=iter,iterhist=iterhist))
}
# Try this on the troutpcb data using a starting point
beta0 = c(-1,5,0.1)
result1 = GNLevMarq(beta0,troutdata$Age,log(troutdata$PCB),fn,J)
# Now we will consider adding a boundary condition.
#
# To enforce the boundary conditions, we modify the algorithm in the following manner
# for the current estimate beta:
#
# If beta is _not_ on the boundary (ie beta_j > 0)
# Calculate the Levenberg-Marquardt step with the current value of lambda.
# That is, calculatebeta* = beta + g(lambda) where g is the step to be taken
#
# If this step takes you over the boundary(beta^*_j < 0), shrink the # step until it hits the boundary. Ie, choose# beta* = beta + alpha*g(lambda)# where alpha is chosen so that beta^*_j >= 0 (and at least one is exactly 0)
#
#
# If beta is on the boundary already:
# Calculate the Levenberg-Marquardt step with the current value of lambda.
# That is, calculatebeta* = beta + g(lambda) where g is the step to be taken
#
# If beta* moves back inside the boundary, take the step.
#
# If beta* crosses the boundary, calculate the Levenberg-Marquardt step dropping
# the dimensions of beta that are on the boundary. That is, if beta_2 = 0, keep
# beta_2 = 0 and proceed as though you were only optimizing beta_0 and beta_1
#
#
# Try the appropriate one of these with the current value of lambda. If it improves squared
# error, make this move and divide lambda by 2. If it does not, multiply lambda by 2 and try
# the appropriate move (from the old value of beta). Keep doubling lambda until you do
# improve squared error.
# Bi. Write a function to implement this step when beta is in the interior
LMStepInterior = function(beta,X,Y,fn,J,lambda)
{
}
# You can test this with value
beta0 = c(-1,1,0.02)
# At lambda = 0 you shoud stay in the interior of the space
LMStepInterior(beta0,troutdata$Age,log(troutdata$PCB),fn,J,0)
# At lambda = 50 you should try to cross the boundary and pull back to it
LMStepInterior(beta0,troutdata$Age,log(troutdata$PCB),fn,J,50)
# Bii. Write a function to implement the step when beta is on the boundary
LMStepBoundary = function(beta,X,Y,fn,J,lambda)
{
return( betastar )
}
# You can test this by consdering
beta1 = c(-1,1,0)
# With lambda = 0 you should move back into the interior of the space,
LMStepBoundary(beta1,troutdata$Age,log(troutdata$PCB),fn,J,0)
# with lambda = 50 you should stay on the boundary
LMStepBoundary(beta1,troutdata$Age,log(troutdata$PCB),fn,J,50)
# C. Put these together in a constrained Levenberg-Marquardt algorithm, for
# our purposes, specify the starting value of lambda rather than choosing
# based on the Hessian.
ConstrainedLevenbergMarquardt = function(beta,X,Y,fn,J,lambda=50,tol=1e-8,maxit=100){
return( list(sol=beta, iter=iter) )
}
# D. Apply this to the trout data with starting values given by
# beta0 above and lambda = 50
source(HW5Q2.R)
# Part A
all.equal(fn(c(1,1,1),c(1,4,7)), c(4/3, 5/3, 16/9) )
all.equal( J(c(1,1,1),c(1,4,7))[2,], c(1,2/3,-1/9) )
all.equal(GNLevMarq(beta0,troutdata$Age,log(troutdata$PCB),fn,J,maxit=3)$beta,
matrix(c(-0.9452889,5.4935518,4.7699064),3,1),tol=1e-6)
# Part B
all.equal( LMStepInterior(beta0,troutdata$Age,log(troutdata$PCB),fn,J,0),
matrix(c(3.1723159,0.5947732,14.5796836),3,1),tol=1e-6)
all.equal( LMStepInterior(beta0,troutdata$Age,log(troutdata$PCB),fn,J,50),
matrix(c(-0.7943829,1.1842198,0),3,1),tol=1e-6)
all.equal( LMStepBoundary(beta1,troutdata$Age,log(troutdata$PCB),fn,J,0),
matrix(c(3.3586664,0.3999969,14.6399720),3,1),tol=1e-6)
all.equal( LMStepBoundary(beta1,troutdata$Age,log(troutdata$PCB),fn,J,50),
c(-0.5176988,1.4328901,0),tol=1e-6)
# Part C
# NOTE: here I check the first few steps of the algorithm. You, of course, should run it to convergence
# in your solutions
all.equal( ConstrainedLevenbergMarquardt(beta0,troutdata$Age,log(troutdata$PCB),fn,J,lambda=100,maxit=1)$sol,
matrix(c(-0.4248085 , 1.5205827 , 0.0000000),3,1),tol=1e-6)
all.equal( ConstrainedLevenbergMarquardt(beta0,troutdata$Age,log(troutdata$PCB),fn,J,lambda=100,maxit=2)$sol,
matrix(c(-0.17607800 , 1.81721084 , 0.01521779),3,1),tol=1e-6)
all.equal( ConstrainedLevenbergMarquardt(beta0,troutdata$Age,log(troutdata$PCB),fn,J,lambda=100,maxit=3)$sol,
matrix(c(-0.1211046 , 2.0503818 , 0.1583751),3,1),tol=1e-6)
all.equal( ConstrainedLevenbergMarquardt(beta0,troutdata$Age,log(troutdata$PCB),fn,J,lambda=100,maxit=4)$sol,
matrix(c(-0.2096945,2.3396091,0.4523016),3,1),tol=1e-6)
#########################################################
################### BTRY 4520 ###########################
################## Homework 5 ###########################
############## Due: May 3, 2019 #########################
#########################################################
# Instructions: save this file in the format
# Complete each question using code below the question number.
# You need only upload this file to CMS.
# Note, we assume your working directory contains any files
# that accompany this one.
# Further note: 10% will be deducted if your file produces
# an error when run. If your code produces an error and you
# cannot find it, comment out that portion of the code and we
# will give partial credit for it.
# Do not use either the function set.seed() or rm(list=ls())
# in your code.
###########################################
# Question 3:Varying Coefficient Models #
###########################################
# This function will consider a variation on local linear regression.
# Here we will instead look at what is described as a varying
# coefficient model.
# To describe this model this, the data in
ethanol = read.csv(ethanol.csv)
# which contains readings of Nitrous Oxide (NOx) produced by a
# gasoline engine as well as the compression of the gas (C) in
# the engine and E a measure of the air/ethanol mix.
# For these data, the relationship between NOx and C changes
# depending on E. If we examine
plot(ethanol$C[ethanol$E<0.8],ethanol$NOx[ethanol$E<0.8] )plot(ethanol$C[ethanol$E>1],ethanol$NOx[ethanol$E>1] )
# We see somewhat different relationships. For that reason
# we consider a model of the form
#
#NOx = beta0(E) + beta1(E) C
#
# That is, NOx has a linear relationship with C, but that
# relationship depends in a non-parametric way on E.
# a) We will be interested in being able to find beta0(e) and
# beta1(e) for any value e in the range of E. To do this, we
# can perform a linear regression forY~C(using the lm()
# function) with weights given by dnorm(e-E,sd=h) for bandwidth h.
#
# Write a function to compute a varying coefficient estimate
# of beta0(e) and beta1(e) at a given e and return the two-vector
# of coefficients.
VarCoef = function(e,data,h)
{
}
# Use this to plot the estimated varying coefficient
# estimates for the ethanol data over the range of
# E, using h = 0.1.
# b) To choose a bandwidth, we can again apply cross
# validation. Its possible to represent our predicted
# values as a linear smootheryhat = S(h)Y, but here
# you can just do it manually with a for loop.
#
# Write a function to calculate the cross-validated
# squared error for predicting Y from these data.
#
# In this case, unlike in local linear regression,
# we havent centered by C and do need to use the whole
# model to predict each Y, rather than just an intercept.
VarCoefCV = function(h,data){
}
# use this to choose an optimal bandwidth from among those in
hs = seq(0.01,0.2,by = 0.01)
# c) Fixing the bandwidth at the optimum you found in b,
# performa residual bootstrap for the values of
# beta0(e) and beta1(e) for each of the 61 values of e
# in
e = seq(0.6,1.2,by=0.01)
# (you should calculate all 61 values for each bootstrap
# re-sample).
#
# Write a function to use this sequence e, the data
# the bandwidth h, and produce a length(e)-by-3
# matrices called conf.band.b0 and conf.band.b1 with
# columns giving
#
# estimate 1.96 boostrap.se, estimate,estimate+1.96 bootstrap se
#
# for each value in e where bootstrap.se is the standard
# deviation over nboot = 200 bootstrap samples.
#
# You should also calculate the percentage of bootstrap
# samples for which the bootstrap beta0(e) is inside the
# confidence interval for ALL of the values of e (this is
# like a multiple testing problem).
VarCoefBoot = function(e,data,h,nboot)
{
return( conf.band.b0 =, conf.band.b1 = ,
numinside.b0 = ,numinside.b1 = )
}
# Use the result to plot our estimated beta0(e) and beta1(e) with
# confidence bands around them.Are these a good representation
# of the variability of your estimates? Upload your plot as HW5Q3c.png.
# d) The intervals in Part c are described as being POINTWISE
# confidence intervals because they cover the value of beta0(e)
# (say) at each e. But its possible that every individual
# bootstrap curve goes outside the bands at some value of e, even
# if at any given e, 95% of the bootstrap curves are contained
# in the interval.(eg, bootstrap 1 is outside at e[45], bootstrap 2
# at e[22] etc).
# Instead, UNIFORM confidence bands are intended to ensure that
# 95% of bootstrap curves are completely contained within the band.
# To produce these, we take an idea from Lab 5 and do the following.
#
# 1.From your bootstrap, calculate the t-statistics
#
#beta0.t(e) =(beta0.boot(e) beta0.est(e))/sd(beta0.boot(e))
#
# where beta0.est is the original estimate, and beta0.boot is
# the estimate for each bootstrap sample. That is you should have
# length(e)-by-nboot t statistics.
#
# 2. Find the maximum value
#
#max.beta0.t = max |beta0.t(e)|
#
# over e for each bootstrap to givenboot values ofmax.beta0.t.
#
# 3. Obtain Q as the 95th quantile of max.beta0.t that is
# at most 5% of the curves are ever more that Q*sd(beta0.boot(e))
# away from beta0.est(e).
#
# 4. Return the bands
#
#[beta0.est(e) Q*sd(beta0.boot(e)),beta0.est(e) + beta0.boot(e)]
#
#
# Write a function to repeat your calcuations in part c, but
# replace 1.96 with Q calculated as above.
VarCoefBoot2 = function(e,data,h,nboot)
{
return( conf.band.b0 =, conf.band.b1 = ,
numinside.b0 = ,numinside.b1 = )
}
# Do 95% of your bootstrap curves fall inside these bands?
# Produce a plot of your estimate and uniform confidence bands.
# Upload this plot as HW5Q3d.png
# Small bonus: Why do we bootstrap t-statistics instead of just looking
# at a band of fixed width C (ie [beta0(e) C,beta0(e) + C]
# with C chosen so that 95% of curves lie in this band?
# Big Bonus: Describe how to estimate this model with splines
# instead of kernels? What would you do to estimate the
# partially linear model
#
# Y = beta0(E) + beta1*C?
source(HW5Q3.R)
# Part a
all.equal(VarCoef(1,ethanol,0.1),c(2.51603346,0.00683997),tol=1e-6)
all.equal(VarCoef(0.5,ethanol,0.3),c(1.908030111,0.003452755), tol=1e-6)
# Part b
all.equal(VarCoefCV(hs[5],ethanol),6.395744,tol=1e-6)
# Part c
set.seed(19532706)
results = VarCoefBoot(c(0.6,1,1.2),ethanol,0.3,100)
all.equal(results$conf.band.b0[2,],c(1.400634,2.146129,2.891624),tol=1e-2)
all.equal(results$conf.band.b1[3,],c(-0.08096679,-0.017721659,0.04552347),tol=1e-2)
results$numinside.b0 == 89
# Part d
set.seed(28071929)
results = VarCoefBoot2(c(0.6,1,1.2),ethanol,0.3,100)
all.equal(results$conf.band.b0[3,],c(1.2337180,2.147877,3.062035),tol=1e-2)
all.equal(results$conf.band.b1[1,],c(-0.07719028,0.007852045,0.09289437),tol=1e-2)
results$numinside.b1 == 95#########################################################
################### BTRY 4520 ###########################
################## Homework 5 ###########################
############## Due: May 3, 2019 #########################
#########################################################
# Instructions: save this file in the format
# Complete each question using code below the question number.
# You need only upload this file to CMS.
# Note, we assume your working directory contains any files
# that accompany this one.
# Further note: 10% will be deducted if your file produces
# an error when run. If your code produces an error and you
# cannot find it, comment out that portion of the code and we
# will give partial credit for it.
# Do not use either the function set.seed() or rm(list=ls())
# in your code.
#########################################
# Question 4:2 Dimensional Quadrature #
#########################################
# A) From JMR Exercize 11.4.4:
# Consider
#
# I = integral_0^1 3/2*sqrt(x) dx = 1
#
# Let Tn be the approximation to I given by the trapezoid rule with a partition
# of size n and let Sn be the approximation given by Simpsons rule
# with a partition of size n.
# Let nT(epsilon) be the smallest value of n for which |Tn-I| < epsilon and let # nS(epsilon) be the smallest value of n for which |Sn-I| < epsilon. ## Write functions that calculate nT(epsilon) and nS(epsilon)nT = function(epsilon){} nS = function(epsilon){}# Use these to plot nT(epsilon) and nS(epsilon) against log(epsilon) for # epsilon = 2^(-k), k = 2, . . . , 16.# Upload this plot as HW5Q4A.png. # B) Now consider a two-dimensional integral. We know that## integral_(-1)^1 integral_(-1)^1 sqrt(1 – x^2 – y^2) dx dy = 2*pi/3 ## where the square root is taken to be positive and set to zero when it isn’t defined. # This is because it solves the equation for the unit sphere: x^2 + y^2 + z^2 = 1, for # z, so we are calculating the volume of a half sphere.# i. Write a function to implement Simpson’s method to two dimensions# (hint: do the inner integral first, then the outer integral). It should use take inputs# n, for the number of grid points in each dimension and a function to calculate f(x,y)# and return the integral. Use the same number, n, of points in x and y. Simpson2d = function(fn,n){}# ii. You can evaluate 2*pi/3 in R; many grid points do you need to make your partition so that# your numerical integral is accurate to within 10^(-4)?N.Needed = # iii. Now consider extending this scheme to integrate over p dimensions. For a fixed# partition size, how does the numerical cost of your scheme grow with p?You should answer# this in comments. # BONUS: Provide an analysis the approximation accuracy of your two-# dimensional scheme in terms of the size of the partition.source(‘HW5Q4.R’)# Part AnT(5e-3) == 17nS(5e-3) == 11# Part Ball.equal( Simpson2d( function(x,y){ exp(x)*cos(x+y) }, 21 ),3.253831, tol=1e-6)
Reviews
There are no reviews yet.