# [Solved] AMSC808N Project 1

30 \$

SKU: [Solved] AMSC808N Project 1 Category: Tag:

Rate this product

The goals of this project are

• to acquire some practice of working with real data;

1. SVN. First consider a subset of data taken from all 58 CA counties. (In Project1main.m, uncomment line 22 and line 27 and comment out lines 40–59.) The initial guess for the dividing plane is obtained by running inexact (due to the use of CG) Newton’s method. (Subsampled Inexact Newton’s method becomes a regular Inexact Newton’s method as 58 < 64 (64 is my default batch size – see SINewton.m). Run the code Project1main.m and you obtain w. Its first three components make up w and the last one is b. See the next item for more details on it.)

• Set up the SVN constrained minimization problem with soft margins (Eqs. (52)–(54) in 2-OptProb&Methods4Classification.pdf) and solve it using the

1

active-set method (my routine ASM should work just fine on this set). Compare the dividing planes produced by minimizing the loss function and solving the SVN problem with soft margins. Which one looks more reasonable?

• Increase the dataset by adding counties from other states (e.g. OR, WA, …). Comment on the performance of the SVN. Do your observations motivate you to switch to unconstrained minimization of a reasonable loss function?

2. SG. Take the dataset consisting of all pro-democratic counties and an equal number of randomly chosen pro-republican counties. (In Project1main.m, lines 22–27 must be commented out and lines 40–59 must be uncommented.) This dataset contains 972 counties. Denote the n × 3 data matrix by X (XX in Project1main.m), the label vector by y (label in Project1main.m). Define Y ∶= (y ⋅ 11×n) ⊙ [X, 1n×1] ∈ Rn×4 and w ∶= [w; b] ∈ R4. Then Y w ≡ y ⊙ (Xw + b). Consider the following loss function:

1nλf(w)∶= n∑log(1+exp(−Yiw))+ 2∥w∥2 (1)

i=1where Yi is the ith row of Y and λ is Tikhonov regularization parameter. The

term λ∥w∥2 makes the loss function strongly convex. I set λ = 0.01. Note that 2

log(1 + exp(−x)) is nearly 0 for positive x and grows approximately linearly as −x increases.

Implement the SG algorithm. Experiment with various batch sizes and stepsize decreasing strategies. For each batch size and each stepsize decreasing strategy, accumulate statistics (e.g. do 1000 runs for each) and plot average function value vs iteration number and vs runtime. Comment on how the batch size affects the performance.

Experiment with various values of the parameter λ. Comment on how this parameter affects the results.

3. Subsampled Inexact Newton. Experiment with various batch sizes for subsam- pled inexact Newton’s method (you can use my routine SINewton or write your own). Plot average function value vs iteration number and vs runtimes. Compare these plots with those for SG. Comment on how these two approaches (SG and Newton) compare.

4. Stochastic L-BFGS. Program stochastic L-BFGS method. Experiment with choos- ing stepsize, batch sizes for the gradient and the pairs (s, y), and with the frequency of updating the pairs. Set m = 5. Compare its performance with stochastic Newton and SG. My codes LBFGS.m and finddirection.m implementing the regular L-BFGS are added on ELMS to Files/Codes.

## Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[Solved] AMSC808N Project 1
30 \$