Homework 2
Q1
10 Points
In typical gradient descent, we take steps using a constant step size η, so that:
θt+1 = θt − η∇θf (θt) .
In the following, assume that f is an arbitrary differentiable function. For very very small η, what will generally be true?
-
f (θt) ≥ f (θt+1)
-
f (θt) ≤ f (θt+1)
-
cannot say
Describe concisely your reasoning behind the choice.
Save Answer
For very very big η, what will generally be true?
-
f (θt) ≥ f (θt+1)
-
f (θt) ≤ f (θt+1)
-
cannot say
Describe concisely your reasoning behind the choice.
Save Answer
Grady would like to pick a perfect step size on every step and proposes a new update rule that selects η∗ to be the value of step-size η that decreases the objective as much as possible in the direction ∇θf (θ) and then uses η∗ as the step size:
η∗ = arg min f (θt − η∇θf (θt))
η
θt+1 = θt − η∗∇θf (θt)
For Grady’s rule, what will generally be true?
-
f (θt) ≥ f (θt+1)
-
f (θt) ≤ f (θt+1)
-
cannot say
Describe concisely your reasoning behind the choice.
Save Answer
10 Points
Robots want to find a location for a meeting that minimizes the sum of squared distances from the rooms of a group of friends (assume everyone is in their room) to the location of the meeting. Assume for now that they can host the gathering at any location in the hallway.
Assuming that the robots live in a 1-dimensional hallway, pose this problem as an (unconstrained) optimization problem. Assume there are n robots (1 robot per friend) and the i-th friend is located at location li. Denote the location of the meeting by p. What is the objective as a function of p ? Write it down.
Hint: If you are having trouble deriving the general case, try the n = 2
case first.
Save Answer
Q5
10 Points
Robots want to find a location for a meeting that minimizes the sum of squared distances from the rooms of a group of friends (assume everyone is in their room) to the location of the meeting. Assume for now that they can host the gathering at any location in the hallway.
Assuming that the robots live in a 1-dimensional hallway,
determine the gradient (write down/show your computation). Use it to
find the optimal location popt
equal to zero and solving.
for the meeting by setting the gradient
Save Answer
10 Points
Robots want to find a location for a meeting that minimizes the sum of squared distances from the rooms of a group of friends (assume everyone is in their room) to the location of the meeting. Assume for now that they can host the gathering at any location in the hallway.
Assuming that the robots live in a 1-dimensional hallway, mark all of the following that are True / False for this particular objective function (i.e., the objective function you derived in Q4). Provide the reasoning behind your choice:
-
There is necessarily a unique location that minimizes the objective function.
-
The optimization problem may have local minima that are not global minima.
-
The optimal location for the meeting will always be one of the rooms.
-
There is necessarily a choice of step size that makes gradient descent converge for this problem.
Save Answer
10 Points
Let’s consider gradient descent when the dimension of the input is 2.
[
]
θ = θ1 θ2
So, we have f (θ) and we are trying to find the values of θ1 and θ2 that minimize it. Suppose
f (θ) = −3θ1 − θ1θ2 + 2θ2 + θ2 + θ2
1 2
If we started at θ = (1, 1) and took a step of gradient descent with step-size 0.1, what would the next value of θ be?
Enter a tuple (No need to show any calculation).
Save Answer
Q8
10 Points
Let’s consider gradient descent when the dimension of the input is 2.
[
]
θ = θ1 θ2
So, we have f (θ) and we are trying to find the values of θ1 and θ2 that minimize it. Suppose
f (θ) = −3θ1 − θ1θ2 + 2θ2 + θ2 + θ2
1 2
What is f ([1.2, 0.7])?
Save Answer
Enter a numerical value.
10 Points
Let’s consider gradient descent when the dimension of the input is 2.
[
]
θ = θ1 θ2
So, we have f (θ) and we are trying to find the values of θ1 and θ2 that minimize it. Suppose
f (θ) = −3θ1 − θ1θ2 + 2θ2 + θ2 + θ2
1 2
If we started at θ = (1, 1) and took a step of gradient descent with step-size 1.0, what would the next value of θ be?
Enter a tuple (No need to show any calculation).
Save Answer
Q10
10 Points
Let’s consider gradient descent when the dimension of the input is 2.
[
]
θ = θ1 θ2
So, we have f (θ) and we are trying to find the values of θ1 and θ2 that minimize it. Suppose
f (θ) = −3θ1 − θ1θ2 + 2θ2 + θ2 + θ2
1 2
Save Answer
What is f ([3, −2])? Enter a numerical value.
Reviews
There are no reviews yet.