APS1070 Lab4 mini-batch implementation notes Question
Implement a mini-batch version with a batch_size constant B. When B=1, it is stochastic gradient descent. When B=#train data points, it is full-batch. Anywhere in between is mini-batch.
1. Choose B=1, B=16, B=128, B=256, and B=#data points and plot training error as a function of # of gradient updates and also separately as a function of wall-clock time for each value of B.
2. All lines should be on the same plot. Which B leads to fastest convergence in terms of #gradient updates and in terms of wall-clock time?
Mini-batch
Assume you have 400 data points
1. If your batch size is 1, then for every iteration(epoch), the number of batches = 400/1=400. This means you have to update the weights 400 times.
2. If yur batchs size is 200, then fro every iteration(epoch), the number of batches = 400/200=2. This means you have to update the weights 2 times.
For more detail explanation:
https://machinelearningmastery.com/gentle-introduction-mini-batch-gradient-descent-configure-bat ch-size/
Sample code
In order to make the assignment easier, we provide the skeleton of the function for you.
def batchGradientDescent(X, y, w, lr=0.01, iterations=100, =0, bs=1):
X, y:
w:
lr:
iteration: number of epochs
: regularization parameter bs: batch size
# m is the nunmber of data points
m, n = X.shape
# these are used to stored the cost and time
costHistory = np.zeros(iterations)
timeHistory = np.zeros(iterations)
for i in range(iterations):
1. Store the current time
2. Randomize your X and y. Make sure they are
You may want to use np.random.permutation
Your code goes here
for j in range(0, m, bs):
1. Get the current mini-batch of X and Y
2. Calculate the current prediction
3. Update the weight (You may want to use np.dot)
The update is similar to Q4
Your code goes here
1. calculate the current cost using w, X and y
2. store the current time and calculate the time difference
training data
weights
learning rate
3. store time difference and cost in costHistory and timeHistory
Your code goes here
return w, costHistory, timeHistory
Reviews
There are no reviews yet.