assignment, you will implement a neural language model.
N-Gram Neural Language Modelling
First, you need to download the code for N-Gram Language Modelling from here https://sheffieldnlp.github.io/com45136513/labs/word_embeddings_tutorial.py. Second, read the documentation for it here http://pytorch.org/tutorials/ beginner/nlp/word_embeddings_tutorial.html
- Run py on a Linux/Unix-based machine, i.e. Ubuntu or MacOS with PyTorch installed (see instruction in Lab 0). Note 1: You do not have to implement the CBOW model presented at the end of the file. [0 marks]
- Describe in your report the neural network language model using mathematical equations, fully detailing the dimensionsality of each parameter (Hint: it is a kind of multilayer perceptron). You could use a table to summarise the dimensionality of each layer of the network. [2 marks]
- Modify the code given in py to model the following toy training set (Tip: You need to create a list of sentences, where each sentence is represented as a list of tokens. You need to include start/end of sentence tokens):
- The mathematician ran .
- The mathematician ran to the store .
- The physicist ran to the store .
- The philosopher thought about it .
- The mathematician solved the open problem .
- Run a Sanity check: make sure your model can learn how to predict correctly your training data. Take the sentence
- The mathematician ran to the store .
and check that for every trigram (i.e. context and prediction) you get the right answer. Does it work? You need to play with the hyper-parameters, such as learning rate, epoch number etc. You will observe some variance in the results, so find and report hyper-parameters that get the correct results in 5 consecutive runs. Among others, your model should be predicting for the context START The the word mathematician. Why is this happening instead of predicting physicist? [3 marks]
- Test: Given a sentence with a gap
- The ______ solved the open problem. which is more likely to fill it in: physicist or philosopher?
Get the model to predict this correctly by changing the hyper-parameters (and report them). Discuss whether this would be possible with the bigram ML model from lab 2. Ensure that the model is predicting correctly for the right reason, i.e. that the embeddings for physicist and mathematician are closer together than the embeddings for philosopher and mathematician. Use cosine similarity for this (it is implemented in PyTorch, check the API documentation). [3 marks]
Reviews
There are no reviews yet.