Programming Assignment 1 – Solving TSP by Searching
CMSC 421 Summer 2024
Due Date: 11:59 PM July 31st 2024
Welcome to your first programming assignment. In this assignment you are going to explore searching algorithms on the Traveling Salesman Problem (TSP). The goal of this assignment is for you to explore various mathematical and algorithmic trade offs. So you are encouraged to collaborate, discuss and compare results with your classmates. You may work in groups of three at most. The final code and lab report submitted should include the name of all the team members.
Stuff you will need to do:
-
Familiarize yourself with the AI:MA code repository and make use of it as you see fit. Feel free to adapt/copy the code from this repository or any other online resources with proper citation. Hints, tips and findings from the code repository you are encouraged to share on Piazza.
-
Represent a graph as an adjacency matrix.
-
Read an N ∗ N adjacency matrix that defines an undirected N -node graph with up to N 2 edges. Your code is expected to read from the command line an adjacency matrix in the following format: a.out < infile.txt, where the infile.txt looks like Figure 1. Take
the TSP problem in Figure 2 as an example, Figure 3 is its corresponding infile matrix. As graphs will be undirected, matrices will be symmetric. For consistency, the full matrix will be provided although all that is needed is the “upper triangle”.
-
Write your output in a csv file, where the name of the file is the function name. For example, for the hill-climbing algorithm, the output file should be hillClimbing.csv. The file should contain 4 entries in one line: total cost of the best solution, number of nodes expanded, CPU run time, and real-world run time.
-
Perform basic statistics: compute average, standard deviation, max, min, etc.
-
Use your output data to produce a graph, or produce a graph image directly using a GUI or other means.
-
Perform experiments and conduct some empirical experimentation to explore various math- ematical and algorithmic trade offs.
Grading rubrics:
-
Presentation of the results in graphical form. Several experiments are defined for each of which there is a capstone plot or chart showing tradeoffs in solution quality and accuracy and measures of time/space used to reach that solution (submitted via GRADESCOPE).
Are these graphs illustrative of the phenomena expected?
Figure 1: Format of .txt
Figure 3: Example of infile.txt
Figure 2: TSP Graph
-
Discussion of the experiments. For each of the experiments a short narrative explaining the phenomena observed will be submitted (via GRADESCOPE). Does this narrative accurately describe the mathematical and algorithmic phenomena behind the observations?
-
Part 1: Exploring Searching Algorithm
Implement A∗ on a randomly generated TSP problem using the following heuristics:
-
h(n) = 0. This is basically Uniform Cost Search.
-
Grab edges randomly (i.e., grab a random number based on remaining nodes, start with grabbing n edges, then n-1 edges, etc. . . )
-
Cheapest remaining edges, n-d edges. For the experiment:
-
Randomly create a family of 30 TSP graphs/matrices for each size 5, 10, 15, 20, …
-
Run the above algorithms on each of the family of size 5, of size 10, etc….
-
For each family of 30 graphs/matrices you’ll compute the AVERAGE/MIN/MAX of total cost, number of nodes, CPU and real-world runtime.
-
Use those data to plot 2 graphs – one for total cost and number of nodes, and the other for CPU and real-world runtime. The x-axis of the plot is the size of the graphs/matrices (size 5, 10, 15, …). There are two y-axis in the plot – total cost and number of nodes, and CPU and real-world runtime respectively. Use either different color, shape, or etc. to represent each searching algorithm.
Compare and discuss the results in a short paragraph. Questions you can explore are for example:
-
Which algorithm provides solution with the lowest cost? What’s the difference of their best solutions, and how that changes when the size of the graph increases?
-
Which algorithm has the least runtime and how do their runtimes change with the size of the graph increases?
-
Is there difference between GPU and real-world runtime?
-
-
What to submit:
Code: Your implementation of three functions: A uniformCost(). A randomEdge() and A cheapestEdge().
Each function should first read in the graph/matrix from infile.txt. Then perform the algorithm and finally return the cost of the best solution.
Report: Two graph plots of your experiment results and your discussion.
Part 2: Explore Local Search Algorithm
In this exercise, we explore the use of local search methods to approach TSPs.
-
Implement and test a hill-climbing method to approach TSPs.
-
Implement and test a simulated annealing method to approach TSPs.
-
Implement and test a genetic algorithm method to approach TSPs. (see section 4.3 of Larranaga et al. 1999)
Compare these results with each other and with the optimal solutions obtained from the A∗ algo- rithm with the MST heuristic. Compare not just the quality of the results but the time it takes to obtain them. Give an assessment of the relationship between computation time and solution
quality, i.e., if all I wanted was a 75% solution, which method is quickest? How about a 90% solution? Is there are general rule here?
For the experiment:
-
Randomly create a family of 30 TSP graphs/matrices for each size 30 (or you can experiment with the size).
-
Run the A∗ algorithm with MST on those 30 TSP graphs to get the optimal solutions.
-
Run the above algorithms on those graphs.
-
Try different assignments to parameters like:
-
number of restarts for hill-climbing
-
number of restarts, initial temperature, cooling ratio α (you can assume the cooling function is g(T ) = α ∗ T ) for simulated annealing
-
number of generations, selection approach, probability of crossover, length of crossover, and mutation rate for genetic algorithm
-
-
Fix your parameters. Then for each graph/matrix you’ll compute the total cost, number of nodes, CPU and real-world runtime.
-
Use those data to plot three graphs – one for each searching algorithm. For each graph, the x-axis is the CPU runtime, while the y-axis is the the difference of total cost between the local searching algorithm and the optimal from A∗ with MST. Use different shape/color to
represent AVERAGE/MIN/MAX of those differences.
-
Maybe repeat the experiment couple times for difference sizes of TSP graphs.
What to submit:
Code: Your implementation of three functions: hillClimbing(), simuAnnealing() and genetic(). Report: At least three graph plots of your experiment results and a short paragraph of your discussion.
-
Extra Credit
Use your implementation from Part 2 to solve the Knapsack problem. See the 0-1 knapsack problem definition.
-
Submission Instructions
Submit a single zip file to Gradescope by 11:59 PM on July 31st. This should include your implementations of the following functions:
-
A uniformCost()
-
A randomEdge()
-
A cheapestEdge()
-
hillClimbing()
-
simuAnnealing()
-
genetic()
-
It should also include any additional code necessary for running these functions.
Reviews
There are no reviews yet.