Name: [SOLVED] Cop4521 assignment 1-game simulation and evaluation of strategies p0
Brand: Assignment Chef
SKU: 61693
Price: 25 USD
Availability: InStock
Rating: 5 (1 reviews)

5/5 - (1 vote)

Objectives:
• Practice problem solving using Python

Description:

Consider a simple game that works as follows: the game involves two players who can choose to either cooperate or defect. If both players cooperate, each receives a reward of 3 coins. If one player cooperates and the other one defects, the defector receives 5 coins, and the cooperator receives 0 coins. If both defect, then each receives 1 coin. If the game is played only once, which is the better strategy, to cooperate or to defect?

Now, consider playing this game over many iterations against different types of players using various strategies. The goal is to accumulate as many coins as possible. In this assignment, you will write a Python program to study different strategies based on the history of both the player’s and the opponent’s choices. You will implement the following strategies (note: your implementation must follow the specifications described here exactly):
• alwaysCooperate: The player always chooses to cooperate.
• alwaysDefect: The player always chooses to defect.
• probeAndLock: The player defects for the first 20 rounds, then cooperates for the next 20 rounds. After these 40 probing rounds, the player compares the total rewards from each phase. If the 20 rounds of defection yielded a higher reward than the 20 rounds of cooperation, the player will stick with defection for the remainder of the game. Otherwise, the player will switch to cooperation for the rest of the game.
• continuousProbe: The player defects in the first round and cooperates in the second round. After that, before each round, the player calculates the average reward obtained when choosing defection and the average reward when choosing cooperation. The player then chooses the action that has yielded the higher average reward so far.
• defectUntilCooperate: The player always defects until the opponent cooperates. Once the opponent cooperates for the first time, the player switches to always cooperating for the rest of the game.
• opponentCooperatePercentage: The player decides based on the percentage of times the opponent has chosen to cooperate in the game so far. If the opponent’s cooperation rate exceeds a certain threshold, the player chooses to cooperate; otherwise, the player defects.
You will implement three variations of this strategy using different threshold values:
• opponentCooperate10Percentage (threshold = 10%) • opponentCooperate50Percentage (threshold = 50%)
• opponentCooperate90Percentage (threshold = 90%)
• random50: The player randomly chooses between cooperation and defection, each with a probability of 50%.
In addition to the 9 strategies described above, you will also design and implement one original strategy of your own—ideally, the most effective strategy when competing against the 9 predefined strategies.

Implementation details:

Each strategy should be implemented as a separate function that takes the action history of both players as parameters. The function name should follow the format: strategy_strategyname.
For example, the probeAndLock strategy can be implemented as follows (include the following code segment in you submission):
defect = 0
cooperate = 1

def strategy_probAndLock(myHistory, oppHistory): if (len(myHistory) < 20) : return defect elif (len(myHistory) < 40) : return cooperate
else:
reward1 = rangeReward(0, 20, myHistory, oppHistory) reward2 = rangeReward(20, 40, myHistory, oppHistory) if (reward1 > reward2):
return defect else:
return cooperate

In the code, rangeReward(beg, end, myHistory, oppHistory) computes the total rewards in the range of rounds from beg to end-1.
To evaluate the strategies, you will implement the logic that simulates a game between two strategies. For each strategy, the program will simulate the game for a specified number of rounds (provided via a command-line argument) against each of the other strategies and calculate the total rewards. The total rewards for each strategy will be output at the end of the game.
Your program should accept two optional command-line arguments:
• num_of_iterations: Specifies the number of rounds played between two strategies.
• num_of_strategies: Specifies the number of strategies to evaluate (starting from the top of the strategy list).
Default values:
• num_of_iterations = 2000
• num_of_strategies = 8
The program can be executed as follows (you can use this output as test cases):
<linprog7:318> python3 sample_assignment1.py 4 2 num_of_iterations = 4, num_of_strategies = 2

alwaysCooperate: 0
alwaysDefect: 20
<linprog7:319> python3 sample_assignment1.py 4 3 num_of_iterations
= 4, num_of_strategies = 3

alwaysCooperate: 0
alwaysDefect: 24 probeAndLock:
24
<linprog7:320> python3 sample_assignment1.py 4 4 num_of_iterations = 4, num_of_strategies = 4

alwaysCooperate: 3
alwaysDefect: 32 probeAndLock:
32 continuousProbe: 24
<linprog7:321> python3 sample_assignment1.py 4 5 num_of_iterations = 4, num_of_strategies = 5

alwaysCooperate: 12 alwaysDefect: 36 probeAndLock: 36 continuousProbe: 35 defectUntilCooperate: 28
<linprog7:322> python3 sample_assignment1.py 4 6
num_of_iterations = 4, num_of_strategies = 6

alwaysCooperate: 21 alwaysDefect: 40 probeAndLock: 40 continuousProbe: 46 defectUntilCooperate: 32 opponentCooperate10Percentage: 32
<linprog7:323> python3 sample_assignment1.py 4 7 num_of_iterations
= 4, num_of_strategies = 7

alwaysCooperate: 30 alwaysDefect: 44 probeAndLock: 44 continuousProbe: 49 defectUntilCooperate: 36 opponentCooperate10Percentage:
36 opponentCooperate50Percentage:
38
<linprog7:324> python3 sample_assignment1.py 4 8 num_of_iterations
= 4, num_of_strategies = 8

alwaysCooperate: 39 alwaysDefect: 48 probeAndLock: 48 continuousProbe: 52 defectUntilCooperate: 40 opponentCooperate10Percentage:
40 opponentCooperate50Percentage:
42 opponentCooperate90Percentage:
42
<linprog7:326> python3 sample_assignment1.py num_of_iterations
= 2000, num_of_strategies = 8

alwaysCooperate: 24051 alwaysDefect: 22084 probeAndLock: 29792 continuousProbe: 30096 defectUntilCooperate: 19970 opponentCooperate10Percentage:
21964 opponentCooperate50Percentage:
18086 opponentCooperate90Percentage:
18086

Submission: Name your program lastname_firstinitial_assignment1.py and submit on Canvas.

Grading (60 points total):
• A program gets at most 6 points if the program generates a runtime error.
• Include basic header (template at the course website) for assignment, name your program lastname_firstinitial_assignment1.py, follow the naming specification for the strategies, and include the sample code segment for the implementation of the probeAndLock strategy (10 points)
• 4 iterations test case 2 strategies (20 points)
• 4 iterations test case 3 strategies (3 points) • 4 iterations test case 4 strategies (3 points) • 4 iterations test case 5 strategies (3 points) • 4 iterations test case 6 strategies (3 points) • 4 iterations test case 7 strategies (3 points)
• 4 iterations test case 8 strategies (3 points)
• 2000 iterations test case 8 strategies (3 points)
• Correct random strategy (3 points)
• Your own strategy is original and works correctly (3 points). Changing a threshold or parameter value of the 9 strategies will not be considered as original. Your strategy can be very similar to one of the 9 strategies, but it must do something differently.
• Your own strategy is the best among the 10 strategies with 2000 iterations per match (3 points)
• Extra points: We will merge all of the original strategies developed by students in this assignment along with the first 8 pre-defined strategies (random50 will not be included). We will then have a competition among these strategies using 1000 iterations per pair of strategies. Bonus points will be awarded to the top 3 strategies:
o 1st place: +20 extra points o
2nd place: +10 extra points o
3rd place: +5 extra points
In the case of a tie, the tied strategies will share the corresponding bonus points equally. For example, if 10 strategies tie for the 1st place, each will receive (20 + 10 + 5) / 10 = 3.5 bonus points.
• Extra points: The first person who reports a bug in the sample program (which produced the test results in this assignment description) will get 3 extra points.
Note:
• This program is longer and more challenging than I would like it to be. Start early.

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Whatsapp Us

[SOLVED] Cop4521 assignment 1-game simulation and evaluation of strategies p0

Reviews

Related products

[SOLVED] COP3530 P0 100/100

[SOLVED] The Nutshell Term Project

[SOLVED] PRC Assignment JavaFX

[SOLVED] HR Analytics: Job Change of Data Scientists

[SOLVED] Library search system

[SOLVED] Assignment 1 – a maze game