CIVE285 Coursework 2
Background
CIVE285 Coursework 2
Santander Cycles is a public bike-hiring scheme in London. You can hire a bike from one of the 750+ docking stations (Figure 1), and return the bike to any docking station.
Figure 1. Santander Cycles docking station (https://www.geograph.org.uk/photo/4598838) Transport for London has recorded all the bike-hiring journeys of Santander Cycles since
September 2015, and made the data publicly available at https://cycling.data.tfl.gov.uk/.
Data
The attached file bike_hiring.txt is a collection of the records of bike-hiring journeys from a certain day. In the file, each line corresponds to one bike-hiring journey and contains four characteristics of the journey. The four characteristics are separated by comma and shown in the order of:
Duration of the journey (in seconds, an integer);
The ID of the bike used for the journey (an integer);
The name of the docking station that the journey started from; and
The name of the docking station that the journey ended at. Note that each physical docking station has a unique name.
1
CIVE285 Coursework 2
For example, the first line of the file is
180,5784,Pott Street,Granby Street
which means:
The duration of the journey is 180 seconds, TheIDofthebikeusedis5784,
The journey started from Pott Street docking station, and
The journey ended at Granby Street docking station.
Tasks
Create a Python programme to read and analyse the data in the bike_hiring.txt file to complete the following four tasks/questions. You can use any built-in modules or third-party packages in Python.
1. Plot a histogram (using 15 bins) for the duration of all bike-hiring journeys in the file.
2. How many bike-sharing journeys are recorded in the file, and what is the mean value (in
seconds, rounded to integer) of the journey duration?
3. How many different bike IDs are recorded in the file? If we assume that each bike has a
unique ID, equivalently this question is asking how many different bikes are used in that
day.
4. In terms of the number of journeys that started from each docking station, what are the
top-2 popular starting stations, and how many journeys started from each of these top-2 stations? Assume that there is only one most popular station and only one second-most popular station.
Your programme should print the information to Console in the format given by the template (see next page), where those [how many] and [what station] should be replaced by the results from your programme (do not print the square brackets). For Task 4, the most popular station should be printed first, and then the second-most popular station.
Your programme should target to work for not only the attached file, but also other files that have the same file name and contain the same type of data in the same format. We will use the journey records of a different day to test your programme.
2
CIVE285 Coursework 2
# Template of your programme output
[how many] journeys are recorded in the file.
The mean value of the journey duration is [how many] seconds.
[how many] different bike IDs are recorded in the file.
Most popular: [how many] journeys started from [what station]. Second-most popular: [how many] journeys started from [what station].
Additional hint
For Task 4, the .index() operation below might be useful.
Assume L is a list and x is some value. If x is an element of L, L.index(x) will return the
index of the first occurrence of x in L. This is illustrated by the following example:
If x is not an element of L, L.index(x) will raise an error. This is illustrated by the following example:
Name = [Dave, Ann, Emily, Jack, Ann] a = Name.index(Ann)
print(a)
Name = [Dave, Ann, Emily, Jack, Ann] b = Name.index(Andy)
3
CIVE285 Coursework 2
Submission instruction
You need to submit your programme and a report.
Your programme should be placed in a .py file named
cw2_xxxx.py
where xxxx is your full Student ID number. Your code should include detailed comments to help the readers understand your code.
The report should be in .pdf format. The file name of your report should be in the format of Student-ID_last-name_first-name.pdf
with no spaces, e.g., 123456789_Ye_Hongbo.pdf.
Report requirement
The page limit of the report is 3 pages max.
It does not matter whether you have a cover page or not. The cover page will not be counted in the page limit.
Indicate your name and student ID in the cover/first page of your report.
The report should cover:
Aims: The aims of the coursework.
Programme design: How the programme is designed and structured, including but not
limited to the key variables and key processes used (such as conditional, loops, functions,
modules, packages, and so on).
Output of your programme.
Discussions: Discussions on the results, the coding and debugging process, and anything
you learnt from the process of finishing the coursework.
Appendix (not counted in the page limit): The code (including the comments) in your
submitted .py file should be copied to the end of the report. When the code is copied into your report, it is alright if a line of code is too long and occupies more than one line in the report.
4
CIVE285 Coursework 2
Marking scheme
This coursework has a value of 15 points.
10 points are on the code and the programme execution.
5 points are on the report.
Any of the following will cause 0 marks for the coursework.
Not attaching your code to the end of the report, or not attaching it as texts.
The code attached to the end of the report is not the same as the code in the .py file
submitted.
Exceeding the page limit.
Both the report and code will be a Turnitin assignment, so they will be checked for plagiarism. Do not copy code from your classmates.
5
Reviews
There are no reviews yet.