[SOLVED] math python statistic COMP1730COMP6730 Programming for Scientists

$25

File Name: math_python_statistic_COMP1730COMP6730_Programming_for_Scientists.zip
File Size: 612.3 KB

5/5 - (1 vote)

COMP1730COMP6730 Programming for Scientists
Data science

Announcements
Example questions for midsemester exam are available on the course web site.
Example solutions for selected lab exercises also available online.
Homework 3 will be checked in this week lab.
Homework 4 is now available and due on
Monday of week 8 instead of week 7.
There will be a lecture tomorrow on Debugging.

Data analysis
Reading data files
Representing tables
Working with data: selecting, visualising
Interpretation

Working example
Table shows how often each model fits best to each test data set. We want to answer: Which model is the best? Question: Which Python data type can we use to process tables?

NumPy Arrays
numpy.ndarray is sequence type, and can also represent ndimensional arrays tables.
fast math operations on arraysmatrices;
plotting via matplotlib.
All values in an array must be of the same type.
Elementwise operators, functions on arrays.
Readwrite functions for some file formats.

Data files
Many data file formats e.g., excel, csv, json, binary. Well use the following csv file.
Model,test1,test2,test3,test4,test5,test6,test7,test8
1,40,571,353,9,95,41,1428,350
2,16,200,108,2,495,434,88,0
3,7,352,216,9,1201,1897,9,0
4,10,187,202,280,704,215,47,0
5,52,616,204,2,47,17,122,5
6,4,147,146,0,3646,536,0,0
7,80,914,373,4,45,2,161,60
8,67,406,778,1,9,2,3,30
9,52,635,303,1,5,0,5,860
10,121,712,595,0,19,0,1,53
11,51,1914,449,0,29,18,4,50

Reading data files with NumPy
import numpy as np
datanp.genfromtxtdata.csv,
dtypeint, delimiter,, skip header1
NOTE: NumPys int is more limited 64bit than Pythons int
More about reading and writing files later in the course.

Array operations
A table is stored as a 2dimensional array:
datairow i dataij row i, column j datai,jrow i, column j
Indexing an nd array returns an n1d array.
data.ndim returns number of dimensions.
data.shape returns the dimensions of the array as a tuple.
data.dtype returns the type of elements.

Slicing
row i
column k
rows i,i1,,j1
columns k,k1,,l1
datai,:
data:,k
datai:j
data:,k:l
datai:j,k:lrows i,,j1 and
columns k,,l1
NOTE: Slicing returns a view reference of data. What happens with?
bdata1:4,3:5
b0,0100
b:100
To copy the data: bdata.copy

Descriptive statistics
np.mindata or data.min
np.maxdata or data.max
np.meandata or data.mean
np.mediandata
Columnwise statistics: np.mindata,0 or data.min0
Rowwise statistics: np.mindata,1 or data.min1

Visualisation
The purpose of visualisation is to see or show informationnot drawing pretty pictures!
Different kinds of plots show different things:barplot
piechart
histogram or cumulative distributionscatterplot
line and area plot
Use one that best makes the point!Choose your dimensions carefully.Label axes, lines, etc.

Matplotlib
Matplotlib is a Python 2D plotting library, which produces publication quality figures.
Matplotlib makes easy things easy and hard things possible.
Documentation: matplotlib.org

Using matplotlib
import matplotlib.pyplot as plot
plot.bardata:,0,data:,1
plot.xlabelModel
plot.ylabelBest frequency
plot.show
plot.piedata:,1, labelsdata:,0,
autopct1.1f
plot.show
Documentation: matplotlib.org

Interpretation
What is this telling us?

Interpretation
What is this telling us?

Takehome message
Python is powerful in data analysis.
Think carefully about visualisation: How can
people quickly interpret the results?
We have only scratched the surface of NumPy and Matplotlib. Extensive documentation: https:www.numpy.org and https:matplotlib.org.
Just google it!

Reviews

There are no reviews yet.

Only logged in customers who have purchased this product may leave a review.

Shopping Cart
[SOLVED] math python statistic COMP1730COMP6730 Programming for Scientists
$25