Q1: Pick a real-world network dataset (number of nodes > 100) from https://snap.stanford.edu/data/index.html
Represent the network in terms of its adjacency matrix as well as edge list.
Briefly describe the dataset chosen and report the following:
- Number of Nodes
- Number of Edges
- Avg In-degree
- Out-Degree
- Node with Max In-degree
- Node with Max out-degree
- Density of the network
Further, perform the following tasks
- Plot degree distribution of the network
- Calculate the clustering coefficient of each node
- Find any 1 centrality measure for each node
NOTE: You are not allowed to use any library for this question.
Q2: For the dataset chosen in the above question, calculate the following:
- PageRank score for each node
- Authority and Hub score for each node
Compare the results obtained from both these parts.
NOTE: You CAN use libraries like networkx (https://networkx.github.io/ ) to solve this question.
For both the questions, you are allowed to subsample the dataset so that it is processable on your machine. Ensure that you use an approach like random walk to subsample the nodes so that you get a connected network
Reviews
There are no reviews yet.