Assignment 1 (CS 4884: Computing the Brain)
Deadline: 11:59pm, February 17, 2022

The goal of this assignment is to determine if different brain networks have the small world property in order to replicate the results in some of the papers we will discuss in class. You will have to write code for this assignment. You can use any programming language you want. Feel free to use matrix or graph libraries that already implement helpful classes and functions, unless I instruct you otherwise. Work on this assignment individually.

Your code will have bugs. It may take a long time to run. Therefore, start well in advance of the deadline!

Analysis of brain connectomes

You will analyze specific brain connectomes for their small-world properties. The networks you will consider are the following. The first five networks are from the Brain Connectivity Toolbox:

  • The weighted and directed interareal connectome for the macaque cerebral cortex used in Cortical High-Density Counterstream Architectures.
    • This dataset is formed by collating tract tracing measurements from multiple subjects, identified in the CASE and MONKEY columns. The first column (CASE) should change if and only if the second column (MONKEY) changes. In other words, each case is a different monkey.
    • If the CASE/MONKEY is fixed, then the TARGET column should have the same value because the experiments are based on retrograde tracers: we inject a tracer in the target region and the tracer moves back through the nerves to the source region. There may be more than one monkey with the same target region, in which case the same edge (SOURCE, TARGET pair) may appear for more than one monkey. I suggest you can average the FLNe (the edge weight) values in that case.
    • Now another problem you need to solve yourself is the following: the FLNe values are between 0 and 1. They are like probabilities: the higher the FLNe value, the stronger the connection. So in a good path between two regions of the brain, we want the sum of the FLNe values to be as large as possible, akin to computing the longest path. However, in a shortest path, we want the sum of the edge weights to be as small as possible. There is a clear disconnect. You will have to figure out a way to address this issue. Explain your strategy in your write up.
  1. For each network, write a function to read the corresponding file into a data structure. You may use existing libraries and packages, e.g., the NetworkX library in Python, to store the network. Some libraries may also have their own functions to read in a graph from a file, although you must be careful to ensure that the format of the file is supported by the function. Here are some points to keep in mind as you implement this function.
    • The graph may be directed or undirected and each edge may or may not have a weight; this weight is usually in the third column of the file.
    • You can assume that a graph is undirected if every edge \((u,v)\) appears twice in the file, once with \(u\) in the first column and \(v\) in the second column and another time with the order flipped. However, the edge weight must be the same in both appearances of the edge.
    • Your function to read the graph must include tests to determine if the graph is directed or undirected and unweighted or weighted.
    • Parsing the final dataset and creating a graph from it will require some thought and specific decisions. Feel free to discuss using the class mailing list on Canvas.
  2. After reading in the network, compute two properties: the average shortest path length and the average clustering coefficient. Implement these functions with your own code, even if the graph library you use contains these functions. The one exception to this rule is that you may use a library function to compute the shortest path between a pair of nodes or from one node to every other node. Recall that to compute the average shortest path length in an undirected graph with \(n\) nodes, you must compute the length of the shortest path between all \(\binom{n}{2}\) pairs of nodes and then calculate the average of these values. Similarly, you must compute the clustering coefficient of every node in the graph and then take its average.

Submission via Canvas

Turn in a typeset (not handwritten) report in PDF format describing your results for each graph. Specifically, for each each graph report the following:

  • Is it undirected/directed, weighted/unweighted?
  • How many nodes and how many edges does it contain?
  • What is the average shortest path length?
  • What is the average clustering coefficient?

Mention any difficulties you encountered and how you addressed them. Were there any surprises, i.e., results or trends you did not expect? Finally, write a paragraph on what you learnt from this assignment.