Handling Graphs with NetworkX¶

Python offers the library NetworkX for manipulating graphs. You can learn more here:

https://networkx.github.io/

https://networkx.github.io/documentation/stable/tutorial.html

In [17]:
import networkx as nx
import matplotlib.pyplot as plt

%matplotlib inline

Creating a graph¶

In [18]:
G = nx.Graph()

Add nodes to the graph

In [19]:
G.add_node(1)
G.add_nodes_from([2,3])
G.add_node('Alice')
G.add_node('Bob')
print(G.nodes())
[1, 2, 3, 'Alice', 'Bob']

Draw the graph

In [20]:
nx.draw_networkx(G)
No description has been provided for this image

Add edges to the graph

In [21]:
G.add_edge(1,2)
G.add_edges_from([(1,3),('Alice','Bob')])
e = (1,'Alice')
G.add_edge(*e)
In [22]:
nx.draw_networkx(G)
No description has been provided for this image

Adding an edge with a new node will create the node Charlie in the graph

In [23]:
G.add_edge('Alice','Charlie')
print(G.edges())
print(G.nodes())
nx.draw_networkx(G)
[(1, 2), (1, 3), (1, 'Alice'), ('Alice', 'Bob'), ('Alice', 'Charlie')]
[1, 2, 3, 'Alice', 'Bob', 'Charlie']
No description has been provided for this image

Creating a graph from edges

This is the most common way to create a graph

In [24]:
G2 = nx.Graph()
G2.add_edges_from([(1,2),(1,3),('Alice','Bob'),(1,'Alice')])
print(G2.nodes())
print(G2.edges())
nx.draw_networkx(G2)
[1, 2, 3, 'Alice', 'Bob']
[(1, 2), (1, 3), (1, 'Alice'), ('Alice', 'Bob')]
No description has been provided for this image

Reading a graph from a file with list of edges

https://networkx.org/documentation/stable/reference/readwrite/index.html

In [25]:
#Read a graph from a list of edges
G3 = nx.read_edgelist('graph_edges.txt')
print(G3.nodes())
print(G3.edges())
nx.draw_networkx(G3)
['1', '2', '3', 'Alice', 'Bob', 'Charlie']
[('1', '2'), ('1', '3'), ('1', 'Alice'), ('2', '3'), ('Alice', 'Bob'), ('Bob', 'Charlie')]
No description has been provided for this image

Removing edges and nodes

In [26]:
G2.remove_edge(1,3)
G2.remove_node(3)
G2.remove_node(1)
print(G2.nodes())
print(G2.edges())
nx.draw_networkx(G2)
[2, 'Alice', 'Bob']
[('Alice', 'Bob')]
No description has been provided for this image

The graph object, nodes and edges¶

A graph can be thought of as a dictionary with nodes as the keys

Each node is a dictionary with the neighbors as the keys, and the edge properties as values

In [27]:
type(G)
Out[27]:
networkx.classes.graph.Graph
def __init__(incoming_graph_data=None, **attr)
/usr/local/lib/python3.10/dist-packages/networkx/classes/graph.pyBase class for undirected graphs.

A Graph stores nodes and edges with optional data, or attributes.

Graphs hold undirected edges.  Self loops are allowed but multiple
(parallel) edges are not.

Nodes can be arbitrary (hashable) Python objects with optional
key/value attributes, except that `None` is not allowed as a node.

Edges are represented as links between nodes with optional
key/value attributes.

Parameters
----------
incoming_graph_data : input graph (optional, default: None)
    Data to initialize graph. If None (default) an empty
    graph is created.  The data can be any format that is supported
    by the to_networkx_graph() function, currently including edge list,
    dict of dicts, dict of lists, NetworkX graph, 2D NumPy array, SciPy
    sparse matrix, or PyGraphviz graph.

attr : keyword arguments, optional (default= no attributes)
    Attributes to add to graph as key=value pairs.

See Also
--------
DiGraph
MultiGraph
MultiDiGraph

Examples
--------
Create an empty graph structure (a "null graph") with no nodes and
no edges.

>>> G = nx.Graph()

G can be grown in several ways.

**Nodes:**

Add one node at a time:

>>> G.add_node(1)

Add the nodes from any container (a list, dict, set or
even the lines from a file or the nodes from another graph).

>>> G.add_nodes_from([2, 3])
>>> G.add_nodes_from(range(100, 110))
>>> H = nx.path_graph(10)
>>> G.add_nodes_from(H)

In addition to strings and integers any hashable Python object
(except None) can represent a node, e.g. a customized node object,
or even another Graph.

>>> G.add_node(H)

**Edges:**

G can also be grown by adding edges.

Add one edge,

>>> G.add_edge(1, 2)

a list of edges,

>>> G.add_edges_from([(1, 2), (1, 3)])

or a collection of edges,

>>> G.add_edges_from(H.edges)

If some edges connect nodes not yet in the graph, the nodes
are added automatically.  There are no errors when adding
nodes or edges that already exist.

**Attributes:**

Each graph, node, and edge can hold key/value attribute pairs
in an associated attribute dictionary (the keys must be hashable).
By default these are empty, but can be added or changed using
add_edge, add_node or direct manipulation of the attribute
dictionaries named graph, node and edge respectively.

>>> G = nx.Graph(day="Friday")
>>> G.graph
{'day': 'Friday'}

Add node attributes using add_node(), add_nodes_from() or G.nodes

>>> G.add_node(1, time="5pm")
>>> G.add_nodes_from([3], time="2pm")
>>> G.nodes[1]
{'time': '5pm'}
>>> G.nodes[1]["room"] = 714  # node must exist already to use G.nodes
>>> del G.nodes[1]["room"]  # remove attribute
>>> list(G.nodes(data=True))
[(1, {'time': '5pm'}), (3, {'time': '2pm'})]

Add edge attributes using add_edge(), add_edges_from(), subscript
notation, or G.edges.

>>> G.add_edge(1, 2, weight=4.7)
>>> G.add_edges_from([(3, 4), (4, 5)], color="red")
>>> G.add_edges_from([(1, 2, {"color": "blue"}), (2, 3, {"weight": 8})])
>>> G[1][2]["weight"] = 4.7
>>> G.edges[1, 2]["weight"] = 4

Warning: we protect the graph data structure by making `G.edges` a
read-only dict-like structure. However, you can assign to attributes
in e.g. `G.edges[1, 2]`. Thus, use 2 sets of brackets to add/change
data attributes: `G.edges[1, 2]['weight'] = 4`
(For multigraphs: `MG.edges[u, v, key][name] = value`).

**Shortcuts:**

Many common graph features allow python syntax to speed reporting.

>>> 1 in G  # check if node in graph
True
>>> [n for n in G if n < 3]  # iterate through nodes
[1, 2]
>>> len(G)  # number of nodes in graph
5

Often the best way to traverse all edges of a graph is via the neighbors.
The neighbors are reported as an adjacency-dict `G.adj` or `G.adjacency()`

>>> for n, nbrsdict in G.adjacency():
...     for nbr, eattr in nbrsdict.items():
...         if "weight" in eattr:
...             # Do something useful with the edges
...             pass

But the edges() method is often more convenient:

>>> for u, v, weight in G.edges.data("weight"):
...     if weight is not None:
...         # Do something useful with the edges
...         pass

**Reporting:**

Simple graph information is obtained using object-attributes and methods.
Reporting typically provides views instead of containers to reduce memory
usage. The views update as the graph is updated similarly to dict-views.
The objects `nodes`, `edges` and `adj` provide access to data attributes
via lookup (e.g. `nodes[n]`, `edges[u, v]`, `adj[u][v]`) and iteration
(e.g. `nodes.items()`, `nodes.data('color')`,
`nodes.data('color', default='blue')` and similarly for `edges`)
Views exist for `nodes`, `edges`, `neighbors()`/`adj` and `degree`.

For details on these and other miscellaneous methods, see below.

**Subclasses (Advanced):**

The Graph class uses a dict-of-dict-of-dict data structure.
The outer dict (node_dict) holds adjacency information keyed by node.
The next dict (adjlist_dict) represents the adjacency information and holds
edge data keyed by neighbor.  The inner dict (edge_attr_dict) represents
the edge data and holds edge attribute values keyed by attribute names.

Each of these three dicts can be replaced in a subclass by a user defined
dict-like object. In general, the dict-like features should be
maintained but extra features can be added. To replace one of the
dicts create a new graph class by changing the class(!) variable
holding the factory for that dict-like structure.

node_dict_factory : function, (default: dict)
    Factory function to be used to create the dict containing node
    attributes, keyed by node id.
    It should require no arguments and return a dict-like object

node_attr_dict_factory: function, (default: dict)
    Factory function to be used to create the node attribute
    dict which holds attribute values keyed by attribute name.
    It should require no arguments and return a dict-like object

adjlist_outer_dict_factory : function, (default: dict)
    Factory function to be used to create the outer-most dict
    in the data structure that holds adjacency info keyed by node.
    It should require no arguments and return a dict-like object.

adjlist_inner_dict_factory : function, (default: dict)
    Factory function to be used to create the adjacency list
    dict which holds edge data keyed by neighbor.
    It should require no arguments and return a dict-like object

edge_attr_dict_factory : function, (default: dict)
    Factory function to be used to create the edge attribute
    dict which holds attribute values keyed by attribute name.
    It should require no arguments and return a dict-like object.

graph_attr_dict_factory : function, (default: dict)
    Factory function to be used to create the graph attribute
    dict which holds attribute values keyed by attribute name.
    It should require no arguments and return a dict-like object.

Typically, if your extension doesn't impact the data structure all
methods will inherit without issue except: `to_directed/to_undirected`.
By default these methods create a DiGraph/Graph class and you probably
want them to create your extension of a DiGraph/Graph. To facilitate
this we define two class variables that you can set in your subclass.

to_directed_class : callable, (default: DiGraph or MultiDiGraph)
    Class to create a new graph structure in the `to_directed` method.
    If `None`, a NetworkX class (DiGraph or MultiDiGraph) is used.

to_undirected_class : callable, (default: Graph or MultiGraph)
    Class to create a new graph structure in the `to_undirected` method.
    If `None`, a NetworkX class (Graph or MultiGraph) is used.

**Subclassing Example**

Create a low memory graph class that effectively disallows edge
attributes by using a single attribute dict for all edges.
This reduces the memory used, but you lose edge attributes.

>>> class ThinGraph(nx.Graph):
...     all_edge_dict = {"weight": 1}
...
...     def single_edge_dict(self):
...         return self.all_edge_dict
...
...     edge_attr_dict_factory = single_edge_dict
>>> G = ThinGraph()
>>> G.add_edge(2, 1)
>>> G[2][1]
{'weight': 1}
>>> G.add_edge(2, 2)
>>> G[2][1] is G[2][2]
True
In [28]:
nx.draw_networkx(G)
No description has been provided for this image
In [29]:
for n in G: print(n,':',G[n])
1 : {2: {}, 3: {}, 'Alice': {}}
2 : {1: {}}
3 : {1: {}}
Alice : {'Bob': {}, 1: {}, 'Charlie': {}}
Bob : {'Alice': {}}
Charlie : {'Alice': {}}
In [30]:
G[1]
Out[30]:
AtlasView({2: {}, 3: {}, 'Alice': {}})
In [31]:
print(G.nodes)
print(G.edges)
[1, 2, 3, 'Alice', 'Bob', 'Charlie']
[(1, 2), (1, 3), (1, 'Alice'), ('Alice', 'Bob'), ('Alice', 'Charlie')]
In [32]:
#For every node we can obtain the properties of the node (in this example this is empty)
print(G.nodes['Alice'])
#For every edge we can obtain the properties of the edge (in this example this is empty)
print(G.edges[(1,2)])
{}
{}

Graph attributes¶

You can assign attributes and values to the nodes of the graph

In [33]:
G3.nodes['Alice']['gender'] = 'female'
G3.nodes['Bob']['gender'] = 'male'
G3.nodes['Charlie']['gender'] = 'male'
G3.nodes['1']['value'] = 1
G3.nodes['2']['value'] = -1
G3.nodes['3']['value'] = 0
for n in G3.nodes():
    print(n,':',G3.nodes[n])
1 : {'value': 1}
2 : {'value': -1}
3 : {'value': 0}
Alice : {'gender': 'female'}
Bob : {'gender': 'male'}
Charlie : {'gender': 'male'}
In [34]:
for n in G3.nodes():
    print(n,':',G3[n])
1 : {'2': {}, '3': {}, 'Alice': {}}
2 : {'1': {}, '3': {}}
3 : {'1': {}, '2': {}}
Alice : {'Bob': {}, '1': {}}
Bob : {'Alice': {}, 'Charlie': {}}
Charlie : {'Bob': {}}
In [35]:
G3.nodes['Alice']['value'] = 1
G3.nodes['Bob']['value'] = -1
G3.nodes['Charlie']['value'] = 1
for n in G3.nodes():
    print(n+ ":" + str(G3.nodes[n]['value']))

for n in G3.nodes():
    print(n,':',G3.nodes[n])
1:1
2:-1
3:0
Alice:1
Bob:-1
Charlie:1
1 : {'value': 1}
2 : {'value': -1}
3 : {'value': 0}
Alice : {'gender': 'female', 'value': 1}
Bob : {'gender': 'male', 'value': -1}
Charlie : {'gender': 'male', 'value': 1}

You can also add attributes and values to the edges in the graph

In [36]:
G3['Alice']['Bob']['label'] = 'strong'
print(G3['Bob']['Alice'])
print(G3['Alice'])
print(G3['Bob'])
{'label': 'strong'}
{'Bob': {'label': 'strong'}, '1': {}}
{'Alice': {'label': 'strong'}, 'Charlie': {}}
In [37]:
for e in G3.edges:
    print(e,':',G3.edges[e])
('1', '2') : {}
('1', '3') : {}
('1', 'Alice') : {}
('2', '3') : {}
('Alice', 'Bob') : {'label': 'strong'}
('Bob', 'Charlie') : {}
In [38]:
print(G3.edges[('Alice','Bob')])
print(G3.edges[('Alice','Bob')]['label'])
{'label': 'strong'}
strong
In [39]:
# The method get_edge_attributes returns a dictionary with keys the edges
#and values the values of the attribute we requested (for the edges that have it)
nx.get_edge_attributes(G3,'label')
Out[39]:
{('Alice', 'Bob'): 'strong'}
In [40]:
pos=nx.spring_layout(G3)
nx.draw_networkx(G3, pos)
my_labels = nx.get_edge_attributes(G3,'label')
print(my_labels)
nx.draw_networkx_edge_labels(G3, pos, edge_labels=my_labels)
#plt.show()
{('Alice', 'Bob'): 'strong'}
Out[40]:
{('Alice', 'Bob'): Text(-0.37570925580541625, -0.37061980394875893, 'strong')}
No description has been provided for this image

Weighted graphs¶

A special attribute of a an edge is the "weight". When adding weighted edges, you enter triples consisting of the two edge endpoints and the weight of the edge. This weight is stored in an attribute "weight" by default.

In [41]:
G4 = nx.Graph()
G4.add_weighted_edges_from([(1,2,0.5),(2,3,0.1),(3,4,0.7)])
for (a,b) in G4.edges():
    print (a,b,':',G4[a][b])
for (a,b,w) in G4.edges(data =True): #data=True returns weight as well
    print (str(a)+" "+ str(b) + " " + str(w['weight']))
for n in G4:
    print(n,':',G4[n])
for e in G4.edges:
    print(e,':',G4.edges[e])
1 2 : {'weight': 0.5}
2 3 : {'weight': 0.1}
3 4 : {'weight': 0.7}
1 2 0.5
2 3 0.1
3 4 0.7
1 : {2: {'weight': 0.5}}
2 : {1: {'weight': 0.5}, 3: {'weight': 0.1}}
3 : {2: {'weight': 0.1}, 4: {'weight': 0.7}}
4 : {3: {'weight': 0.7}}
(1, 2) : {'weight': 0.5}
(2, 3) : {'weight': 0.1}
(3, 4) : {'weight': 0.7}
In [42]:
weights = nx.get_edge_attributes(G4,'weight')
print(weights)
{(1, 2): 0.5, (2, 3): 0.1, (3, 4): 0.7}
In [43]:
pos=nx.spring_layout(G4)
nx.draw_networkx(G4, pos)
nx.draw_networkx_edge_labels(G4, pos, edge_labels=weights)
Out[43]:
{(1, 2): Text(0.7524855355227522, -0.022460304876532503, '0.5'),
 (2, 3): Text(-0.014391343375042887, 0.00039467951111531174, '0.1'),
 (3, 4): Text(-0.7524894595304484, 0.022460421295667984, '0.7')}
No description has been provided for this image
In [44]:
print(weights)
{(1, 2): 0.5, (2, 3): 0.1, (3, 4): 0.7}

Directed Graphs¶

In [45]:
DG1=nx.DiGraph()
DG1.add_edges_from([(1,2), (2,1), (3,1), (1,4), (2,4), (4,3)])
pos=nx.spring_layout(DG1)
nx.draw_networkx(DG1,pos)
No description has been provided for this image
In [46]:
DG2=nx.DiGraph()
DG2.add_weighted_edges_from([(1,2,0.5), (3,1,0.75), (1,4,0.1)])
print(DG2.edges())
for n in DG2:
    print(n, DG2[n])

pos=nx.spring_layout(DG2)
nx.draw_networkx(DG2,pos)
weights = nx.get_edge_attributes(DG2,'weight')
nx.draw_networkx_edge_labels(DG2, pos, edge_labels=weights)
[(1, 2), (1, 4), (3, 1)]
1 {2: {'weight': 0.5}, 4: {'weight': 0.1}}
2 {}
3 {1: {'weight': 0.75}}
4 {}
Out[46]:
{(1, 2): Text(0.5344112176558196, 0.07453179365090234, '0.5'),
 (1, 4): Text(-0.40025893727504, -0.2823709501419644, '0.1'),
 (3, 1): Text(0.06531809932901345, 0.15992679438194546, '0.75')}
No description has been provided for this image

You can transform a directed graph into an undirected on using to_undirected()

In [47]:
UG1 = DG1.to_undirected()
print(UG1.edges())
for n in UG1:
    print(n, UG1[n])
nx.draw_networkx(UG1)
[(1, 2), (1, 4), (1, 3), (2, 4), (3, 4)]
1 {2: {}, 4: {}, 3: {}}
2 {1: {}, 4: {}}
3 {1: {}, 4: {}}
4 {1: {}, 2: {}, 3: {}}
No description has been provided for this image
In [48]:
UG2 = DG2.to_undirected()
print(UG2.edges())
for n in UG2:
    print(n, UG2[n])
nx.draw_networkx(UG2)
[(1, 2), (1, 4), (1, 3)]
1 {2: {'weight': 0.5}, 4: {'weight': 0.1}, 3: {'weight': 0.75}}
2 {1: {'weight': 0.5}}
3 {1: {'weight': 0.75}}
4 {1: {'weight': 0.1}}
No description has been provided for this image

Graph Operations¶

Some common graph functions and algorithms are already implemented in networkx library.

Neighbors, degrees and adjancency matrix

In [49]:
nx.draw_networkx(G)
No description has been provided for this image
In [50]:
A = nx.adjacency_matrix(G)
print(A)
#the adjacency matrix is stored as a sparse matrix
print(type(A))
  (0, 1)	1
  (0, 2)	1
  (0, 3)	1
  (1, 0)	1
  (2, 0)	1
  (3, 0)	1
  (3, 4)	1
  (3, 5)	1
  (4, 3)	1
  (5, 3)	1
<class 'scipy.sparse._csr.csr_array'>
In [51]:
print(G.neighbors(1)) # returns the neighbors of a node
print(list(G.neighbors(1)))
for x in G.neighbors(1):
    print(x)
print('degree of 1:',G.degree(1)) # returns the degree of a node
<dict_keyiterator object at 0x78bddc3e2610>
[2, 3, 'Alice']
2
3
Alice
degree of 1: 3
In [52]:
print(G4[3])
print(G4.degree(3, weight='weight'))
print(G4.degree(3))
{2: {'weight': 0.1}, 4: {'weight': 0.7}}
0.7999999999999999
2

Neighbors and degrees for directed or weighted graphs

In [53]:
pos=nx.spring_layout(DG2)
nx.draw_networkx(DG2,pos)
weights = nx.get_edge_attributes(DG2,'weight')
nx.draw_networkx_edge_labels(DG2, pos, edge_labels=weights)
Out[53]:
{(1, 2): Text(0.5272487293673744, -0.03409400608090196, '0.5'),
 (1, 4): Text(-0.3972168205417217, -0.12617068508745777, '0.1'),
 (3, 1): Text(0.0755496192025753, 0.2051282745726466, '0.75')}
No description has been provided for this image
In [54]:
print(list(DG2.successors(1)))
print(list(DG2.neighbors(1)))
print(list(DG2.predecessors(1)))
print(DG2.out_degree(1))
print(DG2.in_degree(1))
print(DG2.out_degree(1,weight='weight'))
[2, 4]
[2, 4]
[3]
2
1
0.6

Connected components¶

In [55]:
G5 = nx.Graph()
G5.add_edges_from([(1,2),(1,3),(1,4),(2,3),('Alice','Bob'),('Bob','Charlie'),('Alice','Charlie')])
nx.draw_networkx(G5)
No description has been provided for this image
In [56]:
C = nx.connected_components(G5)
print(type(C))
for c in C:
    print(c)
<class 'generator'>
{1, 2, 3, 4}
{'Alice', 'Charlie', 'Bob'}

Get the connected component subgraphs

In [57]:
connected_subgraphs = [nx.subgraph(G5,c) for c in nx.connected_components(G5)]
for GC in connected_subgraphs:
    print('Connected Component')
    print(GC.nodes())
    print(GC.edges())
    print(len(GC))
Connected Component
[1, 2, 3, 4]
[(1, 2), (1, 3), (1, 4), (2, 3)]
4
Connected Component
['Alice', 'Charlie', 'Bob']
[('Alice', 'Bob'), ('Alice', 'Charlie'), ('Charlie', 'Bob')]
3

Get the largest connected component

In [58]:
# Get the nodes

largest_cc = max(nx.connected_components(G5), key=len)
print(largest_cc)
{1, 2, 3, 4}
In [59]:
#Get the subgraph

largest_cc = max(nx.connected_components(G5), key=len)
print(largest_cc)
CC_max = nx.subgraph(G5,largest_cc)
nx.draw_networkx(CC_max)
{1, 2, 3, 4}
No description has been provided for this image

Strongly Connected Components¶

In [60]:
DG = nx.DiGraph()
DG.add_edges_from([(1,2),(2,3),(3,4),(4,2),(4,3),(4,5)])
nx.draw_networkx(DG)
No description has been provided for this image
In [61]:
largest_SCC = max(nx.strongly_connected_components(DG), key=len)
#print(largest_SCC)
SCC_max = nx.subgraph(DG,largest_SCC)
nx.draw_networkx(SCC_max)
No description has been provided for this image

Shortest paths¶

In [62]:
G6 = nx.Graph()
G6.add_edges_from([(1,2),(1,3),(1,4),(2,3),(1,'Alice'),('Alice','Bob'),('Alice','Charlie'),('Bob','Charlie')])
nx.draw_networkx(G6)
sp = nx.shortest_path(G6,3,'Bob')
print(sp)
print(len(sp)-1)
print(nx.shortest_path_length(G6,3,'Bob'))
[3, 1, 'Alice', 'Bob']
3
3
No description has been provided for this image
In [63]:
SP1 = nx.single_source_shortest_path(G6,1)
print(SP1)
#print(nx.single_source_shortest_path_length(G3,'1'))
{1: [1], 2: [1, 2], 3: [1, 3], 4: [1, 4], 'Alice': [1, 'Alice'], 'Bob': [1, 'Alice', 'Bob'], 'Charlie': [1, 'Alice', 'Charlie']}
In [64]:
SP = nx.all_pairs_shortest_path(G6)
print(type(SP))
SP = dict(SP)
print(dict(SP))
<class 'generator'>
{1: {1: [1], 2: [1, 2], 3: [1, 3], 4: [1, 4], 'Alice': [1, 'Alice'], 'Bob': [1, 'Alice', 'Bob'], 'Charlie': [1, 'Alice', 'Charlie']}, 2: {2: [2], 1: [2, 1], 3: [2, 3], 4: [2, 1, 4], 'Alice': [2, 1, 'Alice'], 'Bob': [2, 1, 'Alice', 'Bob'], 'Charlie': [2, 1, 'Alice', 'Charlie']}, 3: {3: [3], 1: [3, 1], 2: [3, 2], 4: [3, 1, 4], 'Alice': [3, 1, 'Alice'], 'Bob': [3, 1, 'Alice', 'Bob'], 'Charlie': [3, 1, 'Alice', 'Charlie']}, 4: {4: [4], 1: [4, 1], 2: [4, 1, 2], 3: [4, 1, 3], 'Alice': [4, 1, 'Alice'], 'Bob': [4, 1, 'Alice', 'Bob'], 'Charlie': [4, 1, 'Alice', 'Charlie']}, 'Alice': {'Alice': ['Alice'], 1: ['Alice', 1], 'Bob': ['Alice', 'Bob'], 'Charlie': ['Alice', 'Charlie'], 2: ['Alice', 1, 2], 3: ['Alice', 1, 3], 4: ['Alice', 1, 4]}, 'Bob': {'Bob': ['Bob'], 'Alice': ['Bob', 'Alice'], 'Charlie': ['Bob', 'Charlie'], 1: ['Bob', 'Alice', 1], 2: ['Bob', 'Alice', 1, 2], 3: ['Bob', 'Alice', 1, 3], 4: ['Bob', 'Alice', 1, 4]}, 'Charlie': {'Charlie': ['Charlie'], 'Alice': ['Charlie', 'Alice'], 'Bob': ['Charlie', 'Bob'], 1: ['Charlie', 'Alice', 1], 2: ['Charlie', 'Alice', 1, 2], 3: ['Charlie', 'Alice', 1, 3], 4: ['Charlie', 'Alice', 1, 4]}}
In [65]:
print(SP1['Bob'])
print(SP[1]['Bob'])
[1, 'Alice', 'Bob']
[1, 'Alice', 'Bob']

Link Analysis¶

https://networkx.github.io/documentation/stable/reference/algorithms/link_analysis.html

In [66]:
DG2 = nx.DiGraph()
DG2.add_edges_from([(1,2),(1,3),(3,2),(2,5),(4,1),(4,2),(4,3),(5,1),(5,4)])
nx.draw_networkx(DG2)
No description has been provided for this image

Pagerank¶

The result of the Pagerank algorithm is a dictionary with the node ids as keys, and the pagerank values as values.

image.png

In [67]:
pr = nx.pagerank(DG2)
print(pr)
print('\n')
pr = nx.pagerank(G6)
print(pr)
{1: 0.18064505060873784, 2: 0.27131643087724033, 3: 0.14665711544131713, 5: 0.26061906832422166, 4: 0.140762334748483}


{1: 0.2420308196543696, 2: 0.12671382905463274, 3: 0.12671382905463274, 4: 0.07286059991932929, 'Alice': 0.17987731223897474, 'Bob': 0.12590180503903042, 'Charlie': 0.12590180503903042}

HITS¶

image.png

In [68]:
[h,a] = nx.hits(DG2)
print(h)
print(a)
print(a[2])
{1: 0.3028419093958839, 2: 2.6560584919688065e-17, 3: 0.16745199268671332, 5: 0.1254412261267391, 4: 0.40426487179066356}
{1: 0.23681287910395027, 2: 0.3909843250829289, 3: 0.3161224561036186, 5: 6.201641558282901e-17, 4: 0.05608033970950216}
0.3909843250829289

Pesronalized Pagerank¶

We will see the example we presented in class

In [69]:
G7 = nx.read_edgelist('graph-example.txt')
nx.draw_networkx(G7)
No description has been provided for this image

To define the personalized Pagerank, we will use the "personalization" parameter, were we define the jump vector as a dictionary, with node ids as the keys, and the jump probability to each node as the values

In [70]:
print(nx.pagerank(G7))
print(nx.pagerank(G7, personalization = {'1':1}))
print(nx.pagerank(G7, personalization = {'6':1}))
{'1': 0.12755099866537586, '2': 0.18232887836414127, '3': 0.2394858218322733, '4': 0.18433868256410274, '5': 0.1326787034099837, '6': 0.13361691516412283}
{'1': 0.2583391334031776, '2': 0.20094608623757965, '3': 0.24190126489812175, '4': 0.14028792201015408, '5': 0.0833530743914921, '6': 0.07517251905947459}
{'1': 0.07517273333798744, '2': 0.12532699696556127, '3': 0.1866533325726192, '4': 0.18958031704255388, '5': 0.15407113246217508, '6': 0.26919548761910295}

We will now change the jump probability. To change the jump probability we will set the parameter alpha.

Attention: The parameter alpha is not the jump probability as we have described in class, but the probability of following an out-link. The jump probability is 1-alpha.

In [71]:
print(nx.pagerank(G7, alpha = 0.5))
print(nx.pagerank(G7, alpha = 0.5, personalization = {'1':1}))
print(nx.pagerank(G7, alpha = 0.5, personalization = {'6':1}))
{'1': 0.1390335398088414, '2': 0.1741686298166486, '3': 0.2133752383994728, '4': 0.17643097727782997, '5': 0.14740281022602297, '6': 0.14958880447118414}
{'1': 0.5510064790190851, '2': 0.16964493982349704, '3': 0.18185949066208873, '4': 0.054964764048023335, '5': 0.026690643658381363, '6': 0.015833682788924434}
{'1': 0.015833407896886878, '2': 0.039357701756408944, '3': 0.07419162031220411, '4': 0.15675161523961573, '5': 0.150192049257395, '6': 0.5636736055374894}

Betweeness¶

In [72]:
BC = nx.edge_betweenness_centrality(G6)
print(BC)
{(1, 2): 0.23809523809523808, (1, 3): 0.23809523809523808, (1, 4): 0.2857142857142857, (1, 'Alice'): 0.5714285714285714, (2, 3): 0.047619047619047616, ('Alice', 'Bob'): 0.23809523809523808, ('Alice', 'Charlie'): 0.23809523809523808, ('Bob', 'Charlie'): 0.047619047619047616}

Drawing Graphs¶

https://networkx.org/documentation/stable/reference/drawing.html

In [73]:
nx.draw_networkx(G6)
No description has been provided for this image
In [74]:
nx.draw(G6, with_labels = True)
No description has been provided for this image
In [75]:
nx.draw_circular(G6)
No description has been provided for this image
In [76]:
nx.draw_spectral(G6, with_labels=True)
No description has been provided for this image
In [77]:
nx.draw_spring(G6)
No description has been provided for this image
In [78]:
nx.draw_spring(DG2)
No description has been provided for this image

An example¶

In [ ]:
karate=nx.read_gml("karate.gml",label='id')
nx.draw_networkx(karate)
No description has been provided for this image
In [ ]:
nx.draw_spring(karate,with_labels=True)
No description has been provided for this image

Change the size of the nodes depending on their pagerank value

In [ ]:
pr = nx.pagerank(karate)
nx.draw_networkx(karate,node_size=[10000*pr[x] for x in pr])
No description has been provided for this image

A PageRank implementation¶

We will now do our own implementation of Pagerank. Pagerank values for node $i$ are computed by iterating the following formula:

$$p_i = 0.85\sum_{j\rightarrow i}\frac{p_j}{d_{out}(j)} +0.15\frac 1n$$

We will associate each node with two values: the old pagerank in the previous step and the new one that is currently computed. We initialize the old pagerank to $1/n$

In [ ]:
for x in karate.nodes:
    karate.nodes[x]['old_pr'] = 1/len(karate.nodes)
    karate.nodes[x]['pr'] = 0

The algorithm goes over the edges in the graph, and for each edge (x,y) transfers a fraction of the Pagerank of x to y (and vice versa since the graph is undirected).

For convergece check we want the maximum difference between old and new Pagerank values to be less than eps.

In [ ]:
eps = 0.0000001
while (True):
    for (x,y) in karate.edges:
        karate.nodes[y]['pr'] += karate.nodes[x]['old_pr']/karate.degree(x)
        karate.nodes[x]['pr'] += karate.nodes[y]['old_pr']/karate.degree(y)

    diff = 0
    for x in karate.nodes:
        karate.nodes[x]['pr'] = karate.nodes[x]['pr']*0.85 + 0.15/len(karate.nodes)
        diff = max(diff, abs(karate.nodes[x]['pr'] - karate.nodes[x]['old_pr']))
    if diff < eps: break

    for x in karate.nodes:
        karate.nodes[x]['old_pr'] = karate.nodes[x]['pr']
        karate.nodes[x]['pr'] = 0

print({x:karate.nodes[x]['pr'] for x in karate.nodes})
{1: 0.09699746639416912, 2: 0.05287697716512755, 3: 0.05707850530246944, 4: 0.03585989886187514, 5: 0.02197802085797975, 6: 0.02911125727906845, 7: 0.02911125727906845, 8: 0.024490521601363044, 9: 0.029766036239543145, 10: 0.014309384124315726, 11: 0.02197802085797975, 12: 0.009564757473502347, 13: 0.014644911694305714, 14: 0.029536468425771824, 15: 0.014535968166646367, 16: 0.014535968166646367, 17: 0.016784065326985863, 18: 0.014558694786535084, 19: 0.014535968166646367, 20: 0.019604641610943857, 21: 0.014535968166646367, 22: 0.014558694786535084, 23: 0.014535968166646367, 24: 0.03152244833588808, 25: 0.021075999887935973, 26: 0.02100616170212189, 27: 0.01504401086799329, 28: 0.0256397267700412, 29: 0.019573438474829792, 30: 0.02628848291686834, 31: 0.024590131585182876, 32: 0.037158037426708354, 33: 0.07169310822672904, 34: 0.10091903290492975}

We got essentially the same values as for the Pagerank vector.

In [ ]:
pr
Out[ ]:
{1: 0.09700181758983706,
 2: 0.052878391037427,
 3: 0.05707842304763673,
 4: 0.035860643223064786,
 5: 0.021979406974834498,
 6: 0.02911334166344221,
 7: 0.029113341663442205,
 8: 0.02449075803950918,
 9: 0.029765339186167028,
 10: 0.014308950284462798,
 11: 0.021979406974834494,
 12: 0.009564916863537146,
 13: 0.014645186487916188,
 14: 0.02953631497720298,
 15: 0.014535161524273824,
 16: 0.014535161524273824,
 17: 0.016785378110253487,
 18: 0.01455885977424349,
 19: 0.014535161524273824,
 20: 0.01960441671193729,
 21: 0.014535161524273824,
 22: 0.01455885977424349,
 23: 0.014535161524273824,
 24: 0.03152091531163227,
 25: 0.021075455001162945,
 26: 0.02100562817474579,
 27: 0.015043395360629753,
 28: 0.025638803528350497,
 29: 0.019572960509438537,
 30: 0.026287262837112076,
 31: 0.024589336534292478,
 32: 0.037156635922679405,
 33: 0.07169213006588289,
 34: 0.1009179167487121}

Pagerank implementation using matrix operations

In [ ]:
import numpy as np
A = nx.adjacency_matrix(karate)
d = np.array([karate.degree(n) for n in karate.nodes])
D = np.diag(1/d)
P = D@A
N = len(karate.nodes)
u = np.ones(N)/N
p = p_old = u
while(True):
    p = (0.85*p_old.T@P+0.15*u.T).T
    if np.linalg.norm(p-p_old) < eps: break
    p_old = p
In [ ]:
p
Out[ ]:
array([0.09699741, 0.05287696, 0.05707851, 0.03585989, 0.021978  ,
       0.02911123, 0.02911123, 0.02449052, 0.02976604, 0.01430939,
       0.021978  , 0.00956476, 0.01464491, 0.02953647, 0.01453598,
       0.01453598, 0.01678405, 0.01455869, 0.01453598, 0.01960464,
       0.01453598, 0.01455869, 0.01453598, 0.03152247, 0.02107601,
       0.02100617, 0.01504402, 0.02563974, 0.01957344, 0.0262885 ,
       0.02459014, 0.03715806, 0.07169312, 0.10091905])