I have many adjacency matrices that I’ve turned into edge lists (stored as a list) using the below (I’m open to better edge list code, though that is not the point of this post):
def edges (mat):
edgelist = []
n = len(mat)
for i in range(n):
for j in range(n):
if i<j:
if mat[i][j] > 0.0001:
edgelist.append([abs(float(mat[i][j])),i,j])
edgeframe = pandas.DataFrame(edgelist)
return edgeframe
I would like to assess the edge differences between edge_lists[0]
and edge_lists[4]
Specifically, I want to know which exact entries disappeared from the first edge list to the next edge list. My first thought was to see how many rows and columns stayed the same using this:
adj1 = edge_lists[0]
adj2 = edge_lists[4]
np.sum(adj1 == adj2)
0 0
1 56
2 39
dtype: int64
But this does not work. The first adjacency matrix could have a non-zero in say, row 5, column 5. That entry could be the 20th row of the first edge list. If the second adjacency matrix also has a nonzero in row 5, column 5, but that entry is the 21st row of the second edge list, the above code falls apart.
Is there a way to assign unique values to an edge list that represent the exact mix of row and column? In exel I would creat "row&column" for each edge list and compare the two. I’m not sure how to do that in python.
Thanks!
>Solution :
import pandas as pd
def assign_edge_labels(edgeframe):
edgeframe['edge_label'] = edgeframe.apply(lambda row: f"{row[1]}&{row[2]}", axis=1)
return edgeframe
# Create edge lists
edge_lists = [edges(mat1), edges(mat2)] # Replace mat1 and mat2 with your adjacency matrices
# Assign unique labels to each edge in the edge lists
edge_lists[0] = assign_edge_labels(edge_lists[0])
edge_lists[1] = assign_edge_labels(edge_lists[1])
# Find the edges that disappeared from the first edge list to the second edge list
disappeared_edges = edge_lists[0][~edge_lists[0]['edge_label'].isin(edge_lists[1]['edge_label'])]
print(disappeared_edges)