Home » Python & NetworkX: Set node attributes from Pandas DataFrame

Python & NetworkX: Set node attributes from Pandas DataFrame

  • by
  • 3 min read

It’s been a couple of years since I first used NetworkX in Python. Something that I expected to be truly obvious was adding node attributes to your network, preferably when calling a from_pandas_… function. However, that wasn’t the case. In this article, we’ll create a network from an edge list and add its node attributes.

First, let’s create a simple network of people who had romantic relationships with each other. Both edges and nodes are created from a Pandas DataFrame. Colors are added by mapping them to the Gender column.

# Edges
df_edges = pd.DataFrame({
    'source': ['John', 'John', 'Jane', 'Alice', 'Tom', 'Helen', 'William'],
    'target': ['Jane', 'Helen', 'Tom', 'Tom', 'Helen', 'William', 'Alice'],
    'years': [2, 3, 5, 1, 2, 8, 3]
# Nodes
df_nodes = pd.DataFrame({
    'Name': ['John', 'Jane', 'Alice', 'Tom', 'Helen', 'William'],
    'Gender': ['Male', 'Female', 'Female', 'Male', 'Female', 'Male']
node_colors = {'Male': 'blue', 'Female': 'red'}
df_nodes['node_color'] = df_nodes['Gender'].map(node_colors)

Next, we’ll create a network from the edge list.

G = nx.from_pandas_edgelist(df_edges, source = 'source', target = 'target', edge_attr = 'years')

Now we have our network. We have create it from the edge list, so it doesn’t have any node properties. For this, we need to use the set_node_attributes function. This function requires a values parameter:

If values is a dict or a dict of dict, it should be keyed by node to either an attribute value or a dict of attribute key/value pairs used to update the node’s attributes.

NetworkX Documentation – set_node_attributes

How can we convert our df_nodes DataFrame so that it matches the input requirements for the set_node_attributes function? Well, Pandas has a function for that: to_dict(). This function converts a DataFrame to a dictionary. Its orient parameter helps you specifying exactly how you want to turn it into a dictionary. The one we’re looking for is:

‘index’ : dict like {index -> {column -> value}}

Pandas Documentation – to_dict

So here’s what we need to do, set the index to the node name and then, convert the DataFrame to a dictionary, with the index orientation.

nodes_attr = df_nodes.set_index('Name').to_dict(orient = 'index')
nx.set_node_attributes(G, nodes_attr)

Now, we can use these node attributes to add color to our graph.

    pos = nx.kamada_kawai_layout(G, weight = 'years'), 
    node_size = 200, 
    node_color = [G.nodes[n]['node_color'] for n in G.nodes],
    width = [G.edges[e]['years'] for e in G.edges],
    with_labels = True)

The resulting visualization:

By the way, I didn’t necessarily come up with this solution myself. Although I’m grateful you’ve visited this blog post, you should know I get a lot from websites like StackOverflow and I have a lot of coding books. This one by Matt Harrison (on Pandas 1.x!) has been updated in 2020 and is an absolute primer on Pandas basics. If you want something broad, ranging from data wrangling to machine learning, try “Mastering Pandas” by Stefanie Molin.

Say thanks, ask questions or give feedback

Technologies get updated, syntax changes and honestly… I make mistakes too. If something is incorrect, incomplete or doesn’t work, let me know in the comments below and help thousands of visitors.

Leave a Reply

Your email address will not be published.