2010 – April – NodeXL – Twitter – Cisco FR No Components Lay Out

Ordering smaller components in a graph – a NodeXL feature tip

From this:to this:  in just a few clicks.

Many network graphs contain disconnected smaller graphs, called “components”, within them.

Most layout algorithms do a poor job of managing to group each component in a separate space. Instead, often, components are laid over one another, suggesting connections that are not real.

A simple solution we have implemented in NodeXL is to offer to sweep up all the smaller components in the graph and order them in neat rows at the bottom of the canvas.  This feature was mentioned in a previous post, but finding the feature may not be obvious:

From the NodeXL network graph canvas toolbar, select the drop down menu next to the selected layout type.

This will display the following menu of layout choices and options:

Select the last option: “Layout Options…”

Which reveals:

Select the option: “Put the graph’s smaller components at the bottom of the graph“.  This dialog also presents other options related to how long the Fruchterman-Reingold layout should calculate and how strong the parameter that governs the force that pushes nodes away from one another should be.  You may find that changing these values improves the FR layout for your data.

Here is a graph that is mapped without the component ordering feature selected.  Many components are scattered around the chart.

This image represents the connections among a population of Twitter users who mentioned the term “Cisco“.  This chart was created using the Fruchterman-Reingold layout.  It is noisy and messy given the nature of the graph it has to render.

The Harel-Koren layout option is better but has a significant flaw: all the isolates are jumbled on top of one another in that smear at the center of the ring in the upper left of the graph.

Here is the same graph created with the Harel-Koren layout with the added  “Put the graph’s smaller components at the bottom of the graph” option selected:

All the many lightly connected Twitter authors are lined up in size order (size is mapped to the number of followers that user has in Twitter).  This removes them from getting in the way of the “giant component”, the big connected group of Twitter users who both tweet the word “cisco” but also follow, mention, or reply to someone else who also mentioned the word “cisco”.  The core of this group is visible along with some peripheral groups or people who both mention the company and talk to other people who do as well.  The isolates mention Cisco but do not do so as part of a larger conversation (as seen at the time of this snapshot).

An additional tip: nodes are plotted on the screen in NodeXL in an order governed by the “Layout Order” column in the Vertices worksheet.  If we use the “Autofill Columns” feature we can easily set the Vertex Layout Order to the same value to which Vertex Size was set.  This has the effect of lining up the nodes by size, making a kind of histogram.  All the singletons or isolates, the nodes with no connections to any other node, line up first, then the dyads, the triads, and the quads.  Each larger sized component sorts from its smallest to its largest by the size of the largest node in the component.