Automate NodeXL Pro > Automate your Wikipedia network analysis!

The following NodeXL Pro “data recipes” are designed to analyze Wikipedia network data with just a few clicks.  These files contain all options settings needed to automate the tasks required to create a full-scale social network and content analysis.

You can easily customize these data recipes to your own needs and save them for future use. You can learn how to automate NodeXL Pro by reading this page, looking at this tutorial and/or watching this video.

Download the official NodeXL Pro recipe bundle below, then unzip and save the folder to your machine.

Wikipedia Page Network 01 – standard

This recipe is designed for Wikipedia Article-Article networks (1.5 degrees) imported with the MediaWiki Page Network importer.  All relevant steps to conduct a full-scale social network analysis are performed. Content analysis is run on the Content column containing the first paragraphs of the respective page. The graph shows an image and a label for every page in the network.

Vertex: Wikipedia article page
Edge:
link to mentioned articles on page
Vertex size:
Betweenness centrality
Group clustering algorithm: Clauset-Newman-Moore
Group labels: Top 10 most frequently used words
Sentiment language: English
Layout algorithm: Harel-Koren Fast Multiscale
Box layout algorithm: Group-in-a-Box, Treemap

Wikipedia Page Network Graph

Wikipedia Page Network 02 – large

This recipe is designed to visualize large Wikipedia Article-Article networks (2.0) degrees) imported with the MediaWiki Page Network importer.  The graph shows an image and a label only for pages with high Indegree.

Vertex size: Indegree
Group clustering algorithm: Clauset-Newman-Moore
Layout algorithm: Harel-Koren Fast Multiscale
Box layout algorithm: Group-in-a-Box, force-directed

Wikipedia User Network 01 – standard

This recipe is designed for larger Wikipedia Article-Article networks (2.0) degrees) imported with the MediaWiki Page Network importer.  Text analysis is run on the Comment column. The graph shows a disk for every user in the discussion.

Vertex: Wikipedia User
Edge:
Comment
Vertex size:
Betweenness centrality
Group clustering algorithm: Clauset-Newman-Moore
Group labels: Top 10 most frequently used words
Text analysis: Top words/word pairs
Layout algorithm: Harel-Koren Fast Multiscale
Box layout algorithm: Group-in-a-Box, Treemap