The Reddit Data Importer for NodeXL Pro
Reddit is a social network where users can submit content, such as text posts, links, images, and videos, to various thematic forums known as “subreddits.” Each subreddit has its unique focus and community, ranging from news, politics, and technology to hobbies, entertainment, and virtually any niche interest imaginable.
Reddit is composed of numerous subreddits, each prefixed with “r/”. For example, r/science is dedicated to discussing scientific topics, while r/movies focuses on films. Content can be upvoted or downvoted by users. Posts that receive many upvotes rise to the top of their respective subreddits and, if they gain enough traction, can also make it to Reddit’s front page. Each post on Reddit has a comment section where users can discuss the content, ask questions, or share thoughts. Like posts, comments can also be upvoted or downvoted. Further, Reddit users can also reply to comments.
With NodeXL Pro you can then create user networks around any search term in all of Reddit, or you can focus on a search term within a subreddit.
Besides a NodeXL Pro user license, the only requirement is to have a Reddit user account which is needed to download the data.
The current importer can only connect to the official Reddit API, which allows the collection of a maximum of the 250 most recent Reddit posts on any given search term including all comments and replies to these posts which may lead to networks with a few thousand vertices.
When this importer was first introduced, it was able to connect to the API of Pushshift.io which allowed the import of large historical datasets. This service was deprecated in June 2023, and it is not clear if it will ever be reinstated.
Select NodeXL > Data > Import > …From Reddit Search Network (Beta)
- Enter a search term of your choice and select if you want to perform the search in all subreddits or a specified subreddit.
- By default, the importer collects the “Basic network” of posts, comments and replies. If you would like to add users that are tagged in the posts and replies, select the option “Basic network plus tags”.
- Click OK.
There are four types of network edges: Posted, Commented, Replied to, Tagged
- Posted: A self-loop created for every post.
- Commented: A user comments on a post.
- Replied to: A user replies to a comment.
- Tagged: When a user tags another user in a post, comment or reply (only createrd if you select “Basic network plus tags in the importer setup).
Learn about Task automation and use one of the the data recipes to generate a network report and map: “Reddit User Network.NodeXLOptions”
Whether you’re an academic researcher diving into the intricate world of online interactions or a business aiming to unravel the vast web of social connections relevant to your brand, NodeXL with its diverse importers is your go-to solution. Experience the power of network analysis tailored to your needs.
Feedback and Support: We continuously strive to enhance our tools. Should you have feedback or require assistance, contact our support team.
NodeXL Pro Quick Start Guide
Wikipedia Data Recipes
Create Wikipedia article network as well as user network analyses with the NodeXL Pro MediaWiki network importer. Click here for Wikipedia data recipes that can be used to automate your analysis of Wikipedia network data.