Tutorial: Exploring YouTube Recommended Video Networks
NodeXL Pro offers several ways to access the official YouTube API (v3). With the NodeXL Pro YouTube data importers you can
- explore the YouTube recommendation algorithm with Video-to-Video Recommendation Networks or
- analyze discussions around videos and search terms with User-to-User Video Comment Networks or
- explore User-to-User Channel subscription networks (very limited).
In this tutorial we will focus on Video-to-Video Recommendation Networks.
Basic knowledge about Task Automation is required. The NodeXL Pro data recipes used in this tutorial can be found in the official recipe bundle on the page Automate NodeXL Pro. You can easily learn how to automate NodeXL Pro by reading this page, looking at this tutorial and/or watching this video.
Step 1: Create YouTube API keys
The NodeXL Pro YouTube data importers have an integrated API key with 100k units per day which can be used by any NodeXL user, but these units are consumed very quickly. API calls for recommended video lists are especially expensive, that is why quota management is very important .
So before getting started we strongly recommend that you create your own YouTube API keys for your project. Here is a guide on how to quickly receive up to 10 API keys with a daily limit of 10,000 units per key. You can apply for an upgrade to 100,000 units, but need to go through a review process.
If your API key runs out of quota during data import, you will receive the following error message: “The network couldn’t be obtained. The request cannot be completed because you have exceeded your quota”. As a result you will see only a partial dataset or no data at all in your workbook.
Step 2: Preparation
The first step is to import the data recipe with the file name “YouTube Video Network 01 – description.NodeXLOptions”. This recipe performs all relevant steps to conduct a full-scale social network analysis and runs text analysis on the video description column. You can also choose the recipe “YouTube Video Network 02 – tags.NodeXLOptions” which uses different layout options and analyzes the video tags column.
Before getting started, it is helpful to first save the file to a filename that includes the basic setup of the import settings, e.g. NodeXL Video Network 50 rel 50-1 2020-10-20. This means that the search term NodeXL was used to import 50 videos sorted by relevance with 50 recommendations each in a 1.0 network on Oct 20, 2020.
Step 3: Data importer setup
Open the YouTube Video Network importer via Data > Import > From YouTube Video Network (screenshot of the importer below).
On the left you see two options to import data – either by entering a search term or by entering a (list of) video ID(s). When entering a list of videos, you need to identify the video ID within the video URL first, e.g. when looking at this URL https://www.youtube.com/watch?v=mjAq8eA7uOM, the video ID of this URL is mjAq8eA7uOM. Note that some video URLs contain tracking codes right after the ID, these need to be removed.
In the section “Add an edge for each” check the box “Recommended video”, leave the other box blank. With “Pair of videos commented on by the same user” unchecked, the limiters on comments and replies below are not relevant and thus deactivated. We suggest to not mix these two edge types in one analysis.
In the Options section you need to limit the amount of videos and recommendations. Depending on the popularity of the search term, the API may not return the full number of requested videos – usually no more than about 200 per search. The number of recommendations per video may also vary between about 20 and 100.
A. Setup for the 1.0 search network
The maximum amount of network data available for one 10k API key is 95 videos limited to 50 recommendations in a 1.0 network. This importer setup results in a network of max 4750 videos, but is usually much less because of overlaps in the recommendations.
B. Setup for the 1.0 video ID list network
This setup is similar to the 1.0 search network above. You can enter up to 95 video IDs limited to 50 recommendations in a 1.0 network without exceeding the 10k API key. This importer setup also results in a network of max 4500 recommendations.
C. Setup for the 2.0 search network
The scope of the 2.0 networks are very limited. If you choose a 2.0 search network, you cannot select more than 5 videos with 6 recommendations without exceeding the 10k API key. The resulting network contains a maximum of 266 edges.
D. Setup for the 2.0 video ID list network
If you choose a 2.0 video ID list network, you need to limit the recommendations to 15 for just one single video ID, or 12 recommendations for two video IDs, or 10 recommendations for three video IDs.
Step 4: Automate
When you are finished with the importer setup, click OK to download the data. After the download click Graph > Automate > Run to analyze the dataset. If you select the option “Automate the graph after the data is imported” under Data > Import > Import Options before opening the importer, the analysis will start automatically.
Step 5: Review the data
When Task Automation is done, the workbook is ready to explore. Have a look at the vertices spreadsheet which has been populated with vertex centrality metrics and also contains the metadata of the collected videos. When sorting the spreadsheet by Out-Degree you can identify the videos for which recommendations have been collected. Further to the right you find the metadata includes the following columns: Title, Description, Tags, Author, Created Date (UTC), Views, Comments, Likes Count, Dislikes Count and a link to the video.
Additional step 1: Getting more data with multiple imports
In order to analyze larger networks, you can add multiple queries to one workbook by using a fresh API key for each new query. Depending on your search term there may be an overlap in the results of the returned videos and its recommendations. The overlap is thus represented in the edge weight column.
To use this approach you first need to uncheck the option “Clear the NodeXL workbook before the data is imported” via Data > Import > Import options. Use Task Automation after all queries have been collected.
A. Import several search queries one by one
You can import several different search queries one by one using the Setup for the 1.0 search network (90 videos with 50 recommendations each). Here is an example around the topic network analysis that contains five different searches integrated into one workbook: social network analysis, social media network, network science, network visualization, network graph. The resulting network has 17,885 edges and 10,197 vertices.
B. Import the same search query with different sorting options
A second approach is to import the same search query five times with each one using a different sorting option of the requested videos: Relevance, Date, Rating, View Count, Title. In this example we used the search term “flat earth” which results in a network of 8,399 videos with 22,229 edges.
Additional step 2: Create a video tag network
In our tutorial “Semantic networks” we show how you can create a network based on video tags which may help to better understand the data. This tag network has been created from the network analysis query above.