#7743
Marc Smith
Keymaster

It is possible but it is also likely to be slow.

What are the limits of the Twitter data importers?

The use of NodeXL Pro Twitter data importers requires a twitter account. Before your first data import you need to authorize NodeXL by entering a token number which is automatically sent to you during the authorization process.

Moreover Twitter’s public free API has many limits. NodeXL Basic and Pro are both effected by these limits. Data is available only for 8-9 days. Queries cannot return more than 18,000 tweets.The follower network is further restricted: the rate at which queries about who follows who can be asked is low.

NodeXL Pro does not enable the collection of data beyond these limits. That said, NodeXL can process data from commercial data providers (like Crimson Hexagon or Radian6). While these are expensive options, they may be the only way to get historical data from Twitter.

Commercial services like Radian6 and Crimson Hexagon provide archival data – but not cheaply!

You may be able to get a little bit more data from the public Twitter API by using the SINCE: and UNTIL: operators – example:QUERYTERM since:2016-01-21 until:2016-01-27. Since: and Until: operators scope the time frame of the query.

Twitter controls its API and throttles it based on unknowable parameters. We notice that the more the volume of tweets == less tweets delivered. One alternative is to do day long slices and append them in order to maximize the data available from Twitter.

You may also be interested in th Connected Action Graph Server Importer which enables NodeXL to connect to the “STREAM” API from Twitter (which sometimes delivers larger volumes of data).

Twitter applies additional limits to the Follows and Followers data query — it is “throttled” to a slower rate of data delivery than the other “Tweet” based queries.

If you are patient it may be possible to do this collection. That said, it is often the case that one or two users in a list of followers are celebrities who have very large numbers of followers. It is not practical to collect many tens of thousands of followers in a desktop application. Millions are not viable at all. NodeXL Pro offers the ability to set an upper threshold (say 5000) on the number of followers that will be collected. Note that the typical Twitter user has about 200 followers (many of which are bots or spam accounts).

In practice, the Follow network may not be as userful as the “SEARCH” network based on an active “Mention” or “Reply” from one user to another. A “Follow” may not indicate any awareness or exposure to the other user’s content. In contrast, Reply and Mention are stronger indicators of content exposure and engagement.