Twitter Stream

Attribute Details
Source Name twitter
Data Source HealthTweets
Geographic Levels National, HHS regions, Census divisions, and US states (see Geographic Codes)
Temporal Granularity Daily and Weekly (Epiweek)
Reporting Cadence Inactive - No longer updated since 2020w31 (2020-12-07)
Temporal Scope Start 2011w48 (2011-11-27)

Overview

Estimate of influenza activity based on analysis of language used in tweets.

General topics not specific to any particular endpoint are discussed in the API overview. Such topics include: contributing, citing, and data licensing.

Table of contents

  1. The API
    1. Parameters
      1. Required
    2. Response
  2. Example URLs
    1. Twitter on 2015w01 (national)
  3. Citing the Survey
  4. Code Samples
    1. Legacy Clients

Note: Restricted access: Delphi doesn’t have permission to share this dataset.

The API

The base URL is: https://api.delphi.cmu.edu/epidata/twitter/

Parameters

Required

Parameter Description Type
auth password string
locations locations list of location codes: nat, HHS regions, Census divisions, or state codes (see Geographic Codes)
dates dates (see Date Formats) list of dates
epiweeks epiweeks (see Date Formats) list of epiweeks

Note: Only one of dates and epiweeks is required. If both are provided, epiweeks is ignored.

Response

Field Description Type
result result code: 1 = success, 2 = too many results, -2 = no results integer
epidata list of results array of objects
epidata[].location location label string
epidata[].date date (yyyy-MM-dd) string
epidata[].epiweek epiweek integer
epidata[].num number of tweets integer
epidata[].total total tweets integer
epidata[].percent percent of tweets float
message success or error message string

Example URLs

Twitter on 2015w01 (national)

https://api.delphi.cmu.edu/epidata/twitter/?auth=...&locations=nat&epiweeks=201501

{
  "result": 1,
  "epidata": [
    {
      "location": "nat",
      "num": 3067,
      "total": 443291,
      "epiweek": 201501,
      "percent": 0.6919
    }
  ],
  "message": "success"
}

Citing the Survey

Researchers who use the Twitter Stream data for research are asked to credit and cite the survey in publications based on the data. Specifically, we ask that you cite our paper describing the survey:

Mark Dredze, Renyuan Cheng, Michael J Paul, David A Broniatowski. HealthTweets.org: A Platform for Public Health Surveillance using Twitter. AAAI Workshop on the World Wide Web and Public Health Intelligence, 2014.

Code Samples

Libraries are available for R and Python. The following samples show how to import the library and fetch Twitter data for national level for epiweek 201501.

Install the package using pip:

pip install -e "git+https://github.com/cmu-delphi/epidatpy.git#egg=epidatpy"
# Import
from epidatpy import CovidcastEpidata, EpiDataContext, EpiRange
# Fetch data
epidata = EpiDataContext()
res = epidata.pvt_twitter(auth='auth_token', locations=['nat'], time_type="week", time_values=[201501])
print(res)
library(epidatr)
# Fetch data
res <- pvt_twitter(auth = 'auth_token', locations = 'nat',
                   time_type = "week", time_values = 201501)
print(res)

Legacy Clients

We recommend using the modern client libraries mentioned above. Legacy clients are also available for Python, R, and JavaScript.

Optionally install the package using pip(env):

pip install delphi-epidata

Otherwise, place delphi_epidata.py from this repo next to your python script.

# Import
from delphi_epidata import Epidata
# Fetch data
res = Epidata.twitter('auth_token', ['nat'], time_type="week", time_values=[201501])  
print(res['result'], res['message'], len(res['epidata']))

Place delphi_epidata.R from this repo next to your R script.

source("delphi_epidata.R")
# Fetch data
res <- Epidata$twitter(auth = "auth_token", locations = list("nat"), time_type = "week", time_values = list(201501))
print(res$message)
print(length(res$epidata))
<!-- Imports -->
<script src="delphi_epidata.js"></script>
<!-- Fetch data -->
<script>
  EpidataAsync.twitter('auth_token', ['nat'], EpidataAsync.range(201501, 201510)).then((res) => {
    console.log(res.result, res.message, res.epidata != null ? res.epidata.length : 0);
  });
</script>