HOME  |  PAPERS  |  BLOG  |  DATA & SOFTWARE               

Data & Software
  1. Impute Race/Ethnicity and Gender From Name:

  2. Search a long list of names (patterns) in a large text corpus systematically and quickly

  3. Categorize the Content of Domains
    Data & Software | Related Blog Posts: (1), (2) | Related Note

  4. AutoSum: Summarize Publications Automatically and Discover Miscitations

  5. Adjust Naive Estimates of Learning for Guessing
    R package | Related Paper

  6. Weather Data:
    Please read this before downloading any of the following scripts.

    • Find nearest zip codes given a list of weather stations (COOP and GHCND) via
      GeoNames: Data & Scripts
    • Find nearest weather stations given a list of zip codes: Data & Scripts
    • Get data from nearest weather station given a list of zip codes and date range
    • Get data from nearest weather station given a list of zip codes and date range
      using the NOAA webservice:  Script

  7. Image to Text:
    Please read this before downloading any of the following scripts.

  8. Edit Distance Based Search and Replace
    Software | Related Note

  9. Text as Data:

    • Normalize text, remove stop words, punctuation, numbers, stem, lemmatize
    • Subset, Randomly Sample, Summarize: Script
    • Create TDM with various weighting schemes: Script
    • Sentiment Analysis: Script
    • Supervised Learning: Classification, Regression

  10. Database on Ideology, Money in Politics, and Elections (DIME):

  11. Clarifai: Understand (Moving) Images
    R package | Analysis of Politicians' Instagrams | Infer Gender Based on First Name

  12. Biographical Information of Indian Politicians
    Data & Software | Related Note

  13. Metadata and Abstracts of Articles Published in Major Political Science Journals
    Data & Software | Related Note

  14. Precinct Level Election Returns

  15. Military Experience of US presidents, and UK Prime Ministers
    Data | Related Blog Post

  16. tuber: Access YouTube from R
    R package
    REVIEW: 'Thank you very much for the package ... it has made my life easy ....'

  17. tubern: R Client for the YouTube Analytics and Reporting API
    R package

  18. Match Level Data on 43,000+ Cricket Matches
    Data & scripts | Related Article | Related Blog Post

  19. virustotal: R Client for the Virustotal Public API 2.0
    R package

  20. aws.alexa: Access Amazon Alexa from R
    R package

  21. Geographic Information on Designated Media Markets (DMAs)
    Shape files, Crosswalks to counties and zip codes

  22. Cable TV Penetration in Designated Market Areas (DMAs): 1983--2010

  23. Cable Operators and Channels Offered in Towns (from TV & Cable Factbook) (1997--2002)
    Data | Related Blog Post

  24. UK Digital TV Coverage
    Data & Scripts

  25. Complaints to the Independent Press Standards Agency (IPSO)
    Data & Scripts | Related Blog Post

  26. Politifact
    Data & Scripts | Related Blog Post

  27. Collecting Data from the Streets:

  28. Race and Gender of Victims and Perpetrators on Law & Order
    Data and scripts | Related Paper

  29. 70+ years of Network TV Schedules; Race & Gender of People Involved
    Data, scripts, and analyses

  30. State and Local Public Employee Salaries