HOME  |  PAPERS  |  BLOG  |  DATA & SOFTWARE               

Data & Software
  1. Categorize the Content of Domains
    Data & Software | Related Blog Posts: (1), (2) | Related Note

  2. AutoSum: Summarize Publications Automatically and Discover Miscitations
    Software

  3. Adjust Naive Estimates of Learning for Guessing
    R package | Related Paper

  4. Weather Data:
    Please read this before downloading any of the following scripts.

    • Find nearest zip codes given a list of weather stations (COOP and GHCND) via
      GeoNames: Data & Scripts
    • Find nearest weather stations given a list of zip codes: Data & Scripts
    • Get data from nearest weather station given a list of zip codes and date range
      Script
    • Get data from nearest weather station given a list of zip codes and date range
      using the NOAA webservice:  Script

  5. Image to Text:
    Please read this before downloading any of the following scripts.

  6. Edit Distance Based Search and Replace
    Software | Related Note

  7. Search a long list of names (patterns) in a large text corpus systematically and quickly
    Software

  8. Text as Data:

    • Normalize text, remove stop words, punctuation, numbers, stem, lemmatize
      Script
    • Subset, Randomly Sample, Summarize: Script
    • Create TDM with various weighting schemes: Script
    • Sentiment Analysis: Script
    • Supervised Learning: Classification, Regression

  9. Database on Ideology, Money in Politics, and Elections (DIME):

  10. Clarifai: Understand (Moving) Images
    R package | Analysis of Politicians' Instagrams | Infer Gender Based on First Name

  11. Biographical Information of Indian Politicians
    Data & Software | Related Note

  12. Metadata and Abstracts of Articles Published in Major Political Science Journals
    Data & Software | Related Note

  13. Precinct Level Election Returns
    Data

  14. Military Experience of US presidents, and UK Prime Ministers
    Data | Related Blog Post

  15. tuber: Access YouTube from R
    R package
    REVIEW: 'Thank you very much for the package ... it has made my life easy ....'

  16. Match Level Data on 43,000+ Cricket Matches
    Data & scripts | Related Article | Related Blog Post

  17. virustotal: R Client for the Virustotal Public API 2.0
    R package

  18. aws.alexa: Access Amazon Alexa from R
    R package

  19. Infer Gender Based on First Name for International Names
    R Data Package | Infer Gender of First Name Using Google Image Search and Clarifai

  20. Geographic Information on Designated Media Markets (DMAs)
    Shape files, Crosswalks to counties and zip codes

  21. Cable TV Penetration in Designated Market Areas (DMAs): 1983--2010
    Data

  22. Cable Operators and Channels Offered in Towns (from TV & Cable Factbook) (1997--2002)
    Data | Related Blog Post

  23. Complaints to the Independent Press Standards Agency (IPSO)
    Data & Scripts | Related Blog Post

  24. Politifact
    Data & Scripts | Related Blog Post

  25. Collecting Data from the Streets:

  26. Race and Gender of Victims and Perpetrators on Law & Order
    Data and scripts | Related Note

  27. 70+ years of Network TV Schedules; Race & Gender of People Involved
    Data, scripts, and analyses

  28. Impute Race and Ethnicity Based on Name
    R Package