HOME  |  PAPERS  |  BLOG  |  DATA & SOFTWARE               

Data & Software
  1. Categorize the Content of Domains
    Data & Software | Related Blog Posts: (1), (2) | Related Note

  2. AutoSum: Summarize Publications Automatically and Discover Miscitations

  3. Adjust Naive Estimates of Learning for Guessing
    R package | Related Paper

  4. Weather Data:
    Please read this before downloading any of the following scripts.

    • Find nearest zip codes given a list of weather stations (COOP and GHCND) via
      GeoNames: Data & Scripts
    • Find nearest weather stations given a list of zip codes: Data & Scripts
    • Get data from nearest weather station given a list of zip codes and date range
    • Get data from nearest weather station given a list of zip codes and date range
      using the NOAA webservice:  Script

  5. Image to Text:
    Please read this before downloading any of the following scripts.

  6. Edit Distance Based Search and Replace
    Software | Related Note

  7. Search a long list of names (patterns) in a large text corpus systematically and quickly

  8. Text as Data:

    • Normalize text, remove stop words, punctuation, numbers, stem, lemmatize
    • Subset, Randomly Sample, Summarize: Script
    • Create TDM with various weighting schemes: Script
    • Sentiment Analysis: Script
    • Supervised Learning: Classification, Regression

  9. Database on Ideology, Money in Politics, and Elections (DIME):

  10. Clarifai: Understand (Moving) Images
    R package | Analysis of Politicians' Instagrams | Infer Gender Based on First Name

  11. Biographical Information of Indian Politicians
    Data & Software | Related Note

  12. Metadata and Abstracts of Articles Published in Major Political Science Journals
    Data & Software | Related Note

  13. Precinct Level Election Returns

  14. Military Experience of US presidents, and UK Prime Ministers
    Data | Related Blog Post

  15. tuber: Access YouTube from R
    R package
    REVIEW: 'Thank you very much for the package ... it has made my life easy ....'

  16. tubern: R Client for the Youtube Analytics and Reporting API
    R package

  17. Match Level Data on 43,000+ Cricket Matches
    Data & scripts | Related Article | Related Blog Post

  18. virustotal: R Client for the Virustotal Public API 2.0
    R package

  19. aws.alexa: Access Amazon Alexa from R
    R package

  20. Infer Gender Based on First Name for International Names
    R Data Package | Infer Gender of First Name Using Google Image Search and Clarifai

  21. Geographic Information on Designated Media Markets (DMAs)
    Shape files, Crosswalks to counties and zip codes

  22. Cable TV Penetration in Designated Market Areas (DMAs): 1983--2010

  23. Cable Operators and Channels Offered in Towns (from TV & Cable Factbook) (1997--2002)
    Data | Related Blog Post

  24. Complaints to the Independent Press Standards Agency (IPSO)
    Data & Scripts | Related Blog Post

  25. Politifact
    Data & Scripts | Related Blog Post

  26. Collecting Data from the Streets:

  27. Race and Gender of Victims and Perpetrators on Law & Order
    Data and scripts | Related Note

  28. 70+ years of Network TV Schedules; Race & Gender of People Involved
    Data, scripts, and analyses

  29. Impute Race and Ethnicity Based on Name
    R Package | Python Package

  30. State and Local Public Employee Salaries