Several dictionaries exist that can be loaded into TAPA. If you have your own dictionary that you would like to make available to others, please send me an e-mail to have it uploaded to this site.

*Note: The Warriner et al. ratings found below are the same that are loaded into the software by default.

**Note: For all of the dictionary files on this page, you may want to set TAPA’s “Text Encoding” option to utf-8 when reading the files in. Your system’s default encoding may also be fine.

Dictionary Files

Right Click and “Save Link As…” to download a dictionary file. Alternatively, click the “download” link, then copy and paste the dictionary contents into a .txt file on your hard drive.

Bestgen & Vincze (2012) – DIC-LSA norms (link) (download)

Buchanan et al. (2012) – Single word norms (link) (download)

Brysbaert, Warriner, & Kuperman (2014) – Concreteness norms (link) (download) (rescaled version)

Brysbaert et al. (2014) – Concreteness norms for Dutch (link) (download)

Engelthaler & Hills (2017) – Humor norms (link) (download)

Kovács, Carroll, & Lehman (2013) – Authenticity norms (link) (download)

Kuperman et al. (2012) – Age of Acquisition norms (link) (download)

Kwan et al. – Embodiment ratings for 687 English verbs (link) (download)

Lynott & Connell (2009) – Adjective Modality norms (link) (download)

Lynott & Connell (2013) – Noun Modality norms (link) (download)

Lynott & Connell – Adjective and Noun Modality norms combined (download)

Stadthagen-Gonzalez et al. (2016) – Spanish Emotional Norms (link) (download)

Warriner et al. (2013) – Affective rating norms (link) (download)

Pre-trained Word Vectors

Note: These dictionaries consist of extremely large numbers of words and dimensions. All of the following files are extremely large and are extremely memory-intensive. You will most likely need at least 32GB of RAM in your computer to use any of the following files. Some may require 64+ GB of RAM.

You may also consider downloading the “first 100K” or “first 500k” versions of these dictionary files. They are shortened versions of the full dictionaries and may be more viable on less cutting-edge systems.

Pennington et al. (2014) – GloVe pre-trained vectors (link)

⇒ Wikipedia 2014 + Gigaword 5, 100-dimensional version, de-duplicated (download)

⇒ Twitter, 25 through 200 dimension versions, cleaned (download)
(first 500K words version) (first 100K words version)

⇒ Common Crawl, 42B token version, 300 dimensions, uncased, cleaned (download)
(first 500K words version) (first 100K words version)