norare-data/datasets.tsv at master · concepticon/norare-data · GitHub

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
ID	AUTHOR	YEAR	TAGS	SOURCE_LANGUAGE	TARGET_LANGUAGE	URL	REFS	NOTE	ALIAS
Bond-2013-OMW	Bond, Francis and Foster, Ryan	2013	relations,ontology	English	ENGLISH	http://compling.hss.ntu.edu.sg/omw/	Bond2013	This is an automated mapping to the [Open Multilingual Wordnet](http://compling.hss.ntu.edu.sg/omw/), derived from the Princeton Wordnet ([Fellbaum 1998](:bib:Fellbaum1998)). The pre-selection of lexical items itself is based on a selection of *core* items from the Open Multilingual Wordnet, which are supposed to represent basic vocabulary ([Boyd-Graber et al. 2006](:bib:BoydGraber2006)).	OMW
Alonso-2015-AoA	Alonso, M. A. and Fernandez, A. and Diez, E.	2015	ratings	Spanish (Spain)	SPANISH	https://doi.org/10.3758/s13428-014-0454-2	Alonso2015	This list includes subjective estimations of age of acquisition (AoA) for Spanish words. The ratings were collected from college students in Spain. Oral frequency norms are taken from [Alonso et al. (2011)](:bib:Alonso2011).
Brysbaert-2009-Frequency	Brysbaert, Marc and New, Boris	2009	norms	English (US)	ENGLISH	https://doi.org/10.3758/BRM.41.4.977	Brysbaert2009	This list includes word frequencies based on television and film subtitles in US American English.	SUBTLEX-US
Brysbaert-2011-Frequency	Brysbaert, Marc and Buchmeier, Matthias and Conrad, Markus and Jacobs, Arthur M. and Boelte, Jens and Boehl, Andrea	2011	norms	German	GERMAN	https://www.ncbi.nlm.nih.gov/pubmed/21768069	Brysbaert2011	This list includes word frequencies based on television and film subtitles in German.	SUBTLEX-DE
Brysbaert-2014a-Concreteness	Brysbaert, Marc and Warriner, Amy Beth and Kuperman, Victor	2014	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-013-0403-5	Brysbaert2014	This list includes concreteness ratings  from over 4,000 participants by means of a norming study using Internet crowdsourcing for data collection. The ratings were obtained on a 5-point rating scale going from abstract to concrete.
Brysbaert-2019-Prevalence	Brysbaert, Marc and Mandera, Pawel and McCormick, Samantha F. and Keuleers, Emmanuel	2019	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-018-1077-9	Brysbaert2019	This list includes word prevalence ratings. Word prevalence refers to the number of people who know the word. The measure was obtained on the basis of an online crowdsourcing study involving over 220,000 people.
Cai-2010-Frequency	Cai, Q. and Brysbaert, M.	2010	norms	Chinese	CHINESE	http://www.ugent.be/pp/experimentele-psychologie/en/research/documents/subtlexch/cai.pdf	Cai2010	This list includes word frequencies based on television and film subtitles in Chinese.	SUBTLEX-CH
Cuetos-2011-Frequency	Cuetos, Fernando and Glez-Nosti, Maria and Barbon, Analia and Brysbaert, Marc	2011	norms	Spanish	SPANISH	http://crr.ugent.be/papers/CUETOS%20et%20al%202011.pdf	Cuetos2011	This list includes Spanish word frequencies taken from contemporary movies and TV series (screened between 1990 and 2009).	SUBTLEX-ESP
Desrochers-2009-SubjFrequency	Desrochers, Alain and Thompson, Glenn L.	2009	ratings	French	FRENCH	https://doi.org/10.3758/BRM.41.2.546	Desrochers2009	This list includes subjective frequency and imageability estimates for French nouns. The data were collected from two independent groups of 72 young adults each. They rated the words on a 7-point scale.
Engelthaler-2018-Humor	Engelthaler, Tomas and Hills, Thomas T.	2018	ratings	English	ENGLISH	https://link.springer.com/article/10.3758/s13428-017-0930-6	Engelthaler2018	This list includes humor ratings for English words. The data was collected from 821 participants using an online crowd-sourcing platform. Each participant rated 211 words on a scale from 1 (humorless) to 5 (humorous).
Juhasz-2013-SER	Juhasz, Barbara J. and Yap, Melvin J.	2013	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-012-0242-9	Juhasz2013	This list includes sensory experience ratings (SER) for English words. Sensory experience ratings reflect the extent to which a word evokes a sensory and/or perceptual experience in the mind of the reader. Participants were asked to rate the degree to which each word evoked a sensory experience, on a 1 to 7 scale, with higher numbers indicating a greater sensory experience.
Keuleers-2010-Frequency	Keuleers, Emmanuel and Brysbaert, Marc and New, Boris	2010	norms	Dutch	DUTCH	https://doi.org/10.3758/BRM.42.3.643	Keuleers2010	This list includes word frequencies based on television and film subtitles in Dutch.	SUBTLEX-NL
Kuperman-2012-AoA	Kuperman, Victor and Stadthagen-Gonzalez, Hans and Brysbaert, Marc	2012	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-012-0210-4	Kuperman2012	This list includes age-of-acquisition (AoA) ratings for English content words (nouns, verbs, and adjectives). For data collection, this megastudy used the Web-based crowdsourcing technology offered by the Amazon Mechanical Turk. Since the download link used for this dataset is no longer functional, the data can also partly be found in the dataset folder Green-2025b-AoA.
Riegel-2015-AffectiveRatings	Riegel, Monika and Wierzba, Malgorzata and Wypych, Marek and Zurawski, Lukasz and Jednorog, Katarzyna and Grabowska, Anna and Marchewka, Artur	2015	ratings	Polish	POLISH	https://doi.org/10.3758/s13428-014-0552-1	Riegel2015	This list includes the Nencki Affective Word List (NAWL). The items were translated from German to Polish based on the stimuli in the Berlin Affective Word List-Reloaded (BAWL-R: [Vo et al. 2009](:bib:Vo2009)). The data include nouns, verbs, and adjectives, with ratings of emotional valence, arousal, and imageability.	NAWL
Scott-2019-Ratings	Scott, Graham G. and Keitel, Anne and Becirspahic, Marc and Yao, Bo and Sereno, Sara C.	2019	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-018-1099-3	Scott2019	This list includes the Glasgow Norms: a set of normative ratings for English words on nine psycholinguistic dimensions: arousal, valence, dominance, concreteness, imageability, familiarity, age of acquisition, semantic size, and gender association. The first three values (arousal, valence, dominance) are rated on a 9-point scale, all others are rated on a 7-point scale.	Glasgow Norms
StadthagenGonzalez-2017-ValenceArousal	Stadthagen-Gonzalez, Hans and Imbault, Constance and Sanchez, Miguel A Perez and Brysbaert, Marc	2017	ratings	Spanish	SPANISH	https://doi.org/10.3758/s13428-015-0700-2	StadthagenGonzalez2017	This list includes valence and arousal ratings for Spanish words. Participants rated the words on a 9-point scale.
Starostin-2000-Sense	Sergei Starostin	2000	relations	English	ENGLISH	http://starling.rinet.ru/program.php?lan=en	Starostin2000a	This list includes sense relations for English submitted with the STARLING database program.
Warriner-2013-AffectiveRatings	Warriner, Amy Beth and Kuperman, Victor and Brysbaert, Marc	2013	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-012-0314-x	Warriner2013	This list includes ratings on valence (the pleasantness of a stimulus), arousal (the intensity of emotion provoked by a stimulus), and dominance (the degree of control exerted by a stimulus). Participants rated the words on a 9-point scale.
Cortese-2008-AoA	Cortese, M. J. and Khanna, M. M.	2008	ratings	English	ENGLISH	https://doi.org/10.3758/BRM.40.3.791	Cortese2008	This list includes age of acquisition (AoA) ratings made on a 1-7 scale for monosyllabic words of English. The data were obtained from 32 participants.
Keuleers-2012-LexicalDecision	Keuleers, Emmanuel and Lacey, Paula and Rastle, Kathleen and Brysbaert, Marc	2012	norms	English (British)	ENGLISH	https://doi.org/10.3758/s13428-011-0118-4	Keuleers2012	This list includes lexical decision times for English words, for which two groups of British participants each responded to monosyllabic and disyllabic words.	British Lexicon Project
Ferrand-2010-LexicalDecision	Ferrand, Ludovic and New, Boris and Brysbaert, Marc and Keuleers, Emmanuel and Bonin, Patrick and Meot, Alain and Augustinova, Maria and Pallier, Christophe	2010	norms	French	FRENCH	https://doi.org/10.3758/BRM.42.2.488	Ferrand2010	This list includes lexical decision times for French words.	French Lexicon Project
GonzalezNosti-2014-LexicalDecision	Gonzalez-Nosti, Maria and Barbon, Analia and Rodriguez-Ferreiro, Javier and Cuetos, Fernando	2014	norms, ratings	Spanish (Spain)	SPANISH	https://doi.org/10.3758/s13428-013-0383-5	GonzalezNosti2014	This list includes lexical decision times for Spanish words. In addition, the study collected AoA ratings from 25 psychology students on a 7-point Likert scale in which 1 corresponded to ages between 0 and 2 years old, 2 to ages between 2 and 4, and so on up to 7, which corresponded to ages over 12 years old.
Tsang-2018-LexicalDecision	Tsang, Yiu-Kei and Huang, Jian and Lui, Ming and Xue, Mingfeng and Chan, Yin-Wah Fiona and Wang, Suiping and Chen, Hsuan-Chih	2018	norms	Chinese	CHINESE	https://doi.org/10.3758/s13428-017-0944-0	Tsang2018	This list includes lexical decision times for Chinese words.	MELD-SCH
Keuleers-2015-Prevalence	Keuleers, Emmanuel and Stevens, Michael and Mandera, Pawel and Brysbaert, Marc	2015	ratings	Dutch (Belgium, Netherlands)	DUTCH	https://doi.org/10.1080/17470218.2015.1022560	Keuleers2015	This list includes word prevalence ratings. Word prevalence refers to the number of people who know the word. The measure was obtained on the basis of an online crowdsourcing study involving over nearly 300,000 Dutch speakers in Belgium and the Netherlands.
StadthagenGonzalez-2018-DiscreteEmotions	Stadthagen-Gonzalez, Hans and Ferre, Pilar and Perez-Sanchez, Miguel A. and Imbault, Constance and Hinojosa, Jose Antonio	2018	ratings	Spanish (Spain)	SPANISH	https://doi.org/10.3758/s13428-017-0962-y	StadthagenGonzalez2018	This list includes ratings for discrete emotion categories (happiness, sadness, anger, fear, and disgust). The ratings were obtained on a scale from 1-5 for each category. In addition, the dataset includes norms on PoS taken from [Duchon et al. (2013)](:bib:Duchon2013).
Alonso-2016-AoA	Alonso, Maria Angeles and Diez, Emiliano and Fernandez, Angel	2016	ratings	Spanish (Spain)	SPANISH	https://doi.org/10.3758/s13428-015-0675-z	Alonso2016	This list includes subjective estimations of age-of-acquisition (AoA) for Spanish verbs. The ratings were collected from college students in Spain.
Imbir-2021-Ratings	Imbir, Kamil K.	2021	ratings	Polish	POLISH	https://doi.org/10.3389/fpsyg.2021.707540	Imbir2021	This list includes ratings on psycholinguistic measures of Valence, Arousal, Dominance, Origin, Significance, Concreteness, Imageability, and subjective Age of Acquisition. The participants were students (excluding psychology students). The ratings were based on different Self-Assessment Manikin (SAM) scales. The author published a [Corrigendum](https://www.frontiersin.org/articles/10.3389/fpsyg.2021.707540/full) and updated the original data set in Imbir ([2016](:bib:Imbir2016)).
Ferre-2017-DiscreteEmotions	Ferre, Pilar and Guasch, Marc and Martinez-Garcia, Natalia and Fraga, Isabel and Hinojosa, Jose Antonio	2017	ratings	Spanish	SPANISH	https://doi.org/10.3758/s13428-016-0768-3	Ferre2017	This list includes ratings on discrete emotions for Spanish words in five discrete emotion categories: happiness, anger, fear, disgust, and sadness. The participants rated the words on a 5-point scale.
Wierzba-2015-DiscreteEmotions	Wierzba, Malgorzata and Riegel, Monika and Wypych, Marek and Jednorog, Katarzyna and Turnau, Pawel and Grabowska, Anna and Marchewka, Artur	2015	ratings	Polish	POLISH	https://doi.org/10.1371/journal.pone.0132305	Wierzba2015	This list includes ratings on discrete emotions for Polish words in five discrete emotion categories: happiness, anger, fear, disgust, and sadness. The participants rated the words on a 7-point scale. The items were based on the stimuli in [Riegel et al. (2015)](:bib:Riegel2015).	NAWL BE
Alonso-2011-OralFrequency	Alonso, Maria Angeles and Fernandez, Angel and Diez, Emiliano	2011	norms	Spanish	SPANISH	https://doi.org/10.3758/s13428-011-0062-3	Alonso2011	This list includes frequency norms for spoken words based on a corpus of over three million units, representing present-day use of the language in Spain.
Lynott-2020-Sensorimotor	Lynott, Dermot and Connell, Louise and Brysbaert, Marc and Brand, James and Carney, James	2020	ratings	English	ENGLISH	https://doi.org/10.3758/s13428-019-01316-z	Lynott2020	This list includes ratings across six perceptual modalities (touch, hearing, smell, taste, vision, and interoception) and five action effectors (mouth/throat, hand/arm, foot/leg, head excluding mouth/throat, and torso), gathered from a total of 3,500 individual participants using Amazon Mechanical Turk platform. The ratings were based on a 5-point scale.	Lancaster Sensorimotor Norms
Kapucu-2018-EmotionRatings	Kapucu, Aycan and Kilic, Asli and Ozkilic, Yildiz and Saribaz, Bengisu	2018	ratings	Turkish	TURKISH	https://doi.org/10.1177/0033294118814722	Kapucu2018	This list includes ratings on two major dimensions of emotion: arousal and valence, as well as on five basic emotion categories of happiness, sadness, anger, fear, and disgust. In addtion, ratings on concreteness were collected. The items were translated from the Affective Norms for English Words (ANEW: [Bradley and Lang 1999](:bib:Bradley1999)) to Turkish.
Briesemeister-2011-DiscreteEmotions	Briesemeister, Benny B. and Kuchinke, Lars and Jacobs, Arthur M	2011	ratings	German	GERMAN	https://doi.org/10.3758/s13428-011-0059-y	Briesemeister2011	This list includes discrete emotions ratings for the categories happiness, sadness, anger, fear, and disgust. The ratings on German nouns were collected from university students (including psychology) on a 5-point Likert scale (1 = low intensity,5= strong intensity). The list is based on the Berlin Affective Word List-Reloaded (BAWL-R: [Vo et al. 2009](:bib:Vo2009))	DENN–BAWL
Mandera-2015-Frequency	Mandera, Pawel and Keuleers, Emmanuel and Wodniecka, Zofia and Brysbaert, Marc	2015	norms	Polish	POLISH	https://doi.org/10.3758/s13428-014-0489-4	Mandera2015	This list includes word frequencies based on television and film subtitles in Polish.	SUBTLEX-PL
Moors-2013-Ratings	Moors, Agnes and De Houwer, Jan and Hermans, Dirk and Wanmaker, Sabine and Van Schie, Kevin and Van Harmelen, Anne-Laura and De Schryver, Maarten and De Winne, Jeffrey and Brysbaert, Marc	2013	ratings	Dutch	DUTCH	https://doi.org/10.3758/s13428-012-0243-8	Moors2013	This list includes ratings on valence, arousal and age-of-acquisition for Dutch words. The ratings were conducted on a  7-point Likert scale. The participants were students from the Netherlands and Belgium.
Wu-2020-CoreVocabulary	Wu, Winston and Nicolai, Garrett and Yarowsky, David	2020	relations	Global	ENGLISH	https://www.aclweb.org/anthology/2020.lrec-1.519.pdf	Wu2020	This list was created automatically to present a core vocabulary. The automatic creation was based on the relative coverage of each target concept across 1895 bilingual dictionaries in the LanguageNet multiligual lexicon [(Baldwin et al., 2010)](:bib:Baldwin2010). The LINE_IN_SOURCES column represents the ranking of the words in the original list.
Mohammad-2018-AffectiveRatings	Mohammad, Saif M.	2018	ratings	English	ENGLISH	https://doi.org/10.18653/v1/P18-1017	Mohammad2018a	This list includes valence, arousal, and dominance ratings for English words. The words were rated by participants on a crowd-sourcing platform. The ratings were obtained by the annotation of best-worst scaling for four words. In addition, the author offers translations of the words for various languages (http://sentiment.nrc.ca/lexicons-for-research/).	NRC VAD Lexicon
Mohammad-2018-EmotionIntensity	Mohammad, Saif M.	2018	ratings	English	ENGLISH	https://www.aclweb.org/anthology/L18-1027.pdf	Mohammad2018b	This list includes ratings of dominant emotion (anger, fear, joy, sadness, disgust, anticipation, trust, surprise) and intensity for English words. The dominant emotion is based on pointwise mutual information in the Hashtag Emotion Corpus [(Mohammad 2012)](:bib:Mohammad2012). The words were rated for emotion intensity by participants on a crowd-sourcing platform. The ratings were obtained by the annotation of best-worst scaling for four words. In addition, the author offers translations of the words for various languages (http://sentiment.nrc.ca/lexicons-for-research/).	NRC Affect Intensity Lexicon
Clark-2004-ImageryFamiliarity	Clark, James M. and Paivio, Allan	2004	ratings	English	ENGLISH	https://doi.org/10.3758/BF03195584	Clark2004	This list includes ratings for imageability and familiarity for English words. The data set is an extension of the norms in [Paivio et al. (1968)](:bib:Paivio1968). The words were rated by psychology students on a 7-point scale. Imageability ratings for words with a number higher than 0 in the column PAVIO_NORMS were taken from [Paivio et al. (1968)](:bib:Paivio1968).
Abdaoui-2017-EmoLex	Abdaoui, Amine and Azé, Jérôme and Bringay, Sandra and Poncelet, Pascaln	2017	relations	French	FRENCH	https://doi.org/10.1007/s10579-016-9364-5	Abdaoui2017	This list represents the French Expanded Emotion Lexicon (FEEL). It the includes results of a sentiment analysis of texts according to the emotions associated with the text. The authors provide information for the dominant polarity of a given word (negative/positive) and the dominant emotion(s) associated with the word (joy, anger, surprise, sadness, disgust, fear) presented as binary values (1/0).	FEEL
Matisoff-2015-STEDT	Matisoff, James A.	2015	relations	English	Proto-Sino-Tibetan	https://stedt.berkeley.edu/	Matisoff2015	This list represents the semantic categorization of the glosses in the Sino-Tibetan Etymological Dictionary and Thesaurus (STEDT). It is an etymological dictionary of Proto-Sino-Tibetan (PST), the ancestor language of the large Sino-Tibetan language family. This family includes Chinese, Tibetan, Burmese, and over 200 other languages spoken in South and Southeast Asia.	STEDT
Kiss-1973-EAT	Kiss, G. and Armstrong, Christine and Milroy, R. and Piper, J.	1973	relations	English	English	http://vlado.fmf.uni-lj.si/pub/networks/data/dic/eat/Eat.htm	Kiss1973	The Edinburgh Associative Thesaurus offers user ratings for a large list of English words. The data themselves are no longer officially available, since the website went down, but we found parts of the data distributed along with the Pajek project for network analysis and visualization. We computed weighted degree and unweighted degree per concept from the data themselves.	EAT
Wikidata		2020	relations	English	Global	https://www.wikidata.org/	Wikidata	This list includes matches between the words in Wikidata and Concepticon concepts. The words are extracted automatically and the mapping was done semi-automated in that the double mappings were checked by hand. The Wikidata repository consists mainly of items, each one having a label, a description and any number of aliases.	Wikidata
OmegaWiki		2020	relations	English	Global	http://www.omegawiki.org	OmegaWiki	This list includes matches between the words in the OmegaWiki and Concepticon concepts. OmegaWiki aims to create a dictionary of all words of all languages, including lexical, terminological and ontological information.	OmegaWiki
Babelnet		2020	relations	English	Global	http://babelnet.org	BabelNet	This list includes matches between the words in BabelNet and Concepticon concepts. BabelNet is an extension of WordNet and offers word senses in multiple languages.	Babelnet
Numerals		2020	relations	Global	Global			This list includes matches between integer numbers and Concepticon concepts.
Crepaldi-2015-Frequency	Crepaldi, D. and Amenta, S. and Pawel, M. and Keuleers, E. and Brysbaert, Marc	2015	norms	Italian	Italian	https://lrlac.sissa.it/publications/subtlex-it-subtitle-based-word-frequency-estimates-italian	Crepaldi2015	This list includes word frequencies based on television and film subtitles in Italian.	SUBTLEX-IT
VanHeuven-2014-Frequency	Van Heuven, Walter J.B. and Mandera, Pawel and Keuleers, Emmanuel and Brysbaert, Marc	2014	norms	English	English	http://crr.ugent.be/archives/1423	VanHeuven2014	This list includes word frequencies based on television and film subtitles in UK-English.	SUBTLEX-UK
Medler-2005-Perceptual	Medler, David A. and Arnoldussen, Aimee and Binder, Jeffrey R. and Seidenberg, Mark S.	2005	ratings	English	English	http://www.neuro.mcw.edu/ratings/	Medler2005	This database contains mean perceptual attribute ratings in 4 sensory-motor domains (Sound, Color, Manipulation, Motion) for 1402 words, as well as Emotion ratings reflecting intensity and valence of emotional associations for the same words.	Wisconsin Perceptual Attribute Rating Database
Gilhooly-1980-Ratings	Gilhooly, Kenneth J. and Logie, Robert H.	1980	ratings	English	English	https://link.springer.com/content/pdf/10.3758/BF03201693.pdf	Gilhooly1980	This list contains age-of-acquisition, imagery, concreteness, familiarity, and ambiguity measures for 1,944 English words (nouns) of varying length and frequency.
Speed-2022-Sensorimotor	Speed, Laura J. and Brysbaert, Marc	2022	ratings	Dutch	Dutch	https://doi.org/10.3758/s13428-021-01656-9	Speed2022	This list contains sensory modality ratings for 24,000 Dutch words. The modalities include  audition, gustation, haptics, olfaction, vision, and interoception.
Chedid-2019-Familiarity	Chedid, Georges and Wilson, Maximilliano A. and Bedetti, Christophe and Rey, Amandine E. and Vallet, Guillaume T. and Brambati, Simona Maria	2019	ratings	French	French		Chedid2019a	This list contains familiarity ratings for 3,596 French nouns. In addition, reaction times were collected.
Bolognesi-2022-Specificity	Bolognesi, Marianna Marcella and Caselli, Tommaso	2022	ratings	English, Italian	Italian		Bolognesi2022	This list contains specificity ratings for Italian nouns, adjectives, and verbs on a 5-point scale.	ANEW-ITA
Montefinese-2014-AffectiveRatings	Montefinese, Maria and Ambrosini, Ettore and Fairfield, Beth and Mammarella, Nicola	2014	ratings	English, Italian	Italian		Montefinese2014	This list contains affective ratings for Italian nouns, adjectives, and verbs on a 9-point scale.
DiNatale-2021a-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with CLICS3 (Rzymski et al. [2020](:bib:Rzymski2020)) and the NRC VAD Lexicon (Mohammad [2018a](:bib:Mohammad2018a)).
DiNatale-2021b-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with CLICS3 (Rzymski et al. [2020](:bib:Rzymski2020)) and the affective ratings by Warriner et al. ([2013](:bib:Warriner2013)).
DiNatale-2021c-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with FreeDict (freedict.org) and the NRC VAD Lexicon (Mohammad [2018a](:bib:Mohammad2018a)).
DiNatale-2021d-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with FreeDict (freedict.org) and the affective ratings by Warriner et al. ([2013](:bib:Warriner2013)).
DiNatale-2021e-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with [OmegaWiki](:bib:OmegaWiki) and the NRC VAD Lexicon (Mohammad [2018a](:bib:Mohammad2018a)).
DiNatale-2021f-AffectiveColexifications	Di Natale, Anna and Pellert, Max and Garcia, David	2021	ratings	English	Global		DiNatale2021	This list contains a lexicon based on an unsupervised method of affective lexicon extension that uses colexification network data to interpolate the affective ratings of words that are not included in the original lexicon. The lexicon is constructed with [OmegaWiki](:bib:OmegaWiki) and the affective ratings by Warriner et al. ([2013](:bib:Warriner2013)).
Montefinese-2019-AoA	Montefinese, Maria and Vinson, David and Vigliocco, Gabriella and Ambrosini, Ettore	2019	ratings	English, Italian	Italian		Montefinese2019	This list contains age-of-acquisition ratings for Italian nouns, adjectives, and verbs. Adult participants were asked to estimate the age at which they thought they had learned the word.	ItAoA
Speed-2024-Emotions	Speed, Laura J. and Brysbaert, Marc	2024	ratings	Dutch	Dutch	https://doi.org/10.3758/s13428-023-02239-6	Speed2024	This list includes the arousal, valence, and discrete emotion ratings (happiness, anger, fear, disgust, and sadness) for 24,000 Dutch words, which were rated on a 5-point scale. The study contains additional variables, such as word length, frequency, age of acquisition, concreteness, imageability, and lexical decision data, which come from other studies and were therefore not included in the data set here.
Winter-2024-Iconicity	Winter, Bodo and Lupyan, Gary and Perry, Lynn K. and Dingemanse, Mark and Perlman, Markus	2024	ratings	English	English	https://doi.org/10.3758/s13428-023-02112-6	Winter2024	This list includes the iconocity rating of 14,000+ English words. Items were rated on a 7-point scale (1 = not iconic at all, 7 = very iconic). Further, playfulness, sensory modality, structural markedness and age of acquisition were compared in the study.
Vankrunkelsven-2024-SemanticGender	Vankrunkelsven, Hendrik and Yang, Yang and Brysbaert, Marc and De Deyne, Simon and Storms, Gert	2024	ratings	Dutch	Dutch	https://doi-org.offsitelib.eva.mpg.de/10.3758/s13428-022-02032-x	Vankrunkelsven2024	This list includes semantic gender ratings for 24,000 Dutch words. Items were rated on a 5-point scale (1 = very feminine, 2 = rather feminine, 3 = neutral, 4 = rather masculine, 5 = very masculine). The study contains additional variables, such as concreteness, age of acquisition, valence, arousal and dominance, which come from other studies and were therefore not included in the data set here.
Pexman-2019-Sensorimotor	Pexman, Penny M. and Muraki, Emiko and Sidhu, David M. and Siakaluk, Paul D. and Yap, Melvin J.	2019	ratings	English	English	https://doi.org/10.3758/s13428-018-1171-z	Pexman2019	This list includes ratings of body-object interaction (BOI), i.e., the extent to which the word refers to an object or thing a human body can easily interact with, for 9,000+ English words. Items were rated on a 7-point scale (1 = low body–object interaction, 2-6 = intermediate body–object interaction, 7 = high body–object interaction).
Schmidtke-2014-AffectiveRatings	Schmidtke, David S. and Schröder, Tobias and Jacobs, Arthur M. and Conrad, Markus	2014	ratings	German	German	https://doi.org/10.3758/s13428-013-0426-y	Schmidtke2014	This list contains ratings of valence, arousal, imageability, dominance and potency for German words. The study includes terms from the Affective Norms for English Words (ANEW) list ([Bradley and Lang 1999](:bib:Bradley1999)) as well as the Berlin Affective Word List (BAWL)  ([Vo et al. 2009](:bib:Vo2009)). Valence and imageability were rated on 7-point scales while potency, dominance and arousal for words in the ANEW list were rated on a 9-point scale. Arousal for words in the BAWL was rated on a 5-point scale. The study contains additional variables, such as data on word frequency, grammatical class, number of letters, number of syllables, and number of orthographic neighbors, which come from other studies and were therefore not included in the data set here.	ANGST
Guasch-2016-AffectiveRatings	Guasch, Marc and Ferré, Pilar and Fraga, Isabel	2016	ratings	Spanish	Spanish	https://doi.org/10.3758/s13428-015-0684-y	Guasch2016	This list contains ratings of valence, arousal, concreteness, imageability, context availability, and familiarity of 1,400 Spanish words. Ratings for valence and arousal were given on a 9-point scale. Concreteness, imageability, availability and familiarity were ranked on a 7-point scale. The participants were undergraduate students who were fluent in Spanish.
Bonin-2018-Concreteness	Bonin, Patrick and Méot, Alain and Bugaiska, Aurélia	2018	ratings	French	French	https://doi.org/10.3758/s13428-018-1014-y	Bonin2018	This list contains ratings of concreteness, context availability, valence and arousal for 1,400+ French words. All ratings were given on a 5-point scale.
Coso-2023-Emotions	Ćoso, Bojana and Guasch, Marc and Bogunovic, Irena and Ferré, Pilar and Hinojosa, José A.	2023	ratings	English	Croatian	https://doi.org/10.3758/s13428-022-02003-2	Coso2023	This list contains ratings of five concrete emotions ("happiness", "anger", "sadness", "fear", "disgust") for 3000+ Croatian words. All ratings were given on a 5-point scale. While the original study was conducted with Croatian words, the present mapping is based on the English translations given for these items.	CROWD-5e
Repetto-2023-Sensorimotor	Repetto, Claudia and Rodella, Claudia and Conca, Francesca and Santi, Gaia Chiara and Catricalá, Eleonora	2023	ratings	English	Italian	https://doi.org/10.3758/s13428-022-02004-1	Repetto2023	This list contains ratings of valence, arousal, dominance, familiarity, imageability and concreteness on a 9-point scale. Further, five effectors (hand-arm, foot-leg, torso, mouth, head) as well as six perceptual modalities (touch, hearing, smell, taste, vision, and interoception) were rated on a 6-point scale. Exclusivity ratings for perception, action and overall sensorimotor values were given ranging from 0-1, to be interpreted as percentages.
Syssau-2009-Valence	Syssau, Arielle and Monnier, Catherine	2009	ratings	English, French	French	https://doi.org/10.3758/BRM.41.1.213	Syssau2009	This list contains ratings of valence given by children age five, age seven and age nine for 600 French items. The data comprise the percentage distribution of children within each respective age group who provided ratings on a scale ranging from negative to neutral to positive, with each age group being treated as making up 100%. It is additionally divided by gender (male, female). The present list further contains ratings of the same items given by adults in [Syssau and Font (2005)](:bib:Syssau2005). The items were chosen based on the emotional databases by [Bonin, Méot et al. 2003](:bib:Bonin2003) and [Syssau and Font (2005)](:bib:Syssau2005), licensed as Emotional Source in the dataset. While these original studies were conducted using pictures, the present study used only the corresponding words, either spoken (age group five) or written (age groups seven and nine).
Dingemanse-2020-IconicityHumor	Dingemanse, Mark and Thompson, Bill	2020	ratings	English	English	https://doi.org/10.1017/langcog.2019.49	Dingemanse2020	This list contains ratings of iconicity and humor, given both by human participants and as computational ratings. We included only the computational ratings and the iconicity ratings from [Perry et al. (2018)](:bib:Perry2018) since the humor ratings from [Engelthaler and Hills (2018)](:bib:Engelthaler2018) were added as separate dataset. The human ratings were compared to the ratings given by the computational model developed for the present study, which was trained on large natural language corpora. The study also includes ratings of aversion and taboos, which were not included here.
Yi-2025-AffectiveRatings	Yi, Wei and Xu, Haitao and Man, Kaiwen	2025	ratings	Chinese	Chinese	https://doi.org/10.3758/s13428-024-02580-4	Yi2025	This list contains ratings of valence, arousal and perceptual experiences for Chinese words translated from [Warriner et al. 2013](:bib:Warriner2013). Further, ratings of familiarity ([Su et al. 2023a](:bib:Su2023a)) and imageability ([Su et al. 2023b](:bib:Su2023b)) were included here since this data is only available in PDF format elsewhere or the file was corrupted.
Amenta-2025-LexicalRecognition	Amenta, Simona and de Varda, Andrea Gregor and Mandera, Pawel and Keuleers, Emmanuel and Brysbaert, Marc and Marelli, Marco	2025	norms	Italian	Italian	https://doi.org/10.3758/s13428-024-02548-4	Amenta2025	This list contains lexical recognition times for 130,495 Italian words from the Italian Crowdsourcing Project (ICP). Words were originally selected from SUBTLEX-IT ([Crepaldi et al. 2015](:bib:Crepaldi2015)), enriched with inflected forms, rare/morphologically complex words, dictionary entries, and some modern neologisms, then cleaned to remove names, punctuation, and offensive content. Ratings were given by Italian native speakers. Unlike in classic lexical decision tasks, participants were not prompted ro react as quickly as possible but rather to indicate whether they recognized a given word in their own time.	ICP
Redondo-2007-AffectiveRatings	Redondo, Jaime and Fraga, Isabel and Padrón, Isabel and Comesaña, Montserrat	2007	ratings	English	Spanish	https://doi.org/10.3758/BF03193031	Redondo2007	This list contains ratings of valence, arousal and dominance of Spanish words. The data is presented as the mean of all participants' responses, female participants' and male participants' responses. The dataset is a translation of the ANEW (Affective Norms for English Words) list ([Bradley and Lang 1999](:bib:Bradley1999)). Additional data from the Spanish lexical database LEXESP by [Sebastian-Galles et al. (2000)](:bib:Sebastian2000) in this dataset include concreteness and imageability ratings. Two additional LEXESP variables, namely objective frequency and familiarity, can be found in a separate dataset (see [Desrochers et al. (2010)](:bib:Desrochers2010)).
Ljubesic-2020-Emotions	Ljubešić, Nikola and Markov, Ilia and Fišer, Darrja and Daelemans, Walter	2020	ratings	English	Dutch, Croatian, Slovene	https://aclanthology.org/2020.peoples-1.15/	Ljubesic2020	This list contains ratings for associations of discrete emotions (anger, anticipation, disgust, fear, joy, sadness, surprise and trust) as well as their positive/negative valence. The original ratings were conducted in English (Mohammad & Turney [2010](:bib:Mohammad2010)). This study also provided translations for Dutch, Croatian, and Slovene which were generated with automated translation tools. The present study conducted manual translations for each Slovene and Croatian word and for each Dutch word that has an association rating in at least one column. Translators were asked to take the sentiment and emotion labels that were already associated with the words into account. It appears that the ratings were taken from the original study in English (Mohammad & Turney [2010](:bib:Mohammad2010)) and the novelty of the dataset lies in the updated translations. However, the authors do not make it clear how the dataset was created, so caution is advised.	LiLaH
Martinez-2025-AffectiveRatings	Martínez, Gonzalo and Molero, Juan Diego and González, Sandra and Conde, Javier and Brysbaert, Marc and Reviriego, Pedro	2025	ratings	English	English	https://doi.org/10.3758/s13428-024-02515-z	Martinez2025	This list contains ratings of concreteness, valence and arousal for English items by GPT-4o. The list of words that were rated was sourced from [Warriner et al. 2013](:bib:Warriner2013) and extended. Concreteness was rated on a 5-point scale while valence and arousal were rated on 9-point scales. Soundness of the LLM ratings was ensured by collecting real human ratings and comparing them to the answers given by GPT-4o.
Dunabeitia-2022-MultiPic	Duñabeitia, Jon Andoni and Baciero, Ana and Antoniou, Kyriakos and Antoniou, Mark and Ataman, Esra and Baus, Cristina and Ben-Shachar, Michal and Çağlar, Ozan Can and Chromý, Jan and Comesaña, Montserrat and Filip, Maroš and Filipović Đurđević, Dušica and Gillon Dowens, Margaret and Hatzidaki, Anna and Januška, Jiří and Jusoh, Zuraini and Kanj, Rama and Kim, Say Young and Kırkıcı, Bilal and Leminen, Alina and Lohndal, Terje and Yap, Ngee Thai and Renvall, Hanna and Rothman, Jason and Royle, Phaedra and Santesteban, Mikel and Sevilla, Yamila and Slioussar, Natalia and Vaughan-Evans, Awel and Wodniecka, Zofia and Wulff, Stefanie and Pliatsikas, Christos	2022	ratings	English	American English, Australian English, Basque, Western Flemish, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh	https://figshare.com/articles/dataset/Untitled_Item/19328939	Dunabeitia2022	This is the 500 item version of the MultiPic dataset, now with 32 languages and varieties (American English, Australian English, Basque, Western Flemish, British English, Catalan, Cypriot Greek, Czech, Finnish, French, German, Greek, Hebrew, Hungarian, Italian, Korean, Lebanese Arabic, Malay, Malaysian English, Mandarin Chinese, Dutch, Norwegian, Polish, Portuguese, Quebec French, Rioplatense Spanish, Russian, Serbian, Slovak, Spanish, Turkish, Welsh) in total and with familiarity ratings for 500 chosen picture stimuli of the original 750. The pictures were standardized for name agreement and visual complexity as well.
Cameirao-2010-AoA	Cameirão, Manuela L. and Vicente, Selene G.	2010	ratings	Portuguese	Portuguese	https://doi.org/10.3758/BRM.42.2.474	Cameirao2010	This list contains age of acquisition ratings for 1,749 Portuguese words. Participants were asked to rate the age at which they thought they had learned a given word on a 9-point scale ranging from 2 years old to +13 years old. The dataset also includes ratings for familiarity ([Marques 2004](:bib:Marques2004)), concreteness and imageability ([Marques 2005](:bib:Marques2005)) which were added here. The words in the original dataset are capitalized, resulting in the format in the present list.
SanMiguelAbella-2020-MotorContent	San Miguel Abella, Romina A. and González-Nosti, María	2020	ratings	Spanish	Spanish	https://doi.org/10.3758/s13428-019-01241-1	SanMiguelAbella2020	This list contains motor content ratings, i.e. the amount of mobility an actions entails, for 4,565 Spanish verbs given on a 7-point scale. The dataset further includes age of acquisition data from [Alonso et al. 2016](:bib:Alonso2016) which is already included in NoRaRe and not duplicated here. Additionally, the original dataset includes data on frequency, familiarity, imageability and concreteness, sourced from the EsPal database ([Duchon et al. 2013](:bib:Duchon2013)), which were not included here. The verbs in the original dataset are capitalized, resulting in the format in the present list.
Grandy-2020-EmotionImageability	Grandy, Thomas H. and Lindenberger, Ulman and Schmiedek, Florian	2020	ratings	German	German	https://doi.org/10.3758/s13428-019-01294-2	Grandy2020	This list contains ratings of emotionality and imageability of about 2500 German nouns given by younger (21-31 years old) and older (70-86 years old) adults. Ratings for imageability were given on a 100-point sliding scale while ratings for emotionality were given on a 200-point sliding scale. Participants used their mouse cursor to select or slide to a point on the scale, the exact numeric value of their selection was displayed for reference.
Winter-2022-SemanticChange	Winter, Bodo and Srinivasan, Mahesh	2022	relations	English	Global	https://doi.org/10.1080/10926488.2021.1945419	Winter2022	This network contains asymmetrical, cross-linguistically attested semantic changes between semantically related concepts. It is based on the study by [Urban 2011](:bib:Urban2011).
Zalizniak-2024-DatSemShift	Anna Zalizniak and Anna Smirnitskaya and Maksim Russo (Rousseau) and Ilya Gruntov and Timur Maisak and Dmitry Ganenkov and Maria Bulakh and Maria Orlova and Marina Bobrik-Fremke and Oksana Dereza and Tatiana Mikhailova and Maria Bibaeva and Mikhail Voronov	2024	relations	English	Global		Zalizniak2024	This network is based on the Database of Semantic Shifts (DatSemShift), retrieved on February 5, 2024, and converted to CLDF. It captures documented semantic shifts across the world’s languages, linking related concepts where semantic change is attested. Duplicates in the source data are marked with an asterisk and excluded from Concepticon linking and network construction.	DatSemShift
List-2023-Colexifications	Johann-Mattis List	2023	relations	English	Global	https://doi.org/10.3389/fpsyg.2023.1156540	List2023	This network contains data on partial colexification patterns across multiple languages. The study offers three kinds of colexification graphs, two undirected ones, stored in the column LINKED_CONCEPTS, and one directed graph, stored in the column TARGET_CONCEPTS.
Rubehn-2025-ConceptEmbeddings	Rubehn, Arne and List, Johann-Mattis	2025	relations	English	Global	https://aclanthology.org/2025.acl-long.1004/	Rubehn2025	This list contains concept embeddings inferred from cross-lingual (partial) colexification networks.
Xu-2020-Concreteness	Xu, Xu and Li, Jiayin	2020	ratings	Chinese	Chinese	https://doi.org/10.1371/journal.pone.0232133	Xu2020	This list contains ratings of concreteness/abstractness on a continuum for 9,877 two-character Chinese words. 1,140 participants completed the survey. The list was compiled from the MEgastudy of Lexical Decision in Simplified CHinese (MELD-SCH, [Tsang et al. 2018](:bib:Tsang2018)).
Brysbaert-2014-AoA	Brysbaert, Marc and Stevens, Micha{\"e}l and De Deyne, Simon and Voorspoels, Wouter and Storms, Gert	2014	ratings	Dutch	Dutch	https://doi.org/10.1016/j.actpsy.2014.04.010	Brysbaert2014b	This list contains ratings of age of acquisition (AoA) for 30.000 Dutch words given by 74 stundents and scientific collaborators  of the University of Ghent. Items were taken from [Moors et al. (2013)](:bib:Moors2013).
Brysbaert-2014b-Concreteness	Brysbaert, Marc and Stevens, Micha{\"e}l and De Deyne, Simon and Voorspoels, Wouter and Storms, Gert	2014	ratings	Dutch	Dutch	https://doi.org/10.1016/j.actpsy.2014.04.011	Brysbaert2014b	This list contains ratings of concreteness for 30.000 Dutch words given by 75 stundents and scientific collaborators of the University of Leuven. Items were taken from [Moors et al. (2013)](:bib:Moors2013).
Rubehn-2025-ConceptEmbeddings	Rubehn, Arne and List, Johann-Mattis	2025	relations	English	Global	https://aclanthology.org/2025.acl-long.1004/	Rubehn2025	This list contains concept embeddings inferred from cross-lingual (partial) colexification networks.
Xu-2021-AoA	Xu, Xu and Li, Jiayin and Guo, Shulun	2021	ratings	Chinese	Chinese	https://doi.org/10.3758/s13428-020-01455-8	Xu2021	This list contains ratings of age of acquisition (AoA) for 19,716 simplified Chinese words given by 1765 native speakers of Mandarin Chinese.
Soares-2012-AffectiveRatings	Soares, Ana Paula and Comesaña, Montserrat and Pinheiro, Ana P and Simões, Alberto and Frade, Carla Sofia	2012	ratings	Portuguese	Portuguese	https://doi.org/10.3758/s13428-011-0131-7	Soares2012	This list contains ratings of valence, arousal and dominance for 1,034 Portuguese words. These words were selected from the Affective Norms for English Words (ANEW) database [(Bradley & Lang, 1999)](:bib:Bradley1999) and compared to the Spanish translation used for a similar study conducted by [Redondo et al. (2007)](:bib:Redondo2007). The Portuguese translation used here was therefore heavily influenced by the Spanish one used previously. The original dataset provides ratings separately for all, male, and female participants. The present mappings include only the ratings for all participants.
Chedid-2019-AuditoryStrength	Chedid, Georges and Brambati, Simona Maria and Bedetti, Christophe and Rey, Amandine E. and Wilson, Maximilliano A. and Vallet, Guillaume T.	2019	ratings	French	French	https://doi.org/10.3758/s13428-019-01254-w	Chedid2019b	This list contains ratings of auditory perceptual strength for 3,596 French Canadian words. The same study also collected ratings for visual perceptual strength, which have been added here in a separate dataset. It is a companion study to [Chedid et al. (2019a)](:bib:Chedid2019a).
Chedid-2019-VisualStrength	Chedid, Georges and Brambati, Simona Maria and Bedetti, Christophe and Rey, Amandine E. and Wilson, Maximilliano A. and Vallet, Guillaume T.	2019	ratings	French	French	https://doi.org/10.3758/s13428-019-01254-w	Chedid2019b	This list contains ratings of visual perceptual strength for 3,596 French Canadian words. The same study also collected ratings for auditory perceptual strength, which have been added here in a separate dataset. It is a companion study to [Chedid et al. (2019a)](:bib:Chedid2019a).
Green-2025a-AoA	Green, Clarence and Kong, Anthony and Brysbaert, Marc and Keogh, Kathleen	2025	ratings	English	English	https://doi.org/10.3758/s13428-025-02843-8	Green2025	This list contains age of acquisition ratings for English words that received a score of 10 years or lower in [Kuperman et al. (2012)](:bib:Kuperman2012). Three studies were conducted: Study 1 crowdsourced AoA ratings for print, i.e., written and read words, extending the study by [Kuperman et al. (2012)](:bib:Kuperman2012). Study 2 tested the extent to which the results obtained in Study 1 are replicable by an untrained LLM (GPT-4o). Study 3 extended the LLM method applied in Study 2. It was used to fine-tune the LLM with regard to the human ratings and get refined ratings. The original study by [Green et al. (2025)](:bib:Green2025) further includes ratings of the full [Kuperman et al. (2012)](:bib:Kuperman2012) list by a trained LLM and ratings of the English Crowdsourcing Project [(Mandera et al. 2020)](:bib:Mandera2020) by both a trained and an untrained LLM. These datasets can be found in the dataset folders Green-2025b-AoA and Green-2025c-Aoa, respectively.
Green-2025b-AoA	Green, Clarence and Kong, Anthony and Brysbaert, Marc and Keogh, Kathleen	2025	ratings	English	English	https://doi.org/10.3758/s13428-025-02843-8	Green2025	This list contains age of acquisition (AoA) ratings for English words given by a trained LLM (GPT-4o). The list of words was taken from [Kuperman et al. (2012)](:bib:Kuperman2012) with the goal to replicate their original ratings in the LLM. Further, AoA ratings for print, i.e., written and read words, were given by the LLM. This dataset also includes the original ratings given in [Kuperman et al. (2012)](:bib:Kuperman2012). The original study by [Green et al. (2025)](:bib:Green2025) further includes crowdsourced AoA ratings as well as untrained LLM scores for print for words which received a score below 10 years old in [Kuperman et al. (2012)](:bib:Kuperman2012). In addition, ratings of the English Crowdsourcing Project [(Mandera et al. 2020)](:bib:Mandera2020) by both a trained and an untrained LLM are included in the original study. These datasets can be found in the dataset folders Green-2025a-AoA and Green-2025c-Aoa, respectively.
Green-2025c-AoA	Green, Clarence and Kong, Anthony and Brysbaert, Marc and Keogh, Kathleen	2025	ratings	English	English	https://doi.org/10.3758/s13428-025-02843-9	Green2025	This list contains age of acquisition (AoA) ratings for English words given by an untrained as well as a trained LLM (GPT-4o). The words rated here were all words included in the English Crowdsourcing Project (ECP) [(Mandera 2020)](:bib:Mandera2020) that did not obtain an AoA rating in [Kuperman et al. (2012)](:bib:Kuperman2012). The original study by [Green et al. (2025)](:bib:Green2025) further includes crowdsourced AoA ratings as well as untrained LLM scores for print for words which received a score below 10 years of age in [Kuperman et al. (2012)](:bib:Kuperman2012). In addition, ratings of the full [Kuperman et al. (2012)](:bib:Kuperman2012) list by a trained LLM, as well as the original ratings given in [Kuperman et al. (2012)](:bib:Kuperman2012) were included. These datasets can be found in the dataset folders Green-2025a-AoA and Green-2025b-Aoa, respectively.
Brysbaert-2025-Familiarity	Brysbaert, Marc and Martínez, Gonzalo and Reviriego, Pedro	2025	ratings	English	English	https://doi.org/10.3758/s13428-024-02561-7	Brysbaert2025	This dataset contains familiarity ratings given by GPT-4o as well as Multilex [(van Paridon & Thompson 2021](:bib:vanParidon);[Gimenes & New 2016)](:bib:Gimenes2016) frequencies for single words. Familiarity ratings were given on a 1–7 scale (1 = very unfamiliar, 7 = very familiar). The Multilex variable combines word frequencies from subtitles, Twitter, blogs, and news sources.
Iaroshenko-2025-EmoLex	Iaroshenko,  Polina  V.  &  Natalia V.  Loukachevitch	2025	relations	Russian	Russian	https://doi.org/10.22363/2687-0088-44439	Iaroshenko2025	The Russian Emotion Lexicon (RusEmoLex) contains 1,274 words that appear in at least two sources from a larger original list of 7,937 candidate words. Words are categorized by semantic class: Радость ('joy'), Грусть ('sadness'), Страх ('fear'), Злость ('anger'), and Удивление ('surprise'). The original sources included dictionaries, corpora, and emotion-word association surveys, with some words appearing in up to seven different sources. Only the Class variable is included here.	RusEmoLex
Binder-2016-AffectiveRatings	Binder, Jeffrey R. and Conant, Lisa L. and Humphries, Colin J. and Fernandino, Leonardo and Simons, Stephen B. and Aguilar, Mario and Desai, Rutvik H.	2016	ratings	English	English	https://doi.org/10.1080/02643294.2016.1147426	Binder2016	This dataset provides sensorimotor, cognitive, emotional, spatial, and impact ratings for English words. Different instructions were given for nouns, verbs, and adjectives within each variable. Some queries were intentionally nonsensical for certain words (e.g., asking about events for a static object). Note that in these cases “Not Applicable” responses were converted to 0. Ratings were given on a 6-point scale (0 = not at all, 3 = somewhat, 6 = very much). Semantic and ontological annotations are included. Frequencies were taken from [Shaoul & Westbury (2013)](:bib:Shaoul2013). Imageability ratings were compiled from different sources [(Bird et al. 2001](:bib:Bird2001); [Clark & Paivio 2004](:bib:Clark2004); [Cortese & Khanna 2008](:bib:Cortese2008); [Wilson 1988)](:bib:Wilson1988) for the original dataset and also included here.
Bird-2001-ImageabilityAoA	Bird, Helen and Franklin, Sue and Howard, David	2001	ratings	English	English	https://doi.org/10.3758/BF03195349	Bird2001	This list contains ratings of age of acquisition (AoA) as well as imageability as given by 78 participants. Further, AoA and imageability ratings from the MRC Psycholinguistic Database [(Coltheart 1981)](:bib:Coltheart1981) and logarithmic frequency measures from the CELEX database ([Baayer et al. 1996)](:bib:Baayer1996) were included.
Ravelli-2025-Specificity	Ravelli, Andrea Amelio and Bolognesi, Marianna Marcella and Caselli, Tommaso	2025	ratings	English	English	https://doi.org/10.1007/s10339-024-01239-4	Ravelli2025	This list contains specificity ratings for English words, provided by native speakers. The list used for rating was compiled from the ANEW dataset ([Bradley et al. 1999)](:bib:Bradley1999).
Su-2023-AffectiveRatings	Su, I-Fan and Yum, Yen Na and Lau, Dustin Kai-Yan	2023	ratings	Chinese	Chinese	https://doi.org/10.3758/s13428-022-01928-y	Su2023	This list contains ratings of age of acquisition (AoA), imageability, familiarity and concreteness for 4376 traditional Chinese characters given by 20 native speakers of Cantonese. Further, logarithmic frequency and number of strokes per character were provided. The original list also contains semantic radical transparency ratings.
Dimitropoulou-2010-Frequency	Dimitropoulou, Maria and Duñabeitia, Jon Andoni and Avilés, Alberto and Corral, José and Carreiras, Manuel	2010	norms	Greek	Greek	https://doi.org/10.3389/fpsyg.2010.00218	Dimitropoulou2010	This list includes word frequencies based on subtitles from 5,508 television series and films in Greek.	SUBTLEX-GR
Kiritchenko-2017a-Valence	Kiritchenko, Svetlana and Mohammad, Saif	2017	ratings	English	English	https://doi.org/10.18653/v1/P17-2074	Kiritchenko2017	This list contains valence ratings given on a best-worst scale (BWS). Participants were asked to select the most positive and the most negative word out of a 4-tuple. Using the conting procedure, each term’s score was calculated as the percentage of times it was chosen as most positive minus the percentage of times the term was chosen as most negative. The scores range from −1 (most negative) to 1 (most positive). Using the same words, a similar experiment was also conducted with a 9-point rating scale. Those data are provided in a separate dataset, see Kiritchenko-2017b-Valence.
Kiritchenko-2017b-Valence	Kiritchenko, Svetlana and Mohammad, Saif	2017	ratings	English	English	https://doi.org/10.18653/v1/P17-2074	Kiritchenko2017	This list contains valence ratings given on a 9-point scale (-4 = extremely negative, 4 = extremely positive). The scores were then converted and range from 1 to 9 in the present data. Using the same words, a similar experiment was also conducted  with a best-worst scale (BWS). Those data are provided in a separate dataset, see Kiritchenko-2017a-Valence.
Wang-2022-AffectiveRatings	Wang, Shaonan and Zhang, Yunhao and Zhang, Xiaohan and Sun, Jingyuan and Lin, Nan and Zhang, Jiajun and Zong, Chengqing	2022	ratings	Chinese	Chinese	https://doi.org/10.18112/openneuro.ds004301.v1.0.0	Wang2022	This dataset provides ratings for Chinese nouns, verbs and adjectives across perceptual, sensorimotor, spatial, temporal, causal, social, cognitive and affective domains given by 30 participants. Different instructions were given for nouns, verbs, and adjectives within each variable, so ratings reflect part-of-speech-specific interpretations. Some variables were queried only for certain word classes. Ratings were given on a 7-point scale (0 = not at all, 3 = somewhat, 7 = very much). The dataset includes mean ratings for each variable. The list is based on [Binder et al. (2016)](:bib:Binder2016), though 13 variables from the original English study were ommited in the Chinese version: motion, biomotion, shape, texture, audition, low, high, speech, time, social, harm, pleasant and unpleasant.
MartinezTomas-2026-AffectiveRatings	Martínez-Tomás, Celia and Guasch, Marc and Ferré, Pilar and Lázaro, Miguel and Hinojosa, José Antonio	2026	ratings	Spanish	Spanish	https://doi.org/10.3758/s13428-026-02976-4	MartinezTomas2026	This list contains ratings of valence and arousal for 1,200 Spanish words. The original study also included 4,800 pseudowords which were rated for wordlikeliness in addition to valence and arousal. Further, valence and arousal ratings for the real words was listed from multiple sources [(Ferré et al. 2012](:bib:Ferre2012), [Guasch et al. 2016](:bib:Guasch2016), [Hinojosa et al. 2016](:bib:Hinojosa2016), [Stadthagen-Gonzalez et al. 2017])(:bib:StadthagenGonzalez2017), which were not included in here but can be found in separate lists.