Analyzing the Cognitive Plausibility of Deep Language Models

LaCo Advancedweek 1 each day

Lisa Beinborn (Vrije Universiteit Amsterdam)
Nora Hollenstein (University of Copenhagen)
Willem Zuidema (University of Amsterdam)


Abstract: Computational models of language serve as an increasingly popular tool to examine research hypotheses related to language processing. Representing language in a computationally processable way forces us to operationalize our underlying assumptions in a concrete, implementable and falsifiable manner.
In the last decade, distributional representations which interpret words, phrases, sentences, and even full stories as a high-dimensional vector in semantic space have become very popular. They are obtained by training language models on large corpora to optimally encode contextual information. Whereas the most commonly known model word2vec only provides standardized representations for isolated words (Mikolov, Sutskever, Chen, Corrado, & Dean, 2013), more recent models interpret words in context. In this tutorial, we introduce the participants into the functionality of state-of-the-art contextualized language models and discuss methods for evaluating their cognitive plausibility. We explore a range of evaluation scenarios using cognitive data (e.g., eyetracking, eeg, and fMRI) and provide practical examples. In a second step, we aim at opening up the blackbox and introduce methods for analyzing the hidden representations of a computational model (e.g., diagnostic classifiers, representational similarity analysis). Participants can explore practical coding examples to analyze and visualize the similarity between computational and human representations of language.