Jose Camacho Collados (Cardiff University)
Mohammad Taher Pilehvar (University of Cambridge)
Abstract: Embeddings have been the dominating buzzword in 2010s for Natural Language Processing (NLP). Representing knowledge through a low-dimensional vector which is easily integrable in modern machine learning algorithms has played a central role in the development of the field. Embedding techniques initially focused on words but the attention soon started to shift to other forms: from graph structures, such as knowledge bases, to other types of textual content, such as sentences and documents.
This course will provide a high level synthesis of the main embedding techniques in NLP, in the broad sense. We will start by conventional word embeddings (e.g. Word2Vec and GloVe) and then move to other types of embeddings, such as sense-specific and graph alternatives. We will finalise with an overview of the recent successful contextualized representations (e.g. ELMo, BERT) and explain their potential in NLP.