Transformers: Applications in Language and Communication

UU

Transformers: Applications in Language and Communication#

This website accompanies the Utrecht University Applied Data Science Master course Transformers: Applications in Language and Communication.

This course introduces you to Transformers, introduced in [VSP+17]. With the release of ChatGPT in November 2022, the Transformer, the T in GPT, firmly took position as the number one working horse in AI. It changed the field of natural language processing (the field of this course’s lecturers) overnight, and its revolutionary industrial impact might be huge and lasting. Yet, the working horse at its core can only do one thing: predict the next word. How can a next-word predictor show good aptitude at school tests? How is it able to reason? What are the problematic aspects of this technology, and what are the current developments?

The Applied Data Science Master program is a one-year Master program. The live course is held yearly in February-April. The current webpages reflect the course content from the 2025 issue.

The content was created by Antal van den Bosch, Lisa Bylinina, Yingjin Song, and Yupei Du. Guest lectures were given by Lukas Edman, Fabian Ferrari, and Jakub Zavrel (Zeta Alpha). Tijmen Baarda and Arjan Mossel were teaching assistants. We took an effort in properly quoting and crediting content of others; please notify Antal van den Bosch in case the current material needs a correction or a credit.

Some of the Jupyter Notebooks accompanying the course (that have been optimized to run on Google Colab) are based on the notebooks that came with the book Natural Language Processing with Transformers by Lewis Tunstall, Leandro von Werra, and Thomas Wolf. Other sources of inspiration are quoted in the respective lecture slides and Jupyter notebooks.

The images containing imaginary Transformer bots in various lab or outside settings were created in Midjourney, with prompts that always included “1950s science fiction book cover style”.

cc-by

The materials on this website are CC-BY-4.0 licensed.