UvA Foundation Models Course


MSc in Artificial Intelligence for the University of Amsterdam.

Find Out More

About


Foundation models are a revolutionary class of AI models that provides impressive abilities to generate content (text, images, sound, videos, protein structures, and more), and to do so by interactive prompts in a seemingly creative manner. These foundation models are often autoregressive, self-supervised, multimodal, transformer-based models that are pre-trained on large volumes of data, typically collected from the web. They already form the basis of all state-of-the-art systems in computer vision and natural language processing across a wide range of tasks and have shown impressive few-shot learning abilities. The perceived intelligence and adaptability of models like ChatGPT, Stable Diffusion, Flamingo, and GPT4 impress, but their aptitude to produce inaccurate, misleading, or false information (and present it confidently and convincingly) makes them unsuitable for any task of importance and poses serious societal concerns.

In this course we consider fundamental algorithmic questions concerning the multimodal learning and reasoning abilities of foundation models, as well as the ethical, legal, and social aspects of their deployment in society. Topics include, but are not limited to, the theory of foundation models, how foundation models appear to give rise to emergent behavior that was not observed during training, how the hallucination problem (as the lack of factual basis is referred to) arises and whether it can be mitigated by learning to retrieve and reference information from trusted sources, how to ensure foundation model objectives are aligned with their operators’ intentions, and how a more decentralized, compositional, and incremental approach to training and serving foundation models might reduce their data and compute hunger.

The course is research focused, with lectures on state-of-the-art advancements, reading and presenting of recent papers, and as final course deliverable a research-team project where students may study their own foundation model topic that should result in a research paper that is submitted to an academic venue.

This course is taught in the MSc program in Artificial Intelligence of the University of Amsterdam by Professor Cees Snoek and Assistant Professor Yuki Asano. The teaching assistants are Aritra Bhowmik, Mohammad Mahdi Derakhshani, Michael Dorkenwald, Yingjun Du, Danilo de Goede, Leon Lang, Phillip Lippe, Ivona Najdenkoska, Kien Nguyen, Mohammadreza Salehi, Zehao Xiao



Aritra Bhowmik Mohammad Mahdi Derakhshani Michael Dorkenwald Yingjun Du

Danilo de Goede Leon Lang Phillip Lippe Ivona Najdenkoska

Kien Nguyen Mohammadreza Salehi Zehao Xiao

Lectures


Week 1

Lecturer: Cees Snoek
Date: April 2, 2024

This lecture introduces the course and reviews the basics behind Transformers and their application to NLP and Vision.

Documents:

Lecture recordings:

No recordings.

Lecturer: Cees Snoek
Date: April 5, 2024

This lecture discusses pre-training methods for language (GPT, LLaMa) and vision (DINO, MAE, SAM), as well as scaling laws.

Documents:

Lecture recordings:

No recordings.

Week 2

Lecturer: Phillip Lippe
Date: April 9, 2024

This lecture gives a deepdive into the ML engineering behind scaling models on single- and multi-device setups. It discusses per-device optimizations such as compilation, kernel optimization (e.g. Flash Attention), gradient checkpoint and accumulation, and profiling. The second part covers basics behind data parallelism, pipeline parallelism, and tensor parallelism.

Documents:

Lecture recordings:

No recordings.

Lecturer: Cees Snoek
Date: April 12, 2024

The most powerful foundation models known today are intrinsically multimodal, exploit a very long context and are capable of amazing abilities. Yet, much of their underlying technology is closed. In this lecture we give an overview of large-scale multimodal pre-training methods that appeared in the last few years, that should provide hints on the architecture, model specifics and datasets used by their modern counterparts. A recent trend is mixture-of-expert modelling, this will be covered in a separate lecture later in this course.

Documents:

Lecture recordings:

No recordings.

Week 3

Lecturer: Cees Snoek
Date: April 16, 2024

This lectures covers adaptation and alignment techniques, including in-context learning, prompting, instruction tuning, and RLHF.

Documents:

Lecture recordings:

No recordings.

Lecturer: Yuki Asano
Date: April 19, 2024

We discuss methods for parameter-efficient fine tuning of large foundation models.

Documents:

Lecture recordings:

No recordings.

Week 4

Lecturer: Cees Snoek, Samuele Papa
Date: April 23, 2024

The lecture first introduces the theoretical fundamentals of DDPM, and then covers topics on guided diffusion, cascades, and latent diffusion.

Documents:

Lecture recordings:

No recordings.

Lecturer: Leon Lang, Tim Bakker, Leonard Bereska (UvA)
Date: April 26, 2024

In this three-part lecture, we discuss alignment, mechanistic interpretability, and developmental interpretability.

Documents:

Lecture recordings:

No recordings.

Week 6

Lecturer: Catelijne Muller (ALLAI)
Date: May 7, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Week 7

Lecturer: Cees Snoek
Date: May 14, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Lecturer: Yuki Asano
Date: May 17, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Week 8

Lecturer: Paul Verhagen
Date: May 21, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Lecturer: Cees Snoek and TAs
Date: May 24, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Week 9

Lecturer: Cees Snoek
Date: May 29, 2024

TBD.

Documents:

No documents.

Lecture recordings:

No recordings.

Contact us!


If you have any questions or recommendations for the website or the course, you can always drop us a line! The knowledge should be free, so feel also free to use any of the material provided here (but please be so kind to cite us).