Sam Motamed

I am an ELLIS Ph.D. student at INSAIT where I am advised by Prof. Luc Van Gool and Dr. Iro Laina. Previously, I was a Machine Learning Researcher Intern at Netflix, and from May to November 2024, a Student Researcher at Google DeepMind in Toronto, working with Robert Geirhos.

Before my PhD journey began, I was a visiting researcher from 2021 to 2023 at CMU's Human Sensing Lab working with the amazing Fernando De La Torre. I also spent 7 wonderful years at the University of Toronto's Computer Science department where I earned my HBSc and MS degrees.

Email CV Google Scholar Twitter Github

7 Magic Mountains - Nevada

WACV 2026 - Arizona

Lego Sam says eat your fruits

Research

I'm broadly interested in video generation and video-language models, with a current focus on improving their generation and understanding of physically plausible scenes. I also work on enabling user-intuitive control over generative models and adapting large vision and language models to solve personalized tasks using limited data. Relevant work is highlighted here.

Research Projects

Physics Plausibility in Video Models: Improving how video generation and video-language models understand and produce physically plausible content.
Generative Vision for Video Synthesis: Developing user-intuitive controls for generative video models to enable interactive and personalized content creation.
Adaptation of Large Models: Adapting large-scale vision and language models to solve personalized tasks efficiently using limited data.

Publications

VOID:Video Object and Interaction Deletion

Saman Motamed, William Harvey, Benjamin Klein, Luc Van Gool, Zhuoning Yuan, Ta-Ying Cheng

arxiv, 2026

project page / code / arxiv / Gradio Demo

A video inpainting model that not only removes an object, but also its induced effects in the scene.

Physics IQ

Do generative video models learn physical principles?

Saman Motamed, Laura Culp, Kevin Swersky, Priyank Jaini, Robert Geirhos

WACV, 2026

project page / code / arxiv

A benchmark of real videos for testing physics understanding of generative video models.

TRAVL: A Recipe for Making Video-Language Models Better Judges of Physics Implausibility

Saman Motamed, Minghao Chen, Luc Van Gool, Iro Laina

arXiv, 2025

project page / code / arxiv

A recipe to make video language models better understand physics, and a rigorous benchmark to test VLMs on physics understanding.

Lego

Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance in Text-to-Image Diffusion Models

Saman Motamed, Danda Pani Paudel, Luc Van Gool

ECCV, 2024

code / arxiv

A method for textual inversion of adjectives and verbs in text-to-image diffusion models.

Video Editing

Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing of Text-to-Video Diffusion Models

Saman Motamed, Wouter Van Gansbeke, Luc Van Gool

CVPR Generative Models Workshop, 2024

code / arxiv

Zero-shot control over object shape, position and movement in text-to-video models via cross-attention maps.

PATMAT

PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting

Saman Motamed, Jianjin Xu, Chen Henry Wu, Fernando De la Torre

ICCV, 2023

ICCV / code / arxiv

A tuning method for personalizing inpainting of the face and preserving the identity of a subject.

PromptGen

Generative Visual Prompt: Unifying Distributional Control of Pre-Trained Generative Models

Chen Henry Wu, Saman Motamed, Shaunak Srivastava, Fernando De La Torre

NeurIPS, 2022

NeurIPS / code / arxiv

A framework for defining control over latent-based generative models.

Talks

Invited talks and presentations.

Dec 16, 2025

Improving VLM's understanding of physically implausible scenes.

Invited Talk · Vector Institute, Toronto
Invited by Dr. Babak Taati · Journal Club

Jan 30, 2025

How to benchmark video generative models for their physics understanding?

Invited Talk · Stability AI Reading Group
Invited by Rahim Entezari

Film Photography

Somewhat related to computer vision and content creation, I enjoy film photography on 35 mm and medium format film.

Turret Arch, Arches National Park

Turret Arch

Arches National Park, Utah

Goblin Valley, Utah

Goblin Valley

Utah

Utah landscape

Utah

Utah

Athens, Greece

Athens

Greece

Highway 1, California

Highway 1

California

Big Sur, California

Big Sur

California

Big Sur, California

Big Sur

California

Port Meadow, Oxford

Port Meadow

Oxford, UK

Haleakala National Park, Hawaii

Haleakalā National Park

Hawaii

Mesa Arch, Utah

Mesa Arch

Utah

❮ ❯