Research
I'm broadly interested in Generative Vision models for content creation,
and currently focused on Video synthesis. My research aims to gain a better
understanding of how to enable user-intuitive control over Generative models.
I am also interested in bias mitigation and harnessing the power of large vision
and language models by adapting them to solve personalized tasks using limited data.
Relevant work is highlighted here.
Research Projects
-
Generative Vision for Video Synthesis: Developing user-intuitive controls for generative video models to enable interactive and personalized content creation.
-
Bias Mitigation in Vision Models: Investigating strategies to mitigate bias in generative models while preserving their generalization capabilities.
-
Adaptation of Large Models: Adapting large-scale vision and language models to solve personalized tasks efficiently using limited data.
|
Publications
|
Lego: Learning to Disentangle and Invert Concepts Beyond Object Appearance
in Text-to-Image Diffusion Models
Saman Motamed, Danda Pani Paudel, Luc Van Gool
ECCV, 2024
code /
arxiv
A method for textual inversion of adjectives and verbs in text-to-image diffusion models.
|
|
Investigating the Effectiveness of Cross-Attention to Unlock Zero-Shot Editing
of Text-to-Video Diffusion Models
Saman Motamed, Wouter Van Gansbeke, Luc Van Gool
CVPR Generative Models Workshop, 2024
code /
arxiv
Zero-shot control over object shape, position and movement
in text-to-video models via cross-attention maps.
|
|
A Unified and Interpretable Emotion Representation and Expression Generation
Reni Paskaleva, Mykyta Holubakha, Andela Ilic,
Saman Motamed, Luc Van Gool, Danda Paudel
CVPR, 2024
arxiv
Fine-grained generation of expressions in conjunction with other textual inputs
and offers a new label space for emotions at the same time.
|
|
D3GU: Multi-Target Active Domain Adaptation via Enhancing Domain Alignment
Lin Zhang, Linghan Xu, Saman Motamed,
Shayok Chakraborty, Fernando De la Torre
WACV, 2024
arxiv
A Multi-Target Active Domain Adaptation (MT-ADA) framework for image classification.
|
|
Personalized Face Inpainting With Diffusion Models by Parallel Visual Attention
Jianjin Xu, Saman Motamed,
Praneetha Vaddamanu, Chen Henry Wu, Christian Haene,
Jean-Charles Bazin, Fernando De la Torre
WACV, 2024
code /
arxiv
Fast, identity preserving face inpainting with diffusion models.
|
|
PATMAT: Person Aware Tuning of Mask-Aware Transformer for Face Inpainting
Saman Motamed, Jianjin Xu, Chen Henry Wu, Fernando De la Torre
ICCV, 2023
ICCV
/
code /
arxiv
A tuning method for personalizing inpainting of the face
and preserving the identity of a subject.
|
|
Generative Visual Prompt: Unifying Distributional Control
of Pre-Trained Generative Models
Chen Henry Wu, Saman Motamed,
Shaunak Srivastava, Fernando De La Torre
NeurIPS, 2022
NeurIPS
/
code /
arxiv
A framework for defining control over latent-based generative models.
|
Somewhat related to computer vision and content creation, I enjoy film photography on 35 mm and medium format film. You can view some of my photos below.
Happenings
- Oct 2023: Two papers accepted at WACV 2024. Details will be posted soon.
- Oct 2023: I served as a volunteer at ICCV 23 and presented PATMAT.
|