# SadTalker **Repository Path**: dingjianfei/SadTalker ## Basic Information - **Project Name**: SadTalker - **Description**: ้ๅ - **Primary Language**: Python - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 1 - **Forks**: 1 - **Created**: 2023-04-03 - **Last Updated**: 2023-10-04 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README
- ๐ฅ Several new mode, eg, `still mode`, `reference mode`, `resize mode` are online for better and custom applications.
- ๐ฅ Happy to see our method is used in various talking or singing avatar, checkout these wonderful demos at [bilibili](https://search.bilibili.com/all?keyword=sadtalker&from_source=webtop_search&spm_id_from=333.1007&search_source=3
) and [twitter #sadtalker](https://twitter.com/search?q=%23sadtalker&src=typed_query).
## ๐ Changelog (Previous changelog can be founded [here](docs/changlelog.md))
- __[2023.03.30]__: Launch beta version of the full body mode.
- __[2023.03.30]__: Launch new feature: through using reference videos, our algorithm can generate videos with more natural eye blinking and some eyebrow movement.
- __[2023.03.29]__: `resize mode` is online by `python infererence.py --preprocess resize`! Where we can produce a larger crop of the image as discussed in https://github.com/Winfredy/SadTalker/issues/35.
- __[2023.03.29]__: local gradio demo is online! `python app.py` to start the demo. New `requirments.txt` is used to avoid the bugs in `librosa`.
- __[2023.03.28]__: Online demo is launched in [](https://huggingface.co/spaces/vinthony/SadTalker), thanks AK!
## ๐ผ Pipeline

> Our method uses the coefficients of 3DMM as intermediate motion representation. To this end, we first generate
realistic 3D motion coefficients (facial expression ฮฒ, head pose ฯ)
from audio, then these coefficients are used to implicitly modulate
the 3D-aware face render for final video generation.
## ๐ง TODO
| |
> Kindly ensure to activate the audio as the default audio playing is incompatible with GitHub.
#### Generating 4D free-view talking examples from audio and a single image
We use `input_yaw`, `input_pitch`, `input_roll` to control head pose. For example, `--input_yaw -20 30 10` means the input head yaw degree changes from -20 to 30 and then changes from 30 to 10.
```bash
python inference.py --driven_audio