# TVTS **Repository Path**: blueshh/TVTS ## Basic Information - **Project Name**: TVTS - **Description**: 利用了自动语音识别 (ASR) 技术,将视频中的声音转换为文本。然后,这些文本可以被整理、搜索和分析,以提取有用的信息。 - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2023-11-23 - **Last Updated**: 2023-11-23 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Turning to Video for Transcript Sorting This repo contains the official implementations of the two papers: 1. [Learning Transferable Spatiotemporal Representations from Natural Script Knowledge](https://arxiv.org/abs/2209.15280) 2. [TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale](https://arxiv.org/abs/2305.14173) ## News + **[2023.02]** 🎉 TVTS is accepted to CVPR 2023. + **[2023.03]** The official code of TVTS has been released. + **[2023.05]** 🚀 TVTSv2 is comming out! Please refer to [this link](https://arxiv.org/abs/2305.14173) for details. + **[2023.08]** The official code of TVTSv2 and the pre-trained models have been released. All zero-shot evaluations are available on a single GPU. We provide scripts for extracting your own video features. Try it now 😎! ## Introduction ### Quickstart Folder [v1](v1) contains the official code of TVTS. See [v1-README](v1/README.md) for details. Folder [v2](v2) contains the official code of TVTSv2, an upgraded version of TVTS that produces powerful video representations for *out-of-the-box* usage. See [v2-README](v2/README.md) for details. ### Citation If you find our work helps, please cite our paper. ```tex @InProceedings{Zeng_2023_CVPR, author = {Zeng, Ziyun and Ge, Yuying and Liu, Xihui and Chen, Bin and Luo, Ping and Xia, Shu-Tao and Ge, Yixiao}, title = {Learning Transferable Spatiotemporal Representations From Natural Script Knowledge}, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, month = {June}, year = {2023}, pages = {23079-23089} } ``` ```tex @misc{zeng2023tvtsv2, title={TVTSv2: Learning Out-of-the-box Spatiotemporal Visual Representations at Scale}, author={Ziyun Zeng and Yixiao Ge and Zhan Tong and Xihui Liu and Shu-Tao Xia and Ying Shan}, year={2023}, eprint={2305.14173}, archivePrefix={arXiv}, primaryClass={cs.CV} } ```