Stars
3
Forks
1
Language
None
Last Updated
Dec 06, 2023
Similar Repos
Repo | Language | Stars | Description | Updated At |
---|---|---|---|---|
Python | 16 | Promting Whisper for Audio-Visual Speech Recognition, Code-Switched Speech Recognition, and Zero-Shot Speech Translation | May 20, 2023 | |
Python | 39 | Zero-shot Audio Classification using Whisper | Apr 25, 2023 | |
Jupyter Notebook | 4 | Audio-visual speech recognition models | May 27, 2022 | |
Python | 33 | Whisper + OpenAI + Speech Recognition | Apr 08, 2023 | |
Python | 23 | Simple Python audio transcriber using OpenAI's Whisper speech recognition model | Mar 26, 2023 | |
Python | 44 | Zero-Shot Visual Recognition using Semantic-Preserving Adversarial Embedding Networks | Apr 19, 2022 | |
Python | 6 | Code for paper "Cross-Modal Global Interaction and Local Alignment for Audio-Visual Speech Recognition" | Jun 25, 2023 | |
HTML | 2 | Code for benchmarking VLMs on zero and few-shot activity recognition | May 19, 2023 | |
Python | 53 | Zero-shot multimodal punctuation insertion and truecasing using Whisper | Apr 25, 2023 | |
Python | 76 | Audio-Visual Speech Recognition using Sequence to Sequence Models | Sep 24, 2022 | |
C | 45 | Transcribe audio files using the "Whisper" Automatic Speech Recognition model from R | Apr 24, 2023 | |
Python | 767 | Zero-shot Image-to-Image Translation | May 09, 2023 | |
Jupyter Notebook | 47 | [WACV 2023] Audio-Visual Efficient Conformer (AVEC) for Robust Speech Recognition | May 23, 2023 | |
Python | 16 | This repository contains the code and instructions needed to reproduce the dataset splits for out … | Aug 03, 2022 | |
Python | 3 | Whisper is a general-purpose speech recognition model. It is trained on a large dataset of … | Apr 01, 2023 | |
Python | 16 | Github repository for Zero Shot Visual Storytelling | Nov 12, 2022 | |
Python | 38 | Whisper realtime streaming for long speech-to-text transcription and translation | Apr 23, 2023 | |
None | 2 | A PyTorch implementation of the Deep Audio-Visual Speech Recognition paper. | Jan 24, 2021 | |
MATLAB | 7 | Domain-Invariant Projection Learning for Zero-Shot Recognition | Jul 07, 2020 | |
Python | 32 | Unofficial Implementation of Zero-Shot Text-to-Speech for Text-Based Insertion in Audio Narration | Jun 27, 2022 | |
Jupyter Notebook | 168 | Wav2Vec for speech recognition, classification, and audio classification | May 29, 2023 | |
JavaScript | 3 | QCRI Speech Recognition and Machine Translation API's for Hackathons | Apr 19, 2021 | |
Python | 34 | Python toolkit for Visual Speech Recognition | Dec 22, 2022 | |
Jupyter Notebook | 3 | Zero-shot and Translation Experiments on XQuAD, MLQA and TyDiQA | Nov 08, 2022 | |
Python | 4 | This repository contains the code for our ECCV 2022 paper "Temporal and cross-modal attention for … | Aug 09, 2022 | |
Shell | 2 | repository for paper "Audio-Visual Speech Recognition in MISP2021 Challenge: Dataset Release and Deep Analysis" | Jun 29, 2022 | |
Python | 3 | [EMNLP 2022] CROP: Zero-shot Cross-lingual Named Entity Recognition with Multilingual Labeled Sequence Translation | Feb 20, 2023 | |
TypeScript | 2 | A CLI speech recognition tool, using OpenAI Whisper, supports audio file transcription and near-realtime microphone … | Oct 20, 2023 | |
Python | 7 | Audio-Visual Model for Speech Separation. | Dec 26, 2019 | |
Kotlin | 2 | Text-to-speech translation Android application with speech recognition | Mar 17, 2023 | |
Python | 207 | Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch | Apr 20, 2023 | |
Python | 4 | Implementation of Natural Speech 2, Zero-shot Speech and Singing Synthesizer, in Pytorch | Jul 12, 2023 | |
Python | 9 | Visual speech recognition with face inputs: code and models for F&G 2020 paper "Can We … | Apr 02, 2022 | |
Python | 2 | This is an OpenAI Whisper automatic speech recognition microservice | Apr 06, 2023 | |
TypeScript | 5 | Speech recognition using Google Cloud Speech, specified for Japanese audio | Oct 20, 2022 | |
Python | 4 | Code for Invariant and consistent: Unsupervised representation learning for few-shot visual recognition. Neurocomputing 2023 | Mar 06, 2023 | |
CSS | 3 | Official documentation of Whisper-Based Real-time Speech Recognition, a Unreal Engine plugin for real-time speech-to-text transcription … | Apr 14, 2023 | |
Jupyter Notebook | 7 | Demo combining Whisper for speech recognition and Google TTS for speech synthesis to interact with … | Apr 01, 2023 | |
Python | 53 | Code and Data for paper: Zero-shot Visual Question Answering using Knowledge Graph [ ISWC 2021 … | May 08, 2023 | |
Python | 2 | Code for Multimodal Cross Attention Network for Audio Visual Emotion Recognition | Nov 01, 2022 | |
JavaScript | 248 | Web Audio Speech Synthesis / Recognition for p5.js | Aug 22, 2022 | |
Jupyter Notebook | 146 | Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper | Apr 11, 2023 | |
Python | 303 | Presenting Low-shot Visual Recognition by Shrinking and Hallucinating Features | Jul 16, 2022 | |
C++ | 5 | Code to train a translation model using speech recognition lattices and their written translations. | Nov 10, 2021 | |
Python | 16 | This repository contains the code for our CVPR 2022 paper on "Audio-visual Generalised Zero-shot Learning … | Aug 09, 2022 | |
None | 2 | Code and Data for the paper: Zero-shot Visual Question Answering using Knowledge Graph [ ISWC … | Apr 15, 2022 | |
Python | 5 | bimodal speech recognition based on acoustic and visual data | May 02, 2022 | |
Python | 4 | Simple demo of OpenAI's Whisper speech recognition model for HF 🤗 Spaces. | Mar 03, 2023 | |
Python | 7 | Code for the AAAI 2021 paper "Attributes-Guided and Pure-Visual Attention Alignment for Few-Shot Recognition". | Jun 07, 2022 | |
Python | 105 | A realtime speech transcription and translation application using Whisper OpenAI and free translation API. Interface … | Apr 25, 2023 |