/work4ai/VALL-E - Scrapbox Reader

generated at 2/17/2025, 5:31:21 PM
VALL-E
https://valle-demo.github.io
>TTSの非常にインパクトの大きい論文が出たので紹介. 
>VALL-E: Zero-shot音声合成システム. MetaのEnCodecを取り入れ, LLMタスクとみなすことでTransformerの強みを活かしている。論文タイトルからも察することができるように❌DALL-E論文と同様の立ち位置であることを示す
>デモ: https://t.co/mdBOMiVvuh 逆瀬川
>

Text2Speech

とても自然だと思う
>Looks like a community reproduction of VALL-E may come before the official release (no ETA or commitment from MSFT yet).
>
>We may be able to clone anyone’s voice to synthesize any speech on @huggingface soon 😮
>
>Link: https://t.co/sCYZ0PEOAL. Not lucidrains this time😄 Jim Fan
>
ほー非公式の実装がもう出たのか

#音声合成AI