/work4ai/ImageBind - Scrapbox Reader

generated at 2/17/2025, 5:44:30 PM
ImageBind
https://imagebind.metademolab.com/ ImageBind: a new way to ‘link’ AI across the senses
https://ai.facebook.com/blog/imagebind-six-modalities-binding-ai/?s=09 blog
https://arxiv.org/abs/2305.05665 ImageBind: One Embedding Space To Bind Them All
https://github.com/facebookresearch/ImageBind repo/ckpt
6つのモダリティ情報を統合したLMM
テキスト/画像・動画/音声/奥行き/赤外線/IMU
これらを単一の埋め込み(single embedding)つまりshared representation spaceで学習させている
橋渡しとして中心にテキストではなく画像を使っている
Web上には画像と対になったデータセットが大量にあり、多様なモダリティと共起するため
MicrosoftのTaskMatrix.AIとは真逆のアプローチ

Meta AI