Beyond ChatGPT: NExT-GPT is an OpenSource Model That Lets You Master AI With Audio, Video and Text


NExT-GPT, a multimodal AI large language model developed by the National University of Singapore and Tsinghua University, can process and generate combinations of text, images, audio, and video. This open-source model, pitched as an “any-to-any” system, allows for more natural interactions than text-only models. It uses a technique called “modality-switching instruction tuning” to improve cross-modal reasoning abilities and unique tokens to handle different inputs. NExT-GPT represents an open-source alternative to multimodal AI products from tech giants like Google and OpenAI.

Read more at Decrypt…

Discover more from Emsi's feed

Subscribe now to keep reading and get access to the full archive.

Continue reading