Moshi AI Chatbot With Real-Time Voice Features Launched by Kyutai Labs as GPT-4o Rival
New Delhi: Kyutai Labs announced the launch of Moshi AI, an advanced artificial intelligence (AI) chatbot capable of real-time verbal responses. Developed by the French AI firm, Moshi’s audio language model was created entirely in-house, featuring voice modulation to convey emotions and respond in various speaking styles. The AI model is available to the public free of charge, though it currently limits conversations to five minutes. Notably, OpenAI has also announced similar speech features with the upcoming release of GPT-4, although it has not yet been released.
Moshi AI features
Kyutai Labs has revealed that their newly launched AI chatbot, Moshi AI, was developed in six months by a team of eight individuals. At an unveiling event in Paris, Kyutai Labs clarified that Moshi is not an AI assistant but a prototype designed to facilitate the development of tools for various use cases. The chatbot is publicly accessible, allowing users to join a queue via email registration. The interface of the platform is minimalist, featuring a straightforward AI design. Users can monitor the loudness of their voice while speaking, and a text box displays only the AI’s responses. Another box at the top of the interface shows technical details such as audio duration, latency, and missed audio. A button to disconnect the call is located at the very top, and the maximum call duration is currently limited to five minutes. The description page emphasizes that Moshi can think, speak, and listen simultaneously, enhancing the conversational flow.
It has also been noted that the AI exhibits extremely low latency, often responding instantly. However, there were occasional delays where response times exceeded 10-15 seconds, likely due to server load. In some instances, verbal prompts were not registered even when the volume meter was nearly full. The AI model is capable of responding with an emotive voice and can employ various speaking styles and voice modulations. Connected to the Internet, Moshi can fetch information for queries that require web searches. Notably, the chatbot does not support text prompts, relying solely on voice interactions. Kyutai Labs announced plans to open-source the AI model, though the model weights and code have not yet been hosted on a portal. Once available, users will be able to download and install it locally, enabling the AI to run on unconnected devices.