Speech Learning Model

Alibaba Updates Speech Translation Model, Triples Language Coverage

Alibaba expands its AI live speech translation model from 18 to 60 languages, adding real-time voice cloning and reducing ...

InfoQ

Google's Universal Speech Model Performs Speech Recognition on Hundreds of Languages

A monthly overview of things you need to know as an architect or aspiring architect. Unlock the full InfoQ experience by logging in! Stay updated with your favorite authors and topics, engage with ...

CU Boulder News & Events

Fine-tuning a Strong Language model to Enable Classroom Speech Recognition

Postdoctorate Viet Anh Trinh led a project within Strand 1 to develop a novel neural network architecture that can both recognize and generate speech. He has since moved on from iSAT to a role at ...

InfoQ

Google AI Updates Universal Speech Model to Scale Automatic Speech Recognition beyond 100 Languages

VentureBeat

aiOla drops ultra-fast ‘multi-head’ speech recognition model, beats OpenAI Whisper

Want smarter insights in your inbox? Sign up for our weekly newsletters to get only what matters to enterprise AI, data, and security leaders. Subscribe Now Today, Israeli AI startup aiOla announced ...

VentureBeat

Meta Introduces Spirit LM open source model that combines text and speech inputs/outputs

Just in time for Halloween 2024, Meta has unveiled Meta Spirit LM, the company’s first open-source multimodal language model capable of seamlessly integrating text and speech inputs and outputs.

Hosted on MSN

Studies and AI advances reveal new paths in speech learning

Shared learning traits: Princeton research shows human infants and zebra finches both rely on social feedback to develop complex vocal sequences. Early disorder detection: A University of Vermont ...

Medical Xpress

How the senses intertwine to help store new speech patterns

We don't usually realize it, but every word we speak depends on a series of complex brain processes working behind the scenes. One important part of this is speech motor learning, the brain's ability ...

Ars Technica

Meta’s “massively multilingual” AI model translates up to 100 languages, speech or text

On Tuesday, Meta announced SeamlessM4T, a multimodal AI model for speech and text translations. As a neural network that can process both text and audio, it can perform text-to-speech, speech-to-text, ...

TechCrunch

Largest text-to-speech AI model yet shows ’emergent abilities’

Researchers at Amazon have trained the largest ever text-to-speech model yet, which they claim exhibits “emergent” qualities improving its ability to speak even complex sentences naturally. The ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results