Unveiling Language Development: Insights from The Little Prince and AI Research

Children require merely a few million words to master a language, yet the brain mechanisms underpinning this process remain largely unexamined. A recent study by Meta* AI in collaboration with the Rothschild Hospital in Paris sheds light on how linguistic representations are formed in the brain, revealing striking similarities to large language models (LLMs) in artificial intelligence.

Researchers monitored the brain activity of 46 French-speaking participants aged between 2 and 46 years old. All participants had electrodes implanted for epilepsy treatment. While listening to the audiobook «The Little Prince,» neural activity was recorded using over 7,400 electrodes, with the aim of tracking how speech is processed in the brain.

The findings indicated that even children aged between two and five exhibited clear responses to speech sounds, such as «b» and «k.» These responses were localized in specific auditory regions of the brain and occurred at predictable times. However, the processing of complete words—their meanings and grammatical structures—was only observed in older children, in more advanced brain areas.

As children grow, the patterns of speech processing expand to larger regions of the brain. The response to words begins earlier, lasts longer, and becomes more pronounced, indicating that speech processing becomes increasingly complex with age.

To gain deeper insights into how these representations develop, the team compared neural data with the activation patterns of two language models: wav2vec 2.0 (which analyzes speech features based on sound) and Llama 3.1. Both models were evaluated before and after training.

Post-training, the models began to resemble human brain activity more closely. Wav2vec, which was trained on raw audio recordings, established a stepwise processing scheme—starting from simple sounds and advancing to more complex meanings. Conversely, Llama 3.1 processed entire words from the outset, similar to how older children and adults operate.

The researchers noted that representations akin to Llama 3.1 were only observed in older children and adults, not in toddlers aged 2 to 5, who instead resembled an early, untrained AI model. It was only after increased exposure to language that the brain displayed activations similar to those of LLMs.

According to the study’s authors, including Jean-Rémy King from Meta, the evolution of speech processing in the brain and the maturation of language models during training illustrate structural similarities. Both biological and artificial systems appear to create comparable hierarchies of language representations, although LLMs necessitate far more data.

Despite these parallels, notable differences exist. Children acquire language using only a few million words, while LLMs require billions. Many cognitive abilities, such as understanding syntactical dependencies or semantic nuances, still elude AI.

Nonetheless, the study’s results suggest that AI models could assist scientists in gaining a better understanding of language development in the human brain. They offer a novel method for tracking speech processing across different age groups and comparing the internal workings of biological and artificial systems.

One significant limitation is that children under two years old could not participate in the study for medical reasons, despite this being a crucial period for language development.

*Meta and its products (Instagram, Facebook) are banned in the Russian Federation.

[Source](https://the-decoder.com/how-the-little-prince-and-ai-help-us-better-understand-language-development-in-the-brain/)