While large language AI models continue to capture attention, the real focus might be shifting to smaller models. Meta seems to be betting on this shift, as highlighted in a recent paper by its research team.
Meta Aims to Downsize Its AI-Language Models
While large language AI models continue to capture attention, the real focus might be shifting to smaller models. Meta seems to be betting on this shift, as highlighted in a recent paper by its research team. Unlike massive models like ChatGPT, Gemini, and Llama, which rely on billions or even trillions of parameters, these large models are too bulky to run efficiently on mobile devices. Meta’s scientists pointed out that there's a growing demand for efficient language models suitable for mobile deployment, driven by rising cloud costs and concerns over latency.In their research, the team outlined how they developed high-quality language models with fewer than a billion parameters—an ideal size for mobile platforms. Challenging the common belief that the sheer number of parameters primarily drives model quality, the scientists managed to produce results that in some areas rival Meta’s Llama LLM.“There’s a prevailing paradigm that ‘bigger is better,’ but this shows it’s really about how the parameters are utilized,” said Nick DeGiacomo, CEO of Bucephalus, an AI-powered e-commerce supply chain platform based in New York City. “This opens the door to broader adoption of on-device AI,” he told TechNewsWorld.
A Crucial Step
The significance of Meta's research lies in its challenge to the prevailing reliance on cloud-based AI, where data is typically processed in distant data centers, according to Darian Shimy, CEO and founder of FutureFund, a venture capital firm in San Francisco.Meta is revolutionizing the approach by bringing AI processing directly into the device itself, potentially reducing the environmental impact of data transmission and processing in large, energy-intensive data centers, while establishing device-based AI as a major player in the tech ecosystem, Shimy told TechNewsWorld.Yashin Manraj, CEO of Pvotal Technologies, an end-to-end security software developer in Eagle Point, Ore., emphasized that this research represents the first comprehensive and publicly shared effort of this scale. Manraj believes it is a crucial initial step towards achieving a balanced approach to data processing, whether in the cloud or on the device, thereby paving the way for AI-powered applications to realize their potential for support, automation, and assistance. Meta scientists have also made significant progress in downsizing a language model. Nishant Neekhra, senior director of mobile marketing at Skyworks Solutions, a semiconductor company in Irvine, Calif., noted that they are proposing a substantial reduction in model size, making it more accessible for wearables, hearables, and mobile phones while opening up new applications and interaction possibilities for AI in the real world. Moreover, by downsizing, they are addressing a significant challenge faced by language models, which is their ability to be deployed on edge devices.
High Impact on Health Care
One potential area for small language models to make a significant impact is in the field of medicine. According to Danielle Kelvas, a physician advisor at IT Medical, a global medical software development company, this research has the potential to unleash the power of generative AI for applications involving mobile devices. These devices are widely used in today's healthcare landscape for remote monitoring and biometric assessments. Kelvas continued by stating that by demonstrating the effectiveness of SLMs with fewer than a billion parameters in certain tasks compared to larger models, researchers are paving the way for the widespread adoption of AI in everyday health monitoring and personalized patient care. Furthermore, Kelvas explained that using SLMs can guarantee the secure processing of sensitive health data on a device, thereby enhancing patient privacy. Additionally, she mentioned that SLMs can enable real-time health monitoring and intervention, which is crucial for patients with chronic conditions or those in need of continuous care. She also noted that the models could help reduce the technological and financial barriers to implementing AI in healthcare settings, potentially making advanced health monitoring technologies more accessible to a wider population.
Reflecting Industry Trends
Caridad Muñoz, a professor specializing in new media technology at CUNY LaGuardia Community College, explained that Meta's emphasis on small AI models for mobile devices reflects the industry's broader trend towards optimizing AI for efficiency and accessibility. She highlighted that this shift not only tackles practical challenges but also matches the increasing concerns about the environmental impact of large-scale AI operations.
Muñoz also added that Meta is establishing a standard for sustainable and inclusive AI development by advocating smaller, more efficient models.
Furthermore, the focus on small language models aligns with the edge computing trend, which aims to bring AI capabilities closer to users. DeGiacomo remarked that the large language models from organizations like OpenAI and Anthropic often exceed the requirements, emphasizing the need for specialized, tuned models that are more efficient and cost-effective for specific tasks. He stressed that numerous mobile applications do not necessitate cutting-edge AI, as sending a text message doesn't require a supercomputer.
DeGiacomo further explained that this approach allows the device to concentrate on managing the routing between what can be answered using the SLM and specialized use cases, drawing a parallel to the relationship between generalist and specialist doctors.
Profound Effect on Global Connectivity -
Shimy emphasized the significant impact SLMs could have on global connectivity. He pointed out that as on-device AI becomes more advanced, there will be less need for continuous internet connectivity, potentially bringing about a major shift in the tech landscape in areas with inconsistent or expensive internet access. This could lead to wider access to advanced technologies, making cutting-edge AI tools available in diverse global markets.
Although Meta is at the forefront of SLM development, Manraj mentioned that developing nations are closely monitoring the situation to control their AI development costs. According to him, China, Russia, and Iran have shown a keen interest in the ability to defer compute calculations on local devices, particularly when high-tech AI hardware chips are under embargo or not easily obtainable.
Despite the expected shift towards enabling an on-device 'last mile' model, it is not anticipated to bring about an immediate or drastic change, as cloud-based LLMs will still be necessary to provide cutting-edge value to end-users for complex, multi-language queries. However, this change can help lessen the burden on LLMs by handling smaller tasks, reducing feedback loops, and offering local data enrichment.
In the end, the ultimate beneficiaries will be the end users, as this shift would allow for a new generation of capabilities on their devices and a more promising overhaul of front-end applications and interactions with the world.
He also emphasized that while known entities are driving innovation in this sector, potentially impacting everyone's daily lives, SLMs could also pose a threat by enabling models to harvest data and metadata at an unprecedented level. He expressed hope that with the right precautions, these efforts can be directed towards a positive outcome.