AI & roboticsTechnological advances in AI

SmolVLA from Hugging Face: Artificial Intelligence propels robotics towards greater agility and accessibility

Hugging Face, a major player in open source artificial intelligence, recently unveiled SmolVLA, a novel robotic model combining lightness, performance and accessibility. This project, developed in collaboration with the open source community, illustrates a paradigm shift in the approach to artificial intelligence applied to robotics: favoring sober, adaptable and economical models over massive, costly architectures.

Through this initiative, Hugging Face poses a strategic question: could the future of intelligent robotics lie in the field of computational simplicity and frugality?

SmolVLA (Small Vision-Language Action) stands out for its ability to understand natural language instructions, analyze images or videos, and generate appropriate robotic actions. Unlike giant models requiring heavy infrastructure, SmolVLA can be deployed on compact robots or low-power embedded systems.

  • Modest parameterization, proven effectiveness: SmolVLA operates with less than 200 million parameters, while retaining competitive inference capability for simple visual and motor tasks.
  • Integrated multimodality: the model is based on a vision-language-action architecture, capable of simultaneously taking into account an image of the environment, a textual command and the robot’s state.
  • Open source and community-based: the project is fully available on GitHub, along with fine-tuning tools, documentation and demonstration videos of robots such as Unitree or Spot from Boston Dynamics.

This approach encourages widespread adoption by researchers, educators, makers and start-ups looking for intelligent robotic solutions without the need for costly cloud infrastructures.

SmolVLA opens up prospects for concrete applications in fields where robotics has hitherto remained difficult to access:

  • Education and research: many universities can now train multimodal robotic models without intensive GPU resources, facilitating the teaching of cognitive robotics.
  • Light logistics: on low-cost robots, SmolVLA enables simple objects to be handled on visual or voice command (e.g. “put this object in the blue box”).
  • Domestic or medical assistance: coupled with on-board visual sensors, the model enables robots to accompany a person in a wheelchair, detect a fallen object or follow a remote command.
  • Rapid prototyping in industrial robotics: SmolVLA facilitates the development of customized human-robot interfaces, even for small industrial structures without advanced AI computing centers.

The SmolVLA initiative is part of a wider movement to redefine priorities in artificial intelligence. Rather than seeking to produce ever larger and more energy-intensive models, Hugging Face is defending an approach geared towards modularity, interpretability and accessibility. This approach is gaining increasing acceptance in the scientific and industrial communities.

According to a Stanford HAI study published in 20241nearly 60% of all academic robotics projects now involve smaller models, optimized for edge deployment. In parallel, initiatives such as Open X-Embodiment or RT-Agents are pushing in the same direction, integrating generative robotic capabilities at low computational cost2.

Intelligent robotics have long been the preserve of large corporations and well-endowed laboratories. By making models more compact, open source and compatible with inexpensive hardware, Hugging Face and its partners are starting a process of technological democratization. This trend could lead to a structural transformation of robotics value chains.

SmolVLA is not just another model: it embodies the political and technical will to bring artificial intelligence down from the cloud to the field, from laboratories to workshops, from research centers to classrooms.

1. Stanford HAI. (2024). AI Index Report 2024 – Robotics Section.
https://aiindex.stanford.edu/report/

2. Google DeepMind. (2023). RT-Agents: A New Standard for Multimodal Robotic Models.
https://www.deepmind.com/publications/rt-agents

Related posts
Innovation & competitiveness through AITechnological advances in AI

Perplexity unveils Comet, a browser powered by artificial intelligence

Perplexity breaks new ground with Comet: artificial intelligence at the heart of navigation In July 2025, Perplexity AI, known for its conversational response engine, announced the launch of Comet, a web browser boosted by artificial…
AI & robotics

Artificial Intelligence becomes a sporting legend: a 100% robotic match is played in China

This weekend, China staged the first-ever official match between two teams made up entirely of AI-guided robots. The event, held in Guangdong province, marks a symbolic milestone in the history of embedded AI and mobile robotics.
AI & healthTechnological advances in AI

Medical artificial intelligence takes a step forward: Microsoft announces ultra-precise AI for complex cases

Microsoft has just unveiled an artificial intelligence system that outperforms humans in complex medical diagnostic cases. According to a comparative study conducted on thousands of clinical scenarios, this AI would be up to four times more accurate than general practitioners in certain rare or difficult-to-identify situations1.
La clinique de l'IA

Vous souhaitez soumettre un projet à la clinique de l'IA et travailler avec nos étudiants.

Leave a Reply

Your email address will not be published. Required fields are marked *