Apertus: a fully open, transparent, multilingual language model

By Melissa Anchisi and Florian Meyer

In July, EPFL, ETH Zurich, and the Swiss National Supercomputing Centre (CSCS) announced their joint initiative to build a large language model (LLM). Now, this model is available and serves as a building block for developers and organisations for future applications such as chatbots, translation systems, or educational tools.

The model is named Apertus – Latin for “open” – highlighting its distinctive feature: the entire development process, including its architecture, model weights, and training data and recipes, is openly accessible and fully documented.

AI researchers, professionals, and experienced enthusiasts can either access the model through the strategic partner Swisscom or download it from Hugging Face – a platform for AI models and applications – and deploy it for their own projects. Apertus is freely available in two sizes – featuring 8 billion and 70 billion parameters, the smaller model being more appropriate for individual usage. Both models are released under a permissive open-source license, allowing use in education and research as well as broad societal and commercial applications.

A fully open-source LLM

As a fully open language model, Apertus allows researchers, professionals and enthusiasts to build upon the model and adapt it to their specific needs, as well as to inspect any part of the training process. This distinguishes Apertus from models that make only selected components accessible.

“With this release, we aim to provide a blueprint for how a trustworthy, sovereign, and inclusive AI model can be developed,” says Martin Jaggi, Professor of Machine Learning at EPFL and member of the Steering Committee of the Swiss AI Initiative. The model will be regularly updated by the development team which includes specialized engineers and a large number of researchers from CSCS, ETH Zurich and EPFL.

A driver of innovation

With its open approach, EPFL, ETH Zurich and CSCS are venturing into new territory. “Apertus is not a conventional case of technology transfer from research to product. Instead, we see it as a driver of innovation and a means of strengthening AI expertise across research, society and industry,” says Thomas Schulthess, Director of CSCS and Professor at ETH Zurich. In line with their tradition, EPFL, ETH Zurich and CSCS are providing both foundational technology and infrastructure to foster innovation across the economy.

Trained on 15 trillion tokens across more than 1,000 languages – 40% of the data is non-English – Apertus includes many languages that have so far been underrepresented in LLMs, such as Swiss German, Romansh, and many others.

“Apertus is built for the public good. It stands among the few fully open LLMs at this scale and is the first of its kind to embody multilingualism, transparency, and compliance as foundational design principles”, says Imanol Schlag, technical lead of the LLM project and Research Scientist at ETH Zurich.

“Swisscom is proud to be among the first to deploy this pioneering large language model on our sovereign Swiss AI Platform. As a strategic partner of the Swiss AI Initiative, we are supporting the access of Apertus during the Swiss {ai} Weeks. This underscores our commitment to shaping a secure and responsible AI ecosystem that serves the public interest and strengthens Switzerland’s digital sovereignty”, commented Daniel Dobos, Research Director at Swisscom.

Accessibility

While setting up Apertus is straightforward for professionals and proficient users, additional components such as servers, cloud infrastructure or specific user interfaces are required for practical use. The upcoming Swiss {ai} Weeks hackathons will be the first opportunity for developers to experiment hands-on with Apertus, test its capabilities, and provide feedback for improvements to future versions.

Swisscom will provide a dedicated interface to hackathon participants, making it easier to interact with the model. As of today, Swisscom business customers will be able to access the Apertus model via Swisscom’s sovereign Swiss AI platform.

Furthermore, for people outside of Switzerland, the Public AI Inference Utility will make Apertus accessible as part of a global movement for public AI. “Currently, Apertus is the leading public AI model: a model built by public institutions, for the public interest. It is our best proof yet that AI can be a form of public infrastructure like highways, water, or electricity,” says Joshua Tan, Lead Maintainer of the Public AI Inference Utility.

Transparency and compliance

Apertus is designed with transparency at its core, thereby ensuring full reproducibility of the training process. Alongside the models, the research team has published a range of resources: comprehensive documentation and source code of the training process and datasets used, model weights including intermediate checkpoints – all released under the permissive open-source license, which also allows for commercial use. The terms and conditions are available via Hugging Face.

Apertus was developed with due consideration to Swiss data protection laws, Swiss copyright laws, and the transparency obligations under the EU AI Act. Particular attention has been paid to data integrity and ethical standards: the training corpus builds only on data which is publicly available. It is filtered to respect machine-readable opt-out requests from websites, even retroactively, and to remove personal data, and other undesired content before training begins.

The beginning of a journey

“Apertus demonstrates that generative AI can be both powerful and open,” says Antoine Bosselut, Professor and Head of the Natural Language Processing Laboratory at EPFL and Co-Lead of the Swiss AI Initiative. “The release of Apertus is not a final step, rather it’s the beginning of a journey, a long-term commitment to open, trustworthy, and sovereign AI foundations, for the public good worldwide. We are excited to see developers engage with the model at the Swiss {ai} Weeks hackathons. Their creativity and feedback will help us to improve future generations of the model.”

Future versions aim to expand the model family, improve efficiency, and explore domain-specific adaptations in fields like law, climate, health and education. They are also expected to integrate additional capabilities, while maintaining strong standards for transparency.

Robotic system offers hidden window into collective bee behavior

The robotic system is shown in an experimental hive © Artificial Life Lab/U. of Graz/Hiveopolis

By Celia Luterbacher

Honeybees are famously finicky when it comes to being studied. Research instruments and conditions and even unfamiliar smells can disrupt a colony’s behavior. Now, a joint research team from the Mobile Robotic Systems Group in EPFL’s School of Engineering and School of Computer and Communication Sciences and the Hiveopolis project at Austria’s University of Graz have developed a robotic system that can be unobtrusively built into the frame of a standard honeybee hive.

Composed of an array of thermal sensors and actuators, the system measures and modulates honeybee behavior through localized temperature variations.

“Many rules of bee society – from collective and individual interactions to raising a healthy brood – are regulated by temperature, so we leveraged that for this study,” explains EPFL PhD student Rafael Barmak, first author on a paper on the system recently published in Science Robotics. “The thermal sensors create a snapshot of the bees’ collective behavior, while the actuators allow us to influence their movement by modulating thermal fields.”

“Previous studies on the thermal behavior of honeybees in winter have relied on observing the bees or manipulating the outside temperature,” adds Martin Stefanec of the University of Graz. “Our robotic system enables us to change the temperature from within the cluster, emulating the heating behavior of core bees there, and allowing us to study how the winter cluster actively regulates its temperature.”

A ‘biohybrid superorganism’ to mitigate colony collapse

Bee colonies are challenging to study in winter since they are sensitive to cold, and opening their hives risks harming them in addition to influencing their behavior. But thanks to the researchers’ biocompatible robotic system, they were able to study three experimental hives, located at the Artificial Life Lab at the University of Graz, during winter and to control them remotely from EPFL. Inside the device, a central processor coordinated the sensors, sent commands to the actuators, and transmitted data to the scientists, demonstrating that the system could be used to study bees with no intrusion – or even cameras – required.

Mobile Robotic Systems Group head Francesco Mondada explains that one of the most important aspects of the system – which he calls a ‘biohybrid superorganism’ for its combination of robotics with a colony of individuals acting as a living entity – is its ability to simultaneously observe and influence bee behavior.

“By gathering data on the bees’ position and creating warmer areas in the hive, we were able to encourage them to move around in ways they would never normally do in nature during the winter, when they tend to huddle together to conserve energy. This gives us the possibility to act on behalf of a colony, for example by directing it toward a food source, or discouraging it from dividing into too-small groups, which can threaten its survival.”

The robotic system is shown in an experimental hive © MOBOTS / EPFL / Hiveopolis

The scientists were able to prolong the survival of a colony following the death of its queen by distributing heat energy via the actuators. The system’s ability to mitigate colony collapse could have implications for bee survivability, which has become a growing environmental and food security concern as the pollinators’ global populations have declined.

Never-before-seen behaviors

In addition to its potential to support colonies, the system has shed light on honeybee behaviors that have never been observed, opening new avenues in biological research.

“The local thermal stimuli produced by our system revealed previously unreported dynamics that are generating exciting new questions and hypotheses,” says EPFL postdoctoral researcher and corresponding author Rob Mills. “For example, currently, no model can explain why we were able to encourage the bees to cross some cold temperature ‘valleys’ within the hive.”

The researchers now plan to use the system to study bees in summertime, which is a critical period for raising young. In parallel, the Mobile Robotic Systems Group is exploring systems using vibrational pathways to interact with honeybees.

“The biological acceptance aspect of this work is critical: the fact that the bees accepted the integration of electronics into the hive gives our device great potential for different scientific or agricultural applications,” says Mondada.


This work was supported by the EU H2020 FET project HIVEOPOLIS (no. 824069), coordinated by Thomas Schmickl, and by the Field of Excellence COLIBRI (Complexity of Life in basic Research and Innovation) at the University of Graz.

How to compete with robots

When it comes to the future of intelligent robots, the first question people ask is often: how many jobs will they make disappear? Whatever the answer, the second question is likely to be: how can I make sure that my job is not among them?

In a study just published in Science Robotics, a team of roboticists from EPFL and economists from the University of Lausanne offers answers to both questions. By combining the scientific and technical literature on robotic abilities with employment and wage statistics, they have developed a method to calculate which of the currently existing jobs are more at risk of being performed by machines in the near future. Additionally, they have devised a method for suggesting career transitions to jobs that are less at risk and require smallest retraining efforts.

“There are several studies predicting how many jobs will be automated by robots, but they all focus on software robots, such as speech and image recognition, financial robo-advisers, chatbots, and so forth. Furthermore, those predictions wildly oscillate depending on how job requirements and software abilities are assessed. Here, we consider not only artificial intelligence software, but also real intelligent robots that perform physical work and we developed a method for a systematic comparison of human and robotic abilities used in hundreds of jobs”, says Prof. Dario Floreano, Director of EPFL’s Laboratory of Intelligent Systems, who led the study at EPFL.

The key innovation of the study is a new mapping of robot capabilities onto job requirements. The team looked into the European H2020 Robotic Multi-Annual Roadmap (MAR), a strategy document by the European Commission that is periodically revised by robotics experts. The MAR describes dozens of abilities that are required from current robot or may be required by future ones, ranging, organised in categories such as manipulation, perception, sensing, interaction with humans. The researchers went through research papers, patents, and description of robotic products to assess the maturity level of robotic abilities, using a well-known scale for measuring the level of technology development, “technology readiness level” (TRL).

For human abilities, they relied on the O*net database, a widely-used resource database on the US job market, that classifies approximately 1,000 occupations and breaks down the skills and knowledge that are most crucial for each of them

After selectively matching the human abilities from O*net list to robotic abilities from the MAR document, the team could calculate how likely each existing job occupation is to be performed by a robot. Say, for example, that a job requires a human to work at millimetre-level precision of movements. Robots are very good at that, and the TRL of the corresponding ability is thus the highest. If a job requires enough such skills, it will be more likely to be automated than one that requires abilities such as critical thinking or creativity.

The result is a ranking of the 1,000 jobs, with “Physicists” being the ones who have the lowest risk of being replaced by a machine, and “Slaughterers and Meat Packers”, who face the highest risk. In general, jobs in food processing, building and maintenance, construction and extraction appear to have the highest risk.

“The key challenge for society today is how to become resilient against automation” says Prof. Rafael Lalive. who co-led the study at the University of Lausanne. “Our work provides detailed career advice for workers who face high risks of automation, which allows them to take on more secure jobs while re-using many of the skills acquired on the old job. Through this advice, governments can support society in becoming more resilient against automation.”

The authors then created a method to find, for any given job, alternative jobs that have a significantly lower automation risk and are reasonably close to the original one in terms of the abilities and knowledge they require – thus keeping the retraining effort minimal and making the career transition feasible. To test how that method would perform in real life, they used data from the US workforce and simulated thousands of career moves based on the algorithm’s suggestions, finding that it would indeed allow workers in the occupations with the highest risk to shift towards medium-risk occupations, while undergoing a relatively low retraining effort.

The method could be used by governments to measure how many workers could face automation risks and adjust retraining policies, by companies to assess the costs of increasing automation, by robotics manufacturers to better tailor their products to the market needs; and by the public to identify the easiest route to reposition themselves on the job market.

Finally, the authors translated the new methods and data into an algorithm that predicts the risk of automation for hundreds of jobs and suggests resilient career transitions at minimal retraining effort, publicly accessible at http://lis2.epfl.ch/resiliencetorobots.

This research was funded by the CROSS (Collaborative Research on Science and Society) Program in EPFL’s College of Humanities; by the Enterprise for Society Center at EPFL; as a part of NCCR Robotics, a National Centres of Competence in Research, funded by the Swiss National Science Foundation (SNSF grant number 51NF40_185543); by the European Commission through the Horizon 2020 projects AERIAL-CORE (grant agreement no. 871479) and MERGING (grant agreement no. 869963); and by SNSF grant no. 100018_178878.

Helping drone swarms avoid obstacles without hitting each other

Enrica Soria, a PhD student at LIS © Alain Herzog / 2021 EPFL

By Clara Marc

There is strength in numbers. That’s true not only for humans, but for drones too. By flying in a swarm, they can cover larger areas and collect a wider range of data, since each drone can be equipped with different sensors.

Preventing drones from bumping into each other
One reason why drone swarms haven’t been used more widely is the risk of gridlock within the swarm. Studies on the collective movement of animals show that each agent tends to coordinate its movements with the others, adjusting its trajectory so as to keep a safe inter-agent distance or to travel in alignment, for example.

“In a drone swarm, when one drone changes its trajectory to avoid an obstacle, its neighbors automatically synchronize their movements accordingly,” says Dario Floreano, a professor at EPFL’s School of Engineering and head of the Laboratory of Intelligent Systems (LIS). “But that often causes the swarm to slow down, generates gridlock within the swarm or even leads to collisions.”

Not just reacting, but also predicting
Enrica Soria, a PhD student at LIS, has come up with a new method for getting around that problem. She has developed a predictive control model that allows drones to not just react to others in a swarm, but also to anticipate their own movements and predict those of their neighbors. “Our model gives drones the ability to determine when a neighbor is about to slow down, meaning the slowdown has less of an effect on their own flight,” says Soria. The model works by programing in locally controlled, simple rules, such as a minimum inter-agent distance to maintain, a set velocity to keep, or a specific direction to follow. Soria’s work has just been published in Nature Machine Intelligence.

With Soria’s model, drones are much less dependent on commands issued by a central computer. Drones in aerial light shows, for example, get their instructions from a computer that calculates each one’s trajectory to avoid a collision. “But with our model, drones are commanded using local information and can modify their trajectories autonomously,” says Soria.

A model inspired by nature
Tests run at LIS show that Soria’s system improves the speed, order and safety of drone swarms in areas with a lot of obstacles. “We don’t yet know if, or to what extent, animals are able to predict the movements of those around them,” says Floreano. “But biologists have recently suggested that the synchronized direction changes observed in some large groups would require a more sophisticated cognitive ability than what has been believed until now.”

References