Posts tagged:

AI

Autonomy

AI and Aerodynamics at Insect Scale: MIT’s Bumblebee‑Sized Robot

MIT soft robotics lab did an amazing job right at the intersection of physical systems, cyber and software. Small robots and insect-related autonomy have fascinated me for awhile. The autonomy of an insect brain is simple enough to just start to touch the boundaries of AI. The aero and mechanics of insects is still mind-bendingly ahead of our best planes (power, maneuver, agility, range).

MIT’s latest insect-scale robot fuses new soft “muscle” actuators, a redesigned four-wing airframe, and an AI-trained controller to achieve truly bug-like flight. Multi-layer dielectric elastomer actuators provide high power at much lower voltage, while long, hair-thin wing hinges and improved transmissions cut mechanical stress so the robot can hover for about 1,000 seconds and execute flips and sharp turns faster than previous microrobots. On top of that hardware, the team uses model-predictive control as an expert “teacher” and distills it into a lightweight neural policy, enabling bumblebee-class speed and agility in a package that weighs less than a paperclip and opens a realistic path to autonomous swarms for pollination and search-and-rescue.

Tiny flying robots have long promised to help with tasks that are too dangerous or delicate for larger drones. Building such machines is extraordinarily difficult: the physics of flapping‑wing flight at insect scale and the constraints of small motors and batteries limit endurance and manoeuvrability. Over the past four years, engineers at the Massachusetts Institute of Technology (MIT) have made three major breakthroughs that together push insect‑scale flight from lab curiosity toward practical autonomy.

I built an animation using three.js to show just how cool this flight path can be.

The insect’s flight path is modeled using a discrete-time Langevin equation, which simulates Brownian motion with aerodynamic damping. At each time step $ t $, the velocity vector $ \mathbf{v} $ is updated by applying a stochastic acceleration $ \mathbf{a}_{\text{rand}} $ and a damping factor $ \gamma $ (representing air resistance):
$$ v_{t+1} = \gamma \, v_t + \mathbf{a}_{\text{rand}} $$
where $ \gamma = 0.95 $ and the components of $ \mathbf{a}_{\text{rand}} $ are drawn from a uniform distribution $ \mathcal{U}(-0.05, 0.05) $. The position $ \mathbf{p} $ is then integrated numerically:
$$ \mathbf{p}_{t+1} = \mathbf{p}_t + \mathbf{v}_{t+1} $$
This results in a “random walk” trajectory that is smoothed by the inertial momentum of the simulated robot, mimicking the erratic yet continuous flight dynamics of a small insect.

Use mouse to rotate/zoom.

Flying robots smaller than a few grams operate in a very different aerodynamic regime from traditional drones. Their wings experience low Reynolds numbers where lift comes from unsteady vortex shedding rather than smooth airflow; getting useful thrust requires wings to flap hundreds of times per second with large stroke angles. At such high frequencies, actuators often buckle and joints fatigue. The robots have extremely tight power and mass budgets, making it hard to carry batteries or onboard processors. Early insect‑like microrobots could just barely hover for a few seconds and needed bulky external power supplies.

A microrobot flips 10 times in 11 seconds.
Credit: Courtesy of the Soft and Micro Robotics Laboratory

MIT’s Soft and Micro Robotics Laboratory, led by Kevin Chen, set out to tackle these problems with a combination of novel soft actuators, mechanically resilient airframes and advanced control algorithms. The resulting platform evolved in three stages: improved artificial muscles (2021), a four‑wing cross‑shaped airframe with durable hinges and transmissions (January 2025) and a learning‑based controller that matches insect agility (December 2025).

The first breakthrough addressed the “muscle” problem. Conventional rigid actuators were too heavy and inefficient for gram‑scale robots. Chen’s group developed multilayer dielectric elastomer actuators—soft “muscles” made from ultrathin elastomer films sandwiched between carbon‑nanotube electrodes and rolled into cylinders. In 2021 they unveiled a fabrication method that eliminates microscopic air bubbles in the elastomer by vacuuming each layer after spin‑coating and baking it immediately. This allowed them to stack 20 alternating layers, each about 10 µm thick , without defects.

Key results from this work included:

Lower voltage and more payload: The new actuators operate at 75% lower voltage and carry 80% more payload than earlier soft actuators. By increasing surface area with more layers, they require less than 500V to actuate yet can lift nearly three times their own weight.
Higher power density and durability: Removing defects increases power output by more than 300% and extends lifespan. The 20‑layer actuators survived more than 2 million cycles while still flying smoothly.
Record hovering: Robots powered by these actuators achieved a 20‑second hovering flight, the longest yet recorded for a sub‑gram robot. With a lift‑to‑weight ratio of about 3.7:1, they could carry additional electronics.

The low‑voltage soft muscles solved a critical bottleneck: the robots could now carry lightweight power electronics and eventually microprocessors instead of being tethered to an external power supply. This breakthrough laid the hardware foundation for future autonomy.

Aerodynamic and structural redesign: four wings, durable hinges and long endurance

The second breakthrough came in early 2025 with a redesign of the robot’s wings and transmission. Previous generations assembled four two‑wing modules into a rectangle, creating eight wings whose wakes interfered with each other and limiting control authority. The new design arranges four single‑wing units in a cross. Each wing flaps outward, reducing aerodynamic interference and freeing up central volume for batteries and sensors.

To exploit the power of the improved actuators, Chen’s team built more complex transmissions that connect each actuator to its wing. These transmissions prevent the artificial muscles from buckling at high flapping frequencies, reducing mechanical strain and allowing higher torque. They also developed a 2 cm‑long wing hinge with a diameter of just 200 µm, fabricated via a multistep laser‑cutting process. The long hinge reduces torsional stress during flapping and increases durability; even slight misalignment during fabrication could affect the wing’s motion.

Performance improvements from this structural redesign were dramatic:

Extended flight time: The robot can hover for more than 1,000 seconds (~17 minutes) without degradation of precision —100 times longer than earlier insect‑scale robots.
Increased speed and agility: It reaches an average speed of 35 cm/s and performs body rolls and double flips. It can precisely follow complex paths, including spelling “MIT” in mid‑air.
Greater control torque: The new transmissions generate about three times more torque than prior designs , enabling sophisticated and accurate path‑following flights.

These mechanical innovations show how aerodynamic design, materials engineering and manufacturing precision are intertwined. By reducing strain on the wings and actuators, the platform gains not only endurance but also the stability needed for advanced control.

Autonomy, AI and Systems Engineering

The final breakthrough is in control theory as it intersects autonomy. Insect‑scale flight demands rapid decision‑making: the wings beat hundreds of times per second, and disturbances like gusts can quickly destabilize the robot. Traditional hand‑tuned controllers cannot handle aggressive maneuvers or unexpected perturbations. MIT’s team collaborated with Jonathan How’s Aerospace Controls Laboratory to develop a two‑step AI‑based control scheme.

Step 1: Model‑predictive control (MPC). The researchers built a high‑fidelity dynamic model of the robot’s mechanics and aerodynamics. An MPC uses this model to plan an optimal sequence of control inputs that follow a desired trajectory while respecting force and torque limits. The planner can design difficult maneuvers such as repeated flips and aggressive turns but is too computationally intensive to run on the robot in real time.

Step 2: Imitation‑learned policy. To bridge the gap between high‑end planning and onboard execution, the team generated training data by having the MPC perform many trajectories and perturbations. They then trained a deep neural network policy via imitation learning to map the robot’s state directly to control commands. This policy effectively compresses the MPC’s intelligence into a lightweight model that can run fast enough for real‑time control.

The results show how AI enables insect‑like agility:

Speed and acceleration: The learning‑based controller allows the robot to fly 447% faster and achieve a 255% increase in acceleration compared with their previous best hand‑tuned controller.
Complex maneuvers: The robot executed 10 somersaults in 11 seconds, staying within about 4–5 cm of the planned trajectory and maintaining performance despite wind gusts of more than 1 m/s
Bio‑inspired saccades: The policy enabled saccadic flight behavior, where the robot rapidly pitches forward to accelerate and then back to decelerate, mimicking insect eye stabilization strategies.

An external motion‑capture system currently provides state estimation, and the controller runs offboard. However, because the neural policy is much less computationally demanding than full MPC, the authors argue that similar policies may be feasible on tiny onboard processors. The AI control thus paves the way for autonomous flight without external assistance.

A key message of MIT’s microrobot program is that agility at insect scale does not come from a single innovation. The soft actuators enable large strokes at lower voltage; the cross‑shaped airframe and long hinges reduce mechanical strain and aerodynamic interference; and the AI‑driven controller exploits the robot’s physical capabilities while respecting constraints. Chen’s group emphasizes that hardware advances pushed them to develop better controllers, and improved controllers made it worthwhile to refine the hardware.

This co‑design philosophy—optimizing materials, mechanisms and algorithms together—will be essential as researchers push toward untethered, autonomous swarms. The team plans to integrate miniature batteries and sensors into the central cavity freed by the four‑wing design, allowing the robots to navigate without a motion‑capture system. Future work includes landing and take‑off from flowers for mechanical pollination and coordinated flight to avoid collisions and operate in groups. There are also open questions about enabling onboard perception—using tiny cameras or event sensors to close the loop—and about energy management to extend endurance beyond 10,000s.

MIT’s bumblebee‑sized flapping robot illustrates how progress in materials science, precision fabrication, aerodynamics and AI can converge to solve a hard problem. The low‑voltage, power‑dense actuators prove that soft materials can outperform rigid designs, the four‑wing airframe with durable hinges unlocks long endurance and high torque, and the hybrid MPC/learning controller shows that sophisticated planning can be compressed into hardware‑friendly neural policies. Together, these advances give the microrobot insect‑like speed, agility and endurance.

While still reliant on external power and motion capture, the robot’s modular design and AI controller suggest a roadmap to fully autonomous operation. As the team integrates onboard sensors and batteries, insect‑scale robots could move from labs to fields and disaster zones, pollinating crops or searching collapsed buildings. In that future, the intersection of autonomy and aerodynamics will be defined not by a single breakthrough but by a careful co‑design of muscles, wings and brains.

December 6, 2025 (updated December 7, 2025) By Tim Booher One Comment

Autonomy

Robots and Chatbots

Before ChatGPT, human looking robotics defined AI in the public imagination. That might be true again in the near future. With AI models online, it’s awesome to have AI automate our writing and art, but we still have to wash the dishes and chop the firewood.

That may change soon. AI is finding bodies fast as AI and Autonomy merge. Autonomy (the field I lead at Boeing) is made of three parts: code, trust and the ability to interact with humans.

Let’s start with code. Code is getting easier to write and new tools are accelerating development across the board. So you can crank out python scripts, tests and web-apps fast, but the really exciting superpowers are those that empower you create AI software. Unsupervised learning allows for code to be grown not written by just exposing sensors to the real world and letting the model weights adapt into a high performance system.

Recent history is well known. Frank Rosenblatt’s work on perceptrons in the 1950s set the stage. In the 1980s, Geoffrey Hinton and David Rumelhart’s popularization of backpropagation made training deep networks feasible.

The real game-changer came with the rise of powerful GPUs, thanks to companies like NVIDIA, which allowed for processing large-scale neural networks. The explosion of digital data provided the fuel for these networks, and deep learning frameworks like TensorFlow and PyTorch made advanced models more accessible.

In the early 2010s, Hinton’s work on deep belief networks and the success of AlexNet in the 2012 ImageNet competition demonstrated the potential of deep learning. This was followed by the introduction of transformers in 2017 by Vaswani and others, which revolutionized natural language processing with the attention mechanism.

Transformers allow models to focus on relevant parts of the input sequence dynamically and process the data in parallel. This mechanism helps models understand the context and relationships within data more effectively, leading to better performance in tasks such as translation, summarization, and text generation. This breakthrough has enabled the creation of powerful language models, transforming language applications and giving us magical software like BERT and GPT.

The impact of all this is that you can build a humanoid robot by just moving it’s arms and legs in diverse enough ways to grow the AI inside. (This is called sensor to servo machine learning.)

This all gets very interesting with the arrival of multimodal models that combine language, vision, and sensor data. Vision-Language-Action Models (VLAMs) enable robots to interpret their environment and predict actions based on combined sensory inputs. This holistic approach reduces errors and enhances the robot’s ability to act in the physical world. The ability to combine vision and language processing with robotic control enables interpretation of complex instructions to perform actions in the physical world.

PaLM-E from Google Research provides an embodied multimodal language model that integrates sensor data from robots with language and vision inputs. This model is designed to handle a variety of tasks involving robotics, vision, and language by transforming sensor data into a format compatible with the language model. PaLM-E can generate plans and decisions directly from these multimodal inputs, enabling robots to perform complex tasks efficiently. The model’s ability to transfer knowledge from large-scale language and vision datasets to robotic systems significantly enhances its generalization capabilities and task performance.

So code is getting awesome, let’s talk about trust since explainability is also exploding. When all models, including embodied AI in robots, can explain their actions they are easier to program, debug but most importantly trust. There has been some great work in this area. I’ve used interpretable models, attention mechanisms, saliency maps, and post-hoc explanation techniques like LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations). I got to be on the ground floor of DARPA’s Explainable Artificial Intelligence (XAI) program, but Anthropic really surprised me last week with their paper “Scaling Monosemanticity: Extracting Interpretable Features from Claude 3 Sonnet“

They identified specific combinations of neurons within the AI model Claude 3 Sonnet that activate in response to particular concepts or features. For instance, when Claude encounters text or images related to the Golden Gate Bridge, a specific set of neurons becomes active. This discovery is pivotal because it allows researchers to precisely tune these features, increasing or decreasing their activation and observing corresponding changes in the model’s behavior.

When the activation of the “Golden Gate Bridge” feature is increased, Claude’s responses heavily incorporate mentions of the bridge, regardless of the query’s relevance. This demonstrates the ability to control and predict the behavior of the AI based on feature manipulation. For example, queries about spending money or writing stories all get steered towards the Golden Gate Bridge, illustrating how tuning specific features can drastically alter output.

So this is all fun, but these techniques have significant implications for AI safety and reliability. By understanding and controlling feature activations, researchers can manage safety-related features such as those linked to dangerous behaviors, criminal activity, or deception. This control could help mitigate risks associated with AI and ensure models behave more predictably and safely. This is a critical capability to enable AI in physical systems. Read the paper, it’s incredible.

OpenAI is doing stuff too. In 2019, they introduced activation atlases, which build on the concept of feature visualization. This technique allows researchers to map out how different neurons in a neural network activate in response to specific concepts. For instance, they can visualize how a network distinguishes between frying pans and woks, revealing that the presence of certain foods, like noodles, can influence the model’s classification. This helps identify and correct spurious correlations that could lead to errors or biases in AI behavior.

The final accelerator is the ability to learn quickly through imitation and generalize skills across different tasks. This is critical because the core skill needed to interact with the real world is flexibility and adaptability. You can’t expose a model in training to all possible scenarios you will find in the real world. Models like RT-2 leverage internet-scale data to perform tasks they were not explicitly trained for, showing impressive generalization and emergent capabilities.

RT-2 is an RT-X model, part of the Open X-Embodiment project, which combines data from multiple robotic platforms to train generalizable robot policies. By leveraging a diverse dataset of robotic experiences, RT-X demonstrates positive transfer, improving the capabilities of multiple robots through shared learning experiences. This approach allows RT-X to generalize skills across different embodiments and tasks, making it highly adaptable to various real-world scenarios.

I’m watching all this very closely and it’s super cool as AI escapes the browser and really starts improving our physical world there are all kinds of cool lifestyle and economic benefits around the corner. Of course there are lots of risks too. I’m proud to be working in a company and in an industry obsessed with ethics and safety. All considered, I’m extremely optimistic, if not more than a little tired trying to track the various actors on a stage that keeps changing with no one sure of what the next act will be.

June 10, 2024 (updated June 10, 2024) By Tim Booher 0 Comments

Autonomy

Power and AI

The Growing Power Needs for Large Language Models

In 2024, AI is awesome, empowering and available to everyone. Unfortunately, while AI is free to consumers, these models are expensive to train and operate at scale. Training them is expected to be the most expensive thing ever. Yes, more than the Manhattan project, pyramids and the entire GDP of the world economy. No wonder companies with free compute are dominating this space.

By 2025, the cost to train an LLM surpasses the Apollo Project, a historical benchmark for significant expenditure. This projection emphasizes the increasing financial burden and resource demand associated with advancing AI capabilities, underscoring the need for more efficient and sustainable approaches in AI research and development. The data points to a future where the financial and energy requirements for AI could become unsustainable without significant technological breakthroughs or shifts in strategy.

Why?

Because of how deep learning works and how it’s trained. The first era, marked by steady progress, follows a trend aligned with Moore’s Law, where computing power doubled approximately every two years. Notable milestones during this period include the development of early AI models like the Perceptron and later advancements such as NETtalk and TD-Gammon.

The second era, beginning around 2012 with the advent of deep learning, demonstrates a dramatic increase in compute usage, following a much steeper trajectory where computational power doubles approximately every 3.4 months. This surge is driven by the development of more complex models like AlexNet, ResNets, and AlphaGoZero. Key factors behind this acceleration include the availability of massive datasets, advancements in GPU and specialized hardware, and significant investments in AI research. As AI models have become more sophisticated, the demand for computational resources has skyrocketed, leading to innovations and increased emphasis on sustainable and efficient energy sources to support this growth.

Training LLMs involves massive computational resources. For instance, models like GPT-3, with 175 billion parameters, require extensive parallel processing using GPUs. Training such a model on a single Nvidia V100 GPU would take an estimated 288 years, emphasizing the need for large-scale distributed computing setups to make the process feasible in a reasonable timeframe. This leads to higher costs, both financially and in terms of energy consumption.

Recent studies have highlighted the dramatic increase in computational power needed for AI training, which is rising at an unprecedented rate. Over the past seven years, compute usage has increased by 300,000-fold, underscoring the escalating costs associated with these advancements. This increase not only affects financial expenditures but also contributes to higher carbon emissions, posing environmental concerns.

Infrastructure and Efficiency Improvements

To address these challenges, companies like Cerebras and Cirrascale are developing specialized infrastructure solutions. For example, Cerebras’ AI Model Studio offers a rental model that leverages clusters of CS-2 nodes, providing a scalable and cost-effective alternative to traditional cloud-based solutions. This approach aims to deliver predictable pricing and reduce the costs associated with training large models.

Moreover, researchers are exploring various optimization techniques to improve the efficiency of LLMs. These include model approximation, compression strategies, and innovations in hardware architecture. For instance, advancements in GPU interconnects and supercomputing technologies are critical to overcoming bottlenecks related to data transfer speeds between servers, which remain a significant challenge.

Implications for Commodities and Nuclear Power

The increasing power needs for AI training have broader implications for commodities, particularly in the energy sector. As AI models grow, the demand for electricity to power the required computational infrastructure will likely rise. This could drive up the prices of energy commodities, especially in regions where data centers are concentrated. Additionally, the need for advanced hardware, such as GPUs and specialized processors, will impact the supply chains and pricing of these components.

To address the substantial energy needs of AI, particularly in powering the growing number of data centers, various approaches are being considered. One notable strategy involves leveraging nuclear power. This approach is championed by tech leaders like OpenAI CEO Sam Altman, who views AI and affordable, green energy as intertwined essentials for a future of abundance. Nuclear startups, such as Oklo, which Altman supports, are working on advanced nuclear reactors designed to be safer, more efficient, and smaller than traditional plants. Oklo’s projects include a 15-megawatt fission reactor and a grant-supported initiative to recycle nuclear waste into new fuel.

However, integrating nuclear energy into the tech sector faces significant regulatory challenges. The Nuclear Regulatory Commission (NRC) denied Oklo’s application for its Idaho plant design due to insufficient safety information, and the Air Force rescinded a contract for a microreactor pilot program in Alaska. These hurdles highlight the tension between the rapid development pace of AI technologies and the methodical, decades-long process traditionally required for nuclear energy projects .

The demand for sustainable energy solutions is underscored by the rising energy consumption of AI servers, which could soon exceed the annual energy use of some small nations. Major tech firms like Microsoft, Google, and Amazon are investing heavily in nuclear energy to secure stable, clean power for their operations. Microsoft has agreements to buy nuclear-generated electricity for its data centers, while Google and Amazon have invested in fusion startups .

May 24, 2024 (updated May 24, 2024) By Tim Booher 0 Comments