Subscribe

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Service

Amazon Launches Nova Act to Advance AI Agent Tech

Amazon Launches Nova Act to Advance AI Agent Tech Amazon Launches Nova Act to Advance AI Agent Tech
IMAGE CREDITS: AMAZON

Amazon has introduced Nova Act, a powerful new AI model built to push the boundaries of what intelligent agents can do. Especially when it comes to performing complex tasks across web browsers without needing constant human input or extensive API integration.

While many AI models today rely on Retrieval-Augmented Generation (RAG) to answer queries or fetch information. Amazon’s ambition stretches far beyond that. With Nova Act, the company redefines the role of AI agents. From passive responders to proactive digital workers capable of handling multi-step tasks in both virtual and real-world environments.

“Our dream is for agents to take on highly involved tasks. Whether it’s planning a wedding or managing IT workflows for enterprises,” Amazon shared.

Most current-generation AI agents hit roadblocks when required to work autonomously. Many depend on detailed APIs or require human supervision to function reliably. Nova Act aims to remove those dependencies by delivering a browser-native agent that is reliable, precise, and capable of executing workflows independently.

As part of this vision, Amazon also launched a research preview of the Nova Act SDK, which allows developers to prototype and build agents that can automate everyday browser tasks like:

  • Sending out-of-office emails
  • Scheduling calendar events
  • Navigating checkouts (while skipping upsells)
  • Interacting with dropdowns, icons, or popups

At the heart of Nova Act’s automation capabilities lies the concept of “atomic commands”—precise instructions that break down complex workflows into manageable, executable parts. These can be enhanced through detailed parameters and integrated with Playwright, API calls, Python scripts, and parallel threading to optimize responsiveness and handle page load delays.

Amazon says Nova Act sets itself apart not just with its broad capabilities but also its exceptional reliability—a key differentiator in today’s crowded AI space.

On internal evaluations, the model consistently scored above 90%, far ahead of popular competitors. Highlights include:

  • 0.939 on the ScreenSpot Web Text benchmark, which tests accuracy on text-based interactions (e.g., adjusting font sizes). Claude 3.7 Sonnet and OpenAI’s CUA trailed at 0.900 and 0.883, respectively.
  • 0.879 on the ScreenSpot Web Icon benchmark, measuring performance in identifying and interacting with UI icons.

Even in tests where Nova Act slightly lagged—such as the GroundUI Web benchmark, which challenges agents on a range of interface complexities—Amazon sees this as an opportunity for further iteration.

The company emphasizes that real-world dependability is more important than simply achieving high test scores. That’s why Nova Act agents can be deployed headlessly, integrated as APIs, or scheduled to run asynchronous automations—like ordering food every Tuesday without needing user prompts.

What truly sets Nova Act apart is its ability to generalize across new environments. In tests, the model was able to perform tasks in browser-based games, despite never being explicitly trained for them. This level of adaptive UI understanding opens the door to endless use cases—from consumer productivity to enterprise automation.

Nova Act is already making its way into Alexa+, where it enables browser-based navigation even in the absence of full API access—enhancing what AI assistants can accomplish without human intervention.

This launch marks the first phase in Amazon’s long-term vision to develop truly intelligent, scalable AI agents. Instead of overfitting models to narrow demonstrations, Amazon is training Nova agents through reinforcement learning in dynamic environments. This approach builds agents that are not only accurate, but resilient and practical in real-world applications.

“The most valuable use cases for agents haven’t even been imagined yet,” Amazon stated. “This research preview is an invitation to developers to build the future with us.”

Share with others