Amazon Nova Act: Advancing Intelligent AI Agents for the Web
Wednesday, Apr 2, 2025

Amazon has unveiled Nova Act, a sophisticated AI model crafted to empower intelligent agents capable of performing tasks within web browsers.
While large language models brought the idea of agents to life as query-answering or information-retrieving tools through methods like Retrieval-Augmented Generation (RAG), Amazon envisions a grander vision. The company describes agents as more than respondersthey are envisioned to carry out concrete, multi-step tasks across varied digital and physical domains.
Amazon aspires to have agents that can manage a range of intricate, multi-step activities such as wedding planning or solving complex IT issues to boost business efficiency, according to the company.
Present-day solutions frequently fall short, with many agents needing constant human oversight while their usefulness hinges on extensive API integration, something that's not viable for all tasks. Nova Act serves as Amazon's remedy to these challenges.
Along with the model, Amazon is providing a research preview of the Amazon Nova Act SDK. Through the SDK, developers can develop agents capable of streamlining web tasks such as sending out-of-office notifications, managing calendar events, or setting up automatic email replies.
The SDK's goal is to simplify complicated workflows into reliable atomic commands such as searching, checking out, or interacting with specific interface elements like dropdowns or popups. Developers can elaborate on these commands to direct an agent, for instance, to bypass an insurance offer during checkout.
To further boost precision, the SDK supports browser manipulation using tools like Playwright, API requests, Python integrations, and parallel threading to deal with web page loading delays.
In contrast to other generative models that offer mediocre precision on complex tasks, Nova Act emphasizes trustworthiness. Amazon underscores its model's remarkable scores of over 90% in internal evaluations for specific capabilities that commonly pose difficulties for other models. a0
Nova Act reached almost perfection with a score of 0.939 on the ScreenSpot Web Text benchmark, which evaluates natural language directions for text-based tasks, such as changing font sizes. Rivals like Claude 3.7 Sonnet (0.900) and OpenAIs CUA (0.883) are behind by notable margins.
Likewise, Nova Act scored 0.879 in the ScreenSpot Web Icon benchmark, which tests interactions with visual elements like rating stars or icons. Although the GroundUI Web test, aimed at evaluating an AI's ability to navigate various user interface elements, showed Nova Act slightly lagging behind competitors, Amazon sees this as a promising area for improvement as the model evolves.
Amazon emphasizes its commitment to delivering practical reliability. Once an agent created with Nova Act performs as intended, developers can deploy it in a headless mode, integrate it as an API, or even schedule its tasks to run asynchronously. In an example use case, an agent automatically orders a salad for delivery every Tuesday evening without needing ongoing user input.
One of Nova Act's distinguishing features is its capability to transfer its understanding of user interfaces to new settings with minimal extra training. Amazon shared a case where Nova Act performed exceptionally well in browser-based games, even though it wasn't trained on video game scenarios. This adaptability positions Nova Act as a versatile solution for diverse applications.
This feature is already being utilized within Amazon's own ecosystem. Through Alexa+, Nova Act facilitates self-directed web navigation to perform tasks for users, even when API access falls short. This marks progress towards more intelligent AI assistants that can operate independently and put their skills to use in dynamic ways.
Amazon asserts that Nova Act is just the beginning of a wider effort to craft smart, reliable AI agents capable of handling progressively complex, multi-step activities. a0
Moving beyond basic instructions, Amazon focuses on training agents through reinforcement learning across diverse, real-world situations rather than simplistic demonstrations. This foundational model acts as a checkpoint in the long-term training curriculum for Nova models, reflecting the company's ambition to revolutionize the AI agent industry.
The most valuable use cases for agents haven't been realized yet," Amazon stated. "Top developers and designers will uncover them. This research preview of our Nova Act SDK allows us to iterate alongside these creators through swift prototyping and iterative feedback.
Nova Act represents progress in making AI agents truly beneficial for intricate, digital tasks. By rethinking benchmarks and prioritizing reliability, its design philosophy focuses on enabling developers to exceed what's achievable with the tools available to today's generation. a0
Latest News
Here are some news that you might be interested in.

Thursday, Apr 3, 2025
Research Suggests OpenAI Utilizes Copyrighted Material for AI Model Training
Read more

Thursday, Apr 3, 2025
Backlash Arises Over AI Copyright Report from Tony Blair Institute
Read more

Thursday, Apr 3, 2025
AI Enhances Budgeting Efficiency, Yet Human Supervision Remains Crucial
Read more

Wednesday, Apr 2, 2025
Effective Use of Debugging and Data Lineage to Safeguard Investments in Generative AI
Read more