5 Groundbreaking Features in Amazon's Nova LLM You Need to Try Now
About This Episode
Discover how the groundbreaking Nova Light, Nova Pro, and especially Nova Act models are reshaping what’s possible with AI-driven tools.
From handling multimodal inputs to achieving superior accuracy in complex web interactions, these models represent a significant leap forward in large language model capabilities.
We explore five game-changing features including Nova Act’s advanced browser automation, parallel task execution, and seamless SDK integration for developers.
Whether it's generating photorealistic images, processing multi-step online actions, or managing web-based tasks with near-human precision, the Nova platform sets a new standard for AI agents.
Tune in to learn how to harness these tools and what this means for the future of AI-powered productivity.
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
⏰ TIMESTAMPS:
0:00 - Introduction To Nova Act
0:55 - Overview Of Amazon’s LLMs
1:37 - Browser Automation Capabilities
2:35 - SDK Integration With Python
3:36 - Real World Use Case Demos
5:03 - Voice Commands And Alexa Integration
6:04 - Image And Video Generation Tools
7:00 - Model Comparisons And Onboarding
7:57 - Image Editing With AI
8:14 - Final Thoughts And Experiment Ideas
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Sign up for free ➡️ https://link.jotform.com/j2tl15Ck1r
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Follow us on:
Twitter ➡️ https://x.com/aiagentspodcast
Instagram ➡️ https://www.instagram.com/aiagentspodcast
TikTok ➡️ https://www.tiktok.com/@aiagentspodcast
▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬▬
Transcript
it's public to try out. I know I keep mentioning that with the with the SDKs and what I find interesting as well and this comes from the article is Nova Act is the first step of in their vision for building the key capabilities that will ensure useful agents at scale. It's an early checkpoint from a much larger training curriculum we are pursuing with Nova Models to truly make agents smart and reliable for increasingly complex multi-step tasks. It's saying now that Alexa AI web action SDK you can now tick actions on the web, right? Hi, my name is Demetri Bonichi and I'm a content creator, agency owner, and AI enthusiast. You're listening to the AI Agents podcast brought to you by Jot Form and featuring our very own CEO and founder, Idkin Tank. This is the show where artificial intelligence meets innovation, productivity, and the tools
shaping the future of work. Enjoy the show. So, Amazon actually just launched its new LLM platform featuring multiple models including Novalite, Nova Pro, and Nova Act with support of over 200 languages exceeding the typical offerings of 100 languages. The platform is designed to cater to a wide array of applications and boasts a context window of over 300,000 tokens with feature plans to expand this to 2 million tokens. Those three models that I just mentioned, Nova Light is essentially their mid-tier model. Nova Pro is a more advanced model suitable for complex tasks and multimodal inputs, which means text, images, and videos. And Nova Act is an innovative AI agent capable of interacting with web browsers, performing tasks like form fillings and navigation with high accuracy. So the Nova Act one is the most interesting one here in my perspective because Nova Act essentially has sophisticated browser
automation, right? So, Nova Act can handle complex web interactions such as multi-step processes and visual recognition of elements like drop-own menus. In benchmark tests, we've seen really high performance out of this. It achieves a 93.9% accuracy in text interactions, outpacing models like Claude 3.7 and Open AI offerings. So, as you can see, they're using Nova Act in order to build out these agents where it can interact similar to what we've seen with other tools, but at a higher level. So, it's going to break down how to complete the task considering the outcome of each step and its plans then on to the next one. So, I I just I think it's incredible that this is even possible. We were in a world a year or two ago where we weren't seeing things like this. So, the SDK can integrate seamlessly with all your different Python
tools out there. And it's just going to be way easier to do these things if you are in the development category of things. So, for example, what it's doing here is it's going to extract that information and output JSON based on the different categories of things. So we'll see the output here in a second. The request here is to find the nearest Kale train station, make sure it's in biking distance, then used Google Maps to calculate the distance. So what it's now doing is it's doing it in parallel rather than one at a time. So you can use something called ThreadPool to have it in multiple browsers. And what's really intriguing about this is this is something that would take hours for you to do yourself, but is done instantaneously with this new model. The SDK is essentially yours to explore at this point. You
can see the comparisons that it has in regards to the other ones. I said that it had a higher score. Amazon Amazon Nova Act has this 939 versus.9 and 883. So these things are getting better at understanding what's going on on the screen. Same with web icon. These models they as they improve, I'm going to be interested to see what approaches can be taken because I don't know, works could just be different. It can handle date pickers. It can handle search filters. It can handle drop downs. These are all incredible. Like obviously this they use personal use cases. You can see as well that they've recently showcased a couple of different things on their website that talks about it from a personal use case perspective. Obviously the last use case we went just went through was personal. But this is interesting. This is essentially trying
to get a sweet green delivery on Tuesday night. And I mean I appreciate this. I don't really understand why they're using personal use cases cuz you're probably not going to get personal people to pay for this. But we're going through the process of getting a sweet green bowl here. The I love the shroom bowl actually. It's very good. And I'm just entertained at the fact that a salads cost that much. And b that you're able to have this order be delivered through this AI. Like that's cool, but what's the reason for uh someone to want to use this personally? It's public to try out. I know I keep mentioning that with the with the SDKs. And what I find interesting as well and actions on the web, right? So maybe you can imagine a world in where all those personal use cases we could be
like hey Alexa please order me XYZ thing or you can be like hey Alexa order me a new camera mount from Amazon or order me Sweet Green this specific thing that I want from Sweet Green. Get me that. It's cool. It's a little disconcerting though. I I I mean I'm one of those people that have you know an awareness of the fact that a credit card makes it easier for you to spend money. And now you know we're going to be able to just ask the AI to not even go through the clicking process ourselves. just do it by theirelves. I think that's could be a step too far, but that's not for me to judge, I guess. So, what you can do with Nova though is you can use pro, light, micro, right? It's pretty much just another language model. You can try these
different prompts. Summarize the main events of World War II, for example. I'm going to get to another tab for exploring video gallery and stuff like that. So, World War II, the deadliest and most Okay, that's a little deep. So, it's got sources from different websites. That's cool. You can ask it to regenerate. Copy. Similar layout to a lot of things. Okay. So, we got generate image here. I'm trying to some bread and pastry and milk on a dining table with a toaster in the center. Photo realistic 4K image. All right. And then Nova Real. What is this? Is a video generation model that supports the creation of short videos from text and input. What? How do I get access to this? Available now in Amazon Bedrock. I see. It's not really available for public consumption in the same way, but that's cool. And then you
can actually do a sideby-side comparison of multiple models. That's the first time I've seen that. Use live information. Same thing for searching. What was the reason Duke basketball lost in the semi-finals this year or I guess final four, whichever. So we got Nova Light, Nova Pro. Yeah, that was a embarrassing one. So as you can see here on the left hand side, a little bit longer, a little bit more context, broke it out. This was just paragraphs. There was a lot of turnovers and missed opportunities. Running style seems decently similar. Myriad is definitely the most AI term I've heard in a while. Business jargon nonsense. But overall, they use the same sources and I think it was pretty good on both sides. There's obviously the micro version that's not the multi-modal option. And then here, what you'll see is they have an entire onboarding guide
for using ACT, which is interesting. And I could walk through this with you. However, we don't have the most technical audience, but I do think it was interesting to point it out for those of you that are interested in this. Amazon's getting into the AI game, peeps. A little bit weird. Oh, look at that. The image was created. Convert to video. What? Takes a couple minutes, but still, that's cool. And by the way, you can edit this image, remove the background, replace the background, generate variations, replace objects, shape the image. All of these are much more indepth editing capabilities than the other platforms that exist. Wow. So, I can, for example, replace the milk on the left next to the bread with orange juice. Okay. Replace remove object. I'm I'm interested to see how this goes. Oh, so it didn't listen directly, but it did
replace the milk on the right with orange juice, which is cool. So, pretty solid overall. Decent model. I guess you have an Amazon account. You can try this out. I I don't know. I I guess everybody's just got a decent model now. That's That's kind of the moral of the story. I think everyone's got a decent model now. What the heck is going on? So, if you're interested in trying out this model, I would definitely just go to your Amazon account and try it. And if you are more technical, try out Nova Act. It's going to be one of the quote game changers. This is the example workflow with search for a coffee maker. I hope that it doesn't get crazy with this cuz I can just see it getting out of hand pretty fast with people finding fun stuff to use. All right, with
that being said, thank you so much for watching this video. We appreciate each and every one of you. Make sure to listen, like, subscribe, all that good stuff to this episode, the next episode, and all of them to come. We'll see you in the next one. Bye.