61% Reliability In Agent Mode

Anthropic is out with an upgrade to its flagship AI
that offers 61% reliability when used as an agent for everyday computing tasks.

Essentially, that means when you use the Sonnet 4.5 as an agent to complete an assignment featuring multi-step tasks like opening apps, editing files, navigating Web pages and filling out forms, it will complete those assignments for you 61% of the time.

One caveat: That reliability metric – known as the OSWorld-Verified Benchmark – is based on Sonnet 4.5’s performance in a sandbox environment, where researchers pit the AI against a set of pre-programmed, digital encounters that never change.

Out on the Web – where things can get unpredictable
very quickly — performance could be worse.

Bottom line: If an AI agent that finishes three-out-of-every-five tasks turns your crank, this could be the AI you’ve been looking for.

In other news and analysis on AI writing:

*LinkedIn’s CEO: ‘I Write Virtually All My Emails With AI Now:” Crediting AI for making him sound ‘super smart’ when it comes to emails, LinkedIn CEO Ryan Roslanksy says he writes nearly all of his emails using AI now.

Observes writer Sherin Shibu: “Roslansky, who has led LinkedIn for the past five years, said that using AI is like tapping into ‘a second brain’ personalized just for him.

*Another ‘AI Writing Humanizer’ Tool Launches: JustDone has just rolled-out an ‘AI humanizer” tool that transforms the sometimes robotic writing of chatbots like ChatGPT into more human-sounding text.

Sounds good in theory.

But truth-be-told, you can do your own ‘humanizing’ with ChatGPT simply by including writing style directions in your prompt.

For example: Simply add phrases like, “write in a warm, witty, conversational style” or “write at the level of a college freshman, but be sure to inject plenty of deadpan humor in your writing.”

Essentially: Simply experiment with describing the precise kind of writing you’d like from ChatGPT, and you won’t need to pay for a ‘humanizer.’

That said, for best results, write — and humanize your writing — using ChatGPT-4.0.

The reason: ChatGPT-5 and other chatbots often resist or water down prompting that attempts to alter writing style.

*New Microsoft 365 ‘Premium” Tier Promising Advanced AI: Microsoft has rolled out a ‘luxury’ version of its productivity suite, billed at $20/month, that offers:

–Higher usage limits with AI

–GPT-4 image generation from OpenAI

–Deep research, vision and actions

–Standard apps that have been with 365 for years, such as
Word, Excel, Powerpoint and Outlook

*OpenAI Launches New Social Media Video App: Video fans just got another text-to-video tool from ChatGPT’s maker – which is designed to compete with the likes of TikTok, Instagram Reels and YouTube Shorts.

The feature setting users’ imaginations ablaze: The ability to drop an image of yourself – or anyone else – into any video the app creates.

Even better: The social media app uses Sora 2, OpenAI’s new video creator, which offers enhanced precision in the creation of complex movement, sound, dialogue and effects for short videos.

*AI Chat, Talking Avatar Style: If chatting with an AI–powered animated character is on your bucket list, Microsoft has the solution.

It’s just rolled out 40 experimental characters you can chat with under its $20/month, Copilot Pro subscription.

Observes writer Lance Whitney: “You can choose from among 40 portraits, all with different genders, races, and nationalities.”

*’Instant Checkout’ Opens for Business in ChatGPT: Now you can buy goods and services while remaining in the ChatGPT app, thanks to a new checkout service from the AI.

Just underway – currently, you can only shop at Etsy in ChatGPT – the AI’s maker is promising to soon onboard Shopify to the new feature, which features a million-plus merchants.

Observes writer Chance Townsend: “OpenAI also revealed that the underlying technology will be open source to help bring agentic commerce to more merchants and developers.”

*Now AI Reports on Police Bodycam Footage, Too: While scores of police agencies have been using AI to write-up standard reports, some have also begun using the tech to report on bodycam footage.

Observes DigWatch: “The tool, Draft One, analyzes Axon body-worn camera footage to generate draft reports for specific calls, including theft, trespassing and DUI incidents.”

*No Good at AI?: Hasta La Vista, Baby: Early AI adopter Accenture, a consulting firm, has issued a stern warning to staff – get with the AI program, or get another job.

Observes writer Joe Wilkins: “If Accenture workers fail to appease their overlords, the CEO says they’ll be dumped like yesterday’s trash.

“In their place, the IT firm will hire people who already have the AI ‘skills’ necessary to appease stockholders.”

*AI BIG PICTURE: Trump To Taiwan: Produce 50% of Chips in U.S., or You’re on Your Own: In a move bringing new definition to the phrase ‘heavy-handed,’ U.S. President Donald Trump has told Taiwan needs to move half of its chip production to the U.S. if it wants U.S. help against a Chinese invasion.

Observes writer Ashley Belanger: “To close the deal with Taiwan, (U.S. Commerce Secretary Howard) Lutnick suggested that the U.S. would offer some kind of security guarantee so that they can expect that moving their supply chain into the U.S. won’t eliminate Taiwan’s so-called silicon shield where countries like the U.S. are willing to protect Taiwan because we need their silicon, their chips, so badly.”

Share a Link: Please consider sharing a link to https://RobotWritersAI.com from your blog, social media post, publication or emails. More links leading to RobotWritersAI.com helps everyone interested in AI-generated writing.

–Joe Dysart is editor of RobotWritersAI.com and a tech journalist with 20+ years experience. His work has appeared in 150+ publications, including The New York Times and the Financial Times of London.

The post New Claude Sonnet 4.5: appeared first on Robot Writers AI.