BotBeat
...
← Back

> ▌

CoastyCoasty
PRODUCT LAUNCHCoasty2026-03-03

Coasty Claims Top Spot on OSWorld Benchmark at 82%, Surpassing Major AI Labs

Key Takeaways

  • ▸Coasty achieved 82% on the OSWorld benchmark, claiming the #1 position and beating AI agents from Anthropic (Claude Sonnet 4.5 at 62.9%), ByteDance (Seed-1.8 at 61.9%), and other major labs
  • ▸The service positions itself as a virtual assistant replacement at $50/month versus $3,000-$5,000 for human alternatives, with 24/7 availability and instant setup
  • ▸Coasty's AI agent operates through a visual computer interface, performing tasks like web browsing, spreadsheet work, and email management with complete action logging
Source:
Hacker Newshttps://coasty.ai/↗

Summary

Coasty, an AI computer automation startup, announced it has achieved the #1 position on the OSWorld benchmark with an 82% success rate, reportedly outperforming AI agents from established players including Anthropic, ByteDance, Moonshot AI, and UiPath. The OSWorld benchmark measures real-world computer task completion across browsers, office applications, and system operations. Coasty's performance represents a significant lead over the second-place Agent S3 from Simular, which scored 72.6% using Opus 4.5 and GPT-5 models.

The company is positioning its AI agent as a cost-effective alternative to human virtual assistants, claiming it can perform tasks like spreadsheet analysis, web browsing, form filling, and email management for as little as $50 per month compared to typical virtual assistant costs of $3,000-$5,000 monthly. Coasty emphasizes that its agent operates on real computers through a visual interface, clicking, typing, and navigating like a human user, with all actions logged for audit purposes. The service runs on isolated virtual machines for security and offers 24/7 availability.

The product targets startup founders, operations managers, solopreneurs, and agency owners handling repetitive administrative tasks. Coasty offers a freemium pricing model starting at $0, with paid tiers ranging from $19 to $100 monthly for individual users, plus custom enterprise pricing. The company provides demonstration videos showing the agent completing tasks such as solving CAPTCHAs, drawing circles, filling spreadsheets, and sending emails autonomously.

  • The platform uses isolated virtual machines for security and offers a self-correcting agent that can detect and adapt to mistakes during task execution

Editorial Opinion

Coasty's benchmark claim deserves scrutiny, as the 82% OSWorld score represents a substantial 9.4 percentage point lead over second place—a gap that seems unusually large given the competitive landscape of AI agents. The company's marketing emphasizes cost savings over human workers, which raises important questions about workforce displacement and whether an 82% success rate is sufficient for mission-critical tasks. While the technical achievement is noteworthy if verified, the presentation feels more focused on disrupting labor markets than advancing the underlying AI research, and independent validation of these benchmark results would strengthen credibility.

AI AgentsMachine LearningStartups & FundingJobs & Workforce ImpactProduct Launch

More from Coasty

CoastyCoasty
PRODUCT LAUNCH

Coasty.ai Launches General Availability, Bringing AI-Powered Coastal Monitoring to Market

2026-03-05
CoastyCoasty
PRODUCT LAUNCH

Coasty Achieves #1 Ranking on OSWorld Benchmark with 82% Score, Positioning AI Agent as Virtual Assistant Alternative

2026-03-03

Comments

Suggested

Generative AIGenerative AI
INDUSTRY REPORT

Barnes & Noble CEO Backs Selling AI-Written Books, Sparking Industry Debate on Transparency Standards

2026-05-20
Research CommunityResearch Community
RESEARCH

New Methodology Proposed for Selecting Runtime Architecture Patterns in Production LLM Agents

2026-05-20
Google / AlphabetGoogle / Alphabet
PRODUCT LAUNCH

Google DeepMind Launches Gemini 3.5 Flash: New Lightweight AI Model

2026-05-20
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us