BotBeat
...
← Back

> ▌

CoastyCoasty
PRODUCT LAUNCHCoasty2026-03-03

Coasty Claims Top Spot on OSWorld Benchmark at 82%, Surpassing Major AI Labs

Key Takeaways

  • ▸Coasty achieved 82% on the OSWorld benchmark, claiming the #1 position and beating AI agents from Anthropic (Claude Sonnet 4.5 at 62.9%), ByteDance (Seed-1.8 at 61.9%), and other major labs
  • ▸The service positions itself as a virtual assistant replacement at $50/month versus $3,000-$5,000 for human alternatives, with 24/7 availability and instant setup
  • ▸Coasty's AI agent operates through a visual computer interface, performing tasks like web browsing, spreadsheet work, and email management with complete action logging
Source:
Hacker Newshttps://coasty.ai/↗

Summary

Coasty, an AI computer automation startup, announced it has achieved the #1 position on the OSWorld benchmark with an 82% success rate, reportedly outperforming AI agents from established players including Anthropic, ByteDance, Moonshot AI, and UiPath. The OSWorld benchmark measures real-world computer task completion across browsers, office applications, and system operations. Coasty's performance represents a significant lead over the second-place Agent S3 from Simular, which scored 72.6% using Opus 4.5 and GPT-5 models.

The company is positioning its AI agent as a cost-effective alternative to human virtual assistants, claiming it can perform tasks like spreadsheet analysis, web browsing, form filling, and email management for as little as $50 per month compared to typical virtual assistant costs of $3,000-$5,000 monthly. Coasty emphasizes that its agent operates on real computers through a visual interface, clicking, typing, and navigating like a human user, with all actions logged for audit purposes. The service runs on isolated virtual machines for security and offers 24/7 availability.

The product targets startup founders, operations managers, solopreneurs, and agency owners handling repetitive administrative tasks. Coasty offers a freemium pricing model starting at $0, with paid tiers ranging from $19 to $100 monthly for individual users, plus custom enterprise pricing. The company provides demonstration videos showing the agent completing tasks such as solving CAPTCHAs, drawing circles, filling spreadsheets, and sending emails autonomously.

  • The platform uses isolated virtual machines for security and offers a self-correcting agent that can detect and adapt to mistakes during task execution

Editorial Opinion

Coasty's benchmark claim deserves scrutiny, as the 82% OSWorld score represents a substantial 9.4 percentage point lead over second place—a gap that seems unusually large given the competitive landscape of AI agents. The company's marketing emphasizes cost savings over human workers, which raises important questions about workforce displacement and whether an 82% success rate is sufficient for mission-critical tasks. While the technical achievement is noteworthy if verified, the presentation feels more focused on disrupting labor markets than advancing the underlying AI research, and independent validation of these benchmark results would strengthen credibility.

AI AgentsMachine LearningStartups & FundingJobs & Workforce ImpactProduct Launch

More from Coasty

CoastyCoasty
PRODUCT LAUNCH

Coasty.ai Launches General Availability, Bringing AI-Powered Coastal Monitoring to Market

2026-03-05
CoastyCoasty
PRODUCT LAUNCH

Coasty Achieves #1 Ranking on OSWorld Benchmark with 82% Score, Positioning AI Agent as Virtual Assistant Alternative

2026-03-03

Comments

Suggested

Not SpecifiedNot Specified
PRODUCT LAUNCH

AI Agents Now Pay for API Data with USDC Micropayments, Eliminating Need for Traditional API Keys

2026-04-05
MicrosoftMicrosoft
OPEN SOURCE

Microsoft Releases Agent Governance Toolkit: Open-Source Runtime Security for AI Agents

2026-04-05
Independent ResearchIndependent Research
RESEARCH

Inference Arena: New Benchmark Compares ML Framework Performance Across Local Inference and Training

2026-04-05
← Back to news
© 2026 BotBeat
AboutPrivacy PolicyTerms of ServiceContact Us