Feed aggregator

Fine, I'll Try AI

Hacker News - Mon, 02/09/2026 - 7:14pm
Categories: Hacker News

Show HN: Insurance AI Benchmark – 510 scenarios from production

Hacker News - Mon, 02/09/2026 - 7:04pm

We published the first public benchmark for insurance AI agents on HuggingFace.

What it contains: - 510 real insurance scenarios - 10 categories across 9 insurance lines - Train/val/test splits (357/76/77) - 4 routing decisions per scenario: AI handles, AI with verification, human handoff, hybrid collaboration - 3 evaluation metrics: intent accuracy, routing accuracy, action completeness

Why it matters: Insurance is precision work. A wrong routing decision costs money and trust. Most AI benchmarks miss this. They don't test what matters in production.

This data came from a real voice AI system. Years of customer calls. Actual insurance decisions. The scenarios are messy. They're real.

Open source: Apache 2.0 license. Ready to use.

Implementation: https://github.com/pavelsukhachev/hybrid-orchestrator Paper: TechRxiv (IEEE) - "The Hybrid Orchestrator: A Framework for Coordinating Human-AI Teams"

Comments URL: https://news.ycombinator.com/item?id=46953463

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: Hybrid Orchestrator – Reliable AI agents for finance

Hacker News - Mon, 02/09/2026 - 7:04pm

I built AI systems for banking and insurance. They failed too often. So I created a framework for human-AI teams.

The Hybrid Orchestrator has four patterns: (1) session state that survives context window limits, (2) multi-channel communication routing, (3) activity monitoring with triggers, and (4) human escalation pathways.

These patterns come from a production voice AI system for insurance applications. The code is Python, 97 tests, Apache 2.0.

I also published a research paper on TechRxiv (IEEE) describing the architecture.

Feedback welcome. What patterns do you use for reliable AI agents?

Comments URL: https://news.ycombinator.com/item?id=46953458

Points: 1

# Comments: 0

Categories: Hacker News

Data Modeling Is Changing

Hacker News - Mon, 02/09/2026 - 7:01pm
Categories: Hacker News

G's Last Exam

Hacker News - Mon, 02/09/2026 - 7:00pm
Categories: Hacker News

Secrets Don't Belong in a Sandbox

Hacker News - Mon, 02/09/2026 - 6:40pm

Article URL: https://vault.oshu.dev/

Comments URL: https://news.ycombinator.com/item?id=46953223

Points: 1

# Comments: 0

Categories: Hacker News

Trump Accounts

Hacker News - Mon, 02/09/2026 - 6:35pm

Article URL: https://trumpaccounts.gov

Comments URL: https://news.ycombinator.com/item?id=46953181

Points: 1

# Comments: 0

Categories: Hacker News

Pages