Hacker News

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual)

Hacker News - 4 hours 38 min ago

Hi HN,

I’ve been working on an OCR pipeline specifically optimized for machine learning dataset preparation. It’s designed to process complex academic materials — including math formulas, tables, figures, and multilingual text — and output clean, structured formats like JSON and Markdown.

Some features: • Multi-stage OCR combining DocLayout-YOLO, Google Vision, MathPix, and Gemini Pro Vision • Extracts and understands diagrams, tables, LaTeX-style math, and multilingual text (Japanese/Korean/English) • Highly tuned for ML training pipelines, including dataset generation and preprocessing for RAG or fine-tuning tasks

Sample outputs and real exam-based examples are included (EJU Biology, UTokyo Math, etc.) Would love to hear any feedback or ideas for improvement.

GitHub: https://github.com/ses4255/Versatile-OCR-Program

Comments URL: https://news.ycombinator.com/item?id=43565239

Points: 2

# Comments: 0

Categories: Hacker News

Lorentz Invariance Violation?

Hacker News - 4 hours 39 min ago

Article URL: https://arxiv.org/abs/2504.01830

Comments URL: https://news.ycombinator.com/item?id=43565234

Points: 1

# Comments: 0

Categories: Hacker News

Chawan: Web Browser in TUI

Hacker News - 4 hours 46 min ago

Article URL: https://sr.ht/~bptato/chawan/

Comments URL: https://news.ycombinator.com/item?id=43565194

Points: 1

# Comments: 0

Categories: Hacker News

Show HN: Compare success rates, and services across fertility clinics in the US

Hacker News - 4 hours 50 min ago

YourIVFPath is a tool to help prospective parents find and compare IVF clinics across the U.S. using transparent, data-driven insights from the CDC.

IVF is a deeply personal, emotionally taxing, and financially significant journey. Yet for many, finding the right clinic feels like navigating in the dark — scattered information, unclear success rates, and overwhelming choices. YourIVFPath aims to change that. Idea is to bring together publicly available clinic data in a clean, accessible format so families can make informed decisions with clarity and confidence.

If you know someone exploring fertility treatment options, I hope this helps bring a little more transparency to their path.

Feedback and suggestions welcome!

Tech stack: I used Supabase to store the data, Ruby to scrape it, bolt.new from StackBlitz for website design and most of the React code, Vercel to deploy and of course ChatGPT at every step of the way. All data available on CDC.

Comments URL: https://news.ycombinator.com/item?id=43565161

Points: 1

# Comments: 0

Categories: Hacker News

Compliance Chaos? How MokaHR Keeps Your Hiring Legally Sound

Hacker News - 4 hours 54 min ago

HR compliance isn’t just about ticking boxes—it’s about protecting your company from costly lawsuits, penalties, and reputational damage. Yet, with labor laws constantly evolving, many businesses struggle to keep up. Is your hiring process airtight, or are you leaving your company exposed?

The Hidden Compliance Risks in Hiring From anti-discrimination laws to data privacy regulations, recruitment today is a legal minefield. A simple misstep—like asking the wrong interview question or mishandling candidate data—can lead to hefty fines. Consider this:

A major tech company was fined for posting job ads requiring “native English speakers.”

A financial firm faced lawsuits for failing to disclose hiring decisions properly.

Several global corporations have been penalized under GDPR for mishandling candidate data.

How MokaHR Keeps You Compliant MokaHR isn't just another HR tool—it’s a compliance safeguard built into your hiring process. Our HR compliance software ensures that every step, from job postings to final offers, aligns with legal best practices. Here’s how:

Automated compliance checks: Flags potential violations in job descriptions, interview scripts, and offer letters.

Document tracking & audit trails: Keep every piece of compliance data organized and accessible.

Real-time regulatory updates: Stay ahead of labor law changes without constantly researching them yourself.

Compliance Without the Headache Most HR teams spend hours manually checking for compliance issues. MokaHR automates this process, freeing up your time while minimizing legal risks. With seamless integration into recruitment workflows, compliance becomes effortless—not an afterthought.

The Future of Hiring Is Risk-Free As labor laws tighten, businesses that fail to adapt will face increasing scrutiny. MokaHR provides a future-proof solution that lets you hire with confidence, knowing that every process is legally sound and audit-ready.

Don’t wait for a compliance disaster—secure your hiring process with MokaHR today.

Comments URL: https://news.ycombinator.com/item?id=43565131

Points: 1

# Comments: 0

Categories: Hacker News

Privacy Isn't Optional–How MokaHR Protects Candidate Data in Recruitment

Hacker News - 5 hours 2 min ago

Recruitment platforms promise efficiency, but at what cost? In a world where data breaches and privacy concerns are escalating, job applicants and HR teams alike are asking: Who actually owns candidate data, and how secure is it?

The Hidden Risk in Hiring Tech Most recruitment systems collect vast amounts of personal data—resumes, contact details, work histories, even interview recordings. This data is gold for employers but also a prime target for leaks, misuse, and regulatory penalties. Many platforms prioritize features over security, leaving sensitive information exposed.

MokaHR’s Privacy-First Approach MokaHR takes a different stance. Rather than treating privacy as an afterthought, we’ve built our system around strict data protection protocols and compliance with global privacy laws. Our recruitment platform ensures that:

Only essential data is collected, minimizing exposure risks.

Encryption and access controls safeguard candidate information.

Compliance with GDPR, CCPA, and other regulations is baked into the platform.

Transparency Over Data Ownership One of the biggest concerns in HR tech is who owns candidate data—the employer, the software provider, or the job board? With MokaHR, there’s no ambiguity. Your data stays in your control, and we never use it beyond the scope of your hiring needs.

The Trade-Off: Privacy vs. Convenience? Some argue that strict privacy measures slow down hiring, but that’s a false choice. MokaHR proves you can have both security and efficiency—with automated workflows, AI-driven candidate matching, and seamless interview scheduling without compromising privacy.

The Future of Privacy in Recruitment As regulations tighten and candidates become more aware of data risks, companies using outdated, insecure recruitment tools will face growing scrutiny. MokaHR offers a future-proof solution that keeps privacy at the core of hiring, not just as a compliance checkbox.

Hiring shouldn’t come at the cost of privacy. Is your recruitment platform protecting your data, or exposing it?

Comments URL: https://news.ycombinator.com/item?id=43565066

Points: 1

# Comments: 0

Categories: Hacker News

Pages