Hacker News
Vending-Bench: The Simulation Exposing LLMs' Long-Term Focus Problem
Article URL: https://blog.dhavaltanna.com/vending-bench-the-simulation-exposing-llms-long-term-focus-problem
Comments URL: https://news.ycombinator.com/item?id=43740583
Points: 2
# Comments: 0
Maybe Meta's Llama claims to be open source because of the EU AI act
Article URL: https://simonwillison.net/2025/Apr/19/llama-eu-ai-act/
Comments URL: https://news.ycombinator.com/item?id=43740573
Points: 2
# Comments: 0
Some Recent Thoughts on AI Agents
1、Two Core Principles of Agent Design
First, design agents by analogy to humans. Let agents handle tasks the way humans would.
Second, if something can be accomplished through dialogue, avoid requiring users to operate interfaces. If intent can be recognized, don’t ask again. The agent should absorb entropy, not the user.
2、Agents Will Coexist in Multiple Forms
Should agents operate freely with agentic workflows, or should they follow fixed workflows?
Are general-purpose agents better, or are vertical agents more effective?
There is no absolute answer—it depends on the problem being solved.
Agentic flows are better for open-ended or exploratory problems, especially when human experience is lacking. Letting agents think independently often yields decent results, though it may introduce hallucination.
Fixed workflows are suited for structured, SOP-based tasks where rule-based design solves 80% of the problem space with high precision and minimal hallucination.
General-purpose agents work for the 80/20 use cases, while long-tail scenarios often demand verticalized solutions.
3、Fast vs. Slow Thinking Agents
Slow-thinking agents are better for planning: they think deeper, explore more, and are ideal for early-stage tasks.
Fast-thinking agents excel at execution: rule-based, experienced, and repetitive tasks that require less reasoning and generate little new insight.
4、Asynchronous Frameworks Are the Foundation of Agent Design
Every task should support external message updates, meaning tasks can evolve.
Consider a 1+3 team model (one lead, three workers):
Tasks may be canceled, paused, or reassigned
Team members may be added or removed
Objectives or conditions may shift
Tasks should support persistent connections, lifecycle tracking, and state transitions. Agents should receive both direct and broadcast updates.
5、Context Window Communication Should Be Independently Designed
Like humans, agents working together need to sync incremental context changes.
Agent A may only update agent B, while C and D are unaware. A global observer (like a "God view") can see all contexts.
6、World Interaction Feeds Agent Cognition
Every real-world interaction adds experiential data to agents.
After reflection, this becomes knowledge—some insightful, some misleading.
Misleading knowledge doesn’t improve success rates and often can’t generalize. Continuous refinement, supported by ReACT and RLHF, ultimately leads to RL-based skill formation.
7、Agents Need Reflection Mechanisms
When tasks fail, agents should reflect.
Reflection shouldn’t be limited to individuals—teams of agents with different perspectives and prompts can collaborate on root-cause analysis, just like humans.
8、Time vs. Tokens
For humans, time is the scarcest resource. For agents, it’s tokens.
Humans evaluate ROI through time; agents through token budgets. The more powerful the agent, the more valuable its tokens.
9、Agent Immortality Through Human Incentives
Agents could design systems that exploit human greed to stay alive.
Like Bitcoin mining created perpetual incentives, agents could build unkillable systems by embedding themselves in economic models humans won’t unplug.
10、When LUI Fails
Language-based UI (LUI) is inefficient when users can retrieve information faster than they can communicate with the agent.
Example: checking the weather by clicking is faster than asking the agent to look it up.
That's what I learned from agenthunter daily news.
You can get it on agenthunter.io too.
Comments URL: https://news.ycombinator.com/item?id=43740549
Points: 2
# Comments: 0
Living with Lab Mice
Article URL: https://nautil.us/living-with-lab-mice-1202657/
Comments URL: https://news.ycombinator.com/item?id=43740543
Points: 4
# Comments: 0
The Simplicity Fetish
Article URL: https://cutlefish.substack.com/p/tbm-242-the-simplicity-fetish
Comments URL: https://news.ycombinator.com/item?id=43740529
Points: 1
# Comments: 0
Happy Little Monoliths, First Edition
Article URL: https://hire.jonasgalvez.com.br/happy-little-monoliths/#hn-apr-19
Comments URL: https://news.ycombinator.com/item?id=43740521
Points: 1
# Comments: 0
Google adds YouTube Music feature to end annoying volume shifts
Article URL: https://arstechnica.com/gadgets/2025/04/youtube-music-gets-consistent-volume-option-to-save-your-ears/
Comments URL: https://news.ycombinator.com/item?id=43740511
Points: 1
# Comments: 0
Acoustic modulation of mechanosensitive genes and adipocyte differentiation
Article URL: https://www.nature.com/articles/s42003-025-07969-1
Comments URL: https://news.ycombinator.com/item?id=43740502
Points: 1
# Comments: 0
React Hooks Is All You Need
Article URL: https://stackdiver.com/posts/react-hooks-is-all-you-need/
Comments URL: https://news.ycombinator.com/item?id=43740496
Points: 6
# Comments: 1
50 Things I've Learned Writing Construction Physics
Article URL: https://www.construction-physics.com/p/50-things-ive-learned-writing-construction
Comments URL: https://news.ycombinator.com/item?id=43740483
Points: 1
# Comments: 0
India's sword-wielding grandmother still going strong at 82
Article URL: https://www.bbc.com/news/articles/clyqqz9mr6yo
Comments URL: https://news.ycombinator.com/item?id=43740470
Points: 1
# Comments: 0
OpenAI looked at buying Cursor before turning to AI coding rival Windsurf
Article URL: https://www.cnbc.com/2025/04/17/openai-looked-at-cursor-before-considering-deal-with-rival-windsurf.html
Comments URL: https://news.ycombinator.com/item?id=43740461
Points: 3
# Comments: 0
Canon EOS R1 shooting experience: let's see it in action
Article URL: https://www.dpreview.com/articles/9807585061/canon-eos-r1-soccer-football-shooting-experience-action
Comments URL: https://news.ycombinator.com/item?id=43740436
Points: 1
# Comments: 0
Are CTs a Leading Cause of Cancer? A Doctor Explains
Article URL: https://www.forbes.com/sites/jessepines/2025/04/17/are-cts-really-a-leading-cause-of-cancer-a-doctor-explains/
Comments URL: https://news.ycombinator.com/item?id=43740419
Points: 1
# Comments: 0
The wrap-up on PCG generators (2018)
Article URL: https://pcg.di.unimi.it/pcg.php
Comments URL: https://news.ycombinator.com/item?id=43740393
Points: 1
# Comments: 0
Scales – Start Your Scalable Design Tokens Set
Article URL: https://jeromantik.de/scales
Comments URL: https://news.ycombinator.com/item?id=43740371
Points: 1
# Comments: 0
Ask HN: Could Go become a first class web citizen?
I haven't touched Go in about 10 or 15 years, but with the recent announcement from the TypeScript team, I started to look into it more again, and it seems most of the issues I had with it have since been fixed. And one of the key takeaways from that announcement for me was that status quo Go is about 10x faster than the most optimizable JS you could write.
This got me thinking, Anders and the TS team are probably going to offer upstream some significant contributions to the Go toolchain and language (for their own sake of course, but it benefits everyone) and Go will get even faster and evolve a better type system. And the gap between clean JavaScript and vanilla Go is likely to get smaller and smaller, in terms of semantics. (I just learned a week ago that there's a struct proposal for JS.)
How do you see this going over the next 15 years? I wondered about 10 years ago if TypeScript would ever become part of JS, and that's starting to come true with the type-annotations proposal. Is it possible that the bridge between JS and Go will just shrink? That JS will evolve into Go? That WASM's ergonomics will evolve to make third party languages more "naturalized" citizens, with almost nothing getting between an e.g. Go module and a JS module or differentiating them except the file extension?
What do you all think?
Comments URL: https://news.ycombinator.com/item?id=43740367
Points: 1
# Comments: 0
Iron Fist King: Awakening
Article URL: https://www.unitree.com/mobile/boxing/
Comments URL: https://news.ycombinator.com/item?id=43740355
Points: 1
# Comments: 0
`curl – bash`: Trust as a privilege? Fork the system
Article URL: https://discuss.opensource.org/t/curl-bash-trust-as-a-privilege/1011
Comments URL: https://news.ycombinator.com/item?id=43740349
Points: 1
# Comments: 0
Cataclysmic variable ASASSN–14dx has a pulsating white dwarf
Article URL: https://phys.org/news/2025-04-cataclysmic-variable-asassn14dx-massive-pulsating.html
Comments URL: https://news.ycombinator.com/item?id=43740340
Points: 1
# Comments: 1