Hacker News Evening Brief: 2026-05-28
The evening briefing for May 28th spans Anthropic’s latest model release, a damning look at private equity’s grip on public services, a surprising backlash against AI-powered search, and a Soviet programming language that feels like Pascal in Cyrillic. From formal verification in Rust to horse-racing board games of pure luck, here are thirty stories that caught the Hacker News community’s attention.
AI & Tech Policy
Claude Opus 4.8
Summary: Anthropic released Claude Opus 4.8, building on Opus 4.7 with improvements across coding, agentic skills, reasoning, and knowledge-work benchmarks at the same price point. New features include user-controllable effort levels on claude.ai, Claude Code “dynamic workflows” for large-scale problems, and a fast mode running at 2.5× speed that is now three times cheaper than prior models. Early testers report Opus 4.8 catches its own mistakes more reliably and is more forthcoming about flagging uncertainties rather than overclaiming progress.
HN Discussion: Commenters compared the iterative improvements to smartphone upgrade cycles—incremental but sufficient to retain the user base. Multiple users welcomed the ability to disable adaptive thinking in the web UI, citing persistent issues with thinking not triggering properly. Some pushed back on Anthropic’s framing of model honesty, noting the lab describes its own models as if discovering phenomena rather than shipping products.
US’s big bet on quantum computing may not be legal
Summary: The US government announced $2 billion in quantum computing investments, allocating $100 million each to startups in exchange for equity, but legal experts question whether the funding mechanism complies with existing appropriations law. The deal also launched the first dedicated quantum foundry company, raising questions about whether market demand justifies such a facility. Critics argue the funds may have been diverted from other authorised programs, and any legal challenge would take so long to resolve that the money would already be spent.
HN Discussion: Commenters compared the deal to IBM offloading dead-end research onto taxpayers. One noted the quantum industry is “grift-filled,” with companies like IonQ trading at 200× earnings and CEOs claiming to be the next NVIDIA. Others questioned the legal precedent that spending money fast enough makes retroactive challenge moot.
Security & Privacy
ICE has spent over $25M on iris scanners in no-bid contracts
Summary: ICE has spent over $25 million on iris recognition technology through no-bid contracts, deploying hundreds of scanning devices nationwide to identify undocumented immigrants. Privacy experts warn that DHS is amassing a large-scale biometric database, with scans reportedly occurring even before arrest and being used on protestors. The no-bid nature of the contracts raises procurement legality concerns under the Federal Acquisition Regulation.
HN Discussion: One commenter who saw iris scanners used in Iraq twenty years ago predicted the technology would eventually be deployed domestically. The no-bid contracts were called out as likely kickbacks, given FAR prohibits them without justification. Concerns were raised about future resolution improvements enabling remote iris scanning via surveillance cameras from a distance.
System Administration
Indoor Wi-Fi Roaming with OpenWRT
Summary: A detailed homelab walkthrough covers how to improve Wi-Fi roaming across multiple OpenWRT access points using usteer, 802.11k neighbour reports, and careful band separation. The author deliberately keeps 2.4 GHz and 5 GHz SSIDs separate to accommodate legacy IoT devices while optimising roaming for modern clients. Practical tips include tuning transmission power, configuring hostapd neighbour reports, and the reasoning behind not merging all SSIDs into one.
HN Discussion: One commenter found that 802.11r with equal-channel APs and lowered transmit power yielded roughly 75 ms handoffs on iOS. Others reported usteer causing rapid battery drain on Android phones due to aggressive steering. Several debated whether splitting or merging SSIDs is the better default for mixed-device households.
Academic & Research
Disagreement Among Frontier LLMs on Real-World Fact-Checks
Summary: Research from Lenz tested 1,000 real user-submitted claims against five frontier LLMs and found that 67% of claims had at least one model dissenting from the panel majority. A full 34% of claims showed a two-or-more bucket gap between the most-disagreeing pair, indicating substantive disagreement rather than minor calibration differences. The four-verdict rubric (True / Mostly True / Misleading / False) leaves no room for “unknown,” forcing models to pick even on inherently unanswerable claims.
HN Discussion: Simon Willison loaded the data into Datasette Lite for exploration, revealing specific disagreement examples like extraterrestrial life claims where ground truth is “nobody knows.” Critics questioned whether the research itself was LLM-written and whether the claim list and ground truth methodology were sufficiently disclosed. The forced-choice rubric was identified as a key limitation—some claims are genuinely unresolvable.
Citing ‘severe’ math deficits, UC faculty demand a return to SAT tests for STEM
Summary: UC math professors are publicly demanding the reinstatement of SAT testing for STEM admissions, citing preparation gaps so extreme that instructors must reteach middle-school maths alongside university-level material. The professors argue that without standardised testing, grade inflation makes a 4.0 GPA from a weak school indistinguishable from one earned at a rigorous school. They warn that dropping the SAT paradoxically harms underprivileged students most, since test prep requires only a book and internet, whereas building the extracurricular profile now weighted more heavily is far more resource-intensive.
HN Discussion: Commenters debated whether “equity” policies that eliminate accelerated maths have backfired by dragging down overall preparation levels. A former high school maths teacher argued that digital device distraction in classrooms has compounded the problem. Several questioned why university instructors feel compelled to reteach prerequisites rather than enforcing existing course requirements.
A Eureka machine that thinks like nature and explores what AI cannot
Summary: IISc researchers propose a neuromorphic computing approach using Ising-model spin systems with Fowler–Nordheim annealing dynamics for combinatorial optimisation problems that gradient descent struggles with. The system uses a neuromorphic autoencoder and controlled search process designed to avoid premature trapping in local minima, scaling to higher-order combinatorial problems. The approach is positioned as an alternative computing paradigm that could complement conventional AI rather than replace it.
HN Discussion: Commenters noted the title is buzzword-heavy and demanded concrete benchmarks rather than marketing claims. One pointed out that spin systems are no more “nature” than transistors, questioning the framing. Another invoked Sutton’s Bitter Lesson, arguing specialised hardware approaches historically lose to scaled general-purpose compute.
Seeing Around Corners Using Smartphone-Grade Lidar
Summary: Researchers demonstrated non-line-of-sight imaging using only smartphone-grade lidar—the kind found in FaceID hardware—detecting objects hidden around corners by analysing diffuse light reflections. The technique uses the walls opposite a corner as a reflective surface, reconstructing hidden geometry from multi-bounce lidar returns. Previously this required expensive research-grade equipment; the finding suggests consumer hardware can now reproduce formerly lab-only capabilities.
HN Discussion: One commenter argued the implications are significant: if consumer hardware can replicate research-level techniques, many other fields may see similar democratisation. A practical suggestion was to simply place a mirror at 45 degrees in the corner as a simpler alternative. Questions were raised about the environmental constraints—the approach requires opposing walls and specific geometry.
Investigating how prompt politeness affects LLM accuracy (2025)
Summary: The paper tested how five politeness levels in prompts affect GPT-4o accuracy on 50 multiple-choice questions across maths, science, and history. Contrary to expectations, impolite prompts consistently outperformed polite ones, with accuracy ranging from 80.8% for Very Polite to 84.8% for Very Rude. The results contradict earlier studies that found polite prompts improved performance, suggesting the relationship between tone and accuracy is model-dependent.
HN Discussion: One commenter argued the four-point spread is within noise and not practically significant. Another noted the “polite” prompts may not have effectively established a collaborative roleplay context, instead sounding like customer-service interactions. The study was criticised for testing only GPT-4o when prior work shows significant inter-model differences in tone sensitivity.
All of human cooking compressed into 2 megabytes
Summary: Epicure is a family of skip-gram ingredient embeddings trained on 4.14 million recipes from 11 sources across seven languages, normalised to 1,790 canonical ingredients via an LLM-augmented pipeline. The paper constructs an NPMI ingredient-ingredient graph and discovers emergent geometric relationships—universal pairings like tomato and beef that transcend cuisine boundaries. The entire embedding family fits in roughly 2 MB, making it practical for on-device food recommendation and recipe generation.
HN Discussion: One commenter corrected the title: the work captures ingredients, not cooking methods or proportions, so “all of human ingredients into 1,800 primitives” would be more accurate. Others noted the seven-language coverage hardly constitutes “all of human cooking,” omitting African, South American, and many Asian cuisines. Links were shared to similar projects compressing recipes into schematic representations and public-domain recipe archives.
Matrix Multiplications on GPUs Run Faster When Given “Predictable” Data (2024)
Summary: Horace He investigates why GPU matrix multiplications run faster on “predictable” (sequential or patterned) data than random data, discovering that CUTLASS’s profiler benchmarks used sequential data by default while real-world PyTorch uses random data. The performance difference is substantial—CUTLASS showed 288 TFlops with predictable data but only 257 TFlops with random data—revealing a widespread benchmarking blind spot. The root cause involves GPU memory access patterns and cache behaviour, not arithmetic special-casing.
HN Discussion: Commenters expected branch prediction to be the cause, but the actual mechanism is more nuanced, involving memory subsystem behaviour. One noted the 88 W idle power consumption of the test GPU as surprisingly high. Others guessed at multiply-by-zero or multiply-by-one shortcuts, which turned out not to be the primary explanation.
Tech Tools & Projects
Creusot helps you prove your Rust code is correct
Summary: Creusot is an open-source verification tool for Rust that translates programs into WhyML, enabling formal proofs of correctness using the Why3 platform. It supports Rust’s ownership model and allows developers to write specifications alongside their code, then mechanically verify that implementations meet those specs. Named after the French industrial town Le Creusot, the project aims to make formal verification more accessible to everyday Rust developers.
HN Discussion: One commenter asked how Creusot differs from Verus, another Rust verification tool, signalling growing interest in the formal verification ecosystem. A developer asked practical questions about applicability to CRUD apps and how to integrate verification into existing codebases. The discussion reflected genuine curiosity about where formal methods deliver the most value versus the cost of adoption.
Ruby vs. Java vs. TypeScript: my experience on building a Cowork DOCX plugin
Summary: The author built the same DOCX plugin for Claude Cowork three times—first in Ruby, then Java, then TypeScript—to compare the developer experience. Java won on library maturity and built-in ZIP/XML support in its standard runtime, but TypeScript was ultimately chosen for potential MCPB support that could reduce binary size by 99%. Bun was used to produce a single executable, though source map integration with PostHog remained an unresolved pain point.
HN Discussion: Older developers noted that Java’s ZIP/XML support stems from its dial-up-era design philosophy where the runtime had to work fully offline. Several commenters suggested Kotlin as a middle ground—Ruby-like expressiveness with JVM maturity—and .NET for its DocumentFormat.OpenXml library. Go and Rust were flagged as surprising omissions from the comparison, and Deno was suggested as an alternative to Bun.
Biff is a command line datetime Swiss army knife
Summary: Biff, by BurntSushi (author of ripgrep), is a CLI tool for datetime arithmetic, parsing, and formatting, built on the Rust Jiff library. Its key design principle is treating civil time and absolute instants as distinct types—avoiding the silent DST ambiguity that plagues most datetime APIs. The tool supports complex operations like sorting files by most-recent-change timestamp across timezones with correct DST handling.
HN Discussion: The author demonstrated sorting repo files by last-changed timestamp as a use case that is surprisingly hard with existing tools. One commenter noted the name collision with the classic BSD “biff” email notification utility. Another praised the civil-time/absolute-instant type distinction, crediting the TC39 Temporal proposal as inspiration.
Libwce: The entropy layer of a wavelet codec, on its own
Summary: Libwce is a 500-line Rust library implementing only the entropy coding layer of a wavelet image codec, stripping away all the framing, profile parsers, and boilerplate found in full codecs like JPEG 2000. It uses a patent-clean Bit-Plane Count approach inspired by JPEG XS, with no external dependencies beyond stdlib. The author wrote it to make the entropy coding step—the conversion of wavelet coefficients to bits—understandable and inspectable in isolation.
HN Discussion: One commenter noted the author’s aside that the demo decodes every test case without crashes, remarking on how LLM-generated code often treats crashes as acceptable. Discussion was sparse but appreciative of the educational clarity of the single-file approach.
Mini Micro Fantasy Computer
Summary: Mini Micro is a “fantasy computer”—a self-contained simulated 8-bit-style machine that runs MiniScript, a lightweight scripting language designed for learning and game development. It provides a curated environment with built-in graphics, sound, and file APIs, abstracting away OS complexity while preserving the feel of retro computing. Available as a desktop app, in-browser play, and downloadable for Windows, Mac, and Linux.
HN Discussion: Commenters compared it to PICO-8 and Picotron as alternative fantasy consoles, noting MiniScript feels similar to Lua. One developer wished for a bare-metal ESP32 or Raspberry Pi version to recapture the feeling of fully controlling the hardware. Criticism focused on the choice of C++ over C and CMake over POSIX make for the underlying implementation.
Web & Infrastructure
DuckDuckGo search saw 28% more visits after Google said people love AI mode
Summary: DuckDuckGo’s AI-free search page saw visits increase 22.7% week-on-week, peaking at 27.7% on May 24, following Google’s public insistence that users love its AI mode. The DuckDuckGo mobile app saw US installs spike 18.1% on average, with growth sustained over six days and peaking at 30.5%. The surge suggests a meaningful backlash against AI-infused search among users seeking traditional link-based results.
HN Discussion: Multiple commenters reported non-technical friends actively seeking Google alternatives for the first time, driven by frustration with AI being pushed into search. One argued that people who want AI answers will use a chat app directly, making AI-in-search a category error. A contrarian view held that Google’s AI mode is actually convenient for quick questions via the address bar, provided it is fast enough.
Incident with Pull Requests, Issues, Git Operations and API Requests
Summary: GitHub experienced a major incident affecting Pull Requests, Issues, Git Operations, and API Requests—the latest in a string of reliability problems during what commenters called “an impressively bad month.” Particularly concerning was that PRs on both the web UI and API were not consistently reflecting all commits or branch changes, risking incomplete code reviews. The incident followed another major outage from just days earlier, compounding developer frustration.
HN Discussion: A tongue-in-cheek “revert GitHub to June 2018” PR was suggested alongside calls to tie executive bonuses to three-nines availability. One developer linked to isgithubcooked.com, a tracker showing escalating incident frequency. Multiple commenters expressed worry that inconsistent PR diffs could lead to merging code without seeing the full changes.
Last.fm is now independent
Summary: Last.fm announced it has become an independent company following a change in ownership from CBS/Paramount, which acquired it in 2007. User accounts, scrobble history, data, privacy settings, and Pro subscriptions all remain unchanged; the same team continues to operate the service. The independence is positioned as allowing Last.fm to focus fully on building listening insights and community features.
HN Discussion: Long-time users shared deeply personal stories—meeting partners through the platform and maintaining over 20 years of continuous scrobbling history. One commenter noted Last.fm has been technically superseded by Spotify’s recommendations but retains unique value as a cross-platform listening record. Community-built visualisation tools like lastfmviz.netlify.app were highlighted as examples of the platform’s enduring ecosystem.
History & Science
Boston and Bermuda
Summary: An aviation writer reflects on the golden age of Boston-to-Bermuda flights in the late 1970s, when the route was surprisingly popular among working-class families in the Boston area. Bermuda sits only 650 miles off the Carolinas at roughly Atlanta’s latitude—far closer than most people assume—and its proximity made it an accessible vacation destination. The essay weaves personal childhood memories with the broader history of regional airlines that once served the Boston-Bermuda corridor.
HN Discussion: The nostalgic tone and vintage family photo drew appreciation from readers who remembered the era of accessible regional air travel. Light discussion with few contentious themes; the piece resonated as a charming aviation memoir.
More Whimsical OEIS Sequences
Summary: Jeremy Kun surveys whimsical OEIS sequences, including one inspired by XKCD 2016 that sorts integers by their pixel width when printed in Helvetica—actually published two days after the comic. Other highlights include A366192, “Peter’s List: Fractions nobody needs,” and the “screaming sequence” A325911 whose hex representation is all Fs. The post celebrates the quirky, human side of the Online Encyclopedia of Integer Sequences.
HN Discussion: One commenter pointed readers to the edit history of the “nonsense sequence” A133451 as particularly entertaining. The XKCD comic that inspired the Helvetica-width sequence was linked and appreciated. Discussion was light and playful, matching the whimsical tone of the article.
Rapira (Рапира) – Soviet programming language interpreter
Summary: Rapira was a Soviet educational programming language developed in the 1980s, featuring Russian keywords and a syntax influenced by SETL and early Python-like dynamic typing. The language supported compound data types including sets, records (associative arrays), and was designed as a more capable alternative to BASIC for teaching programming in Soviet schools. A companion language, Robic, was similar to Logo but featured multiple actors (a Train, an Ant, a Painter) rather than a single turtle.
HN Discussion: One commenter compared the syntax to “Pascal in Cyrillic,” demonstrating how the constructs map to Western languages. Another noted that 1C (1S), a Cyrillic programming language still in wide use in Russia for ERP systems, has a special keyboard layout designed for it. A Russian-speaking developer provided historical context on how Rapira’s design compared to contemporary Western educational languages.
Business & Industry
EU fines Temu €200M for allowing sale of illegal products
Summary: The EU imposed a €200 million fine on Temu for facilitating the sale of illegal products, including unsafe chargers that failed basic electrical safety tests and baby toys containing chemicals above legal limits. The fine is one of the largest imposed under the EU’s Digital Services Act framework, targeting platform accountability for third-party seller compliance. Temu was found to have inadequate systems for detecting and removing non-compliant products from its marketplace.
HN Discussion: Commenters noted that Amazon likely has similar issues with unsafe third-party products, questioning whether enforcement is selective. One argued the EU’s approach is regulatory whack-a-mole that cannot test its way to quality on Chinese imports. Others pointed out Temu fills a genuine need in parts of Europe where local intermediaries sell the same Chinese goods at much higher margins.
New York Passes Tax on the Ultra-Wealthy
Summary: New York passed the “pied-à-terre tax” on non-primary residences valued at $1 million or more, expected to raise $500 million to help close the city’s budget gap. The tax will more than double property taxes for many luxury second-home owners, though NYC’s antiquated assessment system often values properties at just 10% or less of market value, reducing the effective burden. The actual policy is a targeted second-home surcharge, not a broad wealth tax, despite the editorialised headline.
HN Discussion: Commenters corrected the editorialised title, noting this is a second-home tax, not a general wealth tax. One suggested a self-assessment mechanism where owners declare their property value and the government can choose to buy it at that price. Others debated whether the tax would create housing-market liquidity or simply function as a revenue source.
We replaced Zendesk
Summary: TradeCore, a CRM provider for FX/CFD brokers, claims to have replaced Zendesk with a custom solution built in 48 hours using AI-assisted development. The company argues that for organisations already running an internal CRM, the Zendesk integration overhead means they have already done most of the work needed to build their own support ticketing. The post positions this as evidence that AI-assisted development has significantly lowered the barrier to replacing SaaS platforms.
HN Discussion: One commenter warned that vibe-coded solutions work at 48 hours but the real test is what happens at 72 hours and beyond. Another described Zendesk’s post-private-equity behaviour as aggressive—pretending not to receive cancellation notices to lock in another contract year. A support automation veteran noted Zendesk has little real moat, being essentially a CRUD app with features most customers never use.
Private equity bought America’s essential services
Summary: The article opens with a fatal Chicago fire where a malfunctioning aerial ladder—traced to PE-owned maintenance cuts—delayed rescue by one minute, contributing to four deaths including a pregnant woman and her five-year-old. It traces how the $9.4 trillion private equity industry controls roughly 11,500 companies and 11 million jobs, systematically extracting profit from essential services like fire equipment maintenance, healthcare, and housing. The PE business model relies on leveraged buyouts, cost-cutting, and dividend recaps that transfer value from service quality to investor returns.
HN Discussion: One commenter drew a parallel to Crassus’s Roman fire brigade, which would let buildings burn until the owner sold cheaply. Another noted the irony that pension funds are major PE investors, meaning current living standards are being stripped to fund retiree cheques. The strip-mining of social capital—PE gobbling up mom-and-pop businesses—was called out as the most under-discussed consequence.
Other
Show HN: Continue? Y/N: A 60-second game about AI agent permission fatigue
Summary: A browser-based 60-second game that simulates the experience of approving or denying AI agent permission prompts, testing how carefully users read commands before clicking approve. Players receive a rapid stream of permission requests and must decide Y/N, with scoring that rewards security-conscious decisions. The game highlights the real-world problem of permission fatigue when working with AI coding agents and tools.
HN Discussion: Multiple players discovered you can “cheat” by denying everything and still get a top security score, exposing a flaw in the game’s incentive design. Commenters compared it to Papers, Please and suggested grouping permission requests into realistic batches to better simulate how fatigue builds. One user noted the questions jump context too much to be representative of real agent workflows.
The Permanent Upper Crow
Summary: An interactive browser game uses a crow metaphor to illustrate wealth inequality and conspicuous consumption—players must purchase increasingly expensive status symbols to ascend. The game loops in a cycle of earning and spending, satirising the treadmill of consumerism where upgrades never truly satisfy. Minimalist design with hand-drawn crow characters and escalating economic absurdity rounds out the experience.
HN Discussion: One player found they could “escape suffering” by simply not buying the top hat, drawing parallels to real-world conspicuous consumption traps. Another commenter extended the analogy to debt financing and insurance, noting how borrowed money amplifies the consumption treadmill. The pun “CAWn’t believe how hard this hits” captured the thread’s appreciation for the satire.
I analysed 20 years of my chats
Summary: The author processed 1.2 million chat messages spanning 20 years into a structured Obsidian vault, using LLM-driven sentiment analysis to map emotional bandwidth, endearment cycles, and friendship half-lives. Key findings included Dunbar-number-style tiers (15 close friends, 50 regular contacts, 150 active acquaintances) and the discovery that most friendships have a predictable decay curve. The project was partly inspired by WaitButWhy’s “Your Life in Weeks” grid, aiming to make sense of relationships beyond simple biometric tracking.
HN Discussion: Some commenters shared deeply personal reactions—having zero close friends themselves—and reflected on how modern life makes it simultaneously easier and harder to maintain friendships. Others debated whether to keep or delete chat history, with some archiving everything since 2001 and others auto-deleting after 6 months. A recurring theme was the nostalgic value of old IRC and MSN Messenger logs, even when half the contacts are no longer identifiable.
My new obsession: A horse-racing board game of pure luck
Summary: The author describes a horse-racing board game with no player agency—participants are dealt cards, horses move based on random draws, and betting is mechanically determined. The game has been released under many different names (Dubble Kross, Horse Race Game, etc.) with no clear original author, suggesting it may be a folk game in the public domain. Despite having zero skill involved, the game is compulsively playable, combining the relaxation of following a fixed algorithm with the social fun of communal fortune-reading.
HN Discussion: Commenters compared it to character creation in Traveller RPG and fortune-telling with tea leaves—fun comes from humans automatically assigning narrative meaning to random outcomes. Ready Set Bet was recommended as a more accessible alternative with real-time betting mechanics. One recalled mechanical arcade horse-racing games with 20-plus seats around an announcer.