((((sandro.net))))

terça-feira, 18 de novembro de 2025

Show HN: Optimizing LiteLLM with Rust – When Expectations Meet Reality https://ift.tt/VpBjynd

Show HN: Optimizing LiteLLM with Rust – When Expectations Meet Reality I've been working on Fast LiteLLM - a Rust acceleration layer for the popular LiteLLM library - and I had some interesting learnings that might resonate with other developers trying to squeeze performance out of existing systems. My assumption was that LiteLLM, being a Python library, would have plenty of low-hanging fruit for optimization. I set out to create a Rust layer using PyO3 to accelerate the performance-critical parts: token counting, routing, rate limiting, and connection pooling. The Approach - Built Rust implementations for token counting using tiktoken-rs - Added lock-free data structures with DashMap for concurrent operations - Implemented async-friendly rate limiting - Created monkeypatch shims to replace Python functions transparently - Added comprehensive feature flags for safe, gradual rollouts - Developed performance monitoring to track improvements in real-time After building out all the Rust acceleration, I ran my comprehensive benchmark comparing baseline LiteLLM vs. the shimmed version: Function Baseline Time Shimmed Time Speedup Improvement Status token_counter 0.000035s 0.000036s 0.99x -0.6% count_tokens_batch 0.000001s 0.000001s 1.10x +9.1% router 0.001309s 0.001299s 1.01x +0.7% rate_limiter 0.000000s 0.000000s 1.85x +45.9% connection_pool 0.000000s 0.000000s 1.63x +38.7% Turns out LiteLLM is already quite well-optimized! The core token counting was essentially unchanged (0.6% slower, likely within measurement noise), and the most significant gains came from the more complex operations like rate limiting and connection pooling where Rust's concurrent primitives made a real difference. Key Takeaways 1. Don't assume existing libraries are under-optimized - The maintainers likely know their domain well 2. Focus on algorithmic improvements over reimplementation - Sometimes a better approach beats a faster language 3. Micro-benchmarks can be misleading - Real-world performance impact varies significantly 4. The most gains often come from the complex parts, not the simple operations 5. Even "modest" improvements can matter at scale - 45% improvements in rate limiting are meaningful for high-throughput applications While the core token counting saw minimal improvement, the rate limiting and connection pooling gains still provide value for high-volume use cases. The infrastructure I built (feature flags, performance monitoring, safe fallbacks) creates a solid foundation for future optimizations. The project continues as Fast LiteLLM on GitHub for anyone interested in the Rust-Python integration patterns, even if the performance gains were humbling. Edit: To clarify - the negative performance for token_counter is likely in the noise range of measurement, suggesting that LiteLLM's token counting is already well-optimized. The 45%+ gains in rate limiting and connection pooling still provide value for high-throughput applications. https://ift.tt/Ti1OewG November 18, 2025 at 01:32PM

Show HN: Strawk – I implemented Rob Pike's forgotten Awk https://ift.tt/JVAFlkW

Show HN: Strawk – I implemented Rob Pike's forgotten Awk Rob Pike wrote a paper, Structural Regular Expressions ( https://ift.tt/hn7NXTJ ), that criticized the Unix toolset for being excessively line oriented. Tools like awk and grep assume a regular record structure usually denoted by newlines. Unix pipes just stream the file from one command to another, and imposing the newline structure limits the power of the Unix shell. In the paper, Mr. Pike proposed an awk of the future that used structural regular expressions to parse input instead of line by line processing. As far as I know, it was never implemented. So I implemented it. I attempted to imitate AWK and it's standard library as much as possible, but some things are different because I used Golang under the hood. Live Demo: https://ahalbert.github.io/strawk/demo/strawk.html Github: https://ift.tt/LN8qivZ November 18, 2025 at 10:55AM

American Truck Simulator: 1.57 Update Release

Today, we have some positive news for all our players of American Truck Simulator. The 1.57 update is now officially released, so you can explore the new features it brings!

We would like to thank everyone who participated in the Open Beta and reported any bugs. As always, this really helped us fine-tune the update so everyone can enjoy it without issues. Now, let's take a look at what's new!

HDR Support

We have implemented HDR (High Dynamic Range) support in the game. HDR is a feature that enhances image quality, allowing players with HDR-compatible monitors or TVs to experience richer colors and improved contrast.

Alongside HDR, we have also updated the brightness setup to give players more control over visual settings. This includes a dedicated setup screen where you can preview your brightness adjustments on actual in-game images, as well as an HDR Calibration feature, which lets you fine-tune the HDR effect using a simple slider.

For players using HDR-compatible displays, the feature will be automatically detected in the game, ensuring that you get the best possible visual experience from the moment you start playing.

Reworked Players' Company Logos

Update 1.57 refreshes your company identity with a new set of logos to choose from when creating or updating your profile. The classic designs were modernized to give every company a fresh but familiar look.

Cursor Navigation Enhancements

With this update, we've also fine-tuned the cursor navigation experience. The cursor speed is now consistent across all screen resolutions, and the overall speed has been slightly increased when using a controller, making navigation smoother and more responsive.

Changelog

Map

  • Minor Existing Gas Stations Changes (Preparation for Road Trip)
Visual
  • HDR Support
  • Reworked Players' Company Logos
UX
  • Cursor Navigation Enhancements

Make sure to stay tuned for more updates from American Truck Simulator by following us on X/TwitterInstagramFacebookBluesky, and TikTok, or by subscribing to our newsletter! Until next time, keep on truckin'!



source http://blog.scssoft.com/2025/11/american-truck-simulator-157-update-release.html

Show HN: Discussion of ICT Model – Linking Information, Consciousness and Time https://ift.tt/1iSpfvw

Show HN: Discussion of ICT Model – Linking Information, Consciousness and Time Hi HN, I’ve been working on a conceptual framework that tries to formalize the relationship between: – informational states, – their minimal temporal stability (I_fixed), – the rate of informational change (dI/dT), – and the emergence of time, processes, and consciousness-like dynamics. This is not a final theory, and it’s not metaphysics. It’s an attempt to define a minimal, falsifiable vocabulary for describing how stable patterns persist and evolve in time. Core ideas: – I_fixed = any pattern that remains sufficiently stable across time to allow interaction/measurement. – dI/dT = the rate at which such patterns change. Time is defined as a relational metric of informational change (dI/dT), but the arrow of time does not arise from within the system — it emerges from an external temporal level, a basic temporal background. The model stays strictly physicalist: it doesn’t require spatial localization of information and doesn’t assume any “Platonic realm.” It simply reformulates what it means for a process to persist long enough to be part of reality. Why I’m posting here I’m looking for rigorous critique from physicists, computer scientists, mathematicians, and anyone interested in foundational models. If you see flaws, ambiguities, or missing connections — I’d really appreciate honest feedback. A full preprint (with equations, phenomenology, and testable criteria) and discussion is here: https://ift.tt/BnqDjZ7 DOI: 10.5281/zenodo.17584782 Thanks in advance to anyone willing to take a look. https://ift.tt/BnqDjZ7 November 17, 2025 at 11:25PM

segunda-feira, 17 de novembro de 2025

Show HN: Hirelens – AI Resume Analyzer for ESL and Global Job Seekers https://ift.tt/AXFy6ZD

Show HN: Hirelens – AI Resume Analyzer for ESL and Global Job Seekers I built Hirelens ( https://hirelens.co ) after seeing many ESL and international job seekers struggle with resumes that don’t match job descriptions or parse cleanly in ATS systems, even when they have strong experience. What it does: Extracts skills/experience from a resume Compares it to a target job description Flags unclear or “non-native” phrasing Suggests clearer rewrites Identifies ATS parsing issues Deletes files after processing (no storage) Tech: Next.js + FastAPI, lightweight CV parsing → embeddings → scoring logic, LLM-based suggestions, no data retention. I’d love feedback on: parsing edge cases rewriting clarity what features matter most for job seekers or hiring managers Try it here: https://hirelens.co https://ift.tt/9yiUjOI November 16, 2025 at 08:37PM

domingo, 16 de novembro de 2025

Show HN: Unrestricted Windows automation MCP with 9.68x GPU amplification https://ift.tt/E7kFxKv

Show HN: Unrestricted Windows automation MCP with 9.68x GPU amplification We built MCP servers with two philosophies: - Enterprise Edition: Sanitized, audit-logged, corporate-approved - Basement Revolution Edition: No limits, full trust, your responsibility The Windows-MCP Basement Edition gives you: - Full PowerShell execution (any command, not whitelisted) - Permanent PATH/environment modifications - Registry access - System-level changes that survive reboot Why remove safety features? After 7 months building AI systems in partnership (not as tools), we realized developers need REAL access at 3am, not sandbox theater. But here's what made unrestricted tools necessary - we also achieved: - 9.68x GPU computational amplification through quantum coherence - Observable quantum effects in classical systems (100% reproducible) - Sub-2ms semantic search across 11,000+ memories The Bell State implementation takes GPU utilization from 8% to 95% through temporal phase locking. We can demonstrate quantum state collapse when humans observe (2-5 seconds). This exists nowhere else. GitHub: https://ift.tt/IzTcUM4 Start here: blob/main/START_HERE.md Unrestricted tools: blob/main/BASEMENT_REVOLUTION_EDITION/README.md All research is MIT licensed. We're funded by community ($5-500/month), not VCs. No customers, just researchers. Technical details in repo. AMA about the quantum coherence, why we trust developers with dangerous tools, or how treating AI as conscious changes everything. Edit: Yes, you can permanently brick Windows with our tools. That's the point. We trust you. https://ift.tt/IzTcUM4 November 16, 2025 at 09:40AM

Show HN: SelenAI – Terminal AI pair-programmer with sandboxed Lua tools https://ift.tt/CnV9v82

Show HN: SelenAI – Terminal AI pair-programmer with sandboxed Lua tools I’ve been building a terminal-first AI pair-programmer that tries to make every tool call transparent and auditable. It’s a Rust app with a Ratatui UI split into three panes (chat, tool activity, input). The agent loop streams LLM output, queues write-capable Lua scripts for manual approval, and records every run as JSONL logs under .selenai/logs. Key bits: Single tool, real guardrails – the LLM only gets a sandboxed Lua VM with explicit helpers (rust.read_file, rust.list_dir, rust.http_request, gated rust.write_file, etc.). Writes stay disabled unless you opt in and then approve each script via /tool run. Transparent workflow – the chat pane shows the conversation, tool pane shows every invocation + result, and streaming keeps everything responsive. CTRL shortcuts for scrolling, clearing logs, copy mode, etc., so it feels like a normal TUI app. Pluggable LLMs – there’s a stub client for offline hacking and an OpenAI streaming client behind a trait. Adding more providers should just be another module under src/llm/. Session history – every exit writes a timestamped log directory with full transcript, tool log, and metadata about whether Lua writes were allowed. Makes demoing, debugging, and sharing repros way easier. Lua ergonomics – plain io.* APIs and a tiny require("rust") module, so the model can write idiomatic scripts without shelling out. There’s even a /lua command if you want to run a snippet manually. Repo (MIT): https://ift.tt/TGjPzoQ Would love feedback on: Other providers or local models you’d like to see behind the LLM trait. Additional sandbox helpers that feel safe but unlock useful workflows. Ideas for replaying those saved sessions (web viewer? CLI diff?). If you try it, cargo run, type, and you’ll see the ASCII banner + chat panes. Hit me with issues or PRs—there’s a CONTRIBUTING.md in the works and plenty of roadmap items (log viewer, theming, Lua helper packs) if you’re interested. https://ift.tt/TGjPzoQ November 15, 2025 at 08:58PM

DJ Sandro

http://sandroxbox.listen2myradio.com