((((sandro.net))))

quinta-feira, 3 de abril de 2025

Show HN: All Books, All Languages (ABAL) https://ift.tt/Bbd1sJE

Show HN: All Books, All Languages (ABAL) "All Books, All Languages", or ABAL for short, is a modern, web-based, parallel text language-learning application that I've been working on as a side project for about a year. I've already received a lot of great feedback about the application among language learners. In addition to what I hope is an intuitive UI/UX, one of the unique things about ABAL is the way that pronunciations are customized to a user's native and target languages. Systems like IPA ( https://ift.tt/FpOz75I ...) don't make pronunciation any easier to get right for a majority of - if not all - language learners. ABAL solves this by showing phonetic hints that are written in the user's native language script and are meant to be pronounced in the native language's voice. The name All Books, All Languages is intentionally quixotic, as it serves to set the ambitious and unreachable goal of hosting all of humanity's written works while making them available to all language learners and readers across the globe! Today, it hosts a number of generated short stories categorized by subject and CEFR level. I'm currently working on supporting (private) self-uploaded files, and works in the public domain. Work is also going into adding support for more languages, while evaluating and optimizing for cross-language translations accuracy. It currently supports 56 languages (including Klingon and Latin!) - which comes out to 3080 permutations - or 3080 unique user profiles as I like to think of it. Both the marketing and application sites are fully internationalized - even response messages in 4xx responses use i18n. Quite a bit of work went into automating the i18n diff generation system to make sure I can release UI/UX changes quickly. I hope you'll find it useful if you're learning a new language or need to practice one you already know. If you're not learning a new language, but wish you could read classics in a language you're fluent in, stay tuned! Marketing site: https://www.abal.ai Application site: https://read.abal.ai X for updates: @abal_ai https://www.abal.ai April 3, 2025 at 12:04AM

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual) https://ift.tt/z6prSuN

Show HN: OCR pipeline for ML training (tables, diagrams, math, multilingual) Hi HN, I’ve been working on an OCR pipeline specifically optimized for machine learning dataset preparation. It’s designed to process complex academic materials — including math formulas, tables, figures, and multilingual text — and output clean, structured formats like JSON and Markdown. Some features: • Multi-stage OCR combining DocLayout-YOLO, Google Vision, MathPix, and Gemini Pro Vision • Extracts and understands diagrams, tables, LaTeX-style math, and multilingual text (Japanese/Korean/English) • Highly tuned for ML training pipelines, including dataset generation and preprocessing for RAG or fine-tuning tasks Sample outputs and real exam-based examples are included (EJU Biology, UTokyo Math, etc.) Would love to hear any feedback or ideas for improvement. GitHub: https://ift.tt/CZqAnXe https://ift.tt/CZqAnXe April 3, 2025 at 02:48AM

quarta-feira, 2 de abril de 2025

Show HN: Await-Tree – Visualize Async Rust Task Execution in Real-Time https://ift.tt/LHydbP7

Show HN: Await-Tree – Visualize Async Rust Task Execution in Real-Time https://ift.tt/srzuKhZ April 2, 2025 at 05:46AM

Show HN: I made Confetti: a configuration language file format https://ift.tt/OA7YLig

Show HN: I made Confetti: a configuration language file format Hello everyone, I created Confetti: a simple, typeless, and localization-friendly configuration language designed for human-editable configuration files. In my opinion, JSON works well for data interchange, but it's overused for configuration, it's not localization-friendly, and it's too syntactically noisy. INI is simple but lacks hierarchical structures and doesn't have a formal specification. Confetti is intended to bridge the gap. I aim to keep Confetti simple and minimalistic, while encouraging others to extend it. Think of it like Markdown for configuration files: there's a core specification, but your welcome to create your own variations that suit your needs. https://ift.tt/xT8NopY March 31, 2025 at 09:34AM

Show HN: I vibecoded a 35k LoC recipe app https://ift.tt/GQrpgZs

Show HN: I vibecoded a 35k LoC recipe app Over the last 2-3 weeks, I vibecoded the recipe app that I always wished existed - recipeninja.ai . It now includes a fully interactive voice assistant so you don't need to get your dirty hands over your new iPad when you're cooking. Background: I’m a startup founder turned investor. I taught myself (bad) PHP in 2000, and picked up Ruby on Rails in 2011. I’d guess 2015 was the last time I wrote a line of Ruby professionally. Last month, I decided to use Windsurf to build a Rails 8 API backend and React front-end app, using OpenAI's realtime API for voice-to-voice responses. Over the last few days, I also used Claude Code and Gemini 2.5 Pro for some of the trickier features. 35,000 LoC later, this is what I built! The site uses function-calling to navigate the site in realtime as you chat with the voice assistant, which I think is pretty neat. For the long version, see https://ift.tt/OAqEI71... I'd love any feedback you have! Demo video of the voice assistant: https://www.youtube.com/watch?v=kRhVc9D5kcg Generate and edit new recipes: https://www.youtube.com/watch?v=VwwZF6dHcHg https://ift.tt/T6UZi9p April 1, 2025 at 10:57PM

terça-feira, 1 de abril de 2025

Show HN: Emissary – Rust implementation of the I2P protocol stack https://ift.tt/WqMJUc7

Show HN: Emissary – Rust implementation of the I2P protocol stack https://ift.tt/MseGh6y April 1, 2025 at 05:08AM

Show HN: Duolingo-style exercises but with real-world content like the news https://ift.tt/1gOIxCa

Show HN: Duolingo-style exercises but with real-world content like the news I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content . Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises. Would love your thoughts! https://ift.tt/F6oxLEN April 1, 2025 at 02:46AM

DJ Sandro

http://sandroxbox.listen2myradio.com