((((sandro.net))))

quarta-feira, 25 de dezembro de 2024

Show HN: I made a website to semantically search ArXiv papers https://ift.tt/ophVLEY

Show HN: I made a website to semantically search ArXiv papers As a grad student (and an ADHDer), I had trouble doing literature review systematically. To combat this, I made a website that finds similar papers using the meaning of the thing I am looking for. I used MixedBread's [^1] embedding model to generate vectors from the abstracts. I store and search similar vectors using Milvus [^2] and finally use Gradio [^3] to serve the frontend. I update the vector database weekly by pulling the metadata dataset from Kaggle [^4]. To speed up the search process on my free oracle instance, I binarise the embeddings and use Hamming distance as a metric. I would love your feedback on the site :) Happy Holidays! [1]: https://ift.tt/EUa27xJ... [2]: https://milvus.io/ [3]: https://www.gradio.app/ [4]: https://ift.tt/0LmeFNB https://ift.tt/5H4sDmi December 25, 2024 at 02:44AM

Nenhum comentário:

DJ Sandro

http://sandroxbox.listen2myradio.com