About:

ML Engineer currently leading applied AI research and engineering in Europe.

Website:

Specializations:

Incoming Links:

Subscribe to RSS:
MMTEB is a comprehensive evaluation framework for assessing the performance of text embedding models. It covers over 500 quality-controlled evaluation tasks across 250+ languages, making it the largest multilingual collection of e...
The blog discusses the challenge of maintaining focus in the rapidly evolving field of AI, where new developments in NLP, LLMs, CV, and ML emerge frequently. The author questions whether it's better to chase every new innovation o...
The blog post discusses the challenges and solutions for serving concurrent requests with quantized Large Language Models (LLMs). The author highlights the importance of handling multiple user requests in production LLM applicatio...
The blog post discusses the challenges of transitioning a Retrieval Augmented Generation (RAG) system from a proof of concept to a production-ready system. It emphasizes the importance of observability in detecting issues, perform...
The blog post discusses metrics for evaluating the performance of LLM (Large Language Model) serving systems. It highlights common metrics used in production services like Requests Per Second (RPS), uptime, and latency, and compar...
The blog post discusses the challenges of running large language models (LLMs) locally on machines with limited resources, focusing on quantization and offloading techniques. Quantization reduces the computational and memory costs...
The blog post explores the benefits of running large language models (LLMs) locally using Ollama, an open-source app for MacOS and Linux. It highlights the advantages of local LLMs, such as data privacy and reduced reliance on clo...