About:

Sankalp is an Applied AI engineer with interests in AI and distributed systems, focusing on Generative and Applied AI.

Website:

Specializations:

Interests:

AI Distributed systems Generative AI Applied AI

Outgoing Links:

Subscribe to RSS:
The blog post discusses the intricacies of prompt caching in large language models (LLMs), particularly focusing on the vLLM engine and its paged attention mechanism. The author shares personal experiences of optimizing a feature ...
The blog post explores the creative potential of various large language models (LLMs) in generating visual prompts for Midjourney, focusing on the differences in outputs based on the models' training and configurations. It highlig...
A detailed guide on Claude Code 2.0, offering insights, features, and practical tips for optimizing the use of coding agents for both technical and non-technical users.
This blog post discusses the importance of prompt caching in optimizing inference costs and response times when using language models. It highlights practical tips for improving cache hits, such as maintaining a stable prefix, rem...
The author shares their experiences using Claude Code, a coding assistant, after transitioning from Cursor. They discuss the limitations of rate-limiting on Cursor, their workflow with Claude Code, and the advantages of using diff...
Effective Twitter growth relies on regular posting, meaningful interactions, and understanding algorithm nuances to foster genuine connections rather than just building an audience.
The text is a verbatim transcript of an interview with Shoulto and Trent on the Dwarkesh podcast. They discuss the importance of agency in work, choosing high-leverage problems, and the significance of research taste. They emphasi...
The text discusses the release of Claude Code and OpenAI Codex CLI, expressing skepticism about their long-term potential. It speculates on the motivations behind the releases and the potential for innovation in code editors. The ...
The author is on a small employment break and is trying to learn basic GPU programming and low-level memory stuff. They are trying to learn flash attention and are competing in a GPU mode server competition. They are learning trit...
The text is about learning GPU programming basics and flash attention. The author is on a small employment break and is trying to pick up some things. The real competition starts from the first week. The author is learning some Tr...
The author discusses how self-help videos have prompted change in their life. They talk about how they don't consume much self-help content but snack on them once in a while for guidance. They share their thoughts on videos that c...
The text discusses the evolution of AI-assisted coding features and developer interaction patterns. It explores the historical narrative, interaction patterns, and the gears analogy. It also delves into the evolution of autocomple...
The blog post discusses the optimization of the CodeQA code, focusing on the changes made to achieve a 2.5x speedup. It details the bottlenecks, optimizations, and the implementation of a two-stage approach with concurrent process...
The text discusses the release of the GPT-O series models by OpenAI and the impact it will have on the field of software engineering. It emphasizes the need to adapt quickly to the changes brought about by AI advancements and the ...

0prompt caching

2025-11-17

...

0songs to wind down

2026-03-11

...
The text discusses learning LLM optimization, including VLLM docs, speculative decoding, and kernels. The author is considering going into deeper layers than API for more ideas.
...
The text discusses the learning of machine learning in 2024, focusing on the experiences of students on Twitter and the resources available for learning. It emphasizes the importance of learning from both sides, building projects,...
The text is about creative coding with Claude 3.5 Sonnet Artifacts and p5.js. It explains how to upload charts and modify them, and the support for basic web dev rendering. It also discusses the limitations of Artifacts and how to...