Anup Jadhav

2026-02-26 • quality assurance large language models evaluation system testing ai development and strategy

Regression testing is crucial for LLM systems to ensure prompt changes do not lead to unnoticed quality drops, advocating for a systematic evaluation approach.

2026-02-23 • development machine learning software testing large language models evaluation

Evaluation-Driven Development (EDD) offers a systematic approach to testing LLM systems, addressing the shortcomings of traditional software testing methods by focusing on quality criteria and iterative evaluation.

2026-01-14 • state management graph multi-agent systems orchestration temporal

A two-layer architecture combining Temporal and LangGraph enhances multi-agent system reliability and performance by separating orchestration from agent logic.

2025-11-24 • machine learning natural language processing enterprise rag information retrieval ai

The article discusses the limitations of naive Retrieval-Augmented Generation (RAG) systems when deployed in production environments, particularly under high traffic conditions. It outlines the four-step process of naive RAG and i...

2025-10-30 • neural networks machine learning deep learning transformers ai

This guide provides a comprehensive understanding of transformers by tracing the historical development of neural network architectures, including RNNs, LSTMs, and CNNs, leading to the creation of transformers. It explains key con...

2026-02-24 • machine learning large language models testing data evaluation golden dataset

A golden dataset is essential for evaluating LLM responses, defining correctness through input/output pairs rather than exact string matching.

2026-01-30 • learning python code education ai

Using AI coding assistants may hinder developers' learning, as reliance on AI leads to poorer comprehension and debugging skills compared to traditional learning methods.

2025-12-24 • economic inequality transparency environmental and social impact ai technologies job loss and replacement

The AI industry in 2025 showcases significant breakthroughs but is plagued by contradictions in economics, job creation, environmental impact, and transparency.

2025-12-10 • software development technical debt vibe coding ai context engineering

The post argues against 'Vibe Coding' and promotes 'Context Engineering' to ensure reliable software development with AI, emphasizing structured workflows over casual interactions.

2025-11-20 • user interface design generative ai large language models google search ai

The blog post discusses the paper 'Generative UI: LLMs are Effective UI Generators' from Google Research, which argues that large language models (LLMs) should generate complete user interfaces instead of just text responses. The ...

2026-02-25 • machine learning large language models evaluation deterministic builds and checks semantic similarity

The post provides a guide on evaluation techniques for LLM systems, emphasizing cost-effective methods to ensure quality output.

2025-11-03 • vector graphics database postgresql data science and machine learning pgvector

The article discusses the limitations of using Postgres for storing vectors, as highlighted by Alex Jacobs in 'The Case Against pgvector.' It emphasizes that while it may seem convenient to use a single database for vector storage...

2026-02-20 • machine learning code quality and management stripe software engineering ai

Stripe's 'minions' system highlights the critical role of structured constraints and human oversight in managing AI-generated code for improved reliability.

2026-03-01 • technology machine learning ai export tools

The Claude Code team emphasizes that evolving AI agent tools requires simplicity and adaptability, often benefiting from fewer, more expressive tools rather than adding complexity.

2026-02-15 • large language models object-oriented programming rust programming markov language compiler issues

Davis Haupt's Markov language aims to optimize programming for machine fluency, potentially enhancing human readability and addressing the trade-off between ease of writing and comprehension.

2026-02-28 • software development code review ai and it tools engineering best practices

Garry Tan's /plan-exit-review command for Claude Code ensures thorough self-review of coding plans, enhancing quality and reducing scope creep.

2025-11-13 • productivity software development code cursor ai

A study on Cursor, an AI coding assistant, reveals that junior developers utilize autocomplete features more, while senior developers prefer the Cursor Agent for delegating tasks. The research indicates that senior developers, wit...

2025-11-24 • technology ethics code orchestration ai

The article discusses the shift in the role of coders towards becoming orchestrators in the context of AI and agentic coding. It emphasizes the importance of human judgment in the orchestration process, highlighting the need for c...

2025-11-19 • finance machine learning large language models trading ai

The article discusses an experiment where six large language models (LLMs) were given $10,000 each to trade perpetual futures autonomously, revealing distinct trading behaviors despite identical conditions. Notable patterns emerge...

2026-02-15 • strategy data management generative ai ai

Many companies falsely claim to be AI-powered by merely using existing tools, while true value creation requires a strategic mindset shift and original development.

2025-12-20 • programming software development code best practices ai

AI-assisted coding advice often contradicts itself, reflecting diverse user experiences and the need for personal adaptation and understanding of coding fundamentals.

2026-02-16 • openai anthropic machine learning inference framework ai

Anthropic and OpenAI's contrasting fast inference strategies highlight that speed may be less important than accuracy in AI performance.

2026-02-17 • software development job market universal blue existential risk ai

The term 'Deep Blue' encapsulates the existential dread software engineers face as AI threatens their profession, highlighting a unique tension between identity and productivity.

2026-02-15 • software engineering coding standards and guidelines claude code ai boris cherny

Boris Cherny provides essential tips for using Claude Code, highlighting its adaptability and the advantage experienced developers have in utilizing AI tools effectively.