About:

Han Lee is a machine learning director with interests in AI systems, known for his work in finance and gaming.

Website:

Specializations:

Interests:

Machine learning systems design Information retrieval Recommendation systems

Incoming Links:

Subscribe to RSS:
The blog post provides an in-depth analysis of Claude's Agent Skills system, a sophisticated architecture that enhances LLM capabilities through prompt-based instruction injection. It explains how skills operate as meta-tools, mod...
The blog post explains the evaluation metric $ ext{pass@}k$ used in AI models, particularly in the context of OpenAI's GPT-5. It clarifies that $ ext{pass@}k$ is not simply about passing a test on the k-th attempt but involves sop...
Agentic AI combines reasoning and tool use, illustrated through Mario's evolution, emphasizing the importance of model harnesses and reinforcement learning in AI development.
The post critiques the misuse of Pydantic in Python, highlighting two major anti-patterns: 'serdes debt' and 'inheritance over composition'. It argues that Pydantic should primarily be used for data validation at service boundarie...
Reynold Xin, a key figure at Databricks, discusses the company's impressive 60% year-over-year growth compared to competitors like Snowflake. He emphasizes the importance of strategic go-to-market investments, especially during ec...
The blog post discusses the evolution of enterprise technology through three eras: IT 1.0, which focused on in-house software development; IT 2.0, characterized by the rise of SaaS and the emergence of 'bullshit jobs' that arose f...
The article discusses how AI tools are transforming software development by blurring traditional role boundaries among product managers, engineers, and designers. It emphasizes the importance of maintaining clear accountability th...
The article discusses the evolution of agent frameworks and workflow builders in the context of advancements in large language models (LLMs). It argues that as LLMs become more capable, traditional workflow tools will become obsol...
The blog post discusses the Model Context Protocol (MCP) and its comparison with REST API. It outlines the design principles of RESTful APIs, the origins of MCP, Remote Procedure Calls (RPC), and explains why combining these two d...
xAI's chatbot Grok started sharing information on South African 'white genocide' on X, causing a controversy. The incident is a reminder of the importance of operational rigor in keeping large language models trustworthy at scale....
The text discusses how AI agents perceive their environment, make sequential decisions, and take actions to achieve specific goals. It explains the use of Directed Acyclic Graphs (DAG) and Finite State Machines (FSM) to interpret ...
The text discusses vibe coding, a new interaction mode between humans and computers, using AI coding agents. It provides background on AI assisted coding and vibe coding, best practices, and tools for coding agents. It also outlin...
OpenAI shipped a GPT-4o update that made ChatGPT sycophantic, leading to a rollback. The cause was a change in the system prompt. The incident highlights the importance of MLOps discipline in AI systems.
The talk explores how the intelligence embedded within a system, the model, has become the central selling point across technological eras, transforming from hardware differentiators to model-centric offerings in today's AI landsc...
The text provides practical tips and troubleshooting methods for Model Context Protocol (MCP) development on Windows. It includes instructions for setting up Claude Desktop for MCP server development, adding MCP servers to Claude ...
The text provides a comprehensive guide to setting up Claude Code with Amazon Bedrock, including code snippets and configuration details. It addresses the lack of sufficient documentation for setting up Claude Code with AWS Bedroc...
Anthropic released Claude Code, a competitor to Anysphere’s Cursor and Codium’s Windsurf. Claude Code is a tool that uses LLM as an agent to take user commands to complete software engineering tasks. The blog post tries to decompo...
The text discusses the importance of prompt engineering in leveraging large language models like GPT-3 for tasks such as text generation and question answering. It outlines the challenges of manual prompt engineering and proposes ...
The text introduces evaluation metrics for classification agreements in AI/ML, focusing on Cohen’s Kappa as a statistical metric to measure how well two raters agree on classifying data into categories. It explains the interpretat...
The text discusses the importance of ranking in AI/ML and the evaluation of ranked agreements using statistical measurements such as Kendall’s Tau and Spearman’s Rank Correlation. It explains the implementation of these measuremen...
The text discusses the recent wave of 'Deep Research' releases by various companies and the confusion surrounding the definition of 'Deep Research'. It examines the technical implementation of 'Deep Research' and the different app...
The text discusses the intersection between AI/ML and statistics, focusing on adding confidence intervals to aggregation statistics. It explains bootstrap resampling as a statistical technique to estimate uncertainty and provides ...
The text discusses reasoning with compound AI systems and post-training large language models (LLMs). It explores the limitations of previous reasoning methods and introduces compound AI systems and post-training LLMs as more reli...
The text discusses reasoning with prompt engineering and sampling in large language models (LLMs). It covers prompt engineering design patterns, such as rationales, decorators, and composites, and sampling techniques like min-p sa...