About:

Richard Demsyn-Jones is a machine learning researcher and blogger interested in ML, engineering, and human behavior modeling.

Website:

Specializations:

Interests:

Machine learning Engineering Marketplaces Modeling human behavior
Subscribe to RSS:
The post explores the evolution and techniques behind function calling in large language models (LLMs). It discusses how LLMs have improved in natural language tasks, code understanding, and tool usage through various methods such...
The blog post discusses the concept of agentic AI, which refers to systems that can set goals, plan, reason, and interact with the outside world autonomously. It contrasts agentic systems with traditional tools, highlighting the d...
Effective strategy is crucial for overcoming challenges, as highlighted by Richard Rumelt's insights in 'Good Strategy Bad Strategy', which the author interprets through personal experience.
Claude Code revolutionizes data projects by streamlining coding and analysis, though effective use requires domain knowledge and careful management of instructions.
The text discusses the inconsistency between online and offline machine learning models. It highlights the bugs and issues that arise due to the differences in the two settings, using DoorDash's example. It also suggests solutions...
The text discusses a variation of the Monty Hall problem called the golden goat variation. It explains the scenario and the optimal approach to solve the problem using Bayes' theorem. It also compares the original problem with the...
The text discusses the debate on whether machines can think and create art. It explores the capabilities of machines and the concept of artificial intelligence. It also delves into the training of large language models (LLMs) and ...
The text discusses the importance of postmortems in software development, highlighting common anti-patterns and best practices. It emphasizes the need for accurate documentation, learning from mistakes, and improving systems. The ...
The text discusses the parallels between accumulating vendor relationships and accumulating internal technology, particularly in the context of building out a modern ML stack. It highlights the challenges of identifying the best c...
The text discusses the limitations of benchmark datasets for learning-to-rank (LTR) and the dominance of the Yahoo dataset in LTR literature. The author argues that the Yahoo dataset has key limitations and proposes the need for m...
The text discusses the importance of benchmark datasets in machine learning literature, particularly in the field of learning-to-rank (LTR). It highlights the impact of benchmark datasets on research, and provides detailed insight...
The text discusses the issue of position bias in features and its impact on machine learning algorithms. It explains how items that have historically been shown high up on lists will have received a lot of attention from users, an...
The text discusses the challenges of validating language models to ensure they do not generate inappropriate content. It describes the process of testing the language model's autocompletions and the efforts made to ensure it never...
The text discusses the concept of a 'model of everything' and its application in business. It explains the properties of such models, the challenges in building them, and the example of a system built at Lyft. The text also highli...
The text discusses inductive bias and expressiveness in machine learning models. It contrasts architectural decisions in machine learning and explains the structure of a model and the optimization algorithm. It also delves into in...
The text discusses Deep & Cross Networks (DCNs) and their application in machine learning models. It explains the concept of cross layers, the architecture of DCN-V2, and the advantages of using cross layers in neural networks. Th...
The text discusses three papers from the last few years in the learning-to-rank (LTR) literature, focusing on two-tower models for ranking problems. The papers contain models and evaluation methods that could be useful for those w...
The rise of GELU as an activation function in large language model (LLM) architectures is discussed. The author explains the importance of activation functions in neural networks and how GELU has become popular. The GELU paper is ...
The text discusses the concept of profit-maximizing experimentation regimes, critiquing the use of a p < 0.05 criteria and suggesting that a maniacal adherence to minimizing experiment false positives is unproductive. It reviews a...
The text discusses the use of the p-value threshold of 0.05 in experiment analysis and decision-making. It argues that this threshold is arbitrary and may lead to false positives, emphasizing the need to consider false negatives a...
The term 'embedding' has become a central concept in machine learning. The author discusses the evolution of the term and its meaning, and how it has become hard to define. The text explores the properties and uses of embeddings, ...
The text discusses log loss and cross entropy in binary and multiclass problems. It explains the differences between log loss and cross entropy, and the author's concerns about cross entropy. The author also discusses the use of l...
The author discusses the importance of testing in product development and how a small change in the code review process led to a reduction in customer-facing bugs and experiment restarts. The change involved adding a 'Tested' labe...
The text discusses the problem of sharing constants across programming languages and repositories, and the bugs that can arise from inconsistent spelling of strings. It explores the use of Protobuf and Thrift to define constants a...