Emir

2025-10-30 • statistics bayesian statistics probability frequentist methods sensor fusion dempster-shafer theory

The post discusses a scheme for estimating probabilities of subsets of binary variables using Dempster-Shafer theory, which allows for the assignment of probabilities directly to subsets of events. It contrasts this with Bayesian ...

2026-03-08 • deep learning jax attention mechanism differentiable optimization optiq

Differentiable memory leverages linear algebra to create a compressible key-value store, enhancing deep learning's attention mechanism for various applications.

2026-01-17 • python natural language processing openstreetmap evidence theory tags

An unsupervised query tagger is developed using evidence theory to enhance query understanding by tagging user queries with relevant labels based on OpenStreetMap data.

2026-01-04 • statistics probability mathematics resources board games

A statistical analysis of Snakes & Ladders reveals the expected number of turns to finish the game using Markov theory and transition probabilities.

2025-08-17 • data analysis uncertainty statistics bayesian statistics r programming

The blog post discusses methods for sorting fractions under uncertainty, focusing on the binomial distribution and confidence intervals for estimating the fraction of successful trials. It presents two approaches: a Bayesian metho...

2026-01-01 • python computational models evidence theory data science and machine learning dempster-shafer theory

The post presents the pyevidence repository, offering tools to implement evidence theory while addressing its computational challenges.

2025-05-10 • array programming heuristics linear regression data labeling maximum likelihood estimation

The text presents a weak supervision paradigm called 'data programming' which uses maximum likelihood estimation to produce soft labels from heuristics. It includes a simple example to show that the methods work and discusses the ...

2025-03-30 • data analysis machine learning hacker events

The author uses 500 Hacker News titles and an LLM to derive an article ranking model from a user supplied preference description. The LLM supplies the labelled data, whilst Ridge regression and cheap sentence transformer embedding...

2025-01-12 • numerical optimization betting strategies kelly criterion geometric mean

The post explains the Kelly criterion and how to derive it, as well as a simple way to extend it to simultaneous independent binary bets. It also discusses the multiple simultaneous bets and the Python function to achieve it.

2024-04-28 • machine learning linear regression fourier transform rbf kernel kernel methods

The text discusses the use of RBF kernel approximation with random Fourier features in machine learning. It explains the problems with linear methods and how kernel regression can address these issues. It also introduces the rando...

2024-04-24 • data analysis statistics equations metric learning

The text discusses linear metric learning and how to find a transformation A which makes the sum of squared difference between i and j similar, regardless of whether its calculated in terms of x or y. It also explains how to appro...

2024-03-24 • openmp and opencl fortran high-performance computing billion row challenge

The text discusses the 'Billion Row Challenge' in Fortran, which involves processing 1 billion rows of weather station data to obtain min/max/mean for each station as quickly as possible. The author documents their journey from a ...

2024-02-02 • advent of code python scala haskell prolog

The author discusses their experience solving Advent of Code puzzles using Prolog, Haskell, Python, and Scala. They compare the ease of coding in each language, noting that Prolog was the most difficult but also the most mind-expa...

2023-11-19 • puzzle game development prolog domino's

The text introduces a novel logic puzzle called 'Domicles' using Dominoe tiles. The author explains the rules of the game, provides examples, and presents a Prolog implementation. The difficulty of the puzzles is discussed, and a ...

2023-10-18 • simulation prolog stochastic optimization meta-interpreter

The text discusses a minimal proof-of-concept for a stochastic simulator in Prolog via a meta-interpreter. It explains the implementation, syntax, semantics, and conclusions of the interpreter, as well as provides examples and sim...

2023-10-15 • prolog

The author discusses the use of logic programming for data analysis, specifically analyzing diamond prices using a symbolic approach. The post covers data preparation, domain knowledge, consistency and coverage checking, and price...

2023-10-06 • javascript algorithm prolog domino's

The text is about the construction of a Block Dominoe playing algorithm for a hidden information variant of the game. The author built a game simulator, learned from a heuristic algorithm, and developed some play-out based algorit...

2023-08-12 • data analysis machine learning data visualization

The text analyzes the data job market using 'Ask HN: Who is hiring?' posts from 2013 to the present. It suggests that the Data Scientist role is in decline and that skills such as data mining and visualization are also out of favo...

2023-07-30 • probability quant riddles expectation management

The text discusses a riddle about drawing playing cards and the optimal stopping rule to maximize expected payoff. The author shares their thought process and the statistical approach they used to solve the problem.

2023-07-05 • poisson distribution mark-recapture gym population estimate

The author discusses the use of a mark and recapture experiment to estimate the total number of gym members based on the number of people repeatedly seen at the gym. They explore the Lincoln-Petersen estimator and Poisson regressi...

2023-06-18 • data analysis statistical models online experiences

The text explains the methods of blocking, optimal design, and covariate adjustment to improve the power of experiments. It emphasizes the importance of these methods for data scientists working with online experiments, and provid...

2023-05-12 • galaxy clustering prolog semi-supervised clustering

The text discusses the use of logic programming for clustering, emphasizing its suitability for general commercial use cases. It presents artisanal clustering algorithms in Prolog demonstrated on mock data and explains how domain ...

2023-04-30 • data analysis reasoning prolog linear regression isotonic regression

The text discusses the integration of Prolog as a critical component in data science analysis, using analytic methods to generate properties about the data and Prolog to reason about the data via the generated properties. It inclu...

2022-12-28 • m4 data lakehouse sql database schema databases

The author discusses the challenges of working with large SQL codebases and the need for a composable SQL. They explore Logica and the use of M4 macro pre-processor to create shared libraries and abstract common parts of SQL queri...