Sherman Chann (aka 152334H)

About:

Sherman Chann is a developer interested in programming and cybersecurity, known for organizing CTFs and enjoying programming puzzles.

Website:

152334h.github.io

Specializations:

Programmer Developer CTF organizer Machine Learning Engineer Data Scientist

Interests:

Programming Cybersecurity Open source contributions Competitive programming

Incoming Links:

reyem.dev

Outgoing Links:

Dan Luu

Subscribe to RSS:

Link

2026-03-13 • finance health volunteer intellectualism ai societal issues

The post critiques the declining value of intellectual pursuits and explores the implications of AI on human activities and societal roles.

2025-12-31 • technology career mental health isolation level ai

A candid reflection on a disheartening year, exploring health, career disillusionment, and the psychological impacts of technology and social isolation.

2025-02-03 •

The author discusses the struggle of thinking with more than 60 seconds of context, the degradation of their writing ability, and the impact of AI cognition speed. They reflect on the release of GPT-4 and the limitations of langua...

2026-02-13 •

...

2024-07-29 •

The text discusses the cost of replicating a Google Deepmind paper titled Scaling Exponents Across Parameterizations and Optimizers. It details the experiments conducted, the cost of each experiment, and the problems with the expe...

2024-06-29 •

The paper discusses the DeepSeek Core Readings 0 - Coder, which involves pretraining data from Github, constructing pretraining data, and evaluating the performance of the model. It also covers the architecture, training objective...

2024-06-22 •

The paper discusses the DeepSeek Core Readings 1 - LLM - 152334H, which includes details about the pretraining, scaling laws, tokenization, model architecture, LR scheduler, infrastructure, scaling experiments, alignment, DPO, eva...

2024-04-12 •

The text provides tips for remaining conscious and avoiding mental unawareness, including reading, surrounding yourself with conscious people, sleeping on the floor, disabling notifications, adding barriers to distractions, tracki...

2023-12-31 •

The author reflects on the year 2023, which was materially and situationally better but subjectively worse in spirit and health. The author lists 5 resolutions for 2024 and discusses predictions and history. The author also provid...

2023-12-13 •

The text discusses the mixture-of-experts paradigm and its potential replacement by dense models. It covers issues with fine-tuning, vRAM limitations, and the potential benefits of MoE models for certain users. The author also pre...

2023-08-08 •

The text explains the mechanism of token dropping in GPT-4 and the concept of Mixture-of-Experts (MoE) in detail. It discusses the routing strategies, cutoff dates for GPT-4, GPT-4 leaks, and the process of token choice. It also e...

2023-08-04 •

The text discusses the non-deterministic behavior of GPT-4, attributing it to the Sparse MoE architecture. The author presents a hypothesis that batched inference in Sparse MoE models is the root cause of non-determinism in the GP...

2022-08-13 •

The author discusses the process of refurbishing their blog, the challenges they faced, and the changes they made to the site. They also talk about their motivation for the redesign and their plans for future content.