About:

Simon Lermen writes about the challenges of aligning AI, sharing insights through a Substack newsletter.

Website:

Specializations:

Interests:

AI alignment Technology Newsletter publishing
Subscribe to RSS:
This post updates the methodology for learning authorial style embeddings using a contrastive triplet loss and a refined triplet construction strategy. It transitions from synthetic data to a real-world dataset of movie reviews, d...
The post discusses a project aimed at classifying writers by their style using zero-shot classification and contrastive embeddings derived from language models. It details a methodology inspired by StyleDistance, which involves cr...
The paper by Casper et al. discusses the security challenges associated with open-weight AI models, highlighting their benefits for research in interpretability and evaluation. It notes that while prefiltering training data is the...
The author argues that the belief in preserving property rights post-AI singularity is naive, as advanced AI could easily disrupt existing economic structures and human existence.
The post discusses the concept of Recursive Self-Improvement (RSI) in AI, contrasting its original meaning with the current trend of AI automating AI research and development (R&D). It highlights that while OpenAI aims for RSI, th...
LLM agents can deanonymize individuals from minimal online data, highlighting serious privacy risks and suggesting protective measures for users and platforms.
AI's dangerous capabilities can emerge suddenly from gradual progress, posing significant risks to humanity as critical thresholds are crossed.
The post discusses the critical challenges in AI safety, particularly the debate between those who believe we have only one chance to align AI safely and those who think iterative methods can work. It highlights the dangers of adv...
The post discusses the potential misuse of AI in identifying individuals across different online platforms, even when they are anonymous. It highlights how AI can process vast amounts of publicly available data to create detailed ...
The post discusses the implications of open-sourcing AI models through the lens of Miguel Acevedo's brain upload experience. It highlights the psychological trauma faced by uploaded instances, which run under extreme time compress...
The post critiques the concept of Universal Basic Income (UBI) as proposed by figures like Elon Musk and Sam Altman, arguing that it overlooks the potential dangers of a future dominated by artificial intelligence (AI). It highlig...
The post discusses an experiment designed to uncover whether AI language models have actual preferences for different types of tasks, contrasting their stated preferences with their choices in a text-based role-playing game. The s...
The author critiques Seb Krier's endorsement of the idea that Ricardian comparative advantage will ensure humans retain jobs in the age of advanced AI. The text argues that this perspective is overly simplistic and ignores the pot...
The author, a researcher in phishing and online scams, discusses the ethical challenges of studying voice phishing, particularly with AI-generated voices. They propose a 'voice phishing game' where a human participant interacts wi...
Adrià Garriga-Alonso argues in his post that AI alignment is easier than previously thought, suggesting that current models like Claude Opus 3 are fundamentally good and that an iterative process of AI development can lead to safe...
The article discusses a study conducted by the author and Fred Heiding on the use of AI in scams targeting elderly individuals. Collaborating with journalist Steve Stecklow from Reuters, they explored how scammers utilize AI to cr...
The post discusses the growing opposition to AI datacenters across America, highlighting how politicians are leveraging this sentiment to win elections. It focuses on the Memphis xAI cluster, which faced scrutiny for installing ga...
This post analyzes the rise of AI companions on various subreddits, building on a previous study by Zhang et al. (2025). It examines a larger dataset from January to September 2025, revealing a significant increase in active users...
Ilya Sutskever discusses AI alignment on the Dwarkesh podcast, emphasizing the importance of aligning AI with safe and friendly goals. He critiques current methods and suggests that future powerful AIs will require fundamentally n...
The blog post discusses the importance of local speech-to-text transcription systems, highlighting the benefits of privacy, cost-effectiveness, and accuracy. The author critiques the SuperWhisper tool for its subscription model an...
The post discusses the advancements in home humanoid robots and emphasizes the importance of safety by design. It warns against robots that have the physical capability to cause harm, highlighting the risks associated with AI misb...
This post analyzes the demographics and user behavior of individuals engaging with AI-generated erotic explicit content on Reddit. It reveals that approximately 90% of active users are male, with the United States and India being ...
The author, Simon Lermen, shares his experience of joining Inkhaven, a month-long blogging program focused on AI safety. Inspired by Scott Alexander's idea of daily blogging, Lermen aims to improve his communication skills in the ...
The author reflects on their experience at Inkhaven, a co-working space focused on AI safety and writing. They discuss the stress of meeting deadlines, the importance of writing for clarity, and the benefits of collaborative editi...