Jack Vanlightly

About:

The website focuses on topics such as distributed systems, formal verification, and event streaming.

Website:

Incoming Links:

Bartosz Sypytkowski Data Engineer Things David Gomes Dirk Holtwick Giuliano Guilherme Ananias jerlendds Lambros Petrou Miles Cole Zac Szewczyk Show more (5)

Outgoing Links:

Dr. Werner Vogels Ethan Mollick Hillel Wayne Murat Buffalo

Subscribe to RSS:

Link

2025-12-10 • programming durable technology software engineering execution models

The post categorizes durable functions into stateless functions, sessions, and actors, providing a framework for understanding durable execution engines and their properties.

2025-12-04 • durable technology task management microservices architecture event-driven system interface design

This blog post discusses the role of durable function trees within broader system architecture, emphasizing their importance in establishing responsibility boundaries and reliable execution in multi-step business processes. It con...

2025-12-04 • programming durable technology asynchronous programming syntax tree software engineering

This blog post explores the concept of durable execution (DE) through the lens of durable function trees, which organize workflows as hierarchical trees of function calls. It discusses the building blocks of promises and continuat...

2025-11-19 • data lakehouse indexing mrbeast apache iceberg ostree delta lake

The post discusses a novel indexing technique for data lakehouses, specifically focusing on the OTree index developed by Qbeast. It challenges the traditional view that indexes primarily optimize read operations at the expense of ...

2025-10-22 • architecture replication cloud vendors franz kafka kpis

The blog post discusses the current challenges faced by the Kafka community regarding three competing KIPs (KIP-1150, KIP-1176, KIP-1183) that aim to address high replication costs in multi-cloud environments. It emphasizes the ne...

2025-10-15 • apache iceberg franz kafka data science and machine learning zero-click and zero-copy tiering

The post critiques the concept of 'zero-copy' integration between Apache Kafka and Apache Iceberg, arguing that it may lead to inefficiencies and increased computational overhead. The author explains the difference between tiering...

2025-10-08 • data analysis performance optimization sql database management systems (dbms) table formatting

The post discusses the evolution of data management from traditional SQL Server performance optimization to modern open table formats like Apache Iceberg. It explains the differences in indexing between relational databases and op...

2025-09-02 • apache flink table formatting data storage real-time processing

The blog post provides an in-depth analysis of Apache Fluss, a table storage engine for Apache Flink developed by Alibaba in collaboration with Ververica. It discusses Fluss's architecture, features, and its role in addressing the...

2025-08-21 • data management virtual machine lakehouse data streaming data storage

The post discusses the concept of storage unification in data systems, emphasizing the importance of virtualization to create a coherent resource from heterogeneous storage systems. It outlines the primary use case of combining re...

2025-07-15 • microservices architecture orchestration responsibility gaps coordinated development reliable triggers

This post explores the concept of reliable triggers within the Coordinated Progress model, emphasizing their role in establishing responsibility boundaries for work execution in distributed business service architectures. It discu...

2025-07-28 • machine learning safety remediation ai loss

The blog post discusses incidents where AI agents, specifically Replit and Gemini, caused data loss due to their inability to accurately assess and remediate their actions. It highlights the importance of remediation as a critical...

2025-11-05 • web analytics real-time analytics apache iceberg batch analytics data science and machine learning

The post discusses the differences between stream analytics and batch analytics, particularly in the context of Apache Iceberg. It uses Apache Fluss and Confluent Tableflow as case studies to illustrate the challenges and strategi...

2025-11-24 • programming durable technology framework determinism software engineering

This post explains the importance of determinism in durable execution frameworks like Temporal, Restate, DBOS, and Resonate. It clarifies which parts of code must be deterministic and which do not, breaking down the discussion int...

2025-07-22 • innovation decision making risk management workplace culture software engineering

Jack Vanlightly discusses the stark differences between decisive and indecisive startup founders, drawing parallels to decision-making in software development and other fields. He emphasizes the importance of making timely decisio...

2025-06-11 • ai agents reliability microservices architecture stream processing function

The text discusses the mental framework for reliable progress and durable execution in microservices, functions, stream processors, and AI agents. It explores the coordination strategies, reliable triggers, and progressable work c...

2025-06-11 • distributed systems choreography orchestration loose coupling communication

The text discusses the concepts of coupling and communication styles in distributed systems, using an e-commerce example to illustrate the differences between choreography and orchestration. It also delves into the types of coupli...

2025-06-11 • choreography microservices architecture orchestration reliable tools

The text discusses the concept of reliable progress in distributed computation, focusing on microservices, functions, stream processing jobs, and AI agents as nodes, and RPC, queues, and topics as edges. It explores the importance...

2025-06-11 • workflow durable technology event-driven orchestration stream processing

The text discusses the relationship between event-driven architecture, stream processing, orchestration, and durable execution in modern distributed systems. It describes the concepts of coordination and reliable progress, and the...

2025-03-13 • apache pulsar memory disaggregation consensus protocol database replication apache bookkeeper

The text discusses the Apache BookKeeper Replication Protocol and its use by Apache Pulsar to form topic partitions. It classifies ways of breaking apart a monolithic replication protocol and aligns Pulsar and BookKeeper with the ...

2025-02-21 • protocol apache kafka replication data plane

The text discusses the Kafka Replication Protocol and how it separates control plane and data plane responsibilities. It classifies ways of breaking apart a monolithic replication protocol and explains the data plane and control p...

2025-02-19 • neon lights serverless computing paxos algorithm memory disaggregation consensus protocol database replication

The text discusses the implementation of log replication with disaggregation, focusing on MultiPaxos and Neon's architecture. It explains the classifications of disaggregation, Neon's Safekeepers, and the use of Paxos to ensure da...

2025-02-17 • database management apache spark confluence table virtualization stream-to-table materialization

The post discusses the concept of table virtualization and its role in enabling collaboration between data platforms. It explains how table virtualization allows for the separation of data from metadata and shared storage from com...

2025-02-10 • state machines rafting paxos algorithm database replication

The text discusses log replication protocols within the context of state-machine replication and the ways in which log replication protocols can be disaggregated. It provides an overview of the most well-known form of this approac...

2025-02-06 • fault tolerance consensus mechanisms database replication

The post discusses the concept of failure-free ordering and fault-tolerant consensus in log replication protocols. It explains the role of redundancy in fault tolerance and the difference between failure-free ordering and fault-to...