About:
The website focuses on topics such as distributed systems, formal verification, and event streaming.
Website:
Incoming Links:
Outgoing Links:
Subscribe to RSS:
The post categorizes durable functions into stateless functions, sessions, and actors, providing a framework for understanding durable execution engines and their properties.
This blog post discusses the role of durable function trees within broader system architecture, emphasizing their importance in establishing responsibility boundaries and reliable execution in multi-step business processes. It con...
This blog post explores the concept of durable execution (DE) through the lens of durable function trees, which organize workflows as hierarchical trees of function calls. It discusses the building blocks of promises and continuat...
The post discusses a novel indexing technique for data lakehouses, specifically focusing on the OTree index developed by Qbeast. It challenges the traditional view that indexes primarily optimize read operations at the expense of ...
The blog post discusses the current challenges faced by the Kafka community regarding three competing KIPs (KIP-1150, KIP-1176, KIP-1183) that aim to address high replication costs in multi-cloud environments. It emphasizes the ne...
The post critiques the concept of 'zero-copy' integration between Apache Kafka and Apache Iceberg, arguing that it may lead to inefficiencies and increased computational overhead. The author explains the difference between tiering...
The post discusses the evolution of data management from traditional SQL Server performance optimization to modern open table formats like Apache Iceberg. It explains the differences in indexing between relational databases and op...
The blog post provides an in-depth analysis of Apache Fluss, a table storage engine for Apache Flink developed by Alibaba in collaboration with Ververica. It discusses Fluss's architecture, features, and its role in addressing the...
The post discusses the concept of storage unification in data systems, emphasizing the importance of virtualization to create a coherent resource from heterogeneous storage systems. It outlines the primary use case of combining re...
This post explores the concept of reliable triggers within the Coordinated Progress model, emphasizing their role in establishing responsibility boundaries for work execution in distributed business service architectures. It discu...
The blog post discusses incidents where AI agents, specifically Replit and Gemini, caused data loss due to their inability to accurately assess and remediate their actions. It highlights the importance of remediation as a critical...
The post discusses the differences between stream analytics and batch analytics, particularly in the context of Apache Iceberg. It uses Apache Fluss and Confluent Tableflow as case studies to illustrate the challenges and strategi...
This post explains the importance of determinism in durable execution frameworks like Temporal, Restate, DBOS, and Resonate. It clarifies which parts of code must be deterministic and which do not, breaking down the discussion int...
Jack Vanlightly discusses the stark differences between decisive and indecisive startup founders, drawing parallels to decision-making in software development and other fields. He emphasizes the importance of making timely decisio...
The text discusses the mental framework for reliable progress and durable execution in microservices, functions, stream processors, and AI agents. It explores the coordination strategies, reliable triggers, and progressable work c...
The text discusses the concepts of coupling and communication styles in distributed systems, using an e-commerce example to illustrate the differences between choreography and orchestration. It also delves into the types of coupli...
The text discusses the concept of reliable progress in distributed computation, focusing on microservices, functions, stream processing jobs, and AI agents as nodes, and RPC, queues, and topics as edges. It explores the importance...
The text discusses the relationship between event-driven architecture, stream processing, orchestration, and durable execution in modern distributed systems. It describes the concepts of coordination and reliable progress, and the...
The text discusses the Apache BookKeeper Replication Protocol and its use by Apache Pulsar to form topic partitions. It classifies ways of breaking apart a monolithic replication protocol and aligns Pulsar and BookKeeper with the ...
The text discusses the Kafka Replication Protocol and how it separates control plane and data plane responsibilities. It classifies ways of breaking apart a monolithic replication protocol and explains the data plane and control p...
The text discusses the implementation of log replication with disaggregation, focusing on MultiPaxos and Neon's architecture. It explains the classifications of disaggregation, Neon's Safekeepers, and the use of Paxos to ensure da...
The post discusses the concept of table virtualization and its role in enabling collaboration between data platforms. It explains how table virtualization allows for the separation of data from metadata and shared storage from com...
The text discusses log replication protocols within the context of state-machine replication and the ways in which log replication protocols can be disaggregated. It provides an overview of the most well-known form of this approac...
The post discusses the concept of failure-free ordering and fault-tolerant consensus in log replication protocols. It explains the role of redundancy in fault tolerance and the difference between failure-free ordering and fault-to...