About:

Dan Goldin is the author of the Twing Data newsletter, a Substack publication. The newsletter contains product and feature announcements as well as thoughts on the data world, and it has hundreds of subscribers.

Website:

Specializations:

Subscribe to RSS:
Rill offers a streamlined approach to AWS cost monitoring, automating data ingestion and providing intuitive dashboards for effective cost analysis.
Apache Iceberg is an open-source table format that brings SQL table-like functionality to large analytic datasets on inexpensive storage. It provides ACID transactions, schema evolution, time travel, and scalability to petabytes. ...
The text discusses the key numbers to know when sizing big data workloads, based on the experiences of Twing Data in handling various data systems. It provides a quick reference table for architecting a data workload and emphasize...
The text discusses the challenges of handling personal data deletion in AdTech data systems, due to the large volume and complex storage format of the data. It explores various strategies for data deletion, including storing perso...
Data Council 2025 was an event that showcased the next generation AI-native data stack, with a focus on real-time analytics, low-latency data infrastructure, and the unification of OLTP and OLAP. The event highlighted the importan...
The article outlines Twing Data's experience with integrating into the AWS Marketplace, the technical approach taken, and the benefits for buyers. It also provides an overview of the technical implementation of the integration.
The text discusses the evolution of the mobile-first economy and the rise of cross-platform app economy. It focuses on Snowflake's Native App Marketplace, its unique features, and the advantages it offers to users and developers. ...
The text is about the development of a Snowflake Native app called Twing Pulse, which is now available on the Snowflake Marketplace. It provides useful queries from the original product, Twing Data, to Snowflake users. The author ...
The text discusses the challenges of data engineering work with APIs and the use of DuckDB to simplify the process. It explains how to use DuckDB to query semi-structured JSON data and enrich data from APIs, making the workflow mo...
The article discusses the impact of business intelligence tools on data warehouse operations and costs. It highlights the importance of intelligent caching and the storage of extract or cached data in different BI tools. It emphas...
The text discusses the limitations of using a single data warehouse for all data needs and suggests using specialized tools for specific tasks. It emphasizes the importance of being intentional about the use of data and implementi...
The post explores Snowflake's pricing models and optimizations to reduce cost and improve performance. It discusses the flexible pricing model based on compute, storage, and cloud services, and offers proven best practices to get ...
The text discusses three Snowflake queries that provide valuable metadata for analyzing data warehouses and identifying optimization opportunities. It explains how to use these queries to explore table usage, identify optimization...
The text discusses the real cost of BigQuery, a managed system that abstracts and integrates several proprietary elements, as well as many independent Google Cloud Platform (GCP) services. It explains the pricing models, optimizat...
The text discusses the importance of evaluating security when choosing data infrastructure tools, focusing on aspects such as data retention, access controls, encryption, and third-party risks. It provides guidelines for ensuring ...
The text discusses Redshift Serverless pricing and its differences from a true serverless system. It explains the basics of Redshift Serverless, its pricing policies, and the impact of its minimum charge policy on warehouse costs....
The text discusses the challenges faced by data teams, including lack of trust in reporting and data points, and the disconnect between data teams and product management. It suggests starting manually to validate the need and solu...
The text discusses the pricing models and configuration options of Amazon Redshift, a major data warehouse solution. It explains the pricing structure, system options, reserved capacity, automated features, storage management, and...
The text explores Redshift's approach to query hashing and compares it to Snowflake's. It discusses how Redshift's query hashing is more thoughtful and parses and analyzes the query plan to generate the query hash. It provides exa...
The text discusses the importance of tracking unit costs to avoid surprise cloud bills. It emphasizes the need to tag everything and track unit costs to understand the primary drivers of service costs and make accurate forecasts. ...
The text explores the limitations of Snowflake's query_parameterized_hash, highlighting how Snowflake handles query hashing and the shortcomings of its approach. It also discusses how Twing Data implements a more thoughtful and ag...
The text discusses the benefits and challenges of using Snowflake, a cloud-based data warehousing platform. It highlights the cost implications of using Snowflake and suggests a multi-account strategy to optimize resource allocati...
Twing Data has made major developments, including revamping their backend infrastructure, enhancing data extraction capabilities, and launching a self-serve offering. They aim to empower companies of all sizes to leverage their da...
The post discusses the challenges of handling truncated insert queries using regex and ChatGPT. It explains how Twing Data parses every query on a customer's data warehouse, and how they handled truncated queries from a customer u...