About:

Mark Litwintschik is the author of the Tech Blog, which provides benchmarks and tips for big data technologies, including Hadoop, AWS, Google Cloud, PostgreSQL, Spark, Python, and more.

Website:

Incoming Links:

Subscribe to RSS:
The post discusses the release of the GlobalBuildingAtlas (GBA) dataset by researchers at the Technical University of Munich, which estimates the number of buildings on Earth to be around 2.75 billion, contrasting with the UN's es...
The blog post discusses Planet Labs' hyperspectral satellite Tanager-1, which captures a wide range of spectral data beyond the typical RGB bands used in standard satellite imagery. It highlights the satellite's capabilities, incl...
The post reviews Microsoft's updated GlobalMLBuildingFootprints database, detailing the author's technical setup and the complexities of processing and analyzing the dataset.
The blog post discusses Digital Earth Australia's Tidal Composites dataset, which is a cloud-free mosaic of Australia's coasts, estuaries, and reefs created from Sentinel-2 satellite imagery from 2016 to 2023. The dataset, totalin...
The post reviews various level 0 administrative boundary datasets, discussing their accuracy, geometry, and metadata. It includes examples from datasets like OpenStreetMap, Overture, UN COD, and others, detailing the number of rec...
The blog post discusses the release of the OpenBuildingMap (OBM) dataset, which includes building footprints, heights, usage categories, and floorspace for 2.7 billion buildings worldwide. It details the dataset's creation using e...
Data Crew's Route Optimiser Framework offers solutions for routing problems like the Chinese Postman and Traveling Salesman, demonstrated through practical examples in Argentina.
The post analyzes a dataset published by Business Insider regarding the locations, ownership, and resource consumption of data centers in the U.S. It details the methodology used to gather and process the data, including the use o...
All The Places' latest release showcases a comprehensive analysis of location data for thousands of brands, detailing technical processes and insights into geographical distribution.
The Layercake project provides updated Parquet files of OpenStreetMap (OSM) data, including buildings, highways, and settlements, which can be downloaded efficiently. The author details their high-performance workstation setup and...
This post outlines the conversion of Google's Street View dataset into Parquet format, detailing the technical setup and analysis of geospatial coverage trends.
The post discusses the Global Mining Dataset released by the International Council on Mining and Metals (ICMM), which includes over 8,000 mines and related assets worldwide. The author details their workstation setup, including ha...
The blog post discusses the refreshed Open Database of Buildings (ODB) dataset released by Statistics Canada, which compiles building data from 530 datasets across 107 government sources. The author details their analysis process ...
Public Safety Canada released a dataset called Canada Structures, containing over 13 million building footprints across Canada, including metadata on heights and usage. The data was sourced from OpenStreetMap, Microsoft Building F...
The post discusses Alberta's significant role in Canada's oil production, highlighting that it produces 85% of the country's oil and ranks 5th globally among oil exporters. It details the extensive pipeline network in Alberta, wit...
The post details the conversion of Canada's wind turbine dataset into Parquet format, highlighting technical processes and insights on turbine manufacturers and installations.
Overture Map's latest Places dataset features over 72 million global points of interest, detailing their categories, operating statuses, and confidence levels for data analysis.
The blog post discusses the release of version 8.1 of the U.S. Wind Turbine Database (USWTDB) by the US Geological Survey, detailing the dataset's features, including the location and capabilities of over 76,000 wind turbines acro...
Jake Stid, a postdoctoral research associate at Michigan State University, introduces the Ground-Mounted Solar Energy in the United States (GM-SEUS) dataset, which includes 15,000 arrays and 2.9 million panels from solar farms acr...
The Overture Maps Foundation publishes a global dataset every month containing the world's buildings, roads, addresses and places of interest. The data is shared via Parquet files hosted on AWS S3 and Microsoft's Azure. The softwa...
Apple has supported depth maps in the images its iPhones capture since 2017. These depth maps, along with other imagery, are stored in High Efficiency Image File Format (HEIF) container files. Finn Jaeger, head of VFX at Replayboy...
Apple's DepthPro, a fast depth estimation model, is being tested against Maxar's 2025 satellite imagery of Bangkok, Thailand. The author details the hardware and software setup used for the analysis, the process of tiling Maxar's ...
The author discusses a Python package called Building Regulariser that can help correct the issue of wobbly building outlines detected by GeoDeep in satellite imagery. The package is developed by Nick Wright, a Senior Research Sci...
The post discusses the features of ArcGIS Pro 3.5, including system specifications, Python upgrades, Parquet support, 3D clouds, histogram ergonomics, SAR support, vector-based PDFs, label engine updates, and raster to polygon fun...