Insights · 6 min read

Parquet vs CSV: when analysts still reach for flat files

Columnar storage wins in warehouses; CSV remains the handoff format to humans, Excel, and legacy tools.

Published March 21, 2025 · Table

Parquet compresses and types data efficiently for Spark, DuckDB, and cloud warehouses. CSV stays the lowest-common-denominator for email attachments, regulatory submissions, and quick human review.

Split the workflow

  • Store canonical tables in Parquet/Iceberg inside the lake.
  • Emit bounded CSV slices for stakeholders who will not query SQL.
  • Use a viewer for those slices instead of re-importing to Sheets.

← All articles