Product guide · 6 min read

ML feature CSVs: eyeballing training exports before notebooks run

Spot constant columns, label leakage, and impossible ranges in flat files before sklearn or PyTorch.

Published March 21, 2025 · Table

Human scan complements automated profiling: sort numeric features, search for sentinel strings like unknown, and verify label cardinality before training.

Red flags

  • Future-dated columns co-present with targets (leakage).
  • IDs that sort perfectly with labels (merge bugs).

← All articles