I built a CLI tool to quickly sanity-check CSV files (tidypeek)
Source: Dev.to
Overview
Working with CSV files can be frustrating. You often wonder:
- Are there missing values?
- Are there duplicate rows?
- Which column is the actual ID?
- Is the dataset clean enough to work with?
I found myself repeating the same basic checks, so I created tidypeek, a lightweight command‑line tool that provides a quick sanity check of any CSV file.
Installation
pip install tidypeek
Usage
tidypeek yourfile.csv
The tool analyzes the dataset and reports:
- Total rows and columns
- Column types
- Missing values
- Duplicate rows
- Likely identifier columns
- Duplicate IDs
- Simple insights about the data
Features
- Fast – runs instantly from the terminal.
- Simple – no heavy profiling libraries required.
- Terminal‑based – ideal for quick inspections before deeper analysis.
Typical output examples:
- “4 columns have high missing values”
- “Column ‘name’ appears to be an identifier but contains duplicates”
- “12 columns have low uniqueness — useful for grouping”
Use Cases
- Quick dataset inspection
- Data cleaning workflows
- Learning data analysis
Links
- GitHub:
- PyPI:
Feedback and feature requests are welcome!