CSV to Excel Conversion: Why Opening a CSV in Excel Corrupts Your Data

Published: (March 24, 2026 at 10:04 PM EDT)
3 min read
Source: Dev.to

Source: Dev.to

How Excel corrupts CSV data

When Excel opens a CSV file directly, it applies automatic type detection to every column. There is no confirmation dialog and no way to preview the results before they are applied. The most common corruptions:

  • Leading zeros stripped – Zip code 07302 becomes 7302. Any numeric identifier with leading zeros loses them.
  • Dates created from non‑dates"OCT4" (a gene name) becomes October 4th. "3-5" becomes March 5th. "1/2" becomes January 2nd. This issue caused enough problems in genomics research that 27 human genes were renamed.
  • Scientific notation – Long numbers like credit‑card numbers (4111111111111111) display as 4.11111E+15, and digits beyond the 15th are replaced with zeros.
  • Unicode mangling – Characters outside the ASCII range may display incorrectly depending on the encoding. Excel assumes a default encoding that varies by locale.

The correct approach

The safe way to import CSV data into Excel preserves your original data types.

In Excel – Use Data > From Text/CSV (or Get Data > From File > From Text/CSV). This opens an import wizard that lets you set each column’s data type before importing. Set problematic columns to Text to preserve exact values.

Programmatic conversion – Use a library that gives you explicit control over data types. In Python, pandas lets you specify dtypes:

import pandas as pd

df = pd.read_csv('data.csv', dtype={'zip_code': str, 'product_id': str})
df.to_excel('data.xlsx', index=False)

The dtype={'zip_code': str} parameter forces those columns to be treated as text, preventing any type coercion.

CSV encoding issues

CSV files have no standard encoding declaration. A file might be UTF‑8, Latin‑1, Windows‑1252, or UTF‑16. Excel guesses, and when it guesses wrong, accented characters, currency symbols, and non‑Latin scripts display as garbled text.

The safest approach is to know your encoding. If you created the CSV, use UTF‑8 with BOM (Byte Order Mark). The BOM is a special character at the start of the file that tells Excel to use UTF‑8. Without it, Excel often defaults to Windows‑1252 on Western systems.

What proper conversion preserves

A well‑done CSV‑to‑Excel conversion:

  • Preserves all data exactly as it appears in the CSV
  • Sets appropriate column widths for readability
  • Preserves Unicode characters with correct encoding
  • Optionally applies formatting (number formats, date formats) without changing underlying values
  • Handles quoted fields correctly (commas inside quoted strings are not treated as delimiters)
  • Preserves empty cells (empty strings vs. null)

The converter

For quick conversion without opening Excel or writing code, try a CSV to Excel converter that handles encoding detection, preserves data types, and produces a clean .xlsx file. No data corruption, no type coercion, no mangled Unicode.

0 views
Back to Blog

Related posts

Read more »