Fix Python ValueError: could not convert string to float '1,234' — Locale Number Formats

beginner🐍 Python2026-06-20| Python 3.x, pandas 1.x / 2.x, any OS (Windows, Linux, macOS)

Error Message

ValueError: could not convert string to float: '1,234'
#python#float#string#pandas#data-processing#locale

The Error

ValueError: could not convert string to float: '1,234'

You're trying to convert a number string to float — but the value has a comma baked into it. Python's float() only handles plain strings like '1234' or '1234.56'. Anything with a thousands separator throws immediately.

This hits you when loading data from CSVs, Excel files, database exports, or APIs where numbers are formatted for human readability: '1,234', '$1,234.56', or '1.234,56' if the source uses European formatting.

Root Cause

That comma in '1,234' is a thousands separator, not a decimal point. Different locales format numbers differently:

  • US/UK: 1,234.56 — comma = thousands, period = decimal
  • European (DE/FR/etc.): 1.234,56 — period = thousands, comma = decimal

Python's float() doesn't know which convention your data uses. It just fails. When data comes from a spreadsheet, a locale-aware export, or user input, these extra characters will break the conversion every time.

Step-by-Step Fix

Fix 1: Strip the comma manually (simplest case)

Your source uses US-style formatting — comma as thousands separator, period as decimal. Just strip it:

value = '1,234'
result = float(value.replace(',', ''))
print(result)  # 1234.0

Got currency symbols or extra whitespace? Chain the replacements:

value = '$1,234.56'
result = float(value.replace(',', '').replace('$', '').strip())
print(result)  # 1234.56

Fix 2: Clean a pandas column with str.replace + pd.to_numeric

The most common scenario: a DataFrame column loaded as object dtype instead of float64. Clean it in one line:

import pandas as pd

df = pd.DataFrame({'price': ['1,234', '5,678', '9,012']})

df['price'] = pd.to_numeric(df['price'].str.replace(',', '', regex=False))
print(df['price'])
# 0    1234.0
# 1    5678.0
# 2    9012.0
print(df.dtypes)
# price    float64

Fix 3: Pass thousands parameter when reading CSV

Better yet — handle it at load time. The thousands parameter tells pandas about the separator upfront, so numeric columns come out as float64 automatically:

import pandas as pd

df = pd.read_csv('data.csv', thousands=',')
print(df.dtypes)  # numeric columns will already be float64

European-format files where the comma is the decimal separator need both parameters:

df = pd.read_csv('data.csv', decimal=',', thousands='.')

Fix 4: Handle dirty or mixed data with errors='coerce'

Production data is messy. A single column can hold valid numbers, empty strings, and labels like 'N/A' — all at once. Instead of crashing on the bad rows, errors='coerce' converts what it can and turns the rest into NaN:

import pandas as pd

df = pd.DataFrame({'amount': ['1,234', 'N/A', '5,678', '', '9,012']})

df['amount'] = pd.to_numeric(
    df['amount'].str.replace(',', '', regex=False),
    errors='coerce'
)

print(df['amount'])
# 0    1234.0
# 1       NaN
# 2    5678.0
# 3       NaN
# 4    9012.0

Fix 5: Use the locale module for proper locale-aware parsing

When the source locale is known and consistent, locale.atof() handles the separator rules automatically — no manual string wrangling needed:

import locale

# US format: '1,234' → 1234.0
locale.setlocale(locale.LC_NUMERIC, 'en_US.UTF-8')
result = locale.atof('1,234')
print(result)  # 1234.0

# German format: '1.234,56' → 1234.56
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')
result = locale.atof('1.234,56')
print(result)  # 1234.56

One catch: locale.setlocale() changes the setting process-wide and isn't thread-safe. In web apps or async code, stick with the explicit str.replace() approach instead.

Verify the Fix

Quick sanity check for a single value:

value = '1,234'
result = float(value.replace(',', ''))
assert result == 1234.0, f"Unexpected: {result}"
print("OK:", result)  # OK: 1234.0

For a DataFrame, confirm the dtype flipped to numeric and count any unexpected NaN values:

print(df['price'].dtype)          # float64
print(df['price'].isna().sum())   # 0  (or the expected count for truly missing rows)
print(df['price'].describe())     # sanity-check min/max/mean

Tips

  • Inspect raw data first: run df['col'].head(20) or df['col'].unique()[:20] before writing any cleanup code. You need to know exactly what characters are present — commas, spaces, currency symbols, em-dashes.
  • European decimals: if '1,5' means 1.5 in your data, use value.replace(',', '.'). But only when there's no thousands separator in the same column — otherwise you'll silently corrupt values like '1.234,56'.
  • Excel files: numbers in .xlsx files typically arrive as proper floats already. When they don't, the same str.replace() + pd.to_numeric() pipeline works on pd.read_excel() output.
  • After cleaning, handle NaN: if you used errors='coerce', decide whether to drop those rows (df.dropna(subset=['amount'])) or fill them (df['amount'].fillna(0)) before further processing.

Related Error Notes