The Error
ValueError: could not convert string to float: '1,234'
You're trying to convert a number string to float — but the value has a comma baked into it. Python's float() only handles plain strings like '1234' or '1234.56'. Anything with a thousands separator throws immediately.
This hits you when loading data from CSVs, Excel files, database exports, or APIs where numbers are formatted for human readability: '1,234', '$1,234.56', or '1.234,56' if the source uses European formatting.
Root Cause
That comma in '1,234' is a thousands separator, not a decimal point. Different locales format numbers differently:
- US/UK:
1,234.56— comma = thousands, period = decimal - European (DE/FR/etc.):
1.234,56— period = thousands, comma = decimal
Python's float() doesn't know which convention your data uses. It just fails. When data comes from a spreadsheet, a locale-aware export, or user input, these extra characters will break the conversion every time.
Step-by-Step Fix
Fix 1: Strip the comma manually (simplest case)
Your source uses US-style formatting — comma as thousands separator, period as decimal. Just strip it:
value = '1,234'
result = float(value.replace(',', ''))
print(result) # 1234.0
Got currency symbols or extra whitespace? Chain the replacements:
value = '$1,234.56'
result = float(value.replace(',', '').replace('$', '').strip())
print(result) # 1234.56
Fix 2: Clean a pandas column with str.replace + pd.to_numeric
The most common scenario: a DataFrame column loaded as object dtype instead of float64. Clean it in one line:
import pandas as pd
df = pd.DataFrame({'price': ['1,234', '5,678', '9,012']})
df['price'] = pd.to_numeric(df['price'].str.replace(',', '', regex=False))
print(df['price'])
# 0 1234.0
# 1 5678.0
# 2 9012.0
print(df.dtypes)
# price float64
Fix 3: Pass thousands parameter when reading CSV
Better yet — handle it at load time. The thousands parameter tells pandas about the separator upfront, so numeric columns come out as float64 automatically:
import pandas as pd
df = pd.read_csv('data.csv', thousands=',')
print(df.dtypes) # numeric columns will already be float64
European-format files where the comma is the decimal separator need both parameters:
df = pd.read_csv('data.csv', decimal=',', thousands='.')
Fix 4: Handle dirty or mixed data with errors='coerce'
Production data is messy. A single column can hold valid numbers, empty strings, and labels like 'N/A' — all at once. Instead of crashing on the bad rows, errors='coerce' converts what it can and turns the rest into NaN:
import pandas as pd
df = pd.DataFrame({'amount': ['1,234', 'N/A', '5,678', '', '9,012']})
df['amount'] = pd.to_numeric(
df['amount'].str.replace(',', '', regex=False),
errors='coerce'
)
print(df['amount'])
# 0 1234.0
# 1 NaN
# 2 5678.0
# 3 NaN
# 4 9012.0
Fix 5: Use the locale module for proper locale-aware parsing
When the source locale is known and consistent, locale.atof() handles the separator rules automatically — no manual string wrangling needed:
import locale
# US format: '1,234' → 1234.0
locale.setlocale(locale.LC_NUMERIC, 'en_US.UTF-8')
result = locale.atof('1,234')
print(result) # 1234.0
# German format: '1.234,56' → 1234.56
locale.setlocale(locale.LC_NUMERIC, 'de_DE.UTF-8')
result = locale.atof('1.234,56')
print(result) # 1234.56
One catch: locale.setlocale() changes the setting process-wide and isn't thread-safe. In web apps or async code, stick with the explicit str.replace() approach instead.
Verify the Fix
Quick sanity check for a single value:
value = '1,234'
result = float(value.replace(',', ''))
assert result == 1234.0, f"Unexpected: {result}"
print("OK:", result) # OK: 1234.0
For a DataFrame, confirm the dtype flipped to numeric and count any unexpected NaN values:
print(df['price'].dtype) # float64
print(df['price'].isna().sum()) # 0 (or the expected count for truly missing rows)
print(df['price'].describe()) # sanity-check min/max/mean
Tips
- Inspect raw data first: run
df['col'].head(20)ordf['col'].unique()[:20]before writing any cleanup code. You need to know exactly what characters are present — commas, spaces, currency symbols, em-dashes. - European decimals: if
'1,5'means1.5in your data, usevalue.replace(',', '.'). But only when there's no thousands separator in the same column — otherwise you'll silently corrupt values like'1.234,56'. - Excel files: numbers in
.xlsxfiles typically arrive as proper floats already. When they don't, the samestr.replace()+pd.to_numeric()pipeline works onpd.read_excel()output. - After cleaning, handle NaN: if you used
errors='coerce', decide whether to drop those rows (df.dropna(subset=['amount'])) or fill them (df['amount'].fillna(0)) before further processing.

