TL;DR: The Quick Fixes
If you're in a hurry, try these in order:
- Native Repair: Open Excel →
File→Open→Browse→ Select the file → Click the arrow next toOpen→ Open and Repair. - Save as Different Format: If it opens after the repair, immediately save it as a
.xlsb(Binary) or.csvto strip out the corrupted XML junk, then save it back to.xlsx. - Google Sheets: Upload the file to Google Drive. Sometimes Google's parser is more forgiving than Excel's and can re-export a clean version.
The Scenario: 2 AM Production Crisis
It’s late, and a critical report just turned into a brick. You try to open the .xlsx file and Excel hits you with: "We found a problem with some content in 'filename.xlsx'. Do you want us to try to recover as much as we can?"
You click "Yes," hoping for a miracle, but you get a blank workbook or a log file saying "Repaired Records: Drawing from /xl/drawings/drawing1.xml part." Half your data is gone, and the formatting looks like a glitch in the Matrix. This usually happens because the underlying XML structure inside the compressed Excel archive has become malformed—likely due to a crashed save, a network interruption, or a 3rd party library generating invalid tags.
Root Cause: It's Just a ZIP File
An .xlsx file is not a binary blob; it’s a renamed ZIP archive containing a specific folder structure of XML files. When Excel says there is a "problem with some content," it usually means an XML tag isn't closed properly, or there’s an invalid character (like a null byte) where it shouldn't be. Specifically, xl/sharedStrings.xml or xl/worksheets/sheet1.xml are the usual suspects.
The "Surgeon" Approach: Manual XML Repair
When the built-in repair fails, you have to go inside. Here is how to perform manual surgery on the file structure.
Step 1: Unpack the Workbook
- Make a backup copy of your file. Never work on the original.
- Change the file extension from
.xlsxto.zip. - Extract the contents to a folder using 7-Zip, WinRAR, or your OS native utility.
Step 2: Locate the Corruption
Navigate to the xl/ folder. Most errors reside in:
xl/sharedStrings.xml: This stores every unique string used in the workbook. If this is corrupt, the whole file fails.xl/worksheets/: Containssheet1.xml,sheet2.xml, etc.
Use a tool like VS Code or Notepad++ to open these files. If the file is huge, don't use standard Notepad; it will crash.
Step 3: Validate and Fix the XML
If you are on Linux or macOS, you can use xmllint to find the exact line of corruption:
xmllint --noout xl/sharedStrings.xml
If it returns an error like Opening and ending tag mismatch, you've found the problem. In VS Code, you can use an XML extension to "Format Document." The formatter will usually stop or highlight the exact spot where the syntax breaks.
Common issues include:
- Duplicate attributes within a single tag.
- Invalid characters (ASCII control characters) that aren't escaped.
- A truncated file where the final
</sst>or</worksheet>tags are missing.
Manually fix the tag or delete the corrupted XML node, then save the file.
Step 4: Rebuild the Archive
This is the tricky part. You must zip the contents of the folder, not the folder itself.
- Go inside your extracted folder.
- Select all files (
_rels,docProps,xl,[Content_Types].xml). - Right-click → Send to Compressed (zipped) folder.
- Rename the resulting
.zipback to.xlsx.
Verification: Confirming the Fix
Open the new file in Excel. If it opens without the "We found a problem" prompt, you've succeeded. Immediately perform a Save As to let Excel re-index the internal parts and ensure the file structure is 100% compliant with the OpenXML standard.
Check for Data Loss
If you had to delete a corrupted <si> (string item) tag in sharedStrings.xml, one or more cells in your spreadsheet will now be empty or show a 0. Run a quick CTRL+F for any values you know were near the corrupted area to verify integrity.
Further Reading & Tools
- Open XML SDK Productivity Tool: A Microsoft tool that can validate
.xlsxfiles and point out exactly which part of the schema is violated. - SST (Shared String Table) Limits: Large files often break when the
uniqueCountattribute insharedStrings.xmldoesn't match the actual number of strings.

