You exported the file. It imported with errors. Finance says the vendor totals don't match. Now you're hunting invisible merged cells across 380,000 rows while the deadline ticks closer. You've been here before. You'll be here again — unless the cleaning happens before the import.
Why Excel Cleaning Breaks Pipelines
The file looks fine in Excel. Then it hits your import and everything breaks.
Use Excel If... Use SplitForge If...
14 Operations. One Pass.
Everything a data analyst needs to make an Excel file safe to import — in a single browser session.
14 Cleaning Operations
Strip formatting, flatten formulas, remove merged cells, trim whitespace, normalize text case, remove empty rows/columns — all in one pass, no code required.
Fuzzy Deduplication
Levenshtein distance matching (0.50–0.99 threshold) plus Soundex phonetic matching. Catches near-duplicates that Excel's Remove Duplicates misses entirely.
Column Profiling & Health Report
Automatic data quality scoring per column — null counts, type distribution, outlier detection (3σ), top values, and numeric stats. Know what's broken before you clean.
Split & Merge Columns
Split by delimiter, fixed position, or regex. Merge columns with custom separators. Keep or delete originals. No code required.
Conditional Rules Engine
Build IF/THEN rules with AND/OR logic — replace values, delete rows, or flag records. Equivalent to nested Excel formulas without the formula complexity.
CRM & Contact Cleaning
One-click CRM preset: normalize phones, lowercase emails, proper-case names, fuzzy deduplicate contacts, expand abbreviations.
Designed for Sensitive Workflows
SplitForge processes files entirely in your browser using Web Workers and the File API. No bytes reach a server. This architecture is designed to support workflows where files cannot leave the device — including environments governed by HIPAA, GDPR, or internal data handling policies.
You don't have to take our word for it. Open Chrome DevTools, go to the Network tab, drop your file in, and you will see zero outbound data requests. The proof is auditable by anyone.
No server endpoint exists in this tool's architecture.
How It Compares
| Feature | SplitForge | Excel (manual) | OpenRefine | Tableau Prep |
|---|---|---|---|---|
| Data upload required | Never | N/A (local app) | Local install | Yes (cloud) |
| Install required1 | No — browser | Yes | Yes (Java) | Yes |
| Max rows before slowdown3 | Tested reliably to 1M+ rows | ~100K | ~500K | Server-limited |
| Fuzzy deduplication1 | Yes (Levenshtein) | Exact match only | Yes | Limited |
| Column profiling / health1 | Yes (auto) | Manual (formulas) | Basic | Yes (paid tier) |
| Conditional rules engine | Yes | VBA / manual | GREL scripting | Yes (complex) |
| CRM preset (1-click) | Yes | No | No | No |
| Merged cell handling | Auto-fill + remove | Manual | N/A | Limited |
| Date normalization | Auto (14 patterns) | Manual or formula | Via GREL | Yes |
| Privacy by architecture2 | Yes | Yes (local) | Yes (local) | Contractual only |
| Workflow export (reusable) | Yes (JSON export) | Macro (.xlsm) | Yes (.json) | Yes (.tfl) |
| Before/after preview | Yes | No | Yes | Yes |
| No technical knowledge needed1 | Yes | Partially | No (GREL) | Partially |
1 Install requirements and feature availability sourced from G2 reviews and vendor public documentation, Feb 2026. 2 Tableau Prep cloud upload applies to Tableau Cloud (formerly Tableau Online); Tableau Desktop processes locally. 3 OpenRefine row limit estimates based on community benchmarks and project documentation. SplitForge data from internal testing — see the full performance page for methodology.
Time Savings Estimate
Enter your current manual cleaning workload to estimate the time difference.
Assumes ~40 seconds per file (100K-row standard clean). Actual time varies by file size, complexity, and operations selected.
What This Looks Like in Practice
Representative workflow scenarios — not customer case studies. Times reflect mixed operation sets including fuzzy matching; analysis-only speeds are higher.
Quarterly Vendor Reconciliation
380K-row vendor ledger from SAP — merged cells in every group header, multiple date formats across fiscal year columns, vendor name variations (e.g., "Accenture" vs "Accentur" vs "ACCENTURE INC.").
Merged cells removed, dates normalized to ISO 8601, fuzzy deduplication (0.85 threshold) collapses near-duplicate vendor entries, business suffixes standardized.
Patient Registry Normalization
520K-row registry with phone numbers in 8+ formats, addresses as single strings, mixed data types across 34 columns, hidden rows from previous imports.
Phones normalized, addresses parsed into structured fields, numbers-as-text converted, hidden rows surfaced. Processed without upload — required for PHI workflows.
Multi-Supplier Product Catalog Merge
750K-row catalog from 4 supplier feeds — inconsistent column headers, prices with embedded currency symbols, exact and near-duplicate product entries.
Headers normalized, currency symbols stripped, exact and fuzzy duplicates removed, conditional rules flag items with $0 price for pre-import review.
Technical Deep Dives
How the hard problems are actually solved.
When SplitForge Is Not the Right Tool
For automated pipelines, use Python + openpyxl/pandas. For large-scale fuzzy clustering, use OpenRefine. For enterprise ETL orchestration, use Tableau Prep or AWS Glue.
File too large to process in one pass? Excel Splitter breaks it into chunks first. Need to mask sensitive columns before cleaning? Data Masking uses the same browser-only architecture. Working with CSV instead of Excel? CSV Data Cleaner covers the same 14 operations for flat files.
Performance Overview
Filled bar (1M rows): Verified — Chrome stable, Windows 11, Intel i7-12700K, 32GB RAM, February 2026. 10 runs, highest/lowest discarded, remaining 8 averaged. Analysis only (no cleaning operations applied).
Semi-transparent bars: Calculated from 72,043 rows/sec baseline. Results vary by hardware, browser, and file complexity (±15–25%).
Frequently Asked Questions
Is my data private?
What Excel file formats are supported?
What file size can it handle?
What does the Data Health Report show?
What is fuzzy deduplication and when should I use it?
What is the conditional rules engine?
Can I preview changes before downloading?
What is the workflow export?
Can I clean multiple files at once?
When should I use this instead of OpenRefine or Tableau Prep?
Does SplitForge preserve Excel formulas?
Stop Cleaning Manually. Start in 10 Seconds.
Drop your file. Get a health report. Clean in one click. Download a file you can actually import.
Also try: Excel Splitter · Data Masking · Remove Duplicates · Data Validator