Verified Benchmark — February 2026

CSV Compare: 5M Rows
in 14.3 Seconds

Primary key matching, cell-level change detection, streaming architecture. 350K rows/sec — tested February 2026 on Chrome 132, Windows 11, Intel i7-12700K, 32GB RAM. Results vary by hardware, browser, and file complexity.

350K
Throughput
rows/sec (verified)
14.3s
5M Row Test
primary key mode
O(n)
Complexity
linear scaling
Never
File Uploads
zero transmission

Benchmark Performance

All SplitForge times: Chrome 132 (stable), Windows 11, Intel i7-12700K, 32GB RAM, February 2026. 8 runs per configuration — highest/lowest discarded, remaining 6 averaged. 10M row value estimated from 5M trajectory. Results vary by hardware, browser, and file complexity. Excel times estimated from internal workflow testing.

Performance at Scale

Chrome 132 · Windows 11 · Intel i7-12700K · 32GB RAM · February 2026

File SizePrimary Key ModeLine-by-Line ModeTest Notes
100K rows0.5 sec / 200K rows/sec~0.4 sec / 250K rows/secStartup overhead visible at small sizes; Map initialization ~50ms
500K rows~2.4 sec / 208K rows/sec~1.8 sec / 277K rows/secMixed text + numeric columns, 8-column test file
1M rows4.8 sec / 208K rows/sec~3.6 sec / 277K rows/secComma delimiter, string primary key (email), 10 columns
5M rows (verified)14.3 sec / 350K rows/sec~11 sec / 454K rows/secPeak verified result — chunk batching reaches full throughput
10M rows (estimated)~28 sec / 357K rows/sec~21 sec / 476K rows/secEstimated from 5M trajectory. In testing, requires ~2-3GB browser RAM on 32GB machine.

Results vary by hardware, browser, and file complexity. Line-by-line values are estimated. Performance improves at scale due to chunk pipeline optimization — the 50K-row batching pipeline reaches full throughput at 1M+ rows.

How It Achieves 350K Rows/Sec

Streaming 50K-Row Chunks

Neither file is ever fully loaded into memory. Both files are read in 50,000-row chunks — input data is processed and discarded per chunk. However, the primary key Map and diff result arrays do grow with row count (O(n)), so peak memory scales with dataset size, not raw file size on disk. See the Memory Efficiency section below for tested numbers.

Map-Based O(n) Indexing

Primary key mode builds a JavaScript Map keyed by the primary key value. Map lookups are O(1) — each row in File B is looked up exactly once, regardless of dataset size. Total comparison is O(n) where n is the number of rows, not O(n²) like naive nested-loop approaches.

Web Worker Isolation

All processing runs in a dedicated Web Worker thread. The browser UI stays fully responsive during 10M+ row operations. Progress updates stream every 100ms — you see row counts and percent complete while the comparison runs, without any UI blocking.

Zero Network Transmission

File reading, parsing, indexing, comparison, and result generation all happen inside the browser sandbox. No data leaves the device at any point. This is not a proxy model or edge function — the JavaScript engine running in the browser tab is the only compute involved.

Memory Efficiency

~50K rows
Max chunk in memory at once
~800MB
5M rows — peak RAM in testing
~2-3GB
10M rows — peak RAM in testing
O(n)
Linear time complexity

ROI Calculator

Baseline: ~20 min per manual Excel VLOOKUP comparison session

Weekly = 52  /  Monthly = 12  /  Daily = 250
Analyst avg: $35–65/hr  /  Senior: $75–120/hr
16.9h
Time saved per year
$845
Annual value saved
~30s
SplitForge comparison time

Estimate based on 20-min manual VLOOKUP workflow vs 30-sec SplitForge workflow. Individual results vary.

Reproduce This Benchmark

These results are reproducible. Here's exactly how.

Generate test files

Create two CSV files (File A = baseline, File B = modified). Schema: columns id, name, email, value, status, updated_at — mixed string and numeric types.

For the 5M row benchmark: modify ~5% of rows in File B (change value and status columns), add 0.5% new rows, delete 0.5%.

Python generator (pandas):
df = pd.DataFrame({ 'id': range(5_000_000), ... })
df.to_csv('file_a.csv', index=False)
# Modify 5% of rows → file_b.csv

Run the comparison

1
Open CSV Compare in Chrome 132+
2
Load file_a.csv as File A, file_b.csv as File B
3
Select Primary Key mode → choose "id" column
4
Start a timer — click Compare
5
Stop timer when results appear
6
Repeat 3× on a cleared browser cache, discard highest and lowest

Verify zero uploads: Open DevTools → Network tab before clicking Compare. No requests containing file contents will appear — all processing is local.

Hardware note: Our test machine (Chrome 132, Windows 11, i7-12700K, 32GB RAM) produced 14.3 sec for 5M rows. A 16GB laptop may be 20-30% slower — the comparison step still completes; absolute time varies. The structural advantage over Excel VLOOKUP holds regardless of hardware.

Honest Limitations: Where Falls Short

No tool is perfect for every use case. Here's where might be a better choice, and the real limitations of our browser-based architecture.

Browser-Based Processing

Performance depends on your device's RAM and CPU. Modern laptops (2022+) handle 10M+ rows easily, but older devices may struggle with very large files.

Workaround:
Close unnecessary browser tabs to free up memory. For files over 50M rows, consider database solutions.

No Offline Mode (Initial Load)

Requires internet connection to load the tool initially. Processing happens offline in your browser after loading.

Workaround:
Once loaded, you can disconnect and continue processing. For true offline environments, desktop tools may be better.

Browser Tab Memory Limits

Most browsers limit individual tabs to 2-4GB RAM. This is the practical ceiling for file size.

Workaround:
Use 64-bit browsers with sufficient RAM. Chrome and Firefox handle large files best.

Memory-Bound at 10M+ Rows

In internal testing (32GB RAM, Chrome 132), 10M row comparisons required ~2-3GB browser RAM. Machines with less than 8GB total RAM (or browsers with restricted memory) may fail with an out-of-memory error. The 5M row benchmark is achievable on most modern laptops. For 10M+ rows, use a machine with 16GB+ RAM.

Line-by-Line Requires Identical Sort Order

Line-by-line mode compares rows by position — row 1 vs row 1, row 2 vs row 2. If the files are sorted differently, this produces incorrect results. If sort order might differ between files, always use Primary Key mode instead.

CSV and TSV Only

SplitForge CSV Compare accepts .csv and .tsv files. Excel .xlsx files must be converted to CSV first using the Excel to CSV Converter tool. JSON, Parquet, and database format comparisons are not supported.

No Persistent History

Comparison results are not saved between browser sessions. To preserve a diff report, export to CSV or JSON immediately after the comparison completes. Closing or refreshing the tab discards in-memory results.

No Automation or Scheduling

SplitForge is a manual browser tool — not a CLI, API, or pipeline component. It cannot be run on a schedule, triggered by webhooks, or integrated into CI/CD workflows. For automated comparisons, use Python pandas merge, dbt tests, or a database-level diff query.

Questions about limitations? Check our FAQ section below or contact us via the feedback button.

Performance FAQ

Ready to Compare?

Drop your two CSV files. 5M rows compared in under 15 seconds. Your data never leaves your browser.

No account requiredNo upload — everNo file size limits under 1GB