10M rows in 28.4 sec — verified February 2026File contents never uploaded

CSV Splitter Performance
Benchmarks & Methodology

How fast does SplitForge process CSV files? One verified result. All others documented as projections. Hardware configuration, measurement protocol, and honest limitations — fully documented.

351,853/s
Throughput
rows per second (10M test)
28.4 sec
10M Row Test
verified benchmark
4
Split Modes
with overhead breakdown
Never
Upload
browser-only
How to Read This Page

Two results are verified measurements: the 1-million-row By Row Count test at 2.9 seconds and the 10-million-row By Row Count test at 28.42 seconds (351,853 rows/sec). All other times shown on this page are projections calculated by dividing the target row count by 351,853, with overhead multipliers derived from the algorithmic complexity of each split mode. They represent expected performance, not measured results. Your actual results will vary based on hardware, browser, file complexity, and available system memory. Always verify with your own files before committing to a time-sensitive workflow.

Processing Time by Row Count

By Row Count mode, comma delimiter, 1M rows per output chunk, single header row preserved. Two entries are measured — 100K and 5M are projected from the 351,853 rows/sec baseline.

100K1M5M10MRow Count08162432Seconds
Verified measurement (Feb 2026)
Projected from 351,853 rows/sec baseline
0.3s
100K rows
PROJECTED
2.9s
1M rows
MEASURED
14.2s
5M rows
PROJECTED
28.42s
10M rows
MEASURED

Scalability by Split Mode

By Row Count is fastest (baseline). By Column Value is slowest — it must scan the full file, build a unique-value map, then write one file per group. Overhead multipliers are algorithmic estimates, not measured results.

File SizeBy RowsBy SizeBy PartsBy Column
Overhead factor1.00x (baseline)~1.15x~1.08x~1.55x
100K rows0.3sprojected0.3sprojected0.3sprojected0.4sprojected
500K rows1.4sprojected1.6sprojected1.5sprojected2.2sprojected
1M rows2.9smeasured3.3sprojected3.1sprojected4.4sprojected
10M rows28.42smeasured32.7sprojected30.7sprojected44.1sprojected
All values except the 10M-row By Row Count result are projected calculations. Results vary by hardware, browser, file complexity, and available system memory. Verify with your own files before committing to time-sensitive workflows.

Split Mode: Algorithmic Overhead

Each split mode has different algorithmic complexity. These overhead ratios explain why By Column Value takes roughly 55% longer than By Row Count for the same row count.

By Row Count
1.00x — Baseline

Stream file in 60MB chunks → count rows per chunk → flush to output Blob when threshold reached → start new chunk. Adaptive threshold uses row size sampling from first 2K rows for ~10× accuracy vs byte tracking.

1
Sample first 2K rows to estimate row size in bytes
2
Calculate rows-per-flush based on target row count
3
Stream in 60MB chunks, flush output when row threshold reached
4
Write header row to every new output file
By File Size
~1.15x — +15% overhead

Sample first rows to estimate byte density per row → calculate target row count per output file → stream and chunk at calculated threshold. Extra sampling pass adds ~15% overhead vs By Row Count.

1
Sample first 500 rows, measure byte size
2
Calculate target rows per MB
3
Stream in chunks, flush output at byte-estimated row boundary
4
Output size varies ±20% due to row density variation
Output sizes vary ±20% from target because row density varies within the dataset.
By Equal Parts
~1.08x — +8% overhead

Quick pre-pass to count total rows → divide by N → stream and split at calculated row threshold. Pre-pass row count adds ~8% overhead vs direct streaming.

1
Quick scan to count total rows (PapaParse fast-parse mode)
2
Calculate rows per chunk = totalRows / N
3
Stream file, flush output at each calculated boundary
4
Header row preserved in every output chunk
By Column Value
~1.55x — +55% overhead

Two-pass approach: Pass 1 scans entire file to find all unique column values. Pass 2 streams and routes each row to the corresponding output Blob. Building and maintaining the value map is the dominant overhead.

1
Pass 1: Stream full file, collect unique values from target column into Map
2
Validate count ≤ 2,000 output file cap
3
Pass 2: Stream file again, route each row to matching output Blob
4
Write all output Blobs to ZIP on completion
Hard cap: 2,000 output files per operation. High-cardinality columns (>2K unique values) require grouping first.
These overhead ratios are estimated from algorithmic complexity, not from independent measurements. Actual overhead depends on column data types, file structure, available CPU cache, and chunk size.

Hardware and Memory Impact

Performance degrades predictably with lower-spec hardware. These estimates are based on typical CPU and memory differentials — not independently measured on target hardware.

High-End Desktop
Intel i7-12700K+, 32GB+ RAM, Chrome stable
1.0x (benchmark conditions)
10M rows: ~28 sec. Memory pressure rare below 2GB files.
Mid-Range Laptop
Intel i5 / Ryzen 5, 16GB RAM, Chrome stable
~1.8–2.5x slower
10M rows: ~50–70 sec estimated. Files over 1GB may cause memory pressure.
Budget / Older Machine
Core i3 / older CPU, 8GB RAM, any browser
~3–5x slower
10M rows: ~85–140 sec estimated. Files over 500MB may hit memory limits.
Hardware multipliers are estimates based on typical CPU clock speed and memory bandwidth differentials. Streaming architecture means memory-constrained machines hit limits earlier — not just slower. Results vary by hardware, browser, and file complexity.

Test Configuration

Hardware

ProcessorIntel Core i7-12700K (12th Gen, 12 cores)
RAM32GB DDR4-3200
StorageNVMe SSD (Samsung 970 EVO)
OSWindows 11 Pro 23H2
BrowserChrome stable (latest as of Feb 2026)

Test File

Rows10,001,000 (includes header)
Columns8 (UUID, names, email, company, city, country, value)
File size~1.1GB (uncompressed CSV)
DelimiterComma (auto-detected)
EncodingUTF-8

Calculate Your Time Savings

Manual baseline: manual workflow (open file in Excel, encounter the row limit or crash, copy/paste rows, save each chunk, rename files) can take 1–3 hours depending on file size and system stability. SplitForge tool time assumes a 10M row file (~28 seconds verified) — if your files are smaller, the tool time is proportionally faster. Adjust manual minutes to match your actual workflow.

Weekly = 4, daily = 22

Default 20 min — adjust to your actual workflow

Data analyst avg: $45–75/hr

Manual Time / Month
2.7
hours
Annual Time Saved
31
hours per year
Annual Labor Savings
$1,719
per year at $55/hr

Honest Limitations: Where SplitForge CSV Splitter Falls Short

No tool is perfect for every use case. Here's where Python pandas / shell split / AWS Glue might be a better choice, and the real limitations of our browser-based architecture.

Browser-Based Processing

Performance depends on your device's RAM and CPU. Modern laptops (2022+) handle 10M+ rows easily, but older devices may struggle with very large files.

Workaround:
Close unnecessary browser tabs to free up memory. For files over 50M rows, consider database solutions.

No Offline Mode (Initial Load)

Requires internet connection to load the tool initially. Processing happens offline in your browser after loading.

Workaround:
Once loaded, you can disconnect and continue processing. For true offline environments, desktop tools may be better.

Browser Tab Memory Limits

Most browsers limit individual tabs to 2-4GB RAM. This is the practical ceiling for file size.

Workaround:
Use 64-bit browsers with sufficient RAM. Chrome and Firefox handle large files best.

Browser Memory Ceiling (~2–4GB Files)

Streaming architecture keeps memory usage constant, but very large files near or above available browser memory will cause out-of-memory errors. Practical limit: ~2GB on 8GB RAM machines, ~4GB+ on 32GB RAM machines.

Workaround:
Pre-split using shell: split -l 1000000 large.csv chunk_ (Linux/Mac/WSL). Python pandas with chunksize handles arbitrary file sizes without browser memory constraints.

No API or Pipeline Automation

Browser-only tool — no REST API, CLI, webhook, or scheduled job support. Cannot be integrated into automated ETL workflows or run headlessly.

Workaround:
Python pandas (chunksize) for automated pipelines. Shell split for simple size/row splits. AWS Glue for cloud-scale ETL with orchestration.

Column Value Split Hard Cap (2,000 Files)

By Column Value mode capped at 2,000 output files to prevent browser memory exhaustion.

Workaround:
Group by parent category first. Python pandas groupby() has no file count ceiling for programmatic splitting.

Single File Per Session

Processes one CSV at a time. No batch operation across multiple files.

Workaround:
Process files sequentially — each operation takes seconds. Python glob() + pandas for 50+ file batch workflows.

When to Use Python pandas / shell split / AWS Glue Instead

You need automated CSV splitting in a scheduled pipeline

No API or CLI — browser-only tool.

💡 Python pandas (chunksize), shell split, or AWS Glue.

Your files regularly exceed 2GB on low-spec hardware

Browser memory limits make very large files unreliable on <16GB RAM machines.

💡 Shell split command (Linux/Mac/WSL) or Python pandas chunksize.

You need to split 50+ files in a batch

One file per session — batch processing requires repeated operations.

💡 Python glob() + pandas loop or shell for loop.

Questions about limitations? Check our FAQ section below or contact us via the feedback button.

Frequently Asked Questions

Try It on Your Files

Benchmarks are one thing. Your files are another. Drop yours in and see. No uploads, no account, no install required.

File contents never leave your device
Verified: 10M rows in 28.4 seconds
Headers preserved in every output chunk
Free — no account required

Related: 10M Rows in 12 Seconds (v1 benchmark) · Split Large CSV Files Guide · 1GB CSV Benchmark (v1) · CSV Merger