Two results are verified measurements: the 1-million-row By Row Count test at 2.9 seconds and the 10-million-row By Row Count test at 28.42 seconds (351,853 rows/sec). All other times shown on this page are projections calculated by dividing the target row count by 351,853, with overhead multipliers derived from the algorithmic complexity of each split mode. They represent expected performance, not measured results. Your actual results will vary based on hardware, browser, file complexity, and available system memory. Always verify with your own files before committing to a time-sensitive workflow.
Processing Time by Row Count
By Row Count mode, comma delimiter, 1M rows per output chunk, single header row preserved. Two entries are measured — 100K and 5M are projected from the 351,853 rows/sec baseline.
Scalability by Split Mode
By Row Count is fastest (baseline). By Column Value is slowest — it must scan the full file, build a unique-value map, then write one file per group. Overhead multipliers are algorithmic estimates, not measured results.
| File Size | By Rows | By Size | By Parts | By Column |
|---|---|---|---|---|
| Overhead factor | 1.00x (baseline) | ~1.15x | ~1.08x | ~1.55x |
| 100K rows | 0.3sprojected | 0.3sprojected | 0.3sprojected | 0.4sprojected |
| 500K rows | 1.4sprojected | 1.6sprojected | 1.5sprojected | 2.2sprojected |
| 1M rows | 2.9smeasured | 3.3sprojected | 3.1sprojected | 4.4sprojected |
| 10M rows | 28.42smeasured | 32.7sprojected | 30.7sprojected | 44.1sprojected |
Split Mode: Algorithmic Overhead
Each split mode has different algorithmic complexity. These overhead ratios explain why By Column Value takes roughly 55% longer than By Row Count for the same row count.
Stream file in 60MB chunks → count rows per chunk → flush to output Blob when threshold reached → start new chunk. Adaptive threshold uses row size sampling from first 2K rows for ~10× accuracy vs byte tracking.
Sample first rows to estimate byte density per row → calculate target row count per output file → stream and chunk at calculated threshold. Extra sampling pass adds ~15% overhead vs By Row Count.
Quick pre-pass to count total rows → divide by N → stream and split at calculated row threshold. Pre-pass row count adds ~8% overhead vs direct streaming.
Two-pass approach: Pass 1 scans entire file to find all unique column values. Pass 2 streams and routes each row to the corresponding output Blob. Building and maintaining the value map is the dominant overhead.
Hardware and Memory Impact
Performance degrades predictably with lower-spec hardware. These estimates are based on typical CPU and memory differentials — not independently measured on target hardware.
Test Configuration
Hardware
Test File
Calculate Your Time Savings
Weekly = 4, daily = 22
Default 20 min — adjust to your actual workflow
Data analyst avg: $45–75/hr
Honest Limitations: Where SplitForge CSV Splitter Falls Short
No tool is perfect for every use case. Here's where Python pandas / shell split / AWS Glue might be a better choice, and the real limitations of our browser-based architecture.
Browser-Based Processing
Performance depends on your device's RAM and CPU. Modern laptops (2022+) handle 10M+ rows easily, but older devices may struggle with very large files.
No Offline Mode (Initial Load)
Requires internet connection to load the tool initially. Processing happens offline in your browser after loading.
Browser Tab Memory Limits
Most browsers limit individual tabs to 2-4GB RAM. This is the practical ceiling for file size.
Browser Memory Ceiling (~2–4GB Files)
Streaming architecture keeps memory usage constant, but very large files near or above available browser memory will cause out-of-memory errors. Practical limit: ~2GB on 8GB RAM machines, ~4GB+ on 32GB RAM machines.
No API or Pipeline Automation
Browser-only tool — no REST API, CLI, webhook, or scheduled job support. Cannot be integrated into automated ETL workflows or run headlessly.
Column Value Split Hard Cap (2,000 Files)
By Column Value mode capped at 2,000 output files to prevent browser memory exhaustion.
Single File Per Session
Processes one CSV at a time. No batch operation across multiple files.
When to Use Python pandas / shell split / AWS Glue Instead
You need automated CSV splitting in a scheduled pipeline
No API or CLI — browser-only tool.
Your files regularly exceed 2GB on low-spec hardware
Browser memory limits make very large files unreliable on <16GB RAM machines.
You need to split 50+ files in a batch
One file per session — batch processing requires repeated operations.
Questions about limitations? Check our FAQ section below or contact us via the feedback button.
Frequently Asked Questions
Try It on Your Files
Benchmarks are one thing. Your files are another. Drop yours in and see. No uploads, no account, no install required.
Related: 10M Rows in 12 Seconds (v1 benchmark) · Split Large CSV Files Guide · 1GB CSV Benchmark (v1) · CSV Merger