Benchmark Performance
Excel 900s is a workflow estimate, not a compute benchmark — it represents a typical end-to-end VLOOKUP workflow (write formula, drag down, IFERROR wrapper, copy-paste as values, troubleshoot mismatches) from internal testing, February 2026. Actual workflow time varies by user familiarity and file complexity. These are not directly comparable numbers — the intent is to show why the tool context matters, not raw compute speed.
Performance at Scale
Chrome (stable) · Windows 11 · Intel i7-12700K · 32GB RAM · February 2026
| File Size | Inner Join | Left Join | Test Notes |
|---|---|---|---|
| 100K rows | ~296K rows/sec | ~280K rows/sec | Startup overhead visible at small sizes; hash map build dominates |
| 500K rows | ~310K rows/sec | ~295K rows/sec | Mixed data types, single key column, 85% match rate |
| 1M rows | ~333K rows/sec | ~315K rows/sec | Inner join with duplicates; 90% match rate |
| 2M rows | ~290K rows/sec | ~270K rows/sec | Larger right-side hash table, more GC pressure |
| 5M rows | ~212K rows/sec | ~195K rows/sec | Verified stress test — 5M left × 4.5M right, 556MB output |
| 10M (est.) | ~208K rows/sec | ~190K rows/sec | Estimated based on scaling curve; browser memory is the constraint |
Results vary by hardware, browser, match rate, and number of output columns. Throughput peaks at 1M rows as hash map fits optimally in memory, then decreases at 5M due to GC pressure on larger hash tables.
Join Type Performance Overhead
Calculate Your Time Savings
Typical: 1–4 joins per data prep session
Weekly = 52, Monthly = 12, Daily = 260
Analyst avg: $45–75/hr
- Writing and dragging VLOOKUP formulas across hundreds of thousands of rows
- IFERROR wrappers and #N/A troubleshooting
- Copy-paste as values to remove formula dependency before sharing
- Missed duplicate matches that corrupt aggregations downstream
- Excel crashes when joining files over 1,048,576 rows
Testing Methodology
10 runs per config · drop high/low · report avg + range · test datasets available on request
Honest Limitations: Where SplitForge VLOOKUP/Join Falls Short
No tool is perfect for every use case. Here's where Server-Side Join Tools (SQL Databases, Python pandas, AWS Glue) might be a better choice, and the real limitations of our browser-based architecture.
Browser-Based Processing
Performance depends on your device's RAM and CPU. Modern laptops (2022+) handle 10M+ rows easily, but older devices may struggle with very large files.
No Offline Mode (Initial Load)
Requires internet connection to load the tool initially. Processing happens offline in your browser after loading.
Browser Tab Memory Limits
Most browsers limit individual tabs to 2-4GB RAM. This is the practical ceiling for file size.
Browser Memory Ceiling (Right File Size-Dependent)
The right-side file is fully loaded into a JavaScript hash map. Memory usage depends heavily on column count and string lengths — roughly 200–400MB for a 1M row file, 800MB–1.5GB for a 5M row file (typical business data, 8–12 columns). On 16GB machines with other browser tabs open, you may hit limits well below 5M rows.
No Fuzzy or Approximate Matching
Keys must match exactly (or case-insensitively if toggled). No Levenshtein distance, phonetic matching, or pattern-based matching. 'Smith' and 'Smyth' will not match.
No Automation or API Support
SplitForge is a browser tool — no REST API, CLI, or pipeline integration. Cannot be embedded in ETL workflows or scheduled jobs.
Join Key Column Names Must Match
The join key column must have identical names in both files. If your files use 'customer_id' and 'CustomerID' for the same concept, you must rename one before uploading.
When to Use Server-Side Join Tools (SQL Databases, Python pandas, AWS Glue) Instead
You need joins in an automated ETL or scheduled pipeline
SplitForge has no API. Browser-only workflow cannot run on a schedule or be triggered programmatically.
You need fuzzy or approximate key matching
SplitForge only supports exact (or case-insensitive) matching. Complex match patterns require fuzzy logic.
You need to join 50M+ row files regularly
Browser memory limits practical ceiling to ~5M right rows. Server-side tools scale horizontally.
You need team-shared, version-controlled join configurations
SplitForge join settings aren't saved or shareable — each user configures from scratch each session.
Questions about limitations? Check our FAQ section below or contact us via the feedback button.
Frequently Asked Questions
How accurate is the 212K rows/second benchmark?
Why does throughput decrease at 5M rows vs 1M rows?
How does the hash-based algorithm compare to Excel VLOOKUP?
What is the difference between join types and their performance impact?
What is the pre-join analysis step and does it affect performance?
What happens with composite key joins (multi-column)?
Can I reproduce these benchmarks?
What is the browser memory limit for joins?
Benchmarks last updated: February 2026. Planned for re-testing after major algorithm changes.