What are the memory limits?

Typical limits by browser: Chrome: ~4GB per tab (configurable) Firefox: ~3GB per tab Safari: ~2GB per tab Edge: ~4GB per tab In practice, v3.2 uses ~25 MB working memory for 10M rows (OPFS streaming output), well under all browser limits. Files up to 1GB work reliably on machines with 8GB+ RAM.

Is this GDPR/HIPAA-safe?

Yes, by architecture. All processing happens client-side in your browser. No data ever transmits to our servers or any third party. This means: No Article 28 processor agreements needed (GDPR) No BAA required (HIPAA) No cross-border data transfer (GDPR Article 44) No vendor security audits required (SOC 2) Your data never leaves your device.

Why is JSON → CSV faster than CSV → Excel?

Excel files (.xlsx) are ZIP archives containing XML with styling, formatting, and metadata. This requires: ZIP compression (CPU intensive) XML generation (more complex than JSON) Format overhead (much larger than CSV) CSV is plain text with minimal overhead. v3.2 benchmarks (Node harness, May 2026): CSV→JSONL ~440K rows/sec, CSV→JSON ~95K rows/sec, JSON→CSV ~28K rows/sec (two-pass streaming tokenizer), Excel→CSV ~94K rows/sec.

What happens if my browser tab crashes?

Because processing happens entirely client-side, a browser crash means you'll need to restart the conversion. For critical workflows processing 10M+ rows, we recommend: Close other browser tabs Use Incognito/Private mode (starts fresh) Disable browser extensions temporarily For files over 5GB, split first

Can I automate this with scripts?

Not directly (browser-based tools require user interaction). For automation needs: Use Node.js with conversion libraries Use Python pandas for smaller files (<1M rows) Use conversion patterns as templates for your own scripts The browser version excels at one-off conversions without infrastructure setup.

Data Engineering

Convert 10M Rows: CSV ↔ JSON ↔ Excel in 60 Seconds

December 13, 2025

By SplitForge Team

Your database export just finished. 10 million rows. 3.2GB JSON file.

You need it in CSV by Monday for the analytics team.

Your conversion script crashes. Online tools refuse files over 100MB. Cloud APIs want $50/month subscriptions plus per-file charges. Your CTO won't approve uploading customer data to third-party servers.

You have 48 hours.

Every month, data teams lose 12–20 hours trying to convert files that are too large for Excel, too inconsistent for Python scripts, or too sensitive for cloud tools. The financial impact: $1,800–$3,200 per incident in wasted labor, missed deadlines, and paid subscriptions for tools that shouldn't be necessary.

This guide shows the architecture we built to convert 10 million rows in 45 seconds—no uploads, no RAM spikes, no infrastructure.

Key Takeaway:
You don't need cloud APIs, Python expertise, or expensive ETL platforms. A properly architected browser-based converter can process 10 million rows at 440,000 rows per second (CSV → JSONL, February 2026 Node harness benchmark)—entirely client-side with zero uploads and complete privacy.

TL;DR

A properly engineered browser-based converter can process 10M rows CSV→JSON with OPFS output streaming — heap stays flat at ~25 MB regardless of output size — with zero uploads and complete privacy.

This guide breaks down the architecture: streaming parsers, OPFS streaming output, two-pass JSON tokenization, and Web Worker pipelines that make enterprise-grade performance possible without servers.

Quick 2-Minute Emergency Fix

Need to convert millions of rows between CSV/JSON/Excel right now?

Don't use cloud converters → File size limits, uploads expose data, subscription costs
Use browser-based streaming → Web Workers process locally
Drop your file → Handled via File API, stays on device
Convert → OPFS streaming output, ~25 MB working memory, flat heap
Download result → Created via Blob API, zero server interaction

This handles CSV↔JSON↔Excel conversion for 10M+ rows in under 60 seconds. Continue reading for comprehensive technical deep dive.

Why This Matters
The Real Problem: Why Format Conversion Breaks at Scale
How Browser-Based Streaming Solves This
Real-World Performance Benchmarks
Technical Deep Dive: How It Works
Comparison: Browser vs Traditional Methods
Use Cases: When to Use Browser-Based Conversion
Privacy & Compliance Architecture
Performance Optimization Techniques
Common Conversion Patterns
Advanced Features
Troubleshooting Common Issues
Integration Patterns
Cost Analysis: Browser vs Alternatives
Technical Specifications
Best Practices
Benchmarking Methodology
Real-World Success Stories
The Architecture Philosophy
What This Won't Do
FAQ
Conclusion

Why This Matters

Format conversion is infrastructure work. It shouldn't require:

Cloud service subscriptions ($20–$200/month)
Custom Python/Node.js scripts that break on edge cases
Uploading sensitive data to third-party servers
Waiting 30–120 minutes for cloud processing queues

The financial and operational impact:

Development costs:

Average time to write robust CSV↔JSON converter: 8–15 hours
Maintenance burden: 2–4 hours/month fixing encoding issues, edge cases
Total annual cost: $3,200–$6,400 in developer time (at $100/hour loaded cost)

Cloud service costs:

Convertio Pro: $10/month (250 MB file limit)
CloudConvert: $8–$25/month (API limits apply)
Zamzar Pro: $16/month (50 conversions/month)
Annual cost: $96–$300 for basic plans

Compliance risks:

GDPR Article 28 requires processor agreements for uploaded data
SOC 2 compliance mandates data handling audits
HIPAA restricts health data uploads to third parties
Violation costs: $100K–$50M in GDPR fines for data breaches

This guide demonstrates how streaming Web Worker architecture achieves enterprise-grade conversion (10M+ rows, flat heap via OPFS) while maintaining complete data privacy through client-side processing.

By the end, you'll understand:

Why traditional conversion methods fail at scale
How streaming architecture handles 10M+ rows without memory overflow
Technical implementation of OPFS streaming output and two-pass JSON tokenizer (v3.2)
Real-world benchmarks: CSV↔JSON↔Excel at production scale

The Real Problem: Why Format Conversion Breaks at Scale

Traditional Tools Fail Above 1M Rows

Excel:

Hard limit: 1,048,576 rows
CSV import crashes with special characters (international data, JSON escaping)
No native JSON support (requires Power Query, limited to 500K rows)
XLSX generation requires all data in memory (memory = 3–5× file size)

Python pandas:

import pandas as pd
df = pd.read_csv('10m_rows.csv')  # Loads entire file into RAM
df.to_json('output.json')         # Creates full string in memory

Memory usage: 10M rows × 20 columns × 100 bytes = 20GB RAM
Reality: Crashes on laptops, requires server infrastructure

Online Conversion Services:

Convertio: 100 MB file limit (free), 1 GB (paid)
CloudConvert: 1 GB limit, 25 conversions/day
Zamzar: 50 MB limit (free), 2 GB (paid)
All require uploading data to their servers

Node.js streaming (common approach):

const csv = require('csv-parser');
fs.createReadStream('input.csv')
  .pipe(csv())
  .pipe(jsonStream())
  .pipe(fs.createWriteStream('output.json'));

Problems:

Requires Node.js installation
100K–150K rows/sec typical performance
No progress indicators
Breaks on malformed CSV (encoding issues, quote escaping)

The gap: Need 500K+ rows/sec performance, multi-format support, browser accessibility, and zero server uploads.

How Browser-Based Streaming Solves This

Web Workers + OPFS Streaming Architecture

Modern browsers provide everything needed for enterprise-grade file processing:

┌─────────────────────────────────────────────────────────────┐
│ Main Thread                                                 │
│  ├─ UI rendering & user interaction                         │
│  ├─ File selector (<input type="file">)                     │
│  ├─ Progress bar updates                                    │
│  └─ Download link generation                                │
└─────────────────────────────────────────────────────────────┘
                            │
                  postMessage(file)
                            ↓
┌─────────────────────────────────────────────────────────────┐
│ Web Worker (Background Thread)                              │
│  ├─ Streaming file reader (64KB chunks)                     │
│  ├─ Format-specific parser (CSV/JSON/JSONL)                 │
│  ├─ Row builder (per-format hot path)                       │
│  ├─ OPFS StreamWriter (browser-private storage sink)        │
│  └─ File handle transfer back to main thread                │
└─────────────────────────────────────────────────────────────┘

1. Web Workers (Background Processing)

// Main thread remains responsive
const worker = new Worker('converterWorker.js');
worker.postMessage({ file, format });

// Worker processes in background
self.onmessage = async (e) => {
  const { file, format } = e.data;
  await streamConvert(file, format);
};

Benefits:

Non-blocking UI (progress bars, cancellation)
Parallel processing (multi-core CPU utilization)
Memory isolation (worker crash doesn't kill UI)

2. Streaming File API

const reader = file.stream().getReader();
while (true) {
  const { value, done } = await reader.read();
  if (done) break;
  processChunk(value); // Process 64KB at a time
}

Memory usage: O(chunk size) instead of O(file size)
Result: 10M rows uses 2–5 MB RAM, not 20 GB

3. OPFS StreamWriter (v3.1+) — replaced in-memory ChunkWriter

Pre-v3.1 architecture used a ChunkWriter (in-memory 2 MB Uint8Array buffer flushed to a Blob array). v3.1+ writes directly to the browser-private Origin Private File System (OPFS) — output never accumulates in the JS heap. The pattern below shows the older ChunkWriter for historical reference:

// Pre-v3.1: ChunkWriter (in-memory, now replaced by OPFS StreamWriter)
class ChunkWriter {
  constructor(size = 2 * 1024 * 1024) { // 2MB buffer
    this.buffer = new Uint8Array(size);
    this.position = 0;
  }
  
  write(str) {
    const encoded = this.encoder.encode(str);
    this.buffer.set(encoded, this.position);
    this.position += encoded.length;
    
    if (this.position > this.buffer.length * 0.9) {
      this.flush(); // Write to Blob when 90% full
    }
  }
}

Performance gain: 2–3× faster than string concatenation
Reason: Avoids repeated memory allocation and string copies

4. Two-Pass Streaming Tokenizer for JSON→CSV (v3.2)

Pre-v3.1 used a compiled row processor via new Function() for JSON→CSV (15–30% speed gain). This was removed in v3.2 for CSP compliance. The current approach is a two-pass streaming tokenizer:

// v3.2: Two-pass streaming tokenizer — input is O(1) heap
// Pass 1: scan first 100 objects to discover column headers
const headerSet = new Set();
for await (const obj of streamJSONObjects(file)) {
  Object.keys(flattenObject(obj)).forEach(k => headerSet.add(k));
  if (++sampleCount >= 100) break;
}

// Pass 2: stream all objects and write CSV rows to OPFS sink
for await (const obj of streamJSONObjects(file)) {
  writer.write(buildCSVRow(flattenObject(obj), headers));
}

Trade-off: Character-by-character JSON tokenization is slower than the compiled extractor (~28K rows/sec vs pre-v3.1 537K), but input never loads into heap — enabling unlimited input file size.

Real-World Performance Benchmarks

JSON → CSV: Two-Pass Streaming Tokenizer (v3.2)

v3.2 approach: Input file is tokenized character-by-character — never loaded into heap. Output streams to OPFS (browser-private storage). Both input and output are O(1) heap regardless of file size.

v3.2 benchmark (Node harness, May 2026): ~28K rows/sec
Why slower than pre-v3.1 (was 537K): The compiled row extractor (new Function()) was removed for CSP compliance. Character-by-character JSON tokenization replaces it — more memory-safe but CPU-heavier.

Code path (v3.2):

// Pass 1: scan first 100 objects for headers (never loads full file)
for await (const obj of streamJSONObjects(file)) {
  Object.keys(flattenObject(obj)).forEach(k => headerSet.add(k));
  if (++sampleCount >= 100) break;
}

// Pass 2: stream all objects → CSV rows → OPFS sink
for await (const obj of streamJSONObjects(file)) {
  writer.write(buildCSVRow(flattenObject(obj), headers));
}

Memory profile (v3.2):

JS heap: ~15 MB working memory (input tokenized in chunks, never fully loaded)
OPFS sink: output written to browser storage — zero heap accumulation
Unlimited input file size — tokenizer processes one object at a time

CSV → JSON: Streaming Output at 10M Scale

v3.2 benchmark (Node harness, May 2026): ~95K rows/sec at 5M–10M scale
Output: Streams to OPFS — JS heap stays flat regardless of output size (2.4 GB output tested, ~25 MB heap)

Pre-v3.1 figures (now outdated): 220K rows/sec / 45.44 sec for 10M rows (batch-to-Blob architecture, not OPFS). The OPFS streaming path in v3.1+ uses different output mechanics — benchmark figures are not directly comparable.

Architecture enabling flat-heap output (v3.2):

// Input: CSV streamed line-by-line via async generator
for await (const line of streamLinesFast(file, delimiter)) {
  const values = parseCSVLineFast(line, delimiter);
  const obj = buildNestedObject(headers, values, options);
  batch.push(JSON.stringify(obj));
  
  if (batch.length >= BATCH_SIZE) {
    writer.write(batch.join('\n') + '\n'); // Write to OPFS sink
    batch = []; // Heap freed immediately
  }
}

Memory profile (CSV→JSON streaming, v3.2): ~25 MB peak working memory; output streams to OPFS (browser-private storage) — JS heap stays flat regardless of output file size

CSV → Excel: 94,697 Rows/Second

Test: 1 million rows, 3 columns → XLSX
Result: 10.56 seconds = 94,697 rows/sec (65.2 MB output)

Why slower than JSON:

XLSX requires ZIP compression (CPU intensive)
XML generation for sheet data (more complex than JSON)
Excel file format overhead (styles, formatting, metadata)

Still impressive because:

Exceeds Excel's own row limit (1,048,576 max)
Faster than Python pandas (typically 30K–50K rows/sec)
No server upload required (Excel Online has 100K row limit)

Technical Deep Dive: How It Works

1. Streaming CSV Parser

Challenge: CSV isn't truly line-delimited due to quoted fields with newlines:

id,description
1,"Product with
newline in description"
2,"Another product"

Solution: Quote-aware streaming parser

async function* streamLines(file) {
  const reader = file.stream().getReader();
  let buffer = '';
  let inQuotes = false;
  
  while (true) {
    const { value, done } = await reader.read();
    if (done) break;
    
    buffer += decoder.decode(value, { stream: true });
    let i = 0;
    
    while (i < buffer.length) {
      const c = buffer[i];
      
      if (c === '"') {
        if (inQuotes && buffer[i+1] === '"') {
          i += 2; // Skip escaped quote
          continue;
        }
        inQuotes = !inQuotes;
      }
      
      if (!inQuotes && c === '\n') {
        const line = buffer.slice(0, i);
        buffer = buffer.slice(i + 1);
        yield line; // Return complete line
        i = 0;
      } else {
        i++;
      }
    }
  }
}

Performance: 400K+ lines/sec
Memory: O(1) - buffer never exceeds 64 KB

2. Flattening Nested JSON

Input (nested):

{
  "id": 1,
  "user": {
    "name": "John",
    "email": "[email protected]"
  },
  "metadata": {
    "created": "2024-01-01"
  }
}

Output (flattened for CSV):

id,user.name,user.email,metadata.created
1,John,[email protected],2024-01-01

Recursive flattening algorithm:

function flattenObject(obj, prefix = '') {
  const flattened = {};
  
  for (const key in obj) {
    const val = obj[key];
    const newKey = prefix ? `${prefix}.${key}` : key;
    
    if (val && typeof val === 'object' && !Array.isArray(val)) {
      Object.assign(flattened, flattenObject(val, newKey));
    } else if (Array.isArray(val)) {
      flattened[newKey] = val.join(', ');
    } else {
      flattened[newKey] = val;
    }
  }
  
  return flattened;
}

Handles:

Nested objects (unlimited depth)
Arrays (joins with comma-space)
null/undefined (converts to empty string)
Mixed types (stringify objects, preserve primitives)

3. Auto-Header Detection

Problem: JSON objects don't guarantee consistent keys:

[
  {"id": 1, "name": "John", "email": "[email protected]"},
  {"id": 2, "name": "Jane", "phone": "555-0001"},
  {"id": 3, "name": "Bob", "email": "[email protected]", "company": "Acme"}
]

Solution: Sample first N rows, collect all unique keys:

const headerSet = new Set();
const sampleSize = Math.min(100, data.length);

for (let i = 0; i < sampleSize; i++) {
  const obj = data[i];
  const flattened = flattenObject(obj);
  Object.keys(flattened).forEach(key => headerSet.add(key));
}

const headers = Array.from(headerSet);

Result: CSV contains all columns seen in first 100 rows
Trade-off: Misses columns that only appear after row 100 (rare in practice)

4. Escape Handling

CSV requires escaping:

Commas: Hello, World → "Hello, World"
Quotes: He said "Hi" → "He said ""Hi"""
Newlines: Line 1\nLine 2 → "Line 1\nLine 2"

Inline escape function:

function escapeCSV(val, delimiter) {
  const str = val == null ? '' : String(val);
  
  if (str.indexOf(delimiter) !== -1 || 
      str.indexOf('"') !== -1 || 
      str.indexOf('\n') !== -1) {
    return '"' + str.replace(/"/g, '""') + '"';
  }
  
  return str;
}

Performance: 10M+ escapes/sec (when needed)
Optimization: Early return for values not requiring escaping

Comparison: Browser vs Traditional Methods

Method	100K Rows	1M Rows	10M Rows	Memory	Privacy
Browser Converter (CSV→JSON)	0.19s	1.9s	19s	~50 MB batch	✓ Local
Python pandas	2.5s	25s	250s	2 GB	✓ Local
Node.js streaming	0.8s	8s	80s	100 MB	✓ Local
Excel (manual)	15s	Crashes	N/A	4 GB	✓ Local
CloudConvert API	30s	180s	900s	N/A	✗ Upload
Convertio	45s	300s	N/A	N/A	✗ Upload

Browser converter wins on:

Speed (2–13× faster than Python)
Memory efficiency (40× less than pandas)
Accessibility (no installation required)
Privacy (zero uploads)
Cross-platform (works on any OS with a browser)

Use Cases: When to Use Browser-Based Conversion

1. API Response Processing

Scenario: Export 100K user records from REST API as JSON, need CSV for analysis

Traditional approach:

curl https://api.example.com/users > users.json
python -c "import pandas; pandas.read_json('users.json').to_csv('users.csv')"

Time: 5 minutes (including pandas install if first time)

Browser approach:

Save API response as users.json
Upload to browser converter
Select JSON → CSV
Download result

Time: 30 seconds
Benefit: No Python/pandas required, works on any computer

2. Database Export Migration

Scenario: Migrate 5M rows from PostgreSQL (CSV export) to MongoDB (requires JSON)

Traditional approach:

// Node.js script
const csv = require('csv-parser');
const fs = require('fs');

fs.createReadStream('export.csv')
  .pipe(csv())
  .pipe(jsonTransform())
  .pipe(fs.createWriteStream('import.json'));

Issues:

Requires Node.js + dependencies
Script must handle encoding, escaping, edge cases
No progress indicator
Debugging takes hours when it breaks

Browser approach:

Upload 5M row CSV (300 MB file)
Select CSV → JSON
Download in 22 seconds
Import to MongoDB

Benefit: Zero code, handles edge cases automatically, shows progress

3. Excel Limitations Workaround

Scenario: Client sends 1.5M row Excel file, need to analyze in Python

Problem: pandas.read_excel() is extremely slow on large XLSX files

Solution:

Convert XLSX → CSV in browser (15 seconds)
Clean data if needed
Load CSV in pandas (2 seconds)

Total time: 17 seconds
Alternative: pandas.read_excel() takes 180+ seconds on 1.5M rows

4. Privacy-Compliant Processing

Scenario: Healthcare provider needs to convert patient data (HIPAA)

Constraint: Cannot upload PHI (Protected Health Information) to third-party servers

Traditional approach:

Deploy on-premise conversion server
Maintain infrastructure
Security audits required

Browser approach:

All processing client-side
Zero data transmission
No infrastructure needed
Built-in compliance

Cost savings: $50K–$200K annually (infrastructure + compliance overhead)

Privacy & Compliance Architecture

Why Client-Side Processing Matters

Data never leaves your device:

// File selected by user
<input type="file" onChange={handleFile} />

// Processed in Web Worker (browser sandbox)
worker.postMessage({ file });

// Downloaded to user's device
const blob = new Blob([result]);
const url = URL.createObjectURL(blob);
downloadLink.href = url;

No network transmission at any stage.

Compliance Benefits

GDPR (EU):

Article 28: No processor agreement needed (no data processing by third party)
Article 32: Technical measures maintained (client-side encryption)
Article 44: No cross-border transfer (data stays local)

HIPAA (US Healthcare):

No BAA (Business Associate Agreement) required
PHI never transmitted or stored externally
Audit logs on user's device only
Reference: HHS HIPAA Security Rule

SOC 2:

No vendor security assessment needed
Data handling controls at user's discretion
Zero third-party data access

ISO 27001:

Reduces attack surface (no data in transit)
Simplifies risk assessment
No external data storage to audit

Financial impact:

Compliance overhead: $0 (vs $50K–$200K for vendor assessments)
Data breach risk: Eliminated for conversion step
Audit scope: Reduced (one less vendor to assess)

Performance Optimization Techniques

1. Compiled Row Processors

Before optimization:

function toCSV(obj, headers, delimiter) {
  return headers
    .map(h => escape(obj[h], delimiter))
    .join(delimiter) + '\n';
}

Performance: 150K rows/sec

After optimization (compiled):

const builder = new Function(`
  const delimiter = '${delimiter}';
  
  function escape(val) {
    const str = val == null ? '' : String(val);
    if (str.indexOf(delimiter) !== -1 || 
        str.indexOf('"') !== -1 || 
        str.indexOf('\\n') !== -1) {
      return '"' + str.replace(/"/g, '""') + '"';
    }
    return str;
  }
  
  return function(obj) {
    ${headers.map((h, i) => `
      let v${i} = obj['${h}'];
      if (v${i} === undefined || v${i} === null) v${i} = '';
      else if (Array.isArray(v${i})) v${i} = v${i}.join(', ');
    `).join('\n')}
    
    return ${headers.map((_, i) => `escape(v${i})`).join(' + delimiter + ')} + '\\n';
  }
`)();

Performance (pre-v3.1): ~220K rows/sec with compiled extractor (now removed for CSP compliance)

Why it works:

Eliminates .map() array operation
Inlines escape function per call
Removes dynamic property access in loop
Pre-computes string concatenation positions

2. OPFS Streaming Output (v3.1+) — Replaced Blob-Based ChunkWriter

Problem (string concatenation):

let csvText = '';
for (const row of data) {
  csvText += toCSV(row); // O(n²) string copies
}

Memory: Grows with file size, crashes on large files

Intermediate solution (pre-v3.1 ChunkWriter):

// Pre-v3.1: flush 2MB Uint8Array buffers into a growing Blob array
const writer = new ChunkWriter(2 * 1024 * 1024); // 2 MB buffer
for (const row of data) {
  writer.write(toCSV(row));
}
// Blob chunks accumulate in JS heap proportional to output size

Memory: O(n) but still accumulates in JS heap

Current solution (v3.1+ OPFS StreamWriter):

// v3.1+: output writes directly to OPFS (browser-private storage)
const writer = new StreamWriter('text/csv');
await writer.init(); // Creates OPFS sync access handle
for (const row of data) {
  writer.write(toCSV(row)); // Writes to disk, not heap
}
const outputRef = await writer.finalize(); // Returns OPFS File handle

Memory: O(1) heap regardless of output size — heap stays flat at ~25 MB for a 2.4 GB output

3. Streaming vs Buffering Trade-offs

Full buffer approach:

const data = await file.text(); // Load entire file
const result = convert(data);   // Process all at once
download(result);               // Output

Pros: Simple code
Cons: Memory = 3–5× file size, crashes on large files

Streaming approach:

for await (const chunk of file.stream()) {
  const processed = convert(chunk);
  output.write(processed);
}

Pros: Constant memory, handles unlimited file size
Cons: More complex code, requires careful state management

Hybrid (optimal):

const BATCH_SIZE = 25000;
let batch = [];

for await (const line of streamLines(file)) {
  batch.push(parseLine(line));
  
  if (batch.length >= BATCH_SIZE) {
    output.write(convertBatch(batch));
    batch = []; // Free memory
  }
}

Pros: Balance between simplicity and memory efficiency
Result: ~95K rows/sec (CSV→JSON) with ~25 MB working memory (v3.2, OPFS output)

Common Conversion Patterns

Pattern 1: CSV → JSON for API Consumption

Input CSV:

id,name,email,created_at
1,John Doe,[email protected],2024-01-01
2,Jane Smith,[email protected],2024-01-02

Output JSON (array of objects):

[
  {
    "id": 1,
    "name": "John Doe",
    "email": "[email protected]",
    "created_at": "2024-01-01"
  },
  {
    "id": 2,
    "name": "Jane Smith",
    "email": "[email protected]",
    "created_at": "2024-01-02"
  }
]

Type coercion options:

Parse numbers: "1" → 1
Parse booleans: "true" → true
Parse nulls: "null" → null

Pattern 2: JSON → CSV for Excel Analysis

Input JSON (nested):

[
  {
    "user_id": 1,
    "profile": {
      "name": "John",
      "email": "[email protected]"
    },
    "stats": {
      "orders": 5,
      "revenue": 432.50
    }
  }
]

Output CSV (flattened):

user_id,profile.name,profile.email,stats.orders,stats.revenue
1,John,[email protected],5,432.50

Flattening preserves all data in Excel-compatible format.

Pattern 3: Excel → JSON for Database Import

Input: Multi-sheet Excel with related data

Sheet 1 (Users):

id	name	email
1	John	[email protected]

Sheet 2 (Orders):

order_id	user_id	amount
101	1	99.99

Output JSON (separate files):

// users.json
[{"id": 1, "name": "John", "email": "[email protected]"}]

// orders.json
[{"order_id": 101, "user_id": 1, "amount": 99.99}]

Import to database with foreign key relationships preserved.

Advanced Features

1. Nested JSON Handling

Option: Flatten nested objects

Input:

{"user": {"address": {"city": "Boston"}}}

Output:

user.address.city
Boston

Option: Keep nested structure

Input (same):

{"user": {"address": {"city": "Boston"}}}

Output:

user
"{""address"":{""city"":""Boston""}}"

2. Array Value Handling

Join arrays with delimiter:

{"tags": ["javascript", "node", "react"]}

→

tags
"javascript, node, react"

Expand arrays to separate rows:

{"id": 1, "tags": ["a", "b"]}

→

id,tag
1,a
1,b

3. Delimiter Detection

Auto-detect CSV delimiter from file content:

Comma: Standard CSV
Semicolon: European Excel exports
Tab: TSV files
Pipe: Database exports

Detection algorithm:

function detectDelimiter(sample) {
  const delimiters = [',', ';', '\t', '|'];
  const counts = delimiters.map(d => 
    sample.split('\n')[0].split(d).length
  );
  
  return delimiters[counts.indexOf(Math.max(...counts))];
}

4. BOM (Byte Order Mark) Handling

Excel requires BOM for UTF-8 CSV:

const BOM = new Uint8Array([0xEF, 0xBB, 0xBF]);
const csvBlob = new Blob([BOM, csvData], {
  type: 'text/csv;charset=utf-8;'
});

Without BOM: International characters (é, ñ, 中) display incorrectly in Excel
With BOM: Perfect character rendering

Troubleshooting Common Issues

Issue 1: "Out of Memory" Errors

Cause: File too large for available RAM

Solutions:

Split file first
Use JSONL instead of JSON (streaming-friendly)
Convert in chunks (100K rows at a time)
Close other browser tabs/applications

Memory requirements (v3.2 worker, OPFS streaming):

CSV → JSON: ~25 MB working memory (output to OPFS, not heap)
JSON → CSV: ~15 MB working memory (streaming tokenizer, both input and output O(1))
CSV → Excel: constant working memory, streaming write (output accumulates as ArrayBuffer in worker); ~15s per 1M rows; auto-splits at 1,048,576 rows per sheet

Issue 2: Special Characters Corrupted

Cause: Encoding mismatch

Solutions:

Ensure UTF-8 encoding on input
Enable BOM for Excel compatibility
Check source file encoding (Windows-1252, Latin1)

Detection:

// Check for BOM
const header = await file.slice(0, 3).arrayBuffer();
const bytes = new Uint8Array(header);
const hasBOM = bytes[0] === 0xEF && 
               bytes[1] === 0xBB && 
               bytes[2] === 0xBF;

Issue 3: Excel Opens CSV with Wrong Columns

Cause: Delimiter mismatch (Excel expects system locale)

Solutions:

US/UK: Use comma delimiter
Europe: Use semicolon delimiter
Save as .tsv (tab-delimited) for universal compatibility

Issue 4: JSON Parse Errors

Cause: Invalid JSON syntax in source file

Common errors:

Single quotes instead of double quotes
Trailing commas in objects
Unescaped control characters
Byte Order Mark in JSON

Validation:

try {
  JSON.parse(await file.text());
} catch (e) {
  console.error('Invalid JSON:', e.message);
  // Attempt to fix common issues
}

Integration Patterns

Pattern 1: API Development Workflow

Scenario: Frontend expects JSON, backend exports CSV

# Backend exports
psql -c "COPY users TO '/tmp/users.csv' CSV HEADER"

# Convert to JSON in browser

# Frontend consumes
fetch('users.json')
  .then(r => r.json())
  .then(data => render(data))

Benefit: No backend conversion logic needed

Pattern 2: Data Pipeline Integration

ETL flow:

Extract: Database → CSV export
Transform: CSV → JSON (browser converter)
Load: Upload JSON to API

Advantages:

No ETL server infrastructure
No Python/Node.js dependencies
Works on any workstation

Pattern 3: Excel Power Users

Daily workflow:

Receive client data as Excel
Convert to CSV instantly
Process with command-line tools
Convert back to Excel for delivery

Time saved: 15–20 minutes daily (manual copy/paste eliminated)

Cost Analysis: Browser vs Alternatives

Scenario: Monthly Data Processing (1M rows × 20 conversions)

Option 1: Browser Converter (Free)

Conversion cost: $0
Time: 40 minutes total (2 min per conversion)
Privacy: Complete (local processing)
Total cost: $0

Option 2: Cloud Conversion API

Service: CloudConvert Pro ($25/month)
API limits: 500 conversions/month
Upload time: 60 minutes total (3 min per conversion)
Total cost: $300/year
Privacy risk: Data uploaded to third party

Option 3: Python pandas Scripts

Development: 15 hours initial ($1,500)
Maintenance: 2 hours/month ($2,400/year)
Server costs: $0 (runs locally)
Total first year: $3,900
Annual ongoing: $2,400

Option 4: ETL Platform

Service: Talend, Informatica, etc.
Cost: $2,000–$10,000/year
Overkill for simple conversions
Total cost: $2,000–$10,000/year

Winner: Browser converter saves $300–$10,000 annually

Technical Specifications

Supported Formats

Input:

CSV (any delimiter)
TSV (tab-separated)
JSON (array of objects)
JSONL (newline-delimited JSON)
Excel (.xlsx, .xls)

Output:

CSV (configurable delimiter)
JSON (formatted or minified)
JSONL (streaming-friendly)
Excel (.xlsx)

Performance Characteristics

Metric	Value
Max file size	Unlimited (browser memory limit)
Max rows tested	10,000,000
CSV → JSONL throughput	~440,000 rows/sec
CSV → JSON throughput	~95,000 rows/sec
JSON → CSV throughput	~28,000 rows/sec
Excel → CSV throughput	~94,000 rows/sec
Memory usage	50 MB typical
Supported browsers	Chrome, Firefox, Safari, Edge

Browser Requirements

Chrome 90+ (recommended)
Firefox 88+
Safari 14+
Edge 90+

Features used:

Web Workers (background processing)
Streams API (file reading)
TextEncoder/TextDecoder (UTF-8 handling)
Blob/File API (output generation)

Best Practices

1. File Size Management

Under 100 MB: Direct conversion works perfectly
100 MB – 1 GB: Close other tabs, conversion takes 10–60 seconds
Over 1 GB: Consider splitting first, or use JSONL format

2. Encoding Considerations

Always use UTF-8:

Set charset in editor before creating CSV
Enable BOM if opening in Excel
Test with international characters (é, ñ, 中)

3. Data Validation

Before conversion:

Check for consistent column counts
Verify header row is present
Scan for encoding issues
Test with small sample first

After conversion:

Verify row count matches (no data loss)
Spot-check special characters
Validate JSON structure if applicable
Test import into target system

4. Privacy Considerations

For sensitive data:

Use incognito/private browsing (auto-clear history)
Close browser after conversion (clear memory)
Verify network tab shows zero uploads
Consider air-gapped machine for classified data

Benchmarking Methodology

Test Environment

Hardware:

MacBook Pro M1 (8-core, 16 GB RAM)
Chrome 120.0.6099.109

Test files:

Generated with controlled data
Consistent column counts
No null values (worst case)
UTF-8 encoding

Measurement:

const start = performance.now();
await convertFile(file, options);
const elapsed = performance.now() - start;
const rowsPerSec = (rowCount / elapsed) * 1000;

Reproducibility

Generate test data:

// 1M row CSV
const rows = Array.from({length: 1000000}, (_, i) => 
  `${i},User ${i},user${i}@example.com,${randomDate()}`
);
const csv = 'id,name,email,created_at\n' + rows.join('\n');

Run benchmark:

Upload generated file
Click Convert
Record processing time from UI
Calculate rows/sec

Verify results:

Check output row count matches input
Spot-check data integrity
Confirm file size is reasonable

Real-World Success Stories

Case Study 1: E-commerce Analytics

Company: 50-person online retailer
Challenge: Daily sales exports (200K rows) from Shopify as CSV, needed in MongoDB (JSON)

Before:

Manual process: 30 minutes daily
Node.js script (unmaintained, broke on encoding issues)
Developer time to fix: 2 hours/month

After:

Browser conversion: 2 minutes daily
Zero maintenance
Works on any team member's computer

Savings: 9 hours/month, $900/month in developer time

Case Study 2: Healthcare Data Migration

Organization: Regional hospital network
Challenge: Migrate 5M patient records from legacy system (CSV) to new EHR (requires JSON)

Constraints:

HIPAA compliance (no data uploads)
Limited IT budget
Tight timeline (3 weeks)

Solution:

Browser-based conversion on air-gapped workstation
Processing: 5M rows in 23 seconds per file
Total migration time: 4 hours (including validation)

Result:

Zero compliance risk
$0 additional software costs
Completed 2 weeks ahead of schedule

Case Study 3: Financial Services

Firm: Hedge fund analytics team
Challenge: Convert trading data (1M+ rows daily) between formats for different analysis tools

Before:

Python scripts (5 different scripts)
Maintenance burden: 3 hours/week
Frequent breaks on edge cases

After:

Single browser tool handles all conversions
Zero maintenance
Handles edge cases automatically

Impact:

12 hours/month saved
Reduced dependency on one developer
Faster onboarding for new analysts

Case Study 4: Marketing Automation Platform

Company: SaaS marketing platform (120 employees)
Challenge: Customer data exports (1.2M rows/hour) from database to various third-party integrations

Before:

AWS Lambda CSV→JSON pipeline
Cost: $180/month in Lambda + data transfer
Processing time: 14 minutes per export
Occasional timeout failures requiring reruns

After:

Browser-based conversion on analyst workstations
Processing time: 3 minutes per export
Zero infrastructure costs
100% success rate

Result:

Savings: $2,160/year ($180/month eliminated)
Time savings: 77% faster processing
Improved reliability: No timeout failures
Better compliance: Customer data stays local

The Architecture Philosophy

Why Browser-Based Processing Wins

1. Zero Installation Friction

No Python/Node.js required
No dependency management
No version conflicts
Works on locked-down corporate machines

2. Universal Accessibility

Windows, Mac, Linux identical experience
No IT approval needed
No license management
Instant availability

3. Privacy by Architecture

Impossible to upload data (no server-side code)
No vendor security audits required
No data retention policies to manage
Complete user control

4. Performance at Scale

Multi-core CPU utilization via Web Workers
Memory-efficient streaming
Compiled hot paths
Competitive with native code

5. Future-Proof

Browsers improve continuously
WebAssembly support coming
GPU acceleration possible
No deployment pipeline needed

What This Won't Do

Browser-based format conversion excels at CSV↔JSON↔Excel transformation, but it's not a complete ETL platform. Here's what this approach doesn't cover:

Not a Replacement For:

Complex ETL pipelines - No scheduled jobs, data lineage tracking, or orchestration
Database migration tools - Can't directly load to PostgreSQL, MySQL, MongoDB without intermediate steps
Data transformation platforms - No complex joins, aggregations, or multi-source merges
Schema validation services - Converts formats but doesn't enforce business rules or constraints
Data warehousing - Not designed for ongoing analytics, BI dashboards, or historical tracking

Technical Limitations:

RAM constraints - Limited by browser memory (typically 1-4GB per tab)
No incremental processing - Full file re-conversion needed for any changes
Single file at a time - No batch queue for converting 100+ files automatically
Browser-dependent - Performance varies by browser, OS, and hardware
No custom transformations - Can't add calculated columns, complex logic during conversion

Privacy & Security Caveats:

Browser security dependent - Relies on browser sandbox (keep browser updated)
Local malware risk - Workstation compromise still exposes data
No audit trail - Can't prove what was converted, when, or by whom
Cache considerations - Browser cache may retain JavaScript code (not data files)

Data Type Limitations:

Excel formulas - Converted to values only, formula logic not preserved
Pivot tables - Lost during conversion to CSV/JSON
Macros/VBA - Not supported or preserved
Embedded objects - Charts, images removed in CSV/JSON output
Custom formatting - Conditional formatting, cell colors not preserved

Scale Considerations:

Sweet spot: 100K-10M rows - Beyond this, consider database solutions
File size limit: ~1-4GB - Larger files may fail depending on available RAM
Complex nested JSON - Deep nesting (10+ levels) may slow processing significantly

Best Use Cases: This tool excels at one-time or recurring format conversion for files that are too large for Excel, too sensitive for cloud tools, and need standard CSV/JSON/Excel output. For ongoing data pipelines, schema enforcement, or complex transformations, use dedicated ETL platforms after initial conversion.

Frequently Asked Questions

Yes. v3.2 uses OPFS (Origin Private File System) streaming output — data writes directly to browser-private storage rather than accumulating in the JS heap. JS heap stays flat at ~25 MB regardless of output size. CSV→JSON processes 10M rows at ~95K rows/sec (Node harness, May 2026). The bottleneck is CPU (data processing), not memory.

Typical limits by browser:

Chrome: ~4GB per tab (configurable)
Firefox: ~3GB per tab
Safari: ~2GB per tab
Edge: ~4GB per tab

In practice, v3.2 uses ~25 MB working memory for 10M rows (OPFS streaming output), well under all browser limits. Files up to 1GB work reliably on machines with 8GB+ RAM.

Yes, by architecture. All processing happens client-side in your browser. No data ever transmits to our servers or any third party. This means:

No Article 28 processor agreements needed (GDPR)
No BAA required (HIPAA)
No cross-border data transfer (GDPR Article 44)
No vendor security audits required (SOC 2)

Your data never leaves your device.

JSONL (newline-delimited JSON) is fully supported and actually performs better than standard JSON because it's streaming-friendly. Select "JSONL" as the input format and the converter will process line-by-line with lower memory usage than standard JSON arrays.

The converter auto-detects UTF-8, UTF-16, and Windows-1252 encodings. For Excel compatibility, enable "Add BOM" (Byte Order Mark) which ensures international characters display correctly. If you see garbled text, validate encoding first.

Excel formulas are evaluated and converted to their values. The formula logic itself isn't preserved in CSV/JSON output (this is a limitation of the target formats, not the converter). If you need formula preservation, keep a copy of the original .xlsx file. For Excel-to-JSON conversion specifically — including multi-sheet workbooks and data type preservation — the Excel to JSON Converter handles these conversions with correct type mapping for dates and booleans.

Excel files (.xlsx) are ZIP archives containing XML with styling, formatting, and metadata. This requires:

ZIP compression (CPU intensive)
XML generation (more complex than JSON)
Format overhead (much larger than CSV)

CSV is plain text with minimal overhead. v3.2 benchmarks (Node harness, May 2026): CSV→JSONL ~440K rows/sec, CSV→JSON ~95K rows/sec, JSON→CSV ~28K rows/sec (two-pass streaming tokenizer), Excel→CSV ~94K rows/sec.

Because processing happens entirely client-side, a browser crash means you'll need to restart the conversion. For critical workflows processing 10M+ rows, we recommend:

Close other browser tabs
Use Incognito/Private mode (starts fresh)
Disable browser extensions temporarily
For files over 5GB, split first

Not directly (browser-based tools require user interaction). For automation needs:

Use Node.js with conversion libraries
Use Python pandas for smaller files (<1M rows)
Use conversion patterns as templates for your own scripts

The browser version excels at one-off conversions without infrastructure setup.

If you need strict JSON schema validation or CSV column type enforcement, use purpose-built validators first. This tool focuses on format conversion, not schema compliance.

Machines with 4GB RAM or less may struggle with files over 8GB. In these cases, split the file first or use a machine with more RAM.

If your Excel file uses pivot tables, macros, custom formatting, or merged cells, these features won't transfer to CSV/JSON. Save a copy before converting.

Excel Files Too Large: Row Limits, Crashes & Client-Side Solutions — why Excel can't open these files and the 7-tier workaround hierarchy for datasets that exceed the 1,048,576 row limit
CSV vs JSON vs Excel: Which Format for Your Business Data? — format decision framework to choose the right conversion target before you start processing
CSV vs Excel: When to Use Each for Business Data — practical guide to when CSV outperforms Excel and vice versa at scale

Conclusion

Converting 10 million rows between CSV, JSON, and Excel doesn't require cloud APIs, Python expertise, or expensive ETL platforms.

Browser-based streaming architecture delivers:

~440,000 rows/sec peak throughput (CSV → JSONL)
220,000 rows/sec sustained throughput (10M rows)
~25 MB working memory (OPFS streaming output, flat heap regardless of file size)
Zero uploads (complete privacy)
Zero cost (no subscriptions, no infrastructure)

The technical foundation:

Web Workers for parallel processing
Streaming APIs for memory efficiency
OPFS StreamWriter for flat-heap output
Two-pass streaming tokenizer for JSON→CSV input

Real-world impact:

Saves 12–20 hours per incident
Eliminates $300–$10,000 annual costs
Maintains GDPR/HIPAA compliance
Works on any modern computer

Stop paying for cloud conversion APIs that upload your data. Stop maintaining fragile Python scripts that break on edge cases. Stop waiting hours for Excel to process files it can't even fully open.

Modern browsers are production-grade data processing platforms.

Use them.

Format Converter handles CSV, JSON, and Excel at enterprise speed with zero setup.

Convert CSV, JSON & Excel Files Instantly

Process 10M+ rows in under 60 seconds

Zero uploads — complete data privacy

Works in browser — no installation needed

Try Format Converter →

TL;DR

Quick 2-Minute Emergency Fix

Table of Contents

Why This Matters

The Real Problem: Why Format Conversion Breaks at Scale

Traditional Tools Fail Above 1M Rows

How Browser-Based Streaming Solves This

Web Workers + OPFS Streaming Architecture

Real-World Performance Benchmarks

JSON → CSV: Two-Pass Streaming Tokenizer (v3.2)

CSV → JSON: Streaming Output at 10M Scale

CSV → Excel: 94,697 Rows/Second

Technical Deep Dive: How It Works

1. Streaming CSV Parser

2. Flattening Nested JSON

3. Auto-Header Detection

4. Escape Handling

Comparison: Browser vs Traditional Methods

Use Cases: When to Use Browser-Based Conversion

1. API Response Processing

2. Database Export Migration

3. Excel Limitations Workaround

4. Privacy-Compliant Processing

Privacy & Compliance Architecture

Why Client-Side Processing Matters

Compliance Benefits

Performance Optimization Techniques

1. Compiled Row Processors

2. OPFS Streaming Output (v3.1+) — Replaced Blob-Based ChunkWriter

3. Streaming vs Buffering Trade-offs

Common Conversion Patterns

Pattern 1: CSV → JSON for API Consumption

Pattern 2: JSON → CSV for Excel Analysis

Pattern 3: Excel → JSON for Database Import

Advanced Features

1. Nested JSON Handling

2. Array Value Handling

3. Delimiter Detection

4. BOM (Byte Order Mark) Handling

Troubleshooting Common Issues

Issue 1: "Out of Memory" Errors

Issue 2: Special Characters Corrupted

Issue 3: Excel Opens CSV with Wrong Columns

Issue 4: JSON Parse Errors

Integration Patterns

Pattern 1: API Development Workflow

Pattern 2: Data Pipeline Integration

Pattern 3: Excel Power Users

Cost Analysis: Browser vs Alternatives

Scenario: Monthly Data Processing (1M rows × 20 conversions)

Technical Specifications

Supported Formats

Performance Characteristics

Browser Requirements

Best Practices

1. File Size Management

2. Encoding Considerations

3. Data Validation

4. Privacy Considerations

Benchmarking Methodology

Test Environment

Reproducibility

Real-World Success Stories

Case Study 1: E-commerce Analytics

Case Study 2: Healthcare Data Migration

Case Study 3: Financial Services

Case Study 4: Marketing Automation Platform

The Architecture Philosophy

Why Browser-Based Processing Wins

What This Won't Do

Frequently Asked Questions

Can a browser really handle 10 million rows?

What are the memory limits?

Is this GDPR/HIPAA-safe?

How do I convert JSONL instead of JSON?

What if my file has encoding issues?

Can I convert Excel files with formulas?

Why is JSON → CSV faster than CSV → Excel?

What happens if my browser tab crashes?

Can I automate this with scripts?