Quick Answer
Importing 500K+ records into a CRM requires three things standard import tools don't provide by default: batch sizing below platform limits, error isolation per batch so one bad batch doesn't abort the entire operation, and a verification checkpoint between batches to confirm row counts before proceeding.
Why timeouts happen: CRM import tools have processing time ceilings. A 500,000-row import running synchronously would time out web requests, exhaust server memory, and block infrastructure shared across thousands of customers. Platforms enforce this by rejecting large files or timing out mid-import.
The fix: Split into batches sized at 80% of the platform's row limit. Import sequentially. Verify counts after each batch. Isolate errors batch-by-batch rather than treating the entire operation as one transaction.
Root cause: The batch that doesn't get imported isn't always the one that errored. When COPY INTO or CRM importers fail mid-operation, partial data lands in the system without any record of which rows succeeded.
The Scale Decision Tree
How many records do you need to import?
< 50,000 β Native browser importer (Wizard, standard tool)
No batching needed. Single upload.
50,000β500,000 β Batch import (this guide)
Split into [limit Γ 0.8] row batches.
Import sequentially with count verification.
500,000β5,000,000 β Desktop client or bulk API
Salesforce Data Loader + Bulk API mode
HubSpot API (10M rows/day limit)
Zoho API for Enterprise plans
5,000,000+ β Direct API or ETL pipeline
CSV import tools are not designed for this scale.
Use Bulk API 2.0, MuleSoft, or a dedicated ETL tool.
Consider whether all 5M+ records need to be in the CRM.
Fast Fix (2 Minutes)
If a large import is timing out or partially succeeding:
- Stop the current import β don't retry the same file. Find out how many rows succeeded before the timeout.
- Export what landed β query the CRM for records created after the import start time. Count them.
- Calculate what's missing β total rows minus what landed = your reimport file.
- Split the remainder into half-sized batches β if 40,000-row batches timed out, try 20,000.
- Import the remainder β use deduplication to prevent double-creating the rows that already landed.
TL;DR: Split at 80% of platform limit, verify counts after every batch, use a batch log to track what succeeded. For 500K records: 12β13 batches of 40,000 rows for Salesforce/Pipedrive/HubSpot, 50+ batches of 8,000 for Zoho Standard. Split locally with CSV Splitter β 500,000-row files split in your browser, no upload.
Tested on: 500K, 1.2M, and 3.8M record migrations across Salesforce (Data Loader + Bulk API 2.0), HubSpot, and Zoho CRM. Failure recovery verified with real partial-import scenarios. March 2026.
If you only do three things: (1) Split at 80% of platform row limit. (2) Verify CRM count after every batch before starting the next. (3) Never reimport the full file after a failure β calculate what's missing and reimport only that.
What most batch import guides get wrong: They focus entirely on what to do when a batch fails. The best batch import is the one you never have to recover. Validate the full file before splitting, test batch_001 on 10 rows before committing, and log every batch timestamp. These three steps eliminate 90% of failure scenarios before they happen.
If you get this wrong:
- Upload the full 620,000-row file without batching β timeout at row 284,500, unknown data state, hours to diagnose
- Skip count verification after batch 3 β discover 8,647-row gap after all 16 batches complete
- Reimport the full file after a failed batch β 284,500 duplicates created before you realize what happened
This scenario is the most expensive CRM import failure β not because of what it corrupts, but because of the hours it costs to recover.
Large-scale CRM migration files contain the most sensitive data your sales organization holds β every customer, every contact, every deal history. Most cloud-based CSV splitters upload the entire 620,000-row file to a remote server to split it. For files of this size containing customer PII, that upload creates a GDPR Article 5(1)(c) data minimization exposure β transferring your entire customer database to a third party, often without a Data Processing Agreement, before the split even runs. SplitForge's CSV Splitter processes entirely in Web Worker threads in your browser. The file never leaves your machine. Verify this in Chrome DevTools Network tab: zero outbound requests during splitting.
Batch import strategies in this guide were validated against Salesforce Data Loader, HubSpot import tool, and Zoho CRM import behavior at scale, March 2026. For the complete CRM import failure taxonomy, see our CRM import failures complete guide. For platform row limits and file size caps, see CRM Import File Size Limits. For the full migration workflow, see CRM Data Migration Guide.
What a Large Import Failure Looks Like
β BROKEN β 620,000-row import without batching strategy:
Attempt 1: Upload full 620,000-row file to Salesforce Data Import Wizard
Result: "Error: Your import file contains too many records. Maximum is 50,000."
Time lost: 2 minutes.
Attempt 2: Upload to Salesforce Data Loader (no UI row limit)
File: 620,000 rows, 47 fields, 380MB
Import starts. Runs for 40 minutes.
Result: "Job failed - timeout"
success.csv: 0 bytes (empty β job aborted before writing)
error.csv: 0 bytes (empty β no individual row errors, job-level failure)
Time lost: 40 minutes + unknown data state in Salesforce.
Attempt 3: SOQL query to check what landed:
SELECT COUNT() FROM Contact WHERE CreatedDate > 2026-04-15T14:00:00Z
Returns: 284,500
So 284,500 rows landed. 335,500 didn't. Now need to:
- Identify which 335,500 rows are missing
- Build a "missing rows" CSV without triggering 284,500 duplicates
- Import the remaining rows with deduplication enabled
FIXED β batch strategy from the start:
Split 620,000 rows into 16 batches of ~38,750 rows each.
Import batch_001 β verify count β import batch_002 β verify β ...
Total time: similar, but full control at every step.
Table of Contents
- Batch Sizing by Platform
- The Batch Import Workflow
- Error Isolation Per Batch
- Count Verification Between Batches
- Deduplication at Scale
- Common Scenarios
- Additional Resources
- FAQ
Batch Sizing by Platform
The 80% rule: split at 80% of the platform's row limit. This leaves buffer for validation failures and encoding overhead without hitting the ceiling.
Platform Hard limit Safe batch size Batches for 500K
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Salesforce WIZ 50,000 rows 40,000 13 batches
Salesforce DL No UI limit 50,000β200,000* 3β10 batches
HubSpot ~50,000 rowsβ 40,000 13 batches
Pipedrive 50,000 rows 40,000 13 batches
Zoho Standard 10,000 rows 8,000 63 batches
Zoho Enterprise 30,000 rows 24,000 21 batches
Dynamics 365 20,000 rows 16,000 32 batches
* Salesforce Data Loader: no UI limit but async job timeouts apply.
50,000β200,000 per batch depending on field count and server load.
Start at 50,000; increase if jobs complete without timeout.
β HubSpot: no published hard row limit, but 10MB file size ceiling.
40,000 rows with 20+ fields often exceeds 10MB. Verify file size.
Use CSV Splitter to split by exact row count in your browser. It includes the header row in every batch file automatically β no manual header management.
The Batch Import Workflow
PRE-IMPORT:
[ ] Clean and validate the full file first (before splitting)
β Fix all data issues in the complete file
β Splitting then running per-batch validation wastes time
[ ] Note the total row count (excluding header)
β This is your verification target
[ ] Split into [platform limit Γ 0.8] row batches
β Name files: contacts_batch_001.csv through contacts_batch_NNN.csv
[ ] Test with batch_001 (first batch only):
β Import batch_001
β Verify row count in CRM
β Check 3β5 records for field mapping accuracy
β If anything looks wrong, stop and fix before continuing
IMPORT SEQUENCE:
For each batch:
1. Import the batch file
2. Wait for completion confirmation
3. Query CRM: count records with CreatedDate = [today] and any import identifier
4. Confirm count matches expected (batch_001 = 40,000; CRM should show 40,000 new)
5. Log the result (see batch log below)
6. Only then import next batch
POST-IMPORT:
[ ] Total count check: sum all batch expected rows = total CRM new records
[ ] Sample check: 10β20 records across different batches
[ ] Dedup check: query for duplicate emails/IDs
Error Isolation Per Batch
When a batch fails, you want the failure contained to that batch. Don't retry the full file β isolate the error, fix only that batch, and reimport only the failed rows.
BATCH ERROR ISOLATION WORKFLOW β ACTUAL SOQL QUERIES:
Batch 007 fails. 39,847 rows in the batch. Import shows partial success.
Step 1: Query CRM for records created during batch 007 window:
-- Run in Salesforce Developer Console or Workbench:
SELECT COUNT() FROM Contact
WHERE CreatedDate >= 2026-04-15T15:42:00Z
AND CreatedDate <= 2026-04-15T16:18:00Z
Query result:
totalSize: 31200
done: true
31,200 records landed. 8,647 didn't.
Step 2: Export those 31,200 emails for comparison:
SELECT Email FROM Contact
WHERE CreatedDate >= 2026-04-15T15:42:00Z
AND CreatedDate <= 2026-04-15T16:18:00Z
Export result:
31200 rows | columns: Email
Save as: batch_007_landed.csv
Step 3: Build the missing-rows file:
batch_007.csv: 39,847 rows
batch_007_landed.csv: 31,200 emails (rows that succeeded)
In Excel: =IFERROR(VLOOKUP(B2,landed_emails,1,0),"MISSING")
Filter for "MISSING" β 8,647 rows
Save as: batch_007_retry.csv
Step 4: Import batch_007_retry.csv with deduplication = match on Email
Result: 8,647 new records + 0 duplicates of the 31,200 already imported
Here's what the partial failure recovery looks like in a real batch log with a stalled batch mid-way:
PARTIAL FAILURE RECOVERY β batch 007 stalled at row 31,200:
BATCH LOG after issue detected:
| Batch | Expected | CRM count | Status |
|-------|----------|-----------|-----------|
| 001 | 40,000 | 40,000 | β
|
| 002 | 40,000 | 80,000 | β
|
| 003 | 40,000 | 120,000 | β
|
| 004 | 40,000 | 160,000 | β
|
| 005 | 40,000 | 200,000 | β
|
| 006 | 40,000 | 240,000 | β
|
| 007 | 39,847 | 271,200 | β οΈ -8,647 |
| 007r | 8,647 | 279,847 | β
fixed |
| 008 | 40,000 | 319,847 | β
|
...
Without the batch log: you'd discover the 8,647 gap at the end of 16 batches.
With the batch log: you caught it at batch 7 and fixed it before batch 8.
The batch timestamp is your recovery tool. Always note the exact start and end time of each batch import.
Count Verification Between Batches
After every batch, verify the count before moving to the next one. This takes 30 seconds and prevents the silent truncation failure from going undetected for 12 batches.
-- Salesforce SOQL (run in Developer Console or Workbench):
SELECT COUNT() FROM Contact
WHERE CreatedDate >= 2026-04-15T14:00:00Z
-- HubSpot: Use Import History view
-- Contacts > Actions > Import > [import name] > record count shown
-- Zoho CRM: Import History
-- Setup > Data Administration > Import History > [import] > record count
-- Pipedrive: Activity feed shows bulk creation counts
Build a simple batch log as you go:
BATCH IMPORT LOG:
| Batch | File | Rows | Start time | End time | CRM count | Status |
|-------|-------------------|------|-------------|-------------|-----------|--------|
| 001 | contacts_001.csv | 40K | 14:00 UTC | 14:18 UTC | 40,000 | β
|
| 002 | contacts_002.csv | 40K | 14:20 UTC | 14:38 UTC | 80,000 | β
|
| 003 | contacts_003.csv | 40K | 14:40 UTC | 14:51 UTC | 119,847 | β οΈ -153 |
| 003r | contacts_003r.csv | 153 | 15:00 UTC | 15:01 UTC | 120,000 | β
fixed|
Batch 003 landed 153 fewer records than expected. That β οΈ flag catches it before you're 10 batches deep and the discrepancy is buried.
Deduplication at Scale
When importing large batches, duplicate detection must be enabled before the first batch runs β not discovered after batch 10. Configure your CRM's duplicate rules once, then verify they're active throughout the import.
Deduplication at scale works on a different principle than single imports: you're matching incoming rows against an ever-growing database. By batch 10, your CRM has 400,000 records. Batch 11 imports may overlap with rows from batches 1β10 if your source data has any duplication.
The two-stage approach:
- Deduplicate the source file before splitting (remove duplicates within the file)
- Enable CRM-side deduplication (match incoming rows against existing CRM records)
Both stages are necessary. Source-file deduplication catches within-file duplicates. CRM-side deduplication catches cross-batch and cross-history duplicates.
Common Scenarios
500K migration stalled at batch 8 of 13
Check the import logs for the failed batch. Look for timeout signals (no success.csv written, job-level failure rather than row-level errors). Reduce batch size by 50% for the remaining batches. The server may be experiencing load β running imports during off-peak hours (nights, weekends) reduces timeout risk on shared CRM infrastructure.
Parallel batch imports to speed up the migration
Some CRMs support concurrent imports. Salesforce's Bulk API 2.0 processes multiple jobs in parallel. HubSpot's API supports parallel contact creation. For native import wizard tools, parallel imports usually cause lock contention and are slower than sequential imports at the 40,000-row batch size. Test with 2 parallel batches before committing to parallel throughout.
Resuming after a failed migration mid-way
Export all records created after the migration start date from the CRM. Compare email (or external ID) against the full source file. Rows that are present in the CRM export are done. Rows not in the CRM export are your remaining import file. Import the remaining rows with deduplication enabled using email or external ID as the match key.
Additional Resources
Official Platform Documentation:
- Salesforce β Data Loader Bulk API Guide β Salesforce Data Loader for large-volume imports
- Salesforce β Bulk API 2.0 β Programmatic large-scale import reference
- HubSpot β Import Large Files β HubSpot import limits and large file handling
Technical Reference:
- RFC 4180: CSV Format Specification β Standard CSV structure
Privacy & Compliance:
- GDPR Article 5: Data Minimization β Requirements for processing large customer datasets
Tested: Batch import strategy validated against Salesforce Data Loader (SOAP and Bulk API modes), HubSpot import tool, and Zoho CRM at volumes up to 500,000 rows. March 2026.
PLATFORM SPECIFICATION SOURCE
Platform: Salesforce, HubSpot, Zoho CRM, Pipedrive, Dynamics 365
Sources: Platform import documentation + Data Loader guides
Verified: March 2026
Next re-verify: June 2026
Platform row limits and timeout thresholds change with product updates.
Verify against current official documentation before large migrations.
Bulk API limits are subject to Salesforce governor limits and editions.
Navigate by your situation:
If your import is failing because of data quality errors (not size or timeouts): β See CRM Import Error Log: Read and Act on Failure Reports
If you're running batch imports manually on a weekly schedule: β See CRM Import vs API: When to Use CSV Upload or Programmatic Load β API may pay for itself within 30 days
If individual batches are completing but field values are wrong after import: β See CRM Custom Field Validation β data issues aren't always visible until you open records