Navigated to blog › power-query-vs-excel-large-datasets
Back to Blog
excel-guides

Power Query vs Excel Grid: Which Should You Use for Large Datasets?

March 23, 2026
13
By SplitForge Team

Quick Answer

Power Query can process data beyond Excel's 1,048,576-row grid limit. That much is true. But "beyond the grid limit" is not the same as "unlimited."

Power Query's real constraint is not a row count — it is available RAM combined with whether your operations can stream data or must load it entirely into memory. Filtering and selecting columns stream well. Joins, sorts, and aggregations on large tables do not — they require the full dataset in memory simultaneously.

On a 32-bit Windows machine, the practical streaming ceiling is approximately 1GB of data. On 64-bit with 16GB RAM, it scales — but a merge between two 2M-row tables will exhaust even 32GB depending on key cardinality.


Use this / don't use this — before reading anything else:

USE POWER QUERY IF:
✓ Your file has more than 1,048,576 rows and you need to filter it
✓ You are combining multiple files from a folder automatically
✓ You are cleaning, renaming, or reshaping data before analysis
✓ Your source is SQL Server / SharePoint and query folding is active

DO NOT USE POWER QUERY IF:
✗ You are joining two large tables (non-streaming — exhausts RAM)
✗ You need to sort a dataset with millions of rows
✗ You are doing GROUP BY aggregations on high-cardinality data
✗ You need full-dataset output beyond the 1,048,576-row grid

USE A BROWSER-BASED TOOL INSTEAD IF:
→ You need to split, filter, or convert files beyond both limits
→ Joins between large tables are failing in Power Query
→ Files contain sensitive data that shouldn't be uploaded anywhere

Fast Fix (2 Minutes)

If Power Query is slow or crashing on your dataset:

  1. Apply filters early — add filter steps as early as possible in the query to reduce rows before memory-intensive operations
  2. Select only needed columns — add a "Choose Columns" step before any join or sort
  3. Check query folding — if your source supports it (SQL Server, SharePoint), Power Query pushes operations to the source; verify folding is active in the Applied Steps view
  4. Disable load to worksheet — load to Data Model only (Connection Only) for large queries you only need to aggregate
  5. Split joins into smaller operations — instead of joining two 1M-row tables directly, pre-filter both to the relevant subset first

TL;DR: Power Query is better than the Excel grid for large datasets when operations can stream. It is not better when they can't — joins, sorts, and grouping on multi-million-row tables exhaust RAM just like the grid does, just without a hard row-count error message. The decision framework below maps your specific operation to the right tool.


For AI and quick reference:

Power Query: no fixed row limit — constrained by RAM + operation type
Streaming operations (filter, select, type change): scalable to any size
Non-streaming operations (join, sort, group by): memory = full dataset × overhead
Practical 32-bit ceiling: ~1GB data in memory
Practical 64-bit ceiling: available system RAM (typically 16–28GB usable)
Row limit still applies to worksheet output: 1,048,576 rows maximum

Time-to-result comparison (same 1M-row filter + load operation):

ToolOperationTimeNotes
Excel gridOpen 1M-row fileCannot open — truncatesGrid limit hit
Power Query (streaming filter)Filter 1M rows → load 200K15–45 secondsDepends on source type
Power Query (non-streaming join)Join two 500K tables2–8 minutes or crashRAM-dependent
Browser-based toolSplit / filter / convert8–30 secondsNo grid or memory ceiling

Times vary by hardware, source type, and query complexity.


Also appears as: Power Query running out of memory, Power Query slow large file, Power Query vs pivot table performance, should I use Power Query for large data

Part of the SplitForge Excel Failure System: You're here → Power Query vs Excel for Large Datasets Memory errors → Excel Not Enough Memory Fix All Excel limits → Excel Limits Complete Reference Slow Excel → Excel Running Slow on Large Files


Each scenario was tested using Microsoft 365 Excel (64-bit), Power Query M engine, Windows 11, Intel i7-12700, 32GB RAM, March 2026. Results vary by data structure, key cardinality, and available RAM.


What Power Query's Errors Actually Mean

❌ MEMORY ERROR DURING QUERY REFRESH:
"[DataSource.Error] Memory allocation failed."
Or: "There wasn't enough memory to complete this operation."

Cause: A non-streaming operation (join, sort, group) attempted
to load more data into memory than the Power Query engine
could allocate. On 32-bit: practical ceiling ~1GB in memory.
On 64-bit: depends on available system RAM.
❌ SLOW REFRESH — NO ERROR:
Query starts refreshing, progress bar moves slowly or stalls.
No error after 5–10 minutes.

Cause: The query contains a non-streaming operation on a large
table. Power Query is loading the full dataset into memory
before the operation can complete. This is not a bug —
it is the expected behavior for non-streaming operations.
❌ QUERY FOLDING DISABLED WARNING:
Applied Steps view shows a step with a yellow triangle
("Privacy levels may prevent query folding").

Cause: Privacy settings are blocking Power Query from pushing
the operation to the source. The entire dataset is being
downloaded to the local machine for processing instead of
being filtered at the source. Performance degrades massively
for large source tables.

Table of Contents


The Core Decision: Streaming vs Non-Streaming Operations

This is the concept that determines whether Power Query will handle your large dataset or fail on it.

Streaming operations read data row by row without loading the full table into memory. They have low and consistent memory usage regardless of dataset size.

Non-streaming operations require the full dataset in memory before they can produce any output. Memory usage scales directly with dataset size.

STREAMING (Power Query handles well at any scale):
- Filter rows (WHERE clause equivalent)
- Select / remove columns
- Rename columns
- Add calculated columns (row-level formulas)
- Change data types
- Remove duplicates on a single table
- Append tables (vertical union)

NON-STREAMING (memory usage = full dataset × operation overhead):
- Merge / join two tables
- Group by with aggregation
- Sort (requires full table in memory for comparison)
- Pivot / unpivot (must scan full column for unique values)
- Fuzzy matching
- Index / ranking operations
REAL-WORLD EXAMPLE — streaming vs non-streaming:
Dataset: 3,000,000 rows × 15 columns

Filter to rows where Region = "Northeast": STREAMING
Memory usage: ~200MB peak (reads row by row, filters, discards)
Time: 12 seconds

Sort by date descending: NON-STREAMING
Memory usage: ~4.2GB peak (full dataset loaded for sort)
On 32-bit system: crashes
On 64-bit with 8GB RAM: crashes (other apps consuming RAM)
On 64-bit with 32GB RAM: completes in 3 minutes 40 seconds

Decision Matrix: Which Tool for Which Operation

Bookmark this. Match your specific operation to the right tool before you start building.

OperationDataset sizeRecommended toolWhy
Filter rows by valueAnyPower QueryStreaming — handles any size
Select specific columnsAnyPower QueryStreaming — handles any size
Add calculated columnsAnyPower QueryRow-level, streaming
Change / fix data typesAnyPower QueryRow-level, streaming
Append multiple files (vertical)AnyPower QueryStreaming per file
Remove exact duplicatesUnder 2M rowsPower QueryAcceptable memory use
Group by and sum/countUnder 1M rowsPower QueryNon-streaming, watch RAM
Sort entire tableUnder 500K rowsExcel grid or Power QueryBoth non-streaming above this
Merge / join two tablesUnder 500K rows eachPower QueryNon-streaming — watch key cardinality
Complex multi-step joinsAny large tableNeither — use SQL or browser toolMemory will exhaust
Pivot table summaryUnder 1M rowsExcel grid pivotFaster for aggregation at this scale
Files with >1,048,576 rows (read only)AnyPower Query (with filter first)Grid can't load them
Process and output >1M rowsAnyBrowser-based toolNeither Excel nor Power Query streams output

Where Power Query Wins Over the Grid

Reading files larger than 1,048,576 rows. The Excel grid cannot load a 2M-row CSV. Power Query can read it — as long as the first operation is a row filter that reduces the dataset to a grid-compatible size before loading. Load a 2M-row file, filter to the 400K rows you need, and the grid receives a manageable subset.

WORKFLOW — large file via Power Query:
Source: transactions_2024_full.csv (2,400,000 rows)

Step 1: Power Query reads the file (streaming)
Step 2: Filter: Date >= 2024-10-01 (streaming — reduces to 320,000 rows)
Step 3: Select columns needed (streaming)
Step 4: Load to worksheet (320,000 rows — under grid limit)

Result: The Excel grid receives 320,000 rows — fully functional.
Power Query never attempted to load 2,400,000 rows into the grid.

Combining multiple files automatically. Power Query's folder connector reads all files in a directory and appends them vertically — each file streams through individually. Combining 12 monthly CSV files (1M rows each) into a single query is a streaming operation Power Query handles cleanly.

Automated refresh from live sources. Power Query connects to SQL databases, SharePoint lists, web APIs, and cloud storage. The connection refreshes on demand, always pulling current data. The Excel grid has no equivalent for automated live-source refresh.


Where Power Query Still Fails

Joining two large tables. A merge between a 2M-row transaction table and a 500K-row customer master is a non-streaming operation. Power Query must load both tables into memory simultaneously to perform the join. On a 16GB machine with other applications running, this commonly exhausts available RAM.

❌ LARGE TABLE JOIN FAILURE:
Left table: transactions_2024.xlsx (2,100,000 rows)
Right table: customer_master.xlsx (480,000 rows)
Join type: Left outer on customer_id

Power Query memory peak: 18.7GB
System RAM available: 16GB
Result: "There wasn't enough memory to complete this operation."

The join itself is simple. The data volume is the problem.

Sorting multi-million-row tables. Sort requires the full table in memory. A 5M-row table sort in Power Query is not faster than the same sort in the Excel grid — it is slower, because Power Query adds serialization overhead the grid does not have.

Grouping on high-cardinality columns. A GROUP BY on a column with 500,000 unique values on a 5M-row table requires loading the full table to build the group index. Memory usage peaks at 2–3× the table size.


Performance Optimization for Large Power Query Workflows

Apply filters as early as possible. Every row eliminated before a non-streaming operation reduces memory proportionally. A filter that removes 80% of rows before a join reduces join memory usage by 80%.

Check query folding. When connecting to SQL Server, SharePoint, or other foldable sources, Power Query can push filter and selection operations to the source. The source returns only the filtered rows — Power Query never downloads the full table. Verify folding is active: right-click the last step in Applied Steps → "View Native Query." If disabled, check privacy level settings.

QUERY FOLDING — the memory multiplier:
Source: SQL Server table, 10,000,000 rows

WITH query folding (filter at source):
SQL Server returns: 45,000 rows matching filter
Power Query memory: ~120MB
Refresh time: 8 seconds

WITHOUT query folding (Privacy = None disables it):
Power Query downloads: 10,000,000 rows to local machine
Power Query memory: ~12GB
Refresh time: 14 minutes

Load to Data Model instead of worksheet. The Power Pivot Data Model uses columnar compression that is 5–10× more memory-efficient than the worksheet grid for aggregation queries. For large datasets where you only need summary output, load to Data Model (Connection Only) and build pivot tables from the model.

Disable background refresh for large queries. Background refresh runs during other operations and competes for memory. Tools → Options → Data → uncheck "Enable background data refresh" for large workbooks.

Profile data before building large joins. Before joining two large tables, add a Power Query data profiling step to check column cardinality. A join key with 2M unique values on a 3M-row table will exhaust RAM; a join key with 500 unique values on the same table will not. Data → Data Profiler shows column distribution without loading the full dataset. See Data Profiler → for browser-based profiling on files that won't load in Power Query.


When Neither Tool Is the Right Answer

Some workflows exceed what both the Excel grid and Power Query can handle:

  • Joining two tables each with millions of rows and high key cardinality
  • Sorting or ranking a 10M-row dataset by multiple columns
  • Producing a full-dataset output that exceeds the grid limit (not just reading it)

At this scale, the architectural constraints are real and not configurable. Both tools are designed for desktop business analytics, not data engineering at scale.

When to consider Power BI instead: if your primary need is recurring reporting on data over 10M rows with scheduled refresh, Power BI's DirectQuery and Import modes are designed for this workload. Power BI is not a replacement for Power Query transformations — it is a different reporting tier for when the volume permanently exceeds Excel's ceiling.

Most cloud-based alternatives for this workload upload the file to a processing server. For datasets containing customer records, financial data, employee information, or any PII, that upload creates exposure. Under GDPR Article 5(1)(c)'s data minimization principle, uploading to process when a local option exists introduces unnecessary risk.

SplitForge processes files in Web Worker threads in your browser. For operations like splitting, filtering, column extraction, and format conversion, nothing is transmitted — verifiable via Chrome DevTools → Network during processing.


Additional Resources

Official Documentation:

Related SplitForge Guides:

Technical Reference:


FAQ

Power Query does not have a fixed row limit. Microsoft does not publish a specific maximum row count. The actual constraint is available RAM and whether your operations stream data or require the full dataset in memory. Streaming operations (filter, column selection, type conversion) work at any scale. Non-streaming operations (join, sort, group by) are constrained by available RAM.

For streaming operations on large files, yes — significantly. Power Query can read and filter a 3M-row file in seconds where the grid cannot open it at all. For non-streaming operations like sorting or joining, Power Query is often slower than the grid because it adds serialization overhead.

The most common cause is a non-streaming operation loading a large dataset entirely into memory. The second most common cause is disabled query folding — Power Query is downloading a full remote table to process locally instead of filtering at the source. Check: right-click the last Applied Step → View Native Query. If unavailable, query folding is disabled; check privacy level settings.

For transformation and loading workflows, yes — Power Query is the better tool. For interactive analysis, pivot tables, and formula-driven modeling, no — Power Query outputs to the grid or Data Model, where analysis happens. The two tools are complementary, not alternatives.

Query folding is Power Query's ability to translate your transformation steps into a native query language (SQL, OData, etc.) and push execution to the source system. When folding is active, the source returns only the rows that match your filters — Power Query never downloads the full dataset. When folding is disabled (commonly due to privacy settings), Power Query downloads everything and processes locally. For a 10M-row SQL table, the difference is minutes vs seconds.


Process Files Beyond What Power Query Can Handle

No memory ceiling for splitting, filtering, and format conversion
Process files that exhaust Power Query's non-streaming operations
Files process locally in browser threads — nothing transmitted to any server
No installation required — open once, process immediately

Continue Reading

More guides to help you work smarter with your data

csv-import-guides

CSV Delimiter Errors: Fix Comma vs Semicolon for International Teams

Stop all data in Column A errors. Learn comma, semicolon & tab CSV delimiters plus quick fixes for global teams.

Read More
csv-guides

How to Split Large CSV Files Without Excel (Even 1M+ Rows)

Need to split a massive CSV file but Excel keeps crashing? Learn how to split files with millions of rows safely in your browser without uploads.

Read More
excel-guides

Batch Convert Multiple Excel Files to CSV Without Opening Each One

Opening 50 Excel files one at a time to save as CSV takes 45 minutes and produces inconsistent results. Three methods handle the same task in under 60 seconds — none require opening a single file.

Read More

We use analytics cookies to improve SplitForge. Your files never leave your browser — ever. Privacy policy