Navigated to blog β€Ί excel-to-json-for-ai
Back to Blog
ai-data-prep

Convert Excel to JSON for AI APIs and LLM Pipelines (2026)

May 28, 2026
11
By SplitForge Team

Convert your Excel file to JSON or JSONL locally before sending it to an AI pipeline β€” the shape you need depends on the use case, and most online converters upload your file to a third-party server before producing the output. For fine-tuning and batch APIs, JSONL is required. For direct prompts and API context blocks, a JSON array works. The conversion runs in your browser; the spreadsheet stays on your device. For the format comparison between JSON and JSONL for interactive prompts versus pipeline workflows, see Best Format for Feeding Data Into ChatGPT or Claude.

TL;DR: AI APIs and LLM pipelines expect JSON, not Excel β€” and the JSON shape varies by use case: JSONL (one object per line) for OpenAI fine-tuning and batch APIs, JSON arrays for direct ChatGPT/Claude prompts and context blocks. Online converters upload your spreadsheet before converting it; browser-local conversion keeps the source workbook on your device and sends only the JSON output to the AI pipeline.

Convert Excel to JSON locally β†’


You have an Excel workbook of labeled examples β€” 5,000 rows of user queries and ideal assistant responses β€” and you need to feed it to OpenAI's fine-tuning API. You search "excel to json converter," paste your file into the first result, and download the output. Your training data β€” which may contain proprietary prompts, customer interactions, or business-sensitive examples β€” now sits on a conversion service's server while you wait for the download link. The conversion itself is a trivial serialization operation. The exposure was not necessary.


Tested: JSON shapes verified against OpenAI fine-tuning documentation, OpenAI batch API documentation, and Anthropic message batches API documentation, May 2026. Browser-local conversion validated in Chrome 132.


Table of Contents


Why Excel β†’ JSON for AI? (and Why JSONL Specifically)

AI APIs and LLM pipeline tools β€” fine-tuning endpoints, batch processing APIs, RAG ingestion libraries β€” expect structured JSON input, not tabular spreadsheets. Excel workbooks carry formatting metadata, formula records, and multi-sheet structure that programmatic APIs cannot parse directly; JSON serializes each row as a plain key-value object that any language and any pipeline can consume. For large-scale AI workflows β€” fine-tuning a model on labeled examples, ingesting a knowledge base into a vector database, running batch evaluations β€” JSON conversion is a required upstream step, not an optional optimization.

JSONL (JSON Lines) is the specific variant most AI APIs require. Where a JSON array is a single document containing all records wrapped in [...], JSONL writes one complete, self-contained JSON object per line with no outer structure. This makes it streaming-friendly: an API reads line one, processes it, reads line two, and continues without holding the full dataset in memory. OpenAI's fine-tuning API, OpenAI's batch API, and Anthropic's message batches API all require JSONL input β€” submitting a standard JSON array to any of these endpoints will be rejected.

The shape of the JSON matters beyond just format. A flat object-per-row export ({"column1": "value", "column2": "value"}) is correct for RAG ingestion and some batch workflows, but wrong for OpenAI fine-tuning, which requires a specific messages array structure per line. Getting the shape wrong means the API rejects the submission or misinterprets the data even when the file is technically valid JSONL. The decision table below maps each AI use case to the correct format and shape before you convert a single cell.


The JSON Shape by Use Case

The right JSON structure depends entirely on the downstream system. Use the table below to identify the required format and shape before converting β€” the format (JSON vs JSONL) and the schema (field names and nesting) both matter.

AI use caseFormatRequired JSON shapeTool path
Fine-tuning (OpenAI GPT-4o, GPT-3.5)JSONL{"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]} per lineExcel Splitter β†’ JSONL export + post-processing
Batch API (OpenAI / Anthropic)JSONLOne request object per line per API specExcel Splitter β†’ JSONL export
RAG ingestion (vector database)JSONL or JSON array{"id": "...", "text": "...", "metadata": {...}} per chunkEither; JSONL preferred for large datasets
Direct ChatGPT / Claude promptJSON array[{"field": "value", ...}, ...]Excel to JSON Converter
API context block (single-shot)JSON arrayFlat or nested structured objectExcel to JSON Converter

For non-AI conversion use cases β€” API data feeds, application development, general data pipeline integrations β€” see Convert Excel to JSON for the general workflow.


How to Convert Excel to JSON for AI (Step-by-Step)

  1. Identify the target use case. Determine where the JSON output is going: OpenAI fine-tuning, batch API, RAG ingestion, or a direct prompt. Use the decision table above to confirm both the required format (JSON array vs JSONL) and the required schema (field names, nesting depth). The schema matters as much as the format β€” a correctly structured JSONL file with wrong field names will be rejected by fine-tuning endpoints.

  2. Pick the tool based on output format. Standard JSON array (for direct prompts and API context): use Excel to JSON Converter at /tools/excel-json-converter. JSONL output (for fine-tuning and batch APIs): use Excel Splitter at /tools/excel-splitter and select JSONL as the export format β€” it generates one JSON object per data row, with Excel column headers as field names.

  3. Load the workbook locally. Open the tool and select your file. Neither tool uploads the workbook to a server β€” processing runs via a Web Worker in your browser, and the source file stays on your device throughout the operation. For multi-sheet workbooks, select the sheet containing the target data; other sheets are not included in the output.

  4. Review the column-to-field mapping. Excel column headers become JSON field names in the output. Verify that column names are clean identifiers β€” no special characters, no leading spaces β€” before converting, as malformed field names carry through to the JSON output and will be rejected by type-strict APIs. For fine-tuning datasets that require the messages array structure, the flat-object output (one object per row) will need a post-processing step to reshape column values into the role and content fields that OpenAI's fine-tuning endpoint expects.

Example: OpenAI Fine-Tuning JSONL Schema

If your Excel workbook stores training examples as rows β€” one column for the user turn, one for the ideal assistant response β€” the flat-object export from Excel Splitter will look like this:

{"user_input":"What is your return policy?","ideal_response":"We accept returns within 30 days of purchase with a receipt."}
{"user_input":"How do I track my order?","ideal_response":"Log in to your account and click 'Order History' to view real-time tracking."}

OpenAI's fine-tuning API requires a specific messages array structure β€” each training example as a conversation with role-labeled turns:

{"messages": [{"role": "user", "content": "What is your return policy?"}, {"role": "assistant", "content": "We accept returns within 30 days of purchase with a receipt."}]}
{"messages": [{"role": "user", "content": "How do I track my order?"}, {"role": "assistant", "content": "Log in to your account and click 'Order History' to view real-time tracking."}]}

The transformation maps your user_input column value to messages[0].content (role: user) and your ideal_response column value to messages[1].content (role: assistant). Naming your spreadsheet columns user and assistant in advance simplifies the mapping step. A short Python or JavaScript script handles the reshape for hundreds or thousands of examples in seconds β€” OpenAI also provides a data preparation and validation script in their fine-tuning documentation that catches schema errors before submission.

  1. Configure output settings and export. For JSON array output: select pretty-print if a human will inspect the file; minified if it goes directly to an API. For JSONL output: each line is a self-contained object; no additional formatting options are needed. Download the output file to your device.

  2. Validate a sample before full pipeline ingestion. Open the output and check 5–10 records against the target API's schema documentation. For OpenAI fine-tuning, verify the messages array format against the fine-tuning guide β€” OpenAI provides a validation script that catches schema errors before submission. For Anthropic batch processing, check against the message batches documentation. A pre-submission validation pass on a small sample catches format and schema problems before they surface halfway through a large batch job.


The Privacy Trade-Off With Online Converters

Online Excel-to-JSON converters upload the source spreadsheet to a remote server to perform the conversion, then return the JSON for download. For files containing customer data, employee records, proprietary training examples, or any business-sensitive content, this creates a data exposure before the data reaches the AI vendor. The conversion itself β€” serializing rows to JSON objects β€” is a trivial browser-side operation that requires no server; the upload is an infrastructure choice, not a technical necessity.

The double-exposure problem compounds for AI-pipeline workflows specifically. If you are converting a fine-tuning dataset derived from customer support interactions, those interactions pass through the conversion service's infrastructure and then through the AI vendor's infrastructure when submitted for training. Commercial API tiers handle the second leg differently than consumer chat: the OpenAI API and Anthropic API are explicitly exempt from training-by-default under their commercial terms β€” data submitted via API is not used to improve base models without a separate opt-in agreement. Consumer plans (ChatGPT Free/Plus, Claude.ai Free/Pro/Max) operate under different defaults, where uploaded data may be used to improve the model unless you opt out in account settings. For the broader format and privacy context, see Best Format for Feeding Data Into ChatGPT or Claude.

Local conversion eliminates the intermediate server entirely. The source Excel file does not leave the browser; only the converted JSON or JSONL output is sent downstream to the AI pipeline. For files containing data subject to GDPR Article 5 data minimization requirements or contractual data handling obligations, reducing the number of systems that process the raw spreadsheet is a practical risk reduction step. For masking sensitive columns before conversion, see How to Remove PII From a CSV Before Using AI.


Note: For general Excel-to-JSON conversion outside AI workflows β€” API feeds, application data, development use β€” see Convert Excel to JSON. For the broader format decision (CSV vs JSON vs Excel for AI tools), see Best Format for Feeding Data Into ChatGPT or Claude. This post covers AI-pipeline-specific ingestion formats and the privacy case for local conversion.


Additional Resources

How this guide was built: JSONL specification from https://jsonlines.org/. OpenAI fine-tuning format requirements from platform.openai.com/docs/guides/fine-tuning. OpenAI batch API format from platform.openai.com/docs/guides/batch. Anthropic message batches API from docs.anthropic.com/en/docs/build-with-claude/message-batches. Commercial API training exemptions from OpenAI and Anthropic API Terms of Service. All API format specifications verified against current documentation, May 2026.


FAQ

JSON (JavaScript Object Notation) is a single structured document β€” typically an array of objects or a nested object β€” that must be fully loaded and parsed before any individual record can be accessed. JSONL (JSON Lines) writes one complete, self-contained JSON object per line, with no wrapping brackets or document structure. JSONL processes naturally line by line β€” an API reads one record, handles it, and reads the next without holding the full file in memory. OpenAI fine-tuning, OpenAI batch API, and Anthropic message batches all require JSONL; direct ChatGPT and Claude prompts typically use JSON arrays.

OpenAI fine-tuning requires JSONL with a specific messages-array structure per line: {"messages": [{"role": "user", "content": "..."}, {"role": "assistant", "content": "..."}]}. Each line represents one training example as a complete conversation. Standard JSON arrays are not accepted β€” the input must be line-delimited with one object per line and no outer brackets. OpenAI provides a data preparation and validation script in their fine-tuning documentation; running it on a sample before submitting a full dataset catches format and schema errors before they abort the job.

Yes. Browser-based tools using Web Workers can process Excel files entirely on your device and produce JSONL output locally. The source workbook is read in-browser, rows are serialized to JSON objects, and the JSONL file is generated and downloaded without any server-side step. The raw spreadsheet β€” including any sensitive training data, customer records, or proprietary content β€” never leaves your browser during the conversion.

Online converters upload the Excel file to a remote server before converting it. For files containing customer data, employee records, proprietary training examples, or any business-sensitive content, this creates a data exposure before the file reaches the AI vendor. Local conversion eliminates that intermediate server β€” only the converted JSON or JSONL reaches the downstream system. The conversion itself is a trivial serialization operation that requires no server; the upload is an infrastructure choice, not a technical necessity.

No. Commercial API usage β€” including the Anthropic message batches API β€” is explicitly exempt from training-by-default under Anthropic's commercial API terms. Data submitted via the API is not used to train Anthropic's base models without a separate opt-in agreement. This is different from consumer plans on claude.ai (Free, Pro, Max), where conversations and uploaded files may be used to improve Anthropic's models unless you disable that in account settings. If your workflow calls the Anthropic API directly for batch processing, your data is not used for training.

Browser-based conversion is bounded by available device memory rather than a fixed file-size cap. Files up to several hundred MB typically process without issue on modern hardware; very large workbooks with millions of rows across multiple sheets may hit memory limits depending on the browser and available RAM. For workbooks that exceed the practical browser limit, export the target sheet to CSV first β€” Excel File Too Big for AI? covers the export workflow β€” then convert from CSV if JSON or JSONL output is needed for the pipeline.

Generally, no. OpenAI's fine-tuning and batch APIs require JSONL β€” they do not accept XLSX, XLS, or CSV as direct input. Anthropic's message batches API similarly requires structured JSONL request objects. ChatGPT's Data Analysis tool and Claude's web interface accept XLSX and CSV for conversational analysis, but that is a different interface from the programmatic API. For any pipeline that calls an AI API directly β€” fine-tuning, batch inference, RAG ingestion β€” convert the Excel file to the required format before submission; the API will return a schema validation error if the file type or structure does not match its specification.


Convert Locally, Send Clean

Match the output format to the use case β€” JSONL for fine-tuning and batch APIs, JSON array for direct prompts and context blocks
Convert in your browser β€” the source Excel workbook stays on your device; only the JSON output reaches the AI pipeline
Validate a sample before full pipeline ingestion β€” check a few records against the target API's schema before submitting a large batch
Mask sensitive columns before conversion if the output is going to a consumer AI plan that trains on uploads by default

Convert Excel to JSON locally β†’

Continue Reading

More guides to help you work smarter with your data

ai-data-prep

AI-Ready Data Checklist: 10 Things to Verify Before Upload (2026)

Before uploading to ChatGPT, Claude, or a fine-tuning API, run through this 10-point checklist. UTF-8 encoding, clean headers, PII removed, size within limits.

Read More
ai-data-prep

Prepare Data for AI: The Complete Guide (Privacy-First, 2026)

How to prepare a CSV or Excel file for ChatGPT, Claude, or an AI API β€” encoding, PII, format, size, and privacy. The complete local-first prep workflow.

Read More