csv-guides

Privacy by Design for Data Analysts: Build GDPR Article 25 Into Your CSV Workflow From Day One

March 18, 2026

By SplitForge Team

Quick Answer

GDPR Article 25 requires privacy by design and by default — meaning privacy must be embedded into your processing workflows from the design stage, not added as a fix after the fact. For data analysts, this is not a theoretical obligation. Every time you design a new CSV export, build a data pipeline, or set up a recurring report, you are creating a processing activity that Article 25 applies to. The ICO updated its Article 25 guidance on 5 February 2026 to reflect the Data (Use and Access) Act 2025. The standard is not "we have a privacy policy." It's "privacy is designed into how we work."

Fast Fix (2 Minutes)

If you're building or reviewing a CSV workflow involving personal data right now:

Ask: what's the minimum data this workflow actually needs? Not what's available in the export — what does the downstream task require? That's your Article 25(2) data-by-default question.
Remove every column that doesn't answer "yes" to: does the recipient or next step in the pipeline need this field?
Ask: where does this file go? If it touches a cloud tool or leaves the organisation, check for DPA, SCCs, and BAA as appropriate.
Document the workflow — purpose, data categories, retention period. One paragraph is sufficient to satisfy Article 30 requirements for the activity.
Use SplitForge Data Cleaner to strip unnecessary fields locally before any export or upload.

TL;DR: Article 25 is a legal requirement, not guidance. It applies to data analysts designing CSV workflows, not just to engineers building systems. The ICO updated its implementation guidance in February 2026. The core obligation is simple: collect only what you need, protect it from the design stage, and be able to demonstrate you did both. Getting this right doesn't require a privacy team — it requires a habit.

Most GDPR compliance programmes focus on consent banners, privacy policies, and data breach procedures. They frequently miss the place where most data exposure actually starts: the analyst pulling a CRM export to answer a question, building a weekly report that CC's three people, or setting up a pipeline that dumps customer data into a shared folder.

Sambla Group was fined €950,000 in 2025 specifically for Article 25 violations — missing data protection measures from the system design stage, delayed response to identified unsafe processes, and multi-year duration of unresolved deficiencies. The fine wasn't for a breach. It was for designing systems without privacy in mind.

Article 25 is a legal requirement that applies to every controller, regardless of size. The ICO updated its guidance on 5 February 2026 to reflect the Data (Use and Access) Act 2025 changes, including a new children's higher protection duty. For data analysts, it applies every time you design a new workflow — not just once at system setup.

Each workflow in this post was assessed against GDPR Article 25(1) and (2), the EDPB Guidelines 4/2019 on Data Protection by Design and by Default, and the ICO's February 2026 updated guidance, March 2026.

What "Privacy by Design" Actually Means for a Data Analyst

Article 25 has two components that work together.

Article 25(1) — Privacy by Design: Implement appropriate technical and organisational measures designed to implement data protection principles effectively from the design stage. This means: before you build the workflow, not after you've built it and discovered a problem.

Article 25(2) — Privacy by Default: Ensure that by default, only personal data necessary for each specific purpose is processed. This covers amount collected, extent of processing, period of storage, and accessibility.

For a data analyst, "design stage" means: the moment you decide what columns to pull, what tool to use, who gets access, and how long the file stays on the shared drive. That is your design stage.

❌ PRIVACY RETROFIT (common pattern — fails Article 25):
Week 1: Analyst exports full customer table for a campaign analysis
         (name, email, phone, dob, address, purchase history, health_flag, credit_score)
Week 3: Marketing director shares file with agency via email
Week 6: Privacy team reviews and flags sensitive fields as unnecessary
Week 8: Privacy officer asks analyst to redo the export with fewer fields

This is reactive. Article 25 requires the privacy question BEFORE week 1.

PRIVACY BY DESIGN:
Before pulling the export, analyst asks:
- What does the campaign analysis actually need? (email + purchase_category + recency_band)
- Who sees the output? (internal team only)
- How long is it needed? (2 weeks for campaign planning)
- What tool processes it? (local only — no cloud upload needed)

Export contains 3 columns. No sensitive fields. No downstream exposure.
This is Article 25 compliance built into the workflow.

The rule that changes behaviour: If you're designing a new CSV workflow, privacy questions come first — not last. Not as a checklist at the end. Before you write the first query.

What "Privacy by Design" Actually Means for a Data Analyst
The Five Privacy Design Questions for Every CSV Workflow
Privacy by Default: The Minimum Data Rule in Practice
Choosing Tools: Article 25 and Your Processing Stack
Recurring Workflows: When Privacy by Design Decays
Documenting Your Workflow: The Minimum Viable Record
Operator Rules: Privacy by Design for Analysts
Additional Resources
FAQ

The Five Privacy Design Questions for Every CSV Workflow

Ask these before pulling any export involving personal data. They take 5 minutes. They satisfy Article 25.

Question 1: What is the specific purpose of this workflow? Not "we need customer data for analysis." Specifically: "We need email addresses and purchase recency scores to identify customers who haven't purchased in 90 days for a reactivation campaign." The more specific the purpose, the clearer the data minimum.

Question 2: What is the minimum data needed for that specific purpose? Map the purpose to the required columns. If the purpose is identifying inactive customers: email + last_purchase_date + unsubscribed_flag. Not full names. Not addresses. Not phone numbers. Not purchase amounts.

Question 3: Who will access this data and via what path? Every step where data travels is a privacy event. Internal analyst only: low risk. Shared with three team members via Slack: medium risk. Sent to external agency via email: higher risk requiring DPA review.

Question 4: What tool processes this file? If a cloud tool, where are its servers? Does a DPA or BAA exist? Is it DPF-certified for EU data? These questions need answers before the first upload. If the tool doesn't have DPA documentation, process locally.

Question 5: How long is this data needed, and what happens at the end? "I'll delete it when the campaign is done" is a policy. Document it. "I'll archive it indefinitely in the shared folder" is a retention problem. Personal data must not be kept longer than necessary under GDPR Article 5(1)(e).

What this means for your CSV workflow: Answering these 5 questions before every new workflow takes 5 minutes. Missing one is where exposure starts.

Privacy by Default: The Minimum Data Rule in Practice

Article 25(2) is the most actionable obligation for analysts. It requires that by default, only what is necessary for the specific purpose is processed.

This applies to four dimensions:

Dimension	Default Privacy Requirement	Analyst Action
Amount collected	Only necessary fields	Remove columns not needed for the specific task
Extent of processing	Only necessary operations	Don't join tables or enrich data beyond what the task requires
Period of storage	Only as long as necessary	Set a deletion date at the time of creation
Accessibility	Only accessible to those who need it	Don't share to shared drives or CC unnecessary recipients

The column audit — before every export:

Before exporting, list your query columns:
SELECT first_name, last_name, email, phone, dob, address, city, state,
       zip, purchase_history, account_notes, credit_score, health_flag,
       last_login, device_id, ip_address, campaign_source
FROM customers
WHERE last_purchase < DATE_SUB(NOW(), INTERVAL 90 DAY)

Now ask for each column: does the reactivation campaign need this?

first_name → NO (campaign is email-only, personalisation not needed)
last_name → NO
email → YES
phone → NO (email campaign only)
dob → NO
address → NO
purchase_history → YES (needed to personalise with category)
account_notes → NO
credit_score → NO
health_flag → NO (sensitive PI — absolutely not)
last_login → NO
device_id → NO
ip_address → NO
campaign_source → YES (attribution)

MINIMUM EXPORT:
SELECT email, purchase_history, campaign_source
FROM customers
WHERE last_purchase < DATE_SUB(NOW(), INTERVAL 90 DAY)

Three columns instead of seventeen. The exposure reduction is not incremental — it's categorical. A file with 3 columns has a fundamentally different risk profile than a file with 17.

What this means for your CSV workflow: Every column in a CSV export is a decision. The default should be: don't include it unless you can name the specific reason it's needed.

Choosing Tools: Article 25 and Your Processing Stack

Article 25 applies to your choice of processing tools — not just what you do with the data.

The ICO's February 2026 guidance notes: "If you use products and services that don't help you to do this, you may have to take more steps to be sure that your processing complies." When you select a CSV processing tool, you are making an Article 25 design decision.

Questions to ask about any tool before using it with personal data:

Does it process locally or on a remote server?
Does it offer a signed DPA (required if it's a processor under GDPR Article 28)?
What is the file retention period in its terms of service?
For EU data: is it DPF-certified or does it offer SCCs?

Most consumer-grade cloud CSV tools upload your file to remote servers, retain it for debugging or caching purposes, and don't offer DPA documentation by default. Using one of these tools for a CSV containing customer personal data is a processing decision that may not satisfy Article 25.

Many CSV processing tools require uploading personal data to remote servers before any processing occurs. Under Article 25, that upload is the processing activity — and it happens before you've applied any privacy measures. SplitForge processes files in Web Worker threads in your browser. For raw file contents, if nothing is transmitted server-side, the processing activity itself has no server-side footprint. That's privacy by design at the tool selection level.

What this means for your CSV workflow: Tool selection is a privacy decision. Including it in your design review is required, not optional.

Recurring Workflows: When Privacy by Design Decays

One-time exports are easy to review. The harder problem is recurring workflows.

A weekly CRM export that was set up in 2022 was designed under different privacy regulations, different data categories, and a different team. In 2026, it may be pulling sensitive PI that triggers CCPA 2026 risk assessment requirements. The original 12-column export may now contain 20 columns because someone added fields to the CRM.

Privacy by design decays when workflows are set up and forgotten.

Recurring workflow review checklist (quarterly minimum):

What columns does this export still contain? Have any new fields been added to the source?
Is the purpose still the same as when it was set up?
Has the recipient changed? Are they still covered by the original DPA?
Is the tool still the same? Has its ToS or DPA changed?
Is the retention period still appropriate? Are old files being deleted?
Does this workflow now trigger CCPA 2026 significant-risk requirements?

The most common decay pattern:

❌ DECAYED WORKFLOW (common after 12+ months):
2022 setup: Weekly export of email + purchase_date for loyalty programme
2023: Health questionnaire added to CRM → health_flag now in all exports
2024: Marketing team added credit_tier field → now in all exports
2025: New CCPA 2026 regulations take effect
2026: Workflow now exports 3 sensitive PI categories with no updated assessment

The workflow looks the same. The data profile is completely different.

What this means for your CSV workflow: Set a calendar reminder for every recurring CSV workflow — quarterly for high-volume exports, semi-annually for others. Review the column list, not just the process.

For a complete pre-sharing review, see our privacy review before sharing CSV guide. For vendor evaluation against Article 25 standards, see our CSV tool security checklist.

Documenting Your Workflow: The Minimum Viable Record

Article 25 requires you to demonstrate compliance — not just comply. For analysts, that means documentation.

The minimum viable record for any CSV workflow involving personal data:

Workflow: Weekly inactive customer reactivation export
Date created: 2026-03-18
Purpose: Identify customers inactive >90 days for email reactivation campaign
Data categories: Email address, last purchase category, campaign source
Lawful basis: Legitimate interest (existing customer relationship, marketing opt-in)
Recipients: Internal marketing team only
Tool used: SplitForge (local processing, no server transmission)
Retention period: Deleted after campaign completion (max 30 days)
CCPA: No sensitive PI categories — risk assessment not triggered
GDPR cross-border: No transfer — local processing only
Review due: 2026-06-18

One paragraph. One file in a shared folder. This is your Article 30 record and your Article 25 demonstration in one.

If a regulator asks "can you demonstrate that you implemented privacy by design for this workflow?" — this document is the answer.

Operator Rules: Privacy by Design for Analysts

Short. Non-negotiable. Reference before designing any new CSV workflow.

Privacy questions come first — before the first query, not after
Every column is a decision — default is don't include it
Tool selection is a privacy decision — local processing satisfies Article 25 by architecture
Recurring workflows decay — review column lists quarterly, not just processes
Documentation is mandatory — one paragraph per workflow satisfies Article 30
"We've always exported it that way" is not an Article 25 defence
The design stage ends the moment you run the first query — get the privacy questions answered before that

Additional Resources

GDPR Primary Sources:

GDPR Article 25 — Data Protection by Design and by Default — Full legal text
EDPB Guidelines 4/2019 on Article 25 Data Protection by Design — Authoritative implementation guidance

ICO Guidance (Updated February 2026):

ICO: Data Protection by Design and by Default — Updated 5 February 2026, includes DUAA 2025 changes

GDPR Cross-Reference:

GDPR Article 5 — Data Minimization Principle — The data minimization obligation underlying Article 25(2)
GDPR Article 30 — Records of Processing Activities — Documentation requirements

Disclaimer: This post is for informational purposes only and does not constitute legal advice. Article 25 obligations depend on the nature, scope, context, and purposes of your specific processing activities. Consult qualified legal counsel before drawing compliance conclusions.

FAQ

Article 25 applies to controllers — the organisation responsible for determining purposes and means of processing. In practice, individual analysts act on behalf of the controller every time they design a new workflow, pull an export, or set up a recurring report. The DPO oversees compliance, but the analyst is the person making design decisions. If an analyst designs a workflow that pulls 20 unnecessary columns, that's an Article 25 problem regardless of whether a DPO signed off on it.

The design stage is the point at which you decide what data to collect and how to process it — before you execute the workflow. For a CSV export, it's the moment you decide which columns to include, which tool to use, and who gets access. That's when Article 25 applies. Running the query first and reviewing for privacy after is a retrofit, not design.

Yes. Privacy teams set policy and review high-risk processing. They can't review every analyst query in real time. Article 25 requires privacy to be embedded into the culture and habits of everyone who handles personal data — not just reviewed by a central team. The five design questions in this post are the analyst's contribution to that system.

Stop before pulling. New purpose = new processing activity. Ask: what's the minimum data needed for this purpose? Does processing it require a new lawful basis? Does it involve sensitive PI? Does it require a DPIA? For complex requests, escalate to the privacy team before proceeding. For routine requests, the five design questions are your checkpoint.

You need to be able to demonstrate that you considered privacy at the design stage and implemented appropriate measures. A one-paragraph record per workflow covering purpose, data categories, lawful basis, recipient, tool, and retention period satisfies both Article 25 (demonstrate implementation) and Article 30 (records of processing). It doesn't need to be complex — it needs to exist before you're asked for it.

Using a local-first tool that doesn't transmit file contents to a server addresses one Article 25 dimension: the technical measure of minimizing data exposure during processing. It doesn't satisfy Article 25 on its own — you still need data minimization (only necessary columns), access controls, and documentation. But it's a meaningful architectural choice that reduces processing footprint and supports Article 25 compliance at the tool selection level.

Build Privacy Into Your CSV Workflows From Day One

Answer the 5 privacy design questions before every new CSV workflow

Remove unnecessary columns at the query stage — not as a retrofit

Process files locally — no server transmission means no server-side exposure by architecture

Document each workflow in one paragraph — satisfy both Article 25 and Article 30

Start Processing Locally →

Privacy by Design for Data Analysts: Build GDPR Article 25 Into Your CSV Workflow From Day One

Quick Answer

Fast Fix (2 Minutes)

What "Privacy by Design" Actually Means for a Data Analyst

Table of Contents

The Five Privacy Design Questions for Every CSV Workflow

Privacy by Default: The Minimum Data Rule in Practice

Choosing Tools: Article 25 and Your Processing Stack

Recurring Workflows: When Privacy by Design Decays

Documenting Your Workflow: The Minimum Viable Record

Operator Rules: Privacy by Design for Analysts

Additional Resources

FAQ

Does Article 25 actually apply to individual data analysts, or just to DPOs and architects?

What counts as the "design stage" for a CSV export?

We have a privacy team. Do I still need to think about this myself?

How do I handle a request to pull data for a purpose I haven't processed before?

What's the minimum documentation I need to satisfy Article 25?

Does using a local processing tool like SplitForge satisfy Article 25?

Build Privacy Into Your CSV Workflows From Day One

Quick Answer

Fast Fix (2 Minutes)

What "Privacy by Design" Actually Means for a Data Analyst

Table of Contents

The Five Privacy Design Questions for Every CSV Workflow

Privacy by Default: The Minimum Data Rule in Practice

Choosing Tools: Article 25 and Your Processing Stack

Recurring Workflows: When Privacy by Design Decays

Documenting Your Workflow: The Minimum Viable Record

Operator Rules: Privacy by Design for Analysts

Additional Resources

FAQ

Does Article 25 actually apply to individual data analysts, or just to DPOs and architects?

What counts as the "design stage" for a CSV export?

We have a privacy team. Do I still need to think about this myself?

How do I handle a request to pull data for a purpose I haven't processed before?

What's the minimum documentation I need to satisfy Article 25?

Does using a local processing tool like SplitForge satisfy Article 25?

Build Privacy Into Your CSV Workflows From Day One

Continue Reading

Do You Need a Database for a Large CSV File? (2026 Answer)

How to Open a Large CSV File — Even 10 GB, No Database (2026)

Excel File Too Large to Open? Fix Every Memory Error (2026)