If you're building AI systems or handling customer data in Quebec, there's a distinction in Law 25 that can make or break your compliance strategy: the difference between de-identified and anonymized data. Many organizations assume they're the same. They're not. And the consequences of getting it wrong are severe.
Here's the key insight most enterprises miss: de-identified data is still personal information under Law 25. It remains fully regulated. Anonymized data is not. Understanding this distinction is fundamental to building AI systems that are both innovative and compliant.
The Three Categories: A Clear Framework
Law 25 recognizes three distinct categories of data. Each carries different obligations, different risks, and different opportunities for use in AI systems.
Personal Information
Any information that can identify a natural person, directly or indirectly, either alone or in combination with other data.
De-identified (Depersonalized) Data
Personal information that has been modified so the individual can no longer be directly identified. However, re-identification remains possible. Still regulated as personal information.
Anonymized Data
Information that has been irreversibly altered so the individual can no longer be identified directly or indirectly. No longer personal information.
Personal Information: What Counts?
Under Law 25, personal information is broadly defined. It includes any information that can identify a natural person directly or indirectly. The key word is "indirectly." Even if data doesn't contain a name, it may still be personal information if it could be combined with other available information to identify someone.
Common Examples
- Direct identifiers: Full name, social insurance number, health insurance number, driver's license number
- Contact information: Email address, phone number, home address
- Digital identifiers: IP addresses, device IDs, cookies, user account IDs
- Biometric data: Fingerprints, facial recognition data, voice prints, retinal scans
- Sensitive categories: Medical records, financial information, racial or ethnic origin, political opinions, religious beliefs
Common Mistake: Assuming that removing someone's name makes the data non-personal. A customer ID, purchase history, and postal code may be enough to re-identify an individual. If re-identification is possible, it's still personal information.
Concrete Example: E-commerce Customer
Consider a customer record with:
- Name: Marie Tremblay
- Email: marie.tremblay@email.com
- Address: 1234 Rue Sainte-Catherine, Montreal, H3B 1A1
- Purchase history: 47 transactions over 3 years
- Customer ID: C-78234
This is clearly personal information. Every field either directly identifies Marie or contributes to identifying her.
De-identified Data: The Trap Most Organizations Fall Into
De-identification (also called depersonalization in Quebec law) involves modifying personal information so that the individual can no longer be directly identified. The critical word here is "directly."
Under Law 25, de-identified data is defined as personal information that "no longer allows the person concerned to be directly identified." Note what this definition does not say: it does not say the person cannot be identified at all.
Critical Point: De-identified data remains personal information under Law 25. All privacy obligations continue to apply: consent requirements, security measures, breach notification, retention limits, and access rights. De-identification is a security measure, not an escape from regulation.
Concrete Example: De-identified Customer
Taking the same customer record and de-identifying it:
- Name: [Removed]
- Email: [Removed]
- Address: Montreal, H3B (first 3 digits of postal code)
- Purchase history: 47 transactions over 3 years
- Customer ID: C-78234
Is this still personal information? Yes. The customer ID alone allows re-identification by cross-referencing internal databases. Even without the customer ID, the combination of location and detailed purchase history could potentially identify Marie, especially for unusual purchase patterns.
Why De-identification Fails for AI Training
Many organizations assume that de-identifying training data allows them to use it freely for AI development. This is incorrect under Law 25:
- De-identified data requires the same legal basis as personal data
- Consent obtained for one purpose (e.g., service delivery) doesn't automatically extend to AI training
- If the AI model could theoretically be used to re-identify individuals, you're processing personal information
Anonymized Data: The High Bar
Anonymization is fundamentally different from de-identification. Under Law 25, information is considered anonymized when it is "reasonable to expect that it has been irreversibly altered so that the person can no longer be identified directly or indirectly."
The two key requirements are:
- Irreversibility: There is no practical way to reverse the process and recover the original identity
- No indirect identification: Even combining the data with other available information cannot identify the individual
The Reward: Once data is truly anonymized according to Law 25's requirements, it is no longer considered personal information. Privacy regulations no longer apply. You can use it for AI training, share it with partners, or publish it without consent obligations.
Concrete Example: Anonymized Statistics
Instead of individual customer records, consider aggregated statistics:
- Customers in Montreal H3B area: 2,847
- Average transactions per customer (2023-2026): 34
- Most popular product category: Electronics
- Average basket size: $127
This is anonymized data. No individual can be identified from these statistics. The transformation is irreversible. You cannot work backwards from "2,847 customers" to identify Marie Tremblay.
The Practical Reality: Anonymization Is Extremely Difficult
Here's what most vendors won't tell you: true anonymization is nearly impossible for granular data. Diane Poitras, president of Quebec's Commission d'acces a l'information (CAI), has stated that the Commission considers it "virtually impossible" to anonymize personal information, except for aggregated statistics.
Why? Re-identification attacks are increasingly sophisticated:
- Linkage attacks: Combining your "anonymized" data with external datasets to re-identify individuals
- Inference attacks: Using machine learning to infer identities from patterns in the data
- Differencing attacks: Comparing datasets before and after an individual's data was added to identify them
A famous example: researchers re-identified 87% of the U.S. population using just three data points: zip code, birth date, and gender.
The Comparison Table
| Aspect | Personal Information | De-identified | Anonymized |
|---|---|---|---|
| Still personal info? | Yes | Yes | No |
| Law 25 applies? | Fully | Fully | No |
| Consent required? | Yes | Yes | No |
| Breach notification? | Required | Required | Not applicable |
| Re-identification possible? | Trivial | Possible | Impossible |
| Use for AI training? | Needs consent | Needs consent | Free to use |
| Example | Marie Tremblay, marie@email.com | Customer C-78234, Montreal H3B | 2,847 customers, avg 34 transactions |
Implications for AI Systems
If you're building AI systems that process data from Quebec residents, here's what this means in practice:
Training Data
- De-identified training data is still regulated. You need proper consent or legal basis.
- Only truly anonymized data (aggregated statistics, synthetic data) can be used without consent.
- If your model memorizes training data and could potentially reveal it, you're still processing personal information.
Model Outputs
- If your AI can output information that identifies individuals, the outputs are personal information.
- Recommendation systems that use personal data need proper consent, even if the user doesn't see the raw data.
Data Sharing
- Sharing de-identified data with third parties (including AI vendors) requires the same safeguards as sharing personal data.
- Contractual protections, data processing agreements, and security requirements all apply.
Re-identification is an Offense: Under Law 25, attempting to re-identify a person using de-identified or anonymized information without authorization is a specific offense with significant penalties. This applies to anyone, including AI researchers and data scientists.
Practical Recommendations
1. Default to assuming it's personal information. Unless you can prove irreversible anonymization meeting the CAI's strict standards, treat your data as regulated.
2. Use synthetic data for AI development. Generate artificial data that mimics statistical properties without containing real personal information.
3. Implement privacy-preserving techniques. Differential privacy, federated learning, and secure multi-party computation can enable AI development without exposing personal data.
4. Document your anonymization process. Law 25 requires organizations to document how anonymization was performed and maintain records demonstrating compliance.
5. Get expert review. Before claiming data is anonymized, have privacy experts or the CAI review your methodology. The consequences of getting it wrong are severe.
The Bottom Line
The distinction between de-identified and anonymized data is one of the most misunderstood aspects of Quebec's privacy framework. Many organizations have built AI systems on the assumption that de-identification frees them from privacy obligations. It does not.
True anonymization is possible, but it's the exception, not the rule. For most practical purposes, if your data started as personal information, it remains personal information. Plan accordingly.
Building AI systems that respect these boundaries isn't just about compliance. It's about building trust with your users and creating sustainable, ethical AI practices that will serve your organization for the long term.
Trusted AI Advisory
Need Help With Law 25 Compliance?
Our Trusted AI practice helps enterprises navigate Quebec's privacy requirements while building effective AI systems. From data classification to compliance audits, we bridge the gap between innovation and regulation.
Learn More Talk to an Expert