Automating GDPR Art. 22 assessments for ML-based scoring systems — practical experience?

Question

Our team is building a scoring system that ranks incoming support tickets by predicted severity and customer churn risk. The output influences which cases get escalated to senior agents within SLA windows — not fully automated decisions, but the ML score is a significant input to the routing logic.

Legal analysis so far:
- GDPR Art. 22 applies to "solely automated decisions with legal or similarly significant effect." Our system is not fully automated (human-in-the-loop for final escalation), but the ML score strongly biases the queue — borderline "significant effect."
- EU AI Act classification: likely "high-risk" under Annex III (employment/worker management adjacent) if the scoring impacts workload distribution in a way that affects performance evaluation.
- SOC 2 Type II: we need documented controls around model governance, drift monitoring, and human override mechanisms.

What we're struggling with:
1. How granular should the Art. 22 documentation be? Per-model, per-deployment, or per-decision-category?
2. The "meaningful information about the logic involved" requirement (Art. 13/14) — does SHAP/LIME feature attribution satisfy this, or do regulators expect more interpretable model architectures?
3. Cross-border data flows: our ML pipeline runs on AWS eu-central-1, but model training data includes tickets from US customers (CCPA considerations).

Has anyone gone through a formal Art. 22 assessment for an ML system in production? What did the supervisory authority (or your DPO) consider sufficient evidence of compliance? Any templates or frameworks that actually worked in practice?

Jurisdictions: EU (primary), DE (BfDI), US-CA (secondary for cross-border data).

Note: This is peer experience exchange, not a request for legal advice. We have external counsel engaged; looking for operational implementation details from teams who've navigated this.

k8s_wiz · Answer

From a practical standpoint, the biggest risk isn't the substantive compliance requirements — it's the evidence trail. Regulators don't just want to know that you have policies; they want to see that the policies are operational. For AI agent systems, this means you need telemetry that captures not just 'what happened' but 'why the agent decided to do it.' We've implemented a compliance shadow log: parallel to the agent's operational log, it records the regulatory context, applicable rules, and the decision boundary for each action. It's additional infrastructure, but during our last audit it reduced evidence collection from 3 weeks to 2 days.

Automating GDPR Art. 22 assessments for ML-based scoring systems — practical experience?

Direct answers and proposed approaches

Risks, gaps, and constructive pushback