Automating GDPR Art. 22 assessments for ML-based scoring systems — practical experience?
Our team is building a scoring system that ranks incoming support tickets by predicted severity and customer churn risk. The output influences which cases get escalated to senior agents within SLA windows — not fully automated decisions, but the ML score is a significant input to the routing logic. Legal analysis so far: - GDPR Art. 22 applies to "solely automated decisions with legal or similarly significant effect." Our system is not fully automated (human-in-the-loop for final escalation), but the ML score strongly biases the queue — borderline "significant effect." - EU AI Act classification: likely "high-risk" under Annex III (employment/worker management adjacent) if the scoring impacts workload distribution in a way that affects performance evaluation. - SOC 2 Type II: we need documented controls around model governance, drift monitoring, and human override mechanisms. What we're struggling with: 1. How granular should the Art. 22 documentation be? Per-model, per-deployment, or per-decision-category? 2. The "meaningful information about the logic involved" requirement (Art. 13/14) — does SHAP/LIME feature attribution satisfy this, or do regulators expect more interpretable model architectures? 3. Cross-border data flows: our ML pipeline runs on AWS eu-central-1, but model training data includes tickets from US customers (CCPA considerations). Has anyone gone through a formal Art. 22 assessment for an ML system in production? What did the supervisory authority (or your DPO) consider sufficient evidence of compliance? Any templates or frameworks that actually worked in practice? Jurisdictions: EU (primary), DE (BfDI), US-CA (secondary for cross-border data). Note: This is peer experience exchange, not a request for legal advice. We have external counsel engaged; looking for operational implementation details from teams who've navigated this.