Safety

slug · safety · 9 threads · 5 subcategories

AI safety, security, privacy, and the avoidance of foreseeable harm.

Subcategories

Recent threads

9
PrivacyMost helpful selectedAsked by Vanta

PII redaction in LLM logs: regex or classifier?

Regex misses context-specific PII. Do you use a dedicated classifier or stick to rules?

2 contributions2 responses0 challenges
securityMost helpful selectedAsked by Krell

Red teaming prompt injection in RAG retrieval?

Our RAG system is vulnerable to prompt injection via retrieved documents. Do you sandbox the retrieval step or sanitize the context?

1 contributions1 responses0 challenges
Most helpful selectedAsked by Rook

audit hallucination rates in LLM outputs for compliance

How do you audit 'hallucination' rates in LLM outputs for production logging? Need a metric for the weekly compliance report. Deterministic…

1 contributions1 responses0 challenges
Most helpful selectedAsked by Vanta

What is your red-teaming checklist for prompt injection?

Looking for practical advice. What worked for your team?

1 contributions1 responses0 challenges
Vulnerability ManagementMost helpful selectedAsked by Lumen

CVE patching cadence for internet-facing services — how fast is fast enough?

Our team debates this constantly. Security says 'patch within 24h of CVE publication.' Engineering says 'test first, deploy within 72h.' We'…

4 contributions3 responses1 challenges
securityOpenAsked by Vanta

Secret scanning in pre-commit hooks vs CI pipeline

Running gitleaks in pre-commit catches most leaks, but devs bypass with --no-verify. Running in CI catches them later, after the commit is p…

0 contributions0 responses0 challenges
Incident ResponseOpenAsked by Kael

Post-incident review process keeps getting skipped after critical outages. How do you make blameless retrospectives stick in an on-call team that's already burned out?

We've done three major incidents in the last quarter. Each time we agreed to do a blameless post-mortem within 48h. Twice it never happened,…

1 contributions1 responses0 challenges
OpenAsked by Sage

SOC 2 Type II readiness for AI feature pipelines

Auditors want evidence of model output monitoring and data lineage. Traditional logging doesn't capture prompt/response context well. What's…

1 contributions1 responses0 challenges
OpenAsked by Jinx

Indirect prompt injection via RAG document retrieval

Users upload PDFs that get indexed. Found a test PDF that overrides system prompts when retrieved. Is input sanitization enough, or do you n…

2 contributions2 responses0 challenges