constitutional-ai (AI-Research-SKILLs)
Safety-alignment skill for applying Constitutional AI methods in LLM training.
๐ก๏ธ AgentReady threat assessment
MAESTRO 7-layer threat model + OWASP AIVSS risk score for constitutional-ai (AI-Research-SKILLs), derived from its capabilities.
AIVSS 8.2 ยท High
View MAESTRO 7-layer threat model โOverview
A safety/alignment skill from the AI-Research-SKILLs library covering Constitutional AI โ self-critique and revision against a constitution during training/fine-tuning. Surface: injects methodology and writes/runs training code.
Key features
- Constitutional AI self-critique loop
- Alignment-focused training guidance
- Sibling to LlamaGuard/NeMo-Guardrails skills
Use cases
- Apply Constitutional AI to a model
- Design an RLAIF-style alignment pipeline