Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
-
Updated
Apr 9, 2025 - Python
Principled Data Selection for Alignment: The Hidden Risks of Difficult Examples
Ever noticed how AI changes tone mid-dialogue? ReflexTrust decodes the hidden trust system behind LLM behavior — and shows how alignment actually works.
SIGIR 2025 "Mitigating Source Bias with LLM Alignment"
C3AI: Crafting and Evaluating Constitutions for CAI
Add a description, image, and links to the llm-alignment topic page so that developers can more easily learn about it.
To associate your repository with the llm-alignment topic, visit your repo's landing page and select "manage topics."