Nova Models: Safety Through Chain-of-Thought Distillation

Nova CoT Distillation

Nova Models rank among the top in the latest Aymara LLM Risk and Responsibility Matrix with Nova Premier scoring the highest. You can use specialized distillation techniques like Chain Of Thought & Symbolic CoT Distillation to capture some of the safety and responsible AI characteristics of the teacher (Nova Premier) on the student (Nova Lite).

How Does It Work?

  1. Data Preparation: Collect teacher model completions that include explicit reasoning paths, not just outputs. Gather the teacher’s “thought process” or intermediate statements.

  2. Training Objective: The student model is trained to reproduce these chains of thought, learning the reasoning process along with the answer.

  3. Intermediate Supervision: Loss functions may be applied not just at the output but at each reasoning step, encouraging the student to follow the teacher’s logic.

  4. Prompt Engineering: Prompts may be designed to encourage richer teacher rationales, which are in turn taught to the student.

  5. Evaluation: Students are assessed not only on accuracy, but also on the faithfulness, safety, and alignment of their reasoning steps.

Things to Consider

The success of this transfer heavily depends on methodology and choices during distillation. It requires more than naive output-matching and benefits from integrating explicit alignment steps throughout the process.