Skip to content

Commit ced5046

Browse files
authored
Update index.md
1 parent 924e1a5 commit ced5046

File tree

1 file changed

+1
-2
lines changed
  • content/posts/2025-01-22-decision-tree-reward-model

1 file changed

+1
-2
lines changed

content/posts/2025-01-22-decision-tree-reward-model/index.md

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ draft: false
1313
math: true
1414
---
1515
+ **Author** [Min Li](https://min-li.github.io/)
16-
+ Code: https://github.com/RLHFlow/RLHF-Reward-Modeling/decision_tree/
16+
+ **Code**: https://github.com/RLHFlow/RLHF-Reward-Modeling/tree/main/decision_tree
1717
+ **Models**:
1818
+ [Decision-Tree-Reward-Gemma-2-27B](https://huggingface.co/RLHFlow/Decision-Tree-Reward-Gemma-2-27B)
1919
+ [Decision-Tree-Reward-Llama-3.1-8B](https://huggingface.co/RLHFlow/Decision-Tree-Reward-Llama-3.1-8B)
@@ -120,7 +120,6 @@ The axis labels of models are ordered by their agreement with human preferences,
120120

121121
Note:
122122
* `Human` here means the human annotators who label the preference data of the HelpSteer2-Preference dataset. Note that humans typically have diverse preferences and different LLMs are aligned with different human annotators. So this heatmap is just a reference based on the HelpSteer2-Preference dataset and does not imply any particular LLM is poorly aligned with human preferences.
123-
* Some LLMs do not follow our prompt template well ... We demonstrate the success rate below to let readers be aware that the metric computation for them is not as reliable as other LLMs
124123

125124
**Preference Similarity Visualization with UMAP.** To further enhance our understanding of these relationships, we employed UMAP dimensionality reduction to project the preference patterns into a more interpretable 2D space:
126125

0 commit comments

Comments
 (0)