|
| 1 | +--- |
| 2 | +layout: page |
| 3 | +title: Implementing the DAS3H model to model student's learning and retention using spaced repetition |
| 4 | +description: |
| 5 | +img: |
| 6 | +importance: 1 |
| 7 | +category: work |
| 8 | +related_publications: true |
| 9 | +github: |
| 10 | +--- |
| 11 | + |
| 12 | +# Modeling Student Learning with Spaced Repetition: Implementing a DAS3H Model |
| 13 | + |
| 14 | +**This post is an accessible guide to implementing the DAS3H model for modeling student learning and memory using spaced repetition techniques.** |
| 15 | + |
| 16 | +--- |
| 17 | + |
| 18 | +## 🎯 Motivation |
| 19 | + |
| 20 | +Understanding how students learn and forget is key to designing intelligent educational systems. The **DAS3H** (Difficulty, Ability, Skill, and Student, with 3-parameter Half-life) model provides a principled way to capture the effects of **time**, **practice history**, and **knowledge component (KC)** difficulty on memory retention. It builds on earlier memory models by explicitly incorporating the idea that memory decays over time unless refreshed. |
| 21 | + |
| 22 | +This makes DAS3H particularly suitable for modeling **spaced repetition**—a technique used by apps like Anki and Duolingo—where practice is scheduled to optimize long-term retention. |
| 23 | + |
| 24 | +--- |
| 25 | + |
| 26 | +## 🧠 What is the DAS3H Model? |
| 27 | + |
| 28 | +The DAS3H model estimates the probability that a student will recall a given knowledge component (KC) based on: |
| 29 | + |
| 30 | +- **Student ability** |
| 31 | +- **KC difficulty** |
| 32 | +- **KC-specific retention/forgetting rate** |
| 33 | +- **Time since last correct attempt** |
| 34 | +- **Practice history (correct/incorrect responses)** |
| 35 | + |
| 36 | +Mathematically, the probability of a correct response is modeled as a logistic function of the log-time since last practice and other parameters: |
| 37 | + |
| 38 | +\[ |
| 39 | +P(\text{correct}) = \sigma\left(\theta_s - \delta_k + \gamma_k \cdot \log(t)\right) |
| 40 | +\] |
| 41 | + |
| 42 | +Where: |
| 43 | +- \( \theta_s \) = student's ability |
| 44 | +- \( \delta_k \) = KC difficulty |
| 45 | +- \( \gamma_k \) = KC-specific forgetting rate |
| 46 | +- \( t \) = time since last correct answer on that KC |
| 47 | +- \( \sigma(x) \) = sigmoid function |
| 48 | + |
| 49 | +This model assumes **each KC has its own forgetting curve**, which makes it particularly suited to domains where retention rates vary between concepts (e.g., math formulas vs. vocabulary words). |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +## 🛠️ Implementation Overview |
| 54 | + |
| 55 | +Here’s a quick overview of how to implement DAS3H. |
| 56 | + |
| 57 | +### 1. **Data Format** |
| 58 | + |
| 59 | +The input dataset should contain the following fields: |
| 60 | + |
| 61 | +- `student_id` |
| 62 | +- `kc_id` (knowledge component) |
| 63 | +- `timestamp` of the attempt |
| 64 | +- `is_correct` (binary) |
| 65 | +- `attempt_number` |
| 66 | + |
| 67 | +Optionally, store time since last correct response per `(student, KC)` pair. |
| 68 | + |
| 69 | +### 2. **Parameter Initialization** |
| 70 | + |
| 71 | +Each KC has its own: |
| 72 | +- Difficulty \( \delta_k \) |
| 73 | +- Forgetting rate \( \gamma_k \) |
| 74 | + |
| 75 | +Each student has: |
| 76 | +- Ability \( \theta_s \) |
| 77 | + |
| 78 | +You can randomly initialize these parameters or use heuristics from early training performance. |
| 79 | + |
| 80 | +### 3. **Training** |
| 81 | + |
| 82 | +We minimize **binary cross-entropy** loss over the predicted vs. actual correctness using gradient descent. |
| 83 | + |
| 84 | +You can use PyTorch or JAX for a flexible implementation. Here's a simplified PyTorch sketch: |
| 85 | + |
| 86 | +```python |
| 87 | +import torch |
| 88 | +import torch.nn as nn |
| 89 | + |
| 90 | +class DAS3H(nn.Module): |
| 91 | + def __init__(self, num_students, num_kcs): |
| 92 | + super().__init__() |
| 93 | + self.ability = nn.Embedding(num_students, 1) |
| 94 | + self.difficulty = nn.Embedding(num_kcs, 1) |
| 95 | + self.forgetting_rate = nn.Embedding(num_kcs, 1) |
| 96 | + |
| 97 | + def forward(self, student_ids, kc_ids, log_time_since_last_correct): |
| 98 | + theta = self.ability(student_ids).squeeze() |
| 99 | + delta = self.difficulty(kc_ids).squeeze() |
| 100 | + gamma = self.forgetting_rate(kc_ids).squeeze() |
| 101 | + logit = theta - delta + gamma * log_time_since_last_correct |
| 102 | + return torch.sigmoid(logit) |
| 103 | +``` |
| 104 | + |
| 105 | +You’ll need to batch your training data and compute log(t) from timestamps. |
| 106 | + |
| 107 | +## 📊 Evaluating the Model |
| 108 | + |
| 109 | +To evaluate the performance of the DAS3H model, you can use several standard metrics from binary classification tasks, as well as metrics specific to educational data: |
| 110 | + |
| 111 | +- **Accuracy**: Measures the percentage of correctly predicted responses. |
| 112 | +- **AUC-ROC (Area Under the ROC Curve)**: Captures the ability of the model to distinguish between correct and incorrect responses. |
| 113 | +- **Log-loss (Cross-Entropy Loss)**: Measures the calibration of the predicted probabilities. |
| 114 | +- **Mean Absolute Error on Recall Time**: Optional, if you simulate recall prediction over time. |
| 115 | + |
| 116 | +Additionally, you can segment performance by: |
| 117 | +- **Time since last attempt** to see if the model degrades reasonably with time. |
| 118 | +- **Knowledge component (KC)** to verify KC-specific learning/forgetting curves. |
| 119 | +- **Student proficiency levels** to ensure the model generalizes across learner types. |
| 120 | + |
| 121 | +--- |
| 122 | + |
| 123 | +## 💡 Insights |
| 124 | + |
| 125 | +Implementing and analyzing the DAS3H model reveals several key insights: |
| 126 | + |
| 127 | +- **Temporal modeling is critical**: Time since last practice is a powerful predictor of recall. DAS3H leverages this directly through the log-time decay term. |
| 128 | +- **Different concepts decay differently**: The model learns that not all knowledge components (KCs) are retained equally. Some require more frequent review. |
| 129 | +- **Students vary in ability and retention**: The separation of student ability and KC decay rates allows personalization without overfitting. |
| 130 | +- **Supports spaced repetition planning**: DAS3H can be used to estimate optimal review times, supporting intelligent tutoring systems that adapt over time. |
| 131 | + |
| 132 | +--- |
| 133 | + |
| 134 | +## 🔮 Future Directions |
| 135 | + |
| 136 | +There are many ways to build on and extend the DAS3H model: |
| 137 | + |
| 138 | +- **Meta-learning extensions**: Learn how forgetting rates vary across student profiles to dynamically adjust model parameters. |
| 139 | +- **Bayesian DAS3H**: Place priors over student and KC parameters to improve robustness in low-data settings. |
| 140 | +- **Item-level effects**: Incorporate question-level embeddings to account for difficulty or bias in specific items. |
| 141 | +- **Curriculum planning**: Use predicted recall probabilities to optimize what the student should review next. |
| 142 | +- **Active learning for education**: Select the next KC to quiz on based on expected information gain about student knowledge. |
| 143 | + |
| 144 | +--- |
| 145 | + |
| 146 | +## 🧰 Related Tools and Libraries |
| 147 | + |
| 148 | +Here are some tools that are helpful for implementing or experimenting with knowledge tracing and memory models: |
| 149 | + |
| 150 | +- [`pyBKT`](https://github.com/CAHLR/pyBKT): A Python library for Bayesian Knowledge Tracing (BKT), suitable for simpler models. |
| 151 | +- [`EduMPL`](https://github.com/educational-ml/eduml): Educational Machine Learning library with multi-task learning capabilities. |
| 152 | +- [`Torch-KT`](https://github.com/woojihoon/torch-kt): A PyTorch-based framework for implementing knowledge tracing models, including deep KT. |
| 153 | +- [`Khan Academy Datasets`](https://www.kaggle.com/c/riiid-test-answer-prediction/data): Useful for large-scale experiments with temporal learning data. |
| 154 | + |
| 155 | +--- |
| 156 | + |
0 commit comments