Enhancing personalized exercise recommendation with student and exercise portraits

Wei-Wei Gao; Hui-Fang Ma; Yan Zhao; Jing Wang; Quan-Hong Tian

doi:10.1016/j.jnlest.2024.100262

1 Introduction

With the increasing popularity of online education, various teaching resources are flourishing on the Internet, which greatly facilitates students’ learning and makes the online learning method acceptable and adaptable to students. However, with the increase in network resources, how to recommend suitable resources [1–3] for students has become a hot research topic. As one of the most important teaching resources, exercise is vital in consolidating and improving students’ conceptual knowledge. In recent years, numerous research efforts have been dedicated to exercise recommendations [4–6]. Classical exercise recommendation algorithms attempt to learn vector representations (i.e., embeddings) of students and exercises. The hybrid deep collaborative filtering (HB-DeepCF) [7] is a typical hybrid recommendation method via a deep collaborative filtering (CF) model, which obtains a student’s (an exercise’s) embedding by mapping from pre-existing ratings. More recently, researchers have adopted a cognitive diagnosis model to reveal mastery of knowledge concepts for students to advance recommendation performance [8,9]. For instance, the exercise recommendation based on knowledge concept prediction (KCP-ER) [10] designs a knowledge concept prediction module that can obtain the embedding of the knowledge concept coverage and student mastery. Then, the exercise filtering module is presented to filter the exercises. The based contextual knowledge tracing approach (LSTMCQP) [11] designs a knowledge-tracking method to capture students’ knowledge states and then adopts a personalized long short term memory (LSTM) method to perform exercise recommendations. We argue that an inherent limitation of such methods is that finer-grained portraits of students and exercises are not encoded properly. As such, the resultant recommendation result may not be sufficient to capture ‘suitable’ preferences for students.

Exercise recommendation suggests ‘suitable’ exercises for students. An ideal exercise recommender system should satisfy novelty, accuracy, and diversity. Fig. 1 reveals the importance of the above three elements during the recommendations process. The novelty means that the recommended exercise may contain new knowledge concepts that the student does not know (or answer incorrectly before), which helps the student learn new knowledge. For example, in Figs. 1 (a) and (b), e₂ and e₄ are recommended to Ann to practice the new knowledge concepts k₅ (equivalent fractions) and k₆ (unit rate). The accuracy suggests that the exercises recommended to students should be of appropriate difficulty. Too difficult exercises may exert a negative impact on students’ learning motivation, while too easy exercises will make students degrade their learning interests. Thus, the exercises recommended to students should be of a smooth, accurate difficulty. As seen from Figs. 1 (a) and (b), Ann has answered e₁ correctly, and e₄ is of similar difficulty to e₁, so it is reasonable to recommend e₄ to Ann. The diversity reflects that the knowledge concepts of recommended exercises for students should be different. Recommending a list of exercises with different knowledge concepts will make students feel enthusiastic and will eventually assist them in mastering the knowledge comprehensively. In summary, the exercises we recommend for Ann are e₂ and e₄.

Figure 1.Exercise recommendation: (a) students’ mastery of knowledge with past response records and (b) difficulty of exercises pertinent to each knowledge concept and the exercises-knowledge concepts indication.

Download full size

View all figures

We believe the interactions between similar students (exercises) are key to generating suitable exercise recommendation results. This is illustrated with Ann, as shown in Fig. 1 (a), where Ann and Bob share similar mastery of knowledge. Thus, the exercises performed by Bob show a significant reference value for Ann. Ann has correctly answered e₁ and e₃ in the past. Hence, e₂ that Bob has correctly responded while Ann has not performed (or not answered correctly) will be recommended to Ann. As suggested in Fig. 1 (b), e₁ and e₄ exhibit similar difficulty levels. Given Ann answers e₁ correctly, e₄ may be a suitable recommended candidate for her.

Towards these insights in exercise recommendation, in this work, we keep abreast of the ongoing developments of cognitive diagnosis and relevant random walk techniques [12,13] and propose personalized exercise recommendation with student and exercise portraits (PERP). We aim to incorporate the above three recommendation goals into the recommendation process. Along this line, we portray a student with knowledge mastery and knowledge concept coverage, and the exercise is represented by blending difficulty with its relevant knowledge concepts. Afterward, a collaborative student exercise graph (CSEG) is constructed based on the similarity between student (exercise) portraits based on explicit relationships in the student-exercise interaction. Subsequently, a joint random walk is executed on CSEG, where the student-student (exercise-exercise) relationship is fully explored by the nature of the spectral properties of nearly uncoupled Markov chains [14]. This allows us to obtain recommendations with both accuracy and novelty. Subsequently, the list of target recommendation exercises is obtained by solving the optimization problem to satisfy the diversity. The effectiveness of the method is verified by conducting extensive experiments on two real-world datasets. Our study contributed primarily to the following.

1) We highlight the importance of explicitly modeling the finer-grained portraits to construct CSEG, which is conducive to better recommendation results.

2) A new method, PERP, which exploits the joint random walk mechanism with the spectral properties of nearly uncoupled Markov chains, is developed.

3) Two public datasets are analyzed to demonstrate that PERP can provide students with novel, accurate, and diverse exercises.

2 Related works

2.1 Cognitive diagnosis

The cognitive diagnosis aims to discover students’ proficiency level in specific concepts of cognition, which has been recognized as a crucial task in resource recommendation applications [15–17]. Research on online educational systems has shown that cognitive diagnosis greatly impacts adaptive learning [18–20]. Existing approaches typically capture the linear interactions between students’ exercises and questions by manually designed functions (e.g., logistic function or inner product). However, this is not sufficient to capture complex student relationships with exercises, such as typical unidimensional model item response theory (IRT) [21] and multidimensional models deterministic inputs, noisy and gate (DINA) [22]. Recently, the relation map driven cognitive diagnosis [23] takes students, exercises, and concepts thoroughly, naturally modeling the student-exercise interaction. Neural cognitive diagnosis (NeuralCD) [24] has borrowed concepts from educational psychology and employed neural networks to learn a high-order non-linear function. Accurate and interpretable diagnostic results are obtained. Once students’ knowledge states are measured, the model recommends exercises for each student using fine-grained rules [25]. We adopt NeuralCD for the initial modeling of students and exercises since NeuralCD can simulate students’ knowledge state and the difficulty of exercises.

2.2 Exercise recommendation system

In recent years, recommender systems have succeeded greatly in various domains, such as e-commerce [26,27], point of interest services, and social networks [28,29], due to their superiority in filtering confusing contents. However, due to the differences in behavioral motivation, the recommendation in the learning system is quite different from the classic recommendations and should be deliberately designed. Technically, traditional recommendation methods are widely used for learning resource recommendations, such as content-based filtering (CBF), CF, and hybrid filtering (HF) recommendation methods. A content-filtered exercise recommendation method was proposed based on the similarity of exercise and learning target attributes for exercise recommendations [30]. A method for balanced exercise recommendations based on a student’s interactive exercise response system is proposed in Ref. [31], which can recommend new exercises to qualified students in their areas of expertise of interest. The method in Ref. [32] is an HF recommendation that combines CF and CBF, which uses multiple criteria related to student and course information to recommend the most appropriate course to students. Deep learning techniques have recently improved performance through deep structures for capturing complex student-exercise relationships. For example, the exercises Q-networks with recurrent manner (EQNR) [16] further consider the effects of text contents, and multi-objectives are carefully established. The model-agnostic adaptive testing (MAAT) [15] considers the weight of each knowledge concept so that the recommendation results help students learn more comprehensive knowledge concepts. We argue that an inherent limitation of such methods is that finer-grained representations of students and exercises are not encoded properly. As such, the resultant recommendation result may not be sufficient to capture ‘suitable’ preferences for students. Therefore, the PERP models the finer-grained portrait of students and exercises, and the relationship between students and exercises is fully explored through a modified version of a random walk to obtain a ‘suitable’ list of recommendations.

3 Notations and problem statement

The important notations and their definitions are first summarized, followed by the concept of CSEG.

Let $S=\left\{S_1{\mathrm{,}}\; S_2{\mathrm{,}}\; \cdots{\mathrm{,}}\; S_N\right\} $ be the set of N students, $E=\left\{e_1{\mathrm{,}}\; e_2{\mathrm{,}}\; \cdots{\mathrm{,}}\; e_M\right\} $ be the set of M exercises, and $K=\left\{k_1{\mathrm{,}}\; k_2{\mathrm{,}}\; \cdots{\mathrm{,}}\; k_Z\right\} $ be the set of Z knowledge concepts. The student-exercise interaction matrix is $\mathbf{R}=\left[R_{i{\mathrm{,}} j}\right] \in\{0{\mathrm{,}} \; 1\} $, where $ R_{i{\mathrm{,}} j}=1 $ indicates that the student s_i correctly answers the exercise e_j. Otherwise, $R_{i{\mathrm{,}} j}=0 $. Besides, the exercise and knowledge concept incidence matrix is defined as $\mathbf{Q}=\left[Q_{i{\mathrm{,}} j}\right] \in\{0{\mathrm{,}}\;1\} $, where $ Q_{i{\mathrm{,}}j}=1 $ if exercise e_i relates to the knowledge concept k_j and otherwise $ Q_{i{\mathrm{,}}j}=0 $.

1) Student-exercise bipartite graph: It is defined as $ {G_1} = \left( {S \cup E{\mathrm{,}}{\text{ }}{\varepsilon _1}} \right) $, where $ {\varepsilon_1} $ denotes the set of edges that connect a student and an exercise based on R, indicating whether there is an observed interaction between a student s and an exercise e.

2) Student-student similarity graph: The student-student similarity graph is defined as ${G_2} = \left( {S{\mathrm{,}}{\text{ }}{\varepsilon_2}} \right)$, where ${\varepsilon_2}$ denotes the set of edges. An edge in ${\varepsilon_2}$ is built if and only if the similarity between the students exceeds a predefined threshold $\omega $.

3) Exercise-exercise similarity graph: The exercise-exercise similarity graph is defined as ${G_3} = \left( {E{\mathrm{,}}{\text{ }}{\varepsilon_3}} \right)$, where ${\varepsilon_3}$ denotes the set of edges. An edge in ${\varepsilon_3}$ is built if and only if the similarity between the exercises exceeds a predefined threshold $\theta $.

4) CSEG: CSEG conceptualizes students’ response behaviors and the student (exercise) relationship as a unified heterogeneous graph $G = \left( {S \cup E{\mathrm{,}}{\text{ }}{\varepsilon_1} \cup {\varepsilon_2} \cup {\varepsilon_3}} \right)$, as shown in Fig. 2.

Figure 2.Model framework of the presented PER, which consists of three main tasks: (a) finer-grained portrait of the student and the exercise construction of CSEG, (b) importance ranking of the exercises through a joint random walk, and (c) final list of exercise recommendations with multi-objective optimization.

Download full size

View all figures

5) Exercise recommendations: Given the student-exercise response matrix R and the exercise and knowledge concept association matrix Q, the goal of the exercise recommendation task is to accurately provide exercise with novelty, accuracy, and diversity for students through a finer-grained portrait of students and exercises. The key notations used in this paper are summarized in Table 1.

Table 1. Several important mathematical notations.

View table

View all Tables

Table 1. Several important mathematical notations.

Notation	Description
S, E, K	The set of students/exercises/knowledge concepts
R	Student exercise response matrix
Q	Exercise and knowledge concept incidence matrix
m^s	The degree of student mastery of knowledge concept
c^s	The knowledge concept coverage of student response
d^e	The exercise difficulty
q^e	The knowledge association
W^s, W^e	The student/exercise similarity matrix
J	The probability transition matrix
D₀	The set of candidate exercises
D	The final list of recommended exercises

4 Model

The exercise recommendation task is formulated as follows: Given the student-exercise response matrix R and the exercise-knowledge concept association matrix Q, the goal of the exercise recommendation task is to provide each student with suitable exercises, i.e., of proper novelty, accuracy, and diversity. Additionally, with recent advances in the cognitive diagnosis and random walk on graphs, many studies have applied cognitive diagnosis and random walk frameworks on graph data and performed well for exercise recommendations. Therefore, a new exercise recommendation model (PERP), whose overview framework is shown in Fig. 2, is strived.

4.1 Construction of CSEG

Building portraits for students and exercises using key factors from past response records is next. On the one hand, the two most important factors in deciding the exercise preference for a student are knowledge proficiency level and knowledge coverage. On the other hand, the recommended exercises are determined simultaneously by both difficulty level and their related knowledge concepts. Thus, it is natural to portray their profiles from the above aspects.

NeuralCD [24], a widely used cognitive diagnosis method for getting accurate and interpretable diagnosis results, is considered. It is an effective way to parameterize the proficiency level of students’ knowledge proficiency and project exercises with difficulty vector representations. After parameter estimation training of NeuralCD, both students’ proficiency and exercises’ difficulty matrices are obtained, which serve as parts of the student and exercise profiles, respectively. Specifically, we denote the proficiency matrix $ {\bf{A}} \in {\mathbb{R}^{{ {N}} \times Z}} $ for all students and the difficulty matrix ${\bf{B}} \in {\mathbb{R}^{M \times Z}}$ for all exercises. Note that each entry of A/B is continuous ([0, 1]), which indicates the student’s proficiency (difficulty level) in each knowledge concept.

4.1.1 Similarity between students

With all students’ mastery of the knowledge concept A, students can be partly represented by the degree of mastery of the knowledge concept as ${{\bf{m}}_{{s_i}}} = {{\mathrm{softmax}}} \left( {{\bf{x}}_{{s_i}}^{\mathrm{T}}{\bf{A}}} \right)$, where $ {{\bf{x}}_{{s_i}}} $ indicates the one-hot vector of the student e_j. We then denote knowledge concept coverage of a student as another feature portrait of students as ${{\bf{c}}_{{s_i}}} = {{\mathrm{softmax}}} \left( {{{\bf{x}}_{{s_i}}}{\bf{RQ}}} \right)$. After obtaining the above two representations of ${{\bf{m}}_{{s_i}}}$ and ${{\bf{c}}_{{s_i}}}$ for students, a criterion of similar students should be defined to provide students with suitable exercises. Typically, similar students share similar levels of knowledge mastery, which enables similar students to refer to each other. However, in recommending novel exercises, the difference between the exercises correctly answered by students should be as large as possible, and such students are inclined to reveal novel exercises. Thus, similar students are defined as those with similar knowledge mastery and the greatest possible difference in knowledge coverage. Similarly, similar students should satisfy the following conditions: Similar mastery of knowledge concepts and correctly answering the exercises with large differences in knowledge concept coverage. Therefore, the similarity between the students ${s_i}$ and ${s_j}$ is defined with the Gaussian kernel function as

$ w_{i{\mathrm{,}}j}^s = \exp \left( {\frac{{{{\left\| {{{\bf{m}}_{{s_i}}} - {{\bf{m}}_{{s_j}}}} \right\|}^2}}}{{ - 2{\sigma ^2}}}} \right) - \exp \left( {\frac{{{{\left\| {{{\bf{c}}_{{s_i}}} - {{\bf{c}}_{{s_j}}}} \right\|}^2}}}{{ - 2{\sigma ^2}}}} \right) $ (1)

where $\sigma $ is the bandwidth that controls the local scope of the Gaussian kernel. Students whose similarity calculation for $w_{i{\mathrm{,}}j}^s$ is higher than $\omega $ are defined as similar students, forming a student similarity matrix ${\bf{W}}_{i{\mathrm{,}}j}^s$. Otherwise, $w_{i{\mathrm{,}}j}^s = 0$ is set.

4.1.2 Similarity between exercises

Exercise factors are divided into two categories. The first factor is exercise difficulty, which is crucial to maintaining the recommendation results accurately. We can take each row of the exercise difficulty matrix B as ${{\bf{d}}_{{e_j}}} = {{\mathrm{sigmoid}}} \left( {{\bf{x}}_{{e_j}}^{\rm{T}}{\bf{B}}} \right)$, where ${{\bf{x}}_{{e_j}}}$ indicates the one-hot vector of the exercise e_j. The second factor is the knowledge relevance. This information is known and suggests a connection between the exercise and the knowledge concept. We denote it as ${{\bf{q}}_{{e_j}}} = {\bf{x}}_{{e_j}}^{\mathrm{T}}{\bf{Q}}$ correlation between the exercise and the concept of knowledge ${k_i}$. The comprehensive representation ${{\bf{e}}_j}$ of exercise ${e_j}$ is acquired by the above two representations of the exercise through the difficulty distribution ${{\bf{d}}_{{e_j}}}$of the exercise and the knowledge association vector ${{\bf{q}}_{{e_j}}}$ of the exercise, i.e., ${{\bf{e}}_j} = {{\bf{d}}_{{e_j}}}{{\bf{q}}_{{e_j}}}$, where $ \odot $ is element-wise multiplication.

We then define the similarity between exercise representation ${e_i}$ and ${e_j}$ with a Gaussian kernel function as

$ w_{i{\mathrm{,}}j}^e = \exp \left( { - \frac{{{{\left\| {{{\bf{e}}_i} - {{\bf{e}}_j}} \right\|}^2}}}{{2{\sigma ^2}}}} \right) {\mathrm{.}} $ (2)

Exercises for which the similarity of $w_{i{\mathrm{,}}j}^e$ is calculated to be higher than $\theta $ are defined as similar exercises from the exercise similarity matrix ${\bf{W}}_{i{\mathrm{,}}j}^e$. Otherwise, we set $w_{i{\mathrm{,}}j}^e = 0$.

4.2 Candidate recommendation exercises

4.2.1 Joint random walk

We proposed a novel joint random walk framework on CSEG for exercise recommendations, equipped with the implicit potential of student/exercise associations. It endows the inherent ability of the joint random walk to propagate these relationships in the student/exercise space and exploits the rich network of interactions they form. Consequently, a walker jumps between student and exercise and between student and student (exercise and exercise). Then, the student/exercise with similar portraits gives some valuable clues to the walker, making it possible to explore similar students/exercises fully. The procedures are the following.

Initially, the block transition matrix of the student-exercise bipartite graph is defined as ${\bf{H}}$ or

$ {\bf{H}} = \left( {\begin{array}{*{20}{c}} {\boldsymbol{0}}&{\bf{R}} \\ {{{\bf{R}}^{\mathrm{T}}}}&{\boldsymbol{0}} \end{array}} \right){\mathrm{.}} $ (3)

${\bf{H}}$ is normalized to ${{\bf{J}}_R}$ with

$ {{\bf{J}}_R} = {{\mathrm{Diag}}} {({\bf{H1}})^{ - 1}}{\bf{H}} = \left( {\begin{array}{*{20}{c}} {\boldsymbol{0}}&{\overline {\bf{R}} } \\ {{{\mathrm{Diag}}} {{\left( {{{\bf{R}}^{\mathrm{T}}}{\boldsymbol{1}}} \right)}^{ - 1}}{{\bf{R}}^{\mathrm{T}}}}&{\boldsymbol{0}} \end{array}} \right) $ (4)

where 1 denotes the vector with all elements of 1. $\overline {\bf{R}} $ represents the result of normalizing the matrix ${\bf{R}}$. Then, a block transition matrix ${\bf{M}}$ with the student similarity matrix ${{\bf{W}}^s}$ and the exercise similarity matrix ${{\bf{W}}^e}$ is constructed,

$ {\bf{M}} = \left( {\begin{array}{*{20}{c}} {{{\bf{W}}^s}}&{\boldsymbol{0}} \\ {\boldsymbol{0}}&{{{\bf{W}}^e}} \end{array}} \right){\mathrm{.}} $ (5)

Equally, a similar normalization operation is performed on ${\bf{M}}$ to get ${{\bf{J}}_{{\text{SE}}}}$, or

$ {{\bf{J}}_{{\text{SE}}}} = {{\mathrm{Diag}}} {({\bf{M1}})^{ - 1}}{\bf{M}} = \left( {\begin{array}{*{20}{c}} {{{\overline {\bf{W}} }^s}}&{\boldsymbol{0}} \\ {\boldsymbol{0}}&{{{\overline {\bf{W}} }^e}} \end{array}} \right){\mathrm{.}} $ (6)

The transition probability matrix ${\bf{J}}$ controlling the walk behavior can be effectively represented as a weighted sum of the above transition probability matrices ${{\bf{J}}_R}$ and ${{\bf{J}}_{{\text{SE}}}}$ as follows:

$ {\bf{J}} \triangleq \partial {{\bf{J}}_R} + (1 - \partial ){{\bf{J}}_{{\text{SE}}}} = \left( {\begin{array}{*{20}{c}} {(1 - \partial ){{\overline {\bf{W}} }^s}}&{\partial \overline {\bf{R}} } \\ {\partial {{\mathrm{Diag}}} {{\left( {{{\bf{R}}^{\mathrm{T}}}1} \right)}^{ - 1}}{{\bf{R}}^{\mathrm{T}}}}&{(1 - \partial ){{\overline {\bf{W}} }^e}} \end{array}} \right) $ (7)

where $\partial $ controls the degree of the involvement of these two components for the final random walk process.

Finally, we apply the restart policy to the joint random walk, which is defined as

$ {\bf{v}}_s^{t + 1} = \beta {\bf{Jv}}_s^t + (1 - \beta ){\bf{v}}_s^0 $ (8)

where ${\bf{v}}_s^0$ is a one-hot vector in which the element in the corresponding position of the student to be recommended is 1 and the others are 0; ${\bf{v}}_s^{t + 1}$ is the node visiting probability vector at the time t. A higher value in ${\bf{v}}_s^{t + 1}$ indicates that the exercise is more likely to be recommended to students. $ \beta $ is the hyper-parameter that controls the balance between random walks and restarts.

It is worth noting that the above recommendation paradigm explores CSEG with the advantage of the spectral properties of the nearly uncoupled Markov chain, which can be clearly stated in the following subsection.

4.2.2 Nearly uncoupled Markov chains

A key property of the joint random walk is that the joint random walk chain will be almost uncoupled into two blocks when the $\partial $ values are small. Disentangling the obtained slow-mixing and fast-mixing components will make the random walk process dynamics toward equilibrium. In summary, the joint random walk prolongs the influence of the personalized initialization on the successive K-step landing probabilities of the walker. It thus produces exercise recommendations with novelty and accuracy by eliminating the need to stop the walks early.

Nearly uncoupled Markov chains are discrete-time chains with transition matrices almost blocked diagonal. If ${\bf{S}} \in {\mathbb{R}^{{ {n}} \times { {n}}}}$ is the probability transition matrix satisfying irreducible and aperiodic Markov chains, ${\bf{S}}$ can be rewritten as ${\bf{S}} = \widehat {\bf{S}} + \varepsilon {\bf{C}}$, where $\widehat {\bf{S}}$ represents a block diagonal square matrix of order $n$, $ \widehat {\bf{S}} = {{\mathrm{Diag}}} \left( {{{\widehat {\bf{S}}}_{11}}{\mathrm{,}}{\text{ }}{{\widehat {\bf{S}}}_{22}}{\mathrm{,}}{\text{ }} \cdots {\mathrm{,}}{\text{ }}{{\hat {\bf{S}}}_{nn}}} \right) $, and ${\widehat {\bf{S}}_{{ {ii}}}}$ denotes an irreducible random matrix of order $n(i)$. Since ${\bf{S}}$ and $\widehat {\bf{S}}$ are probability transition matrices, the row-sum of ${\bf{C}}$ must be 0. The parameter $\varepsilon $ is defined as the maximum degree of coupling between blocks.

The Markov chain with probability transition matrix ${\bf{S}}$ is almost decoupled into $L$ blocks when $\varepsilon $ is sufficiently small since the discrete-time Markov chain defined by ${\bf{J}}$ is finite and aperiodic. As the number of iterative steps $K$ increases round by round, the joint random walk transition probability converges to limit distribution. Generally, the convergence rate of the transition probability matrix for the joint random walk depends on the modulus of the second dominant eigenvalue. The asymptotic rate $ {\left| {{\lambda _2}({\bf{J}})} \right|^k} \to 0 $ converges to the limit distribution. More intuitively, the smaller the value of $ \left| {{\lambda _2}({\bf{J}})} \right| $ is, the faster the importance score vector loses the personalized features of the query node. In other words, for any query vertex, the random walk will regard the vertex and attribute with higher degrees as the approximate term with high correlation. The spectral properties of the matrix ${\bf{J}}$ are further explained as follows.

Theorem 4.1. Let ${\bf{J}}$ be the joint transition probability transition matrix with $\partial \in (0{\mathrm{,}}{\text{ }}1)$ defined over CSEG, and let $\lambda ({\bf{J}})$ be the set of the eigenvalues of ${\bf{J}}$. Irrespectively of the similar students (exercises) used to define matrix ${{\bf{W}}^s}$ (${{\bf{W}}^e}$), it holds

1) $1 - 2\partial \in \lambda ({\bf{J}})$;

2) The Markov chain with a probabilistic transition matrix ${\bf{J}}$ will be almost decoupled into at least 2 blocks when $\partial $ is sufficiently small.

Proof. The matrix $\overline {\bf{R}} $ is fundamental when the student-exercise bipartite graph is connected. Furthermore, a simple random walk on the student-exercise bipartite graph results in a periodic Markov chain with a period $d = 2$ since the graph is bipartite. From the Perron-Frobenius theorem, it can be obtained that

$ {\lambda _1}\left( {{{\bf{J}}_R}} \right) = 1 $ (9a)

$ {\lambda _2}\left( {{{\bf{J}}_R}} \right) = - 1 {\mathrm{.}} $ (9b)

The so-called Perron eigenvalue ${\lambda _1}\left( {{{\bf{J}}_R}} \right)$ is associated with the right eigenvector ${\boldsymbol{1}}$; whereas eigenvalue ${\lambda _2}\left( {{{\bf{J}}_R}} \right)$ with a right eigenvector which we denote as ${\bf{I}}$.

Since ${{\bf{J}}_R} \in {\mathbb{R}^{(N + M) \times (N + M)}}$ and $ {\bf{l}} \in {\mathbb{R}^{(N + M) \times 1}} $, let the structure of ${\bf{I}}$ be

$ {\bf{I}} \triangleq [1{\mathrm{,}}{\text{ }}1{\mathrm{,}}{\text{ }} \cdots {\mathrm{,}}{\text{ }}1{\mathrm{,}}{\text{ }} - 1{\mathrm{,}}{\text{ }} - 1{\mathrm{,}}{\text{ }} \cdots {\mathrm{,}}{\text{ }} - 1] {\mathrm{.}} $ (10)

Thus, $ {\bf{I}} $ is an eigenvector of both matrices ${{\bf{J}}_R}$ and ${{\bf{J}}_{{\text{SE}}}}$.

$ {{\bf{J}}_R}{\bf{l}} = \left( {\begin{array}{*{20}{c}} 0&{\overline {\bf{R}} } \\ {{{\mathrm{Diag}}} \left( {{{\bf{R}}^{\rm{T}}}{1^{ - 1}}} \right){{\bf{R}}^{\rm{T}}}}&0 \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{{\boldsymbol{1}}_N}} \\ { - {{\boldsymbol{1}}_M}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} { - {{\boldsymbol{1}}_N}} \\ {{{\boldsymbol{1}}_M}} \end{array}} \right) = - {\bf{l}} . $ (11)

Therefore, –1 and ${\bf{I}}$ are the eigenvalues and eigenvectors of ${{\bf{J}}_R}$, respectively.

$ {{\bf{J}}_{{\text{SE}}}}{\bf{l}} = \left( {\begin{array}{*{20}{c}} {{{\overline {\bf{W}} }^s}}&{\boldsymbol{0}} \\ {\boldsymbol{0}}&{{{\overline {\bf{W}} }^e}} \end{array}} \right)\left( {\begin{array}{*{20}{c}} {{{\boldsymbol{1}}_{N \times 1}}} \\ { - {{\boldsymbol{1}}_{M \times 1}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} {{{\overline {\bf{W}} }^s}{{\boldsymbol{1}}_{N \times 1}}} \\ { - {{\overline {\bf{W}} }^e}{{\boldsymbol{1}}_{M \times 1}}} \end{array}} \right) = {\bf{l}} . $ (12)

Thus, 1 and ${\bf{I}}$ are the eigenvalues and eigenvectors of $ {{\bf{J}}_{{\text{SE}}}} $, respectively.

Next, a non-singular matrix, ${\bf{N}} \triangleq ({\boldsymbol{1}}{\mathrm{,}}{\text{ }}{\bf{I}}{\mathrm{,}}{\text{ }}{\mathbf{X}})$, is considered. Its first two columns contain the eigenvectors ${\bf{1}}$ and ${\bf{I}}$. Let

$ {{\bf{N}}^{ - 1}} \triangleq \left( {\begin{array}{*{20}{c}} {{\bf{y}}_1^{\text{T}}} \\ {{\bf{y}}_2^{\text{T}}} \\ {{{\bf{Y}}^{\text{T}}}} \end{array}} \right) . $ (13)

Through definition, it holds ${{\bf{N}}^{ - 1}}{\bf{N}} = {\bf{I}}$, or

$ \left( {\begin{array}{*{20}{c}} {{\bf{y}}_1^{\text{T}}{\bf{1}}}&{{\bf{y}}_1^{\text{T}}{\bf{l}}}&{{\bf{y}}_1^{\text{T}}{\bf{X}}} \\ {{\bf{y}}_2^{\text{T}}{\bf{1}}}&{{\bf{y}}_2^{\text{T}}{\bf{l}}}&{{\bf{y}}_2^{\text{T}}{\bf{X}}} \\ {{{\bf{Y}}^{\text{T}}}{\bf{1}}}&{{{\bf{Y}}^{\text{T}}}{\bf{l}}}&{{{\bf{Y}}^{\mathrm{T}}}{\bf{X}}} \end{array}} \right) = \left( {\begin{array}{*{20}{c}} 1&0&{\boldsymbol{0}} \\ 0&1&{\boldsymbol{0}} \\ {\boldsymbol{0}}&{\boldsymbol{0}}&{\bf{l}} \end{array}} \right){\mathrm{.}} $ (14)

Now, considering the similarity transformation of the joint random walk transition probability matrix, ${{\bf{N}}^{ - 1}}{\bf{JN}}$, and (11), (12), and the identities (14), we have

$ {{\bf{N}}^{ - 1}}{\bf{JN}} = {{\bf{N}}^{ - 1}}\left( {\partial {{\bf{J}}_{{R}}} + (1 - \partial ){{\bf{J}}_{{\text{SE}}}}} \right){\bf{N}} = \cdots = \left( {\begin{array}{*{20}{c}} 1&0&{\partial {\bf{y}}_1^{\mathrm{T}}{{\bf{J}}_R}{\bf{X}} + (1 - \partial ){\bf{y}}_1^{\mathrm{T}}{{\bf{J}}_{{\text{SE}}}}{\bf{X}}} \\ 0&{1 - 2\partial }&{\partial {\bf{y}}_2^{\mathrm{T}}{{\bf{J}}_R}{\bf{X}} + (1 - \partial ){\bf{y}}_2^{\mathrm{T}}{{\bf{J}}_{{\text{SE}}}}{\bf{X}}} \\ 0&0&{\partial {{\bf{Y}}^{\mathrm{T}}}{{\bf{J}}_R}{\bf{X}} + (1 - \partial ){{\bf{Y}}^{\mathrm{T}}}{{\bf{J}}_{{\text{SE}}}}{\bf{X}}} \end{array}} \right) . $ (15)

Then, ${\text{1}} - \partial $ which is an eigenvalue of the matrix ${\bf{J}}$ is obtained. The first part of the theorem is proved.

To prove the second condition of Theorem 4.1, we only need to verify the existence of some state-space blocks such that the upper bound of the maximum transition probability between blocks is $\partial $. Consider the following state partition case,

$ \mathcal{A} \triangleq \{ S{\mathrm{,}}{\text{ }}E\} {\mathrm{.}} $ (16)

With the definition of the matrix ${{\bf{J}}_R}$, the probability of leaving s is equal to $\partial $, for all $s \in S$. Thus,

$ {\mathrm{Pr}} \{ {\text{ }}{\mathrm{jump}}\; {\mathrm{from}}{\text{ }}s \in S{\text{ }}{\mathrm{to}}\; {\mathrm{any}}{\text{ }}i \notin S\} = \sum\limits_{i \notin S} {{P_{s{\mathrm{,}}i}}} = \sum\limits_{i \notin S} \partial {J_{{R_{s{\mathrm{,}}i}}}} = \partial {\mathrm{.}} $ (17)

Similarly, the probability of the leaving block $E$ upon a transition is

$ {\mathrm{Pr}} \{ {\text{ }}{\mathrm{jump}}\; {\mathrm{from}}{\text{ }}e \in E \;{\mathrm{to}}\; {\mathrm{any}}{\text{ }}j \notin E\} = \sum\limits_{j \notin E} {{P_{e{\mathrm{,}}j}}} = \sum\limits_{j \notin E} \partial {J_{{\mathrm{S}}{{\mathrm{E}}_{{ {e{\mathrm{,}}j}}}}}} = \partial {\mathrm{.}} $ (18)

Therefore, the joint random walk chain can always be decomposed according to $\mathcal{A}$. As a result, the maximum degree of coupling between the involved blocks is equal to $\partial $. Hence, $\partial $ is set to be sufficiently small to guarantee that the chain will be nearly uncoupled into (at least) 2 blocks.

It has been demonstrated that the relationship between similar students is explored sufficiently when $\partial $ is large enough. In other words, our recommendations include exercises performed by similar students and exercises with similar difficulty as those conducted by the target students. This is consistent with the expected results.

4.3 Optimizing the recommendation list

In this subsection, to get a diverse list of recommendations for the stationary distribution of the random walk vector, First, exercises with the top-$P$ largest scores as the candidate recommendation list denoted as ${D_0}$ is chosen. We take ${D_0}$ as a high-dimensional space, where each exercise in ${D_0}$ represents a point in that space. Then, with the simulated annealing algorithm [33], we solve a combinatorial optimization problem for obtaining a diverse recommendation list. Some exercises are replaced randomly, followed by comparing the optimization factors, and the new version is continuously generated by updating the exercises in the list one by one. More specifically, a replicated version of the top-$L$ exercises selected from the candidate exercise set ${D_0}$ forms the set ${D_1}$. We first replace some exercises in the original exercise ${D_1}$ to obtain ${D_2}$. Then we calculate ${\bf{D}}_1^\prime $ and ${\bf{D}}_2^\prime $, indicating the distance matrices of the exercise in ${D_1}$ and ${D_2}$, respectively. If the exercise distance of ${D_2}$ is larger than ${D_1}$, we accept ${D_2}$ as a new edition. Otherwise, we compare the calculated acceptance probability $\gamma $ with the randomly generated $r$ to decide whether to accept ${D_2}$. $\gamma $ is calculated by

$ \gamma = \exp \left( {\frac{{ - \left( {{{\mathrm{mean}}} \left( {{\bf{D}}_2^\prime } \right)} \right) - \left( {{{\mathrm{mean}}} \left( {{\bf{D}}_1^\prime } \right)} \right)}}{{{k_B}T}}} \right) $ (19)

where ${{\mathrm{mean}}} \left( {{\bf{D}}_1^\prime } \right)$ indicates the average value of the distance matrix of the list ${D_1}$, ${k_B}$ is the Boltzmann constant, and $T$ represents the temperature. The mean value of the distance matrix of all acceptable relations will be compared, and the version with the largest distance divergence will finally be kept as the output of the optimized recommendation list, which is the final list $D$ with diversity. As shown in Table 2, Algorithm 1 summarizes our PERP approach.

Table 2. Description of PERP algorithm.

View table

View all Tables

Table 2. Description of PERP algorithm.

Algorithm 1: PERP algorithm
Input: R, Q
Output: Final recommendation list D
1. 　A, B$ \leftarrow $training NeuralCD model;
2. 　for i = 0 to N do
3. 　　${{\bf{m}}_{{s_i}}} = {{\mathrm{softmax}}} \left( {{\bf{x}}_{{s_i}}^{\mathrm{T}}{\bf{A}}} \right)$
4. 　　${{\bf{c}}_{{s_i}}} = {{\mathrm{softmax}}} \left( {{{\bf{x}}_{{s_i}}}{\bf{RQ}}} \right)$
5. 　end for
6. 　Ws$ \leftarrow $student similarity matrix;
7. 　for j = 0 to M do
8. 　　${{\bf{d}}_{{e_j}}} = {{\mathrm{sigmoid}}} \left( {{\bf{x}}_{{e_j}}^{\rm{T}}{\bf{B}}} \right)$
9. 　　${{\bf{q}}_{{e_j}}} = {\bf{x}}_{{e_j}}^{\mathrm{T}}{\bf{Q}}$
10. 　　${{\bf{e}}_j} = { {\bf{d} }_{ {e_j} } } \odot { {\bf{q} }_{ {e_j} } }$
11. 　end for
12. 　We$ \leftarrow $exercise similarity matrix;
13. 　J$ \leftarrow $transition probability matrix J by (7);
14. 　for j = 0 to t do
15. 　　${\bf{v}}_s^{t + 1} = \beta {\bf{Jv}}_s^t + (1 - \beta ){\bf{v}}_s^0$
16. 　end for
17. 　D0$ \leftarrow $choose exercises with the top-P largest scores;
18. 　while termination criterion is not satisfied do
19. 　　D1$ \leftarrow $the top-L exercises of D0;
20. 　　D2$ \leftarrow $replace some of the exercises in D1;
21. 　　$ {\bf{D}}_1' $$ \leftarrow $distance matrix of the exercises in D1;
22. 　　${\bf{D}}_2' $$ \leftarrow $distance matrix of the exercises in D2;
23. 　　if ${\bf{D}}_1' $ > ${\bf{D}}_2' $
24. 　　　D1$ \leftarrow $D2
25. 　　else
26. 　$\gamma = \exp \left( {\frac{ { - \left( { {{\rm{mean}}} \left( {{\bf{D}}_2^\prime } \right)} \right) - \left( { {{\rm{mean}}} \left( {{\bf{D}}_1^\prime } \right)} \right)} }{ { {k_B}T} } } \right)$
27.　　　 r$ \leftarrow $random(0, 1)
28. 　　　if r >$\gamma $
29. 　　　　D1$ \leftarrow $D2
30. 　　　else
31. 　　　　D1$ \leftarrow $D1
32. 　　　end if
33. 　　end if
34. 　end while
35. 　D$ \leftarrow $largest distance

5 Experiments

In this section, we conduct extensive experiments on two real-world datasets to evaluate the effectiveness of the presented PERP. We mainly focus on answering the following research questions:

RQ1: How does PERP perform compared with the baseline methods?

RQ2: How much does each component in PERP contribute?

RQ3: How do the hyper-parameters in PERP affect the performance?

RQ4: How does the incorrect response record affect the performance?

RQ5: Can PERP provide students with “suitable” exercises?

5.1 Datasets

To evaluate the performance of the proposed PERP, we conduct comprehensive experiments on two real-world datasets: ASSISTments 2009-2010 [34] and Algebra 2006-2007 [35]. Detailed datasets are presented in Table 3.

Table 3. Real dataset statistics.

View table
View all Tables
Table 3. Real dataset statistics.

Dataset ASSISTments 2009-2010 Algebra 2006-2007
Students 4163 1338
Exercises 17746 91771
Knowledge concepts 123 491
Records 278607 222314

ASSISTments 2009-2010: ASSISTments is an open dataset. An assistant is an online tutor who simultaneously teaches and assesses students in grade school mathematics.

Algebra 2006-2007: This dataset is derived from the 2010 KDD Cup educational data mining challenge. The dataset consists of sequences of algebra exercises collected in 2006 and 2007.

5.2 Baselines

To demonstrate the effectiveness, we compare our PERP with the following state-of-the-art methods from three groups: Classical recommendation methods, which take students as users and exercises as items (e.g., K-nearest neighbor (KNN) [36], top-N collaborative filtering recommendation algorithm based on knowledge graph embedding (KGEB-CF) [37], and multi-behavior hypergraph-enhanced transformer (MBHT) [38]); cognitive diagnose models, which model the hidden knowledge states for students, including deep knowledge tracing (DKT) [39], NeuralCD [24], and diagnostic transformer (DTransformer) [40]; exercise recommendation models, which are dedicated for exercise recommendation, such as HB-DeepCF [7] and KCP-ER [10].

• KNN: It finds a predefined number of learners the nearest to the students by comparing the cosine distance of their learning paths and decides what to learn next for the new learner.

• KGEB-CF: It takes the entity as the student and the exercise, and the relationship is the student’s response record. Through knowledge graph embedding, a representation is learned for each student, exercise, and response result, with the semantic information and structure of the original graph retained. Eventually, the above is included in the recommendation process by combining the CF recommendation method.

• MBHT: This approach utilizes a multi-scale transformer with low-rank self-attention to encode fine-grained and coarse-grained levels of behavioral sequential patterns. It incorporates global multi-behavior dependencies into the hypergraph neural architecture in a customized manner to capture hierarchical long-range item correlation. Doing so successfully captures short-term and long-term cross-type behavioral dependencies for recommendation purposes.

• DKT: The classic knowledge tracing model leverages the students’ knowledge states to predict the probability of the students correctly answering the questions. In this paper, the student’s knowledge state in DKT is taken as an embedding representation of the student for exercise recommendations. Then, the percentage of students answering the exercises is correctly employed as difficulty representations since this model does not involve difficult information about the exercises. Finally, the recommendation list is obtained by CF.

• NeuralCD: It is a general neural, cognitive diagnostic framework adopting neural networks to learn complex interaction functions between students and exercises to obtain students’ knowledge state and difficulty. The next operation is the same as DKT, except that the exercise representation is represented by the difficulty of the exercises trained by the model.

• DTransformer: It introduces DTransformer, a new architecture and training paradigm for accurately assessing learner performance. DTransformer diagnoses knowledge proficiency at each question mastery state and employs contrastive learning for stable knowledge state diagnosis. The next operation is the same as DKT.

• HB-DeepCF: It is a typical hybrid recommendation method integrating an autoencoder with a classical recommendation model. The representation of exercises and students is obtained through the former. With the latter, the autoencoder is instructed to learn more semantic features and make recommendations.

• KCP-ER: It is a recursive neural network for predicting the range of knowledge concepts, using deep knowledge tracking to predict students’ mastery of knowledge concepts based on their exercise records. The final step is to use the prediction results to filter the exercises and an optimization algorithm to obtain a complete list of recommended exercises.

5.3 Evaluation metrics

Unlike traditional recommendations, making a comprehensive evaluation for exercise recommendations is more complex. Generally speaking, exposure to exercises with new knowledge is beneficial for students to acquire comprehensive knowledge continuously, and the difficulty of the recommended exercises should be smooth with the previously finished exercises. Besides, we are expecting a recommendation set containing diverse candidates. Therefore, we evaluate our method and baselines from the three aspects of novelty, accuracy, and diversity, respectively.

Novelty: The novelty reflects that the recommendation contains new knowledge. The set of knowledge concepts contained in an exercise can be expressed as ${{\bf{q}}_e}$. The set of knowledge concepts in the exercise that the student correctly answered is denoted as ${{\bf{c}}_s}$. Then, the novelty of the recommended exercises can be expressed as

$ {{\rm{Nov}}} (D) = \frac{{\displaystyle\sum\limits_{e \in D} {\left( {1 - {\text{Jaccsim}}\left( {{{\bf{q}}_e}{\mathrm{,}}{\text{ }}{{\bf{c}}_s}} \right)} \right)} }}{{\left| D \right|}} {\rm{.}} $ (20)

Accuracy: While recommending exercises to students, overly difficult or easy problems can negatively affect student learning. Therefore, exercises with the desired difficulty $\xi $ should be recommended. Then, the overall accuracy of the obtained recommendation list is

$ {{\rm{Acc}}} (D) = \frac{{\displaystyle\sum\limits_{e \in D} {\left( {1 - \left| {\xi - {\mathrm{max}} \left( {{{\bf{d}}_e}} \right)} \right|} \right)} }}{{\left| D \right|}} $ (21)

where the relative distance $\left| {\xi - {\rm{max}} \left( {{{\bf{d}}_e}} \right)} \right|$ between the difficulty of each exercise in the recommended list and desired difficulty of the exercise $\xi $ is calculated to test the accuracy of recommended exercises.

Diversity: The diversity indicates the variability of the exercises in the recommended list, represented by calculating the similarity between the exercises in the recommended list. Then, the diversity of the whole recommended list can be expressed as

$ {{\mathrm{Div}}} (D) = \frac{{\displaystyle\sum\limits_{{e_i} \in D} {\displaystyle\sum\limits_{{e_j} \in D} {\left( {1 - \dfrac{{{{\bf{e}}_i}{{\bf{e}}_j}}}{{\left\| {{{\bf{e}}_i}} \right\|\left\| {{{\bf{e}}_j}} \right\|}}} \right)} } }}{{\left| D \right|(\left| D \right| - 1)}} . $ (22)

5.4 Performance comparison (RQ1)

In this subsection, the PERP method is compared with other baseline methods. In Table 4, bold words indicate the best performance of all methods, and underlined words imply the best performance of the baseline methods. The default candidate exercise set size is 30, and the final recommendation list size is 6. Table 4 compares PERP with the eight baseline methods on two real datasets concerning novelty, accuracy, and diversity. The results in Table 4 reveal that the proposed method consistently produces the best comprehensive performance on all datasets. It is verified that our method can indeed recommend a list of exercises with novelty, accuracy, and diversity simultaneously.

Table 4. Performance of all methods on all datasets.

View table

View all Tables

Table 4. Performance of all methods on all datasets.

Model	ASSISTments 2009-2010			Algebra 2006-2007
Model	Novelty	Accuracy	Diversity	Novelty	Accuracy	Diversity
KNN	0.934	0.888	0.254	0.783	0.747	0.407
KGEB-CF	0.912	0.879	0.524	0.676	0.631	0.674
MBHT	0.946	0.893	0.343	0.798	0.859	0.468
DKT	0.602	0.880	0.466	0.621	0.855	0.583
NeuralCD	0.583	0.894	0.495	0.645	0.859	0.668
DTransformer	0.713	0.892	0.452	0.661	0.861	0.516
HB-DeepCF	0.914	0.823	0.758	0.739	0.695	0.619
KCP-ER	0.957	0.895	0.765	0.818	0.863	0.743
PERP	0.959	0.897	0.781	0.821	0.865	0.758

From the result, we summarize several important observations.

• NeuralCD and DKT consistently demonstrate poor performance in terms of novelty on both datasets. The cognitive model adopted in both cannot capture the intricate relationships between students and exercises, limiting their effectiveness. However, the enhanced accuracy achieved by the cognitive diagnostic model, DTransformer, positively impacts novelty. A more precise estimation of students’ mastery levels improves student profiling. KNN, KGEB-CF, and MBHT consistently outperform DKT, NeuralCD, and MBHT, respectively, highlighting the significance of modeling implicit relationships between students and exercises for effective recommendation systems. The superior performance of MBHT over KNN and KGEB-CF indicates that exploring multiple dimensions of the relationships between students and exercises can result in a more personalized list of recommended exercises that align with students’ learning situations. However, it is important to note that these models do not account for the specific nature of education, which ultimately leads to subpar recommendation results.

• In terms of accuracy, HB-DeepCF performs the poorest on both datasets. This can be attributed to the fact that this method learns semantic features through a hybrid recommendation paradigm, which may decrease accuracy. This approach fails to fully explore the associations between students and exercises, leading to suboptimal performance. On the other hand, the performance of other baseline models confirms that incorporating the connections between students and exercises improves accuracy. For example, the MBHT model outperforms other recommendation models regarding accuracy. This indicates that considering global behavior by modeling students can lead to more accurate student representations and, consequently, more tailored recommendations that align with students’ specific learning needs.

• So far as diversity is concerned, KNN exhibits the poorest performance. One possible reason is that KNN heavily relies on the quality of answer records, which can lead to one-sided recommendation results. Furthermore, the first two categories of methods (presumably referring to HB-DeepCF and hybrid recommendation paradigms) do not consider the specific characteristics of educational scenarios. This lack of consideration results in similar recommendation outcomes. In contrast, cognitive models that are trained to obtain a more comprehensive representation of students and exercises have the potential to generate a more diverse set of recommendation results. By capturing a broader range of student and exercise characteristics, cognitive models can offer recommendations that better reflect the diversity of learning needs and preferences.

• PERP offers several notable advantages over existing baselines, as outlined below. 1) Personalized dependency modeling: PERP adopts tailored portraits for students and exercises, allowing for personalized dependency modeling. This customized approach improves the accuracy and precision of the model, resulting in enhanced performance on both datasets. 2) Enhanced influence prolongation: PERP incorporates a joint random walk paradigm that leverages the spectral properties of nearly uncoupled Markov chains. This design choice prolongs users’ personalized initialization, facilitating a more comprehensive understanding and representation of their preferences. The improved influence prolongation leads to better performance and robustness than other methods. 3) Diversity-driven optimization: PERP integrates the simulated annealing algorithm to optimize for diversity. This algorithmic approach ensures the model explores various possible solutions, promoting a more diverse and comprehensive recommendation system. PERP enhances its adaptability and reliability by considering diversity as an optimization criterion. Overall, PERP outperforms existing baselines in terms of performance on both datasets. Its advantages, such as personalized dependency modeling, enhanced influence prolongation, and diversity-driven optimization, establish the effectiveness and superiority of this proposed approach for exercise recommendation systems.

5.5 Ablation study (RQ2)

In this subsection, we perform the ablation study to better understand the effects of different components of our model.

5.5.1 Similar students (exercises)

To verify the effectiveness of integrating student (exercise) dependency under a random walk prototype, we build three variants of PERP by removing part of its modules: 1) The variant PERP-s is built by removing the student-student relation; 2) the variant PERP-e removes the exercise-exercise relation; 3) the variant PERP-s-e is constructed by removing both the student-student relation and the exercise-exercise relation.

We present the evaluation results in Fig. 3 with the following key summaries: PERP consistently outperforms its variants. Thus, the effectiveness of our portrait modeling paradigm by capturing the complex dependent relations between students (exercises). The removal of the student-student relation exerts a more significant impact on the model for accuracy, which validates that exercises performed by students with similar levels of knowledge mastery can provide an imperative reference value to the target students. Moreover, PERP-e underperforms PERP-s with respect to diversity, which indicates that removing the exercise-exercise relation has a negative impact on diversity in PERP.

Figure 3.Influence of similar students (exercises).

Download full size

View all figures

5.5.2 Portraits of students (exercises)

The four variants, PERP-m, PERP-c, PERP-d, and PERP-q, are designed to verify the usefulness of the fine-grained portrait of students and exercises specifically. PERP-m means the effect of removing the knowledge mastery component on students. PERP-c indicates the effect of removing the knowledge coverage component on students. PERP-d denotes the effect of removing the exercise difficulty component on exercises. PERP-q represents the effect of removing the knowledge association component on exercises.

As observed in Fig. 4, PERP is consistently superior to all variants, which illustrates the importance of all four components of the portrait of students and exercises. It is worth noting that knowledge mastery and knowledge coverage in PERP significantly impact the accuracy of the results. This indicates they form a more reasonable student representation, helping the model recommend more accurate exercises.

Figure 4.Influence of portraits of students (exercises).

Download full size

View all figures

5.6 Sensitivity evaluation (RQ3)

We evaluate how different hyper-parameter settings affect the performance of PERP, especially the candidate recommendation list size and the impact of the recommendation list size on the results. Also, the effect of similar students’ (exercises) thresholds on experimental results is of great interest.

5.6.1 Recommended list size

We evaluate the effect of the candidate recommendation list size top-P and the recommendation list size top-L in PERP.

To verify the effect of the candidate exercise set and the final recommendation list on the recommendation results, we vary top-P in the range of {20, 30, 40, 50} and top-L in the range of {2, 4, 6, 8, 10}. The performance comparison of ASSISTments 2009-2010 data aggregation is illustrated in Fig. 5. When top-P increases from 20 to 30, the diversity performance is improved, demonstrating that too few exercises are not beneficial to the diversity of exercises. When top-P increases from 30 to 50, the performance becomes poorer. It makes sense since too much exercise can bring in inappropriate exercises. When the fixed-size number of top-L increases, we can see that the performance rises first and then reduces. It is due to the increase in the diversity of exercises, which decreases accuracy and novelty. When the size of a candidate’s exercise set is selected at 30, and the final recommendation list size is selected at 6, the results are more consistent with our requirement for novelty, accuracy, and diversity. Too few exercises are not beneficial to the diversity of exercises. Too much exercise and novelty wear off.

Figure 5.Impacts of the candidate, i.e., top-P (recommendation), i.e., top-L (exercise) number on performance.

Download full size

View all figures

5.6.2 Similarity threshold

The similarity threshold is searched from 0 and 1 with an increment of 0.2 to verify the effect of the student similarity threshold and the exercise similarity threshold on the PERP method. The performance comparison results on the ASSISTments 2009-2010 dataset are presented in Fig. 6.

Figure 6.Performance comparison with the change in the student (exercise) similarity threshold .

Download full size

View all figures

Obviously, as the student’s similarity threshold or exercise similarity threshold becomes larger, the recommendation effect becomes better until the threshold reaches 0.6 and then decreases. Lower thresholds cause most of the student’s influence to be added to the calculation, resulting in too much noisy data that impacts the recommendation performance. Higher thresholds filter most of the similar students (exercises). Consequently, the model receives too little information, leading to a decrease in the performance of the recommendation results.

5.7 Wrong answer record reference (RQ4)

Understanding how incorrect response records facilitate the recommendation results is studied. Towards this end, we take incorrect response records to construct CSEG and perform the joint random walk to get a list of student recommendations. After that, two recommendation lists are obtained based on correct or incorrect response records, providing students with review and pre-learning suggestions, respectively. Table 5 shows that the recommendation result on correct records consistently outperforms that of incorrect ones with respect to accuracy and diversity, while the recommendation result on correct records underperforms incorrect ones with respect to novelty. The possible reason is that when a recommendation list is obtained based on correct records, it is more likely to recommend exercises from those students whose mastery is similar to that of the target students, which may filter some useful information. In addition, making recommendations with incorrect response records preserves differences between students and thus makes the final recommendation with novelty.

Table 5. Right and wrong answer recommendations on the ASSISTments 2009-2010 dataset.

View table
View all Tables
Table 5. Right and wrong answer recommendations on the ASSISTments 2009-2010 dataset.

ASSISTments 2009-2010 PERP+ PERP–
Novelty 0.959 0.961
Accuracy 0.897 0.887
Diversity 0.781 0.779

5.8 Case study (RQ4)

A student with ID.219 who has performed 21 exercises with 14 knowledge concepts from the ASSISTments 2009-2010 dataset is selected as an example. Table 6 shows the recommendation results derived from KNN, NeuralCD, and PERP, respectively. As demonstrated by comparing the proposed method, the recommended exercises for the KNN method consist of three new knowledge concepts. In contrast, the exercises recommended by NeuralCD contain a new knowledge concept of the ‘circle graph’. The exercises recommended by our method involve the new knowledge concepts of ‘box and whisker’, ‘congruence’, ‘ordering integers’, ‘ordering integers’, and ‘equivalent fractions’. This uncovers that PERP can recommend diverse exercises for students while ensuring novelty.

Table 6. Student with ID.219 answered and recommended the situation.

View table

View all Tables

Table 6. Student with ID.219 answered and recommended the situation.

	Exercise number	Knowledge concepts
Actual answer records	7,33,960, 962,1098, 1831,3090, 3102,3145, 3151,7398, 7465,7516, 8724,8729, 10274,12339, 12358,12811, 17125,17162	Equation solving, Histogram as table or graph,Number line,Line plot,Stem and leaf plot,Table,Mode,Addition and subtraction fractions,Ordering fractions,Conversion of fraction decimals percents,Finding percents,Scale factor,Unit rate,Pattern finding
KNN	8318,8306,8304,8252,14864,8254	Multiplication and division integers,Addition whole numbers,Division fractions
NeuralCD	10224,52,70,8286,37,7380,8307	Equation solving,Stem and leaf plot,Addition and subtraction fractions,Circle graph,Finding percents
PERP	196,246,202,235,184,200	Box and whisker,Congruence,Ordering integers,Square root,Equivalent fractions

6 Conclusion

A novelty approach, modeling portraits of students and exercises for exercise recommendations, is proposed. First, the fine-grained modeling of students is achieved through their knowledge mastery and knowledge coverage to determine similar students. The fine-grained modeling of the exercises is completed by the difficulty distribution of the exercises and the knowledge association vector of the exercises, and similar exercises are identified. The student exercise heterogeneity map is depicted with similar students, similar exercises, and student exercise interaction records. Afterward, a random walk is performed based on the nearly uncoupled Markov chains property to acquire a list of the recommended exercises. Finally, a set of exercises with novelty, accuracy, and diversity is obtained by handling the optimization problem. Compared with some existing recommendation methods, the advantages of PERP are validated on several real-world datasets used in educational data mining.

Disclosures

The authors declare no conflicts of interest.

Category:

Received: Jul. 3, 2023

Accepted: May. 31, 2024

Published Online: Aug. 8, 2024

The Author Email: Hui-Fang Ma (mahuifang@yeah.net)

DOI:10.1016/j.jnlest.2024.100262

Table 1. Several important mathematical notations.

Table 1. Several important mathematical notations.

Table 2. Description of PERP algorithm.

Table 2. Description of PERP algorithm.

Table 3. Real dataset statistics.

Table 3. Real dataset statistics.

Table 4. Performance of all methods on all datasets.

Table 4. Performance of all methods on all datasets.

Table 5. Right and wrong answer recommendations on the ASSISTments 2009-2010 dataset.

Table 5. Right and wrong answer recommendations on the ASSISTments 2009-2010 dataset.

Table 6. Student with ID.219 answered and recommended the situation.

Table 6. Student with ID.219 answered and recommended the situation.

微信扫一扫：分享