
Digital textbook (e-book) systems record student interactions with textbooks as a sequence of events called EventStream data. In the past, researchers extracted meaningful features from EventStream, and utilized them as inputs for downstream tasks such as grade prediction and modeling of student behavior. Previous research evaluated models that mainly used statistical-based features derived from EventStream logs, such as the number of operation types or access frequencies. While these features are useful for providing certain insights, they lack temporal information that captures fine-grained differences in learning behaviors among different students. This study proposes E2Vec, a novel feature representation method based on word embeddings. The proposed method regards operation logs and their time intervals for each student as a string sequence of characters and generates a student vector of learning activity features that incorporates time information. We applied fastText to generate an embedding vector for each of 305 students in a dataset from two years of computer science courses. Then, we investigated the effectiveness of E2Vec in an at-risk detection task, demonstrating potential for generalizability and performance.
Published in proceedings of the 17th Educational Data Mining Conference (EDM 2024)
FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), feature representation; fastText; digital textbooks; e-book EventStream; at-risk prediction; educational data mining, Computer Science - Artificial Intelligence, Computers and Society (cs.CY), Computation and Language (cs.CL), Machine Learning (cs.LG)
FOS: Computer and information sciences, Computer Science - Computers and Society, Computer Science - Machine Learning, Computer Science - Computation and Language, Artificial Intelligence (cs.AI), feature representation; fastText; digital textbooks; e-book EventStream; at-risk prediction; educational data mining, Computer Science - Artificial Intelligence, Computers and Society (cs.CY), Computation and Language (cs.CL), Machine Learning (cs.LG)
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
