Upload
others
View
2
Download
0
Embed Size (px)
Citation preview
Progression Trajectory-Based Student Modeling for NoviceBlock-Based Programming
Fahmid Morshed Fahid
North Carolina State University
Xiaoyi Tian
University of Florida
Andrew Emerson
North Carolina State University
Joseph B. Wiggins
University of Florida
Dolly Bounajim
North Carolina State University
Andy Smith
North Carolina State University
Eric Wiebe
North Carolina State University
Bradford Mott
North Carolina State University
Kristy Elizabeth Boyer
University of Florida
James Lester
North Carolina State University
ABSTRACTBlock-based programming environments are widely used in com-
puter science education. However, these environments pose signifi-
cant challenges for student modeling. Given a series of problem-
solving actions taken by students in block-based programming envi-
ronments, student models need to accurately infer problem-solving
students’ programming abilities in real time to enable adaptive
feedback and hints that are tailored to students’ abilities. While
student models for block-based programming offer the potential
to support student-adaptivity, creating student models for these
environments is challenging because students can develop a broad
range of solutions to a given programming activity. To address
these challenges, we introduce a progression trajectory-based stu-
dent modeling framework for modeling novice student block-based
programming across multiple learning activities. Student trajecto-
ries utilize a time series representation that employs code analysis
to incrementally compare student programs to expert solutions
as students undertake block-based programming activities. This
paper reports on a study in which progression trajectories were
collected from more than 100 undergraduate students engaging in
a series of block-based programming activities in an introductory
computer science course. Using progression trajectory-based stu-
dent modeling, we identified three distinct trajectory classes: Early
Quitting, High Persistence, and Efficient Completion. Analysis of
these trajectories revealed that they exhibit significantly different
characteristics with respect to students’ actions and can be used to
Permission to make digital or hard copies of all or part of this work for personal or
classroom use is granted without fee provided that copies are not made or distributed
for profit or commercial advantage and that copies bear this notice and the full citation
on the first page. Copyrights for components of this work owned by others than ACM
must be honored. Abstracting with credit is permitted. To copy otherwise, or republish,
to post on servers or to redistribute to lists, requires prior specific permission and/or a
fee. Request permissions from [email protected].
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands© 2021 Association for Computing Machinery.
ACM ISBN 978-1-4503-8366-0/21/06. . . $15.00
https://doi.org/10.1145/3450613.3456833
accurately predict students’ programming behaviors on future pro-
gramming activities compared to competing baseline models. The
findings suggest that progression trajectory-based student models
can accurately model students’ block-based programming prob-
lem solving and hold potential for informing adaptive support in
block-based programming environments.
CCS CONCEPTS• Social and professional topics→ Professional topics; Comput-
ing education; • Applied computing→ Education.
KEYWORDSTrajectory-based Student Modeling, Block-based Programming,
Time Series Clustering
ACM Reference Format:Fahmid Morshed Fahid, Xiaoyi Tian, Andrew Emerson, Joseph B. Wiggins,
Dolly Bounajim, Andy Smith, Eric Wiebe, Bradford Mott, Kristy Elizabeth
Boyer, and James Lester. 2021. Progression Trajectory-Based Student Mod-
eling for Novice Block-Based Programming. In Proceedings of the 29th ACMConference on User Modeling, Adaptation and Personalization (UMAP ’21),June 21–25, 2021, Utrecht, Netherlands. ACM, New York, NY, USA, 12 pages.
https://doi.org/10.1145/3450613.3456833
1 INTRODUCTIONIntroductory computer science courses can be challenging for
novice students who have limited or no prior programming experi-
ence, resulting in high failure and attrition rates [2]. An important
ability that students must develop is computational thinking, a
complex mental process that involves logically formulating and
solving problems [58]. In addition to computational thinking, tradi-
tional text-based programming courses also teach students about
language-specific syntax and structure. Novice students typically
struggle with programming language syntax and structure, which
results in a steep learning curve in their introductory courses [55].
To ease the learning process, block-based programming languages
such as Scratch [46], Snap! [14] and Blockly [13] are frequently
189
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
used in introductory computer science courses [29], eliminating
many of the common syntactic, language-specific challenges. These
visual, block-based languages can reduce students’ cognitive load
while increasing their engagement in the learning process [41, 56].
However, even with the use of block-based programming languages,
students still struggle in introductory classes [30, 42].
Informing student-adaptive support in the form of tailored ad-
vice, guidance, hints, and feedback for block-based programming
calls for the creation of student models that can accurately ana-
lyze student problem solving in block-based programming environ-
ments. Programming tasks are typically iterative in nature where
programmers progress through the phase of planning, writing, test-
ing, and revising their code. To better understand this iterative
process, researchers have studied programming trajectories, which
represent the incremental code changes in a programming activity.
To understand such programming trajectories, several code analy-
sis methods have been used [9, 34, 53], to provide better support
to students [7, 42, 48, 57], develop automatic grading [8, 54], or
predict learning performance [11, 53, 54]. Some studies also found
patterns in students’ programming trajectories that can be used
to identify learning behaviors [12, 22, 39]. However, limited work
has investigated students’ programming trajectory patterns across
multiple programming activities to enable early prediction of their
learning behavior in future programming activities.
In this work, we introduce progression trajectory-based studentmodels that utilize a time series representation employing code
analysis to incrementally analyze student programs as students cre-
ate them. We explore progression trajectory-based student models
for the programming trajectories of 149 undergraduate students
in an introductory computer science course using a block-based
programming environment across five programming activities. We
investigate the following three research questions (RQs):
• RQ1: Are there distinct patterns in student’s block-based
progression trajectories?
• RQ2: Are these patterns associated with meaningful differ-
ences in programming behaviors and learning outcomes?
• RQ3: Are these patterns predictive of students’ future pro-
gramming behavior (i.e., progression trajectory)?
To answer RQ1, we analyzed a total of 745 progression trajecto-
ries across five different programming activities using time series-
based clustering. The analyses revealed that there are three distinct
clusters in students’ progression trajectories. To understand how
these clusters differ from one another (RQ2), we used different
response variables across the clusters and found that the clusters
exhibit significantly different characteristics across programming
activity completion, hint requests, and activity restarts. We also
found that students primarily associated with a particular cluster
across multiple programming activities have significantly different
pre- and post-test scores than students associated with other clus-
ters. To answer RQ3, we trained several predictive classifiers using
features collected from student programming trajectories and event
logs. The prediction results showed that the models can accurately
predict whether a student will change their behavior (i.e., transition
to a different cluster) in subsequent programming activities.
2 RELATEDWORKRecent years have seen increasing adoption of block-based pro-
gramming environments in introductory computer science courses.
However, limited work has investigated student models that infer
student programming competences from student behaviors in block-
based programming environments. Price et al. created iSnap [43], a
block-based programming environment that can generate adaptive
contextual hints using students’ programming behavior. Further
studies showed that such hints can lead to positive learning outcome
in undergraduate-level introductory programming courses [42] and
to better help-seeking behavior as well as reduction of hint abuse
[44]. Specific hint types have also been studied. For example, Zhi et
al., found that worked examples reduce cognitive load but do not
result in significant learning gains [59]. Several other studies with
students in MOOCs showed the efficacy of data-driven approaches
to provide effective feedback to students [19, 28, 31, 36]. Despite
these advances, a recent study comparing machine-generated feed-
back to human-authored feedback showed that students develop
a better conceptual understanding of programming concepts and
perform better in class when assisted by human-authored feedback
[25].
Student modeling can provide insight into students’ future learn-
ing behaviors and performance. For example, Emerson et al. showed
that Bayesian hierarchical student models can account for indi-
vidual student differences and successfully predict students’ post-
test programming performance [9]. Other work has used Bayesian
student models to identify concepts that are difficult for students
[47] and to predict mastery learning [38, 53]. Machine learning
approaches were found to be effective in modeling students’ compe-
tencies and goals in game-based learning environments using game
trace data [23, 32, 33]. Multimodal data streams have also been used
to model students’ performance and affective states [18, 27]. Barria-
Pineda et al. designed open learner-based student modeling to
visually inform students about their conceptual knowledge level on
topics and found that students make more informed decisions when
navigating through the programming activities while increasing
their overall learning performance [1]. Another study by Troiano
et al. clustered the developments of programming concepts among
K-12 students in Scratch using automated metric-based assessments
and found four groups of progressions: quick-then-steady, steady-
all-the-way, slow-and-still-developing, no-improvement-necessary
[52]. In our work, we have modeled students’ progress towards
solutions using their block-based programs to find and predict their
progress patterns in future activities.
Modeling students in open-ended programming activities is crit-
ical as there are often multiple paths to a solution, and program
accuracy alone does not reveal students’ learning behavior [17]. An
empirical study by Hosseini et al. showed that intermediate pro-
gramming steps can be used to effectively induce student models
and can be useful for providing improved feedback to students [20].
The authors analyzed the interaction logs of JavaParser and found
that there is a strong association between conceptual change (i.e.,
start with a small program and gradually add new programming
concepts) and the correctness of the program. A similar association
between programming concepts and performance has been found
by using programming trajectories [4, 40]. In another empirical
190
Progression Trajectory-Based Student Modeling for Novice Block-Based Programming UMAP ’21, June 21–25, 2021, Utrecht, Netherlands
Figure 1: Screenshot of the Prime environment.
study, Blikstein found three patterns of programming behaviors
among students, namely copy-paste, self-sufficient, and a combi-
nation of both [3]. A recent study by Estey et al. found that inter-
mediate programming steps can successfully identify struggling
students [12]. To understand how a student changes their program-
ming pattern in a programming activity, Glassman et al. used a
clustering technique to visualize students’ behaviors [15].
Much prior work has used edit distance-based code embed-
dings to investigate students’ programming behaviors. For exam-
ple, Paaßen et al. generated adaptive hints in student programs
using their edit distance to expert solutions to predict future be-
haviors of students in block-based programming activities [35]. A
similar approach has been used in other work [48] to investigate
programming patterns. Emerson et al. clustered student programs
to identify common misconceptions among students in block-based
programming activities [10]. Another clustering study by Perkins
et al. identified three kinds of novice programmers: stoppers,movers,and tinkerers. As the cluster names suggest, stoppers give little effortafter they reach a difficult point, whereasmovers gradually work to-wards a solution, and tinkerers iterate frequently with the program
[39]. Similar types of clusters have also been identified in an analysis
of program syntax errors [21] and learning behaviors in an intelli-
gent logic tutor [30]. Jiang et al. conducted a hierarchical clustering
of programming trajectories in a block-based programming envi-
ronment and found four different clusters for a single programming
activity, namely: quitters, approachers, knowers, and solvers [22].Their analysis suggests that both quitters and approachers struggledto solve the given programming activity but have different paths
while failing. A more recent study found five different clusters of
hint usage when looking across several programming activities of
students in a block-based programming environment [57], all of
which have different help-seeking behaviors. For example, students
with prior programming knowledge only requested hints at the
very end of each programming activity, whereas students with no
prior knowledge asked for hints uniformly. However, there has
been limited work on student modeling for students’ programming
trajectories across multiple programming activities to predict their
progress.
In this work, we present a student modeling approach based on
students’ progression trajectories using intermediate block-based
programs on multiple programming activities by iteratively calcu-
lating edit distances between their intermediate code states and
expert solutions. Using time series clustering, we find meaningful
patterns in student progression trajectories that are predictive of
future programming behavior.
3 METHODSWe investigate progression trajectory-based student modeling in
the context of introductory college-level computer science with
a focus on block-based programming for non-majors. We use in-
teraction data collected from undergraduates interacting with a
block-based programming environment during the first few weeks
of an introductory programming course.
3.1 Prime Learning EnvironmentPrime is an adaptive learning environment that leverages Google’s
block-based programming framework, Blockly [13]. The adaptive
learning environment includes twenty programming activities, each
designed to reinforce the skills learned from previous programming
activities. Students are encouraged to complete activities in increas-
ing order, but they can skip ahead and come back later to attempt
or practice again. The programming activities are organized into
three units with the following computer science competencies: 1)
input/output, numeric data types, expressions, variables, and iter-
ation; 2) abstraction, functions, and parameters; and 3) Boolean
data types, conditionals, indefinite iteration, and debugging. Stu-
dents are encouraged to use the learning environment both during
class, as well as for self-practice, lab, and homework assignments.
The curriculum of Prime was designed based on the introductory
programming courses taught in top-rated undergraduate computer
science programs in the United States [51].
191
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
Table 1: Short descriptions of each programming activity
Programming
Activity
New Concepts Introduced Short Description
3 Variable, Math Expression Print the sum of two numbers using variables
4 Input Take two numbers from user and print the sum
5 Logical Take five numbers from user using two variables only and print the sum
6 Loop Complete Programming Activity 5 using a for loop
7 Logical Create a counter that countdown to 0
To support students during their programming activities, the
primary interface of the environment consists of a Program panel,
a Console panel, a Feedback panel, and an Instruction panel (Figure
1). The Program panel consists of a toolbox of Blockly program
blocks that can be drag-and-dropped to the coding workspace. The
toolbox categorizes the blocks by type. The categories of blocks
are incrementally populated to reduce the cognitive complexity of
tasks [45] and to increase the interface usability [49].
The default workspace contains an unremovable Start block thatserves as an entry point. The Console panel contains a Run button
to run the program connected to the Start block. The output is
printed on the Console panel each time the Run button is pressed.
The input console is also shown in the Console panel when needed.
The Instruction panel contains step-by-step instructions for each
programming activity and is fairly common in intelligent tutoring
systems for computer science education [6]. The Feedback panel
provides a “Get Hint” button that enables students to request textual
hints, which are suggestions to make small changes to the program
towards a solution.
3.2 Study DesignWe collected student programming interaction data on block-based
programming activities at a large public university in the United
States during two semesters (fall 2019 and spring 2020). The study
was IRB-approved for an introductory engineering course, in which
students received extra credit for completing the programming ac-
tivities. As the learning environment was accessible online, transi-
tioning to remote learning was straightforward when the pandemic
began in spring 2020. The study focused on five programming activ-
ities from Unit 1 (Programming Activities 3 through 7) that teaches
students the concept of variables, input/output and loops (Table
1). We focused on these programming activities because they are
sufficiently challenging (62% average completion rate) while ensur-
ing maximum participants. We did not use the first two activities
(Programming Activities 1 and 2), as they serve as an introduction
to the programming environment and have a very high completion
rate (99% and 87% respectively). Although Programming Activity 3
also has a significantly high competition rate (83%), we included
this programming activity in the study to facilitate early prediction
of Programming Activity 4.
A total of 407 students participated in the study, of which 156
students attempted all five programming activities in Unit 1 (Pro-
gramming Activities 3 through 7). Among these 156 students, 149
students also completed both pre-test and post-test surveys to as-
sess their computer science competencies. The study focuses on
these 149 participants having an average age of 19.1 (SD=2.2), with74% reporting as male and 26% as female. Approximately 90% of
the students were non-computer science engineering majors. A
total of 62% of these students reported their race as White, 16%
as Asian, 9% as African American, 6% as Hispanic, 2% as Native
American, and 4% as Other. The 149 students attempted a total of
745 programming activities (one student per activity for Program-
ming Activities 3 through 7) of which 459 (62%) were successfully
completed. Among these five programming activities, the median
value of successful programming activity completion per student is
3 (M=3.08, SD=1.57, min=0, max=5).
3.3 DatasetThe data was collected from student interaction logs with the learn-
ing environment. The data consists of workspace snapshots cap-
tured with an XML representation of student programs of each new
change made in the code and 16 system-level user actions, including
program runs and user input. The workspace snapshots represent
the current block-based program for a particular programming ac-
tivity. The environment automatically takes workspace snapshots
every 30 seconds or if certain actions are taken, such as running or
saving the program. We removed repeated snapshots of the same
workspace to limit redundancy. As this is a browser-based envi-
ronment, the calculation of elapsed time is challenging because
students sometimes leave their browser open for extended periods
of time without making any changes to the workspace. To correct
for such cases, we replaced any time gaps of more than five minutes
between two consecutive workspace snapshots with the median
value (29.96 seconds) of consecutive time gaps between snapshots.
The five-minute threshold was selected based on the third quartile
value of time gap between two consecutive workspace snapshots.
The final dataset contains a total of 745 trajectories, one for each
student-activity pair, having a median length of 10 snapshots per
programming activity (M=19.62, SD=29.95). The time spent on each
programming activity on average is 508.98 seconds (SD=850.38)with a median value of 216.05 seconds.
The dataset also uses students’ pre-test and post-test survey be-
fore and after interacting with the learning environment to measure
their computer science competencies. These multiple-choice based
assessments were validated by three content area experts, having
0.88 Cronbach’s alpha for pre-test and 0.90 Cronbach’s alpha for
post-test. Among these 149 students, the median value of the pre-
test score was 58.33 (M=57.5, SD=22.6,min=12.5,max=100) and themedian value of their post-test score was 66.7 (M=61.5, SD=23.9,min=4.2, max=100).
192
Progression Trajectory-Based Student Modeling for Novice Block-Based Programming UMAP ’21, June 21–25, 2021, Utrecht, Netherlands
Figure 2: Ten randomly sampled progression trajectories for each of the three clusters. The x-axis is the time in seconds andthe y-axis is the similarity percentage between an expert solution and the current workspace (higher is better).
4 RESULTSWe describe findings on the time series-based progression trajectory
clustering and early predictions. We first introduce the progres-
sion trajectory-based student modeling and then discusses the time
series-based clusters we have found. We then investigate and char-
acterize each of the clusters based on multiple attributes. Finally,
we evaluate two early prediction-based modeling tasks that can be
used to predict students’ future programming behaviors.
4.1 Progression Trajectory PatternsTo generate the progression trajectory of a programming activity at-
tempt, we calculated the similarity percentage between workspace
snapshots and an expert authored solution for that activity. Each
workspace snapshot represents one block-based programming code
state whereas the expert-authored solution is the final workspace
snapshot of the solution to a particular programming activity. The
expert-authored solutions were based on the block-based program-
ming best practices such as minimizing the number of steps towards
each solution. To calculate the similarity percentage, we first con-
verted the student’s workspace snapshots and the expert solution
into Python code. Then, using the program’s abstract syntax tree
(AST), we calculated the edit distance using Damerau–Levenshtein
[5] distance between the workspace and the expert solution. The
Damerau–Levenshtein distance is calculated by the minimum num-
ber of operations (insert, delete, substitute, transpose) needed to
convert one string to another string. Finally, the edit distance was
converted to a percentage to calculate the similarity between the
current workspace and the expert solution. For converting Python
codes to ASTs and calculating the similarity percentages, we used
the pyastsim package.1We used the edit distance to calculate sim-
ilarity because the majority of the programming activities and
their corresponding Python code are relatively simple. Similar ap-
proaches have been effectively utilized in prior studies to observe
students’ progression through individual programming activities
[22].
As there can be many correct solutions to most programming
activities, some students who correctly completed a programming
activity may not have 100% similarity with the expert solution. To
ensure these snapshots were treated as correct, we updated their
final similarity score to 100% for all the programming activities that
were completed successfully. We also removed any change made
in their workspace after reaching 100% similarity to focus on their
first correct solution.
We clustered the progression trajectories using k-means clus-
tering. As the progression trajectories are time series data with
varying lengths and shapes, using generic distance measures will
not reflect the true distance between two trajectories. Thus, we
used the dynamic time warping (DTW) algorithm to calculate the
distance between trajectories. DTW is a dynamic programming
approach where a distance or cost function (typically Euclidean
distance) is minimized by finding an optimal alignment. This can
be done by rearranging and/or repeating sequence points of two
sequential trajectories [50].
The distortion elbow is often used as a data-driven approach for
finding the optimal number of clusters in a dataset [24]. Thus, to
find the optimal value of k, we calculate the distortion elbow using
the average DTW distance from the closest cluster center. Visually
inspecting the "elbow" (the inflection point of the curve), we chose
3 as the optimal number of clusters of progression trajectories. We
name these clusters Early Quitting (EQ), High Persistence (HP), andEfficient Completion (EC). Ten randomly sampled progression trajec-
tories from each cluster are shown in Figure 2. Note that each line
represents one student-activity pair and one student’s progression
trajectories for different programming activities could be assigned
to different clusters. With regard to the RQ1, we found three distinct
time-series clusters: Early Quitting (EQ), High Persistence (HP), andEfficient Completion (EC) in students’ progression trajectories.
As show in Figure 2, all three clusters have distinct progression
trajectories. For example, progression trajectories in the HP clusterspend a large amount of time (M=1756.69 seconds, SD=1514.82)and change their similarity percentage frequently, whereas progres-
sion trajectories in the EQ cluster do not make as much progress
within their shorter period of time in the programming activity
(M=195.56 seconds, SD=358.59), and progression trajectories in
the EC cluster quickly make progress towards the solution like an
1https://pypi.org/project/pyastsim/
193
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
Figure 3: Number of progression trajectories that completed and did not complete each programming activity for cluster EQ,HP, and EC (from left to right)
expert (M=319.53 seconds, SD=371.10). The EC cluster has almost
twice as many student-activity pairs (473) than the HP cluster andthe EQ cluster combined (108 and 164 respectively). This is reason-
able, as the programming activities (e.g., Programming Activities 3
and 4) are introductory and short, and many students were able to
make progress through them quickly (as seen by the EC cluster).
4.2 Analyzing ClustersThe three clusters that separate students’ progression trajectories
into different groups have distinct properties. For example, the
EQ cluster could be characterized as lack of persistence/quick-to-
dropout, which might indicate the student is stuck or unsure how to
begin solving the programming activity, while the EC cluster might
indicate high prior knowledge. A deeper analysis of the impact of
different characteristics on each individual cluster is detailed below.
First, we investigated the overall performance of each cluster
based on successfully completing each programming activity. Figure
3 shows that the EQ cluster has a higher number of programming
activities with incomplete count; thus, a progression trajectory that
belongs to the EQ cluster is most likely to leave the programming
activity unfinished. The HP cluster has an almost equal number of
completed vs. incomplete programming activity counts (52 vs. 56),
indicating that it is uncertain that a student will successfully com-
plete the programming activity or not. The EC cluster has a highernumber of completions. Thus, progression trajectories assigned
to the EC cluster are more likely to finish successfully for a given
programming activity. Even though there are a small number of
incomplete programming activities in the EC cluster, these studentsprogressed through the programming activities fairly quickly and
were close to the expert solutions.
Next, we compared how these clusters are distributed across
programming activities. Figure 4 shows that early programming
activities have a higher number of progression trajectories that
belong to the EC cluster whereas the later programming activities
have an increased number of progression trajectories in the EQcluster. This is expected, as many students will make quick progress
through earlier programming activities due to their low difficulty
and the later programming activities have a higher rate of incom-
pletion. Programming Activity 5 has the highest occurrences of the
HP cluster, suggesting that students spend a significantly long time
in Programming Activity 5.
To examine the differences in student actions across clusters, we
investigated the frequency of requesting hints and restarting the
current programming activity from scratch. We also performed a
pairwise Tukey honestly significant difference test to verify statis-
tically significant results. As shown in Figure 5, progression trajec-
tories in the HP cluster requests significantly more hints (M=7.18,SD=8.37, min=0, max=49) than those in the EQ cluster (M=0.81,SD=1.78, min=0, max=10, p<0.001) and the EC cluster (M=1.11,SD=2.63, min=0, max=22, p<0.001). This can be explained as pro-
gression trajectories in the HP cluster spend a significantly longer
time progressing through the programming activities (Figure 2).
Trajectories in the EQ cluster and the EC cluster either completes the
programming activity shortly or leaves the programming activity
unfinished without sufficient effort, so their low number of hint
requests are expected as well. It is also interesting to observe that
there is no significant difference between hint requests in the EQcluster and the EC cluster (p=0.650).
When progressing through a programming activity, students
sometimes remove all blocks from the workspace to start fresh.
We describe this behavior as a restart. To count the number of
restarts, we selected a threshold of 10%, meaning whenever similar-
ity percentage falls below 10% and is then followed by an increase,
we categorize this as a single restart. Considering the number of
restarts in progression trajectories across clusters, the HP clusterhas the highest number of restarts (M=3.42, SD=2.86), whereas theEQ cluster (M=0.37, SD=1.41) and the EC cluster (M=0.33, SD=0.67)have significantly low numbers of restarts (p<0.001 for both). Thisis also expected as trajectories in the HP cluster show significant
effort to finish a programming activity and thus restarts more fre-
quently than other clusters. Progression trajectories that belong
to the EQ cluster tend to restart less frequently. This suggests that
students following these projection trajectories are less willing to
start again, leaving the programming activity unfinished. Moderate
numbers of restarts are seen in trajectories that belong to the ECcluster, implying the tendency to start fresh when stuck.
To understand how these clusters are associated with prior
knowledge (i.e., pre-test) and overall performance (i.e., post-test) of
194
Progression Trajectory-Based Student Modeling for Novice Block-Based Programming UMAP ’21, June 21–25, 2021, Utrecht, Netherlands
Figure 4: Number of progression trajectories in each cluster for Programming Activity 3, 4, 5, 6, and 7 (from left to right)
each student, we assigned each student to a student group based on
their majority association with a particular cluster across all five
programming activities. Students assigned to the same cluster in
three or more programming activities (out of five) are tagged as
having that cluster as a majority. Among 149 students, 98 belong
to Early Completion major (EC Major), 25 belong to Early Quittingmajor (EQ Major), 6 belong to High Persistence major (HP Major),
and the remaining 20 have mixed clusters (Mixed) in their progres-
sion through the programming activities. Analysis of the pre-test
scores using Tukey honestly significant difference test (Figure 5)
shows no significant differences between HP Major, EQ Major,
and Mixed student groups. But EC Major groups have significantly
higher (p<0.001) pre-test scores (M=66.26, SD=18.81) than EQMajor
(M=33.05, SD=14.83) and Mixed (M=47.58, SD=23) student groups.
This is expected as students with higher prior knowledge should
progress through the programming activities more easily (early
completion) than others.
For the post-test scores, there is a significant difference between
HP Major and EQ Major (p=0.014), between EC Major and Mixed
(p=0.009), between EQ Major and EC Major (p<0.001), and betweenEQ Major and Mixed (p=0.028) student groups. From this, it is
evident that students progressing through programming activi-
ties using specific progression trajectories have strong associations
with their post-test performance. The ECMajor student group has a
higher post-test score (M=69.53, SD=21.46) than EQ Major student
groups (M=35.97, SD=16.08) and Mixed student groups (M=53.33,SD=23.08) but have no significant difference with the HP Major
student group (M=64.58, SD=6.7). In other words, students whose
majority of the progression trajectories belong to the HP clustersor the EC clusters have the highest post-test performance, suggest-
ing significant learning outcomes for these two groups, whereas
the EQ Major student group has the lowest post-test performance
compared to the other groups. The Mixed student group, having
no major association with any particular progression trajectory
cluster, demonstrates moderate post-test performance with high
standard deviation.
A notable observation here is that ECMajor students have almost
identical pre-test scores (M=66.26, SD=18.81) and post-test scores
(M=69.53, SD=21.46), suggesting no significant learning (paired ttest p=0.051) was observed. Same is also true for EQ Major students
(mean pre-test score of 33.05, mean post-test score of 35.97, pairedt test p=0.172). But the Mixed student group and the HP Major
student group have noteworthy differences (paired t test p=0.042and p=0.029 respectively) between their pre-test and post-test score.
With regard to the second research question (RQ2), the analysis
shows that these three clusters have significantly different char-
acteristics in programming activity completion, number of hint
requests, and number of restarts. Students majorly associated with
these clusters also show significantly different pre-test and post-test
scores.
4.3 Early Prediction of Next-Activity ClusterInvestigating the behavior of individual students across program-
ming activities, we found that 34 (23%) students did not change
clusters across all five programming activities. Among 596 program-
ming activity transitions (no transition going into Programming
Activity 3), 65% of transitions remained in the same cluster in con-
secutive programming activities. More granular details can be found
in Table 2
To understand how students transition between clusters across
all five programming activities, we conducted sequential pattern
mining using the prefixspan2Python package on the clusters [16].
Table 3 depicts the five most frequent patterns when progress-
ing through the programming activities. As we can see, 33 (22%)
students remained in the EC cluster across all five programming
2https://pypi.org/project/prefixspan/
195
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
Figure 5: The top row shows the number ofHint Requests (top-left) and the number of Restarts (top-right) across three clusters.The bottom row shows the Pre-test (bottom-left) and Post-test (bottom-right) scores across all four student groups.
Table 2: Students remaining in the same cluster in consecu-tive programming activity
Programming Activity Transition Students with Same Cluster
3 to 4 107 (71.81%)
4 to 5 84 (56.38%)
5 to 6 97 (61.1%)
6 to 7 100 (67.11%)
Total 388 (65.1%)
activities. Another 12 students started their progression trajectories
in the EC cluster and changed to EQ cluster after Programming Ac-
tivity 4, implying that these students might have lost interest or lack
the prior knowledge to complete more challenging programming
activities. A total of 12 other students remain in the EC cluster andonly transitioned to the HP cluster once in Programming Activity
5. These students appeared to expend significant effort when first
faced with a difficult programming activity and learned to quickly
progress through the rest.
On average, two-thirds (65%) of the students do not change their
progression trajectory patterns (i.e., remain in the same cluster) in
consecutive programming activities. This is noteworthy and can
Table 3: Top five patterns of cluster change across all fiveprogramming activities
Number of Students Cluster Change from Activity 3 to 7
33 (22.15%) EC -> EC -> EC -> EC -> EC
12 (8.05%) EC -> EC -> HP -> EC -> EC
12 (8.05%) EC -> EC -> EQ -> EQ -> EQ
8 (5.37%) EC -> EC -> EC -> EC -> EQ
6 (4.03%) EC -> EC -> HP -> EQ -> EQ
be used to accurately predict the behavior of students in consec-
utive programming activities. To demonstrate this, we designed
two evaluations to do early prediction of students’ future behavior.
The first evaluation (TASK Cluster Change) was a binary classifi-
cation task to predict if a student would change her progression
trajectory pattern in the next programming activity or not (binary
class). The second evaluation (TASK Next Cluster) was a multi-
class classification task to predict the progression trajectory (i.e.,
cluster) of a student in the next programming activity. For both
of these evaluations, we trained four commonly used off-the-shelf
classifiers using scikit-learn [37]: random forests (RF), decision
196
Progression Trajectory-Based Student Modeling for Novice Block-Based Programming UMAP ’21, June 21–25, 2021, Utrecht, Netherlands
Table 4: (TASK Cluster Change) F1 weighted macro score for predicting if a student will change their current progressiontrajectory cluster (binary) in the next programming activity
Programming Activity Base Prev Base Major RF DT LR SVC
3 to 4 60% 60% 81% 79% 84% 87%4 to 5 44% 41% 61% 60% 56% 61%5 to 6 65% 51% 74% 60% 70% 78%6 to 7 55% 54% 67% 65% 62% 70%All 58% 51% 71% 67% 69% 72%
trees (DT), logistic regression (LR), and support vector classifica-
tion (SVC). We used a total of 30 features to train the models, which
included three sets of features: system log, prior performance, and
current activity progress. The system log features contain 16 event
logs, which are counts of the following actions: hint request, totalnumber of interactions with the environment, load last save, loadprevious exercise code, next instruction, previous instruction, run code,save workspace, change block, create block, delete block, move block,number of toolbox interactions, create variable, delete variable, andrename variable. The prior performance features group includes
9 features: total time spent up to the current moment, total pro-
gramming activities completed so far, total number of restarts so
far (as defined in 4.2), total number of trajectories belonged to each
cluster, total workspace cleared per restart so far, total number
of clusters changed so far, total cumulative final similarity so far.
These features are calculated using the cumulative sum of previous
programming activities for a particular student. Lastly, the current
activity progress features group includes 5 features: the number of
restarts in the current programming activity, current cluster, time
spent in current programming activity, is the current programming
activity successfully completed, and the final similarity percentage
of the current progression trajectory. We standardized all the fea-
tures to remain consistent across different feature scales. As the
distribution of clusters is imbalanced (see 4.1), the only parameter
manually tuned was the class_weight balanced (where applicable).
The remainder of the hyperparameters were set to the default val-
ues.
In the evaluations, we used two baselines. The first baseline,
BASE Major, predicts the majority class in both of the classification
tasks. The second baseline, BASE Prev, predicts the same class in
consecutive programming activities. For TASK Cluster Change, this
means carrying over the current cluster changing behavior in the
next programming activity, and for TASK Next Cluster, this means
carrying over the current cluster in the next programming activity.
For TASK Cluster Change, as both of the classes (students who will
remain in the same cluster, and students who will not) are equally
important to predict, and for TASK Next Cluster, as all three clusters
(EQ, HP, and EC) are equally important, we have reported the F1
scores, instead of precision or recall. We calculated F1 weighted
macro (average F1 score based on support on each class) to account
for class imbalance. For all evaluations, we performed 10-fold cross-
validation, repeated 10 times (total of 100 runs for each classifier),
and reported the median F1 scores (mean values are affected by
outliers [26]).
TASK Cluster Change: As shown in Table 4, when training
on All programming activity transitions, all classifiers outperform
(9 to 14% higher) the best baseline (BASE Prev). Examining each
programming activity transitions, we can see that it is easier to
predict the cluster change in Programming Activity 3 to 4 (19 to
27% higher than both baselines) and in Programming Activity 4 to
5 (12 to 17% higher than the best baseline), but it is more difficult to
predict cluster changes in later programming activity transitions,
specifically from Programming Activity 5 to 6 (-5 to 13% higher than
best baseline) and from ProgrammingActivity 6 to 7 (7 to 15% higher
than best baseline). In all cases, almost all the classifiers significantly
outperform the baselines. Also, note that SVC outperforms every
other classifier in all cases.
The fact that predicting cluster changes from Programming Ac-
tivity 5 to 6 is difficult is noteworthy. Looking into the programming
activity description (Table 1), Programming Activity 6 introduces
loops for the first time. It might be the case that students find this
concept hard to learn and struggle. As such, classification based on
prior knowledge is not sufficient.
TASK Next Cluster: As shown in Table 5, predicting specific
clusters (EQ, HP, and EC) is a challenging task because the three
classes are highly imbalanced. For All programming activity tran-
sitions, the F1 scores of the classifiers are ranging between 62%
(DT) to 68% (RF) whereas the Base Prev, which is the best baseline,
has a score of 64%. Improvements can be seen when predicting
clusters for Programming Activity 5 using data from Programming
Activity 4 (3 to 11% improvement over best baseline). Apart from
DT predicting clusters in Programming Activity 7, all the classifiers
perform somewhat similarly to the best baselines, if not better, in
all cases. Investigating further, we found that most of the time the
clusters are wrongly predicted as EC clusters. For example, for ran-
dom forest classifiers, the HP cluster gets misclassified as EC cluster76% of the time and EQ cluster gets misclassified as EC cluster 25%of the time. Similar results are seen across other classifiers. This
can be explained by the fact that the classes are mostly imbalanced,
having 60% of the progression trajectories belong to the EC clusterand only 16% belong to HP cluster. This makes the prediction of the
classifiers biased towards the EC cluster. Also note that HP cluster,being the uncertain cluster (Figure 3), can go either way and is
predicted towards the majority class (EC cluster).With regard to the final research question (RQ3), the two eval-
uations suggest that we can accurately predict students’ cluster
changing behaviors in consecutive programming activities, but pre-
dicting the exact cluster is somewhat difficult due to imbalanced
data.
197
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
Table 5: (TASK Next Cluster) F1 weighted macro score for predicting the progression trajectory cluster (EQ, HP, or EC) of astudent in the next programming activity
Programming Activity Base Prev Base Major RF DT LR SVC
3 to 4 74% 75% 77% 72% 80% 76%
4 to 5 50% 34% 59% 53% 61% 59%
5 to 6 67% 41% 65% 65% 63% 67%6 to 7 69% 32% 68% 54% 62% 61%
All 64% 45% 68% 62% 66% 65%
5 DISCUSSIONUnderstanding students’ progression trajectories can provide in-
sight to guide the design of more effective adaptive learning envi-
ronments. Our analysis found three generic progression trajectory
patterns when students progress through individual programming
activities in block-based programming environments. Similar pat-
terns have also been seen in earlier studies across different pro-
gramming environments [22, 23, 39], but unlike this work, these
studies found student-specific patterns, rather than their behav-
ior in individual programming activities. However, our findings
echo the findings of previous studies. For example, students who
primarily belong to the EQ cluster (giving up early without spend-
ing much time or effort) across multiple programming activities
would likely be identified as stoppers or quitters whereas studentswho primarily belong to the HP cluster (spent significantly more
time to trying to solve a programming activity) across multiple
programming activities would likely be identified as tinkerers or ap-proachers. Similarly, student progression trajectories that primarily
belong to the EC cluster could be identified as movers, knowers, orsolvers. Overall, these student progression trajectory clusters offer
a fine-grained understanding of student behaviors in individual
programming activities and have significant potential to improve
learning environments in future.
The analysis in this paper also shows these clusters to have
significantly different characteristics across multiple dimensions.
Their distributions across programming activities show that earlier
programming activities with lower difficulties have lower number
of progression trajectories in EQ cluster whereas later programming
activities with higher difficulties have increasingly higher number
of progression trajectories in EQ clusters. This implies that students
tend to early-quit more often as difficulty increases. Similarly, stu-
dent progression trajectories in the HP cluster show significantly
more hint requests and restarts. Thus, students primarily belonging
to the HP cluster across multiple programming activities (HP Major)
are more persistent in the learning process and their outcomes are
reflected by their positive changes in distributions for pre-test and
post-test scores. As such, it can be seen that difficult activities need
better support or timely motivation to reduce early quitting while
ensuring higher persistence for better learning outcomes.
Thus, predicting these clusters early for individual students can
inform the design of adaptive support in learning environments.
For example, by identifying a student who will change their current
behavior in the upcoming activity, the environment can adapt its
support system to provide more or less student-adaptive hints or
feedback accordingly. Environments can also utilize these predictive
future behavior patterns to motivate students to increase their
engagement or to discourage early quitting.
6 CONCLUSIONStudent modeling for block-based programming environments can
inform student-adaptive learning experiences with feedback and
guidance that is tailored to individual students to help them mas-
ter programming concepts. However, devising student models for
block-based programming poses significant challenges because stu-
dents can pursue a multiplicity of paths to solutions. To address
these issues, we have introduced progression trajectory-based stu-
dent models for novice block-based programming environments.
Progression trajectory-based student models utilize a time-series
representation that employs code analysis to incrementally compare
the intermediate programming steps of students’ evolving programs
to expert programs. We investigated progression trajectory-based
student models with student programming interactions in an in-
troductory college-level computer science course. Analysis of the
models found that 1) there are three distinct patterns among such
trajectories, Early Quitting (EQ), High Persistence (HP), and EfficientCompletion (EC), which exhibit significantly different characteris-
tics across multiple dimensions, and 2) the models can be used to
accurately predict learning behaviors.
The results suggest several promising directions for future work.
First, it will be important to consider different embeddings tech-
niques of intermediate code states. For example, vector-based em-
beddings of the intermediate code states can reflect more granu-
lar and structured understanding of students’ program evolution.
Second, future work should investigate approaches to further im-
proving the predictability of next programming activity patterns
by considering different feature sets including incoming knowl-
edge, perceived skills, and program code. To this end, devising
feature extraction techniques appears promising. Third, it will be
instructive to investigate integrating trajectory-based student mod-
els into adaptive learning environments to understand how they
can most effectively contribute to student-adaptivity to improve
student learning.
ACKNOWLEDGMENTSThis work is supported by the National Science Foundation through
grants DUE-1626235 and DUE-1625908. Any opinions, findings,
and conclusions or recommendations expressed in this material are
those of the author(s) and do not necessarily reflect the views of
the National Science Foundation.
198
Progression Trajectory-Based Student Modeling for Novice Block-Based Programming UMAP ’21, June 21–25, 2021, Utrecht, Netherlands
REFERENCES[1] Jordan Barria-Pineda, Julio Guerra-Hollstein, and Peter Brusilovsky. 2018. A
fine-grained open learner model for an introductory programming course. In Pro-ceedings of the 26th Conference on User Modeling, Adaptation and Personalization,53–61.
[2] Jens Bennedsen and Michael E Caspersen. 2019. Failure rates in introductory
programming: 12 years later. ACM Inroads 10, 2 (2019), 30–36.[3] Paulo Blikstein. 2011. Using learning analytics to assess students’ behavior in
open-ended programming tasks. In Proceedings of the 1st International Conferenceon Learning Analytics and Knowledge, 110–116.
[4] Paulo Blikstein, Marcelo Worsley, Chris Piech, Mehran Sahami, Steven Cooper,
and Daphne Koller. 2014. Programming pluralism: Using learning analytics to
detect patterns in the learning of computer programming. Journal of LearningScience 23, 4 (2014), 561–599.
[5] Eric Brill and Robert C Moore. 2000. An improved error model for noisy channel
spelling correction. In Proceedings of the 38th Annual Meeting of the Associationfor Computational Linguistics, 286–293.
[6] Tyne Crow, Andrew Luxton-Reilly, and Burkhard Wuensche. 2018. Intelligent
tutoring systems for programming education: a systematic review. In Proceedingsof the 20th Australasian Computing Education Conference, 53–62.
[7] Nicholas Diana, Michael Eagle, John Stamper, Shuchi Grover, Marie Bienkowski,
and Satabdi Basu. 2017. An instructor dashboard for real-time analytics in in-
teractive programming assignments. In Proceedings of the Seventh InternationalLearning Analytics & Knowledge Conference, 272–279.
[8] Nicholas Diana, Michael Eagle, John Stamper, Shuchi Grover, Marie Bienkowski,
and Satabdi Basu. 2018. Data-driven generation of rubric criteria from an ed-
ucational programming environment. In Proceedings of the 8th InternationalConference on Learning Analytics and Knowledge, 16–20.
[9] Andrew Emerson, Michael Geden, Andy Smith, Eric Wiebe, Bradford Mott,
Kristy Elizabeth Boyer, and James Lester. 2020. Predictive Student Modeling
in Block-Based Programming Environments with Bayesian Hierarchical Models.
In Proceedings of the 28th ACM Conference on User Modeling, Adaptation andPersonalization, 62–70.
[10] Andrew Emerson, Andy Smith, Fernando J. Rodriguez, Eric N. Wiebe, BradfordW.
Mott, Kristy Elizabeth Boyer, and James C. Lester. 2020. Cluster-based analysis of
novice coding misconceptions in block-based programming. In Proceedings of the51st ACM Technical Symposium on Computer Science Education (2020), 825–831.
DOI:https://doi.org/10.1145/3328778.3366924
[11] Andrew Emerson, Andy Smith, Cody Smith, Fernando Rodríguez, Eric Wiebe,
Bradford Mott, Kristy Boyer, and James Lester. 2019. Predicting Early and Often:
Predictive Student Modeling for Block-Based Programming Environments. In
Proceedings of the 12th International Conferece on Educational Data Mining (2019),
39–48.
[12] Anthony Estey, Hieke Keuning, and Yvonne Coady. 2017. Automatically classify-
ing students in need of support by detecting changes in programming behaviour.
In Proceedings of the 2017 ACM SIGCSE Technical Symposium on Computer ScienceEducation, 189–194.
[13] Neil Fraser. 2013. Blockly: A visual programming editor. URL https//code. google.com/p/blockly 42, (2013).
[14] Dan Garcia, Brian Harvey, and Tiffany Barnes. 2015. The beauty and joy of
computing. ACM Inroads 6, 4 (2015), 71–79.[15] Elena L Glassman, Jeremy Scott, Rishabh Singh, Philip J Guo, and Robert C Miller.
2015. OverCode: Visualizing variation in student solutions to programming
problems at scale. ACM Transactions on Computer-Human Interactions (TOCHI)22, 2 (2015), 1–35.
[16] Jiawei Han, Jian Pei, Behzad Mortazavi-Asl, Helen Pinto, Qiming Chen, Umesh-
war Dayal, and Meichun Hsu. 2001. Prefixspan: Mining sequential patterns effi-
ciently by prefix-projected pattern growth. In Proceedings of the 17th InternationalConference on Data Engineering, IEEE Washington, DC, USA, 215–224.
[17] Juha Helminen, Petri Ihantola, Ville Karavirta, and Lauri Malmi. 2012. How
do students solve parsons programming problems? an analysis of interaction
traces. In Proceedings of the Ninth Annual International Conference on InternationalComputing Education Research, 119–126.
[18] Nathan Henderson, Vikram Kumaran, Wookhee Min, Bradford Mott, Ziwei Wu,
Danielle Boulden, Trudi Lord, Frieda Reichsman, Chad Dorsey, and Eric Wiebe.
2020. Enhancing Student Competency Models for Game-Based Learning with a
Hybrid Stealth Assessment Framework. In Proceedings of the 13th InternationalConference on Educational Data Mining (2020), 93-103.
[19] RoyaHosseini, Peter Brusilovsky, Michael Yudelson, and Arto Hellas. 2017. Stereo-
type modeling for Problem-Solving performance predictions in MOOCs and tradi-
tional courses. In Proceedings of the 25th Conference on User Modeling, Adaptationand Personalization, 76–84.
[20] Roya Hosseini, Arto Vihavainen, and Peter Brusilovsky. 2014. Exploring problem
solving paths in a Java programming course. In Psychology of ProgrammingInterest Group Annual Conference 2014, University of Pittsburgh, 65.
[21] Matthew C Jadud. 2006. Methods and tools for exploring novice compilation
behaviour. In Proceedings of the Second International Workshop on Computing
Education Research, 73–84.[22] Bo Jiang, Wei Zhao, Nuan Zhang, and Feiyue Qiu. 2019. Programming trajectories
analytics in block-based programming language learning. Interactive Learning En-vironments 4820, (2019), 1–14. DOI:https://doi.org/10.1080/10494820.2019.1643741
[23] Shamya Karumbaiah, Ryan S Baker, and Valerie Shute. 2018. Predicting Quitting
in Students Playing a Learning Game. In Proceedings of the 11th InternationalConference on Educational Data Mining (2018).
[24] Trupti M Kodinariya and Prashant R Makwana. 2013. Review on determining
number of Cluster in K-Means Clustering. International Journal 1, 6 (2013), 90–95.[25] Abe Leite and Saúl A Blanco. 2020. Effects of Human vs. Automatic Feedback on
Students’ Understanding of AI Concepts and Programming Style. In Proceedingsof the 51st ACM Technical Symposium on Computer Science Education, 44–50.
[26] Christophe Leys, Christophe Ley, Olivier Klein, Philippe Bernard, and Laurent Li-
cata. 2013. Detecting outliers: Do not use standard deviation around the mean, use
absolute deviation around the median. Journal of Experimental Social Psychology49, 4 (2013), 764–766.
[27] Zitao Liu, Songfan Yang, Jiliang Tang, Neil Heffernan, and Rose Luckin. 2020.
Recent advances in multimodal educational data mining in k-12 education. In
Proceedings of the 26th ACM SIGKDD International Conference on KnowledgeDiscovery & Data Mining, 3549–3550.
[28] Yuetian Luo and Zachary A. Pardos. 2018. Diagnosing University student sub-
ject proficiency and predicting degree completion in vector space. 32nd AAAIConference on Artificial Intelligence 32, 1, (2018), 7920–7927.
[29] Andrew Luxton-Reilly, Ibrahim Albluwi, Brett A Becker, Michail Giannakos, Am-
ruth N Kumar, Linda Ott, James Paterson, Michael James Scott, Judy Sheard, and
Claudia Szabo. 2018. Introductory programming: a systematic literature review.
In Proceedings Companion of the 23rd Annual ACM Conference on Innovation andTechnology in Computer Science Education, 55–106.
[30] Mehak Maniktala, Christa Cody, Amy Isvik, Nicholas Lytle, Min Chi, and Tiffany
Barnes. 2020. Extending the Hint Factory for the assistance dilemma: A novel,
data-driven HelpNeed Predictor for proactive problem-solving help. arXiv Prepr.arXiv2010.04124 (2020).
[31] Victor J Marin, Tobin Pereira, Srinivas Sridharan, and Carlos R Rivero. 2017.
Automated personalized feedback in introductory Java programming MOOCs.
In 2017 IEEE 33rd International Conference on Data Engineering (ICDE), IEEE,1259–1270.
[32] Wookhee Min, Megan H Frankosky, Bradford W Mott, Jonathan P Rowe, Eric
Wiebe, Kristy Elizabeth Boyer, and James C Lester. 2015. DeepStealth: leveraging
deep learning models for stealth assessment in game-based learning environ-
ments. In Proceedings of the 17th International Conference on Artificial Intelligencein Education, Springer, 277–286.
[33] Wookhee Min, Bradford Mott, Jonathan Rowe, Barry Liu, and James Lester.
2016. Player goal recognition in open-world digital games with long short-term
memory networks. In Porceedings of the 25th International Joint Conference onArtificial Intelligence (2016), 2590–2596.
[34] Benjamin Paaßen, Claudio Gallicchio, Alessio Micheli, and Barbara Hammer.
2018. Tree edit distance learning via adaptive symbol embeddings. In Proceedingof the 35th International Conference on Machine Learning, PMLR, 3976–3985.
[35] Benjamin Paaßen, Barbara Hammer, ThomasWilliam Price, Tiffany Barnes, Sebas-
tian Gross, and Niels Pinkwart. 2017. The continuous hint factory-providing hints
in vast and sparsely populated edit distance spaces. arXiv Prepr. arXiv1708.06564(2017).
[36] Zachary A Pardos, Steven Tang, Daniel Davis, and Christopher Vu Le. 2017.
Enabling real-time adaptivity in MOOCs with a personalized next-step recom-
mendation framework. In Proceedings of the Fourth (2017) ACM Conference onLearning@ Scale, 23–32.
[37] Fabian Pedregosa, Gaël Varoquaux, Alexandre Gramfort, Vincent Michel,
Bertrand Thirion, Olivier Grisel, Mathieu Blondel, Peter Prettenhofer, Ron Weiss,
and Vincent Dubourg. 2011. Scikit-learn: Machine learning in Python. The Journalof Machine Learning Research 12, (2011), 2825–2830.
[38] Radek Pelánek and Jiří Řihák. 2017. Experimental analysis of mastery learning
criteria. In Proceedings of the 25th Conference on User Modeling, Adaptation andPersonalization, 156–163.
[39] D N Perkins and Fay Martin. 1986. Fragile knowledge and neglected strategies
in novice programmers. In at Empirical Studies of Programmers, 1st Workshop,Washington, DC, 213–229.
[40] Chris Piech, Mehran Sahami, Daphne Koller, Steve Cooper, and Paulo Blikstein.
2012. Modeling how students learn to program. In Proceedings of the 43rd ACMtechnical symposium on Computer Science Education, 153–160.
[41] ThomasWPrice and Tiffany Barnes. 2015. Comparing textual and block interfaces
in a novice programming environment. In Proceedings of the Eleventh AnnualInternational Conference on International Computing Education Research, 91–99.
[42] Thomas W Price, Yihuan Dong, and Tiffany Barnes. 2016. Generating Data-
DrivenHints for Open-Ended Programming. In Proceedings of the 9th InternationalConference on Educational Data Mining (2016), 191-198.
[43] Thomas W Price, Yihuan Dong, and Dragan Lipovac. 2017. iSnap: towards intelli-
gent tutoring in novice programming environments. In Proceedings of the 2017ACM SIGCSE Technical Symposium on Computer Science Education, 483–488.
199
UMAP ’21, June 21–25, 2021, Utrecht, Netherlands Fahmid Fahid et al.
[44] Thomas W Price, Rui Zhi, and Tiffany Barnes. 2017. Hint generation under
uncertainty: The effect of hint quality on help-seeking behavior. In Proceedings ofthe 18th International Conference on Artificial Intelligence in Education, Springer,311–322.
[45] Alexander Renkl and Robert K Atkinson. 2003. Structuring the transition from
example study to problem solving in cognitive skill acquisition: A cognitive load
perspective. Educational Psychologist 38, 1 (2003), 15–22.[46] Mitchel Resnick, JohnMaloney, Andrés Monroy-Hernández, Natalie Rusk, Evelyn
Eastmond, Karen Brennan, Amon Millner, Eric Rosenbaum, Jay Silver, and Brian
Silverman. 2009. Scratch: programming for all. Communication of the ACM 52, 11
(2009), 60–67.
[47] Kelly Rivers, Erik Harpstead, and Kenneth R Koedinger. 2016. Learning curve
analysis for programming: Which concepts do students struggle with? In Proceed-ings of the 2016 ACM Conference on International Computing Education Research,143–151.
[48] Kelly Rivers and Kenneth R Koedinger. 2017. Data-driven hint generation in
vast solution spaces: a self-improving python programming tutor. InternationalJournal of Artificial Intelligence in Education 27, 1 (2017), 37–64.
[49] Fernando J Rodríguez, Kimberly Michelle Price, Joseph Isaac, Kristy Elizabeth
Boyer, and Christina Gardner-McCune. 2017. How block categories affect learner
satisfaction with a block-based programming interface. In 2017 IEEE Symposiumon Visual Languages and Human-Centric Computing (VL/HCC), IEEE, 201–205.
[50] Pavel Senin. 2008. Dynamic time warping algorithm review. Information andComputer Science Department University of Hawaii at Manoa Honolulu, USA 855,
1–23 (2008), 40.
[51] Melissa Stanger and Emmie Martin. 2016. The 50 Best Computer-Science and
Engineering Schools in America, 2015.
[52] Giovanni Maria Troiano, Sam Snodgrass, Erinç Argımak, Gregorio Robles,
Gillian Smith, Michael Cassidy, Eli Tucker-Raymond, Gillian Puttick, and Casper
Harteveld. 2019. Is my game OK Dr. Scratch? Exploring programming and com-
putational thinking development via metrics in student-designed serious games
for STEM. In Proceedings of the 18th ACM International Conference on InteractionDesign and Children, 208–219.
[53] Lisa Wang, Angela Sy, Larry Liu, and Chris Piech. 2017. Learning to represent
student knowledge on programming exercises using deep learning. In Proceedingsof the 10th International Conference on Educational Data Mining, EDM 2017 (2017),
324–329.
[54] Christiane Gresse VonWangenheim, Jean C RHauck, Matheus Faustino Demetrio,
Rafael Pelle, Nathalia da Cruz Alves, Heliziane Barbosa, and Luiz Felipe Azevedo.
2018. CodeMaster–Automatic Assessment and Grading of App Inventor and
Snap! Programs. Informatics in Education. 17, 1 (2018), 117–150.[55] Christopher Watson and Frederick W B Li. 2014. Failure rates in introductory
programming revisited. In Proceedings of the 2014 Conference on Innovation &Technology in Computer Science Education, 39–44.
[56] David Weintrop and Uri Wilensky. 2017. Comparing block-based and text-based
programming in high school computer science classrooms. ACM Transactions onComputer Education 18, 1 (2017), 1–25.
[57] Joseph B Wiggins, Fahmid M Fahid, Andrew Emerson, Madeline Hinckle, Andy
Smith, Kristy Elizabeth Boyer, Bradford Mott, Eric Wiebe, and James Lester. 2021.
Exploring Novice Programmers’ Hint Requests in an Intelligent Block-Based
Coding Environment. In Proceedings of the 52nd ACM Technical Symposium onComputer Science Education (2021), 52-58.
[58] Jeannette M Wing. 2014. Computational thinking benefits society. 40th Anniver-sary Blog on Social Issues in Computing (2014), 26.
[59] Rui Zhi, Thomas W Price, Samiha Marwan, Alexandra Milliken, Tiffany Barnes,
and Min Chi. 2019. Exploring the impact of worked examples in a novice pro-
gramming environment. In Proceedings of the 50th ACM Technical Symposium onComputer Science Education, 98–104.
200