It’s not a bug, it’s a feature: New study explores mentoring program activities

Lyons, M. D., & McQuillin, S. D. (2021). It’s Not a Bug, It’s a Feature: Evaluating Mentoring Programs with Heterogeneous Activities. Child & Youth Care Forum

Summarized by Ariel Ervin

Notes of Interest: 

  • Despite the prevalence of mentoring relationships between non-parental figures and youth, many evaluations on these dyads only show moderate effects on youth-related outcomes.  
  • This research paper theorizes that heterogeneous mentoring activities illustrate how mentoring is often perceived and how it functions as an attractive feature for programs & researchers to utilize (it’s not a “bug” that needs to be fixed).     
    • Argues that heterogeneous mentoring activities can make it harder for programs and researchers to evaluate the effectiveness interventions have on youth-related outcomes. 
  • Findings suggest that…
    • Treatment effects tend to be smaller whenever mentoring activities aren’t measured properly. 
    • Observed and predicted treatment effects mirror each other when there is a good match between mentees and the treatment activities.  
    • If there isn’t a good match between program practices and desired youth-related outcomes, null effects occur. 
  • Researchers need to focus on evaluating mentoring program practices rather than just assessing mentoring programs as a whole.     
    • Ensures that mentoring activities align well with desired outcomes. 

Introduction (Reprinted from the Abstract)


Mentoring programs pair non-familial adults with children and adolescents for the purposes of promoting positive youth development. Although these programs are widely popular, evaluations tend to show that mentoring programs have, on average, modest effects on youth outcomes. Some researchers have suggested that mentoring programs should homogenize mentoring activities as a means for increasing effect sizes of programs. 


This paper describes why heterogeneity of mentoring activities should not necessarily be regarded as a problem (i.e., a bug) that needs correction; rather it is more representative of the construct of mentoring as it is popularly understood and also desirable because of the potential to improve access and quality of prevention services (i.e., it is a feature).


We present different simulated scenarios demonstrating how evaluations of mentoring programs may change the estimates of treatment effects depending on how evaluators measure programmatic activities and approach analyses.


Analyses illustrate that commonly used evaluation strategies that treatment effects may be underestimated when mentoring activities are not measured and are paired with common analytic approaches (e.g., intent-to-treat analyses). Simulated scenarios also highlight alternative approaches for defining programmatic elements and evaluating programs to produce a more robust estimate of effects.


The optimal strategy for evaluating mentoring services depends on the particular features of the program as well as the goals of the evaluation. One approach researchers might take is to evaluate specific mentoring practices, before evaluating mentoring programs, to begin to understand program impact.

Implications (Reprinted from the Discussion)

Mentoring programs are one type of prevention program characterized by heterogeneous (and, often, unspecified) program practices designed to target a wide range of positive, developmental outcomes for youth (Rhodes et al. 2006). Mentoring programs, for example, often aim to reduce behavioral problems in school, improve grades, strengthen peer relationships, and reduce truancy and delinquency (Garringer et al. 2017). The rationale for mentoring is largely based on developmental research demonstrating that youth-adult relationships are critical for preventing unwanted outcomes and promoting positive development (Rhodes et al. 2006). As a result, mentoring programs are often defined in terms of relational qualities rather than specific mentoring practices (i.e., actions that mentors and mentees take to achieve desired outcomes) (McQuillin et al. 2020). However, when researchers do not define program practices, researchers cannot make inferences about what program practices (if any) caused a change within the mentee. In addition, failing to define program practices means that other mentoring programs cannot replicate the program practices, nor can researchers understand why mentoring programs work (Gottfredson et al. 2015). The purpose of this paper was to illustrate how failing to specify program practice biases estimates of the treatment effects and present alternatives for programs to consider when developing and evaluating prevention programs meant to target heterogeneous outcomes through heterogeneous practices. The results from the simulations presented provide insight into the tradeoffs that programs might make with respect to matching, training, and supporting mentors to achieve desired outcomes.

It is a Bug: Reducing Mentoring Adaptations

The first simulated scenario illustrated common practice in which mentor-mentee activities are unmeasured (i.e., lacking treatment construct validity) and unaccounted (i.e., lacking statistical conclusion validity). The results show that overall treatment effects were smaller than the expected effects when mentoring activities were unaccounted. Smaller treatment effects were observed because a portion of mentors in this scenario engaged in ineffective mentoring practices while a portion engaged in effective practices. Thus, failing to measure the activities in which mentors and mentees engage means that inferences cannot be made about the effects mentors on desired outcomes (i.e., it is bug of the intervention). The underestimation of effects in real world evaluations is likely much worse, as it is typical that many more outcomes are evaluated and the possible activities range in their number, effectiveness, and implementation (as illustrated in the scenario replicating SMP results). Thus, it is not reasonable to expect any scientifically or practically useful information to be gained from testing unspecified treatment packages with multiple outcomes, as is the case in most evaluations of mentoring.

To access this article, click here.