Diabetes mellitus is one of the most prevalent chronic diseases globally, with type 1 diabetes (T1D) standing out due to its autoimmune nature and typically earlier onset compared to type 2 diabetes (T2D). The ability to accurately identify T1D cases within primary care electronic medical record (EMR) databases is pivotal for various stakeholders, including healthcare providers, researchers, and policymakers. An exploratory study led by researchers from the University of Calgary and University of Alberta has made significant strides in developing and validating a case definition for T1D using primary care EMRs. This news article delves into the findings of their research published in CMAJ Open.

The Study at a Glance

The study, “Developing a case definition for type 1 diabetes mellitus in a primary care electronic medical record database: an exploratory study” (DOI: 10.9778/cmajo.20180142), utilized EMR data from the Southern Alberta Primary Care Network within the Canadian Primary Care Sentinel Surveillance Network (CPCSSN) for the period from 2008 to 2016. The aim was to develop and validate a case definition that could reliably distinguish between T1D and T2D within these records.

Methodology and Findings

The researchers identified 1,309 individuals with diabetes, and with the aid of family physicians, confirmed 110 T1D cases to create a reference standard. They employed three decision-tree classification algorithms and least absolute shrinkage and selection operator (LASSO) logistic regression to pinpoint variables that accurately differentiated T1D from T2D cases.

The first algorithm achieved a 42.7% sensitivity and 99.3% specificity by using the mention of “type 1” in text words or if the patient was younger than 22 years at the time of initial diabetes diagnosis. The second algorithm, which combined free-text terms, insulin prescriptions, and age, demonstrated an impressive 87.3% sensitivity and 85.4% specificity.

Implications and Future Directions

This study’s findings are significant for improving the detection and classification of T1D within electronic health records, thus enhancing disease surveillance, research validity, and personalized patient care. However, the authors recommend further validation and testing with larger and more diverse cohorts to establish the generalizability of these algorithms.

Importance for T1D Care and Surveillance

The development of accurate T1D identification methods is crucial in the landscape of diabetes care for several reasons, outlined below.

1. Enhanced Patient Management: Proper classification empowers healthcare professionals to tailor treatment plans more effectively and manage T1D more efficiently.
2. Improved Research: Accurate case definitions enable researchers to carry out epidemiological studies, clinical trials, and outcomes research with greater precision.
3. Policy and Planning: For healthcare systems and policymakers, accurately identifying T1D prevalence informs resource allocation, program development, and policy-making.


The insights in this article are supported by an array of academic literature, which underscores the robust nature of the study:

1. Lipscombe LL, Hux JE. “Trends in diabetes prevalence, incidence and mortality in Ontario, Canada 1995–2005: a population-based study.” Lancet. 2007;369:750–6. [DOI] 2. Hux JE, Ivis F, Flintoft V, et al. “Diabetes in Ontario: determination of prevalence and incidence using a validated administrative data algorithm.” Diabetes Care. 2002;25:512–6. [PubMed] 3. Amed S, Vanderloo SE, Metzger D, et al. “Validation of diabetes case definitions using administrative claims data.” Diabet Med. 2011;28:424–7. [PubMed] 4. Garies S, Birtwhistle R, Drummond N, et al. “Data resource profile: national electronic medical record data from the Canadian Primary Care Sentinel Surveillance Network (CPCSSN).” Int J Epidemiol. 2017;46:1091–1092f. [PubMed] 5. Therneau TM, Atkinson EJ. “An introduction to recursive partitioning using the RPART routines.” Rochester (MN): Mayo Foundation; 2018. [Accessed online]


1. Type 1 Diabetes EMR Identification
2. T1D Machine Learning Algorithms
3. Primary Care Diabetes Surveillance
4. Diabetes EMR Case Definition
5. Canadian Diabetes EMR Study


The paradigm of diabetes care is reshaping with emerging digital tools and data analytics methods. The research exhibited by the team from the University of Calgary and University of Alberta serves as a springboard for leveraging primary care EMRs to enhance the quality of diabetes care. The quest for accuracy in T1D classification represents not just an academic endeavor but a transformative healthcare initiative. It heralds an era where machine learning aligns seamlessly with clinical care to provide nuanced patient management and improved outcomes for those living with diabetes.