The aim of this study was to evaluate the inter-rater reliability of the Phillips-checklist, a proposed framework for the quality assessment of modeling studies. Six raters evaluated nine modeling studies from three different medical specialties. Intra-class correlation (ICC) and corresponding variance components were estimated from these studies. Raters were asked to comment on their experience with the framework. While overall the mean inter-rater reliability showed no significant rater-effect (ICC = 0.69, p = 0.064), there was – presumably as a result of a lower study variability – a significant rater effect for clopidogrel only (p