Methods & Evidence
Troika Assessment Score
A transparent summary of the psychometric foundation behind the 63-item formation diagnostic — built on the ABC Leadership Model and anchored in Galatians 5:22-23.
“Formation is the metric. Fruit is the evidence. Character is the credential.”
What It Measures
Nine virtues of the Fruit of the Spirit, organized into three co-equal, concurrent, and inseparable dimensions.
Attitude
Inner Compass
- Love (Agape)
- Joy (Chara)
- Peace (Eirēnē)
Behavior
Visible Practice
- Patience (Makrothumia)
- Kindness (Chrēstotēs)
- Goodness (Agathōsunē)
Character
Moral Core
- Faithfulness (Pistis)
- Gentleness (Prautēs)
- Self-Control (Enkrateia)
Instrument at a Glance
63
Total Items
54
Scored Items
9
Virtues
6
Items per Virtue
9
SD Items
6-point Likert
Scale
Item Composition
| Component | Items | Purpose |
|---|---|---|
| Primary scoring items | 36 | 4 positively-worded items per virtue |
| Reverse-coded items | 18 | 2 reverse-worded items per virtue (response consistency) |
| Social desirability items | 9 | 1 per virtue (impression management detection) |
| Total administered | 63 | 54 scored + 9 validity |
How Scores Work
Per-Virtue Score (0–100)
Reverse-code flagged items (7 − raw score), compute the mean of 6 items per virtue, then scale: score = (mean − 1) / 5 × 100
Dimension Score
Mean of the 3 virtue scores within each dimension (Attitude, Behavior, Character).
Total Score
Mean of all 9 virtue scores.
Interpretation Bands
Beginning (0–25)
Foundational awareness is emerging
Developing (26–50)
Practices are taking root
Maturing (51–75)
Fruit is becoming visible
Flourishing (76–100)
Fruit is evident to others
Fracture Detection
A gap of 20+ points between any two dimension scores is flagged as a “fracture.” A fracture is:
- •A signal, not a verdict
- •An invitation to explore one area of formation more intentionally
- •Common — most people show some variation across dimensions
- •Recoverable — fractures point toward growth, not failure
Built-In Safeguards
Three validity scales ensure the integrity of every administration.
Social Desirability Scale
9 items (one per virtue) detect impression management — when someone presents an overly favorable self-image rather than responding authentically.
Response Consistency Check
Compares primary and reverse-coded items within each virtue (rcPair). Flags discrepancies of 4+ points — the same intra-scale approach used by NEO-PI-R and BFI.
Internal Reliability
Cronbach's alpha computed per administration for total and subscale scores as a real-time quality check.
Reliability Evidence
Reliability refers to the consistency of scores. We use three complementary methods.
| Method | Purpose | Threshold |
|---|---|---|
| Cronbach's Alpha | Internal consistency — do items within each scale hang together? | α ≥ 0.70 |
| McDonald's Omega | Model-based reliability that accounts for varying item strengths | ω ≥ 0.70 |
| Test-Retest | Temporal stability — do scores remain consistent when conditions haven't changed? | r ≥ 0.70 |
Validity Evidence
Following the Standards for Educational and Psychological Testing (AERA/APA/NCME, 2014).
| Analysis | Question | Threshold |
|---|---|---|
| Exploratory Factor Analysis | Do items group into the expected Attitude/Behavior/Character dimensions? | Loading ≥ 0.30 |
| Confirmatory Factor Analysis | Does the 3-factor or 9-factor model fit the data adequately? | CFI ≥ 0.90, RMSEA ≤ 0.08 |
| Convergent Validity | Do scores relate to self-rated spiritual growth and practice frequency? | r ≥ 0.30 |
| Known-Groups | Do leaders score differently from non-leaders, as the framework predicts? | Cohen's d ≥ 0.20 |
Fairness Commitment
Scores must function equivalently across demographic groups. We check for systematic advantages or disadvantages through:
- •Group mean comparisons by gender, age, and other demographics — differences exceeding 0.5 SD are flagged
- •Differential Item Functioning (DIF) — as sample sizes grow, individual items are tested for group bias
- •Content review — items are culturally sensitive, gender-neutral, and theologically broad
- •Criterion-referenced interpretation — all bands are absolute, not normed on any demographic group
What This Means for You
Formation Guides & Pastors
The Troika Assessment gives you a structured starting point for formation conversations. It reveals patterns — not just a number — and always points toward invitation, never judgment.
Respondents
Your scores reflect your self-perception at a moment in time. They are a gift to yourself — an honest mirror held up to your formation journey. There are no “bad” scores, only honest ones.
Researchers & Institutions
The full Technical Manual with detailed statistical tables is available upon request. We are committed to transparency and welcome research collaboration.
Ongoing Validation
The Troika Assessment Score is under active validation. This means psychometric evidence will strengthen as more people complete the assessment. We report all findings transparently — including limitations — and refine the instrument based on evidence, not marketing.
The assessment is designed for use within a pastoral formation context. It is not a clinical diagnostic tool, performance evaluation, or gatekeeping instrument. Results are always interpreted as formation invitations.
“But the fruit of the Spirit is love, joy, peace, forbearance, kindness, goodness, faithfulness, gentleness and self-control.”
For the full Technical Manual or to discuss research collaboration:
