How does Typo calculates PR health?

PR Health Scorer is a modular scoring engine that quantifies the overall “health” of a Pull Request (PR) based on a mix of code-level metrics and critical-issue signals.

It produces a weighted score (0–100) and assigns a risk label (Excellent → Very Risky).

It is invoked during the PR reviews to provide a holistic quality indicator that reflects:

  • PR scope and size (diff stats)

  • Code complexity (cyclomatic complexity)

  • Presence of critical issues (from validation phase)

Core Configuration

Categories impacting the score -

CATEGORIES = { 
    "diff_size", 
    "files_changed", 
    "code_complexity_score" 
}
  • Diff size - large diffs are risky.

  • Files changed - more files = higher review complexity.

  • Cyclomatic complexity - indicates logical complexity and maintainability risk.

Each metric is normalized to a 0–100 scale internally, then combined as a weighted average.

Label Definitions

LABELS = { 
    "EXCELLENT": "`Excellent` 🔥", 
    "GOOD": "`Good` âś…", 
    "NEEDS_ATTENTION": "`Needs Attention` âš ", 
    "RISKY": "`Risky` đźš«", 
    "VERY_RISKY": "`Very Risky` 🛑", 
}

These are human-readable labels used directly in the AI Code Review summary. They represent progressively decreasing confidence in the PR’s stability and reviewability.

Scoring & Aggregation Logic

  1. Category Scoring

Each raw metric is normalized to a 0 - 100 score using hard thresholds designed around empirical risk points.

Lines Changed
Score
Interpretation

< 350

100

Compact, easy to review

< 700

80

Manageable

< 900

60

Slightly heavy

< 1280

30

Harder to review

< 1500

15

Very heavy

> 1500

0

Overloaded PR

  1. File Count

Files Changed Score Interpretation

Files Changed
Score
Interpretation

< 5

100

Atomic PR

< 15

80

Slightly broad

< 30

50

Needs caution

< 50

20

Very large surface

> 50

0

Unreviewable PR

  1. Cyclomatic Complexity

A Linear decay function is used to calculate this

Combining Scores

Each category (diff size, files changed, and complexity) has an importance level assigned to it. The scorer takes each category’s score and factors in its importance, then combines them to get one overall health score.

This final score is a number between 0 and 100 that represents the overall quality and risk of the PR. Higher numbers mean the PR is healthier and easier to review.

Label Assignment Logic

Once the composite score is computed, it’s mapped to one of the five risk tiers:

Final Score
Label Key
Meaning

> 90

EXCELLENT

Clean, reviewable PR

> 75

GOOD

Solid, minor concerns

> 50

NEEDS_ATTENTION

Needs some caution

> 25

RISKY

Significant risk

< 25

VERY_RISKY

High complexity or volume

Last updated