Central Station / YPM-ARCHIVE-GA-FLAKY-GRADING-AND-INCOMPLETE-MARKUP

GA returns inconsistent grades and skips markup

archive/ga-flaky-grading-and-incomplete-markup.md · Updated 2026-05-01
GET /api/tickets/YPM-ARCHIVE-GA-FLAKY-GRADING-AND-INCOMPLETE-MARKUP

Summary

Fixed 2026-05-01

0Questions 0Links 0Comments 0PRs
Spec body Markdown
# GA returns inconsistent grades and skips markup

The Grading Assistant sometimes leaves comments without marking up the paper itself. Re-running on the same unchanged document yields different grades, and the first paragraph is consistently the part that goes unmarked.

## Reproduction

1. As a teacher, run GA on a submitted paper.
2. Observe the paper has comments but no inline markup (or markup is missing on the first paragraph).
3. Without changing the document, run GA again.
4. Note that the grade differs from the first run.

## Expected

GA produces deterministic markup and grades for an unchanged document. Markup covers the entire paper, including the first paragraph.

## Actual

- Markup is sometimes missing, especially on the first paragraph.
- Re-running on the same unchanged input gives a different (sometimes lower) grade.

## Impact

Teachers lose trust in the GA when they can re-run and see the grade drift. The first-paragraph blind spot is particularly bad because the first paragraph is where thesis/intro feedback matters most. Reported 2026-03-20.

## Affected versions

Reported 2026-03-20.

## Suspected cause

Two likely separate but overlapping issues:
1. **Non-determinism** — LLM call uses non-zero temperature, or the rubric application step isn't pinned. Could also be retry/fallback logic returning inconsistent results.
2. **First-paragraph markup gap** — extraction or chunking of the document is dropping the first chunk, or the markup overlay is failing to render against it. Investigate how paragraphs are tokenized and addressed.

## Workaround

Re-run until the markup looks complete. Trust comments more than the numeric grade.
Repo sync Not recorded

No repo sync metadata recorded yet.