Clarity
CLAHow clearly the speaker structured and articulated the argument.
High: Easy to follow, organized, direct.
Low: Confusing, rambling, hard to parse.
Public judging standard
Beacon debates are judged by an open, stance-blind rubric. The AI scores how well someone argues, not whether the judge agrees with their position.
Core formula
argumentScore = 10 x sum(dimension x weight)
The AI returns six 0-10 dimension scores. The server computes the final 0-100 score in code.
How clearly the speaker structured and articulated the argument.
High: Easy to follow, organized, direct.
Low: Confusing, rambling, hard to parse.
Whether the reasoning is coherent and avoids obvious fallacies.
High: Claims connect cleanly to conclusions.
Low: Contradictions, leaps, or unsupported inferences.
Use of specific facts, examples, data, citations, or concrete proof.
High: Specific, relevant support.
Low: Bare assertions or vague references.
Calibration, intellectual humility, fair treatment of the other side.
High: Acknowledges nuance and avoids strawmen.
Low: Overclaims, misrepresents, or dodges complexity.
How directly the speaker engages the opponent's prior claims.
High: Answers the opponent head-on.
Low: Ignores the opponent and repeats prepared points.
How memorable, persuasive, or listener-relevant the turn is.
High: A point viewers remember.
Low: Technically present but forgettable.
Duel mode is the baseline. Logic and evidence carry the most weight, because Beacon should reward well-supported arguments over vibes.
Start a debateRewards punchiness, memorability, and one clear idea. Penalizes meandering.
Rewards compression and clarity in 30-second turns. Does not over-punish lack of deep citation.
Balanced one-on-one judging across all six dimensions.
Rewards rigor, structure, evidence, and rebuttal quality above all.
The judge sees the topic, the current speaker's transcript, and a small set of recent opponent claims for clash scoring. It is told not to infer identity, politics, race, gender, or background.
Beacon extracts up to two factual claims per turn, verifies them against citable source URLs, and only shows live fact-check cards when confidence clears the threshold. Uncertain claims stay out of the overlay instead of pretending to know.
xAI grok-4.3
primary live judge
OpenAI gpt-4o-mini
fallback judge
Anthropic claude-haiku-4-5
last-resort fallback