Tactics Journal report prompt experiments

Baseline exp5-concrete-title → current exp26-live-gated-style-repair-full. Updated 2026-05-30 23:08 UTC

Feature-flagged cron canary ready: not globally enabled. Use REPORT_STYLE_REPAIR_ENABLED=1 for one canary report-worker run.

Previous best
20.92
Baseline rejudge
21.33
Safe QA repair
15.58
rejudge 16.42
Production-mode gated
16.58
exp26
Delta vs rejudge
-4.75
Repairs
28/42
accepted / total
Fixture gates
10/12
accepted
Guardrail
exp27
caffeine regression rejected

Fixture scores

Fixtureexp5exp5 rejudgeexp26Δacceptedrejectedgate
001-extra-attacker13.0015.005.00-10.0040accepted
002-lukaku-rehab48.0041.0033.00-8.0071accepted
003-role-discovery33.0035.0029.00-6.0022accepted
004-drone-training45.0046.0037.00-9.0022accepted
005-press-goalkeeper25.0023.0014.00-9.0051accepted
006-youth-circulate-vertical10.0011.004.00-7.0011accepted
007-sterile-possession18.0019.0017.00-2.0020accepted
008-caffeine-load12.0013.0018.00+5.0011accepted
009-submaximal-vo2max18.0017.0018.00+1.0004rejected
010-training-sanctions17.0023.0015.00-8.0021accepted
011-loan-option11.0011.008.00-3.0021accepted
012-insufficient-positional-structure1.002.001.00-1.0000rejected

001-extra-attacker

exp5 rejudge 15.00 → exp26 5.00 -10.00; gate=accepted []

Repair decisions
accepted
− An overlapping full-back can turn a winger-versus-full-back duel into a wide 2v1.
+ An overlapping full-back can give the winger a wide 2v1 against the full-back.
accepted
− Counterattacks are the danger behind the decision.
+ If both full-backs are high, the first pass after a turnover can expose the channels behind them.
accepted
− That is the live decision: whether the extra attacker changes the opponent’s defensive line enough to create shots, and whether the first turnover after the change still leaves the team with pressure on the ball and cover behind it.
+ Barcelona’s staff had to decide whether the extra attacker changed Frankfurt’s defensive line enough to create shots, and whether the first turnover after the change still left the team with pressure on the ball and cover behind it.
accepted
− The public evidence supports a practical coaching read, not a universal minute, scoreline or automatic instruction to release both full-backs.
+ Xiao and Zhang’s substitution evidence, Forcher’s formation-change study and the EPL counterattack study help coaches read the game, but do not give a universal minute, scoreline or automatic instruction to release both full-backs.
Unified diff
--- exp5/001-extra-attacker.md
+++ exp26/001-extra-attacker.md
@@ -14,7 +14,7 @@
## Route to goal
-An overlapping full-back can turn a winger-versus-full-back duel into a wide 2v1. [Coaches’ Voice’s overloads guide](https://learning.coachesvoice.com/cv/overloads-football-tactics-explained-guardiola-liverpool-messi/) also describes a pivot dropping into the back line to let full-backs push forward, and warns that an overload in one area leaves an opponent free somewhere else. FIFA’s Tokyo 2020 Technical Study Group made the same role flexible: [Steve McClaren said full-back “adaptability” was the key word](https://inside.fifa.com/tournaments/mens/mensolympic/tokyo2020/news/tactical-trends-at-the-mens-olympic-tournament), with full-backs overlapping, underlapping, inverting and preparing counter-pressing positions.
+An overlapping full-back can give the winger a wide 2v1 against the full-back. [Coaches’ Voice’s overloads guide](https://learning.coachesvoice.com/cv/overloads-football-tactics-explained-guardiola-liverpool-messi/) also describes a pivot dropping into the back line to let full-backs push forward, and warns that an overload in one area leaves an opponent free somewhere else. FIFA’s Tokyo 2020 Technical Study Group made the same role flexible: [Steve McClaren said full-back “adaptability” was the key word](https://inside.fifa.com/tournaments/mens/mensolympic/tokyo2020/news/tactical-trends-at-the-mens-olympic-tournament), with full-backs overlapping, underlapping, inverting and preparing counter-pressing positions.
Axel Witsel dropped into Dortmund’s back line in [Coaches’ Voice’s Dortmund 3-2 Bayern analysis](https://learning.coachesvoice.com/tactical-analysis-borussia-dortmund-3-bayern-munich-2/), which let both Dortmund full-backs move higher. Bayern later pushed both full-backs very high, but neither Javi Martínez nor Leon Goretzka dropped into the back line, and Dortmund attacked the space behind them on the break.
@@ -22,12 +22,12 @@
Five attackers usually need five outfield players underneath or around the ball. [Coaches’ Voice’s rest-defence guide](https://learning.coachesvoice.com/cv/rest-defence-explained/) says teams often attack with five and keep five in rest defence, but may move to six attackers and four rest-defence players when desperate for a goal. The same guide describes common 2-3 and 3-2 structures behind the attack.
-Counterattacks are the danger behind the decision. González-Rodenas and colleagues’ [2017-18 EPL study of 1,971 possessions](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226978) found counterattacks were more effective than combinative attacks for creating scoring opportunities, with odds ratio 3.428. A [FIFA Training Centre defensive-transition session](https://www.fifatrainingcentre.com/en/practice/elite-sessions/transition-to-defending/defensive-transitions.php) gives the coachable response: the closest player presses the ball carrier, teammates recover compactly, central areas are protected, and the team tries to create a numerical advantage around the ball.
+If both full-backs are high, the first pass after a turnover can expose the channels behind them. González-Rodenas and colleagues’ [2017-18 EPL study of 1,971 possessions](https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0226978) found counterattacks were more effective than combinative attacks for creating scoring opportunities, with odds ratio 3.428. A [FIFA Training Centre defensive-transition session](https://www.fifatrainingcentre.com/en/practice/elite-sessions/transition-to-defending/defensive-transitions.php) gives the coachable response: the closest player presses the ball carrier, teammates recover compactly, central areas are protected, and the team tries to create a numerical advantage around the ball.
## Live check
Bench staff can answer the full-back question with four clips: the last settled attack before the change, the first settled attack after it, the first turnover after it, and the first opponent counter after it. Pause before the turnover. If both full-backs are high, who is behind them: a dropping pivot, a centre-back covering across, or an inverted full-back protecting the middle? If the opponent’s first pass breaks the central screen or finds the vacated full-back channel with a 3v2, holding one full-back deeper is safer than adding another runner.
-Flick credited Barcelona’s staff after the Frankfurt change, saying his coaches and analysts “did a great job” after showing the team the spaces they wanted. That is the live decision: whether the extra attacker changes the opponent’s defensive line enough to create shots, and whether the first turnover after the change still leaves the team with pressure on the ball and cover behind it.
+Flick credited Barcelona’s staff after the Frankfurt change, saying his coaches and analysts “did a great job” after showing the team the spaces they wanted. Barcelona’s staff had to decide whether the extra attacker changed Frankfurt’s defensive line enough to create shots, and whether the first turnover after the change still left the team with pressure on the ball and cover behind it.
-Xiao and Zhang’s substitution evidence is World Cup evidence, Forcher’s formation-change study is one Bundesliga team, and the EPL counterattack study is one league season sample. The public evidence supports a practical coaching read, not a universal minute, scoreline or automatic instruction to release both full-backs.
+Xiao and Zhang’s substitution evidence is World Cup evidence, Forcher’s formation-change study is one Bundesliga team, and the EPL counterattack study is one league season sample. Xiao and Zhang’s substitution evidence, Forcher’s formation-change study and the EPL counterattack study help coaches read the game, but do not give a universal minute, scoreline or automatic instruction to release both full-backs.

002-lukaku-rehab

exp5 rejudge 41.00 → exp26 33.00 -8.00; gate=accepted []

Repair decisions
accepted
− The dispute began when the place of rehab stopped matching the club’s expected line of sight.
+ Napoli said Lukaku would return to Naples to improve his fitness, but the rehab location became the dispute.
accepted
− It does show the management route: a club can allow rehab away from base only if the club still knows the medical state, the expected return date, and the conditions for selection.
+ Under that compromise, a club can allow rehab away from base only if it still knows the medical state, the expected return date, and the conditions for selection.
accepted
− An injured player near selection is being judged under uncertainty: re-injury risk, competition pressure, the player’s wishes, and the club’s availability need all pull on the decision.
+ Staff deciding whether an injured player trains or plays have to weigh re-injury risk, competition pressure, the player’s wishes, and the club’s availability need.
accepted
− The club-country solution should make the decision visible: what are the options, what information is being used, and who has accepted the risk.
+ The club and national team should record the options, the information being used, and who has accepted the risk.
accepted
− A football return should be cleared by criteria, not by the World Cup date or the next league match alone.
+ The doctor should clear a football return by criteria even when the World Cup date or the next league match is close.
rejected — after_repeats_next_sentence:0.60
− The national-team target can sit inside the plan, but it should not become the clearance test.
+ A tournament date can guide the plan while the player’s symptoms, strength and sprint benchmarks, football-specific training, and response after each step up still decide clearance.
accepted
− Didier Deschamps’ reply shows why informal trust is not enough.
+ Deschamps defended France’s protocol, so a club still needs written limits.
accepted
− Rehab disputes happen inside that calendar pressure, not outside it.
+ The crowded calendar puts more pressure on clubs to set review dates before international windows.
Unified diff
--- exp5/002-lukaku-rehab.md
+++ exp26/002-lukaku-rehab.md
@@ -4,17 +4,17 @@
<!---more--->
-Napoli diagnosed Romelu Lukaku with a high-grade rectus femoris injury in his left thigh on August 18th, 2025; [Napoli’s August 18th medical update](https://sscnapoli.it/en/bollettino-medico-le-condizioni-di-lukaku/) said he had already begun rehabilitation and would have a surgical consultation. On March 24th, 2026, [Napoli’s note](https://sscnapoli.it/en/lukaku-lascia-il-ritiro-della-nazionale-belga/) said Lukaku would miss Belgium’s friendlies against the United States and Mexico and return to Naples to improve his fitness. The dispute began when the place of rehab stopped matching the club’s expected line of sight.
+Napoli diagnosed Romelu Lukaku with a high-grade rectus femoris injury in his left thigh on August 18th, 2025; [Napoli’s August 18th medical update](https://sscnapoli.it/en/bollettino-medico-le-condizioni-di-lukaku/) said he had already begun rehabilitation and would have a surgical consultation. On March 24th, 2026, [Napoli’s note](https://sscnapoli.it/en/lukaku-lascia-il-ritiro-della-nazionale-belga/) said Lukaku would miss Belgium’s friendlies against the United States and Mexico and return to Naples to improve his fitness. Napoli said Lukaku would return to Naples to improve his fitness, but the rehab location became the dispute.
Lukaku later said he had checks in Belgium that showed inflammation and liquid on his hip flexor muscle, and that he chose to rehab in Belgium because he needed to be “clinically 100 percent.” [Sportstar’s AFP report](https://sportstar.thehindu.com/football/lukaku-injury-update-treatment-row-belgium-napoli-world-cup-friendlies/article70803144.ece) quoted him saying he could “never turn my back on Napoli” and wanted to help both Napoli and Belgium. Napoli then escalated. [Napoli’s March 31st statement](https://sscnapoli.it/en/nota-del-club-5/) said Lukaku did not respond to the call to return to training and that the club reserved the right to consider disciplinary action.
-Lukaku and Napoli eventually moved from a location fight to a negotiated solution. [BBC Sport reported on April 20th](https://www.bbc.com/sport/football/articles/cr4199zr2g4o) that Lukaku held “calm, collaborative and constructive” talks with Napoli, updated the club on his recovery, and would continue rehabilitation in Belgium after an amicable solution. That compromise does not prove the private terms. It does show the management route: a club can allow rehab away from base only if the club still knows the medical state, the expected return date, and the conditions for selection.
+Lukaku and Napoli eventually moved from a location fight to a negotiated solution. [BBC Sport reported on April 20th](https://www.bbc.com/sport/football/articles/cr4199zr2g4o) that Lukaku held “calm, collaborative and constructive” talks with Napoli, updated the club on his recovery, and would continue rehabilitation in Belgium after an amicable solution. That compromise does not prove the private terms. Under that compromise, a club can allow rehab away from base only if it still knows the medical state, the expected return date, and the conditions for selection.
## Readiness before urgency
-An injured player near selection is being judged under uncertainty: re-injury risk, competition pressure, the player’s wishes, and the club’s availability need all pull on the decision. [Yung, Ardern, Serpiello and Robertson’s open-access review](https://link.springer.com/article/10.1186/s40798-022-00440-z) describes return-to-sport decisions as complex and affected by time pressure from competition schedules and social pressure from coaches, families and supporters. The club-country solution should make the decision visible: what are the options, what information is being used, and who has accepted the risk.
+Staff deciding whether an injured player trains or plays have to weigh re-injury risk, competition pressure, the player’s wishes, and the club’s availability need. [Yung, Ardern, Serpiello and Robertson’s open-access review](https://link.springer.com/article/10.1186/s40798-022-00440-z) describes return-to-sport decisions as complex and affected by time pressure from competition schedules and social pressure from coaches, families and supporters. The club and national team should record the options, the information being used, and who has accepted the risk.
-A football return should be cleared by criteria, not by the World Cup date or the next league match alone. [Barça Innovation Hub’s rehabilitation overview](https://barcainnovationhub.fcbarcelona.com/blog/sports-rehabilitation-return-to-play-protocols/) says hamstring recurrence can reach 38% in the first six months after clearance and argues for objective return-to-play checks: strength, functional performance, confidence, load tolerance, GPS monitoring, conditioned group training, and limited first minutes. For a club, the usable checks are concrete: sprint exposure against pre-injury values, change-of-direction work without compensation, at least seven consecutive days tolerating training load, and a minutes cap for the first match back.
+The doctor should clear a football return by criteria even when the World Cup date or the next league match is close. [Barça Innovation Hub’s rehabilitation overview](https://barcainnovationhub.fcbarcelona.com/blog/sports-rehabilitation-return-to-play-protocols/) says hamstring recurrence can reach 38% in the first six months after clearance and argues for objective return-to-play checks: strength, functional performance, confidence, load tolerance, GPS monitoring, conditioned group training, and limited first minutes. For a club, the usable checks are concrete: sprint exposure against pre-injury values, change-of-direction work without compensation, at least seven consecutive days tolerating training load, and a minutes cap for the first match back.
The national-team target can sit inside the plan, but it should not become the clearance test. If the player wants a tournament and the club needs league availability, staff still need the same record: symptoms after load, strength and sprint benchmarks, football-specific training completed, and the response after each step up.
@@ -22,13 +22,13 @@
PSG named the failure case after Ousmane Dembélé and Désiré Doué were injured with France in September 2025. [France24’s AFP report](https://www.france24.com/en/live-news/20250907-psg-call-for-change-after-dembele-and-doue-international-duty-injuries) said PSG claimed it had given France “concrete medical information” on acceptable workload and injury risks, regretted that recommendations were not taken into account, and asked for “systematic, documented and reciprocal exchanges” between club and national-team medical staffs.
-Didier Deschamps’ reply shows why informal trust is not enough. He said there is “no such thing as zero risk,” defended France’s protocol, and said he considers “the player’s feelings” when deciding whether to play someone. A club cannot remove all match risk, but it can require the national team to acknowledge the risk in writing: maximum training load, forbidden exercises, match-minute ceiling, pain or fatigue stop rules, and who calls the club doctor before the player trains again.
+Deschamps defended France’s protocol, so a club still needs written limits. He said there is “no such thing as zero risk,” defended France’s protocol, and said he considers “the player’s feelings” when deciding whether to play someone. A club cannot remove all match risk, but it can require the national team to acknowledge the risk in writing: maximum training load, forbidden exercises, match-minute ceiling, pain or fatigue stop rules, and who calls the club doctor before the player trains again.
## Calendar and insurance
[FIFA Council’s international calendar release](https://inside.fifa.com/organisation/fifa-council/media-releases/fifa-council-approves-international-match-calendars?ref=ed_direct) sets formal men’s windows for 2025-2030: March, June, late September/early October, and November, with the 2026 World Cup mandatory release period starting on May 25th, 2026 after the last official club match on May 24th. Clubs know these dates early enough to schedule rehab decision points before the player leaves: medical review before call-up, load plan on arrival, reassessment before any match, and post-window handover.
-[FIFPRO’s 2023/24 Player Workload Monitoring report](https://www.fifpro.org/en/articles/2024/09/workload-demands-on-players-spiral-as-competitions-expand-and-governing-bodies-fail-to-meet-duty-of-care) found 54% of 1,500 monitored players faced excessive or high workload, 31% had 55 or more matchday squad inclusions, 17% made more than 55 appearances, and 30% had at least six straight weeks with two or more games per week. FIFPRO also cited Takumi Minamino having one day of recovery after returning from Japan duty at the Asian Cup before resuming club work with Monaco. Rehab disputes happen inside that calendar pressure, not outside it.
+[FIFPRO’s 2023/24 Player Workload Monitoring report](https://www.fifpro.org/en/articles/2024/09/workload-demands-on-players-spiral-as-competitions-expand-and-governing-bodies-fail-to-meet-duty-of-care) found 54% of 1,500 monitored players faced excessive or high workload, 31% had 55 or more matchday squad inclusions, 17% made more than 55 appearances, and 30% had at least six straight weeks with two or more games per week. FIFPRO also cited Takumi Minamino having one day of recovery after returning from Japan duty at the Asian Cup before resuming club work with Monaco. The crowded calendar puts more pressure on clubs to set review dates before international windows.
[EFC’s FIFA Club Protection Programme explainer](https://www.efcfootball.com/en/fifa-club-protection-programme) says clubs can receive salary protection when a player is accidentally injured on official national-team duty, with payment after the first 28 days, up to 365 days, and up to $7.5m per accident. It also says illness, permanent injury, death, medical treatment costs, and most pre-existing injuries are excluded. Insurance can reduce wage exposure; it cannot give the coach the player back or reduce recurrence risk.

003-role-discovery

exp5 rejudge 35.00 → exp26 29.00 -6.00; gate=accepted []

Repair decisions
rejected — after_broad_should_opener
− Recruitment models should compare players by phase-role exposure, not listed position or season per-90s, so scouts can separate proven fit, missing opportunity, and projection risk.
+ Scouts should compare players by phase-role exposure before using listed position or season per-90s to separate proven fit, missing opportunity, and projection risk.
rejected — after_repeats_previous_sentence:0.82
− Low output can mean low ability, but it can also mean low exposure.
+ A player may not show inverted-full-back metrics because he lacks the ability, or because his current team never asks him to play there.
accepted
− The evidence supports a better scouting denominator, not an automatic answer.
+ Reeves’s example gives scouts a better denominator for role exposure while leaving the transfer-success answer unproved.
accepted
− In the meeting, the empty rows matter as much as the good clips, because they show which parts of the buying club’s role have not yet been tested.
+ In the meeting, scouts should mark the empty rows alongside the good clips because those rows show which parts of the buying club’s role have not yet been tested.
Unified diff
--- exp5/003-role-discovery.md
+++ exp26/003-role-discovery.md
@@ -24,7 +24,7 @@
[Hudl StatsBomb’s recruitment workflow](https://blogarchive.statsbomb.com/articles/soccer/using-statsbomb-iq-for-player-recruitment/) already points in this direction with role-specific filters, more than 100 metrics, On-Ball Value, Similar Player Search, pressure context, and tactical evaluation. The data can build the shortlist, but the evaluation still asks where the player defends, where he creates chances, how he passes under pressure, and what type of chances he receives.
-Everton’s Head of Performance Insights Charlie Reeves said [SkillCorner tracking data](https://skillcorner.com/articles/everton-add-skillcorner-game-intelligence-data-to-extended-agreement) helps measure “huge parts of the game that we can’t get from event data,” including off-ball work, team shape, build-up, and player fit. Reeves is describing a club workflow, not proving that any model predicts transfer success. The evidence supports a better scouting denominator, not an automatic answer.
+Everton’s Head of Performance Insights Charlie Reeves said [SkillCorner tracking data](https://skillcorner.com/articles/everton-add-skillcorner-game-intelligence-data-to-extended-agreement) helps measure “huge parts of the game that we can’t get from event data,” including off-ball work, team shape, build-up, and player fit. Reeves is describing a club workflow, not proving that any model predicts transfer success. Reeves’s example gives scouts a better denominator for role exposure while leaving the transfer-success answer unproved.
## Read output through exposure
@@ -38,4 +38,4 @@
| Missing exposure | The current team rarely created that phase-role for him. | Look for transferable cues in nearby tasks and mark the role as projection. |
| Transfer risk | The new role adds untested scanning, receiving, running, or defending demands. | Decide whether the price assumes proof or potential. |
-The matrix does not predict the transfer by itself; it tells scouts where the evidence ends. In the meeting, the empty rows matter as much as the good clips, because they show which parts of the buying club’s role have not yet been tested.
+The matrix does not predict the transfer by itself; it tells scouts where the evidence ends. In the meeting, scouts should mark the empty rows alongside the good clips because those rows show which parts of the buying club’s role have not yet been tested.

004-drone-training

exp5 rejudge 46.00 → exp26 37.00 -9.00; gate=accepted []

Repair decisions
accepted
− Clubs should treat the last 72 hours before a knockout match as a match-information security window: hide set plays and team clues, audit sightlines, control access, plan for drones and preserve evidence quickly.
+ In the last 72 hours before a knockout match, clubs should hide set plays and team clues, audit sightlines, control access, plan for drones and preserve evidence quickly.
rejected — after_repeats_previous_sentence:0.86
− That is the model: shield the valuable session, not every minute of ordinary training.
+ The screen should close during tactical sessions and stay open most of the time for grass growth and residents.
accepted
− Canada’s Olympic drone case shows why airspace cannot be treated like a person on a hill.
+ A drone can fly over training sites, so staff cannot treat airspace like a person on a hill.
rejected — after_broad_should_opener
− The football plan cannot wait for the disciplinary plan.
+ Clubs should decide on the football plan before the hearing.
Unified diff
--- exp5/004-drone-training.md
+++ exp26/004-drone-training.md
@@ -1,6 +1,6 @@
# How clubs should protect training-session information before knockout matches
-Clubs should treat the last 72 hours before a knockout match as a match-information security window: hide set plays and team clues, audit sightlines, control access, plan for drones and preserve evidence quickly.
+In the last 72 hours before a knockout match, clubs should hide set plays and team clues, audit sightlines, control access, plan for drones and preserve evidence quickly.
<!---more--->
@@ -26,7 +26,7 @@
## Drones need a separate plan
-Canada’s Olympic drone case shows why airspace cannot be treated like a person on a hill. [FIFA’s Appeal Committee](https://ipt.fifa.com/legal/judicial-bodies/news/fifa-appeal-committee-decision-on-the-canadian-soccer-association-and-its-officials) deducted six points from Canada, fined the federation CHF 200,000 and suspended Bev Priestman, Joseph Lombardi and Jasmine Mander for one year after a breach connected to the prohibition on flying drones over training sites.
+A drone can fly over training sites, so staff cannot treat airspace like a person on a hill. [FIFA’s Appeal Committee](https://ipt.fifa.com/legal/judicial-bodies/news/fifa-appeal-committee-decision-on-the-canadian-soccer-association-and-its-officials) deducted six points from Canada, fined the federation CHF 200,000 and suspended Bev Priestman, Joseph Lombardi and Jasmine Mander for one year after a breach connected to the prohibition on flying drones over training sites.
UK drone privacy rules also make the evidence problem different. The [ICO’s drone guidance](https://ico.org.uk/for-organisations/uk-gdpr-guidance-and-resources/cctv-and-video-surveillance/guidance-on-video-surveillance-including-cctv/additional-considerations-for-technologies-other-than-cctv/unmanned-aerial-systems-uas-drones/) says drone footage can collect personal data, including people not meant to be recorded, while the [Civil Aviation Authority](https://www.caa.co.uk/drones/open-category/moving-on-to-more-advanced-flying/privacy-rules-when-flying-drones/) says photos or recordings identifying people may fall under GDPR and the Data Protection Act 2018. A club should brief staff on who calls security, who records the time and direction of the drone, who checks CCTV for the pilot, and who contacts police or the league.

005-press-goalkeeper

exp5 rejudge 23.00 → exp26 14.00 -9.00; gate=accepted []

Repair decisions
accepted
− PSG turned Yann Sommer into the trigger in the 2025 Champions League final.
+ PSG pressed when the ball went back to Yann Sommer in the 2025 Champions League final.
accepted
− The usual reaction is for the goalkeeper to play long under pressure, often losing possession, but the same article warns that teams can reverse the trap by creating options behind the jump as the press starts.
+ The goalkeeper often plays long under pressure and loses possession; the same article warns that teams can create options behind the jump as the press starts.
rejected — after_repeats_next_sentence:0.60
− Benfica showed why the goalkeeper press also asks questions behind the first line.
+ Benfica punished Bayern Munich’s high press by playing long behind the first line.
accepted
− The pressing team has to decide whether it is hurrying the goalkeeper into a bad pass or inviting him into the best pass on the pitch.
+ The pressing team has to decide whether it is hurrying the goalkeeper into a bad pass or giving him a clean long pass.
accepted
− The examples come from different competitions, seasons, and providers, so they support a decision rule rather than a league-wide trend claim.
+ The clips are examples from different competitions, seasons, and providers, and they do not prove a league-wide trend claim.
accepted
− Those numbers do not decide the press by themselves, but they tell the coach whether the likely outcome is a short escape, a rushed clearance, or a planned long pass.
+ The coach can use those numbers to prepare for a short escape, a rushed clearance, or a planned long pass.
Unified diff
--- exp5/005-press-goalkeeper.md
+++ exp26/005-press-goalkeeper.md
@@ -4,13 +4,13 @@
<!---more--->
-PSG turned Yann Sommer into the trigger in the 2025 Champions League final. [Coaches’ Voice’s PSG-Inter analysis](https://learning.coachesvoice.com/cv/psg-inter-2025-champions-league-final-tactics/) describes Ousmane Dembélé jumping onto Sommer after Francesco Acerbi passed back and moved into midfield; PSG’s front line and higher midfielders then went player-to-player as Inter tried to play over them. Inter won many first contacts, but Willian Pacho and Marquinhos followed the two forwards and PSG’s midfield stayed goal-side for the second action. Luis Enrique’s summary was plain: PSG put the “right pressure” on Inter and did not let them play their normal football.
+PSG pressed when the ball went back to Yann Sommer in the 2025 Champions League final. [Coaches’ Voice’s PSG-Inter analysis](https://learning.coachesvoice.com/cv/psg-inter-2025-champions-league-final-tactics/) describes Ousmane Dembélé jumping onto Sommer after Francesco Acerbi passed back and moved into midfield; PSG’s front line and higher midfielders then went player-to-player as Inter tried to play over them. Inter won many first contacts, but Willian Pacho and Marquinhos followed the two forwards and PSG’s midfield stayed goal-side for the second action. Luis Enrique’s summary was plain: PSG put the “right pressure” on Inter and did not let them play their normal football.
Goalkeepers are targeted because build-up teams use them to make the free player. [Ralph Hasenhüttl’s Elite Soccer session](https://elitesoccercoaching.net/in-possession/using-the-goalkeeper-in-build-up-play) says using the goalkeeper gives a numerical advantage against player-oriented high pressing. The session trains the goalkeeper to wait for pressure, keep the ball controlled, and find the spare player; its coaching cue is “straight pressure = diagonal bounce, diagonal pressure = vertical bounce.”
## The trigger is only the start
-A backward pass to the goalkeeper from a higher defensive line is a common pressing trigger, [Spielverlagerung’s Timing Game article](https://spielverlagerung.com/2025/03/31/the-timing-game-mh/) says. The usual reaction is for the goalkeeper to play long under pressure, often losing possession, but the same article warns that teams can reverse the trap by creating options behind the jump as the press starts.
+A backward pass to the goalkeeper from a higher defensive line is a common pressing trigger, [Spielverlagerung’s Timing Game article](https://spielverlagerung.com/2025/03/31/the-timing-game-mh/) says. The goalkeeper often plays long under pressure and loses possession; the same article warns that teams can create options behind the jump as the press starts.
Dominic Solanke’s curved run toward Jan Oblak shows the short-exit part of the plan. In [Football Analysis Hub’s Tottenham-Atlético example](https://www.footballanalysishub.com/post/the-high-press-explained-risk-vs-reward), a back pass triggered the run, Solanke blocked the right centre-back, Spurs marked ball-side options, and Oblak was forced into a turnover.
@@ -24,12 +24,12 @@
Brad Guzan punished early jumps by stepping outside his box as Atlanta’s centre-backs split wider. [Tactical Football Analysis’s build-up piece](https://tacticalfootballanalysis.com/tactical-theory-goalkeepers-build-up-tactics-analysis/) says he created a three-against-two, took a touch between the strikers when forwards jumped, and then played long from a higher central position. The same piece describes Manuel Neuer moving side to side, stepping into line with Bayern’s centre-backs, and using an aggressive touch to pin a pressing forward so teammates had time to receive and drive past Celtic’s press.
-The pressing team has to decide whether it is hurrying the goalkeeper into a bad pass or inviting him into the best pass on the pitch. If the first runner cannot block one short lane and arrive close enough to affect the kick, the goalkeeper may become more dangerous after the press starts.
+The pressing team has to decide whether it is hurrying the goalkeeper into a bad pass or giving him a clean long pass. If the first runner cannot block one short lane and arrive close enough to affect the kick, the goalkeeper may become more dangerous after the press starts.
## Scouting the jump
-The examples come from different competitions, seasons, and providers, so they support a decision rule rather than a league-wide trend claim. Goalkeeper passing data can tell analysts what kind of escape to expect. [Analytics FC’s risk-versus-reward study](https://analyticsfc.co.uk/blog/2023/03/03/goalkeeper-passing-risk-versus-reward/) used 2021/22 and 2022/23 Premier League data to assess pass length, retention, xG, xGA, xT, and xTA after goalkeeper touches; it also notes that xG is often too sparse near goalkeeper distribution, while xT can better capture the threat created or conceded.
+The clips are examples from different competitions, seasons, and providers, and they do not prove a league-wide trend claim. Goalkeeper passing data can tell analysts what kind of escape to expect. [Analytics FC’s risk-versus-reward study](https://analyticsfc.co.uk/blog/2023/03/03/goalkeeper-passing-risk-versus-reward/) used 2021/22 and 2022/23 Premier League data to assess pass length, retention, xG, xGA, xT, and xTA after goalkeeper touches; it also notes that xG is often too sparse near goalkeeper distribution, while xT can better capture the threat created or conceded.
-[Premier League/Opta’s 2024/25 goalkeeper numbers](https://www.premierleague.com/en/news/4293708) show why opponent profiles matter. Guglielmo Vicario had 88.8% passing accuracy and 75.4% of his goal kicks find a teammate before leaving his own penalty area. Jordan Pickford attempted 776 long balls at 41.2% accuracy and led keepers with 118 passes into the final third. Those numbers do not decide the press by themselves, but they tell the coach whether the likely outcome is a short escape, a rushed clearance, or a planned long pass.
+[Premier League/Opta’s 2024/25 goalkeeper numbers](https://www.premierleague.com/en/news/4293708) show why opponent profiles matter. Guglielmo Vicario had 88.8% passing accuracy and 75.4% of his goal kicks find a teammate before leaving his own penalty area. Jordan Pickford attempted 776 long balls at 41.2% accuracy and led keepers with 118 passes into the final third. The coach can use those numbers to prepare for a short escape, a rushed clearance, or a planned long pass.
Pause the clip as the ball travels back to the goalkeeper. The cue is not just the striker’s sprint. The coach needs to see the curved run denying one centre-back, the nearest pivot marked, the centre-back assigned to contest the target forward, and a midfielder underneath the second ball. If those jobs are not named, the press should wait for the next pass.

006-youth-circulate-vertical

exp5 rejudge 11.00 → exp26 4.00 -7.00; gate=accepted []

Repair decisions
rejected — after_generated_pattern:/^public evidence\b/i
− Direct interviews with a broad set of academies and long-term tracking are missing, so the public evidence supports common practice design rather than a global survey of youth teams.
+ Public evidence lacks direct interviews with a broad set of academies and long-term tracking, so it fits common practice design rather than a global survey of youth teams.
accepted
− The next progression is a directional game with pressure and a target line; the feedback is whether the player scanned before the first touch, received side-on, saw the closest defender, and chose the safe forward pass or kept the ball.
+ Move the player into a directional game with pressure and a target line, then check whether the player scanned before the first touch, received side-on, saw the closest defender, and chose the safe forward pass or kept the ball.
Unified diff
--- exp5/006-youth-circulate-vertical.md
+++ exp26/006-youth-circulate-vertical.md
@@ -30,4 +30,4 @@
England’s [FA age-phase priorities](https://www.thefa.com/bootroom/resources/england-dna/the-future-england-player/age-phase-priorities) move from ball mastery and combining in the foundation phase to youth players staying connected to retain possession and open compact defences. At professional development age, the expectation becomes more specific: retain with intent, recognise when to penetrate quickly, control tempo and change the speed of play.
-Direct interviews with a broad set of academies and long-term tracking are missing, so the public evidence supports common practice design rather than a global survey of youth teams. The next progression is a directional game with pressure and a target line; the feedback is whether the player scanned before the first touch, received side-on, saw the closest defender, and chose the safe forward pass or kept the ball.
+Direct interviews with a broad set of academies and long-term tracking are missing, so the public evidence supports common practice design rather than a global survey of youth teams. Move the player into a directional game with pressure and a target line, then check whether the player scanned before the first touch, received side-on, saw the closest defender, and chose the safe forward pass or kept the ball.

007-sterile-possession

exp5 rejudge 19.00 → exp26 17.00 -2.00; gate=accepted []

Repair decisions
accepted
− Shot count is the final filter, not the answer.
+ Shot totals should be checked against the chance-quality context that Opta and StatsBomb list.
accepted
− Long possessions can be overvalued when chain length stands in for threat.
+ Models can rate long passing moves too highly when they treat pass-chain length as attacking danger.
Unified diff
--- exp5/007-sterile-possession.md
+++ exp26/007-sterile-possession.md
@@ -36,11 +36,11 @@
Pause the clip as the receiver takes the first touch. The useful tags are nearest defender, defenders goalside, body shape, first touch direction and next action. A between-line pass is penetration if the receiver can face forward, slip the next runner, carry into the box or set a cutback. If the receiver has to bounce the ball backwards under immediate pressure, the entry was probably only nominal.
-Shot count is the final filter, not the answer. [Opta's xG explainer](https://theanalyst.com/articles/what-is-expected-goals-xg) says chance quality depends on distance, angle, goalkeeper position, pressure, shot type, pattern of play and the previous action. [StatsBomb's freeze-frame case study](https://blogarchive.statsbomb.com/news/statsbomb-data-case-studies-freeze-frames-and-defender-locations/) adds defender and goalkeeper locations, pressure, blockers and the keeper's view of goal. Against a low block, a cutback to a free runner and a blocked shot through six defenders should not be reported as equal shots.
+Shot totals should be checked against the chance-quality context that Opta and StatsBomb list. [Opta's xG explainer](https://theanalyst.com/articles/what-is-expected-goals-xg) says chance quality depends on distance, angle, goalkeeper position, pressure, shot type, pattern of play and the previous action. [StatsBomb's freeze-frame case study](https://blogarchive.statsbomb.com/news/statsbomb-data-case-studies-freeze-frames-and-defender-locations/) adds defender and goalkeeper locations, pressure, blockers and the keeper's view of goal. Against a low block, a cutback to a free runner and a blocked shot through six defenders should not be reported as equal shots.
## Limits and report output
-Long possessions can be overvalued when chain length stands in for threat. [StatsBomb's OBV explainer](https://blogarchive.statsbomb.com/news/introducing-on-ball-value-obv/) values each event by its change in scoring and conceding probability, includes pressure and action context, and warns that models using possession-history features can overvalue longer chains because stronger teams tend to have longer possessions.
+Models can rate long passing moves too highly when they treat pass-chain length as attacking danger. [StatsBomb's OBV explainer](https://blogarchive.statsbomb.com/news/introducing-on-ball-value-obv/) values each event by its change in scoring and conceding probability, includes pressure and action context, and warns that models using possession-history features can overvalue longer chains because stronger teams tend to have longer possessions.
Provider definitions differ, and public match pages rarely show all defensive positions. Event data can count entries, carries, shots and cutbacks; tracking, 360 data or video is needed to judge whether the block moved, whether the receiver had space, and whether the rest defence protected the first pass after a loss.

008-caffeine-load

exp5 rejudge 13.00 → exp26 18.00 +5.00; gate=accepted []

Repair decisions
rejected — number_tokens_changed, after_repeats_next_sentence:0.56
− Twenty male soccer players in a treadmill protocol gave a football-specific version of the same pattern.
+ In 20 male soccer players, 6 mg/kg caffeine increased treadmill time to exhaustion and lowered RPE.
accepted
− The stronger conclusion is that caffeine may help some football-relevant high-intensity protocols, not that it reliably improves every repeated-sprint or match situation.
+ Caffeine may help some football-relevant high-intensity protocols, although it does not reliably improve every repeated-sprint or match situation.
Unified diff
--- exp5/008-caffeine-load.md
+++ exp26/008-caffeine-load.md
@@ -24,7 +24,7 @@
Sixteen soccer-player RCTs in a [2021 systematic review and meta-analysis](https://pubmed.ncbi.nlm.nih.gov/33666113/) did not show significant pooled caffeine benefits for repeated sprint tests, vertical jump, aerobic endurance, reaction-time agility, or RPE. The RPE difference was small and non-significant (MD 0.16 points, 95% CI -0.55 to 0.87), and the certainty of evidence was very low to low.
-Seven caffeine studies in a [repeated-sprint meta-analysis](https://pubmed.ncbi.nlm.nih.gov/31036532/) also found no improvement in total work (Hedges' g = -0.01, 95% CI -0.32 to 0.31), best sprint, or last sprint. The stronger conclusion is that caffeine may help some football-relevant high-intensity protocols, not that it reliably improves every repeated-sprint or match situation.
+Seven caffeine studies in a [repeated-sprint meta-analysis](https://pubmed.ncbi.nlm.nih.gov/31036532/) also found no improvement in total work (Hedges' g = -0.01, 95% CI -0.32 to 0.31), best sprint, or last sprint. Caffeine may help some football-relevant high-intensity protocols, although it does not reliably improve every repeated-sprint or match situation.
## Dose and monitoring

009-submaximal-vo2max

exp5 rejudge 17.00 → exp26 18.00 +1.00; gate=rejected ['no_accepted_replacements', 'missing_after_judge', 'style_score_not_improved:16.00->16.00']

Repair decisions
rejected — number_tokens_changed, after_repeats_next_sentence:0.91
− Male recreational footballers give one cleaner conversion case.
+ Castagna, Krustrup and Póvoas tested 66 male recreational players against the HRmax/HRrest ratio method.
rejected — number_tokens_changed, after_repeats_next_sentence:0.88
− Professional football also shows why staff should not borrow a familiar formula without checking the sample.
+ Bangsbo’s distance-based formula underestimated VO2max in 17 Italian third-division players.
rejected — number_tokens_changed, after_repeats_next_sentence:0.67
− Female soccer evidence gives the same warning from another group.
+ The 18 German second-division players had lower direct YYIR1 and formula VO2max values than lab VO2max.
rejected — number_tokens_changed
− Heart-rate equations add another error source.
+ ACSM puts HRmax regression-equation standard errors at around ±3–12 bpm.
Unified diff

No text changes.

010-training-sanctions

exp5 rejudge 23.00 → exp26 15.00 -8.00; gate=accepted []

Repair decisions
accepted
− In the EFL, normal opposition analysis becomes a sporting-sanctions problem when a club observes or tries to observe another club’s training session inside the 72 hours before a match.
+ The EFL can punish a club that observes or tries to observe another club’s training session inside the 72 hours before a match.
accepted
− The EFL used sporting remedies because the breach sat inside a live promotion competition.
+ The EFL changed who played for promotion because the breach happened during the play-offs.
rejected — number_tokens_changed
− That matters for clubs now because ordinary preparation can stay inside match analysis; watching the closed session now sits under a specific EFL rule.
+ Clubs can keep ordinary preparation inside match analysis; Regulation 127 covers watching a closed session inside 72 hours.
Unified diff
--- exp5/010-training-sanctions.md
+++ exp26/010-training-sanctions.md
@@ -1,6 +1,6 @@
# When watching an opponent train becomes a sporting-sanctions problem
-In the EFL, normal opposition analysis becomes a sporting-sanctions problem when a club observes or tries to observe another club’s training session inside the 72 hours before a match.
+The EFL can punish a club that observes or tries to observe another club’s training session inside the 72 hours before a match.
<!---more--->
@@ -20,7 +20,7 @@
Southampton later admitted more than the Middlesbrough incident. [Sky Sports](https://www.skysports.com/football/news/11095/13545415/southampton-expelled-from-championship-play-offs-over-spygate-with-middlesbrough-reinstated) reported that the admitted breaches related to Oxford United in December 2025, Ipswich Town in April 2026, and Middlesbrough in May 2026. [BBC Sport](https://www.bbc.com/sport/football/articles/cn4p284ny2ko) reported that Southampton admitted spying on three rivals’ training sessions, lost their appeal, were expelled from the play-offs, and received a four-point deduction for 2026-27.
-The EFL used sporting remedies because the breach sat inside a live promotion competition. Middlesbrough had lost the semi-final 2-1 on aggregate, but were reinstated to face Hull City. Sky also noted the final was potentially worth about £200m in revenue. The punishment did not just fine a club after the event; it changed who played for promotion.
+The EFL changed who played for promotion because the breach happened during the play-offs. Middlesbrough had lost the semi-final 2-1 on aggregate, but were reinstated to face Hull City. Sky also noted the final was potentially worth about £200m in revenue. The punishment did not just fine a club after the event; it changed who played for promotion.
## Why Leeds was different

011-loan-option

exp5 rejudge 11.00 → exp26 8.00 -3.00; gate=accepted []

Repair decisions
accepted
− Clubs use option-to-buy loans to separate short-term access from permanent commitment: they test the player in their team, keep a pre-agreed route to buy, and can return him if the football or price is wrong.
+ With an option-to-buy loan, clubs test the player in their team, keep a pre-agreed route to buy, and can return him if the football or price is wrong.
accepted
− FIFA rules also make loans a squad-building resource, not free trial space.
+ FIFA rules limit how many professionals clubs can loan in and out.
rejected — after_generated_pattern:/^the public evidence\b/i
− The public evidence supports a mechanism, not a market-wide success rate.
+ The public evidence covers how clubs use the option after seeing the player and does not give a market-wide success rate.
Unified diff
--- exp5/011-loan-option.md
+++ exp26/011-loan-option.md
@@ -1,6 +1,6 @@
# How clubs use option-to-buy loans to reduce recruitment uncertainty
-Clubs use option-to-buy loans to separate short-term access from permanent commitment: they test the player in their team, keep a pre-agreed route to buy, and can return him if the football or price is wrong.
+With an option-to-buy loan, clubs test the player in their team, keep a pre-agreed route to buy, and can return him if the football or price is wrong.
<!---more--->
@@ -28,6 +28,6 @@
The borrowing club still carries costs during the loan. Loans can include split wages or a loan fee, and the Coutinho example shows that wage exposure clearly: Barcelona said Aston Villa would pay part of the player’s wages during the loan.
-FIFA rules also make loans a squad-building resource, not free trial space. The [2022 loan regulations](https://inside.fifa.com/organisation/media-releases/fifa-to-introduce-new-loan-regulations) require written loan terms covering duration and financial conditions, ban sub-loaning, and from July 2024 limit clubs to six professionals loaned in and six loaned out, with exemptions for under-21 and club-trained players.
+FIFA rules limit how many professionals clubs can loan in and out. The [2022 loan regulations](https://inside.fifa.com/organisation/media-releases/fifa-to-introduce-new-loan-regulations) require written loan terms covering duration and financial conditions, ban sub-loaning, and from July 2024 limit clubs to six professionals loaned in and six loaned out, with exemptions for under-21 and club-trained players.
The public evidence supports a mechanism, not a market-wide success rate. FCN’s Adel deal is the clearest case here: the club used the option only after seeing 10 matches, two goals, one assist, adaptation, dressing-room fit and playing-style fit in its own environment.

012-insufficient-positional-structure

exp5 rejudge 2.00 → exp26 1.00 -1.00; gate=rejected ['no_accepted_replacements', 'missing_after_judge', 'style_score_not_improved:2.00->2.00']

Repair decisions

No repair decisions.

Unified diff

No text changes.

Experiment history

RunLossStatusDescription
exp27-live-gated-style-repair-008-validator-smoke13.00guardrail_passcaffeine-load production-mode smoke after reliably-improve validator; gate rejected repair and preserved baseline, avoiding exp26 +5 regression
exp26-live-gated-style-repair-full16.58production_mode_candidatesingle-draft live judge+repair+after-judge gate; 28 accepted, 14 rejected, 10/12 fixtures gate-accepted; found caffeine regression fixed by added validator in exp27 smoke
exp25-qa-strict-sentence-repair-rejudge16.42confirmation_passsame exp25 drafts rejudged; still clear win vs exp5 20.92/21.33 with safer diffs than exp24
exp25-qa-strict-sentence-repair15.58repair_candidate_safestricter validators after manual QA; 34 accepted, 5 rejected; URLs preserved; opener hits 20 to 15
exp24-evaluator-guided-sentence-repair-rejudge14.50confirmation_passsame exp24 drafts rejudged; confirmed large win vs exp5 20.92 and exp5 rejudge 21.33
exp24-evaluator-guided-sentence-repair15.00repair_candidate_bestfull visible exact sentence repair from severity>=2 judge examples; 38 accepted, 3 rejected, URLs preserved; opener hits 20 to 14
exp24-evaluator-guided-sentence-repair-smoke19.33smoke_passsix-fixture exact sentence repair smoke improved subset mean 29.83 to 19.33 vs exp5 rejudge
exp23-gated-selective-style-eval21.33cleanup_eval_failedaccepted only 004 and 010 after per-fixture acceptance gate; opener hits 20 to 18 but style_loss stayed at 21.33 vs rejudge and above exp5 20.92; not promoted
exp22-selective-cleaned-style-eval20.67cleanup_eval_mixedhigh-confidence opener cleanup improved mean vs exp5 rejudge 21.33 but regressed 001 +4, 005 +6, 009 +3; not promoted
exp5-rejudge-existing21.33rejudgeexisting exp5 drafts rejudged for apples-to-apples selective cleanup comparison
exp21-cleaned-style-eval23.17cleanup_eval_failedexisting exp21 cleaned drafts removed forced openers but worsened full visible style_loss vs exp5 20.92
exp5-visible-plus-holdout22.93holdout_checkvisible exp5 plus separate holdout fixture checks