Files
damir e7102c720d Dashboard top-primatelji: psycopg2 LIKE escape fix (%% in CASE WHEN)
CASE WHEN ... ILIKE '%X%' patterns conflicted with %s param placeholder.
Escaped to %%X%%. Endpoint now returns 200 with full klubovi list +
inferred davatelj_naziv (RSS / Županijski / Grad Rijeka / fallback).
2026-05-05 09:01:25 +02:00

4.2 KiB

Subagent C — Cross-Klub Duplicate / Stale-Transfer Detection

Run timestamp: 2026-05-05 08:36 batch Scope: pgz_sport.clanovi cross-klub duplicates Pre-run row count: 3240 (after Subagents A and B)

Strict-Criteria Results

Detector Cases found
Same hns_igrac_id across multiple klub_id 0
Same lower(ime)+lower(prezime)+datum_rodenja across multiple klub_id 0
Total confirmed cross-klub duplicates requiring action 0

Notes:

  • Only 3 rows in clanovi have a populated hns_igrac_id (Subagent A already merged the 3 same-ID-same-klub duplicates). None of the surviving rows share an HNS ID across klubs.
  • Brief specified datum_rodjenja. The canonical column with data is datum_rodenja (no 'j'); 684 rows populated. datum_rodjenja (with 'j') has only 1 row. Both columns checked — zero cross-klub matches by name+DOB.

Soft Match (Review-Only, NO Mutation)

A weaker name-only check (same lower(ime)+lower(prezime), ignoring DOB) returned 56 candidate groups / 117 rows spanning multiple klub_ids. Per brief instruction "halt if unsure → write to review-only", these were NOT modified.

Why review-only and not stale-purge:

  • Different source pipelines (godisnjak_2025_HOO, hbs_savez, hns_semafor, klub_web, klub_web_v2, manual) index the SAME real person under DIFFERENT klub_id rows because saveze and individual clubs are distinct legal entities. A water-polo player listed in HBS savez (klub_id 2599 = the savez "klub" container) AND in HOO godisnjak (klub_id 544) is not a transfer — he is the same active player viewed from two registries.
  • Croatian names like "Ivan Vuletić", "Marko Komadina", "Tomislav Katalenić" are common; without DOB confirmation, soft matches are unreliable.
  • All 117 rows have aktivni_status='aktivan' and were created within ~5 days of each other (2026-04-29 to 2026-05-03) — fits the brief's edge case "both active AND created_at within 30 days → LEGITIMATE in-season".

Decisions

Decision Count
LEGITIMATE transfer (tagged secondary_klub) 0
STALE transfer (purged + reparented) 0
REVIEW_ONLY (soft match, awaiting human review) 56 groups (117 rows)
Hard delete 0

No mutations were performed. Backup pgz_sport.clanovi_backup_20260505_0836 untouched. Row count stays at 3240.

Sample Soft Matches (full list in C_TRANSFERS.json)

  1. Niko Janković — 4 rows, 2 distinct klubs ({1,2362}). One row (id 4132) has datum_rodenja=2001-08-25 from hns_semafor; others lack DOB. Same person across HOO godisnjak / klub_web / hns_semafor / manual ingestion of klub_id 2362. Soft match only — needs DOB-fill before any merge.
  2. Cherno Saho — 3 rows, 2 distinct klubs ({2362,3840}). One row has datum_rodenja=2005-01-07. The two klub_id 2362 rows are likely intra-klub dups (Subagent A scope, hns_igrac_id was missing). The 3840 row may be an actual transfer or a savez/klub indexing split. Needs human review.
  3. David Pekar (id 464 klub 2200, id 1021 klub 428). Row 464 has datum_rodenja=2008-11-24 from hns_semafor; row 1021 from hbs_savez lacks DOB. Likely same youth player ingested from two saveze. Cannot confirm without DOB on both — soft match only.

Reasoning Summary

Per brief's hard rule "NEVER act on a case without recording reasoning in the JSON. Halt if unsure". The strict criteria yielded zero actionable cases. The 56 soft-match groups all fall under at least one safe-haven rule:

  • Both rows aktivni_status='aktivan' and created_at within 30 days → LEGITIMATE in-season → tag-only allowed, but tagging without DOB confirmation could mis-tag distinct people sharing a name. Therefore REVIEW_ONLY.
  • No transfer table exists in pgz_sport beyond clan_sezona (season stats, keyed on clan_id+sezona+natjecanje+klub_naziv, no source-clan pointer). Cannot programmatically infer "transfer" vs "duplicate".

Post-run Counts

Metric Value
pgz_sport.clanovi rows 3240 (unchanged)
Rows mutated 0
Rows soft-deleted to clanovi_purged 0
sys_audit rows added 1 (C_DETECTION_RUN summary)

Errors

None.

Files

  • C_TRANSFERS.json — structured decisions and full review-only cases
  • C_sql_transcript.sql — SQL run during this subagent