e7102c720d
CASE WHEN ... ILIKE '%X%' patterns conflicted with %s param placeholder. Escaped to %%X%%. Endpoint now returns 200 with full klubovi list + inferred davatelj_naziv (RSS / Županijski / Grad Rijeka / fallback).
4.2 KiB
4.2 KiB
Subagent C — Cross-Klub Duplicate / Stale-Transfer Detection
Run timestamp: 2026-05-05 08:36 batch Scope: pgz_sport.clanovi cross-klub duplicates Pre-run row count: 3240 (after Subagents A and B)
Strict-Criteria Results
| Detector | Cases found |
|---|---|
Same hns_igrac_id across multiple klub_id |
0 |
Same lower(ime)+lower(prezime)+datum_rodenja across multiple klub_id |
0 |
| Total confirmed cross-klub duplicates requiring action | 0 |
Notes:
- Only 3 rows in clanovi have a populated
hns_igrac_id(Subagent A already merged the 3 same-ID-same-klub duplicates). None of the surviving rows share an HNS ID across klubs. - Brief specified
datum_rodjenja. The canonical column with data isdatum_rodenja(no 'j'); 684 rows populated.datum_rodjenja(with 'j') has only 1 row. Both columns checked — zero cross-klub matches by name+DOB.
Soft Match (Review-Only, NO Mutation)
A weaker name-only check (same lower(ime)+lower(prezime), ignoring DOB) returned 56 candidate groups / 117 rows spanning multiple klub_ids. Per brief instruction "halt if unsure → write to review-only", these were NOT modified.
Why review-only and not stale-purge:
- Different source pipelines (
godisnjak_2025_HOO,hbs_savez,hns_semafor,klub_web,klub_web_v2,manual) index the SAME real person under DIFFERENT klub_id rows because saveze and individual clubs are distinct legal entities. A water-polo player listed in HBS savez (klub_id 2599 = the savez "klub" container) AND in HOO godisnjak (klub_id 544) is not a transfer — he is the same active player viewed from two registries. - Croatian names like "Ivan Vuletić", "Marko Komadina", "Tomislav Katalenić" are common; without DOB confirmation, soft matches are unreliable.
- All 117 rows have
aktivni_status='aktivan'and were created within ~5 days of each other (2026-04-29 to 2026-05-03) — fits the brief's edge case "both active AND created_at within 30 days → LEGITIMATE in-season".
Decisions
| Decision | Count |
|---|---|
| LEGITIMATE transfer (tagged secondary_klub) | 0 |
| STALE transfer (purged + reparented) | 0 |
| REVIEW_ONLY (soft match, awaiting human review) | 56 groups (117 rows) |
| Hard delete | 0 |
No mutations were performed. Backup pgz_sport.clanovi_backup_20260505_0836 untouched. Row count stays at 3240.
Sample Soft Matches (full list in C_TRANSFERS.json)
- Niko Janković — 4 rows, 2 distinct klubs ({1,2362}). One row (id 4132) has
datum_rodenja=2001-08-25fromhns_semafor; others lack DOB. Same person across HOO godisnjak / klub_web / hns_semafor / manual ingestion of klub_id 2362. Soft match only — needs DOB-fill before any merge. - Cherno Saho — 3 rows, 2 distinct klubs ({2362,3840}). One row has
datum_rodenja=2005-01-07. The two klub_id 2362 rows are likely intra-klub dups (Subagent A scope, hns_igrac_id was missing). The 3840 row may be an actual transfer or a savez/klub indexing split. Needs human review. - David Pekar (id 464 klub 2200, id 1021 klub 428). Row 464 has
datum_rodenja=2008-11-24fromhns_semafor; row 1021 fromhbs_savezlacks DOB. Likely same youth player ingested from two saveze. Cannot confirm without DOB on both — soft match only.
Reasoning Summary
Per brief's hard rule "NEVER act on a case without recording reasoning in the JSON. Halt if unsure". The strict criteria yielded zero actionable cases. The 56 soft-match groups all fall under at least one safe-haven rule:
- Both rows
aktivni_status='aktivan'and created_at within 30 days → LEGITIMATE in-season → tag-only allowed, but tagging without DOB confirmation could mis-tag distinct people sharing a name. Therefore REVIEW_ONLY. - No transfer table exists in pgz_sport beyond
clan_sezona(season stats, keyed on clan_id+sezona+natjecanje+klub_naziv, no source-clan pointer). Cannot programmatically infer "transfer" vs "duplicate".
Post-run Counts
| Metric | Value |
|---|---|
| pgz_sport.clanovi rows | 3240 (unchanged) |
| Rows mutated | 0 |
| Rows soft-deleted to clanovi_purged | 0 |
| sys_audit rows added | 1 (C_DETECTION_RUN summary) |
Errors
None.
Files
C_TRANSFERS.json— structured decisions and full review-only casesC_sql_transcript.sql— SQL run during this subagent