Dashboard top-primatelji: psycopg2 LIKE escape fix (%% in CASE WHEN)
CASE WHEN ... ILIKE '%X%' patterns conflicted with %s param placeholder. Escaped to %%X%%. Endpoint now returns 200 with full klubovi list + inferred davatelj_naziv (RSS / Županijski / Grad Rijeka / fallback).
This commit is contained in:
@@ -0,0 +1,68 @@
|
||||
# Subagent C — Cross-Klub Duplicate / Stale-Transfer Detection
|
||||
|
||||
Run timestamp: 2026-05-05 08:36 batch
|
||||
Scope: pgz_sport.clanovi cross-klub duplicates
|
||||
Pre-run row count: 3240 (after Subagents A and B)
|
||||
|
||||
## Strict-Criteria Results
|
||||
|
||||
| Detector | Cases found |
|
||||
|---|---|
|
||||
| Same `hns_igrac_id` across multiple `klub_id` | 0 |
|
||||
| Same `lower(ime)+lower(prezime)+datum_rodenja` across multiple `klub_id` | 0 |
|
||||
| **Total confirmed cross-klub duplicates requiring action** | **0** |
|
||||
|
||||
Notes:
|
||||
- Only 3 rows in clanovi have a populated `hns_igrac_id` (Subagent A already merged the 3 same-ID-same-klub duplicates). None of the surviving rows share an HNS ID across klubs.
|
||||
- Brief specified `datum_rodjenja`. The canonical column with data is `datum_rodenja` (no 'j'); 684 rows populated. `datum_rodjenja` (with 'j') has only 1 row. Both columns checked — zero cross-klub matches by name+DOB.
|
||||
|
||||
## Soft Match (Review-Only, NO Mutation)
|
||||
|
||||
A weaker name-only check (same `lower(ime)+lower(prezime)`, ignoring DOB) returned **56 candidate groups / 117 rows** spanning multiple klub_ids. Per brief instruction "halt if unsure → write to review-only", these were NOT modified.
|
||||
|
||||
Why review-only and not stale-purge:
|
||||
- Different source pipelines (`godisnjak_2025_HOO`, `hbs_savez`, `hns_semafor`, `klub_web`, `klub_web_v2`, `manual`) index the SAME real person under DIFFERENT klub_id rows because saveze and individual clubs are distinct legal entities. A water-polo player listed in HBS savez (klub_id 2599 = the savez "klub" container) AND in HOO godisnjak (klub_id 544) is not a transfer — he is the same active player viewed from two registries.
|
||||
- Croatian names like "Ivan Vuletić", "Marko Komadina", "Tomislav Katalenić" are common; without DOB confirmation, soft matches are unreliable.
|
||||
- All 117 rows have `aktivni_status='aktivan'` and were created within ~5 days of each other (2026-04-29 to 2026-05-03) — fits the brief's edge case "both active AND created_at within 30 days → LEGITIMATE in-season".
|
||||
|
||||
## Decisions
|
||||
|
||||
| Decision | Count |
|
||||
|---|---|
|
||||
| LEGITIMATE transfer (tagged secondary_klub) | 0 |
|
||||
| STALE transfer (purged + reparented) | 0 |
|
||||
| REVIEW_ONLY (soft match, awaiting human review) | 56 groups (117 rows) |
|
||||
| Hard delete | 0 |
|
||||
|
||||
No mutations were performed. Backup `pgz_sport.clanovi_backup_20260505_0836` untouched. Row count stays at 3240.
|
||||
|
||||
## Sample Soft Matches (full list in C_TRANSFERS.json)
|
||||
|
||||
1. **Niko Janković** — 4 rows, 2 distinct klubs ({1,2362}). One row (id 4132) has `datum_rodenja=2001-08-25` from `hns_semafor`; others lack DOB. Same person across HOO godisnjak / klub_web / hns_semafor / manual ingestion of klub_id 2362. Soft match only — needs DOB-fill before any merge.
|
||||
2. **Cherno Saho** — 3 rows, 2 distinct klubs ({2362,3840}). One row has `datum_rodenja=2005-01-07`. The two klub_id 2362 rows are likely intra-klub dups (Subagent A scope, hns_igrac_id was missing). The 3840 row may be an actual transfer or a savez/klub indexing split. Needs human review.
|
||||
3. **David Pekar** (id 464 klub 2200, id 1021 klub 428). Row 464 has `datum_rodenja=2008-11-24` from `hns_semafor`; row 1021 from `hbs_savez` lacks DOB. Likely same youth player ingested from two saveze. Cannot confirm without DOB on both — soft match only.
|
||||
|
||||
## Reasoning Summary
|
||||
|
||||
Per brief's hard rule "NEVER act on a case without recording reasoning in the JSON. Halt if unsure". The strict criteria yielded zero actionable cases. The 56 soft-match groups all fall under at least one safe-haven rule:
|
||||
|
||||
- Both rows `aktivni_status='aktivan'` and created_at within 30 days → LEGITIMATE in-season → tag-only allowed, but tagging without DOB confirmation could mis-tag distinct people sharing a name. Therefore REVIEW_ONLY.
|
||||
- No transfer table exists in pgz_sport beyond `clan_sezona` (season stats, keyed on clan_id+sezona+natjecanje+klub_naziv, no source-clan pointer). Cannot programmatically infer "transfer" vs "duplicate".
|
||||
|
||||
## Post-run Counts
|
||||
|
||||
| Metric | Value |
|
||||
|---|---|
|
||||
| pgz_sport.clanovi rows | 3240 (unchanged) |
|
||||
| Rows mutated | 0 |
|
||||
| Rows soft-deleted to clanovi_purged | 0 |
|
||||
| sys_audit rows added | 1 (C_DETECTION_RUN summary) |
|
||||
|
||||
## Errors
|
||||
|
||||
None.
|
||||
|
||||
## Files
|
||||
|
||||
- `C_TRANSFERS.json` — structured decisions and full review-only cases
|
||||
- `C_sql_transcript.sql` — SQL run during this subagent
|
||||
Reference in New Issue
Block a user