Dashboard top-primatelji: psycopg2 LIKE escape fix (%% in CASE WHEN)

CASE WHEN ... ILIKE '%X%' patterns conflicted with %s param placeholder.
Escaped to %%X%%. Endpoint now returns 200 with full klubovi list +
inferred davatelj_naziv (RSS / Županijski / Grad Rijeka / fallback).
This commit is contained in:
2026-05-05 09:01:25 +02:00
parent b95b2e8423
commit e7102c720d
4 changed files with 2139 additions and 0 deletions
File diff suppressed because it is too large Load Diff
@@ -0,0 +1,68 @@
# Subagent C — Cross-Klub Duplicate / Stale-Transfer Detection
Run timestamp: 2026-05-05 08:36 batch
Scope: pgz_sport.clanovi cross-klub duplicates
Pre-run row count: 3240 (after Subagents A and B)
## Strict-Criteria Results
| Detector | Cases found |
|---|---|
| Same `hns_igrac_id` across multiple `klub_id` | 0 |
| Same `lower(ime)+lower(prezime)+datum_rodenja` across multiple `klub_id` | 0 |
| **Total confirmed cross-klub duplicates requiring action** | **0** |
Notes:
- Only 3 rows in clanovi have a populated `hns_igrac_id` (Subagent A already merged the 3 same-ID-same-klub duplicates). None of the surviving rows share an HNS ID across klubs.
- Brief specified `datum_rodjenja`. The canonical column with data is `datum_rodenja` (no 'j'); 684 rows populated. `datum_rodjenja` (with 'j') has only 1 row. Both columns checked — zero cross-klub matches by name+DOB.
## Soft Match (Review-Only, NO Mutation)
A weaker name-only check (same `lower(ime)+lower(prezime)`, ignoring DOB) returned **56 candidate groups / 117 rows** spanning multiple klub_ids. Per brief instruction "halt if unsure → write to review-only", these were NOT modified.
Why review-only and not stale-purge:
- Different source pipelines (`godisnjak_2025_HOO`, `hbs_savez`, `hns_semafor`, `klub_web`, `klub_web_v2`, `manual`) index the SAME real person under DIFFERENT klub_id rows because saveze and individual clubs are distinct legal entities. A water-polo player listed in HBS savez (klub_id 2599 = the savez "klub" container) AND in HOO godisnjak (klub_id 544) is not a transfer — he is the same active player viewed from two registries.
- Croatian names like "Ivan Vuletić", "Marko Komadina", "Tomislav Katalenić" are common; without DOB confirmation, soft matches are unreliable.
- All 117 rows have `aktivni_status='aktivan'` and were created within ~5 days of each other (2026-04-29 to 2026-05-03) — fits the brief's edge case "both active AND created_at within 30 days → LEGITIMATE in-season".
## Decisions
| Decision | Count |
|---|---|
| LEGITIMATE transfer (tagged secondary_klub) | 0 |
| STALE transfer (purged + reparented) | 0 |
| REVIEW_ONLY (soft match, awaiting human review) | 56 groups (117 rows) |
| Hard delete | 0 |
No mutations were performed. Backup `pgz_sport.clanovi_backup_20260505_0836` untouched. Row count stays at 3240.
## Sample Soft Matches (full list in C_TRANSFERS.json)
1. **Niko Janković** — 4 rows, 2 distinct klubs ({1,2362}). One row (id 4132) has `datum_rodenja=2001-08-25` from `hns_semafor`; others lack DOB. Same person across HOO godisnjak / klub_web / hns_semafor / manual ingestion of klub_id 2362. Soft match only — needs DOB-fill before any merge.
2. **Cherno Saho** — 3 rows, 2 distinct klubs ({2362,3840}). One row has `datum_rodenja=2005-01-07`. The two klub_id 2362 rows are likely intra-klub dups (Subagent A scope, hns_igrac_id was missing). The 3840 row may be an actual transfer or a savez/klub indexing split. Needs human review.
3. **David Pekar** (id 464 klub 2200, id 1021 klub 428). Row 464 has `datum_rodenja=2008-11-24` from `hns_semafor`; row 1021 from `hbs_savez` lacks DOB. Likely same youth player ingested from two saveze. Cannot confirm without DOB on both — soft match only.
## Reasoning Summary
Per brief's hard rule "NEVER act on a case without recording reasoning in the JSON. Halt if unsure". The strict criteria yielded zero actionable cases. The 56 soft-match groups all fall under at least one safe-haven rule:
- Both rows `aktivni_status='aktivan'` and created_at within 30 days → LEGITIMATE in-season → tag-only allowed, but tagging without DOB confirmation could mis-tag distinct people sharing a name. Therefore REVIEW_ONLY.
- No transfer table exists in pgz_sport beyond `clan_sezona` (season stats, keyed on clan_id+sezona+natjecanje+klub_naziv, no source-clan pointer). Cannot programmatically infer "transfer" vs "duplicate".
## Post-run Counts
| Metric | Value |
|---|---|
| pgz_sport.clanovi rows | 3240 (unchanged) |
| Rows mutated | 0 |
| Rows soft-deleted to clanovi_purged | 0 |
| sys_audit rows added | 1 (C_DETECTION_RUN summary) |
## Errors
None.
## Files
- `C_TRANSFERS.json` — structured decisions and full review-only cases
- `C_sql_transcript.sql` — SQL run during this subagent
@@ -0,0 +1,119 @@
-- Subagent C — Cross-Klub Duplicate / Stale-Transfer detection
-- Run: 2026-05-05 08:36 batch
-- DB: rinet_v3, schema pgz_sport
-- Result: 0 strict mutations. 56 soft-match groups recorded review-only.
-- ====================================================================
-- 0. Pre-run sanity
-- ====================================================================
SELECT count(*) AS clanovi_rowcount FROM pgz_sport.clanovi;
-- expect 3240 (post-A, post-B)
SELECT count(*) AS rows_with_hns_igrac_id
FROM pgz_sport.clanovi
WHERE hns_igrac_id IS NOT NULL AND hns_igrac_id != '';
-- 3
-- ====================================================================
-- 1. Strict detector A: same hns_igrac_id across multiple klub_id
-- ====================================================================
SELECT hns_igrac_id, count(DISTINCT klub_id) AS n_klubs,
array_agg(id ORDER BY created_at) AS clan_ids,
array_agg(klub_id ORDER BY created_at) AS klub_ids,
array_agg(created_at ORDER BY created_at) AS created_ats,
array_agg(updated_at ORDER BY created_at) AS updated_ats,
array_agg(aktivni_status ORDER BY created_at) AS statuses
FROM pgz_sport.clanovi
WHERE hns_igrac_id IS NOT NULL AND hns_igrac_id != ''
GROUP BY hns_igrac_id
HAVING count(DISTINCT klub_id) > 1;
-- 0 rows
-- ====================================================================
-- 2. Strict detector B: same lower(ime)+lower(prezime)+datum_rodenja across klubs
-- (column name in this schema is datum_rodenja, no 'j'; datum_rodjenja has only 1 row)
-- ====================================================================
SELECT count(*) FROM pgz_sport.clanovi WHERE datum_rodjenja IS NOT NULL; -- 1
SELECT count(*) FROM pgz_sport.clanovi WHERE datum_rodenja IS NOT NULL; -- 684
SELECT lower(ime) AS ime_l, lower(prezime) AS prez_l, datum_rodenja,
count(DISTINCT klub_id) AS n_klubs,
array_agg(id ORDER BY created_at) AS clan_ids,
array_agg(klub_id ORDER BY created_at) AS klub_ids,
array_agg(aktivni_status ORDER BY created_at) AS statuses
FROM pgz_sport.clanovi
WHERE datum_rodenja IS NOT NULL
GROUP BY 1,2,3
HAVING count(DISTINCT klub_id) > 1
ORDER BY n_klubs DESC, prez_l, ime_l;
-- 0 rows
-- Same check using datum_rodjenja (with 'j') for completeness
SELECT lower(ime), lower(prezime), datum_rodjenja, count(DISTINCT klub_id)
FROM pgz_sport.clanovi
WHERE datum_rodjenja IS NOT NULL
GROUP BY 1,2,3
HAVING count(DISTINCT klub_id) > 1;
-- 0 rows
-- ====================================================================
-- 3. Look for transfer/history evidence tables
-- ====================================================================
SELECT table_name FROM information_schema.tables
WHERE table_schema='pgz_sport'
AND (table_name ILIKE '%transfer%' OR table_name ILIKE '%history%' OR table_name ILIKE '%sezona%');
-- clan_sezona, klub_sezona, klub_sezona_backup_20260502
-- clan_sezona has no source-klub pointer; cannot infer transfer programmatically.
-- ====================================================================
-- 4. Soft detector (review only): name-only across klubs
-- ====================================================================
SELECT count(*) AS soft_namematch_groups FROM (
SELECT lower(ime), lower(prezime)
FROM pgz_sport.clanovi
WHERE ime IS NOT NULL AND prezime IS NOT NULL
GROUP BY 1,2
HAVING count(DISTINCT klub_id) > 1
) x;
-- 56
-- Full review-only material (exported to C_TRANSFERS.json)
WITH soft AS (
SELECT lower(ime) AS ime_l, lower(prezime) AS prez_l
FROM pgz_sport.clanovi
WHERE ime IS NOT NULL AND prezime IS NOT NULL
GROUP BY 1,2
HAVING count(DISTINCT klub_id) > 1
)
SELECT c.id, c.klub_id, c.ime, c.prezime, c.datum_rodenja, c.aktivni_status,
c.created_at::date, c.source, c.hns_igrac_id
FROM pgz_sport.clanovi c
JOIN soft s ON lower(c.ime)=s.ime_l AND lower(c.prezime)=s.prez_l
ORDER BY s.prez_l, s.ime_l, c.created_at;
-- 117 rows. All aktivni_status='aktivan'. All created within 5-day window.
-- ====================================================================
-- 5. Audit log: record the run, no mutation
-- ====================================================================
INSERT INTO pgz_sport.sys_audit (action, target_type, target_text, payload)
VALUES (
'C_DETECTION_RUN',
'clanovi',
'cross-klub stale-transfer detection',
jsonb_build_object(
'run_ts', '2026-05-05T08:36:00Z',
'strict_hns_id_cross_klub', 0,
'strict_name_dob_cross_klub', 0,
'soft_name_only_groups', 56,
'soft_name_only_rows', 117,
'mutations_applied', 0,
'review_only_file', '/opt/pgz-sport/_audit/data_integrity_20260505_0836/C_TRANSFERS.json',
'reason', 'Strict criteria yielded 0 cases. Soft matches treated as REVIEW_ONLY per brief halt-if-unsure rule.'
)
);
-- ====================================================================
-- 6. Post-run sanity
-- ====================================================================
SELECT count(*) AS clanovi_rowcount_after FROM pgz_sport.clanovi;
-- expect 3240 (unchanged)
+36
View File
@@ -296,6 +296,42 @@ def api_kpi():
}
@app.get("/api/dashboard/top-primatelji")
def dashboard_top_primatelji(godina: int = 2025, limit: int = 50):
"""Top primatelji javnih potreba — svi klubovi sa primljenim potporama u godini."""
rows = fetch("""
SELECT
pn.naziv_kluba,
pn.klub_id,
pn.iznos,
pn.napomena,
pn.godina,
COALESCE(k.sport, 'n/a') AS sport,
COALESCE(s.naziv, '') AS savez_naziv,
COALESCE(k.razina, '') AS razina,
COALESCE(k.grad, '') AS grad,
CASE
WHEN pn.napomena ILIKE '%%županijski%%' OR pn.napomena ILIKE '%%PGZ%%' OR pn.napomena ILIKE '%%PGŽ%%' THEN 'Županijski sportski savez PGŽ'
WHEN pn.napomena ILIKE '%%riječki%%' OR pn.napomena ILIKE '%%RSS%%' THEN 'Riječki sportski savez'
WHEN pn.napomena ILIKE '%%grad rijeka%%' THEN 'Grad Rijeka'
ELSE 'Riječki sportski savez'
END AS davatelj_naziv
FROM pgz_sport.potpore_nositelji pn
LEFT JOIN pgz_sport.klubovi k ON k.id = pn.klub_id
LEFT JOIN pgz_sport.savezi s ON s.id = k.savez_id
WHERE pn.godina = %s
ORDER BY pn.iznos DESC NULLS LAST
LIMIT %s
""", (godina, limit))
return {
"godina": godina,
"count": len(rows),
"rows": rows,
"ukupno": sum((r.get("iznos") or 0) for r in rows),
}
@app.get("/api/dashboard/ekosustav")
def dashboard_ekosustav():
"""Sport ekosustav PGŽ — coverage stats za enrichment iz FINA registra."""