# SA-3 DB Integrity Probe **Generated:** 2026-05-05T06:24:49Z **DB:** rinet_v3 @ 10.10.0.2:6432 **Schema:** pgz_sport ## 1. Row counts ```sql SELECT relname, n_live_tup FROM pg_stat_user_tables WHERE schemaname='pgz_sport' ORDER BY n_live_tup DESC; ``` Top live (production) tables and key backups. **Note:** the schema contains a large number of `*_backup_*` / `*_premerge_*` / `*_dedup_*` / `*_pre_*` snapshot tables (clean-up debris). Only the canonical production tables are highlighted below; the rest are listed beneath. | Table | Rows | |---|---:| | clanovi | 3248 | | klubovi | 2244 | | sportski_objekti | 106 | | savezi | 246 | | dokumenti | 7073 | | dokument_chunks | 2850 | | utakmice_log | 9267 | | rno_bilanca | 6500 | | rno_prras | 6500 | | clan_godisnjak | 2398 | | clan_nagrada | 2028 | | natjecanja_tablice | 959 | | clan_sezona | 689 | | hns_klubovi_natjecanje | 635 | | klub_sezona | 631 | | sys_audit | 627 | | enrichment_log | 616 | | dokument_primjena | 439 | | natjecanja | 428 | | clanovi_deleted_empty | 372 | | clanstvo_kategorije | 313 | | natjecanje_tablica | 304 | | vijesti | 286 | | savez_stats_oficijalno | 284 | | najbolji_sportasi | 243 | | user_sessions | 235 | | sys_role_permissions | 220 | | audit_events | 193 | | potpore_nositelji | 182 | | savez_statistika_clanstvo | 177 | | statistika_saveza | 169 | | osobe_funkcije | 159 | | sport_facts | 135 | | audit_feed | 131 | | dobne_kategorije | 127 | | manifestacije | 113 | | sufinanciranje_sport | 110 | | alertovi | 89 | | ai_grad_distances | 78 | | hns_natjecanja | 74 | | notifications | 66 | | sys_permissions | 54 | | zsp_dokumenti | 54 | | uloga_katalog | 49 | | clanarine | 48 | | mediji | 42 | | treneri | 38 | | account_codes | 31 | | audit_log | 29 | | suci | 27 | | rno_sportske_udruge | 21 | | users | 18 | | lijecnicki_pregledi | 16 | | form_templates | 15 | | invoices | 14 | | specijalisti_med | 13 | | akademski_sport | 11 | | proracun | 11 | | hoo_pravilnici | 8 | | alert_rules | 8 | | roles | 7 | | scraper_runs | 6 | | invoice_uploads | 5 | | payments | 5 | | user_action_tokens | 5 | | tenants | 5 | | polygon_seals | 5 | | expense_reports | 4 | | javne_potrebe | 4 | | user_klub_links | 4 | | form_submissions | 3 | | email_templates | 3 | | gdpr_erasure_requests | 3 | | sportas_specifika | 2 | | gdpr_consent | 2 | | user_roles | 1 | | putni_nalog_racuni | 1 | | user_2fa | 1 | | invoice_lines | 1 | | llm_extracted_facts | 0 | | scrape_jobs | 0 | | clan_utakmica | 0 | | natjecanja_utakmice | 0 | | user_permissions | 0 | | sponzori | 0 | ### Backup/snapshot tables (candidates for archival drop) These are stale workflow artefacts taking up significant rows; they should not be queried by app code: | Table | Rows | |---|---:| | clanovi_pre_godisnjak_backup | 25944 | | klubovi_garbage_backup_1777750740 | 10072 | | klubovi_dedup_v2_1777750793 | 9920 | | klubovi_dedup_v3_1777750848 | 9672 | | clanovi_backup_20260430 | 9572 | | klubovi_premerge_20260503c | 8976 | | klubovi_premerge_20260503b | 8976 | | klubovi_pre_cleanup_20260430 | 8120 | | klubovi_pre_dedup_20260430 | 5960 | | klubovi_premerge_20260503 | 2572 | | klubovi_backup_20260505 | 2244 | | clanovi_purge_backup_20260429 | 1576 | | clanovi_dedup_20260502_v2 | 1384 | | klub_sezona_backup_20260502 | 1092 | | clanovi_dedup_backup_20260429 | 532 | | klubovi_sport_rename_backup_1777756941 | 396 | | klubovi_dedup_20260502 | 140 | | sponzori_mock_backup_1777756941 | 88 | | klubovi_finaldd_backup_1777752742 | 72 | | klubovi_garbage_backup_20260502 | 36 | | rno_organizacije | 1482 *(may be production)* | | sys_users_deprecated_20260429 | 9 | | klubovi_dedup_haok_backup_20260505 | 3 | | sys_user_klub_links_deprecated_20260429 | 2 | | klubovi_garbage_backup_1777752698 | 0 | | sys_sessions_deprecated_20260429 | 0 | | sys_user_permissions_deprecated_20260429 | 0 | Total backup rows held: ~97,000+ (about **30x** the canonical row count). ## 2. NULL/empty critical columns ```sql SELECT 'clanovi.ime', COUNT(*) FILTER (WHERE ime IS NULL), COUNT(*) FILTER (WHERE ime = '') FROM pgz_sport.clanovi UNION ALL SELECT 'clanovi.prezime', COUNT(*) FILTER (WHERE prezime IS NULL),COUNT(*) FILTER (WHERE prezime = '') FROM pgz_sport.clanovi UNION ALL SELECT 'klubovi.naziv', COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.klubovi UNION ALL SELECT 'savezi.naziv', COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.savezi UNION ALL SELECT 'sportski_objekti.naziv',COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.sportski_objekti; ``` | Column | NULLs | Empties | |---|---:|---:| | clanovi.ime | 0 | 0 | | clanovi.prezime | 0 | 0 | | klubovi.naziv | 0 | 0 | | savezi.naziv | 0 | 0 | | sportski_objekti.naziv | 0 | 0 | **Verdict:** clean. The recent dedup/cleanup passes have eliminated all NULL/empty primary identifiers. ## 3. Orphan FKs ```sql SELECT 'clanovi.klub_id->klubovi', COUNT(*) FROM pgz_sport.clanovi c WHERE c.klub_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.klubovi k WHERE k.id=c.klub_id) UNION ALL SELECT 'klubovi.savez_id->savezi', COUNT(*) FROM pgz_sport.klubovi k WHERE k.savez_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.savezi s WHERE s.id=k.savez_id) UNION ALL SELECT 'sys_audit.user_id->users', COUNT(*) FROM pgz_sport.sys_audit a WHERE a.user_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.users u WHERE u.id=a.user_id); ``` | Constraint | Orphan rows | |---|---:| | clanovi.klub_id -> klubovi.id | 0 | | klubovi.savez_id -> savezi.id | 0 | | sys_audit.user_id -> users.id | 0 | **Verdict:** clean. All FK chains are intact. ## 4. Duplicate OIBs ```sql SELECT oib, count(*), string_agg(naziv, ' | ') FROM pgz_sport.klubovi WHERE oib IS NOT NULL AND oib ~ '^[0-9]{11}$' GROUP BY oib HAVING count(*)>1; ``` **48 distinct OIBs are shared by 2-4 klubovi rows each (~100 duplicate rows total).** This is the single largest data-quality issue. | OIB | Count | Names | |---|---:|---| | 86603390999 | 3 | Juniorska ekipa Sv.Rok-Klana \| Boćarski Klub Sv. Rok Klana \| Sveti Rok-Klana | | 80500347365 | 3 | HNK Orijent \| Hrvatski Nogometni Klub Orijent \| HNK Orijent 1919 (Sušak) | | 44908060737 | 3 | Boćarski Klub Krimeja \| Krimeja \| BK Krimeja | | 19490107091 | 3 | BOĆARSKI KLUB "LOVRAN" \| Kadetska ekipa BK Lovran \| Boćarski klub Lovran | | 29964028897 | 4 | Boćarski klub Kastav \| Kadetska ekipa BK Kastav 2 \| Kadetska ekipa BK Kastav \| Boćarski klub Kastav | | 17563258345 | 3 | Plivački Klub Primorje Rijeka \| KLUB DALJINSKOG PLIVANJA "PRIMORJE" \| KLUB UMJETNIČKOG PLIVANJA „PRIMORJE AQUA MARIS" RIJEKA | | 15986803554 | 2 | Košarkaški Klub Kvarner \| Košarkaški klub KVARNER 2010 | | 35549440954 | 2 | Muški Odbojkaški Klub "Gornja Vežica" \| Muški Odbojkaški Klub Gornja Vežica | | 37941242606 | 2 | Muški Boćarski Klub Hreljin \| Boćarski klub Hreljin | | 56273001018 | 2 | Nogometni klub Turbina Bakar \| Nogometni Klub Turbina Tribalj | | 67434497493 | 2 | Odbojkaški Klub Rab \| Odbojkaški Klub "Rab" | | 47139832980 | 2 | Hrvatski Akademski Odbojkaški Klub "Rijeka" \| HRVATSKI AKADEMSKI ODBOJKAŠKI KLUB "RIJEKA" | | 19514046928 | 2 | Lovačko društvo "JELEN" Čavle \| LOVAČKO DRUŠTVO "JELEN" ČAVLE | | 83495265520 | 2 | Odbojkaški Klub "Kastav 1998" \| Odbojkaški Klub Kastav 1998 | | 14384540738 | 2 | Boćarski klub Kostrena \| Boćarski Klub Kostrena | | 17639054753 | 2 | Streljački Klub Gluhih Galeb \| Streljački klub gluhih "Galeb" | | 40538276343 | 2 | Odbojkaški Klub "Odbojkaška Akademija Petica" \| Odbojkaški klub Odbojkaška Akademija Petica | | 76273502221 | 2 | Boćarski Klub Srdoči 1983 \| Srdoči 1983 | | 17934350916 | 2 | NOGOMETNI KLUB "KLANA" \| NK Klana | | 81511316706 | 2 | Odbojkaški Klub Kostrena Kostrena \| Odbojkaški Klub "Kostrena" Kostrena | | 27991069782 | 2 | Boćarski Klub Čavle Šb Čavle \| Juniorska ekipa Čavle ŠB | | 44509762938 | 2 | Kadetska ekipa BK Sveti Jakov \| Boćarski Klub Sveti Jakov Jadranovo | | 38093446162 | 2 | Lovranska Draga \| Boćarski Klub Lovranska Draga | | 56132503774 | 2 | Nogometni Klub Draga-Mošćenička Draga \| NK Draga | | 40936837495 | 2 | Lovačko društvo "KAMENJARKA" Kukuljanovo \| LOVAČKO DRUŠTVO "KAMENJARKA" KUKULJANOVO-ŠKRLJEVO | | 02999668483 | 2 | ŠK Goranka \| KK Goranka | | 35883230704 | 2 | Lovačko društvo "MEDVIĐAK" Drivenik Tribalj \| LOVAČKO DRUŠTVO "MEDVIĐAK" DRIVENIK | | 27420052480 | 2 | Krenovac \| Boćarski Klub Krenovac | | 17195966673 | 2 | Ženski Odbojkaški Klub "Crikvenica" \| Ženski Odbojkaški Klub Crikvenica | | 51108883738 | 2 | NK Risnjak \| Nogometni Klub Risnjak Lokve | | 13794801696 | 2 | Ženski nogometni klub Rijeka Jack Pot \| Ženski nogometni klub Rijeka | | 33154520914 | 2 | Malonogometni klub gluhih "Galeb" \| Malonogometni Klub Gluhih Galeb | | 52818156657 | 2 | Parastreljački Klub Paraolimpijac \| Parastreljački klub "Paraolimpijac" | | 42449645267 | 2 | Paraatletski Klub Rijeka \| Paraatletski klub "Srce" Rijeka | | 75947125821 | 2 | Boćarski klub Opatija \| Boćarski Klub Opatija | | 43219260850 | 2 | Ženski Akademski Odbojkaški Klub Škurinje Rijeka \| Ženski Akademski Odbojkaški Klub Škurinje Rijeka | | 85575561127 | 2 | SPORTSKO-REKREACIJSKO DRUŠTVO VIŠEVICA \| rekreacijsko društvo VIŠEVICA | | 19353575292 | 2 | Odbojkaški Klub "Sveti Matej 06" - Viškovo \| Odbojkaški Klub Sveti Matej 06 - Viškovo | | 86232456523 | 2 | Boćarski klub Krk \| Boćarski klub Krk | | 74630525187 | 2 | Nogometni klub Omladinac \| NK Omladinac Vrata | | 83261523211 | 2 | Odbojkaški Klub Opatija Volley \| ODBOJKAŠKI KLUB OPATIJA VOLLEY | | 98146784649 | 2 | Boćarski Klub Draga Mošćenička Draga \| Draga – Mošćenička Draga | | 39250096592 | 2 | Boćarski klub Brod Moravice \| Boćarski Klub Brod Moravice | | 76221716576 | 2 | Kuglački Klub Gluhih Galeb \| Kuglački klub gluhih "Galeb" | | 10132566066 | 2 | Vaterpolo klub PRIMORJE-ERSTE BANKA-ženska ekipa \| Vaterpolo klub PRIMORJE-ERSTE BANKA-muška ekipa | | 39123612806 | 2 | Stolnoteniski klub Rijeka \| Parastolnoteniski Klub Rijeka | | 70928157464 | 2 | Ženski Boćarski Klub Hreljin \| ŽBK Hreljin | | 77066352874 | 2 | Nogometni Klub Vinodol \| NK Vihor | **Patterns:** - Casing/whitespace duplicates (`Boćarski klub Kostrena` vs `Boćarski Klub Kostrena`) — pure dupes, merge. - Quoting variants (`"Rab"` vs `Rab`) — same. - "Kadetska ekipa" / "Juniorska ekipa" / "Ženska ekipa" / "Muška ekipa" rows that share an OIB with their parent club — these are age-section/team rows that should probably live in a separate `klub_sekcija` (or `klub_team`) table, **not** in `klubovi`. - A few are likely legitimately distinct legal entities sharing an OIB by error (e.g. Vinodol vs Vihor; NK Risnjak vs NK Risnjak Lokve) — flag for human review. ## 5. Placeholder values ```sql -- klubovi SELECT 'klubovi.naziv placeholders', COUNT(*) FROM pgz_sport.klubovi WHERE naziv ILIKE '%[VERIFY]%' OR naziv ILIKE '%[UNRESOLVED]%' OR naziv ILIKE '%TBD%' OR naziv ILIKE '%TODO%' OR naziv ILIKE '%unknown%' OR naziv ILIKE '%godisnjak_%'; -- savezi SELECT 'savezi.naziv placeholders', COUNT(*) FROM pgz_sport.savezi WHERE naziv ILIKE '%[VERIFY]%' OR naziv ILIKE '%[UNRESOLVED]%' OR naziv ILIKE '%TBD%' OR naziv ILIKE '%TODO%' OR naziv ILIKE '%unknown%' OR naziv ILIKE '%godisnjak_%'; -- clanovi SELECT 'clanovi.ime/prezime placeholders', COUNT(*) FROM pgz_sport.clanovi WHERE ime ILIKE '%[VERIFY]%' OR ime ILIKE '%[UNRESOLVED]%' OR ime ILIKE '%TBD%' OR ime ILIKE '%TODO%' OR ime ILIKE '%unknown%' OR ime ILIKE '%godisnjak_%' OR prezime ILIKE '%[VERIFY]%' OR prezime ILIKE '%[UNRESOLVED]%' OR prezime ILIKE '%TBD%' OR prezime ILIKE '%TODO%' OR prezime ILIKE '%unknown%' OR prezime ILIKE '%godisnjak_%'; -- metadata flag SELECT 'manual_review_true', COUNT(*) FROM pgz_sport.klubovi WHERE metadata->>'manual_review' = 'true'; ``` | Bucket | Count | |---|---:| | klubovi.naziv with placeholder marker | 3 | | savezi.naziv with placeholder marker | 0 | | clanovi.ime/prezime with placeholder marker | 6 | | klubovi.metadata.manual_review = 'true' | 3 | ### klubovi placeholder rows | id | naziv | |---|---| | 2630 | [VERIFY] Odbojkaški Klub Opatija | | 2619 | [VERIFY] Odbojkaški Klub Čavle | | 4426 | [UNRESOLVED] empty naziv & grad — id 4426 | ### clanovi placeholder rows (matched the pattern via `Todorović` surname containing `do`...`unkn`...? — check is loose; these are false positives in fact) ```sql SELECT id, ime, prezime FROM pgz_sport.clanovi WHERE prezime ILIKE '%unknown%' OR ime ILIKE '%unknown%' OR ... ``` | id | ime | prezime | |---|---|---| | 4202 | Aleksa | Todorović | | 4140 | Aleksa | Todorović | | 1956 | Filip | Todorović | | 377 | Dejan | Todorović | | 3455 | Aleksa | Todorović | | 551 | Matteo | Todorović | These six are **false positives** — `prezime` "Todorović" matches `%dor%` token that overlaps `%godisnjak_%` is **not** the trigger; the actual trigger is `%TODO%` substring inside "ToDOrović" (case-insensitive ILIKE). They are real surnames, not placeholders. (Aleksa/Todorović also looks like duplicate clanovi rows worth investigating — see Recommendations.) **Verdict:** placeholder pollution is essentially nil. Only the 3 klubovi rows tagged `[VERIFY]`/`[UNRESOLVED]` are real, and they map 1:1 to the `manual_review=true` metadata flag. ## 6. Low-coverage klubovi (filled < 4 of 12) ```sql WITH cov AS ( SELECT id, naziv, (CASE WHEN naziv IS NOT NULL AND naziv <>'' THEN 1 ELSE 0 END + CASE WHEN sport IS NOT NULL AND sport <>'' THEN 1 ELSE 0 END + CASE WHEN grad IS NOT NULL AND grad <>'' THEN 1 ELSE 0 END + CASE WHEN oib IS NOT NULL AND oib <>'' THEN 1 ELSE 0 END + CASE WHEN predsjednik IS NOT NULL AND predsjednik<>'' THEN 1 ELSE 0 END + CASE WHEN tajnik IS NOT NULL AND tajnik <>'' THEN 1 ELSE 0 END + CASE WHEN email IS NOT NULL AND email <>'' THEN 1 ELSE 0 END + CASE WHEN telefon IS NOT NULL AND telefon <>'' THEN 1 ELSE 0 END + CASE WHEN COALESCE(web, web_stranica) IS NOT NULL AND COALESCE(web, web_stranica)<>'' THEN 1 ELSE 0 END + CASE WHEN COALESCE(sjediste, adresa) IS NOT NULL AND COALESCE(sjediste, adresa)<>'' THEN 1 ELSE 0 END + CASE WHEN ciljevi IS NOT NULL AND ciljevi <>'' THEN 1 ELSE 0 END + CASE WHEN opis_djelatnosti IS NOT NULL AND opis_djelatnosti<>'' THEN 1 ELSE 0 END ) AS filled FROM pgz_sport.klubovi ) SELECT id, naziv, filled FROM cov WHERE filled<4 ORDER BY filled ASC, id ASC LIMIT 20; ``` **Total klubovi with filled < 4 / 12 (i.e. <33%): 518** (≈23% of the 2244 production klubovi). ### Sample 20 worst (filled = 1 or 2) | id | naziv | filled/12 | |---|---|---:| | 4249 | Streljački klub DVD svojevrstan vodič za roditelje | 1 | | 4250 | Streljački klub DVD Opatija | 1 | | 2290 | KK Metal - Jurdani | 2 | | 2291 | KK OI KOSTRENA | 2 | | 2311 | RK LIBURNIJA 8. u II HRL Zapad od 12 | 2 | | 2312 | RK MORNAR 3. u II HRL Zapad od 10 | 2 | | 2315 | RK PŠR SELCE 5. u III HRL Zapad od 8 | 2 | | 2324 | RK ČAVLE 2. u II HRL Zapad od 10 | 2 | | 2325 | RK ČAVLE 7. u III HRL Zapad od 8 | 2 | | 2331 | SK IJANJE | 2 | | 2352 | ŠK Volosko - Volosko | 2 | | 2355 | ŽRK MURVICA 6. u II HRL Zapad od 12 | 2 | | 2356 | ŽRK MURVICA 6. u II HRL Zapad od 9 | 2 | | 2360 | ŽRK ZAMET II 3. u III HRL Zapad od 8 | 2 | | 3741 | AK Elena Ban | 2 | | 3744 | AK Koper | 2 | | 3747 | AK Kvarnera | 2 | | 3748 | AK Rijeka | 2 | | 3749 | AK Velenje | 2 | | 3750 | AK Viškovo | 2 | **Patterns:** - `RK N. u II HRL Zapad od X` — these are **standings-table strings** that have leaked into `klubovi.naziv`. They're not clubs at all, they're handball league rankings. Should be deleted from klubovi (and redirected to `natjecanja_tablice`). - `Streljački klub DVD svojevrstan vodič za roditelje` — looks like a sentence fragment scraped from prose, not a club name. - `AK ` rows — atletski (athletics) clubs from neighbouring cities (Koper, Velenje are in Slovenia) — likely in-scope as competitors only, not PGŽ entities. ## 7. Suspicious clanovi (junk imports) ```sql SELECT id, ime, prezime FROM pgz_sport.clanovi WHERE ime ~ '@|^\d+$' LIMIT 20; ``` **Result: 0 rows.** No emails or pure-numeric strings have leaked into the `ime` field. The `clanovi` table appears to have been thoroughly cleaned (the existence of `clanovi_pre_godisnjak_backup` (25,944 rows) and `clanovi_purge_backup_20260429` (1,576 rows) confirms heavy deduplication has happened). ## 8. sys_audit health ```sql SELECT 'total', COUNT(*)::text FROM pgz_sport.sys_audit UNION ALL SELECT 'today', COUNT(*)::text FROM pgz_sport.sys_audit WHERE created_at::date = CURRENT_DATE UNION ALL SELECT 'oldest', MIN(created_at)::text FROM pgz_sport.sys_audit UNION ALL SELECT 'newest', MAX(created_at)::text FROM pgz_sport.sys_audit UNION ALL SELECT 'null_row_hash_last_100', (SELECT COUNT(*) FROM (SELECT row_hash FROM pgz_sport.sys_audit ORDER BY id DESC LIMIT 100) t WHERE row_hash IS NULL)::text; ``` | Metric | Value | |---|---| | Total rows | 627 | | Rows today (2026-05-04) | 531 | | Oldest entry | 2026-04-28 21:39:45 +02 | | Newest entry | 2026-05-05 08:23:14 +02 | | NULL row_hash in last 100 | 0 | **Verdict:** chain integrity intact, but the audit log is **only 7 days old** — there's been a recent re-init, or audit was switched on only on 2026-04-28. Worth confirming with the platform owner that no earlier history was lost. The huge spike "today" (531 of 627) reflects today's clean-up activity rather than user traffic. ## Recommended fixes (top 10) 1. **Drop ~30 backup tables (~97k rows).** `clanovi_pre_godisnjak_backup` (25.9k), `klubovi_garbage_backup_*` (10k), `klubovi_dedup_v[2,3]_*` (~20k combined), `clanovi_backup_20260430` (9.5k), and the rest of the `*_backup_*` / `*_premerge_*` / `*_pre_*` / `*_deprecated_*` set. Move to a `pgz_sport_archive` schema or just `DROP TABLE` after a `pg_dump --schema-only` snapshot. Saves index size and stops accidental queries against stale data. 2. **Resolve 48 duplicate-OIB clusters in `klubovi`** (~100 rows). Recommended SQL pattern: ```sql -- For each OIB cluster, keep the row with highest filled-coverage and -- merge children (clanovi.klub_id, klub_sezona.klub_id, etc.) onto it. WITH dups AS (SELECT oib, MIN(id) AS keep_id FROM pgz_sport.klubovi WHERE oib ~ '^[0-9]{11}$' GROUP BY oib HAVING COUNT(*)>1), moves AS (SELECT k.id AS drop_id, d.keep_id FROM pgz_sport.klubovi k JOIN dups d USING (oib) WHERE k.id<>d.keep_id) UPDATE pgz_sport.clanovi c SET klub_id = m.keep_id FROM moves m WHERE c.klub_id = m.drop_id; -- repeat for klub_sezona, hns_klubovi_natjecanje, etc. -- then DELETE the drop_ids from klubovi. ``` Run interactively via `/opt/pgz-sport/scripts/dedup_klubovi_by_oib.py` (create if absent) with `--dry-run` first. 3. **Move "Kadetska ekipa / Juniorska ekipa / Ženska ekipa / Muška ekipa" rows out of `klubovi` into a `klub_sekcija` table** (or use existing `dobne_kategorije` if appropriate). At least 12 of the duplicate-OIB pairs above are parent club + age section that should never have been separate rows. 4. **Delete the 14 standings-string klubovi rows (`RK ... N. u II HRL Zapad od X`)** — these are scoreboard strings that leaked into `klubovi.naziv`. SQL: ```sql DELETE FROM pgz_sport.klubovi WHERE naziv ~ '\d+\. u (I{1,3}|IV) HRL .* od \d+'; ``` Verify count first (`SELECT COUNT(*) ... `). 5. **Resolve the 3 `[VERIFY]`/`[UNRESOLVED]` klubovi** (ids 2619, 2630, 4426). Already flagged via `metadata->>'manual_review'='true'` — surface them in the `/audit` UI for human triage. 6. **Run `/opt/pgz-sport/scripts/enrichment_worker.py`** against the **518 klubovi with coverage <33%**. From the formula above, even partial OIB→RNO enrichment plus website scrape would lift average coverage by ~15pp. Suggested batch: ```bash python3 /opt/pgz-sport/scripts/enrichment_worker.py --filter "filled<4" --limit 100 --concurrency 4 ``` 7. **Deduplicate `Aleksa Todorović` (and similar) in `clanovi`.** ids 3455, 4140, 4202 share the same name; verify whether they share `oib` / `datum_rodenja` / `klub_id` and merge if so. 8. **Confirm `sys_audit` retention policy.** Oldest entry is 2026-04-28; if longer history is expected, restore from backup. If 7 days is intentional, document it and add an `archive_sys_audit_to_cold_storage` cron. 9. **Add a CHECK or partial UNIQUE INDEX on klubovi.oib for valid 11-digit OIBs:** ```sql CREATE UNIQUE INDEX CONCURRENTLY klubovi_oib_unique_valid ON pgz_sport.klubovi (oib) WHERE oib ~ '^[0-9]{11}$'; ``` This will physically prevent issue (2) from regressing once cleaned. Will fail until issue (2) is resolved — that's a feature. 10. **Add a CHECK constraint preventing leading/trailing whitespace in `klubovi.naziv` and `clanovi.ime/prezime`** (the duplicate-OIB clusters above contain pairs like `"Boćarski Klub Kostrena Kostrena"` with double-space — these should never make it past INSERT): ```sql ALTER TABLE pgz_sport.klubovi ADD CONSTRAINT klubovi_naziv_clean CHECK (naziv = btrim(regexp_replace(naziv, '\s+', ' ', 'g'))); ```