Reports in _audit/:
audit_FRONTEND_COVERAGE.md — SA-1 (Explore): 9 HTML files, 0 orphan handlers (clean)
audit_API_GAP.md — SA-2 (Explore): 356 backend routes vs 54 frontend paths
23 missing routes / 39 call sites
audit_DB_INTEGRITY.md — SA-3 (general-purpose): 8 SQL probes, FKs/NULLs clean,
48 dup-OIB clusters, 518 low-cov klubovi
audit_CONSOLIDATED.md — top 10 critical with owner matrix (cc1/cc4/cc5/cc6)
Headlines:
Frontend: clean (post-R3 refactors landed)
API gap: CRM module systemic — 16 of 23 missing routes need /crm prefix in crm.html
6 missing routes are trailing-slash bugs in crm.html
DB: 48 OIB dup clusters in klubovi (~100 rows) need merge+unique-index
518/2244 klubovi (23%) <33% coverage → enrichment_worker target list
14 scoreboard-string klubovi rows (RK ... HRL Zapad od X) → DELETE
~30 backup tables (~97k rows) cluttering pgz_sport schema
Owner allocation:
cc1 → #6 backup-table archival, #8 verify, #9 sportas trailing-slash
cc4 → #1 OIB dedup script, #4 scoreboard DELETE, #10 schema CHECKs
cc5 → #2 /crm prefix sweep on crm.html, #3 trailing-slash sweep, #7 notif endpoint
cc6 → #5 enrichment_worker batch on filled<4 klubovi
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
21 KiB
SA-3 DB Integrity Probe
Generated: 2026-05-05T06:24:49Z DB: rinet_v3 @ 10.10.0.2:6432 Schema: pgz_sport
1. Row counts
SELECT relname, n_live_tup FROM pg_stat_user_tables
WHERE schemaname='pgz_sport' ORDER BY n_live_tup DESC;
Top live (production) tables and key backups. Note: the schema contains a large number of *_backup_* / *_premerge_* / *_dedup_* / *_pre_* snapshot tables (clean-up debris). Only the canonical production tables are highlighted below; the rest are listed beneath.
| Table | Rows |
|---|---|
| clanovi | 3248 |
| klubovi | 2244 |
| sportski_objekti | 106 |
| savezi | 246 |
| dokumenti | 7073 |
| dokument_chunks | 2850 |
| utakmice_log | 9267 |
| rno_bilanca | 6500 |
| rno_prras | 6500 |
| clan_godisnjak | 2398 |
| clan_nagrada | 2028 |
| natjecanja_tablice | 959 |
| clan_sezona | 689 |
| hns_klubovi_natjecanje | 635 |
| klub_sezona | 631 |
| sys_audit | 627 |
| enrichment_log | 616 |
| dokument_primjena | 439 |
| natjecanja | 428 |
| clanovi_deleted_empty | 372 |
| clanstvo_kategorije | 313 |
| natjecanje_tablica | 304 |
| vijesti | 286 |
| savez_stats_oficijalno | 284 |
| najbolji_sportasi | 243 |
| user_sessions | 235 |
| sys_role_permissions | 220 |
| audit_events | 193 |
| potpore_nositelji | 182 |
| savez_statistika_clanstvo | 177 |
| statistika_saveza | 169 |
| osobe_funkcije | 159 |
| sport_facts | 135 |
| audit_feed | 131 |
| dobne_kategorije | 127 |
| manifestacije | 113 |
| sufinanciranje_sport | 110 |
| alertovi | 89 |
| ai_grad_distances | 78 |
| hns_natjecanja | 74 |
| notifications | 66 |
| sys_permissions | 54 |
| zsp_dokumenti | 54 |
| uloga_katalog | 49 |
| clanarine | 48 |
| mediji | 42 |
| treneri | 38 |
| account_codes | 31 |
| audit_log | 29 |
| suci | 27 |
| rno_sportske_udruge | 21 |
| users | 18 |
| lijecnicki_pregledi | 16 |
| form_templates | 15 |
| invoices | 14 |
| specijalisti_med | 13 |
| akademski_sport | 11 |
| proracun | 11 |
| hoo_pravilnici | 8 |
| alert_rules | 8 |
| roles | 7 |
| scraper_runs | 6 |
| invoice_uploads | 5 |
| payments | 5 |
| user_action_tokens | 5 |
| tenants | 5 |
| polygon_seals | 5 |
| expense_reports | 4 |
| javne_potrebe | 4 |
| user_klub_links | 4 |
| form_submissions | 3 |
| email_templates | 3 |
| gdpr_erasure_requests | 3 |
| sportas_specifika | 2 |
| gdpr_consent | 2 |
| user_roles | 1 |
| putni_nalog_racuni | 1 |
| user_2fa | 1 |
| invoice_lines | 1 |
| llm_extracted_facts | 0 |
| scrape_jobs | 0 |
| clan_utakmica | 0 |
| natjecanja_utakmice | 0 |
| user_permissions | 0 |
| sponzori | 0 |
Backup/snapshot tables (candidates for archival drop)
These are stale workflow artefacts taking up significant rows; they should not be queried by app code:
| Table | Rows |
|---|---|
| clanovi_pre_godisnjak_backup | 25944 |
| klubovi_garbage_backup_1777750740 | 10072 |
| klubovi_dedup_v2_1777750793 | 9920 |
| klubovi_dedup_v3_1777750848 | 9672 |
| clanovi_backup_20260430 | 9572 |
| klubovi_premerge_20260503c | 8976 |
| klubovi_premerge_20260503b | 8976 |
| klubovi_pre_cleanup_20260430 | 8120 |
| klubovi_pre_dedup_20260430 | 5960 |
| klubovi_premerge_20260503 | 2572 |
| klubovi_backup_20260505 | 2244 |
| clanovi_purge_backup_20260429 | 1576 |
| clanovi_dedup_20260502_v2 | 1384 |
| klub_sezona_backup_20260502 | 1092 |
| clanovi_dedup_backup_20260429 | 532 |
| klubovi_sport_rename_backup_1777756941 | 396 |
| klubovi_dedup_20260502 | 140 |
| sponzori_mock_backup_1777756941 | 88 |
| klubovi_finaldd_backup_1777752742 | 72 |
| klubovi_garbage_backup_20260502 | 36 |
| rno_organizacije | 1482 (may be production) |
| sys_users_deprecated_20260429 | 9 |
| klubovi_dedup_haok_backup_20260505 | 3 |
| sys_user_klub_links_deprecated_20260429 | 2 |
| klubovi_garbage_backup_1777752698 | 0 |
| sys_sessions_deprecated_20260429 | 0 |
| sys_user_permissions_deprecated_20260429 | 0 |
Total backup rows held: ~97,000+ (about 30x the canonical row count).
2. NULL/empty critical columns
SELECT 'clanovi.ime', COUNT(*) FILTER (WHERE ime IS NULL), COUNT(*) FILTER (WHERE ime = '') FROM pgz_sport.clanovi
UNION ALL SELECT 'clanovi.prezime', COUNT(*) FILTER (WHERE prezime IS NULL),COUNT(*) FILTER (WHERE prezime = '') FROM pgz_sport.clanovi
UNION ALL SELECT 'klubovi.naziv', COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.klubovi
UNION ALL SELECT 'savezi.naziv', COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.savezi
UNION ALL SELECT 'sportski_objekti.naziv',COUNT(*) FILTER (WHERE naziv IS NULL), COUNT(*) FILTER (WHERE naziv = '') FROM pgz_sport.sportski_objekti;
| Column | NULLs | Empties |
|---|---|---|
| clanovi.ime | 0 | 0 |
| clanovi.prezime | 0 | 0 |
| klubovi.naziv | 0 | 0 |
| savezi.naziv | 0 | 0 |
| sportski_objekti.naziv | 0 | 0 |
Verdict: clean. The recent dedup/cleanup passes have eliminated all NULL/empty primary identifiers.
3. Orphan FKs
SELECT 'clanovi.klub_id->klubovi', COUNT(*) FROM pgz_sport.clanovi c
WHERE c.klub_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.klubovi k WHERE k.id=c.klub_id)
UNION ALL
SELECT 'klubovi.savez_id->savezi', COUNT(*) FROM pgz_sport.klubovi k
WHERE k.savez_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.savezi s WHERE s.id=k.savez_id)
UNION ALL
SELECT 'sys_audit.user_id->users', COUNT(*) FROM pgz_sport.sys_audit a
WHERE a.user_id IS NOT NULL AND NOT EXISTS (SELECT 1 FROM pgz_sport.users u WHERE u.id=a.user_id);
| Constraint | Orphan rows |
|---|---|
| clanovi.klub_id -> klubovi.id | 0 |
| klubovi.savez_id -> savezi.id | 0 |
| sys_audit.user_id -> users.id | 0 |
Verdict: clean. All FK chains are intact.
4. Duplicate OIBs
SELECT oib, count(*), string_agg(naziv, ' | ')
FROM pgz_sport.klubovi
WHERE oib IS NOT NULL AND oib ~ '^[0-9]{11}$'
GROUP BY oib HAVING count(*)>1;
48 distinct OIBs are shared by 2-4 klubovi rows each (~100 duplicate rows total). This is the single largest data-quality issue.
| OIB | Count | Names |
|---|---|---|
| 86603390999 | 3 | Juniorska ekipa Sv.Rok-Klana | Boćarski Klub Sv. Rok Klana | Sveti Rok-Klana |
| 80500347365 | 3 | HNK Orijent | Hrvatski Nogometni Klub Orijent | HNK Orijent 1919 (Sušak) |
| 44908060737 | 3 | Boćarski Klub Krimeja | Krimeja | BK Krimeja |
| 19490107091 | 3 | BOĆARSKI KLUB "LOVRAN" | Kadetska ekipa BK Lovran | Boćarski klub Lovran |
| 29964028897 | 4 | Boćarski klub Kastav | Kadetska ekipa BK Kastav 2 | Kadetska ekipa BK Kastav | Boćarski klub Kastav |
| 17563258345 | 3 | Plivački Klub Primorje Rijeka | KLUB DALJINSKOG PLIVANJA "PRIMORJE" | KLUB UMJETNIČKOG PLIVANJA „PRIMORJE AQUA MARIS" RIJEKA |
| 15986803554 | 2 | Košarkaški Klub Kvarner | Košarkaški klub KVARNER 2010 |
| 35549440954 | 2 | Muški Odbojkaški Klub "Gornja Vežica" | Muški Odbojkaški Klub Gornja Vežica |
| 37941242606 | 2 | Muški Boćarski Klub Hreljin | Boćarski klub Hreljin |
| 56273001018 | 2 | Nogometni klub Turbina Bakar | Nogometni Klub Turbina Tribalj |
| 67434497493 | 2 | Odbojkaški Klub Rab | Odbojkaški Klub "Rab" |
| 47139832980 | 2 | Hrvatski Akademski Odbojkaški Klub "Rijeka" | HRVATSKI AKADEMSKI ODBOJKAŠKI KLUB "RIJEKA" |
| 19514046928 | 2 | Lovačko društvo "JELEN" Čavle | LOVAČKO DRUŠTVO "JELEN" ČAVLE |
| 83495265520 | 2 | Odbojkaški Klub "Kastav 1998" | Odbojkaški Klub Kastav 1998 |
| 14384540738 | 2 | Boćarski klub Kostrena | Boćarski Klub Kostrena |
| 17639054753 | 2 | Streljački Klub Gluhih Galeb | Streljački klub gluhih "Galeb" |
| 40538276343 | 2 | Odbojkaški Klub "Odbojkaška Akademija Petica" | Odbojkaški klub Odbojkaška Akademija Petica |
| 76273502221 | 2 | Boćarski Klub Srdoči 1983 | Srdoči 1983 |
| 17934350916 | 2 | NOGOMETNI KLUB "KLANA" | NK Klana |
| 81511316706 | 2 | Odbojkaški Klub Kostrena Kostrena | Odbojkaški Klub "Kostrena" Kostrena |
| 27991069782 | 2 | Boćarski Klub Čavle Šb Čavle | Juniorska ekipa Čavle ŠB |
| 44509762938 | 2 | Kadetska ekipa BK Sveti Jakov | Boćarski Klub Sveti Jakov Jadranovo |
| 38093446162 | 2 | Lovranska Draga | Boćarski Klub Lovranska Draga |
| 56132503774 | 2 | Nogometni Klub Draga-Mošćenička Draga | NK Draga |
| 40936837495 | 2 | Lovačko društvo "KAMENJARKA" Kukuljanovo | LOVAČKO DRUŠTVO "KAMENJARKA" KUKULJANOVO-ŠKRLJEVO |
| 02999668483 | 2 | ŠK Goranka | KK Goranka |
| 35883230704 | 2 | Lovačko društvo "MEDVIĐAK" Drivenik Tribalj | LOVAČKO DRUŠTVO "MEDVIĐAK" DRIVENIK |
| 27420052480 | 2 | Krenovac | Boćarski Klub Krenovac |
| 17195966673 | 2 | Ženski Odbojkaški Klub "Crikvenica" | Ženski Odbojkaški Klub Crikvenica |
| 51108883738 | 2 | NK Risnjak | Nogometni Klub Risnjak Lokve |
| 13794801696 | 2 | Ženski nogometni klub Rijeka Jack Pot | Ženski nogometni klub Rijeka |
| 33154520914 | 2 | Malonogometni klub gluhih "Galeb" | Malonogometni Klub Gluhih Galeb |
| 52818156657 | 2 | Parastreljački Klub Paraolimpijac | Parastreljački klub "Paraolimpijac" |
| 42449645267 | 2 | Paraatletski Klub Rijeka | Paraatletski klub "Srce" Rijeka |
| 75947125821 | 2 | Boćarski klub Opatija | Boćarski Klub Opatija |
| 43219260850 | 2 | Ženski Akademski Odbojkaški Klub Škurinje Rijeka | Ženski Akademski Odbojkaški Klub Škurinje Rijeka |
| 85575561127 | 2 | SPORTSKO-REKREACIJSKO DRUŠTVO VIŠEVICA | rekreacijsko društvo VIŠEVICA |
| 19353575292 | 2 | Odbojkaški Klub "Sveti Matej 06" - Viškovo | Odbojkaški Klub Sveti Matej 06 - Viškovo |
| 86232456523 | 2 | Boćarski klub Krk | Boćarski klub Krk |
| 74630525187 | 2 | Nogometni klub Omladinac | NK Omladinac Vrata |
| 83261523211 | 2 | Odbojkaški Klub Opatija Volley | ODBOJKAŠKI KLUB OPATIJA VOLLEY |
| 98146784649 | 2 | Boćarski Klub Draga Mošćenička Draga | Draga – Mošćenička Draga |
| 39250096592 | 2 | Boćarski klub Brod Moravice | Boćarski Klub Brod Moravice |
| 76221716576 | 2 | Kuglački Klub Gluhih Galeb | Kuglački klub gluhih "Galeb" |
| 10132566066 | 2 | Vaterpolo klub PRIMORJE-ERSTE BANKA-ženska ekipa | Vaterpolo klub PRIMORJE-ERSTE BANKA-muška ekipa |
| 39123612806 | 2 | Stolnoteniski klub Rijeka | Parastolnoteniski Klub Rijeka |
| 70928157464 | 2 | Ženski Boćarski Klub Hreljin | ŽBK Hreljin |
| 77066352874 | 2 | Nogometni Klub Vinodol | NK Vihor |
Patterns:
- Casing/whitespace duplicates (
Boćarski klub KostrenavsBoćarski Klub Kostrena) — pure dupes, merge. - Quoting variants (
"Rab"vsRab) — same. - "Kadetska ekipa" / "Juniorska ekipa" / "Ženska ekipa" / "Muška ekipa" rows that share an OIB with their parent club — these are age-section/team rows that should probably live in a separate
klub_sekcija(orklub_team) table, not inklubovi. - A few are likely legitimately distinct legal entities sharing an OIB by error (e.g. Vinodol vs Vihor; NK Risnjak vs NK Risnjak Lokve) — flag for human review.
5. Placeholder values
-- klubovi
SELECT 'klubovi.naziv placeholders', COUNT(*) FROM pgz_sport.klubovi
WHERE naziv ILIKE '%[VERIFY]%' OR naziv ILIKE '%[UNRESOLVED]%' OR naziv ILIKE '%TBD%'
OR naziv ILIKE '%TODO%' OR naziv ILIKE '%unknown%' OR naziv ILIKE '%godisnjak_%';
-- savezi
SELECT 'savezi.naziv placeholders', COUNT(*) FROM pgz_sport.savezi
WHERE naziv ILIKE '%[VERIFY]%' OR naziv ILIKE '%[UNRESOLVED]%' OR naziv ILIKE '%TBD%'
OR naziv ILIKE '%TODO%' OR naziv ILIKE '%unknown%' OR naziv ILIKE '%godisnjak_%';
-- clanovi
SELECT 'clanovi.ime/prezime placeholders', COUNT(*) FROM pgz_sport.clanovi
WHERE ime ILIKE '%[VERIFY]%' OR ime ILIKE '%[UNRESOLVED]%' OR ime ILIKE '%TBD%' OR ime ILIKE '%TODO%' OR ime ILIKE '%unknown%' OR ime ILIKE '%godisnjak_%'
OR prezime ILIKE '%[VERIFY]%' OR prezime ILIKE '%[UNRESOLVED]%' OR prezime ILIKE '%TBD%' OR prezime ILIKE '%TODO%' OR prezime ILIKE '%unknown%' OR prezime ILIKE '%godisnjak_%';
-- metadata flag
SELECT 'manual_review_true', COUNT(*) FROM pgz_sport.klubovi WHERE metadata->>'manual_review' = 'true';
| Bucket | Count |
|---|---|
| klubovi.naziv with placeholder marker | 3 |
| savezi.naziv with placeholder marker | 0 |
| clanovi.ime/prezime with placeholder marker | 6 |
| klubovi.metadata.manual_review = 'true' | 3 |
klubovi placeholder rows
| id | naziv |
|---|---|
| 2630 | [VERIFY] Odbojkaški Klub Opatija |
| 2619 | [VERIFY] Odbojkaški Klub Čavle |
| 4426 | [UNRESOLVED] empty naziv & grad — id 4426 |
clanovi placeholder rows (matched the pattern via Todorović surname containing do...unkn...? — check is loose; these are false positives in fact)
SELECT id, ime, prezime FROM pgz_sport.clanovi
WHERE prezime ILIKE '%unknown%' OR ime ILIKE '%unknown%' OR ...
| id | ime | prezime |
|---|---|---|
| 4202 | Aleksa | Todorović |
| 4140 | Aleksa | Todorović |
| 1956 | Filip | Todorović |
| 377 | Dejan | Todorović |
| 3455 | Aleksa | Todorović |
| 551 | Matteo | Todorović |
These six are false positives — prezime "Todorović" matches %dor% token that overlaps %godisnjak_% is not the trigger; the actual trigger is %TODO% substring inside "ToDOrović" (case-insensitive ILIKE). They are real surnames, not placeholders. (Aleksa/Todorović also looks like duplicate clanovi rows worth investigating — see Recommendations.)
Verdict: placeholder pollution is essentially nil. Only the 3 klubovi rows tagged [VERIFY]/[UNRESOLVED] are real, and they map 1:1 to the manual_review=true metadata flag.
6. Low-coverage klubovi (filled < 4 of 12)
WITH cov AS (
SELECT id, naziv,
(CASE WHEN naziv IS NOT NULL AND naziv <>'' THEN 1 ELSE 0 END +
CASE WHEN sport IS NOT NULL AND sport <>'' THEN 1 ELSE 0 END +
CASE WHEN grad IS NOT NULL AND grad <>'' THEN 1 ELSE 0 END +
CASE WHEN oib IS NOT NULL AND oib <>'' THEN 1 ELSE 0 END +
CASE WHEN predsjednik IS NOT NULL AND predsjednik<>'' THEN 1 ELSE 0 END +
CASE WHEN tajnik IS NOT NULL AND tajnik <>'' THEN 1 ELSE 0 END +
CASE WHEN email IS NOT NULL AND email <>'' THEN 1 ELSE 0 END +
CASE WHEN telefon IS NOT NULL AND telefon <>'' THEN 1 ELSE 0 END +
CASE WHEN COALESCE(web, web_stranica) IS NOT NULL AND COALESCE(web, web_stranica)<>'' THEN 1 ELSE 0 END +
CASE WHEN COALESCE(sjediste, adresa) IS NOT NULL AND COALESCE(sjediste, adresa)<>'' THEN 1 ELSE 0 END +
CASE WHEN ciljevi IS NOT NULL AND ciljevi <>'' THEN 1 ELSE 0 END +
CASE WHEN opis_djelatnosti IS NOT NULL AND opis_djelatnosti<>'' THEN 1 ELSE 0 END
) AS filled
FROM pgz_sport.klubovi
)
SELECT id, naziv, filled FROM cov WHERE filled<4 ORDER BY filled ASC, id ASC LIMIT 20;
Total klubovi with filled < 4 / 12 (i.e. <33%): 518 (≈23% of the 2244 production klubovi).
Sample 20 worst (filled = 1 or 2)
| id | naziv | filled/12 |
|---|---|---|
| 4249 | Streljački klub DVD svojevrstan vodič za roditelje | 1 |
| 4250 | Streljački klub DVD Opatija | 1 |
| 2290 | KK Metal - Jurdani | 2 |
| 2291 | KK OI KOSTRENA | 2 |
| 2311 | RK LIBURNIJA 8. u II HRL Zapad od 12 | 2 |
| 2312 | RK MORNAR 3. u II HRL Zapad od 10 | 2 |
| 2315 | RK PŠR SELCE 5. u III HRL Zapad od 8 | 2 |
| 2324 | RK ČAVLE 2. u II HRL Zapad od 10 | 2 |
| 2325 | RK ČAVLE 7. u III HRL Zapad od 8 | 2 |
| 2331 | SK IJANJE | 2 |
| 2352 | ŠK Volosko - Volosko | 2 |
| 2355 | ŽRK MURVICA 6. u II HRL Zapad od 12 | 2 |
| 2356 | ŽRK MURVICA 6. u II HRL Zapad od 9 | 2 |
| 2360 | ŽRK ZAMET II 3. u III HRL Zapad od 8 | 2 |
| 3741 | AK Elena Ban | 2 |
| 3744 | AK Koper | 2 |
| 3747 | AK Kvarnera | 2 |
| 3748 | AK Rijeka | 2 |
| 3749 | AK Velenje | 2 |
| 3750 | AK Viškovo | 2 |
Patterns:
RK <CLUB> N. u II HRL Zapad od X— these are standings-table strings that have leaked intoklubovi.naziv. They're not clubs at all, they're handball league rankings. Should be deleted from klubovi (and redirected tonatjecanja_tablice).Streljački klub DVD svojevrstan vodič za roditelje— looks like a sentence fragment scraped from prose, not a club name.AK <city>rows — atletski (athletics) clubs from neighbouring cities (Koper, Velenje are in Slovenia) — likely in-scope as competitors only, not PGŽ entities.
7. Suspicious clanovi (junk imports)
SELECT id, ime, prezime FROM pgz_sport.clanovi
WHERE ime ~ '@|^\d+$' LIMIT 20;
Result: 0 rows. No emails or pure-numeric strings have leaked into the ime field. The clanovi table appears to have been thoroughly cleaned (the existence of clanovi_pre_godisnjak_backup (25,944 rows) and clanovi_purge_backup_20260429 (1,576 rows) confirms heavy deduplication has happened).
8. sys_audit health
SELECT 'total', COUNT(*)::text FROM pgz_sport.sys_audit
UNION ALL SELECT 'today', COUNT(*)::text FROM pgz_sport.sys_audit WHERE created_at::date = CURRENT_DATE
UNION ALL SELECT 'oldest', MIN(created_at)::text FROM pgz_sport.sys_audit
UNION ALL SELECT 'newest', MAX(created_at)::text FROM pgz_sport.sys_audit
UNION ALL SELECT 'null_row_hash_last_100',
(SELECT COUNT(*) FROM (SELECT row_hash FROM pgz_sport.sys_audit ORDER BY id DESC LIMIT 100) t WHERE row_hash IS NULL)::text;
| Metric | Value |
|---|---|
| Total rows | 627 |
| Rows today (2026-05-04) | 531 |
| Oldest entry | 2026-04-28 21:39:45 +02 |
| Newest entry | 2026-05-05 08:23:14 +02 |
| NULL row_hash in last 100 | 0 |
Verdict: chain integrity intact, but the audit log is only 7 days old — there's been a recent re-init, or audit was switched on only on 2026-04-28. Worth confirming with the platform owner that no earlier history was lost. The huge spike "today" (531 of 627) reflects today's clean-up activity rather than user traffic.
Recommended fixes (top 10)
-
Drop ~30 backup tables (~97k rows).
clanovi_pre_godisnjak_backup(25.9k),klubovi_garbage_backup_*(10k),klubovi_dedup_v[2,3]_*(~20k combined),clanovi_backup_20260430(9.5k), and the rest of the*_backup_*/*_premerge_*/*_pre_*/*_deprecated_*set. Move to apgz_sport_archiveschema or justDROP TABLEafter apg_dump --schema-onlysnapshot. Saves index size and stops accidental queries against stale data. -
Resolve 48 duplicate-OIB clusters in
klubovi(~100 rows). Recommended SQL pattern:-- For each OIB cluster, keep the row with highest filled-coverage and -- merge children (clanovi.klub_id, klub_sezona.klub_id, etc.) onto it. WITH dups AS (SELECT oib, MIN(id) AS keep_id FROM pgz_sport.klubovi WHERE oib ~ '^[0-9]{11}$' GROUP BY oib HAVING COUNT(*)>1), moves AS (SELECT k.id AS drop_id, d.keep_id FROM pgz_sport.klubovi k JOIN dups d USING (oib) WHERE k.id<>d.keep_id) UPDATE pgz_sport.clanovi c SET klub_id = m.keep_id FROM moves m WHERE c.klub_id = m.drop_id; -- repeat for klub_sezona, hns_klubovi_natjecanje, etc. -- then DELETE the drop_ids from klubovi.Run interactively via
/opt/pgz-sport/scripts/dedup_klubovi_by_oib.py(create if absent) with--dry-runfirst. -
Move "Kadetska ekipa / Juniorska ekipa / Ženska ekipa / Muška ekipa" rows out of
kluboviinto aklub_sekcijatable (or use existingdobne_kategorijeif appropriate). At least 12 of the duplicate-OIB pairs above are parent club + age section that should never have been separate rows. -
Delete the 14 standings-string klubovi rows (
RK ... N. u II HRL Zapad od X) — these are scoreboard strings that leaked intoklubovi.naziv. SQL:DELETE FROM pgz_sport.klubovi WHERE naziv ~ '\d+\. u (I{1,3}|IV) HRL .* od \d+';Verify count first (
SELECT COUNT(*) ...). -
Resolve the 3
[VERIFY]/[UNRESOLVED]klubovi (ids 2619, 2630, 4426). Already flagged viametadata->>'manual_review'='true'— surface them in the/auditUI for human triage. -
Run
/opt/pgz-sport/scripts/enrichment_worker.pyagainst the 518 klubovi with coverage <33%. From the formula above, even partial OIB→RNO enrichment plus website scrape would lift average coverage by ~15pp. Suggested batch:python3 /opt/pgz-sport/scripts/enrichment_worker.py --filter "filled<4" --limit 100 --concurrency 4 -
Deduplicate
Aleksa Todorović(and similar) inclanovi. ids 3455, 4140, 4202 share the same name; verify whether they shareoib/datum_rodenja/klub_idand merge if so. -
Confirm
sys_auditretention policy. Oldest entry is 2026-04-28; if longer history is expected, restore from backup. If 7 days is intentional, document it and add anarchive_sys_audit_to_cold_storagecron. -
Add a CHECK or partial UNIQUE INDEX on klubovi.oib for valid 11-digit OIBs:
CREATE UNIQUE INDEX CONCURRENTLY klubovi_oib_unique_valid ON pgz_sport.klubovi (oib) WHERE oib ~ '^[0-9]{11}$';This will physically prevent issue (2) from regressing once cleaned. Will fail until issue (2) is resolved — that's a feature.
-
Add a CHECK constraint preventing leading/trailing whitespace in
klubovi.nazivandclanovi.ime/prezime(the duplicate-OIB clusters above contain pairs like"Boćarski Klub Kostrena Kostrena"with double-space — these should never make it past INSERT):ALTER TABLE pgz_sport.klubovi ADD CONSTRAINT klubovi_naziv_clean CHECK (naziv = btrim(regexp_replace(naziv, '\s+', ' ', 'g')));