Files
pgz-sport/_handoff/HANDOFF_20260430_0115_FORENSIC_AUDIT.md
T

9.1 KiB
Raw Blame History

HANDOFF — 30.04.2026 01:15 — KOMPLETNI FORENZIČKI AUDIT

🔴 BRUTAL VERDIKT — TL;DR

Ri.NET nije monstrum. Ri.NET je ozbiljan civic-intelligence platform s 48.6M redova, 35 Qdrant kolekcija, 50+ servisa. ALI: tvoja "samouči, autohealing, samorazvijajući kod" priča je 70% marketing, 30% istina. Stvarno radi 3 od 15 self-learning servisa. Ostalo je inactive ili failed.

📊 STANJE — KLJUČNI BROJEVI

Metrika Vrijednost
GPU RTX 4000 SFF Ada, 100% utilization, 78% mem, 70°C
RAM 62 GB total, 41 GB available, swap u upotrebi 13/31 GB
Disk 1.7 TB, 68% used, 539 GB free
Load avg 6.04 / 4.67 / 3.50 (na 20-thread CPU)
PostgreSQL 18.3, 39 GB, 28 schemas, ~600 tablica
Total DB rows 48.592.560 (2.3× više nego docs tvrde)
Qdrant 35 kolekcija, ~8M vektora total
Redis samo 63 keys / 2.15 MB used (cache NEDOVOLJNO iskorišten)
Systemd services 80+ rinet servisa, 3 failed
Aktivni cron jobs 27+
Backup .bak fajlovi u /opt 536 (cleanup needed)

🟢 ŠTO RADI DOBRO

  1. PostgreSQL tuning — shared_buffers 8GB, effective_cache 48GB, work_mem 128MB, random_page_cost 1.1 (SSD-tuned)
  2. UFW + fail2ban + iptables — DROP policy, 5 jails, blokirani recurring scanners
  3. PG ANALYZE cron — radi svakih 6h ✓ (zakon 3)
  4. Bridge API + UFW DENY za interne portove — 16 deny pravila
  5. vLLM + BGE-M3 embedder — aktivni i odzivni
  6. PGŽ Sport data integrity trigger — radi (clanovi_validate_source)
  7. OS-First arhitektura — 16/18 projekata koristi centralnu DB
  8. DABI eval framework — 954 eval rezultata, RAGAS daily cron, hallucination detection radi
  9. Handoff disciplina — 7 handoff dokumenata 29.04 jučer

🔴 ŠTO RADI LOŠE

Failed servisi

  • budget-active-learning.service — RAGAS eval + auto-regen — FAILED
  • lora-finetune.service — Qwen2.5-3B + DABI Croatian fine-tune — FAILED
  • eoglasna-collector.service — sudski oglasi scraper — FAILED 6× zadnja 4h

Self-learning farsi

Servis Status
rinet-self-learning inactive disabled
rinet-self-learn (DUPLIKAT!) inactive disabled
rinet-meta-agent inactive disabled
rinet-perpetual inactive (enabled)
rinet-qa-gen inactive disabled
rinet-eval inactive (enabled)
rinet-eval-daily inactive
rinet-backfill-knowledge inactive
rinet-gpu-learn inactive disabled
dabi-eval inactive disabled

Samo 3 od 15 self-learning servisa stvarno radi: budget-continuous, dabi-orchestrator-v3, gpu-learning.

Resource stress

  • GPU 100% utilization (vLLM 40% + ollama + embedder boriće se za isti GPU)
  • Swap 13 GB used (na 32GB swap → znači RAM pressure postoji)
  • Load avg 6 (sustainable na 20 cores ali nije idealno)
  • Qdrant 17 GB RAM + 43% CPU kontinuirano

Security defects (defense-by-accident)

  • 27 python servisa veže na 0.0.0.0 (a ne 127.0.0.1)
  • UFW DENY pokriva samo 8040, 8050, 8031, 8055 — portovi 8000, 8001, 8042, 8051, 8060, 8070, 8080, 8090, 8095, 8098, 8099, 8100, 8101, 8765, 9090, 9091, 9099, 9876, 9878, 9879 NISU u UFW DENY
  • Spasilo nas iptables INPUT policy DROP — ali to je accident, ne by design

Code hygiene

  • 536 .bak/deprecated/backup_ fajlova u /opt
  • 9 .bak.* unit files u /etc/systemd/system/
  • nginx sites-enabled ima rinet.bak.1777502696 ⚠️
  • 309 dirty fajlova u portal-rinet repu
  • 98 dirty u novitalia, 42 u dabi-persona
  • MASTER_CREDENTIALS_v3.md i v5.md — duplikati

Data quality (pgz_sport)

  • 922 sportaša 'manual' source — 0.4% sa source_url (sumnjivi)
  • 1986 klubova bez source_url

Audit incompleteness

  • 27 cron jobs aktivnih
  • sys_audit za 30 dana = 47 entries
  • Audit chain trigger NE hvata cron operacije, samo neke API calls
  • Tvrdnja "audit log poslije svake bigger operacije" je polu-istina

Dokumentacija laži

  • Doc kaže schema je eu_fondovi.* — stvarno je eu.*
  • Doc kaže "21.4M rows / 245 tables" — stvarno 48.6M / ~600 tables
  • Doc ne spominje civic schema (235 tablica, 27 GB) — najveći dio sustava
  • Doc ne spominje legal schema, openalex schema, dabi schema (35 tablica)

📋 OS-FIRST POTVRDA — JE LI Ri.NET TEMELJ?

DA, empirijski potvrđeno:

Resurs Korisnika
rinet_v3 centralna DB 16 projekata
BGE-M3 embedder :9879 12 projekata
Qdrant :6333 12 projekata

IZNIMKE (PREKRŠAJI Zakona 1):

  1. novitalia — ima vlastitu PG bazu novitalia + DB_USER=novitalia → PREKRŠAJ
  2. rinet-gpu/cortex/cortex.db — vlastita SQLite → PREKRŠAJ (manji)
  3. mail-server SQLite (4 db) — OK, mail server logika
  4. Qdrant 35 kolekcija po domeni — ovo je dobar pattern, ne prekršaj

Schema-per-projekt funkcionira: 28 schemas, jasno odvojeno.

🎯 ARHITEKTURA REVIEW — IS THIS THE BEST WE CAN DO?

House MD verdikt: NIJE, ali nije ni katastrofa

Što je dobro:

  • Single GPU monolith za solo developera = smart (nema cluster overhead)
  • Schema-per-projekt = smart (jasna izolacija, lako backupirat)
  • Bridge API kao jedini external entry = smart (manji attack surface)
  • DB triggers za data integrity = smart (Emil Baltić incident lesson learned)

Što je pretjerano:

  • 80+ systemd servisa — preglomazno za solo developera
  • Duplikati: rinet-self-learn vs rinet-self-learning, gpu-learning vs rinet-gpu-learn — confusing
  • 3 reranker instance (8099, 8100, 8101) za solo developera = overengineered
  • 4 sudreg-api + 3 worker instance = previše paralelizma
  • 35 Qdrant kolekcija — neke imaju 0 ili <100 points (pgz_zip_v1, pgz_kultura_v1, pgz_obrazovanje_v1)

Što fali:

  • Ozbiljan auto-restart na fail (eoglasna-collector failed 6× za 4h, nije se sam popravio)
  • Canary deployment — nema
  • Rollback mehanizam — nema (samo .bak file copies)
  • Centralni monitoring dashboard (Grafana radi ali bez exposed dashboards)
  • Prometheus alerting — node_exporter radi, ali nema alertmanager
  • Backup koji STVARNO backupira 39GB DB (current backup = 65KB → samo metadata)

🤖 SAMOUČEĆI ASPEKT — ŠTO STVARNO RADI

Marketing vs reality

Tvrdiš: "Ri.NET ima autohealing, samorazvijajući kod, sam analizira, mijenja, testira i deploya"

Stvarno:

Komponenta Status
Auto-healing logika Djelomično — health-guardian.service active, master-watchdog active, ali ne self-fix
Code generation pipeline NEMA — cc-swarm scripts postoje ali nisu cron-driven
Automatski testing prije deploya NEMA
Canary/rollback NEMA
Monitoring koji TRIGGERA promjene NEMA — samo loga
Learning loop iz audit logova DJELOMIČNO — chat_learner.py i intensive_learner.py rade svakih 4h, ALI sys_audit ima samo 47 entry/30d

ISTINA: Ri.NET ima eval framework (RAGAS daily, eval_runner svakih sat, 954 eval rezultata u dabi.eval_results_v2) — to je realan progress. Ima TRAINING corpus (365K Q&A parova u dabi.training_qa). ALI: Nema feedback loop koji ZATIM koristi training_qa za fine-tune (lora-finetune.service je FAILED).

🎯 TOP 5 STVARI ZA SLJEDEĆA 4 TJEDNA

Tjedan 1: Stabilizacija (must-do)

  1. Popraviti eoglasna-collector.service — failed 6× za 4h, missing scrape
  2. Popraviti budget-active-learning.service — to je RAGAS eval + auto-regen
  3. Bind sve python servise na 127.0.0.1 ili dodati UFW DENY za sve 8xxx i 9xxx portove
  4. Cleanup 536 .bak fajlova + 9 .bak unit files + nginx rinet.bak
  5. Stvarni DB backup — pg_dump 39GB → /opt/rinet-backups (ne samo 65KB metadata)

Tjedan 2: Self-learning aktivacija

  1. Popraviti lora-finetune.service — već imaš 365K training_qa, samo fali fine-tune step
  2. Decide: rinet-self-learning vs rinet-self-learn — ubij duplikat, zadrži jedan, enable
  3. Dovršiti rinet-meta-agent — to je ono što "samouči-trigger" obećava
  4. Cron za retraining kad nova batch training_qa dosegne threshold

Tjedan 3: Monitoring + alerting

  1. Grafana dashboards — DB rows growth, query latency, eval scores per category
  2. Alertmanager + Prometheus rules — GPU >95% za >30 min, swap >50%, service failed
  3. DABI eval scores trending — ako tjedna agregirana ocjena padne >10%, alert

Tjedan 4: Hardening + dokumentacija

  1. Refresh dokumentacije — civic schema, legal schema, openalex schema TREBAJU u docs
  2. novitalia migracija na centralnu DB ili formalna iznimka
  3. Audit chain trigger — proširiti da hvata cron operacije, ne samo API calls

📌 OPERATIVNI QUICK-REF (potvrđeno radi)

# Bridge API (jedini izvana)
curl -X POST https://api.rinet.one/bridge/exec \
  -H "X-API-KEY: rinet-yS4ZnKlwUqsjk" -d '{"cmd":"..."}'

# DB
PGPASSWORD='R1net2026!SecureDB#v7' psql -h localhost -p 5432 -U rinet -d rinet_v3

# vLLM (potvrđeno active)
curl http://localhost:8001/v1/models

# Embedder (potvrđeno active)
curl -X POST http://localhost:9879/api/embeddings -d '{"input":["test"]}'

# Qdrant (35 kolekcija)
curl http://10.10.0.2:6333/collections