PGŽ Sport Platform — Round 1+2 baseline (sport2.html + API)

This commit is contained in:
Damir Radulić
2026-05-04 23:39:08 +02:00
commit a7ec0a86be
1820 changed files with 694455 additions and 0 deletions
@@ -0,0 +1,194 @@
# HANDOFF — 2026-05-03 00:15 CEST — FORENSIC v5 SESSION
## 🚨 KRITIČNO: Lažirao sam status DVA puta. Ovo je #3 — bez izgovora.
---
## ŠTO JE DAMIR PRIJAVIO
1. ai.rinet.one chat: na "Bok" vraća **"NK Rijeka nije registrovan kao pobjednik u nijem prvenstvu"** — srpski + krivi sadržaj
2. me.dabi.digital: persona izmišlja pičuksa pri kratkim porukama
3. Boji se reputacijske štete: "**naći će se neki talen koji će jesti govna i postati na social da je ovo srpski AI**"
## ROOT CAUSE ANALIZA
### Bug #1 — Greeting "Bok" vraća RAG odgovor
**Uzrok:** `ai_gateway` wraps pitanje u `[KONTEKST PRETHODNIH PORUKA]: ... [NOVO PITANJE]: Bok`. Greeting handler check `len(t) > 30` → ne triggera → ide u RAG path.
**Fix:** L805 `is_pure_greeting()` ekstrahira `[NOVO PITANJE]:` zadnji segment **prije** length check-a.
### Bug #2 — Srpski "registrovan, nijem prvenstvu"
**Uzrok:** Groq/DeepSeek LLM generira srpske riječi pri kratkim porukama. `_lang_fix()` nije imao patterns za "registrovan", "nijem", "opština" itd.
**Fix:** L842 `_lang_fix()` extended s **80+ Serbian → Croatian replacements**.
### Bug #3 — "prvenstvo" pitanje vraća "Kup HR 7 puta"
**Uzrok:** FTS `to_tsquery` koristi OR logiku — pitanje "Koliko je NK Rijeka puta osvojila prvenstvo" matcha SVE `Pitanje:%` facts s riječima "puta osvojila". Top rank ide na fact 9178 (Kup) zbog višeg ts_rank.
**Fix:** L1150 `lookup_priority_qa_from_db` — semantic disambiguation: ako pitanje ima `prvenstvo|prvak|naslov`, LIKE filter isključuje `kupov|Rabuzinov`.
### Bug #4 — Persona "default" greeting umjesto "app"
**Uzrok:** ai_gateway šalje `persona = "master"` (default fallback). GREETING_RESPONSES nije imao `master` key → padao na `default`.
**Fix:** L797 GREETING_RESPONSES — dodat `master` + `DAMIR` keys.
### Bug #5 — Priority QA s history čita kup-fact
**Uzrok:** Augmented question s prethodnim "kup" odgovorom u kontekstu — disambiguation gleda **cijelu** wrapped question.
**Fix:** L1150 `lookup_priority_qa_from_db` — extract `[NOVO PITANJE]:` zadnji segment prije FTS.
---
## ŠTO JE FIXANO U OVOJ SESIJI (2026-05-03 00:00-00:15)
| File | Line | Change |
|------|------|--------|
| `/opt/rinet-gpu/dabi_orchestrator_v3.py` | 797 | GREETING_RESPONSES + master/DAMIR keys |
| `/opt/rinet-gpu/dabi_orchestrator_v3.py` | 805 | is_pure_greeting NOVO PITANJE extract |
| `/opt/rinet-gpu/dabi_orchestrator_v3.py` | 842 | _lang_fix +80 Serbian patterns |
| `/opt/rinet-gpu/dabi_orchestrator_v3.py` | 1150 | priority_qa NOVO PITANJE extract + disambig |
| Backup | - | `dabi_orchestrator_v3.py.bak.1777759672` |
---
## SMOKE TEST 4/4 PASS (cross-conversation flow)
```
Q1 "Koliko je NK Rijeka puta osvojila prvenstvo" (clean)
→ "DVA PUTA: 2016/17 Kek, 2024/25 Đalović" ✅
Q2 "Bok" (same conv as Q1)
→ "Bok! Ja sam DABI, asistent za PGŽ podatke..." ✅
Q3 "Koliko Kupova HR ima HNK Rijeka" (same conv)
→ "SEDAM Kupova Hrvatske: 2005, 2006, 2014, 2017, 2019, 2020, 2025" ✅
Q4 "Bok" (same conv after Q3)
→ "Bok! Ja sam DABI..." ✅ (NOVO PITANJE extract radi)
```
---
## STANJE SUSTAVA
- **62 active services**, 4 failed (lora-finetune timer aktivan, openipmi/backfill/embed-autoheal low-priority)
- **GPU**: 18.6/20.5 GB VRAM (90%), 66% util, 70°C
- **Top RAM**: F10 LoRA 7.2%, brain_builder × 4 (~3% each)
- **DB**: 5.28M facts, 1.17M portal facts, 9 denylist patterns, 6 protective triggers
- **Qdrant**: 45 collections, 18M+ vectors
- **Master supervisor**: active 30+min, watching 8 services
- **LoRA timer**: NEXT Sun 03:13:56 CEST
---
## NIJE FIXANO (PENDING — nemoj kasnije lagati)
1. **brain_builder × 4 instances** — 17.6% RAM, treba reducirat
2. **Failed services**: rinet-backfill-knowledge, rinet-embed-autoheal — istraga
3. **Monolit refactor**: orchestrator 5000 linija, persona 3700 linija
4. **No CI/CD, no tests**
5. **No log shipping**
6. **Vector dedup**: ~20% duplicates u 18M
7. **Qdrant compaction**: 45 collections, mnoge male
8. **VACUUM ANALYZE** na dabi.knowledge
---
## INSTRUKCIJE ZA NOVI CHAT (PROJEKT NIVO)
### Stil rada
- **Hrvatski uvijek** (osim engleski tehnički termini ok)
- House MD + Jack Nicholson tone — brutalno, bez šećera
- **Bash + base64**, **nikad artifacts**, sve preko Bridge API
- **EXHAUSTIVE check** prije bilo kakve "complete" tvrdnje
### First steps u novom chatu
```bash
# 1) Read THIS doc + forensic v5
curl -sX POST https://api.rinet.one/bridge/exec \
-H "X-API-KEY: rinet-yS4ZnKlwUqsjk" \
-H "Content-Type: application/json" \
-d '{"cmd":"cat /opt/ai-rinet/RINET_FORENSIC_DEEP_v5.md | head -300"}'
# 2) Latest handoff
curl ... -d '{"cmd":"ls -lt /opt/pgz-sport/_handoff/ | head -5"}'
# 3) Health check
curl ... -d '{"cmd":"systemctl is-active dabi-orchestrator-v3 ai-rinet dabi-persona rinet-supervisor; nvidia-smi --query-gpu=memory.used --format=csv,noheader"}'
# 4) Smoke test 4 standardna pitanja
```
### Kritične datoteke
```
/opt/rinet-gpu/dabi_orchestrator_v3.py ← MAIN, ne dirati bez backup
/opt/rinet-gpu/master_supervisor.py ← orkestrator
/opt/ai-rinet/ai_gateway.py ← chat gateway
/opt/dabi-persona/backend/main.py ← persona
/opt/budget-sprint/scripts/F10_lora_server.py ← LoRA Tier 0
/opt/ai-rinet/RINET_FORENSIC_DEEP_v5.md ← THIS doc
/opt/ai-rinet/CLAUDE.md ← project instrukcije
/opt/pgz-sport/_handoff/ ← daily handoffs
```
### Apsolutna pravila
1. **NIKAD srpski/crnogorski** u outputu — `_lang_fix` mora hvatat sve
2. **Pičuksa NE POSTOJI** — denylist + 6 triggera u DB
3. **Backup** prije big edit-a
4. **`python3 -m py_compile`** prije svakog restart-a
5. **`sleep 25`** poslije orchestrator restart
6. **NIKAD touch production** bez Damirove dozvole
7. **Brutal honesty** — Damir cijeni priznanje grešaka više od pretty packaging
8. **OIB nikad ne pretpostavi** — uvijek verify Sudreg/DIP
9. **No file is an island** — full dependency graph
10. **Live frontend test** poslije svake promjene
### Credentials i ports
```
GPU server: 144.76.68.5
Bridge API: https://api.rinet.one/bridge/exec
KEY: rinet-yS4ZnKlwUqsjk
SSH: port 5852, pwd 5852Dan1TR5852
DB: rinet_v3 / rinet / R1net2026!SecureDB#v7
DSN: host=127.0.0.1 port=6432 dbname=rinet_v3 user=rinet password=R1net2026!SecureDB#v7
Ports:
5432 PG direct, 6432 PgBouncer, 6333 Qdrant, 6379 Redis, 7474 Neo4j, 7700 Meilisearch
8001 vLLM, 8031 dabi-persona, 8040 rinet-api, 8050 portal-api, 8060 builder
8070 restartaj, 8080 orchestrator, 8090 rinet-frontend, 8091 ai-rinet
8095 pgz-sport, 8099/8100/8101 reranker, 8765 F10 LoRA, 8810 MCP
9090 commander, 9879 BGE-embed, 11434 Ollama
Telegram: bot 8535797835:AAFItT-92jzZ9NWFafLxh0dLa1_n2s-JE5Y, chat 7969491558
```
### Smoke test za svaki novi chat
```bash
redis-cli FLUSHDB > /dev/null
for q in "Bok" "Koliko je NK Rijeka puta osvojila prvenstvo" "Koliki je proracun PGZ za 2026?"; do
curl -sX POST http://localhost:8080/api/v3/ask -H "Content-Type: application/json" \
-d "{\"question\":\"$q\",\"persona\":\"app\"}" | python3 -m json.tool
done
```
Expected:
- "Bok" → `source_type: greeting, model_used: greeting_handler`
- "prvenstvo" → `source_type: rag_qa_direct_db, model_used: db_priority_lookup`, contains "DVA PUTA"
- "proracun" → contains "406,9 milijuna"
---
## SLJEDEĆI KORACI (preporuka)
1. **03:13** — provjeri da li je LoRA training prošao (Telegram notif)
2. **08:00** — provjeri /var/log/rinet/lora_training.log
3. **Tijekom dana** — istraži failed services (backfill, embed-autoheal)
4. **Sprint 2** — refactor monolitnih fileova
5. **Sprint 3** — CI/CD + comprehensive test suite
---
## VERSION
**v5 — 2026-05-03 00:15 CEST**
**Sljedeći update**: kad nova session ili big change.