M1+M2+M10 (CC2 R3): JWT auth + admin users + GDPR backend

- auth/auth_v2.py: JWT login/refresh/logout/me + bcrypt + tenant_id/role/tier claims
- auth/admin_users.py: /api/admin/users CRUD + invite/role/suspend + bulk CSV
- auth/gdpr.py: cookie consent + Art.20 export + Art.17 erasure + admin queue
- auth/seed_demo.py: 3 demo tenants + 4 users (damir@pgz.hr / PGZ2026!)
- Removed legacy /api/auth/login + /api/auth/me from pgz_sport_api.py
- Wired auth/admin/gdpr routers into FastAPI

5/5 live curl tests pass: damir@pgz.hr login → JWT with tenant_id=1, role=pgz_admin, tier=0
This commit is contained in:
Damir Radulić
2026-05-05 00:09:09 +02:00
parent c12a8e9698
commit 492c8fdd87
23 changed files with 21518 additions and 49 deletions
@@ -0,0 +1,164 @@
# HANDOFF — FULL MIGRATION + CLEANUP
**Datum:** 04.05.2026 23:50 CEST
**Autor:** Damir Radulić (kroz Claude session)
**Verzija:** v1.0
## TL;DR
Migracija s GPU servera (144.76.68.5) na Server B (10.10.0.2) **POTPUNA**. Lokalni PG **stopped+disabled**. Sustav radi 100% iz Server B-a. Disk recovered ~30GB. Cron timeoutovi dodani da spriječe daljnje stuck procese.
## Što je urađeno večeras
### 1. Migracija ovisnosti (pgz_sport + ostali)
- `pgz_sport_api.py`: DSN `localhost:5432``10.10.0.2:6432`
- `pgz_sport_v2_router.py`: isto fixed
- `learn_loop.py`: provjereno, već ide na Server B
- `reembed_phase2.py`: DSN fix → 10.10.0.2:6432
- `reembed_knowledge_v2.py`: import iz docstring-a fix (DB_DSN bio undefined)
### 2. EnvironmentFile fix (GLAVNI BUG)
Bilo bez `EnvironmentFile=/opt/rinet-gpu/.env.master`:
- `dabi-orchestrator-v3.service`
- `rinet-mcp.service`
- `rinet-supervisor.service`
- `rinet-heartbeat.service`
Posljedica: env vars (QDRANT_URL, GROQ_API_KEY, ANTHROPIC_API_KEY, DEEPSEEK_API_KEY) nisu stizale procesima.
### 3. Mass-fix Qdrant URL (35+ scripts)
- `localhost:6333``10.10.0.2:6333` u **55+ aktivnih file-ova**
- Pokriveno: /opt/rinet-gpu, /opt/ai-rinet, /opt/pgz-sport, /opt/dabi-persona, /opt/portal-rinet
- Ostali: backup files (pre_b_switch, .bak.*) — nije dirano
### 4. TG spam blokiranje
- Globalni Python monkey-patch `/usr/lib/python3/dist-packages/usercustomize.py`
- Intercept svaki `requests.post("api.telegram.org/...")` u svim Python procesima
- Šalje kroz `rinet-notify` rate-limited helper (max 5/h, dedup 30min)
- Bash wrapper `/usr/local/bin/rinet-curl-tg`
- Disabled cron monitor (embed_monitor.sh, embed_monitor_p2.sh)
### 5. Anthropic Tier 4 (ZADNJI u waterfall) ✅
Linije 484+496 u `dabi_orchestrator_v3.py`:
```
Tier 0: dabi-budget LoRA (port 8765)
Tier 1: vLLM Qwen 7B (port 8001)
Tier 2: Groq llama-4-scout
Tier 3: DeepSeek V3
Tier 4: Anthropic Claude ← ZADNJI
```
ENV var bug fix: `CLAUDE_API_KEY``ANTHROPIC_API_KEY`
### 6. Multi-language support (HR/EN/DE/IT)
- `_translate_to()` + `_detect_query_lang()` u `/opt/ai-rinet/ai_gateway.py`
- HR: native ✅
- EN: radi ✅
- DE: radi ✅
- IT: povremeno (Groq rate-limit issue)
### 7. Sport scrapers — pokrenuti svi
Bili 5 INACTIVE, sad SVI ACTIVE:
- sport-pgz-deep-loop ✅
- sport-master-loop ✅
- sport-extra-loop ✅
- sport-fed-scrapers ✅
- sport-oib-loop ✅
- sport-dabi-quiz ✅
`pgz_sport_deep.py`: keyword filter prošireno **8 → 26 keywords** (sport, klub, savez, sportaš, kup, prvenstvo, liga, utakmica, igrač, trener, olimpij, paraolimpij, turn, medalj, pobjed, rijeka, pgž, primorsko, subvenc, natječaj, odluka, proračun, rebal...)
### 8. Reembed processes — radi
- `tmux 'reembed'`: 89% done, rate 55-173k/s ⭐
- `reembed_phase2.py`: PID 1790646, 85-102k/h, court_notices_v2 + rsv_enriched_v2
### 9. LoRA daily timer — REVIVED ⭐
**Bug**: timer bio mrtav od 03.05.2026!
**Fix**: `systemctl enable lora-finetune.timer` + start
Training pokrenuto 23:24 — 100,000 examples + 309 eval
### 10. KPI Dashboard — LIVE
- JSON: https://sport.rinet.one/admin/api/kpi
- HTML: https://sport.rinet.one/admin/api/kpi-page (auto-refresh 30s)
### 11. Continuous loops (15 cron)
| Cron | Loop | Timeout |
|---|---|---|
| */2 min | lora_watchdog | - |
| */5 min | smoke_test | 60s ⭐ |
| */5 min | kpi_snapshot | 30s ⭐ |
| */10 min | latency_alert | 30s ⭐ |
| */15 min | halu_scanner | 60s ⭐ |
| */20 min | learn_from_errors | 90s ⭐ |
| */30 min | capture_to_training | 120s ⭐ |
| */30 min | scraper_health | 90s ⭐ |
| */45 min | regression_test | 90s ⭐ |
| 0 * | hourly_status | 30s ⭐ |
| 0 8 | daily_learning | - |
| 0 4 daily | RAGAS eval | - |
| 0 2 daily | overnight_learning | - |
| daily 03:00 | LoRA fine-tune | - |
| daily 03:07 | master_backup 22TB | - |
⭐ = timeout dodan večeras (spriječava stuck procese)
### 12. Lokalni PG — STOPPED + DISABLED
- `systemctl stop postgresql`
- `systemctl disable postgresql`
- Listen 5432: NONE
- Schema backup u `/mnt/cold/local_pg_schema_backup_20260504_2343.sql.gz` (109K)
- Data dir `/var/lib/postgresql/18/main` (47GB) **NIJE OBRISAN** (čekamo 24h verifikaciju)
### 13. Stuck procesi ubijeni
- 46× smoke_test stuck → 0
- 8× scraper_health stuck → 0
- 5× hourly_status stuck → 0
- 1× duplicate master_scraper_coordinator → 0
- **Total 60 stuck procesa eliminirano**
### 14. Disk cleanup (~30GB recovered)
- `/tmp/ocr_resized` (15GB)
- `/tmp/sprint` (13GB)
- `/tmp/rinet_v3_backup.dump` (2.2GB old PG dump)
- `/root/.cache/uv` (6.1GB)
- 201× .bak files older 14 days
- 113× __pycache__ dirs
## Trenutno stanje
```
PG: Server B 10.10.0.2:6432 (5,315,161 facts)
Lokalni 5432 STOPPED + DISABLED
PgBouncer: 127.0.0.1:6432 → host=10.10.0.2 port=5432 (proxy to Server B)
Qdrant: Server B 10.10.0.2:6333 (46 collections, 14M+ vectors)
Lokalni 6333: NE POSTOJI
Redis: Lokalni 6379 (cache)
Neo4j: Lokalni 7687 (615,580 nodes, 756,333 relations)
Embed: Lokalni 9879 (BGE-M3, dim 1024)
Reranker: Lokalni 8099/8100/8101 (3 instance)
vLLM: Lokalni 8001 (Qwen2.5-7B-Instruct-AWQ)
F10 LoRA: Lokalni 8765 (dabi-budget-lora-q4)
Ollama: Lokalni 11434 (qwen3:14b, llama3.2:3b)
MCP: Lokalni 8810 (7 tools)
```
## Što ostaje za dovršiti
1. **24h dry-run lokalni PG stop** — provjeriti je li sve OK pa onda obrisati `/var/lib/postgresql/18/main` (47GB)
2. **`drop_gpu_pg.sh`** — pripremljen prije, **NE pokretati** dok dry-run ne potvrdi
3. **Multi-lang IT/DE retry** — Groq rate-limit issue povremeno
4. **9 facts bez source** — UPDATE bio prekinut Bridge timeout-om, treba ponoviti
5. **Neo4j integration u RAG** — orchestrator još ne koristi knowledge graph (756k relations leže neiskorišteno)
## Testovi prošli
- Smoke 4 questions: 3/4 PASS (Bok, NK Rijeka predsjednik, Kup HR; PGŽ proracun timeout via Bridge)
- vLLM: response OK
- Embed BGE-M3: dim 1024 OK
- RAG: tier 1 vLLM + tier 2 Groq + tier 0 DB priority sve rade
- Server B PG via PgBouncer: 5,315,161 facts ✅
- Sport+PGŽ embed: 99.97% / 99.92% ✅
- Halucinacije 24h: 0 ✅
- Sport scrapers: 6 active ✅
## Bridge stability notes
- Bridge timeout-i tijekom session-a (server pod opterećenjem)
- Glavni razlog: GPU 100% util (LoRA training), 18+ paralelni scrapers
- Load average peak: 126 (sad 11)