Commit Graph

6 Commits

Author SHA1 Message Date
Damir Radulic aca5051418 feat: /api/v2/analiza/* endpoints - sport analytics backend 2026-05-16 00:28:12 +02:00
Damir Radulić f9ebcddf28 CC2 R6: middleware-wide JWT, avatar demo mode, mock mailer, login rate limit
#1 JWT middleware extended:
- Was: /api/admin/* only
- Now: any POST/PUT/PATCH/DELETE under /api/* requires Bearer JWT
- Whitelist (still anonymous): /api/auth/login, /refresh, /forgot-password,
  /password/reset, /reset-password, /setup-password, /google;
  /api/gdpr/consent; any path ending /avatar
- 14 mutating endpoints verified to return 401 without token

#2 Avatar upload demo mode (routers/clan_panel_router.py):
- Anonymous → returns {demo_mode:true, slika_url:null,
  message:'Demo mode — slika nije spremljena. Prijavite se za pravu pohranu.'},
  no FS write, no DB write
- Authenticated (valid JWT, allowed role) → real save as before
- Auth check now uses auth.auth_v2.decode_token (proper secret + revocation)
  instead of the broken local _resolve_role

#3 Mock mailer (auth/mailer.py):
- send_email writes RFC 822 .eml to /tmp/pgz_mailbox + appends to INDEX.jsonl
- send_password_reset, send_invite helpers with HR text + HTML alt
- Real SMTP active when PGZ_SMTP_HOST is set (env-driven, off by default)
- forgot-password and admin invite both call mailer; audit logs mail status

#5 Rate limiting on /api/auth/login:
- Per-user: 5 wrong attempts → 5-minute DB-backed lockout
  (was 5 → 15 min). Configurable via PGZ_LOGIN_LOCK_THRESHOLD/MINUTES.
- Per-IP: 10 fails / 5-min sliding window in-memory → HTTP 429
  Configurable via PGZ_LOGIN_IP_THRESHOLD/WINDOW_SEC. Successful
  login clears the IP counter.
- Failed attempts respond '(N/5) — račun je zaključan na 5 minuta'
- New audit actions: login.ratelimit.ip; login.fail meta now
  includes fails count, locked, lock_minutes

#4 Live test report: 46/46 across 6 demo users — login, JWT gate on 14
   mutating endpoints, public path whitelist, demo-mode avatar +
   real save, forgot-password e-mail to mailbox, no-leak unknown email,
   5-fail lockout, 423 during lockout, audit coverage.
2026-05-05 01:42:53 +02:00
CC6 Worker faf6beb536 M12.6 SF: sport-aware enrichment + federation map (HBS, HKS, HRS, HOS, HVS, HPS, HBS bocanje…)
- data/sport_federations.json: 24 Croatian sport federations + aliases +
  PGŽ local media (Novi list, Glas Istre, Rijeka.danas).
- enrich_router._sport_fed/_normalize_sport/_load_sport_feds: cached
  loader that picks up file changes via mtime.
- _research_links() now sport-aware: when row.sport maps to a known fed,
  the dynamic links list shows that fed (national + PGŽ regional) plus the
  three PGŽ local-media search URLs in place of the static HNS Semafor +
  transfermarkt fallback.
- scrape_sport_federation(sport, ime, prezime): generic profile-page
  scraper (slug pattern OR search-results crawl) → returns
  {profile_url, slika_url, datum_rodenja, mjesto_rodenja, klub_naziv}.
- _propose_for_sportas() now routes through the federation scraper before
  HNS Semafor; HNS path is gated to nogomet or rows already linked.
- _load_row(sportas) JOINs klubovi to fall back to klub.sport when
  c.sport is empty.
- Tested on 1024 Marijan Alkić (boćanje): proposed profile_url +
  datum_rodenja from hrvatski-bocarski-savez.hr; /apply persisted them.
- Tested on 3335 Toni Jelenković (košarka) and 3379 Niko Miknić
  (plivanje): research_links surface HKS/KS PGŽ and HPS respectively.

Worker:
- _pick_sportas now selects on coverage<70 across ALL sports (sport
  set OR known external linkage), not just hns_*.
- _SOURCE_WEIGHTS extended with 16 federation hosts at 0.88-0.92.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 01:30:16 +02:00
CC6 Worker 9c5116eaa3 M12.5 R4: coverage<70 picker + confidence>=0.7 gate + /var/log target
- Coverage computed in SQL (filled_keys * 100 / total_keys); only rows below
  threshold (default 70%, override ENRICHER_COVERAGE_MAX) are queued.
- Per-row confidence is the max of source weights (semafor.hns.family=0.95,
  wikipedia.hr=0.80, sport-pgz.hr=0.55) plus a small evidence-count bonus.
  Below threshold (default 0.70, override ENRICHER_CONFIDENCE), only 'hard'
  structured fields (profile_url, source_url, slika_url, hns_igrac_id) are
  applied — never an LLM-synthesised biografija.
- Logs now mirrored to /var/log/pgz-sport-enricher.log alongside the project
  log, so 'tail /var/log/pgz-sport-enricher.log' works as the brief asks.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 00:45:48 +02:00
Damir Radulić f5c6570d47 CC2 R4 #2+#5: remove legacy unauth /api/admin/users — close 401 gap
The bare @app.get/post('/api/admin/users') decorators in pgz_sport_api.py
were registered before app.include_router(admin_users_router) and shadowed
the JWT-protected M2 routes, leaking user list to anyone.

Removed all three: GET /api/admin/users, POST /api/admin/users,
POST /api/admin/users/{uid}/toggle. The auth.admin_users router now owns
this prefix exclusively and gates every method with require_user.

Verified: no-auth → 401, invalid token → 401, valid Bearer → 200.
2026-05-05 00:44:50 +02:00
CC6 Worker ece556de11 M12.4: real HNS Semafor scraper for sportas + 24/7 enrichment worker
Critical bug fix: /v2/enrich/sportas/{id} returned proposed:{} for athletes
because the v3 pipeline was still relying on Wikipedia-only evidence and never
actually fetched semafor.hns.family/igraci/.

- enrich_router._propose_for_sportas now:
  • Resolves a HNS Semafor URL from profile_url, source_url, hns_igrac_id,
    vanjski_id JSONB ('hns_comet'+'hns_slug'), or source='hns_semafor'+source_id.
  • Fetches and parses the player page (BS4, regex fallback) and proposes
    profile_url, source_url, slika_url, hns_igrac_id, datum_rodenja,
    mjesto_rodenja, broj_dresa, biografija (DeepSeek synthesis from HNS+Wiki).
- _load_row(sportas) widened to read every relevant column + vanjski_id.
- _TABLE_MAP['sportas'] writeback whitelist expanded to 12 fields.
- workers/enrichment_worker.py: 24/7 daemon, picks under-enriched
  clanovi/klubovi/savezi every 5 min via SQL, calls /apply for each.
- systemd unit pgz-sport-enricher.service installed + enabled.
- Tested end-to-end: id=2222 (Abdija) and id=449 (Zec) now have
  profile_url, slika_url, hns_igrac_id, biografija persisted.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
2026-05-05 00:36:57 +02:00