TL;DR
- Identity attacks (session/token theft and lateral movement across IdPs), API/GraphQL abuse, AI/LLM supply‑chain exposures, and cloud‑native malware are the highest‑likelihood risks for 2025–2026.
- Quick wins in 30 minutes: revoke stale tokens, enforce least‑privilege roles, lock down public buckets, apply API rate/complexity limits, enable image signing and runtime alerts.
- Make it durable: adopt attestations (SLSA ≥ L3), per‑tenant key rotation, runtime anomaly baselines for containers/FaaS, and continuous posture monitoring.
As cloud adoption deepens and AI workloads explode, threat actors are pivoting to the weakest links: identity, APIs, and the software and data that train your models. This guide updates our 2024 piece with the most relevant cloud threats for 2025–2026 and the exact mitigations your team can implement this quarter.
Attacks on Cloud‑Based AI Platforms (LLM/RAG)
Why it matters in 2025–2026
AI services expose new attack surfaces: model endpoints, vector databases (RAG), fine‑tuning datasets, and orchestration glue (functions, API gateways, schedulers).
Threats
- Prompt/Indirect Injection against model endpoints and RAG pipelines to extract secrets or override system instructions.
- Vector‑DB Exposure (embeddings + metadata) via public misconfig or weak auth.
- Model/Data Theft (weights, fine‑tuning datasets) from permissive storage roles.
- Training Data Poisoning to bias outputs or create logic bombs.
- Endpoint Enumeration & Abuse (no auth, shared tokens, weak CORS) to exfiltrate data at scale.
Example (composite)
A fintech’s public LLM endpoint was chained with a prompt injection that forced RAG to fetch sensitive S3 objects referenced in embeddings. The attacker replayed a leaked token, pivoted to the vector store, and scraped PII.
Mitigations (do these this quarter)
- Tier & gate model access (public, partner, internal) with per‑tenant keys and time‑boxed tokens.
- Isolate RAG: separate indexes per tenant; disable cross‑tenant searches; store only minimal metadata.
- Validate inputs: strip/escape tool‑calling instructions; enforce output schemas; rate‑limit generation.
- Secrets hygiene: remove secrets from prompts; rotate all model/API keys at least monthly; bind tokens to client + IP where feasible.
- Egress guardrails: block model tools from reaching sensitive networks unless explicitly allow‑listed.
- Monitor model behavior: drift/outlier detection on prompts, completions, and retrieval patterns.
For deeper coverage, see our vulnerability management for the AI guide.
Software Supply Chain Risks (cloud‑first pipelines)
Why it matters
Build systems span SaaS repos, artifact registries, and multi‑cloud deploy stages—one weak link compromises downstream services.
Threats
- Compromised updates & registries (typosquats, dependency confusion).
- Unsigned artifacts and mutable tags that allow swap‑outs at deploy time.
- Build‑time egress that leaks credentials and signing keys.
- Open‑source drift—transitive deps adding risky licenses or telemetry.
Mitigations
- Provenance & Attestations: adopt SLSA‑aligned build attestations; require signed OCI artifacts; fail closed on missing/invalid signatures.
- SBOM + diffing: keep SBOMs per image/service and alert on changes (new deps, CVEs, licenses).
- Registry hygiene: pin to digests; disallow
latest
; restrict who can push; quarantine new images until scanned. - Build isolation: no outbound internet by default; egress only via proxies; store secrets in short‑lived build tokens.
- Typosquat/name‑collision detection in private mirrors; prefer vetted internal packages.
Cloud‑Native Malware (containers, serverless, control plane)
Patterns we now see
- Container‑escape → control‑plane pivots
- Serverless persistence via scheduled invocations
- eBPF‑aware evasion
- Queue/ETL abuse to move data covertly
Mitigations
- Minimal images: drop setuid, run as non‑root, read‑only FS; sign images and verify at deploy.
- Network controls: default‑deny egress; explicitly allow APIs per workload; add DNS logging.
- Secrets: use cloud KMS; never bake creds in images; rotate on every incident.
- Runtime baselines: alert on unusual syscalls, new child processes, crypto‑mining patterns.
- Function hardening: least‑privilege IAM, short timeouts, ephemeral storage only; disable unused runtimes.
- Incident playbooks: isolate namespace/account quickly; revoke tokens; redeploy from clean images.
Identity Threat Detection & Response (ITDR)
Why it matters
Compromised identities and mis‑scoped roles remain the fastest path to blast‑radius.
Threats
- Session/token theft and replay across cloud + SaaS
- Over‑privileged roles enabling stealthy escalation
- Federation drift (IdP → cloud) and stale trust relationships
Mitigations
- JIT & time‑boxed roles for admin actions; require ticket/approval trails.
- Conditional access baselines: MFA everywhere, device posture checks; IP + geo guards for admin.
- Session management: rotate/sign tokens frequently; revoke on policy changes; detect refresh storms.
- Detect identity anomalies: impossible travel, privilege spikes, new API families per identity.
- Break‑glass controls: separate accounts with hardware keys; monitored and rotated after drills.
Check out this Blog: AWS IAM Best Practices
API & GraphQL Abuse in Multi‑Cloud
Common failures
- Broken auth (hard‑coded/shared keys)
- Excessive data exposure from overly broad fields
- Unlimited query depth/complexity; introspection left on in prod
- Key leakage in CI logs, mobile apps, or public repos
Mitigations
- Positive security model: schema‑aware allow‑lists; separate read/write endpoints.
- Depth/complexity limits and query timeouts; turn off introspection in prod.
- Authentication: mTLS or signed client tokens; rotate keys automatically; device posture for sensitive APIs.
- Quotas + anomaly detection on per‑key basis (RPS, payload size, error spikes).
- Honey‑tokens and intentional canary keys to catch abuse.
AI/LLM Supply Chain & Data Leakage
New realities
- Model weights and fine‑tune datasets are high‑value artifacts.
- RAG pipelines often leak via embeddings or mis‑scoped storage.
- Synthetic‑data and third‑party datasets can introduce hidden PII/licensing risk.
Mitigations
- Classify & encrypt model artifacts; store in restricted projects with KMS‑bound access.
- Per‑tenant indexes and PII‑minimized embeddings; rotate and salt embeddings where feasible.
- Data contracts for fine‑tuning sets; automated PII scans; licensing checks during ingestion.
- Attestations for models: record training data lineage and hyper‑params; sign & verify before deploy.
Read this Blog to Learn More: Vulnerability Management in Cloud Security
Confidential Computing & GPU Hijacking
What’s changed
- Wider availability of confidential VMs/containers; emerging enclave side‑channels.
- GPU quota theft for cryptomining or unauthorized AI workloads.
Mitigations
- Use attested workloads for sensitive training/inference; verify quotes & policies at startup.
- GPU guardrails: per‑namespace quotas, anomaly alerts on utilization and unusual kernels.
- Tenant isolation: separate clusters/accounts for regulated workloads; periodic attestation audits.
30‑Minute Cloud Hardening (Checklist)
- Revoke all unused tokens/keys older than 30 days.
- Enforce least‑privilege roles; remove wildcard
*
permissions. - Turn on image signing + verification in CI/CD.
- Lock down public buckets; add object‑level logging.
- Add API quotas + GraphQL depth/complexity limits.
- Disable introspection in prod GraphQL.
- Require MFA + device posture for admins.
- Enable runtime alerts for crypto‑mining/process‑spawn anomalies.
- Put RAG indexes in separate projects per tenant; rotate keys.
- Block egress from builds by default; proxy required exceptions.
- Create break‑glass accounts with hardware keys; test quarterly.
- Document a revocation playbook (tokens, sessions, roles) and run a drill.
How Cy5 Helps
- ITDR: Detect risky role escalation & stolen sessions across AWS/GCP/Azure; suggest least‑privilege policies; time‑box high‑risk roles automatically.
- AI Security: Discover model endpoints and vector DBs; scan prompts & embeddings for secrets; detect prompt/indirect injections.
- Supply Chain: Ingest SBOMs, verify signatures/attestations, and block unsigned images from deploying.
- Runtime Protection: eBPF‑powered baselines for containers/FaaS; detect escape techniques, crypto‑mining, and covert data movement.
- API/GraphQL: Observe schema changes, enforce depth/complexity guards, auto‑rotate leaked keys.
For compliance mapping, review the CERT-In Guidelines 2025 and related regulatory updates.
See Cy5 stop token replay in under 3 minutes—book a live demo.
FAQs: Cloud Security Threats (2025–2026)
Attackers focus on function‑as‑a‑service persistence, eBPF‑aware evasion, and control‑plane pivots.
ITDR focuses on identities, sessions, and authorization paths; XDR focuses on endpoint/host telemetry. They complement each other.
Yes—with strict gating, prompt/response validation, rate limits, and isolation of RAG indexes.
Set per‑key quotas and add GraphQL depth/complexity limits; rotate keys automatically.
Use signed attestations (SLSA‑aligned), SBOMs per service, and deployment‑time verification.
Sudden utilization spikes, unknown kernels, and workloads running outside approved namespaces or hours.
Do the 30‑minute checklist, then prioritize ITDR and API abuse controls.
Ready to reduce cloud blast‑radius?
See how Cy5 detects token theft, blocks unsigned images, and hardens APIs in minutes.
Book a live demo