Security¶

Security model for the LGTM stack (Loki + Grafana + Tempo + Mimir). Covers multi-tenant isolation, OTLP authentication, object storage encryption, and per-tenant overrides. See also: observability/lgtm/index, observability/lgtm/architecture, observability/lgtm/architecture.

Authentication Architecture¶

In a production LGTM deployment, no backend component (Mimir, Loki, Tempo) should be directly exposed to untrusted networks. Authentication and authorization are enforced at two layers:

Edge proxy (Nginx, Envoy, Grafana Enterprise Gateway) validates identity and injects tenant headers.
Component-level multi-tenancy uses the X-Scope-OrgID header to isolate data.

flowchart TD
    subgraph "External Clients"
        Alloy[Grafana Alloy<br/>Metrics/Logs/Traces]
        Users[Grafana Users]
    end

    subgraph "Auth Layer"
        Gateway[Auth Gateway<br/>Envoy / Nginx / GEM Gateway]
    end

    subgraph "LGTM Backends"
        Mimir[Mimir<br/>Metrics Backend]
        Loki[Loki<br/>Logs Backend]
        Tempo[Tempo<br/>Traces Backend]
    end

    subgraph "Object Storage"
        S3[S3 / GCS / Azure Blob<br/>SSE-S3 or SSE-KMS]
    end

    Alloy -->|OTLP + X-Scope-OrgID| Gateway
    Users -->|HTTP + Auth| Gateway
    Gateway -->|remote_write + tenant header| Mimir
    Gateway -->|OTLP + tenant header| Loki
    Gateway -->|OTLP + tenant header| Tempo
    Mimir --> S3
    Loki --> S3
    Tempo --> S3

Multi-Tenant Isolation¶

X-Scope-OrgID Header¶

All three backends use the X-Scope-OrgID HTTP header as the tenant identifier. This is the fundamental isolation mechanism across the LGTM stack.

Mimir: -auth.multitenancy-enabled=true enforces the header on every request.
Loki: auth_enabled: true in the configuration activates tenant isolation.
Tempo: Multi-tenancy configured via the TempoStack CRD or tempo.yaml.

# Mimir multi-tenancy configuration
multitenancy_enabled: true
multitenancy_mode: standard  # or "multiplexing" for shared TSDB

# OpenTelemetry Collector OTLP export with tenant header
exporters:
  otlp:
    endpoint: tempo.example.com:4317
    headers:
      x-scope-orgid: tenant-engineering

Tenant ID Rules¶

Property	Loki	Mimir	Tempo
Max length	150 bytes	Configurable	Configurable
Allowed chars	Alphanumeric, `!`, `-`, `_`, `.`, `*`, `'`, `(`, `)`	Alphanumeric	Alphanumeric
Invalid values	`.` and `..`	Empty string	Empty string
Multi-tenant query	Pipe-separated: `A\\|B\\|C`	Pipe-separated with `-tenant-federation.enabled=true`	Per-tenant overrides only

Data Isolation Guarantees¶

Each backend stores tenant data in separate prefixes or paths within object storage:

Mimir: Tenant data stored under <bucket>/<tenant-id>/ prefix. TSDB blocks are tenant-scoped.
Loki: Chunks and indexes separated by tenant. Compactor runs per-tenant.
Tempo: Trace data partitioned by tenant in object storage blocks.

Tenants cannot access each other's data unless the gateway explicitly supports cross-tenant federation.

OTLP Authentication¶

Ingestion Authentication¶

OTLP ingestion endpoints should not accept unauthenticated traffic. Common patterns:

mTLS (Mutual TLS): Client certificates verify identity at the gateway level. Tempo supports mTLS directly on its gRPC receiver.

# Tempo gRPC TLS
server:
  http_listen_port: 3200
  grpc_listen_port: 9095
  tls_cert_path: /etc/tempo/certs/server.crt
  tls_key_path: /etc/tempo/certs/server.key
  tls_client_ca_path: /etc/tempo/certs/ca.crt

Token-based authentication: An auth gateway validates bearer tokens or API keys and injects the appropriate X-Scope-OrgID header before forwarding to backends.

# Envoy token validation
http_filters:
- name: envoy.filters.http.ext_authz
  typed_config:
    "@type": type.googleapis.com/envoy.extensions.filters.http.ext_authz.v3.ExtAuthz
    http_service:
      server_uri:
        uri: auth-service:8080

Mimir TLS Configuration¶

Mimir supports mutual TLS on both HTTP and gRPC interfaces:

-server.http-tls-cert-path=/certs/server.crt
-server.http-tls-key-path=/certs/server.key
-server.http-tls-client-auth="RequireAndVerifyClientCert"
-server.http-tls-ca-path="/certs/ca.crt"
-server.grpc-tls-cert-path=/certs/server.crt
-server.grpc-tls-key-path=/certs/server.key
-server.grpc-tls-client-auth="RequireAndVerifyClientCert"
-server.grpc-tls-ca-path="/certs/ca.crt"

Object Storage Encryption¶

Server-Side Encryption¶

All LGTM backends store data in S3-compatible object storage. Encryption at rest is configured at the bucket level:

Method	Description	Key Management
SSE-S3	AWS-managed keys	AWS handles rotation
SSE-KMS	Customer-managed KMS key	Full key lifecycle control
SSE-C	Customer-provided key	Client-side key management
CSE	Client-side encryption	Application encrypts before upload

Recommendation

Use SSE-KMS with a customer-managed key per LGTM component. This provides audit trails via CloudTrail for key usage and allows independent key rotation for Mimir, Loki, and Tempo data.

Bucket Isolation¶

Use separate buckets per component to prevent cross-signal data leakage and simplify lifecycle policies:

observability-mimir-<env>    # Metrics data
observability-loki-<env>     # Logs data
observability-tempo-<env>    # Traces data

Bucket policies should restrict access to only the component service account:

{
  "Effect": "Allow",
  "Principal": {"AWS": "arn:aws:iam::123456789:role/mimir-role"},
  "Action": ["s3:GetObject", "s3:PutObject", "s3:DeleteObject", "s3:ListBucket"],
  "Resource": ["arn:aws:s3:::observability-mimir-prod", "arn:aws:s3:::observability-mimir-prod/*"]
}

Per-Tenant Overrides¶

Each LGTM backend supports runtime-configurable per-tenant limits. These prevent noisy-neighbor scenarios and enforce fair resource allocation.

Mimir Per-Tenant Limits¶

overrides:
  "tenant-engineering":
    ingestion_rate: 100000
    ingestion_burst_size: 200000
    max_global_series_per_user: 500000
    max_chunks_per_query: 2000000
    query_timeout: 120s
  "*":  # Default for all tenants without explicit override
    ingestion_rate: 50000
    max_global_series_per_user: 150000

Loki Per-Tenant Limits¶

overrides:
  "tenant-frontend":
    ingestion_rate_mb: 10
    max_streams_per_user: 100000
    max_chunks_per_query: 100000
    retention_period: 720h
  "*":
    ingestion_rate_mb: 5
    max_streams_per_user: 50000

Tempo Per-Tenant Limits¶

overrides:
  "tenant-payments":
    ingestion:
      rate_size_bytes: 50000000
      burst_limit_bytes: 100000000
      max_traces_per_user: 10000
    global:
      max_bytes_per_trace: 5000000
  "*":
    ingestion:
      rate_size_bytes: 20000000
      max_traces_per_user: 5000

Overrides Runtime Reload

Per-tenant overrides are loaded from a separate YAML file and reloaded at runtime without restarting the component. Use the per_tenant_override_config path in each backend's configuration.

Tempo Multi-Tenancy with OIDC¶

The Tempo Operator supports OIDC-based multi-tenancy with static RBAC:

apiVersion: tempo.grafana.com/v1alpha1
kind: TempoStack
metadata:
  name: production
spec:
  template:
    gateway:
      enabled: true
    queryFrontend:
      jaegerQuery:
        enabled: true
  tenants:
    mode: static
    authentication:
      - tenantName: engineering
        tenantId: eng-team
        oidc:
          issuerURL: https://dex.YOUR_DOMAIN/dex
          redirectURL: https://tempo-gateway.example.com/oidc/eng-team/callback
          usernameClaim: email
          secret:
            name: tempo-oidc-secret
    authorization:
      roles:
        - name: read-write
          permissions: [read, write]
          resources: [traces]
          tenants: [engineering]
      roleBindings:
        - name: engineers
          roles: [read-write]
          subjects:
            - kind: user
              name: [email protected]

Grafana Integration Security¶

Data Source Configuration¶

When connecting Grafana to multi-tenant LGTM backends:

Configure the X-Scope-OrgID header in each data source's custom HTTP headers.
Use separate data sources per tenant or configure Grafana's data source proxy to inject the correct header.
Enable basic auth or bearer token auth on the data source if the gateway requires it.

Query Security¶

Loki: Tenant ID filtering in queries uses the __tenant_id__ label for cross-tenant queries (e.g., {app="api", __tenant_id__=~"eng.+"}).
Multi-tenant queries require multi_tenant_queries_enabled: true in the querier config.
Push (POST /loki/api/v1/push) and tail (GET /loki/api/v1/tail) endpoints reject multi-tenant requests.

Network Hardening¶

Component	Default Ports	Recommended Exposure
Mimir distributor	8080 (HTTP), 9095 (gRPC)	Internal only; behind gateway
Loki distributor	3100 (HTTP)	Internal only; behind gateway
Tempo ingester	3200 (HTTP), 9095 (gRPC)	Internal only; behind gateway
Object storage	443 (HTTPS)	Private endpoint or VPC-only
Grafana	3000 (HTTP)	TLS-terminated proxy

Hardening Checklist¶

Area	Recommendation
Network isolation	All backends in private VPC/subnet; no direct internet access
TLS everywhere	mTLS between components; TLS at gateway edge
Auth gateway	Centralize authentication; inject X-Scope-OrgID
Per-tenant limits	Configure rate limiting and resource quotas for every tenant
Bucket isolation	Separate S3 bucket per component per environment
Encryption at rest	SSE-KMS with customer-managed keys
Secrets management	Store storage credentials and TLS keys in Vault or cloud secret manager
Audit logging	Enable access logs on gateway and object storage