Loki is multitenant by default, meaning that it can ingest, store, and query for different users (teams, organizations, etc) in the same process. This has some attractive benefits, namely that it’s much more economical to run a single database with multiple tenants than multiple databases with single tenants.
Loki does not have any sophisticated authentication on it’s own, so tenancy is determined by a special
X-Scope-OrgID http header attached to all requests. A request with
will not be able to see or interfere with data from any other tenant.
One of the harder problems in building multi-tenant systems is building them in such a way that no one tenant can adversely affect other tenants (called the noisy neighbor problem). Loki handles this in a few ways:
Query Quality of Service
The first way Loki protects against noisy neighbors is by ensuring that no one tenant can consume all the querying capacity in a cluster. To do this, each tenant is given an independent queue where they can enqueue queries. Loki will then select a queue at random, dequeue a query, and process it on one of the querier replicas. Let’s look at a couple scenarios:
- If the cluster is not under load and all tenants are enqueueing queries, they’ll all be processed.
- If the cluster is under load and all tenants are enqueuing queries, they’ll all be subject to queueing delays or cancellations
- If the cluster not under load and one tenant is enqueuing many queries, that tenant will be able to utilize the extra cluster capacity and all the queries will be processed.
- If the cluster is under moderate load and one tenant is enqueueing many queries, only that tenant will see queue delays/cancellations; the other tenants will not.
In this way, the cluster can be scaled according to overall read usage and it will automatically limit tenants trying to consume more than their fair share of resources, but only when resources are in high demand. In low-demand scenarios, a single tenant can be allowed to take advantage of this extra processing power.
Not all tenants need be the same. It’s common for some tenants to be much bigger than others and therefore be treated differently.
Loki supports a reloadable
overrides configuration file which can specify these differences:
ingestion_rate_mb: 10 # 10MB/s =~ 25TB/month
ingestion_burst_size_mb: 20 # biggest payload
max_query_parallelism: 32 # each query can execute 32 subqueries in parallel
split_queries_by_interval: '30m' # each query can be split into 30m intervals and executed in parallel
ingestion_rate_mb: 30 # ~75TB/month
max_query_parallelism: 64 # each query can execute 64 subqueries in parallel
split_queries_by_interval: '15m' # each query can be split into 15m intervals and executed in parallel
In the above example, Loki will accept up to
medium_user, over which it will rate limit/reject writes. The
big_user, however, won’t see rate limits until past
Bonus: these files can also be edited/redeployed and Loki will notice the changes without needing to restart!