Tenancy¶
Sometimes you might want to query a different database or schema because might be convenient to get data from two different sources.
Querying two different databases or schemas its not so uncommon as you might think and to this we can sometimes refer to multi-tenancy or simply two different data sources.
What is multi-tenancy¶
When implementing multi tenancy architecture there are many factors to consider and there are three common ways:
- Shared schemas - The data of all users are shared within the same schema and filtered by common IDs or whatever that is unique to the platform. This is not so great for GDPR (Europe) or similar in different countries.
- Shared database, different Schemas - The user's data is split by different schemas but live on the same database.
- Different databases - The user's data or any data live on different databases.
Saffier has three different ways of achieving this in a simple and clean fashion.
- Using the using in the queryset.
- Using the using_with_db in the queryset.
- Using the set_tenant as global.
You can also use the Saffier helpers for schemas if you need to use it.
Using¶
This is probably the one that is more commonly used and probably the one you will be using more often when querying different schemas or databases.
The using is simply an instruction telling to use this schema to query the data instead of the default set in the registry.
Parameters:
- schema - A string parameter with the name of the schema to query.
The syntax is quite simple.
<Model>.query.using(<SCHEMA-NAME>).all()
This is not limited to the all()
at all, you can use any of the available query types
as well.
Example¶
Let us assume we have two different schemas inside the same database and those schemas have a table
called User
.
- The schema
default
- The one that is automatically used if no schema is specified in the registry. - The schema
other
- The one we also want to query.
import saffier
from saffier import Database, Registry
database = Database("<YOUR-CONNECTION-STRING>")
models = Registry(database=database)
class User(saffier.Model):
id = saffier.IntegerField(primary_key=True)
is_active = saffier.BooleanField(default=False)
class Meta:
registry = models
Now we want to query the users from each schema.
Querying the default
As per normal approach, the query looks like this.
User.query.all()
Querying the main
Query the users table from the main
schema.
User.query.using('main').all()
And that is it, really. Using its a simple shortcut that allows querying different schemas without a lot of boilerplate.
Using with database¶
Now here it is where the things get interesting. What if you need/want to query a schema but from
a different database instead? Well, that is possible with the use of the using_with_db
.
This is the part that makes a whole difference if you are thinking about querying a specific database using a diffent connection.
What does that even mean? Imagine you have a main database public
(default) and a database copy somewhere
else called alternative
(or whatever name you choose) and both have the model User
.
You now want to query the alternative
to gather some user data that was specifically stored
in that database where the connection string is different.
The way Saffier operates is by checking if that alternative connection exists in the extra
parameter of the registry and then uses that connection to connect and query to the desired database.
Warning
To use the alternative
database, the connection must be declared in the registry
of the
model or else it will raise an AssertationError
.
The way of doing that is by using the using_with_db
of the queryset. This is particularly useful
if you want to do some tenant applications or simply
connecting to a different database to gather your data.
Simple right?
Nothing like a good example to simplify those possible confusing thoughts.
Let us assume we want to bulk_create
some users in the
alternative
database instead of the default
.
import saffier
from saffier.core.db import fields
from saffier.testclient import DatabaseTestClient as Database
database = Database("<YOUR-CONNECTION-STRING>")
alternative = Database("<YOUR-ALTERNATIVE-CONNECTION-STRING>")
models = saffier.Registry(database=database, extra={"alternative": alternative})
class User(saffier.Model):
id = fields.IntegerField(primary_key=True)
name = fields.CharField(max_length=255)
email = fields.CharField(max_length=255)
class Meta:
registry = models
As you can see, the alternative
was declared in the extra
parameter of the registry
of the
model as required.
Now we can simply use that connection and create the data in the alternative
database.
import saffier
from saffier.core.db import fields
from saffier.testclient import DatabaseTestClient as Database
database = Database("<YOUR-CONNECTION-STRING>")
alternative = Database("<YOUR-ALTERNATIVE-CONNECTION-STRING>")
models = saffier.Registry(database=database, extra={"alternative": alternative})
class User(saffier.Model):
id = fields.IntegerField(primary_key=True)
name = fields.CharField(max_length=255)
email = fields.CharField(max_length=255)
class Meta:
registry = models
async def bulk_create_users() -> None:
"""
Bulk creates some users.
"""
await User.query.using_with_db("alternative").bulk_create(
[
{"name": "Edgy", "email": "saffier@example.com"},
{"name": "Edgy Alternative", "email": "saffier.alternative@example.com"},
]
)
Did you notice the alternative
name in the using_with_db
? Well, that should match the name
given in the extra
declaration of the registry.
You can have as many connections declared in the extra as you want, there are no limits.
Using with activate_schema
¶
Warning
This feature is experimental and might be inconsistent with the intended results. Use it at your own discretion.
This is an alternative to using and serves solely as the purpose of avoiding
writing all the time Model.query.using(...)
.
You can use activate_schema(...)
and deactivate_schema()
to tell the query to always query
a specific tenant, in other words, using the activate_schema()
you don't need to constantly
write using(...)
.
Importing is as simple as this:
from saffier.core.db.querysets.mixins import activate_schema, deativate_schema
Let us see an example:
With the classic .using()
# Using the 'main' schema
User.query.using('main').all()
User.query.using('main').filter(email__icontains="user@example.com")
User.query.using('main').get(pk=1)
Using the activate_schema
# Using the 'main' schema
activate_schema("main")
# Query the 'User' from the 'main' schema
User.query.all()
User.query.filter(email__icontains="user@example.com")
User.query.get(pk=1)
# Deactivate the schema and default to the public
deactivate_schema("main")
Set tenant¶
This is another way to create a global tenant
for your application. Instead if using or
using_with_db you simply want to make sure that in your application you
want every request for a specific tenant
to always hit their corresponding tenant data.
This is specially useful for multi-tenant applications where your tenant users will only see their own data.
To use the set_tenant
you can import it via:
from saffier.core.db import set_tenant
Tip
Use the set_tenant
in things like application middlewares or interceptors, right before
reaching the API.
Practical case¶
The set_tenant
can be somehow confusing without a proper example so let us run one 😁.
As usual, for this example Esmerald will be used. This can be applied to any framework of your choice of course.
What are we building:
- Models - Some models that will help us out mapping a user with a tenant.
- Middleware - Intercept the request and set the corresponding tenant.
- API - The API that returns the data for a given tenant.
Models¶
Let us start with some models where we have a Tenant
, a User
model as well as a Product
where we will be adding some data for different tenants.
The TenantUser
model will serve as the link between a database schema (tenant) and the User
.
We will want to exclude some models from being created in every schema. The Tenant
on save it will
generate the schema
for a user in the database and it will automatically generate the database
models.
Warning
This is for explanation purposes, just do in the way you see fit.
from typing import Any, Dict, List, Type, Union
import sqlalchemy
from loguru import logger
import saffier
from saffier.core.db.models.model import Model
from saffier.testclient import DatabaseTestClient as Database
database = Database("<YOUR-CONNECTION-STRING>")
registry = saffier.Registry(database=database)
class Tenant(saffier.Model):
schema_name = saffier.CharField(max_length=63, unique=True, index=True)
tenant_name = saffier.CharField(max_length=100, unique=True, null=False)
class Meta:
registry = registry
def table_schema(
self,
model_class: Type["Model"],
schema,
) -> sqlalchemy.Table:
return model_class.build(schema)
async def create_tables(
self,
registry: "saffier.Registry",
tenant_models: Dict[str, Type["Model"]],
schema,
exclude: Union[List[str], None],
) -> None:
for name, model in tenant_models.items():
if name in exclude:
continue
table = self.table_schema(model, schema)
logger.info(f"Creating table '{name}' for schema: '{schema}'")
try:
async with registry.engine.begin() as connection:
await connection.run_sync(table.create)
await registry.engine.dispose()
except Exception as e:
logger.error(str(e))
async def create_schema(self) -> None:
await registry.schema.create_schema(self.schema_name)
async def save(
self: Any,
force_save: bool = False,
values: Dict[str, Any] = None,
**kwargs: Any,
) -> Type[Model]:
tenant = await super().save(force_save, values, **kwargs)
try:
await self.create_schema(schema=tenant.schema_name, if_not_exists=True)
await self.create_tables(
registry, registry.models, tenant.schema_name, exclude=["Tenant", "TenantUser"]
)
except Exception as e:
message = f"Rolling back... {str(e)}"
logger.error(message)
await self.delete()
return tenant
class User(saffier.Model):
id = saffier.IntegerField(primary_key=True)
name = saffier.CharField(max_length=255)
class Meta:
registry = registry
class Product(saffier.Model):
id = saffier.IntegerField(primary_key=True)
name = saffier.CharField(max_length=255)
user = saffier.ForeignKey(User, null=True)
class Meta:
registry = registry
This is a lot to unwrap is it? Well, that was explained before at the top and this is just the declaration of the models for some general purposes.
Generate example data¶
Now it is time to generate some example data and populate the tables previously created.
async def create_data():
"""
Creates mock data.
"""
# Create some users in the main users table
esmerald = await User.query.create(name="esmerald")
# Create a tenant for Saffier (only)
tenant = await Tenant.query.create(
schema_name="saffier",
tenant_name="saffier",
)
# Create a user in the `User` table inside the `saffier` tenant.
saffier = await User.query.using(tenant.schema_name).create(
name="Saffier schema user",
)
# Products for Saffier (inside saffier schema)
for i in range(10):
await Product.query.using(tenant.schema_name).create(
name=f"Product-{i}",
user=saffier,
)
# Products for Esmerald (no schema associated, defaulting to the public schema or "shared")
for i in range(25):
await Product.query.create(name=f"Product-{i}", user=esmerald)
We now have models
and mock data for those. You will realise that we created a user
inside the
shared
database (no schema associated) and one specifically inside the newly saffier
schema.
Middleware¶
It is time to create a middleware that will take advantage of our new models and tenants and set the tenant automatically.
The middleware will receive some headers with the tenant information and it will lookup if the tenant exist.
Danger
Do not use this example in production, the way it is done it is not safe. A real lookup example would need more validations besides a direct headers check.
from typing import Any, Coroutine
from esmerald import Request
from esmerald.protocols.middleware import MiddlewareProtocol
from lilya.types import ASGIApp, Receive, Scope, Send
from saffier.core.db import set_tenant
from saffier.exceptions import ObjectNotFound
class TenantMiddleware(MiddlewareProtocol):
def __init__(self, app: "ASGIApp"):
super().__init__(app)
self.app = app
async def __call__(
self, scope: Scope, receive: Receive, send: Send
) -> Coroutine[Any, Any, None]:
"""
Receives a header with the tenant information and lookup in
the database if exists.
Sets the tenant if true, or none otherwise.
"""
request = Request(scope=scope, receive=receive, send=send)
tenant_header = request.headers.get("tenant", None)
try:
user = await Tenant.query.get(schema_name=tenant_header)
tenant = user.schema_name
except ObjectNotFound:
tenant = None
set_tenant(tenant)
await self.app(scope, receive, send)
Now this is getting somewhere! As you could now see, this is where we take advantage of the set_tenant.
In the middleware, the tenant is intercepted and all the calls in the API will now query only
the tenant data, which means that there is no need for using
or using_with_db
anymore.
API¶
Now it is time to simply create the API that will read the created products from the database and assemble everything.
This will create an Esmerald application, assemble the routes
and add the
middleware created in the previous step.
from typing import List
from esmerald import Esmerald, Gateway, JSONResponse, get
import saffier
database = saffier.Database("<TOUR-CONNECTION-STRING>")
models = saffier.Registry(database=database)
@get("/products")
async def products() -> JSONResponse:
"""
Returns the products associated to a tenant or
all the "shared" products if tenant is None.
"""
products = await Product.query.all()
products = [product.pk for product in products]
return JSONResponse(products)
app = Esmerald(
routes=[Gateway(handler=products)],
on_startup=[database.connect],
on_shutdown=[database.disconnect],
middleware=[TenantMiddleware],
)
Query the API¶
If you query the API, you should have similar results to this:
import httpx
async def query():
response = await httpx.get("/products", headers={"tenant": "saffier"})
# Total products created for `saffier` schema
assert len(response.json()) == 10
# Response for the "shared", no tenant associated.
response = await httpx.get("/products")
assert len(response.json()) == 25
The data generated for each schema (shared
and saffier
) should match
the response total returned.
Notes¶
As you could see in the previous step-by=step example, using the set_tenant can be extremely useful mostrly for those large scale applications where multi-tenancy is a must so you can actually take advantage of this.