So you have a multi-tenant SaaS application that is using PostgreSQL as the database of choice. As you are serving multiple customers, how do you protect each customer’s data? How do you provide full data isolation (logical and physical) between different customers? How do you minimize impact of attack vectors such as SQL Injection? How do you retain the flexibility to potentially move the customer to a higher hosting tier or higher SLAs?
1. One DB per customer
Instead of putting every customer’s data in one database, simply create one database per customer. This allows for physical isolation of data within your Postgres cluster. So, for every new customer that registers, do this as part of the workflow:
1 |
CREATE DATABASE customer_A WITH TEMPLATE customer_template_v1; |
In the example above customer_template_v1
is a custom database template with all the tables, schemas, procedures pre-created.
Note: You can use Schema or Row Level Security (v9.5) to effect isolation. However, Schema and Row Level Security would only allow for logical isolation. You could go the other extreme and use a DB cluster (as opposed to a database) per customer to effect complete data isolation. But the management overhead makes it a less than ideal option in most cases.
2. Separate DB user(s) per customer
After the Database is created as mentioned above, create a unique Database user as well. This user only would have permission to one (and only one) database: customer_A
.
1 2 3 4 |
CREATE ROLE customer_A_user with option NOSUPERUSER NOCREATEDB LOGIN ENCRYPTED PASSWORD '' REVOKE ALL ON DATABASE customer_A FROM PUBLIC; GRANT CONNECT ON DATABASE customer_A TO customer_A_user; GRANT ALL ON SCHEMA public TO customer_A_user WITH GRANT OPTION; |
Now, in your middleware code, make sure to connect to customer_A
database only using customer_A_user
. In other words, when a user from customer_A
organization logs into your SaaS application, use appropriate database and database user name.
If you wish, you can even create separate READ and WRITE users. So, to create a read user for database: customer_A
1 2 3 4 5 |
CREATE ROLE customer_A_read_user with option NOSUPERUSER NOCREATEDB LOGIN ENCRYPTED PASSWORD '' GRANT USAGE ON SCHEMA public TO customer_A_read_user; GRANT CONNECT, TEMPORARY ON DATABASE customer_A TO customer_A_read_user; GRANT SELECT ON ALL TABLES IN SCHEMA public TO customer_A_read_user; ALTER DEFAULT PRIVILEGES IN SCHEMA public GRANT SELECT ON TABLES TO customer_A_read_user; |
With the above you have fine grained control in terms of database access privileges and every activity from the middleware needs to decide carefully as to which role (read or read/write) needs to be used for access.
So, what DB User/Role do you use to create the new customer database in the first place? Create a special DB User (say create_db_user
) just for this purpose. Audit and monitor this user’s activity closely. Don’t use this DB User for anything else. Or you can create a new user for each new database and simply specify that at database creation time. Whatever happens, don’t use the Postgres root user for your web connections!
1 2 |
CREATE ROLE customer_B_user with option NOSUPERUSER NOCREATEDB NOCREATEROLE LOGIN ENCRYPTED PASSWORD 'ABGF$%##89'; CREATE DATABASE customer_B WITH TEMPLATE customer_template_v1 OWNER=customer_B_user; |
As you may have noticed, a number of SaaS applications give vanity URLs (example: https://customerA.example.com) to their customers. Some other SaaS applications have a concept of ‘customerId’ which is a required field for authentication into SaaS application. The benefit is two fold:
- As the user logs into the SaaS application, the middleware code knows exactly which database to connect to.
- This also helps to keep the URL space isolated, allowing the SaaS application to start isolation at the web server level itself.
3. Separate crypto keys per customer
If you are doing any encryption within the database (say with pgcrypto
), make sure to use separate encryption keys for each customer. This adds cryptographic isolation between your customer data. Finally, when it comes to encryption and key management, avoid these common encryption errors developers keep making.
Comment and do let us know what other best practices make sense for multi-tenant SaaS access with PostgreSQL.
While this is a solid way to set this up, it will potentially lead to performance and scalability issues. Because you can not use a connection pool to pool connections between DB users, this setup can lead to a large number of DB connections which can be a massive performance killer.
True. However, if providing isolation is important (especially in regulatory environments), then connection pooling problem can be solved by moving more chatty customers to a more powerful pg cluster (and maybe to a higher pricing tier). The risk profiles and attack vectors should dictate the decisions I think.
If you do per-database segmentation, you should also disallow connecting to the wrong database via pg_hba.conf. I recommend creating a no-login role (aka a group) for each database. Setup pg_hba.conf so that only users with that role can connect to that database, and then grant that role to whatever other login roles you need for that database.
Thanks for the tip @Jim. We’ll incorporate this into the article soon.
We are doing one db per X customers and in that db each user has a separate schema.
This (with the addition of pg_hba.conf as noted above) is exactly how we do things in our SaaS platform, and it works really well. there are all sorts of great outcomes from doing things this way, including being able to easily give certain customers a restoreable backup. This is the first time I’ve seen the technique published and we’ve often wondered if anyone else did things this way, so many thanks for validating our approach
Good to hear that Mark!