Snowflake Separation of Environments
In a healthy DataSecOps operation, an important part of planning a secure Snowflake deployment is planning for the separation of environments. This is true for an organization with a new Snowflake. Still, it is never too late to amend the way you structure your accounts, and in particular, setting up new accounts for the separation of environments.
When Do You Separate Your Environments?
There are several situations where it makes sense to have several different environments within the same organization. By “a separate environment,” I mean either separate schemas, separate databases, the above two with separate virtual data warehouses as well, or even separate accounts in the same company. I will soon discuss the different types of separation, but first, here are the main reasons for separation I’ve encountered:
- Different environments for different business units. This really depends on your organization, how budgets are handled between departments, how big the departments are, how deep data sharing is within the organization, the regional aspect (and corresponding regulations). However, it means that in some cases, the same organization (e.g., the same company) wants to set up Snowflake accounts for different business units that need to be separate — either fully or partially.
- Separation due to data sensitivity levels. In some cases, organizations want to separate sensitive data to be in separate environments so that it’s apparent who has access to what environment, and in some cases operate these accounts in different configurations, like different encryption settings, PrivateLink connections, etc.
- Multi-tenancy and contractual restrictions. In some cases, a company holds data of several customers. They want to separate the different customers' data, either to prevent the risk of data exposure or due to contractual obligations.
- Separation due to regional issues. In some cases, the most (in)famous of them being GDPR, there are certain restrictions on data of data subjects in a specific country or region. These can also be other regulations or even customer requirements, but in such cases, some companies hold accounts in several different regions.
- Operational efficiency when working with multiple clouds. Sometimes companies are working with two or more of the public clouds or have data to ingest in several regions, making sense to separate the organization to separate accounts in different clouds or regions.
- Separation for testing, staging, and production environments. This is something I hope to see more of. In many organizations, tests are either not done, done sporadically, or done on production data, or “almost production data” (such as copying a table to another table to test a certain flow).
Please understand that unless you’re just experimenting, you’re doing DataOps, which means that your pride should be in well-processed automation and tests. In the same way that software is deployed to production in a well-processed way, ideally, you’d like to do the same with data, especially if it changes frequently. So you want to set up a staging environment where you make sure everything is working as planned before applying changes to production data schemas and data sets.
How Do You Separate Your Environments?
When you decide to separate your data cloud, you need to understand how deep that separation should be. For example, for an organization that has a strong base of data democratization, with different teams involved in data consumption, it would make sense to try not to separate the operation to separate accounts but invest more in access control within the account itself, and even fine-grained security if needed, and if required even a self-service data access portal.
Operationally, depending on the company, it may make sense to use different virtual data warehouses to keep compute billing separately, though depending on the usage and the amount of data engineering efforts to be invested, aggregating operations to less virtual data warehouses or even implementing queueing over the virtual data warehouses may make sense.
For an organization with little sharing between different business units, and if this is also not planned to change, it may make sense to use completely different accounts managed by the same organization administrator. If there is only little sharing to be done, it can also be done with cross-account data sharing or reader accounts, depending on the exact use case.
When sensitive data is involved, it depends on the number of resources you have to perform data access operations. Separating sensitive data aside may sound tempting for your Snowflake security, but other options like proper role separation, using controls like data masking, and having good visibility over your data can be more effective than separating sensitive data to a separate account, especially when several users or teams do require access to that data.
In some of these cases, as mentioned above, you may want to operate several accounts. In Snowflake, you can do so using SQL extension as self-service. In some cases (where the investment is justified), you can even create, modify and delete accounts using a custom application.
Self-Service Accounts Management
When managing accounts, you need the special ORGADMIN role to manage organization settings (such as setting up new accounts). Currently, this role needs to be added specifically only by opening a support ticket. As an example, when the user with the ORGADMIN role wants to show the accounts in the organization, they can run the following query:
SHOW ORGANIZATION ACCOUNTS;
And an example to create a new account is the following:
CREATE ACCOUNT acme_dev_staging
ADMIN_NAME = stageadmin
ADMIN_PASSWORD = ‘CHANGE_ME_19932221222!!’
FIRST_NAME = stage
LAST_NAME = admin
MUST_CHANGE_PASSWORD = TRUE
EMAIL = ‘dataops@acme.corp’
EDITION = standard
REGION = aws_us_east_1;
For more information about ORGADMIN, refer to the Snowflake documentation.