Rules
The rules block of a policy
Rules specify who can interact with which data, and what actions they
can take on that data. Inside the rules
block:
Every rule except your default rule has an identities specification that specifies the people, applications, or groups this rule applies to.
Every rule contains of a set of contexted rules, one for each type of access:
reads
,updates
, and/ordeletes
. A contexted rule specifies the policy enforcement actions that will ensure users can only see and operate on data as allowed by your policy.Each contexted rule applies only in the context of its specified operation type. For example the
reads
rule applies only when someone tries to retrieve data. The rules block does not need to include all three operation types; actions you omit are disallowed.A rule may optionally contain a hosts specification that limits access to only those users connecting from a certain network location.
Unless you create a default rule, users and groups only have the rights you explicitly grant them.
The default rule
A default rule is an optional rule without an identity specification
(identities
field). It applies to any user whose username or group
affiliation failed to match any other rule. Without a default rule,
the policy only allows those actions explicitly granted in the
identities
-based rules.
The following default rule from the sample policy specifies that any
person who failed to match the other rules will be allowed to read
only 1 row of EMAIL
at a time. Updates and deletes are disallowed in
for such users, since the default rule contains no updates
or
deletes
permissions.
reads:
- data: [EMAIL]
rows: 1
The identities specification in a rule
Cyral policy rules can determine what a user can do in a repository,
based on the authenticated user's identity. For each rule, you specify
the set of identities
(people, applications, and groups) to which
the rule applies. If you omit the identity specification, this rule
becomes the default rule.
users
([string]): one or more individual users identified by the user account they use to sign-in:- for SSO users registered with email, this string will be the
user's email address, like
nancy.drew@hhiu.us
. - for SSO users registered with a username, this string will be the
SSO username, like
nancydrew
. - for native database accounts, this string is the database
username (like what's shown in the examples on this page:
[bob, sara]
)
- for SSO users registered with email, this string will be the
user's email address, like
services
([string]): applications- for users going through Looker, use the service name
looker
- for custom services use the application name provided in the connection URL when connecting to the database
- for users going through Looker, use the service name
groups
([string]): one or more groups, as identified by the SSO group name they use to sign-in. Group names are defined in your enterprise SSO service, such as GSuite or Okta. For a policy rule to match, the group name listed here must match the group name of the access rule that granted the user access.
For example, the following identity specification indicates that the
rule will apply to users bob
and sara
, any users going through the
service looker
, and any users belonging to the user group analyst
.
identities:
user: [bob, sara]
services: [looker]
groups: [analyst]
In a policy, a limit of one rule per user or group
Within a given policy, make sure you only create one rule per user or
group. In other words, no two rules in a single policy can contain
the same user/group/service. In our example, this means that the
user bob
can only appear in one rule for a given policy.
Specifically, the following limits apply in order to prevent conflicts within a policy:
Each person must have only one rule that specifically applies to that person by username
Each group must have only one rule that specifically applies to that group by name
A person may have both a rule applied to them by username, and one or more rules that apply to them based on group affiliation. In this case, the rule that applies to them by username takes precedence.
- Looking at the sample policy, we can see that one rule applies to
the user
bob
and another applies to the user groupanalyst
. Ifbob
happens to be a member of the groupanalyst
, then when Bob attempts to perform a data operation, we will apply the rule specified for the userbob
and ignore the rule specified for the groupanalyst
. In overlap cases like this, Cyral enforces a single rule with the following precedence:user
>group
>service
.
- Looking at the sample policy, we can see that one rule applies to
the user
The hosts specification in a rule
The hosts specification is optional. It lists the host addresses that are allowed to connect to the data locations governed by this rule. If you do not include a hosts block, Cyral does not enforce limits based on the connecting client's host address.
To specify a hosts block, provide addresses as a comma-separated list of IP addresses and network blocks in CIDR notation. When a user tries to perform a data operation while connected from any host other than those you list here, the rule blocks the action.
For example, the hosts specification shown below ensures that data
locations in this rule can be accessed only while connected from a
host at 192.0.2.22
or one of the hosts in the 203.0.113.16/28
block.
hosts: [192.0.2.22, 203.0.113.16/28]
Contexted rules
A contexted rule is where you specify the enforcement actions that will ensure users can only see and operate on data as allowed by your policy.
Enforcement actions in contexted rules
Cyral policy enforcement actions can limit or change what data a user sees in response to a query request, and they set limits on what data users can update or delete.
The available enforcement actions in a contexted rule are described in the sections that follow:
- Blocking blocks access to a table or location.
- Row limiting limits how many rows are returned per query.
- Rate limiting limits the speed at which rows are returned per hour.
- Dataset rewriting filters the set of rows returned.
- Masking hides or replaces specific field values in results.
- Additional checks let you add conditions that must be satisfied in order for results to be returned.
Below, we explain Cyral policy enforcement actions and how they interact.
Where do I specify the enforcement action?
For most types of enforcement actions, you'll add the enforcement action in a contexted rule in your policy.
info
What if I don't have a policy for the repository?
Basic access control to a repository does not require a policy. Once you've set up SSO authentication for a repository, only the allowed SSO users and groups can connect to that repository.
Likewise, preconfigured alerts don't require a policy. Cyral notifies you when suspicious activity is detected on a repository.
Data scope
Inside a contexted rule, the data
block lists the data labels or tags
of the data locations protected by this rule:
- Specify locations using data labels you've established in your Data Map.
- Specify a value of
any
to grant access to all the data locations protected by the current policy.
For example, the following rule from the sample policy specifies that
individuals belonging to the user group analyst
can read 10 rows at a time from
any of the tracked data locations (labels EMAIL
, CCN
, and SSN
). They can also
write 1 row at a time to the locations EMAIL
and CCN
, and they can delete
1 row at a time from any of the tracked locations.
identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
Optional fields in a contexted rule
Users can also specify the following optional fields in a contexted rule:
additionalChecks
(string): constraints on the data access specified in Rego. See Additional checks.severity
(string): severity level that's recorded when someone triggers this rule. This is an informational value that will be written to the query log. Settings: (low
|medium
|high
). If not specified, the severity is considered to below
.
Example with optional fields
For example, the following rule from the sample policy specifies that
individuals belonging to the user group analyst
can read 10 rows at
a time from any of the data locations covered by this policy (EMAIL
,
CCN
, and SSN
). They can write 1 row at a time to the locations
EMAIL
and CCN
. Finally, they can delete 1 row at a time from any
of the data locations covered by this policy, provided they are using
the psql application to do it.
rules:
- identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}
Blocking access
The block on violation enforcement action stops the user from receiving the database results for a query, if that query violates your policy.
A Cyral policy is inherently a blocking policy, meaning you do not
need to include a keyword in your policy to block access to a data
location. Instead, simply create a rule so that identities
scope
covers the users or groups to be blocked, set the data
scope to
include the data labels or tags of the data locations to be protected,
and then do not add further instructions to the rule.
caution
You must enable blocking on each repository where you wish to block access to data locations.
When does the blocking occur?
Depending on the type of request, Cyral may block the request before it's submitted to the repository, or it may block the response from the repository.
- For read operations such as SELECTs:
- Cyral blocks the request is blocked if the query referred to a forbidden data label.
- Cyral blocks the response if the result set would contain more rows than allowed for the referenced data labels.
- For UPDATE and DELETE attempts, the request is blocked if it refers to a forbidden data label.
Blocking example
In the following example, all CCN
data is blocked for all the users in
the level-1-support
user group.
rules:
- identities:
groups: [level-1-support]
reads:
- data: [CCN]
In other words, no rule declaration in a contexted rule is needed to block access to a data location. Instead, blocking is what happens when there's no rule granting a user access to a labeled data location.
Row limiting
Use the rows
keyword in a contexted rule to limit the number of
rows or documents a user can retrieve in a single query statment.
Specify a value of any
to allow an unlimited number of
records to be accessed/affected in a single statement.
For example, to ensure that a member of level-2-support
can
retrieve at most 10 rows per query, create a rule like:
rules:
- identities:
groups: [level-2-support]
reads:
- data: any
rows: 10
Rate limiting
Use the rateLimit
keyword in a contexted rule to limit the rate at
which a user can read, update, and/or delete data. The limit is
expressed in the number of rows or documents per hour.
To set this up, add a rateLimit
contexted rule for any operation you
wish to limit (in the reads
, updates
, and/or deletes
sections of
the rule).
For example, to set a limit of 20 records per user per hour from
the CCN
data location, you would add:
rules:
- reads:
- data: [CCN]
rows: any
rateLimit: 20
Dataset rewriting
The dataset rewriting enforcement action lets you specify a filter on the set of rows that the user is allowed to access.
This action rewrites table expressions in the user query, replacing
them with a substitute query that you've specified in the policy.
Rewriting is typically used to filter the set of rows the user can
see. The most common use case is to specify a query of the form
SELECT * FROM table WHERE ...
and including a WHERE
clause that
specifies a filter allowing only the rows that the user is allowed to
access. However, you also have the option to supply a more complex
replacement query.
caution
You must enable dataset rewrites on each repository where you wish to perform rewrites.
tip
See policy evaluation to understand how dataset rewriting interacts with other actions like blocking, rate limiting, and masking data.
Procedure
- Turn on dataset rewrites for this repository.
- Specify how the dataset will be rewritten by adding the
datasetRewrites
field in your contexted rule. ThedatasetRewrites
field contains an array of objects with the following structure:repo
(string): the name of the repository that the rewrite applies todataset
(string): the dataset or data location that should be rewritten. This name is case insensitive. For example, if you specify a table name,orders
, it will also match a table calledOrders
in your database.- For most database types, this is a fully qualified table name
in the form
<schema>.<table>
- For Snowflake, this is a fully qualified table name
in the form
<database>.<schema>.<table>
- For most database types, this is a fully qualified table name
in the form
parameters
([string]): the set of parameters used in the substitution request, these are references to fields in the activity log as described in the Additional Checks section abovesubstitution
(string): the request used to substitute references to the dataset
Example
For example, the following contexted rule specifies a rewrite that is
triggered in the event a request reads EMAIL
data.
As a result, in this case, any references
to the fully qualified table myDb.finance.customers
will be replaced
with the subquery SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:
, where :identity.endUser:
would be
replaced with the value in the identity.endUser
field in the
activity log.
reads:
- data: [EMAIL]
rows: 10
datasetRewrites:
- repo: claims
dataset: myDb.finance.customers
parameters: [identity.endUser]
substitution: "SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:"
As a more specific example, suppose an individual makes the following
query that tries to read more than
the 10 row limit specified above. Suppose also that the individual has
accessed the repository using SSO authentication, and is identified as
the end user nancy.drew@hhiu.us
.
SELECT * FROM myDb.finance.customers;
Given the dataset rewrite specification in the example above, the Cyral sidecar would rewrite the query such that the receiving database sees the following query.
SELECT * FROM (SELECT * FROM myDb.finance.customers WHERE email='nancy.drew@hhiu.us');
note
Currently, parameter substitutions take place even within
string literals. For example, the substitution
"SELECT FROM myDb.finance.customers WHERE greeting = 'Hello, :identity.endUser:'"
contains the string literal 'Hello, :identity.endUser:'
.
During dataset rewriting, the sidecar will substitute
:identity.endUser:
with whatever value is in the identity.endUser
field in the activity log associated with the data access.
Masking data
The data masking enforcement action hides or replaces specific field values in each row returned, rather than filtering the set of rows returned.
To mask the contents of a data location, use one of the mask keywords
in your contexted rule in the format, <mask_type>(<data_label>, <mask_argument>)
where mask_type
is one of:
mask
to replace the field's contents with a semi-randomized string;constant_mask
to replace the field's contents with a value you provide; ornull_mask
to replace the field's contents with a null value.
For example, to mask both the EMAIL
and CCN
fields for
all members of level-3-support
you would add a rule like:
data:
- EMAIL
- CCN
rules:
- identities:
users: [level-3-support]
reads:
- data:
- mask(EMAIL)
- constant_mask(CCN, "***")
caution
You must enable masking and install helper functions in each repository where you wish to perform masking. See Mask data for instructions.
Additional checks
Beyond specifying which and how much data can be accessed in the
data
and rows
fields, you can impose more sophisticated
constraints by adding the additionalChecks
field to a contexted rule.
The additionalChecks
field contains a rule you'll write in the
Rego
language. The checks you specify in this field will be evaluated each
time the contexted rule applies to an access request. Specify each
check in the form of a Rego rule named is_valid_request
, which needs
to evaluate to true for the access attempt to be considered an
allowed request. Otherwise the request will be considered a policy
violation.
Each rule can evaluate attributes of the access request that are made available in the activity log. This information is exposed in the context of the Rego rule through the following variables that represent top-level fields in the activity log:
identity
: information about the entity performing the observed data accessclient
: information about the client application from which the data is accessedrepo
: information about the repository being accessedrequest
: information about the request itselftags
: values provided in the request comment via the pass-throughCyralTags
Attributes nested inside these top-level fields can be accessed using
dot notation (e.g. identity.endUser
, client.applicationName
,
repo.type
, and so on).
As an example, the following additional check denotes that whatever access the check is specified for is only valid if the access is through a psql client. A more sophisticated example is provided in the Examples section at the end of this document (Example 6: Only allow users to see data pertaining to themselves).
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}
In the above example, we use the |
operator, which denotes a
multiline string in YAML. See this page
for more information on specifying multiline strings in YAML.
note
The Rego language defines a Rego module as comprising a Package declaration, a set of Import statements for declaring data dependencies, and a set of Rules. In this context, users need only specify Rules, omitting Package declaration and Import statements.