Rules
The rules block of a policy
Rules specify who can interact with which data, and what actions they
can take on that data. Inside the rules
block:
- Every rule except your default rule has an identities specification that specifies the people, applications, or groups this rule applies to.
- Every rule contains of a set of contexted rules, one for each type
of access:
reads
,updates
, and/ordeletes
. Each contexted rule applies only in the context of its specified operation type. For example thereads
rule applies only when someone tries to retrieve data. The rules block does not need to include all three operation types; actions you omit are disallowed. - A rule may optionally contain a hosts specification that limits access to only those users connecting from a certain network location.
Unless you create a default rule, users and groups only have the rights you explicitly grant them.
The default rule
A default rule is an optional rule without an identity specification
(identities
field). It applies to any user whose username or group
affiliation failed to match any other rule. Without a default rule,
the policy only allows those actions explicitly granted in the
identities
-based rules.
The following default rule from the sample policy specifies that any
person who failed to match the other rules will be allowed to read
only 1 row of EMAIL
at a time. Updates and deletes are disallowed in
for such users, since the default rule contains no updates
or
deletes
permissions.
reads:
- data: [EMAIL]
rows: 1
The identities specification in a rule
For each rule, you can specify the set of identities
(people,
applications, and groups) to which the rule applies. If you omit the
identity specification, this rule becomes the default rule.
users
([string]): individual usersservices
([string]): applications- for users going through Looker, use the service name
looker
- for custom services use the application name provided in the connection URL when connecting to the database
- for users going through Looker, use the service name
groups
([string]): user groups defined your enterprise SSO service such as GSuite or Okta
For example, the following identity specification indicates that the
rule will apply to users bob
and sara
, any users going through the
service looker
, and any users belonging to the user group analyst
.
identities:
user: [bob, sara]
services: [looker]
groups: [analyst]
In a policy, a limit of one rule per user or group
Within a given policy, make sure you only create one rule per user or
group. In other words, no two rules in a single policy can contain
the same user/group/service. In our example, this means that the
user bob
can only appear in one rule for a given policy.
Specifically, the following limits apply in order to prevent conflicts within a policy:
Each person must have only one rule that specifically applies to that person by username
Each group must have only one rule that specifically applies to that group by name
A person may have both a rule applied to them by username, and one or more rules that apply to them based on group affiliation. In this case, the rule that applies to them by username takes precedence.
- Looking at the sample policy, we can see that one rule applies to
the user
bob
and another applies to the user groupanalyst
. Ifbob
happens to be a member of the groupanalyst
, then when Bob attempts to perform a data operation, we will apply the rule specified for the userbob
and ignore the rule specified for the groupanalyst
. In overlap cases like this, Cyral enforces a single rule with the following precedence:user
>group
>service
.
- Looking at the sample policy, we can see that one rule applies to
the user
The hosts specification in a rule
The hosts specification is optional. It lists the host addresses that are allowed to connect to the data locations governed by this rule. If you do not include a hosts block, Cyral does not enforce limits based on the connecting client's host address.
To specify a hosts block, provide addresses as a comma-separated list of IP addresses and network blocks in CIDR notation. When a user tries to perform a data operation while connected from any host other than those you list here, the rule blocks the action.
For example, the hosts specification shown below ensures that data
locations in this rule can be accessed only while connected from a
host at 192.0.2.22
or one of the hosts in the 203.0.113.16/28
block.
hosts: [192.0.2.22, 203.0.113.16/28]
Contexted rules
Each contexted rule comprises these fields describing the allowed access for a given access type:
data
([string]): the data locations protected by this rule.- Specify locations using
LABEL
s you've established in your data map. - Specify a value of
any
to grant access to all the data locations protected by the current policy.
- Specify locations using
rows
(int): the number of records (for example, rows or documents) that can be accessed/affected in a single statement.- Specify a value of
any
to allow an unlimited number of records to be accessed/affected in a single statement.
- Specify a value of
- other optional fields, like additional checks and request rewriting.
For example, the following rule from the sample policy specifies that
individuals belonging to the user group analyst
can read 10 rows at a time from
any of the tracked data locations (EMAIL
, CCN
, and SSN
). They can also
write 1 row at a time to the locations EMAIL
and CCN
, and they can delete
1 row at a time from any of the tracked locations.
identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
Optional fields in a contexted rule
Users can also specify the following optional fields in a contexted rule:
additionalChecks
(string): constraints on the data access specified in Rego. See Additional checks.datasetRewrites
([object]): defines how requests should be rewritten in the case of policy violations. See Request rewriting.severity
(string): severity level that's recorded when someone violate this rule. This is an informational value. Settings: (low
|medium
|high
). If not specified, the severity is considered to below
.
Example with optional fields
For example, the following rule from the sample policy specifies that
individuals belonging to the user group analyst
can read 10 rows at
a time from any of the data locations covered by this policy (EMAIL
,
CCN
, and SSN
). They can write 1 row at a time to the locations
EMAIL
and CCN
. Finally, they can delete 1 row at a time from any
of the data locations covered by this policy, provided they are using
the psql application to do it.
identities:
groups: [analyst]
reads:
- data: any
rows: 10
updates:
- data: [EMAIL, CCN]
rows: 1
severity: medium
deletes:
- data: any
rows: 1
severity: medium
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}
Additional checks
Beyond specifying which and how much data can be accessed in the
data
and rows
fields, you can impose more sophisticated
constraints by adding the additionalChecks
field to a contexted rule.
The additionalChecks
field contains a rule you'll write in the
Rego
language. The checks you specify in this field will be evaluated each
time the contexted rule applies to an access request. Specify each
check in the form of a Rego rule named is_valid_request
, which needs
to evaluate to true for the access attempt to be considered an
allowed request. Otherwise the request will be considered a policy
violation.
Each rule can evaluate attributes of the access request that are made available in the activity log. This information is exposed in the context of the Rego rule through the following variables that represent top-level fields in the activity log:
identity
: information about the entity performing the observed data accessclient
: information about the client application from which the data is accessedrepo
: information about the repository being accessedrequest
: information about the request itselftags
: values provided in the request comment via the pass-throughCyralTags
Attributes nested inside these top-level fields can be accessed using
dot notation (e.g. identity.endUser
, client.applicationName
,
repo.type
, and so on).
As an example, the following additional check denotes that whatever access the check is specified for is only valid if the access is through a psql client. A more sophisticated example is provided in the Examples section at the end of this document (Example 6: Only allow users to see data pertaining to themselves).
additionalChecks: |
is_valid_request {
client.applicationName == "psql"
}
In the above example, we use the |
operator, which denotes a
multiline string in YAML. See this page
for more information on specifying multiline strings in YAML.
note
The Rego language defines a Rego module as comprising a Package declaration, a set of Import statements for declaring data dependencies, and a set of Rules. In this context, users need only specify Rules, omitting Package declaration and Import statements.
Request rewriting
You can specify how a read request should be rewritten when that request would otherwise violate your policy. This allows you to place constraints on what the data user can retrieve.
Specify this by adding the datasetRewrites
field in your contexted
rule. The datasetRewrites
field contains an array of objects with
the following structure:
repo
(string): the name of the repository that the rewrite applies todataset
(string): the dataset that should be rewritten- in the case of Snowflake, this denotes a fully qualified table name
in the form
<database>.<schema>.<table>
- in the case of Snowflake, this denotes a fully qualified table name
in the form
parameters
([string]): the set of parameters used in the substitution request, these are references to fields in the activity log as described in the Additional Checks section abovesubstitution
(string): the request used to substitute references to the dataset
For example, the following contexted rule specifies a rewrite that is
triggered in the event a request which reads EMAIL
data would
produce a policy violation. As a result, in this case, any references
to the fully qualified table myDb.finance.customers
will be replaced
with the subquery SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:
, where :identity.endUser:
would be
replaced with the value in the identity.endUser
field in the
activity log.
reads:
- data: [EMAIL]
rows: 10
datasetRewrites:
- repo: claims
dataset: myDb.finance.customers
parameters: [identity.endUser]
substitution: "SELECT * FROM myDb.finance.customers WHERE email=:identity.endUser:"
As a more specific example, suppose an individual makes the following
query which would cause a policy violation due to reading more than
the 10 row limit specified above. Suppose also that the individual has
accessed the repository using SSO authentication, and is identified as
the end user nancy.drew@hhiu.us
.
SELECT * FROM myDb.finance.customers;
Given the dataset rewrite specification in the example above, the Cyral sidecar would rewrite the query such that the receiving database sees the following query.
SELECT * FROM (SELECT * FROM myDb.finance.customers WHERE email='nancy.drew@hhiu.us');
note
Currently, parameter substitutions take place even within
string literals. For example, the substitution
"SELECT FROM myDb.finance.customers WHERE greeting = 'Hello, :identity.endUser:'"
contains the string literal 'Hello, :identity.endUser:'
.
During request rewriting, the sidecar will substitute
:identity.endUser:
with whatever value is in the identity.endUser
field in the activity log associated with the data access.