Data Map
The Data Map is an inventory of your data locations that allows you
to establish a short name, called a LABEL
, for each data location
(like a table, collection, or S3 bucket) that you want to protect.
When you write a policy rule, you'll use LABEL
s rather than specific
table and column names to specify which data the rule protects.
Each LABEL
maps to a specific location (for example, a specific
column in a specific database). Because a single LABEL
can refer to
many locations in many repositories, the Data Map gives you the
ability to write a policy that treats your data consistently, even
when that data is spread across many data repositories.
tip
Cyral can automatically watch for database columns and other locations that contain data that you might want to protect. See Automatic Data Map.
info
Data Maps are used to label data locations so that you can manage and monitor them individually or in groups. In particular, once a table or other location is listed in your Data Map, Cyral can:
- control access to it according to your policy
- let users request access via the Cyral chatbot
- log its data activity
Add or edit a Data Map
There are three ways to work on your Data Maps in Cyral:
- create and edit them in the UI
- create and edit them in a YAML file
- have Cyral's Automatic Data Map watch for sensitive data locations and prompt you to add them to a Data Map.
Add or edit a Data Map through the UI
In the Cyral control plane UI:
Click Data Repos ➡️ your repo's name ➡️ Data Map ➡️ Mappings
Add or edit:
- To add a new mapping, click Add Mapping
- To edit a mapping, find the
LABEL
you want to edit, click the 🔽 to expand it, and click ✏️ to edit.
In the Edit Mapping window specify the label's name and the data location it will apply to:
- Label: If you're creating a new
LABEL
, name it here. Think of this as a shorthand name you can use in your policy to refer to one or many database columns or other locations. - The remaining fields describe a location to be included in this
label:
- Schema or Database: name of the schema, database, or Dremio space
- Table, Collection, or Bucket: name of the database table, MongoDB collection, or AWS bucket
- Column, Field, or Object Key: database column name or AWS object key
- Label: If you're creating a new
Click Create or Save to save your Data Map.
Add or edit a Data Map as a YAML file
In the Cyral control plane UI:
Click Data Repos ➡️ your repo's name ➡️ Data Map ➡️ View as YAML
Click the Edit button.
Use the test editing field to type or paste your Data Map. See the Structure section below for help with the syntax.
Click Save to save your Data Map.
Structure
The Data Map follows this structure:
{ LABEL }:
attributes: [{ ATTRIBUTE_LOCATION }, ...]
And the fields are defined as follows:
{LABEL}
(string): label given to the data specified in the corresponding list. Important: See the Limits on how you create labels, below.- each value in the list assigned to a label is an object made up of
two fields:
attributes
([string]): a list of data locations in this repo that will be included in thisLABEL
. You specify each attribute using the format,{SCHEMA}.{TABLE}.{FIELD}
where:SCHEMA
is the name of the database schema, or Dremio space, if any;TABLE
is the name of the database table, MongoDB collection, or AWS bucketFIELD
is the database column name or AWS object key
tip
Dremio users: When you refer to data in a Dremio repository,
please include the complete location, with each nested Dremio
space separated by a .
(dot). For example, an attribute my_attr
contained by table my_tbl
within space inner_space
within
space outer_space
would be referenced as
outer_space.inner_space.my_tbl.my_attr
.
Data Map example
In the below example, we assign labels to data in two repos, claims
and loans
. The label CCN
is assigned to the attribute ccn
in the
table customers
in the finance
schema of the claims
repository
as well as the attribute credit_card_number
in the table customers
in the applications
schema of the loans
repository. The labels
EMAIL
and SSN
are also assigned to email and social security
number data from each repo, respectively, following the same pattern.
Data map for the claims
repo:
CCN:
attributes:
- finance.customers.ccn
EMAIL:
attributes:
- finance.customers.email
SSN:
attributes:
- finance.customers.ssn
Data map for the loans
repo:
CCN:
attributes:
- applications.customers.credit_card_number
EMAIL:
attributes:
- applications.customers.email
SSN:
attributes:
- applications.customers.social_security_number
In the next section, we'll show a sample policy that sets access rules
for each of these data labels (CCN
, EMAIL
, and SSN
). The policy
applies to all repositories included in the Data Map.
Limits on how you create and use labels
When creating and using a LABEL
, please observe these limits:
- A
LABEL
can refer to one or many attributes (for example, tables, fields, or columns) in one or many repositories. - A given repository location (a table, collection, field, column, or bucket)
must be included in only one
LABEL
. - Each
LABEL
must be used in only one policy. You may use theLABEL
in one or many rules in the policy.
If your Data Maps and policies violate any of these limits, the policy update will fail. These limits prevent users from writing conflicting rules about the same data.