Skip to main content
Version: v2

Masking Data

In this tutorial, you will learn how to use Canner Enterprise's SQL Explorer and UDF functions to mask your data and work with the workspace authorization mechanism to ensure compliance with sensitive data usage regulations.

Before starting

Before starting this tutorial, you will need to have completed Connecting Data Sources in Canner Enterprise and learned about Import Data to your Workspace.

What is Data Masking?

Data masking is a processing method of masking or replacing data to protect specific sensitive or private information, reducing the risk of data leakage while maintaining usability.

Basics: Data Masking in SQL Explorer

  1. Enter the workspace, and add the data you want to mask in 【Tables】 >【Create a Table】.

  2. Go to 【SQL Explorer】, enter ANSI SQL Query in the editor to mask: For example, you have a raw Table customer_table

    NamePhone
    David Chen25-989-741-2988
    Jennifer Lin23-768-687-3665
    Henry Wu11-719-748-3364
    Amber Lee14-128-190-5944

    The following query replaces the customer's name with an abbreviation and hides the last four digits of the phone number as "****"

    SELECT
    substr(split_part(Name,' ',1),1,1)||'. '||substr(split_part(Name,' ',2),1,1)||'.' AS Name_abbr,
    substr(Phone,1,11)||'****'AS Phone_masked
    FROM customer_table
  3. Write and mask 【SQL】 > Click 【Run】, and the result after masking will be displayed in 【Result】 >【View(magnifying glass)】.

    1_data_masking

    sql_table

  4. Click 【Save as (save mark)】, and you can save the masked data as View or Materialized View for subsequent output and application.

    2_data_masking

Advanced: with UDF function

If your masking logic is complex, or you want all users to use the same business logic to process data, you can use the UDF function of Canner Enterprise to help.

  1. First, upload your custom mask function on the 【UDF】 page: The function you upload needs to be written in Java; the following example MaskFunction function will replace the entire specified string field with *

    package com.brandboat.udfs.scalar;

    import io.airlift.slice.Slice;
    import io.trino.spi.function.Description;
    import io.trino.spi.function.ScalarFunction;
    import io.trino.spi.function.SqlType;
    import io.trino.spi.type.StandardTypes;

    import static io.airlift.slice.Slices.utf8Slice;

    public class MaskFunction
    {
    private MaskFunction() {}

    @Description("mask value in column")
    @ScalarFunction("mask_column")
    @SqlType(StandardTypes. VARCHAR)
    public static Slice maskColumnSlice(@SqlType(StandardTypes.VARCHAR) Slice value)
    {
    return utf8Slice("*****");
    }
    }

    After writing the function, you must save the java file in .jar format and upload it.

  2. After the upload is complete, click to enter the function file you want to use in 【Action】 >【Edit Access Control】 enable Enable and set 【Privilege Type】 to Public, All workspaces can use this function. If you only want to allow specific workspaces to be used, please set 【Privilege Type】 to Private, and select which workspaces you want to share the specified function to in 【Share to workspace】.

  3. Enter the workspace, and add the data you want to mask in 【Tables】 >【Create a Table】.

  4. Go to 【SQL Explorer】, enter ANSI SQL Query in the editor, and use the uploaded UDF function to mask: For example, you have a raw Table customer_table

    NamePhone
    David Chen25-989-741-2988
    Jennifer Lin23-768-687-3665
    Henry Wu11-719-748-3364
    Amber Lee14-128-190-5944

    The following query uses the MaskFunction function to replace the customer's name with an "*" number and mask the last four digits of the phone

    SELECT
    mask_column(Name) AS Name_masked,
    substr(Phone,1,11)||'****' AS Phone_masked
    FROM customer_table
  5. Write and mask 【SQL】 > Click 【Run】, and the result after masking will be displayed in 【Result】 >【View(magnifying glass)】.

    1_data_masking

    udf_table

  6. Click 【Save as (save mark)】, and you can save the masked data as View or Materialized View for subsequent output and application.

    2_data_masking

Next step

You can use How Access Control works in Canner Enterprise and use the Using Semantic Layers to Manage Data Semantics in the workspace to specify that other users can only use your The masked data achieves data security requirements.