Blind Report
Blind Report allows you to position a database-backed query with predefined configurable parameters. Users can configure the query using these predefined options, have it run against your database, and receive a report table.
This is a powerful operation that allows the data steward to permit only specific, controlled access to the data they desire exposed to the consumer. For example, a report could be defined that allows a user to select a year and month along with a company division for generating salary statistics by ethnicity or gender for usage in compliance reporting.
Any number of variables and any complexity of queries are supported. See the examples/Blind_Report
for documentation and more information.
Blind Report is a Safe operation (see Privacy Assurances and Risk in the Getting Started section of the User Guide).
Operation
- See the
ReportAsset
documentation for positioning methods. - When using
add_agreement()
to forge an agreement for a counterparty to use the Blind Report, use the positioned asset’s UUID for theoperation
parameter. - When using
create_job()
to run the report in a process, use the positioned asset for theoperation
parameter.
Parameters
Positioning parameters
Blind Reports are positioned using create
methods that accept connection details similar to their DatabaseDataset
counterparts. Additional parameters include:
query_template: str
- The query template uses {{brackets}} to identify which parameters will be exposed as configurable by a user.
params: List[ReportParameter]
ReportParameter
methods (create_string
,create_float
, &create_int
) should be used to generate the acceptable format.- Configurable options should be added using
ParameterOption
.
Report parameters
operation: ReportAsset
- When running the Blind Report, the Asset UUID of the positioned report should be supplied here as the algorithm to be run.
dataset: []
- This should be left blank when running a Blind Report, ie.
dataset=[]
.
params: Dict{"report_values": {"param_1": "value_1"}, …, {}}
- This is a JSON string supplying the desired parameters to be run within the Blind Report job.
- Use
get_report_params()
to understand the configurable parameters and their options.
Limitations
- Blind Report is not supported for file-based assets like CSVs or Amazon S3.
- Blind Report is not supported for MongoDB assets.
- This operation does not permit the use of
sql_transform
preprocessors by the data user.
k-Grouping
Operations that return data (eg. Blind Query, Blind Join, & Blind Stats) usually have embedded k-Grouping
safeguards that reduce the risk of leaking data when there are less than a specific threshold of records comprising a group or the total output of an operation. Unlike these operations, Blind Report is not protected by k
-Grouping in an automated way, as it is fully defined by the data owner.
ℹ️ As a best practice, we encourage using a SQL 🔗HAVING clause to enact a purposeful k
-Grouping safeguard within the parameterized query in your Blind Report. For instance, the query in the example script (examples/Blind_Report/1_position_bigquery_report.py
) is:
query_template = """ SELECT Dept_Name, {{demographic}}, AVG({{pay}}) as average_{{pay}} from tripleblind_datasets.city_of_somerville_payroll GROUP BY Dept_Name, {{demographic}}; """
This can be modified to respect a k
-Grouping safeguard by introducing a clause to only return groups with more than a certain amount of records:
query_template = """ SELECT Dept_Name, {{demographic}}, AVG({{pay}}) as average_{{pay}} from tripleblind_datasets.city_of_somerville_payroll GROUP BY Dept_Name, {{demographic}} HAVING COUNT({{demographic}}) >= 5; """
With this clause, you ensure that each group contains at least 5 members, and the report is less likely to inadvertently provide information for a malicious actor to discern potentially personally-identifiable information from its contents (eg. returning the average salary of only a single individual).