Module tripleblind.report_asset
Specialized Asset representing a parameterized database report.
The ReportAsset wraps a generic database query, providing a mechanism to publish a limited query (such as an SQL database query) with very tightly controlled options. These ReportAssets can then be easily executed by others to perform the limited variety of queries upon demand. Like all other Assets, the report can be manually approved or access can be granted via an Agreement.
Reports only allow changes to the limited parameters and validate provided parameters before usage, ensuring the report consumer can only utilize the intended range of options. This enables powerful but strongly controlled reporting against even the most sensitive of databases.
Parameters must be an integer, float or a string from an explicit list of options. Multiple parameters can be used by a report, including optional values and default. The parameters are used by the query_template, utilizing the Mustache templating language (see https://mustache.github.io).
Connection strings can be templated using Mustache as well, allowing for secrets to be used in the connection string. For example, a connection string could be defined as: "mssql+pyodbc://{{secret_username}}:{{secret_password}}@myserver:3306/payroll" Or secrets could be included in the parameters: username="{{secret_username}}", password="{{secret_password}}" if using a create method which accepts those parameters. See "Using Named Secrets" under https://deveval.tripleblind.app/portal/docs/user-guide/asset-owner-operations for more details.
Example report definition:
demographic_param = ReportParameter.create_string(
name="demographic",
description="Specific demographic to filter the report on",
options=[
ParameterOption("Gender"),
ParameterOption("Ethnicity"),
],
)
query_template = '''
select Dept_Name, {{demographic}}, avg(Total_Gross) as average_Pay
from tripleblind_datasets.city_of_somerville_payroll
group by Dept_Name, {{demographic}};
'''
report = tb.report_asset.BigQueryReport.create(
gcp_project_id,
bigquery_dataset,
credentials,
query_template=query_template,
params=[demographic_param],
name=f"EXAMPLE- Payroll Report",
)
Example report usage:
# Run a payroll report summarizing by Gender.
report = tb.ReportAsset.find("EXAMPLE- Payroll Report")
print(report.run({"demographic": "Gender"}).raw_content)
Classes
class AggregationMethod (value, names=None, *, module=None, qualname=None, type=None, start=1)
-
The aggregation methods available for use in a federated report.
Ancestors
- enum.Enum
Class variables
var COUNT
var MAX
var MEAN
var MIN
var SUM
class AggregationRules (group_by: Union[str, List[str], ForwardRef(None)], aggregates: Optional[Mapping[str, AggregationMethod]], sort_columns: Optional[List[str]] = None, sort_order: Optional[str] = 'asc')
-
Federated Blind Reports require an AggregationRules object which defines the way the results from the federation members get combined. A single report can report multiple values, and each value can be aggregated in a different way. Grouping is optional and will be applied to all the results the same.
Class variables
var aggregates : Optional[Mapping[str, AggregationMethod]]
-
A map of column names to the aggregation method to use.
var group_by : Union[str, List[str], ForwardRef(None)]
-
The column(s) to group the results by.
var sort_columns : Optional[List[str]]
-
The columns to sort the results by.
var sort_order : Optional[str]
-
The order to sort the results by. Default is 'asc'.
Static methods
def create(group_by: Optional[List[str]] = None, aggregates: Optional[Mapping[str, AggregationMethod]] = None, sort_columns: Optional[List[str]] = None, sort_order: Optional[str] = 'asc')
-
Define the aggregation rules for a federated report. At least one group_by or aggregate is required.
Args
group_by
:str
orList[str]
- The column(s) to group the results by.
aggregates
:Mapping[str, AggregationMethod]
- A map of column names to the aggregation method to use.
sort_columns
:List[str]
, optional- The columns to sort the results by.
sort_order
:str
, optional- The order to sort the results by. Default is 'asc'.
Returns
AggregationRules
- The new AggregationRules object.
def from_dict(kvs: Union[dict, list, str, int, float, bool, ForwardRef(None)], *, infer_missing=False) -> ~A
def from_json(s: Union[str, bytes, bytearray], *, parse_float=None, parse_int=None, parse_constant=None, infer_missing=False, **kw) -> ~A
def schema(*, infer_missing: bool = False, only=None, exclude=(), many: bool = False, context=None, load_only=(), dump_only=(), partial: bool = False, unknown=None) -> dataclasses_json.mm.SchemaF[~A]
Methods
def to_dict(self, encode_json=False) -> Dict[str, Union[dict, list, str, int, float, bool, ForwardRef(None)]]
def to_json(self, *, skipkeys: bool = False, ensure_ascii: bool = True, check_circular: bool = True, allow_nan: bool = True, indent: Union[int, str, ForwardRef(None)] = None, separators: Tuple[str, str] = None, default: Callable = None, sort_keys: bool = False, **kw) -> str
class BigQueryReport (uuid: UUID, query_template: str = '', params: List[preprocessor.report_parameters.ReportParameter] = typing.List, connection: Optional[str] = None, connection_opts: Optional[dict] = None, federation_group: Union[str, uuid.UUID, ForwardRef(None)] = None)
-
A parameterized database report asset backed by a view from a BigQuery database.
Ancestors
Static methods
def create(gcp_project: str, bigquery_dataset: str, credentials: Union[str, pathlib.Path], query_template: str, params: List[preprocessor.report_parameters.ReportParameter], name: str, desc: str, is_discoverable: Optional[bool] = False, allow_overwrite: Optional[bool] = False, session: Optional[Session] = None, validate_sql: Optional[bool] = True, post_processing: Optional[str] = None) -> DatabaseReport
-
Creates a parameterized database report using a connection to a BigQuery database.
Args
gcp_project
:str
- The project name of your Google Cloud Project which will be used to cover any query access costs.
bigquery_dataset
:str
- The BigQuery dataset name.
credentials
:str
orPath
- The path of your keyfile.json. See the Google documentation for more details. These credentials will be stored securely on your Access Point; neither TripleBlind nor anyone using your dataset will have access to it.
query_template
:str
- The SQL query template defined using the Mustache templating language for generating the report.
params
:List[ReportParameter]
- List of ReportParameters defining the configurable report parameters.
name
:str
- Name of the new asset.
desc
:str
- Description of the new asset (can include markdown)
is_discoverable
:bool
, optional- Should this asset be listed in the Router index to be found and used by others?
allow_overwrite
:bool
, optional- If False an exception will be thrown if the asset name already exists. If True, an existing asset will be overwritten.
session
:Session
, optional- A connection session. If not specified, the default session is used.
validate_sql
:bool
, optional- If True (the default) the query syntax is checked for common SQL syntax errors.
post_processing
:str
, optional- A string holding a Python function or a filename containing a Python function to use as a post-processing operation on the report output.
Raises
SystemExit
- SQL syntax errors were found in query.
Returns
DatabaseReport
- New DatabaseReport asset on the Router, or None on failure
Inherited members
DatabaseReport
:accesspoint_filename
activate_date
add_agreement
archive
cast
deactivate_date
delete
desc
download
federation_members
filename
find
find_all
get_federation_members
get_report_params
hash
is_active
is_discoverable
is_valid
k_grouping
list_agreements
metadata
name
namespace
position
publish_to_team
retrieve
run
team
team_id
upload
uuid
class DatabaseReport (uuid: UUID, query_template: str = '', params: List[preprocessor.report_parameters.ReportParameter] = typing.List, connection: Optional[str] = None, connection_opts: Optional[dict] = None, federation_group: Union[str, uuid.UUID, ForwardRef(None)] = None)
-
A parameterized database report asset backed by an SQL view of a database
When running a Federated Report, the same SQL query is distributed to be executed against multiple databases. Most SQL is compatible with all databases, but some specific situations require accessing features are not completely standardized. To handle this, the query_template can use DIALECT_xxx values to customize the SQL for the specific database. For example, to use the SQL Server TOP keyword in SQL Server, but LIMIT in Postgres, you could use the following query_template:
SELECT * FROM my_table {{#DIALECT_MSSQL}} TOP {{limit}} {{/DIALECT_MSSQL}} {{#DIALECT_POSTGRESQL}} LIMIT {{limit}} {{/DIALECT_POSTGRESQL}}
Or you could only use the TOP keyword in SQL Server and LIMIT elsewhere:
SELECT * FROM my_table {{#DIALECT_MSSQL}} TOP {{limit}} {{/DIALECT_MSSQL}} {{^DIALECT_MSSQL}} LIMIT {{limit}} {{/DIALECT_MSSQL}}
Supported DIALECT_ values are:
- DIALECT_BIGQUERY
- DIALECT_DATABRICKS
- DIALECT_MONGO
- DIALECT_MSSQL
- DIALECT_MYSQL
- DIALECT_ORACLE
- DIALECT_POSTGRESQL
- DIALECT_REDSHIFT
- DIALECT_SNOWFLAKE
- DIALECT_SQLITE
The active database will be given a value of "1", others will be undefined.
Ancestors
Subclasses
Class variables
var connection : Optional[str]
var connection_opts : Optional[dict]
var federation_group : Union[str, uuid.UUID, ForwardRef(None)]
var params : List[preprocessor.report_parameters.ReportParameter]
var query_template : str
Static methods
def cast(asset: Asset) -> DatabaseReport
-
Convert a generic Asset into a DatabaseReport
This should only be used on an asset known to be a DatabaseReport, no validation occurs during the cast.
Args
asset
:Asset
- A generic Asset
Returns
DatabaseReport
- A DatabaseReport object
def create(query_template: str, params: List[preprocessor.report_parameters.ReportParameter], name: str, desc: str, connection: Optional[str] = None, connection_opts: Optional[dict] = None, credentials_info: Optional[dict] = None, is_discoverable: Optional[bool] = False, allow_overwrite: Optional[bool] = False, session: Optional[Session] = None, validate_sql: Optional[bool] = True, federation_group: Union[ForwardRef('FederationGroup'), str, uuid.UUID, ForwardRef(None)] = None, federation_aggregation: Optional[AggregationRules] = None, post_processing: Optional[str] = None)
-
Creates a parameterized database report.
Args
connection
:str
- The connection string to the remote database connection. Secrets can be included in the connection string using Mustache templating of named secrets, e.g. "mssql+pyodbc://{{secret_username}}:{{secret_password}}@myserver.acme.com:3306/payroll". Any portion of the connection string can be templated, including the host, username, password, database, etc.
query_template
:str
- The SQL query template defined using the Mustache templating language for generating the report. In addition to your query parameters, your template can also include DIALECT_xxx values to customize the SQL for different databases. See the DatabaseReport documentation for more details.
params
:List[ReportParameter]
- List of ReportParameters defining the configurable report parameters.
name
:str
- Name of the new asset.
desc
:str
- Description of the new asset (can include markdown).
connection_opts
:dict
, optional- Optional dictionary of database connection options.
credentials_info
:dict
, optional- Dictionary of credentials information if not provided in the connection string.
is_discoverable
:bool
, optional- Should this asset be listed in the Router index to be found and used by others?
allow_overwrite
:bool
, optional- If False an exception will be thrown if the asset name already exists. If True, an existing asset will be overwritten.
session
:Session
, optional- A connection session. If not specified, the default session is used.
validate_sql
:bool
, optional- If True (the default) the query syntax is checked for common SQL syntax errors.
federation_group
:FederationGroup, str
orUUID
, optional- The group, group name or UUID of the federation group to use when executing this report. If not specified, this is a standalone report.
federation_aggregation
:AggregationRules
, optional- Defines the aggregation to use when executing this report. Required for federated reports, invalid otherwise.
post_processing
:str, Optional
- A Python function or the filename
containing a function to run after the report has been executed.
The function must have the signature:
def postprocess(df: pd.Dataframe, ctx: dict)
The two arguments are the report output data frame and a dict holding the user-selected report parameters as context.
Raises
Exception
- Name must be unique for each parameter.
SystemExit
- SQL syntax errors were found in query.
Returns
DatabaseReport
- New DatabaseReport asset on the Router, or None on failure
Inherited members
ReportAsset
:accesspoint_filename
activate_date
add_agreement
archive
deactivate_date
delete
desc
download
federation_members
filename
find
find_all
get_federation_members
get_report_params
hash
is_active
is_discoverable
is_valid
k_grouping
list_agreements
metadata
name
namespace
position
publish_to_team
retrieve
run
team
team_id
upload
uuid
class DatabricksReport (uuid: UUID, query_template: str = '', params: List[preprocessor.report_parameters.ReportParameter] = typing.List, connection: Optional[str] = None, connection_opts: Optional[dict] = None, federation_group: Union[str, uuid.UUID, ForwardRef(None)] = None)
-
A table asset backed by a view from a Databricks database.
Ancestors
Static methods
def create(access_token: str, server_hostname: str, http_path: str, catalog: str, schema: str, query_template: str, params: List[preprocessor.report_parameters.ReportParameter], name: str, desc: str, is_discoverable: Optional[bool] = False, allow_overwrite: Optional[bool] = False, session: Optional[Session] = None, validate_sql: Optional[bool] = True, post_processing: Optional[str] = None) -> DatabaseReport
-
Create a connection to a Databricks database
You can find the connection details for your Databricks cluster in the Databricks UI. Under the Compute in the sidebar, choose your target cluster. Under the Configuration tab for that cluster expand Advanced Options and choose the JDBC/ODBC tab, where you will find the needed values. See the Databricks documentation for more details: https://docs.databricks.com/en/integrations/compute-details.html
Args
access_token
:str
- A Databricks access token or a secret name. For example, "dapi1234567890abcdef"
server_hostname
:str
- The Databricks server name or a secret name. For example, "community.cloud.databricks.com"
http_path
:str
- The Databricks server name or a secret name. For example, "/sql/protocolv1/o/1234567890123456/0123-456789-abc123"
catalog
:str
- The Databricks catalog name or a secret name. For example, "default"
schema
:str
- The Databricks schema name or a secret name. For example, "default"
query_template
:str
- The SQL query template defined using the Mustache templating language for generating the report.
params
:List[ReportParameter]
- List of ReportParameters defining the configurable report parameters.
name
:str
- Name of the new asset.
desc
:str
- Description of the new asset (can include markdown)
is_discoverable
:bool
, optional- Should this asset be listed in the Router index to be found and used by others?
allow_overwrite
:bool
, optional- If False an exception will be thrown if the asset name already exists. If True, an existing asset will be overwritten.
session
:Session
, optional- A connection session. If not specified, the default session is used.
validate_sql
:bool
, optional- If True (the default) the query syntax is checked for common SQL syntax errors.
post_processing
:str
, optional- A string holding a Python function or a filename containing a Python function to use as a post-processing operation on the report output.
Raises
SystemExit
- SQL syntax errors were found in query.
Returns
DatabaseReport
- New DatabaseReport asset on the Router, or None on failure
Inherited members
DatabaseReport
:accesspoint_filename
activate_date
add_agreement
archive
cast
deactivate_date
delete
desc
download
federation_members
filename
find
find_all
get_federation_members
get_report_params
hash
is_active
is_discoverable
is_valid
k_grouping
list_agreements
metadata
name
namespace
position
publish_to_team
retrieve
run
team
team_id
upload
uuid
class MSSQLReport (uuid: UUID, query_template: str = '', params: List[preprocessor.report_parameters.ReportParameter] = typing.List, connection: Optional[str] = None, connection_opts: Optional[dict] = None, federation_group: Union[str, uuid.UUID, ForwardRef(None)] = None)
-
A parameterized database report backed by a view of a Microsoft SQL database.
Ancestors
Static methods
def create(host: str, port: int, database: str, query_template: str, params: List[preprocessor.report_parameters.ReportParameter], name: str, desc: str, username: Optional[str] = None, password: Optional[str] = None, options: Optional[dict] = {}, is_discoverable: Optional[bool] = False, allow_overwrite: Optional[bool] = False, session: Optional[Session] = None, validate_sql: Optional[bool] = True, post_processing: Optional[str] = None) -> DatabaseReport
-
Creates a parameterized database report using a connection to a Microsoft SQL database.
Args
host
:str
- The host name of the Microsoft SQL database or a secret name. Example: testsqlserver123.database.windows.net
port
:int
- The port number of the Microsoft SQL database.
database
:str
- The name of the Microsoft SQL database to connect to or a secret name. Example: "dev" or "{{secret_database_name}}".
query_template
:str
- The SQL query template defined using the Mustache templating language for generating the report.
params
:List[ReportParameter]
- List of ReportParameters defining the configurable report parameters.
name
:str
- Name of the new asset.
desc
:str
- Description of the new asset (can include markdown)
username
:str
, optional- Username to use in the database connec