Decision Tree

Securely and privately train a Decision Tree model over vertically-partitioned datasets, and use federated-security inferencing on the trained Decision Tree model.

Operation

When using add_agreement() to forge an agreement on a trained model, use the positioned asset’s UUID for the operation parameter.

When using add_agreement() to allow a counterparty to use your dataset for model training, or using create_job() to train a Decision Tree, use the appropriate operation parameter below.

PSI Vertical Decision Tree

Use Operation.PSI_VERTICAL_DECISION_TREE_TRAIN to identify an overlap of matching records across datasets, and then train a Decision Tree classification or regression model on the vertically-partitioned intersection.

Parameters

When running the training protocol explicitly (PSI_VERTICAL_DECISION_TREE_TRAIN) using create_job():

decision_tree: Dict{ "regression": bool, max_depth: int}

  • Set regression to False for classification.
  • Setting max_depth to 3 or more increases execution time sharply.

psi: Dict{ "match_column": Union[str, List[str]] }

  • Name of the column to match. If not the same in all datasets, a list of the matching column names, one for each tables dataset in order.
  • If a single fieldname is provided, each dataset must have the same name for that match_column, eg. “ID”.

target_column: str

  • The name of the target column for the training.
  • If multiple target columns are found with the same name, an exception will be thrown.

Inference parameters

psi: Dict{ "match_column": List[str] = ["id0", "id1"] }

Limitations

  • Supported for up to 100,000 samples.
  • The owned dataset must be supplied as the first (or left-side) dataset asset.