Random Forest

TripleBlind supports Random Forest models using the đź”—RandomForestClassifier and RandomForestRegressor classes of scikit-learn. Both Federated and SMPC inference are supported on these models. For inference, we have support for both sk-learn models as well as a subset of models exported in PMML format.

Operation

  • When using add_agreement() to forge an agreement on a trained Random Forest, use the positioned asset’s UUID for the operation parameter.
  • When using add_agreement() to allow a counterparty to use your dataset for model training, or when using create_job() to train a Random Forest, use Operation.RANDOM_FOREST_TRAIN for the operation parameter.

Parameters

Training parameters

  • train_type: str = "classification" # “classification” or “regression”
  • random_forest_params: dict = defaultdict
  • test_size: float = 0.0

Inference parameters

  • infer_type: str = "classification" # “classification” or “regression”
  • security: str = "fed" # "fed" or "smpc"
  • output: str = "classification" # "classification" or "probability"

Predictive Model Markup Language (PMML)

Predictive Model Markup Language (PMML) is the leading standard for statistical and data mining models and is supported by over 30 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.

In general, TripleBlind will be a consumer of PMML models as a means for a user to load a model into the TripleBlind system (i.e., create a model asset owned by the user).

TripleBlind has added support for đź”—Predictive Model Markup Language (PMML) models to perform inference with both Federated and SMPC security. Currently, a subset of the full PMML specification is supported, including đź”—General Regression and đź”—Tree Models.

Usage Notes
  • Data transformations should be defined with TripleBlind preprocessors and not inside the PMML definition.
  • In order to make use of the infer() method, it is recommended the asset is cast to the ModelAsset class.
Limitations
  • Currently we support inference for random forest binary classification models exported from R’s đź”—randomForest package.
  • Only numerical data is supported.
  • The scikit parameter n_estimators (number of trees in the forest) is not supported. This is due to our privacy-preserving implementation.
Inference parameters
  • security: str = "fed" # "fed" or "smpc"