Random Forest
TripleBlind supports Random Forest models using the đź”—RandomForestClassifier and RandomForestRegressor classes of scikit-learn. Both Federated and SMPC inference are supported on these models. For inference, we have support for both sk-learn models as well as a subset of models exported in PMML format.
Operation
- When using
add_agreement()
to forge an agreement on a trained Random Forest, use the positioned asset’s UUID for theoperation
parameter. - When using
add_agreement()
to allow a counterparty to use your dataset for model training, or when usingcreate_job()
to train a Random Forest, useOperation.RANDOM_FOREST_TRAIN
for theoperation
parameter.
Parameters
Training parameters
train_type: str = "classification"
# “classification” or “regression”random_forest_params: dict = defaultdict
test_size: float = 0.0
Inference parameters
infer_type: str = "classification"
# “classification” or “regression”security: str = "fed"
# "fed" or "smpc"output: str = "classification"
# "classification" or "probability"
Predictive Model Markup Language (PMML)
Predictive Model Markup Language (PMML) is the leading standard for statistical and data mining models and is supported by over 30 vendors and organizations. With PMML, it is easy to develop a model on one system using one application and deploy the model on another system using another application, simply by transmitting an XML configuration file.
In general, TripleBlind will be a consumer of PMML models as a means for a user to load a model into the TripleBlind system (i.e., create a model asset owned by the user).
TripleBlind has added support for đź”—Predictive Model Markup Language (PMML) models to perform inference with both Federated and SMPC security. Currently, a subset of the full PMML specification is supported, including đź”—General Regression and đź”—Tree Models.
Usage Notes
- Data transformations should be defined with TripleBlind preprocessors and not inside the PMML definition.
- In order to make use of the
infer()
method, it is recommended the asset is cast to theModelAsset
class.
Limitations
- Currently we support inference for random forest binary classification models exported from R’s 🔗randomForest package.
- Only numerical data is supported.
- The scikit parameter
n_estimators
(number of trees in the forest) is not supported. This is due to our privacy-preserving implementation.
Inference parameters
security: str = "fed"
# "fed" or "smpc"