TripleBlind User Guide

Privacy Assurances and Risk

TripleBlind offers a suite of capabilities to service many different privacy needs. Privacy assurances are dependent on many factors, including the actual content of Datasets, and it is misleading to state that risk is zero, even when operating blindly. This section explains the risk associated with each capability offered.

Risk Levels

To simplify understanding of risk, the following icons are used to designate risk level.

Safest

Provides HIPAA-level data privacy protection and meets GDPR standards for data privacy protection. For more information, reference the external expert opinion, Privacy Analysis of TripleBlind’s Technology.

Safe

Provides high privacy protection with virtually no risk of inadvertent disclosure, but there are possible “edge cases” that prevent full assurance of privacy.

Safe with Care

Provides high privacy protection when set up and governed with proper procedure. Incorrect usage or procedure could result in a privacy leak.

Risk Summary

TripleBlind’s capabilities can be grouped into three primary categories: Machine Learning, Data Analysis, and Data Operations. The following table summarizes the risk level associated with each capability available in each category.


Capability Safest Safe Safe with Care
Machine Learning
Training

Inference

Data Analysis
Blind Sample

Outlier Detection

Exploratory Data Analysis (EDA)

Data Operations
Sentiment Analysis

Blind Match

Blind Report

Blind Stats

Blind String Search

Blind Join

Blind Query






Access Point Configuration

By default, a TripleBlind Access Point is configured to enable only the capabilities that are identified as Safe and Safest. If you want access to the Safe with Care capabilities, and understand what is required to use them in a regulatory compliant manner, they can be enabled on your Access Point.

Machine Learning

Training

TripleBlind supports a large number of model types and training methods. This list includes:

  • Blind Learning (for shallow and deep learning networks)
  • Distributed Blind Learning (conventional machine learning; e.g., logistic regression)
  • Federated Learning (for neural networks)
  • Region of Interest Training
  • Random Forest Training
  • XGBoost Training
  • Recommender System Training (using both neural networks and conventional methods)
  • BERT Training
  • Regression

All of these methods are designed to provide the highest level of privacy protection. The techniques used in generating these models reveal no raw data outside of the Dataset Owner’s Organization.

Inference

The TripleBlind inference toolset supports a wide range of Algorithms, from basic statistical queries to deep neural networks. The inferences made with the toolset preserve the privacy of both the used model and the data. TripleBlind uses state-of-the-art, mathematically-backed methods to guarantee the privacy of all involved parties. Additionally, the Audit and Agreements features help protect against a wide range of model and data attacks, such as frequency, membership, and reconstruction attacks.

Data Analysis

Blind Sample

Blind Sample generates a realistic privacy-preserving sample similar to the real data. Strings are similar lengths, integers are in the same range, and floating point numbers have the same precision. When columns have been unmasked (), a real sample value taken from the Dataset is returned in that column. This value is out-of-context of the row from which it has been sampled.

Outlier Detection

The private analysis returns the index of outlier rows, but never the actual data. Only the Dataset Owner can use these indices to determine the actual outlying data value.

Exploratory Data Analysis (EDA)