TripleBlind User Guide
Privacy Assurances and Risk
TripleBlind offers a suite of capabilities to service many different privacy needs. Privacy assurances are dependent on many factors, including the actual content of Datasets, and it is misleading to state that risk is zero, even when operating blindly. This section explains the risk associated with each capability offered.
Risk Levels
To simplify understanding of risk, the following icons are used to designate risk level.
Safest
Provides HIPAA-level data privacy protection and meets GDPR standards for data privacy protection. For more information, reference the external expert opinion, Privacy Analysis of TripleBlind’s Technology.
Safe
Provides high privacy protection with virtually no risk of inadvertent disclosure, but there are possible “edge cases” that prevent full assurance of privacy.
Safe with Care
Provides high privacy protection when set up and governed with proper procedure. Incorrect usage or procedure could result in a privacy leak.
Risk Summary
TripleBlind’s capabilities can be grouped into three primary categories: Machine Learning, Data Analysis, and Data Operations. The following table summarizes the risk level associated with each capability available in each category.
Capability | Safest | Safe | Safe with Care |
Machine Learning | |||
Training | |||
Inference | |||
Data Analysis | |||
Blind Sample | |||
Outlier Detection | |||
Exploratory Data Analysis (EDA) | |||
Data Operations | |||
Sentiment Analysis | |||
Blind Match | |||
Blind Report | |||
Blind Stats | |||
Blind String Search | |||
Blind Join | |||
Blind Query |
Access Point Configuration
By default, a TripleBlind Access Point is configured to enable only the capabilities that are identified as Safe and Safest. If you want access to the Safe with Care capabilities, and understand what is required to use them in a regulatory compliant manner, they can be enabled on your Access Point.
Machine Learning
Training
TripleBlind supports a large number of model types and training methods. This list includes:
- Blind Learning (for shallow and deep learning networks)
- Distributed Blind Learning (conventional machine learning; e.g., logistic regression)
- Federated Learning (for neural networks)
- Region of Interest Training
- Random Forest Training
- XGBoost Training
- Recommender System Training (using both neural networks and conventional methods)
- BERT Training
- Regression
All of these methods are designed to provide the highest level of privacy protection. The techniques used in generating these models reveal no raw data outside of the Dataset Owner’s Organization.
Inference
The TripleBlind inference toolset supports a wide range of Algorithms, from basic statistical queries to deep neural networks. The inferences made with the toolset preserve the privacy of both the used model and the data. TripleBlind uses state-of-the-art, mathematically-backed methods to guarantee the privacy of all involved parties. Additionally, the Audit and Agreements features help protect against a wide range of model and data attacks, such as frequency, membership, and reconstruction attacks.
Data Analysis
Blind Sample
Blind Sample generates a realistic privacy-preserving sample similar to the real data. Strings are similar lengths, integers are in the same range, and floating point numbers have the same precision. When columns have been unmasked (), a real sample value taken from the Dataset is returned in that column. This value is out-of-context of the row from which it has been sampled.
Outlier Detection
The private analysis returns the index of outlier rows, but never the actual data. Only the Dataset Owner can use these indices to determine the actual outlying data value.