Identify records in which values deviate from the expected distribution. This occurs in a privacy-preserving manner, never exposing the content of the dataset in the process, and outputs only record identifiers of the outliers, not raw data.
This operation privately analyzes the given dataset values. Each value is then compared to the mean value for the dataset, identifying rows where the value is outside the number of standard deviations given. This is known as the Z Score. For a normal distribution, 99.7% of the values will be within 3 standard deviations of the mean.
- Use the
- When using
add_agreement()to allow a counterparty to use your dataset, or using
create_job()to perform outlier detection, use
outlier_algorithm: str = ""
outlier_params: dict = defaultdict
identifier_column: str = ""
- Currently only the only
- Rows which have values outside of the absolute value of the
z-scorelarger than the specified
stdparameter will be returned.