Glossary


Here you will find brief definitions of commonly used terms in the Seldonian framework:

Behavioral constraints

Criteria for fairness or safety provided by the user. Each behavioral constraint consists of a constraint function and a confidence level. In many cases, the constraint function can be constructed from a constraint string provided by the user. The Seldonian algorithm ensures that the behavioral constraints are met with probability of at least $1-\delta$, where $\delta$ is the confidence level provided by the user.

Candidate selection

One of the major components of a Seldonian algorithm. It is the component that, using a fraction of the dataset (called the candidate dataset), searches for a solution that simultaneously optimizes the primary objective (i.e., loss function) and is predicted to satisfy the behavioral constraints on the safety dataset, the remaining fraction of the dataset. The candidate dataset is analogous to the training set in the standard supervised machine learning paradigm.

Confidence level

Often called ${\delta}$. Provided by the user, the confidence level is used to define the maximum acceptable probability for the Seldonian algorithm to violate a behavioral constraint.

Interface

The system the user interacts with to provide the behavioral constraints and other inputs to the Seldonian algorithm. Examples include simple command line interfaces, scripts, or more complicated graphical user interfaces (GUIs).

Measure function

Components of a behavioral constraint that, if appearing in a constraint string, will be recognized by the engine as statistical functions with special meaning. Examples are "Mean_Squared_Error", used in regression problems, "FPR", standing for false positive rate and used in classification problems, and "J_pi_new_IS", which stands for the performance of the new policy for reinforcement learning problems, as evaluated by ordinary importance sampling.

Primary objective function

The objective function (also called loss function) that, in the absence of behavioral constraints, would be solely optimized by the machine learning model. The Seldonian machine learning model seeks to simultaneously optimize the primary objective function while satisfying the behavioral constraints. Performance on the objective function is sometimes sacrificed to satisfy the behavioral constraints, depending on the problem.

Regime

The broad category of machine learning problem, e.g., supervised learning or reinforcement learning.

Safety test

One of the three major components of a Seldonian algorithm. It is the component that, given a solution determined during candidate selection, tests whether that solution satisfies the behavioral constraints on the held-out safety dataset that was not used to find the solution. The safety dataset is analogous to the test set in the standard supervised machine learning paradigm.

Seldonian algorithm

An algorithm designed to enforce high-probability constraints in a machine learning problem

Sensitive attributes

In a fairness constraint, a sensitive attribute is one against which the model should not discriminate. Gender and race are common examples. Sensitive attributes are also sometimes called protected attributes.

Subregime

Within supervised learning, the subregimes supported by this library are classification (binary and multiclass) and regression. Reinforcement learning does not currently have subregimes in this library.