Monday, 24 August 2020

How does predict_proba() in LightGBM work internally?

This is in reference to understanding, internally, how the probabilities for a class are predicted using LightGBM.

Other packages, like sklearn, provide thorough detail for their classifiers. For example:

Probability estimates.

The returned estimates for all classes are ordered by the label of classes.

For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.

Predict class probabilities for X.

The predicted class probabilities of an input sample are computed as the mean predicted class probabilities of the trees in the forest. The class probability of a single tree is the fraction of samples of the same class in a leaf.

There are additional Stack Overflow questions which provide additional details, such as for:

I am trying to uncover those same details for LightGBM's predict_proba function. The documentation does not list the details of how the probabilities are calculated.

The documentation simply states:

Return the predicted probability for each class for each sample.

The source code is below:

def predict_proba(self, X, raw_score=False, start_iteration=0, num_iteration=None,
                  pred_leaf=False, pred_contrib=False, **kwargs):
    """Return the predicted probability for each class for each sample.

    Parameters
    ----------
    X : array-like or sparse matrix of shape = [n_samples, n_features]
        Input features matrix.
    raw_score : bool, optional (default=False)
        Whether to predict raw scores.
    start_iteration : int, optional (default=0)
        Start index of the iteration to predict.
        If <= 0, starts from the first iteration.
    num_iteration : int or None, optional (default=None)
        Total number of iterations used in the prediction.
        If None, if the best iteration exists and start_iteration <= 0, the best iteration is used;
        otherwise, all iterations from ``start_iteration`` are used (no limits).
        If <= 0, all iterations from ``start_iteration`` are used (no limits).
    pred_leaf : bool, optional (default=False)
        Whether to predict leaf index.
    pred_contrib : bool, optional (default=False)
        Whether to predict feature contributions.

        .. note::

            If you want to get more explanations for your model's predictions using SHAP values,
            like SHAP interaction values,
            you can install the shap package (https://github.com/slundberg/shap).
            Note that unlike the shap package, with ``pred_contrib`` we return a matrix with an extra
            column, where the last column is the expected value.

    **kwargs
        Other parameters for the prediction.

    Returns
    -------
    predicted_probability : array-like of shape = [n_samples, n_classes]
        The predicted probability for each class for each sample.
    X_leaves : array-like of shape = [n_samples, n_trees * n_classes]
        If ``pred_leaf=True``, the predicted leaf of every tree for each sample.
    X_SHAP_values : array-like of shape = [n_samples, (n_features + 1) * n_classes] or list with n_classes length of such objects
        If ``pred_contrib=True``, the feature contributions for each sample.
    """
    result = super(LGBMClassifier, self).predict(X, raw_score, start_iteration, num_iteration,
                                                 pred_leaf, pred_contrib, **kwargs)
    if callable(self._objective) and not (raw_score or pred_leaf or pred_contrib):
        warnings.warn("Cannot compute class probabilities or labels "
                      "due to the usage of customized objective function.\n"
                      "Returning raw scores instead.")
        return result
    elif self._n_classes > 2 or raw_score or pred_leaf or pred_contrib:
        return result
    else:
        return np.vstack((1. - result, result)).transpose()

How can I understand how exactly the predict_proba function for LightGBM is working internally?



from How does predict_proba() in LightGBM work internally?

No comments:

Post a Comment