Algorithms and Parallelism

As has been discussed, in nearly all cases Shapley values are impossible to compute exactly, both due to computational intractibility and lack of knowledge of the true distribution.

This package's API allows users to use a common interface. It consists of Algorithm objects which store parameters associated with the particular algorithm.

Additionally, these objects describe what parallelism, if any, should be used in the computation. These are associated with the algorithms because different algorithms may have different ways of utilizing parallelism.

This package uses ComputationalResources.jl to describe which computational resources, and which degree of parallelism should be used in a computation.

For the time being, the package only provides the Shaplye.MonteCarlo algorithm.

Available Algorithms

Shapley.MonteCarlo

Shapley.MonteCarloType
MonteCarlo{P<:AbstractResource,R<:AbstractRNG}

A Shapley.Algorithm object for describing the computation of Shapley values via the monte carlo algorithm introduced by Strumbelj, Kononenko 2014.

The res argument in the constructors specifies a computational resource for executing the computation. Currently available options are CPU1() (single threaded) and CPUThreads() (all threads available to the Julia process). These are provided by the ComputationalResources.jl package.

Constructors

  • MonteCarlo(res, M, rng=Random.GLOBAL_RNG)
  • MonteCarlo(M, rng=Random.GLOBAL_RNG)

Arguments

  • res: An AbstractResource computational resource. Pass CPUThreads() to run multi-threaded.
  • M: The total number of iterations (samples).
  • rng: A random number generator object to use.

Computational Resources:

  • CPU1(): Single threaded.
  • CPUThreads(): Independent iterations of the monte carlo algorithm will be computed on separate threads. See the Julia multi-threading documentation for instructions on how to set the number of CPU threads available to the Julia process.

Shapley.KernelSHAP

Shapley.KernelSHAPType
KernelSHAP{P<:AbstractResource,R<:AbstractRNG,NT<:NamedTuple}

A Shapley.Algorithm object for describing the computation of Shapley values via the Kernel SHAP algorithm.

The res argument in the constructors specfies a computational resource, currently either CPU1() or CPUThreads() is available.

This algorithm uses the lasso regression from Lasso.jl to perform regularized linear regressions. Note that because this algorithm involves a lasso fit for each individual data point, it is incredibly slow.

Constructors

  • KernelSHAP(res, ν, M, rng=Random.GLOBAL_RNG; ϕ₀_colname=:ϕ₀, kwargs...)
  • KernelSHAP(res, ν, M, rng=Random.GLOBAL_RNG; ϕ₀_colname=:ϕ₀, kwargs...)

Arguments

  • res: An AbstractResource computational resource. Pass CPUThreads() to run multi-threaded.
  • ν: The number of distinct coalitions to use. It cannot excede floor(N/2) where N is the number of features.
  • M: The multiplicity of sample data points per coalition.
  • ϕ₀_colname: The column name to use for the intercept Shapley value. If nothing, colum will not be included.
  • kwargs: Keyword arguments used by the fit. The fit is performed by Lasso.jl, see here for available arguments.

Computational Resources:

  • CPU1(): Single threaded.
  • CPUThreads(): Linear regressions will be performed on different threads.
Note

Using KernelSHAP effictively will require a good understanding of the "lasso" ($L_1,L_2$ regularized linear regressions) regression from the Lasso.jl package. In particular see the documentation for fit. The fit method called by KernelSHAP is fit(LassoModel, X, y; kwargs...).