Muppet: A Modular and Constructive Decomposition for Perturbation-based Explanation Methods.
Perturbation-based explanation methods — such as LIME, SHAP, and Anchors — have become the de facto standard for interpreting black-box machine learning models. Yet these methods are typically presented as monolithic algorithms, making it difficult to compare them systematically or combine their strengths.
In this paper, we propose Muppet, a modular decomposition framework that breaks any perturbation-based explainer into four composable blocks: Perturbation, Representation, Similarity, and Aggregation. We show that all major explanation methods can be expressed as specific configurations of these blocks — and that novel, high-performing configurations emerge from recombining them.
The four blocks
Perturbation
The perturbation block defines how input features are modified to observe their effect on the model's output. Common strategies include:
- Masking — Replacing features with a baseline value (zero, mean, or random)
- Coalitional — Sampling subsets of features as in Shapley value estimation
- Generative — Using a learned model to generate realistic perturbations
from muppet import perturbation
# Masking strategy with mean baseline
masker = perturbation.MaskPerturbation(baseline="mean", n_samples=1000)
# Coalitional strategy (Shapley-style)
coalitional = perturbation.CoalitionalPerturbation(n_samples=2048, paired=True)
# Generative strategy using a VAE
generative = perturbation.GenerativePerturbation(model=trained_vae, n_samples=500)
Representation
The representation block determines the feature space in which explanations are computed. For tabular data, this is straightforward (individual columns). For images and text, the choice is more complex:
from muppet import representation
# Superpixel representation for images
superpixel = representation.SuperpixelRepresentation(n_segments=50, compactness=10)
# Token-level representation for text
token_rep = representation.TokenRepresentation(tokenizer="spacy", granularity="word")
# Semantic grouping (clusters of related features)
semantic = representation.SemanticGroupRepresentation(embedding_model="all-MiniLM-L6-v2")
Similarity
The similarity block defines how the importance of each perturbed sample is weighted relative to the original input. This is where methods diverge most significantly:
- LIME uses an exponential kernel in the representation space
- SHAP uses Shapley value weighting based on coalition size
- Anchors uses a binary precision threshold
Aggregation
Finally, the aggregation block combines weighted observations into a final attribution vector. Linear regression (LIME), weighted averaging (SHAP), and rule extraction (Anchors) are all specific aggregation strategies.
Key findings
By systematically exploring the combinatorial space of block configurations, we discovered that:
- LIME's perturbation + SHAP's aggregation yields attributions that are 15% more faithful than either method alone on tabular benchmarks
- Generative perturbations dramatically reduce out-of-distribution artifacts — the primary failure mode of masking-based approaches
- The choice of similarity kernel matters more than the aggregation method for image explanations, contradicting the common focus on the latter
The modular perspective reveals that the differences between explanation methods are less fundamental than they appear. Most of the "innovation" in recent papers amounts to changing a single block while keeping the others fixed.
The Muppet library
We release Muppet as an open-source Python library that implements this decomposition:
from muppet import Explainer, perturbation, representation, similarity, aggregation
# Compose a custom explainer
explainer = Explainer(
perturbation=perturbation.CoalitionalPerturbation(n_samples=2048),
representation=representation.SuperpixelRepresentation(n_segments=50),
similarity=similarity.ExponentialKernel(width=0.75),
aggregation=aggregation.LinearRegression(regularization="ridge")
)
# Explain a prediction
explanation = explainer.explain(model, input_sample)
explanation.visualize()
Muppet supports tabular, image, and text data out of the box, and is designed to be extended with custom blocks. The library is available on PyPI (pip install muppet-xai) and on GitHub.


