Introspectors API¶
Data Introspectors¶
- class deepview.introspectors.Familiarity(meta_key, _distributions)[source]¶
An algorithm that fits a density model to model responses and produces a
that can score responses.Like other
, useFamiliarity.introspect
to instantiate.- Parameters:
meta_key – do not instantiate
directly, useFamiliarity.introspect
_distributions – do not instantiate
directly, useFamiliarity.introspect
- class Strategy[source]¶
Bundled Familiarity computation strategies. See
- class GMM(*, gaussian_count=5, convergence_threshold=0.001, max_iterations=200, covariance_type=GMMCovarianceType.DIAG, _random_state=None)¶
that fits a mixture of multivariate gaussian distributions on the introspected responses usingsklearn.mixture.GaussianMixture
.- Parameters:
gaussian_count – [keyword arg, optional] Number of gaussian distributions to be fitted in the mixture model.
convergence_threshold – [keyword arg, optional] Convergence threshold used when fitting the mixture model.
max_iterations – [keyword arg, optional] Maximum number of iterations to use when fitting the mixture model.
covariance_type – [keyword arg, optional] Covariance type, usually
. See sklearn’s GaussianMixture docs for extra information.
- covariance_type: GMMCovarianceType = 'diag'¶
Covariance type, usually
. See sklearn’s GaussianMixture docs for extra information.
- static introspect(producer, *, strategy=None, batch_size=1024)[source]¶
Examines the
to fit a model for classifying familiarity of another set of responses.- Parameters:
producer (Producer) – the
of model responses to fit the familiarity model tostrategy (FamiliarityStrategyType | None) – [keyword arg, optional] familiarity strategy for producing the model. Default is
.batch_size (int) – [keyword arg, optional] batch size to use when reading data from the
- Returns:
that, when added into apipeline
, will score responses with regard to the fit familiarity model to the inputproducer
and attach the score asmetadata
using itsmeta_key
.- Return type:
- meta_key: DictMetaKey[FamiliarityResult]¶
Metadata key used to access the familiarity result (
). This is accessible via:Example
results = batch.metadata[familiarity_processor.meta_key]['response_a'] # type of results: t.Sequence[FamiliarityResult]
- class deepview.introspectors.FamiliarityStrategyType(*args, **kwargs)[source]¶
Protocol for a class/function that takes a
and produces a per-layer mapping ofFamiliarityDistribution
.- metadata_key: ClassVar[DictMetaKey[FamiliarityResult]]¶
Key that will be used to view the metadata for a particular strategy.
- class deepview.introspectors.FamiliarityResult(*args, **kwargs)[source]¶
Protocol for the result of applying a
to a response.
- class deepview.introspectors.GMMCovarianceType(value)[source]¶
Covariance type to be learnt from data. Typically, use
for low dimensional data andDIAG
for high dimensional data.The main problem with
in high dimensions is that the algorithm learnsdim x dim
parameters for each gaussian, and so overfitting or degenerate solutions may be a problem.The boundary between low and high dimensional data is fuzzy, and the choice of covariance type also depends on the application, data distribution or amount of data available.
A general rule is:
If there are concerns about overfitting due to a lack of data, dimensions are high wrt. the data available, etc. Then use
. This is typically the case when working with DNN embeddings.Else, use
. For example, if fitting 2D data.
For more information about covariance types, refer to the sklearn GMM covariances page.
- DIAG = 'diag'¶
Diagonal covariance type, only the diagonal parameters will be learnt from data.
- FULL = 'full'¶
Full covariance type, all
dim x dim
parameters will be learnt from data.
- class deepview.introspectors.FamiliarityDistribution(*args, **kwargs)[source]¶
The per-response result of
. An instance of this represents the distribution for a single layer and can evaluate the contents of a response.- compute_familiarity_score(x)[source]¶
Compute and return the
Familiarity score
for each data point inx
.- Parameters:
x (ndarray) – input data samples to score according to the built distribution
- Returns:
Familiarity score
for each data sample- Return type:
Dimensionality Reduction¶
- class deepview.introspectors.DimensionReduction(_reducers)[source]¶
to reduce dimensionality of :class`Batch <deepview.base.Batch>`fields
(usually model responses).Like other
, useDimensionReduction.introspect
to instantiate.- class Strategy[source]¶
Bundled dimension reduction strategies. See
.The available options are:
– an Incremental PCA algorithm fromsklearn
that can process data incrementally without accumulating the datasetStandardPCA
– PCA algorithm fromsklearn
that requires accumulating the full dataset in memoryTSNE
– t-SNE algorithm fromsklearn
that requires accumulating the full dataset in memoryUMAP
– the UMAP algorithm fromumap-learn
that requires accumulating the full dataset in memoryPaCMAP
– the PaCMAP algorithm that requires accumulating the full dataset in memory
- class PCA(target_dimensions=2)¶
Principal Component Analysis based dimension reduction using
SKLearn IncrementalPCA
This does not require reading all of the responses into memory to compute the model. A larger batch size will improve the quality of the fit at the cost of additional memory. The incremental approach produces an approximation of PCA, but is documented to be very close and testing backs this up.
can be used if exact computation of PCA is necessary.- Parameters:
target_dimensions – [optional] Target dimensionality of the data.
- class PaCMAP(target_dimensions=2, *, _parameters=None, **kwargs)¶
PaCMAP (Pairwise Controlled Manifold Approximation) is a dimensionality reduction method built with PaCMAP. PaCMAP can be used for visualization, preserving both local and global structure of the data in original space.
This dimension reduction strategy requires reading all of the data into memory before producing the projection. Typically the input data should be reduced from high dimension to low, e.g. 1024 -> 40, before applying PaCMAP.
- Parameters:
target_dimensions – [optional] Target dimensionality of the data.
kwargs –
[optional] Any additional PaCMAP keyword args
- class StandardPCA(target_dimensions=2)¶
Principal Component Analysis based dimension reduction using
.This dimension reduction strategy requires reading all of the data into memory before producing the projection.
is preferred for its lower memory use.- Parameters:
target_dimensions – [optional] Target dimensionality of the data.
- class TSNE(target_dimensions=2, *, _parameters=None, **kwargs)¶
t-distributed Stochastic Neighbor Embedding (t-SNE) using
SKLearn t-SNE
.This dimension reduction strategy requires reading all of the data into memory before producing the projection. Typically the input data should be reduced from high dimension to low, e.g. 1024 -> 40, before applying t-SNE.
- Parameters:
target_dimensions – [optional] Target dimensionality of the data.
kwargs – [optional] Any additional
SKLearn t-SNE
- class UMAP(target_dimensions=2, *, _parameters=None, **kwargs)¶
UMAP based dimension reduction using umap-learn (
This dimension reduction strategy requires reading all of the data into memory before producing the projection. Typically the input data should be reduced from high dimension to low, e.g. 1024 -> 40, before applying UMAP.
- Parameters:
target_dimensions – [optional] Target dimensionality of the data.
kwargs – [optional] Any additional umap-learn args.
- Raises:
DeepViewException – if a layer’s response shape does not have exactly 2 dimensions.
- target_dimensions: int = 2¶
The dimension of the space to embed into. This defaults to 2 to provide straightforward visualization, but can reasonably be set to any integer value in the range 2 to 100. (from
- static introspect(producer, *, strategies, batch_size=None)[source]¶
Perform dimension reduction using training data generated by
, and return aDimensionReduction
that can perform dimensionality reduction in apipeline
must produce 1d vectors, e.g. theBatch
will be of dimensionBxN
. SeeFlattener
if multi-dimensional data is used.Note: some strategies will need to read all of the response data into memory to fit their model. Currently only the
algorithm runs in a streaming fashion.- Parameters:
producer (Producer) – the source of data to train the
onstrategies (DimensionReductionStrategyType | Mapping[str, DimensionReductionStrategyType]) – [keyword arg] which dimension reduction
to use or a mapping fromfield
name tostrategy
(for running a different dimension reduction per layer.batch_size (int | None) – [keyword arg, optional] size of batch to read out – this must be >= the target dimension. For some strategies like
, this will improve the quality of the dimension reduction. The default value will select thebatch_size
- Raises:
DeepViewException – if a layer’s response shape does not have exactly 2 dimensions.
DeepViewException – if the
is smaller than the target dimensions.
- Return type:
- OneOrManyDimStrategies¶
alias of Union[DimensionReductionStrategyType, Mapping[str, DimensionReductionStrategyType]]
- class deepview.introspectors.DimensionReductionStrategyType(*args, **kwargs)[source]¶
Strategy for performing dimension reduction on a single layer. This is initialized with the target dimensions.
method is called repeatedly for each batch that is processed. When all data has been visited, thefit_complete()
method is called. Algorithms that require the full data set in memory may collect values with the first call and then combine and process infit_complete()
is used to transform high dimensional data into the target dimensions.- check_batch_size(batch_size)[source]¶
Validate the batch_size and throw an error if there is an issue.
- Parameters:
batch_size (int) – batch size to validate
- Return type:
- fit_incremental(data)[source]¶
Fit the reducer to the incremental
- Parameters:
data (ndarray) – data to fit the reducer to
- Return type:
- property is_one_shot: bool¶
Returns True if this can transform input data via
, or if the entire input data set is transformed at once viatransform_one_shot()
- transform(data)[source]¶
Transform the given high dimensional
into the target dimensions. Seeis_one_shot()
- transform_one_shot()[source]¶
Returns the input data transformed per the reducer. See
.- Return type:
- class deepview.introspectors.Duplicates(results, count)[source]¶
Introspector for finding duplicate data in a
. This uses an approximate nearest neighbor algorithm to build clusters of nearby samples,Duplicates.DuplicateSetCandidate
. Specifically, it uses the ANNOY - Approximate Nearest Neighbor Oh My! algorithm.Like other
, useDuplicates.introspect
to instantiate.- Parameters:
results – do not instantiate
directly, useDuplicates.introspect
count – do not instantiate
directly, useDuplicates.introspect
- class DuplicateSetCandidate(std, mean, projection, indices, batch)[source]¶
- Parameters:
- class KNNStrategy[source]¶
Bundled K Nearest Neighbours computation strategies. See
- class KNNAnnoy¶
Strategy for computing duplicates using the Annoy library.
- class KNNFaiss¶
Strategy for computing duplicates using the FAISS library.
- class ThresholdStrategy[source]¶
- class Percentile(percentile)¶
Strategy that determines the closeness threshold by taking the nth percentile distance number in the sorted distances. For example a value of
would use a threshold such that 98.5% of the points were not considered close.- Parameters:
percentile – n_th percentile to use for “closeness” in the sorted distances
- class Slope(sensitivity=5)¶
Given an array of distances, find the “close” threshold – the distance where points are close to each other.
This strategy determines the closeness threshold dynamically using a sensitivity value. A lower sensitivity (down to 2) will consider more items to be close (less sensitive to the curve of distances). A value of 5 will use a sliding window 1/5 the size of the distance array (related to the size of the dataset) and is a good default. A sensitivity of 20 will use a window 1/20 the size of the distance array and is a reasonable large value.
The distance are likely a sharp up-slope followed by a elbow and finally a long, possibly rising, tail. The target delta will be computed from the difference between the 25th and 75h percentile values. A sliding window will be run over the data with a size of
len(distances) // sensitivity
to find when the delta in the window exceeds the middle delta. This will approximate the tail end of the elbow.This returns the threshold value and the index into the distances array where it was found.
- Parameters:
sensitivity – [optional] lower value considers more items to be close, a larger value considers less items to be close.
- Raises:
ValueError – if
- static introspect(producer, *, batch_size=32, strategy=None, threshold=None)[source]¶
Uses an approximate nearest neighbor to build a distance matrix for all samples and build clusters from the closest samples.
Although this works on data of any dimension, the performance is linear in the number of samples in the
AND the number of dimensions. Consider usingDimensionReduction
to reduce the number of dimensions before detecting duplicates – if the dimensions are already being reduced forFamiliarity
, the same can be used here, otherwise a reduction to 40 still gives good results.The data from the
is L2 normalized per-column – this will help keep one column from dominating the distance metric. See also this explanation about how any why this is done.producer = Producer... duplicates = Duplicates.introspect(producer) for response_name, clusters in duplicates.items(): # sort by the mean distance to the centroid clusters = sorted(clusters, key=lambda x: x.mean) ...
- Parameters:
producer (Producer) – producer of data
batch_size (int) – [optional] size of batch to read while collecting data from the
strategy (DuplicatesStrategyType | None) – [optional] strategy to use for finding the nearest neighbors. Default is
threshold (DuplicatesThresholdStrategyType | None) – [optional] strategy to use for finding the distance between points that are considered duplicates. Default is
- Returns:
, which contains candidate duplicates for each response name- Return type:
- results: Mapping[str, Sequence[DuplicateSetCandidate]]¶
Mapping from response name to a list of candidate duplicates.
- class deepview.introspectors.DuplicatesStrategyType(*args, **kwargs)[source]¶
Protocol for code that takes anarray of vectors (embeddings) and computes a list of duplicates for each point.
Dataset Report¶
- class deepview.introspectors.DatasetReport(data, _report_save_data_path=PosixPath('report_save_data.pkl'))[source]¶
A report built to inspect a dataset for a given model from the perspective of fairness.
Like other
, useDatasetReport.introspect
to instantiate, or load a saved report usingDatasetReport.from_disk
.This report is particularly useful for introspecting datasets that have various class labels attached. See overall DatasetReport page in docs to learn more.
The following components can be run (default to all), configured using a
. - Summarize overall dataset, including by metadata labels, if they exist - Find near duplicate data samples, seeDuplicates
- Find most / least representative data overall and per metadata label, seeFamiliarity
- Project the data down to visualize overall in a 2D scatterplotThe input
to this class’s instantiation is expected to havefields
of model responses (likely a layer towards the end of the model but not the last response). These responses can come either from loading data and running it through a DeepViewModel
, or by loading the responses directly from file into aProducer
. In eachBatch's metadata
, this report looks for identifiers and optional labels attached as metadata usingBatch.StdKeys.IDENTIFIER
metadata keys.Note
For the moment, the
should be a path to the image data.This class creates a
full of the data needed to build the UI for theDatasetReport
, which can then be exported into a standalone static site to explore. The different components built in the UI interact with each other.# Build all components of the dataset report using default configuration. # This output can then be used to visualize the results with Canvas: # (1) as a standalone web dashboard to explore interactively # (2) inline in a Jupyter notebook to explore interactively # Please see the Canvas documentation for an example: # report = DatasetReport.introspect(producer)
- Parameters:
data – do not instantiate
directly, useDatasetReport.introspect
- data: DataFrame¶
of introspection results for responses and report components
- static from_disk(directory)[source]¶
object from a report save directory
- static introspect(producer, *, config=None, batch_size=1024)[source]¶
Build relevant
components from inputProducer
.- Parameters:
producer (Producer) – response producer (separate caching not needed as responses are cached in this function)
config (ReportConfig | None) – [keyword arg, optional]
. Set components toNone
to omit them from report.batch_size (int) – [keyword arg, optional] number of samples to batch at once
- Returns:
whose results can be exported into different formats- Return type:
- class deepview.introspectors.ReportConfig(projection=<factory>, duplicates=<factory>, familiarity=<factory>, dim_reduction=None, split_familiarity_min=50)[source]¶
Configuration for which components to build into the
, and what strategies to use to build those components. Default config corresponds to running all components with default strategies (projection
, andfamiliarity
).When running familiarity, “split” familiarity is also run, which means that a familiarity model is built for each label, for each label category, and then that subgroup of data is evaluated according to the model.
- Parameters:
projection (DimensionReductionStrategyType | Mapping[str, DimensionReductionStrategyType] | None) – [optional] see
duplicates (DuplicatesThresholdStrategyType | None) – [optional] see
familiarity (FamiliarityStrategyType | None) – [optional] see
dim_reduction (DimensionReductionStrategyType | Mapping[str, DimensionReductionStrategyType] | None) – [optional] see
split_familiarity_min (int) – [optional] see
- dim_reduction: DimensionReductionStrategyType | Mapping[str, DimensionReductionStrategyType] | None = None¶
If None, default to
before runningfamiliarity
, and/orprojection`
. Else provideDimensionReduction.Strategy
- duplicates: DuplicatesThresholdStrategyType | None¶
if None, elseDuplicates.ThresholdStrategy
(default isSlope
- familiarity: FamiliarityStrategyType | None¶
if None, else provideFamiliarity.Strategy
to apply to overall and split familiarity.
- property n_stages: int¶
How many stages of
multi introspect
need to be run (not counting stub intropectors)
- projection: DimensionReductionStrategyType | Mapping[str, DimensionReductionStrategyType] | None¶
if None, else provide aDimensionReduction.Strategy
that projects down to 2 dimensions, for visualization (default isDimensionReduction.Strategy.UMAP
- split_familiarity_min: int = 50¶
If running
, min data that must exist per-label for fitting individual models to subgroups of data determined by label (“split” familiarity).
Model Introspectors¶
Principal Filter Analysis¶
- class deepview.introspectors.PFA(failed_responses, _covariance_result_by_response)[source]¶
Like other
, usePFA.introspect
to instantiate.Use PFA to discover highly correlated filter, or more generically unit, responses within layers of a neural network. Exploit data to guide network compression in order to decrease inference time and memory footprint while improving generalization. See the DeepView docs for more information.
- Parameters:
failed_responses – do not instantiate
directly, usePFA.introspect
- class Strategy[source]¶
Bundled PFA strategies. To implement a custom strategy, see
.- class Energy(energy_threshold, min_kept_count=0)¶
Energy strategy for generating PFA recipes – this targets a given
to keep.- Parameters:
energy_threshold – The spectral energy to keep
min_kept_count – [optional] The minimum number of outputs to keep per response
- class KL(interpolation_function=None)¶
KL strategy for generating PFA recipes.
- Parameters:
interpolation_function – [optional] the interpolation function to use, see
- class KLInterpolationFunction(*args, **kwargs)¶
A protocol to map a KL divergence to the ratio of the number of units in the layer. The KL divergence is that between the distribution of eigenvalues of the covariance matrix of model responses and the uniform distribution.
- class LinearInterpolation(*args, **kwargs)¶
A concrete
function that performs its intended mapping by linearly interpolating [kl_divergence, max_kl_divergence] to [0, 1]
- class Size(relative_size, min_kept_count=0, epsilon_energy=1e-08)¶
Size strategy for generating PFA recipes – this targets a given
to produce a cross-layer energy threshold that will produce that result.- Parameters:
relative_size – The relative amount of channels to keep (in 0..1)
min_kept_count – [optional] The minimum number of output to keep per response
epsilon_energy – [optional] Minimum level of energy
- class UnitSelectionStrategy[source]¶
Strategy for selecting the maximally correlated units. To implement a custom strategy, see
.- class AbsMax¶
Given a correlation matrix, choose units based on the one with the greatest coefficient
- distance: _DirectionalDistanceCalculation¶
Distance function
- class AbsMin¶
Given a correlation matrix, choose units based on the one with the lowest coefficient
- distance: _DirectionalDistanceCalculation¶
Distance function
- class L1Max¶
Given a correlation matrix, choose units based on the one with the greatest L1 norm
- distance: _DirectionalDistanceCalculation¶
Distance function
- class L1Min¶
Given a correlation matrix, choose units based on the one with the lowest L1 norm
- distance: _DirectionalDistanceCalculation¶
Distance function
- class VisType[source]¶
Type of visualization modality for PFA, available to visualize via
- failed_responses: Sequence[str]¶
The names of any responses that failed to generate output. This caused by layers with insufficient data to support the analysis.
- get_recipe(*, strategy=None, unit_strategy=None)[source]¶
Generate a recipe using the given algorithm and unit strategy. For more information refer to the PFA documentation page.
- Parameters:
strategy (PFAStrategyType | None) – [keyword arg, optional] The algorithm to use,
. The default value isPFA.Strategy.KL
unit_strategy (PFAUnitSelectionStrategyType | None) – [keyword arg, optional] the
to use, default isPFA.UnitSelectionStrategy.L1Max
- Returns:
a mapping from response name to
for the givenalgorithm
andunit strategy
.- Return type:
- static introspect(producer, *, batch_size=32, epsilon_inactive=1e-08)[source]¶
Perform Principal Filter Analysis on the responses (
) generated by theproducer
The responses generated by
are assumed to be 2D (Batch x C). Thus it might be necessary topipeline
together theProducer
with aProcessor
), that transforms each individual response from multi-dimensional to mono-dimensional.- Parameters:
producer (Producer) – The producer of the responses (in
) to be analyzedbatch_size (int) – [keyword arg, optional] the batch size to use when consuming the responses (via
)epsilon_inactive (float) – [keyword arg, optional] factor used to identify inactive units (whose
var < epsilon_inactive * np.max(var)
- Returns:
an instance of
that can generatePFARecipes
using aPFAStrategyType
).- Return type:
- static show(recipe_result, *, vis_type='table', include_columns=None, exclude_columns=None)[source]¶
Create table or chart to visualize PFA results in iPython / Jupyter notebook.
Note: Requires pandas (
) or matplotlib (vis_type
), which can be installed withpip install "deepview[notebook]"
- Parameters:
recipe_result (Mapping[str, PFARecipe] | Collection[Mapping[str, PFARecipe]]) – result of
, mapping of layer toPFARecipe
. When plotting forvis_type
, a sequence oft.Mapping[str, PFARecipe]
can be passed in to compare multiple results.vis_type (str) – [keyword arg, optional] determines visualization type.
for pandas dataframe result orPFA.VisType.CHART
for matplotlib pyplot of recommended vs. original unit countsinclude_columns (Sequence[str] | None) – [keyword arg, optional] For
only. If included, only returnpandas.DataFrame
with these columns. Defaults to include all columns (valueNone
). Options are: [layer name
,original count
,recommended count
,units to keep
,KL divergence
,PFA strategy
,units ratio
,kept energy
].exclude_columns (Sequence[str] | None) – [keyword arg, optional] For
only. If included, returnpandas.DataFrame
without these columns (irrelevant ifinclude_columns
is specified). Defaults toNone
. Options are: [layer name
,original count
,recommended count
,units to keep
,KL divergence
,PFA strategy
,units ratio
,kept energy
- Returns:
of PFA results from inputrecipe_result
- Return type:
- class deepview.introspectors.PFAKLDiagnostics(kl_divergence, units_ratio)[source]¶
Diagnostic information for
- Parameters:
kl_divergence (float) – see
units_ratio (float) – see
- class deepview.introspectors.PFAEnergyDiagnostics(total_kept_energy)[source]¶
Diagnostic information for
- Parameters:
total_kept_energy (float) – see
- class deepview.introspectors.PFARecipe(original_output_count, recommended_output_count, maximally_correlated_units, number_inactive_units, diagnostics)[source]¶
Recommendation about a specific model response. This will likely never be instantiated directly, and instead an instance will be returned from
.- Parameters:
original_output_count (int) –
recommended_output_count (int) –
number_inactive_units (int) –
diagnostics (PFAKLDiagnostics | PFAEnergyDiagnostics | None) –
- diagnostics: PFAKLDiagnostics | PFAEnergyDiagnostics | None¶
Per algorithm diagnostic information
Maximally correlated units found with this recommendation.
- class deepview.introspectors.PFAUnitSelectionStrategyType(*args, **kwargs)[source]¶
Given a correlation matrix and a number of units to keep, choose which units are maximally correlated.
- __call__(covariances, *, num_units_to_keep)[source]¶
- Parameters:
covariances (PFACovariancesResult) – the covariance data for the layer
num_units_to_keep (int) – [keyword arg, optional] number of recommended units to be kept
- Returns:
with the list of indexes that corresponds to the unit that is maximally correlated (the first part of the list contains the indices of the inactive units). The number of inactive units can be found incovariances.inactive_units.shape[0]
- Return type:
- class deepview.introspectors.PFAStrategyType(*args, **kwargs)[source]¶
Protocol for PFA strategies (
). These examine per-layerPFACovariancesResult
and produces per-layerPFARecipe
This takes all layers and produces a result for each of the layers, but the algorithm operates on each layer independently.
- __call__(covariances)[source]¶
- Parameters:
covariances (Mapping[str, PFACovariancesResult]) – mapping from layer name (
name) toPFACovariancesResult
for that layer- Return type:
- class deepview.introspectors.PFACovariancesResult(covariances, eigenvalues, eigenvectors, original_output_count, inactive_units)[source]¶
Encapsulates the results of the covariance calculation
- Parameters:
covariances (ndarray) – see
eigenvalues (ndarray) – see
eigenvectors (ndarray) – see
original_output_count (int) – see
inactive_units (ndarray) – see
- covariances: ndarray¶
The covariances matrix. This is a two dimensional square array of size
- eigenvalues: ndarray¶
The eigenvalues of the covariances. This is a one dimensional array of size
- eigenvectors: ndarray¶
The eigenvectors of the covariances. This is a two dimensional square array of size
Inactive Unit Analysis¶
- class deepview.introspectors.IUA(_layer_counts, _unit_counts, _total_probe_counts)[source]¶
An introspector that evaluates responses to compute for inactive unit statistics.
Like other
, useIUA.introspect
to instantiate.- class Result(mean_inactive, std_inactive, inactive, unit_inactive_count, unit_inactive_proportion)[source]¶
Result- Parameters:
mean_inactive (float) – see
std_inactive (float) – see
unit_inactive_count (Sequence[float]) – see
unit_inactive_proportion (Sequence[float]) – see
- inactive: Sequence[float]¶
sequence tracking the number of inactive units in the layer per batch input, used to compute
- class VisType[source]¶
Type of visualization modality for IUA, available to visualize via
- static introspect(producer, *, batch_size=32, rtol=1e-05, atol=1e-08)[source]¶
Compute inactive unit statistics (mean, standard deviation, counts, and unit frequency) for each layer (
) in the inputproducer
of model responses.- Parameters:
producer (Producer) – The producer of the model responses to be introspected
batch_size (int) – [keyword arg, optional] number of inputs to pull from
at a timertol (float) – [keyword arg, optional] float relative tolerance parameter (see doc for
).atol (float) – [keyword arg, optional] float absolute tolerance parameter (see doc for
- Returns:
instance that can provide information about inactive units in the model- Return type:
- property results: Mapping[str, Result]¶
A per-layer
encapsulating Inactive Unit Analysis results.
- static show(iua, *, vis_type='table', response_names=None)[source]¶
Create table or chart to visualize IUA results in iPython / Jupyter notebook.
- Note: Requires pandas
) or matplotlib (vis_type
), which can be installed withpip install "deepview[notebook]"
- Parameters:
iua (IUA) – result of
, instance ofIUA
vis_type (str) – [keyword arg, optional] determines visualization type. IUA.VisType.TABLE for pandas dataframe result or IUA.VisType.CHART for matplotlib pyplot of inactive units
response_names (Sequence[str] | None) – [keyword arg, optional] For IUA.VisType.CHART vis. Sequence of responses (
names) to visualize (defaults to None for showing all responses)
- Returns:
results- Return type: