
Introspection is the examination of the activations of a neural network as data passes through. Introspecting networks and data can help improve an ML pipeline’s efficiency, robustness, and fairness.

DeepView Introspectors are the core algorithms of DeepView. You can see all available introspectors on the following pages:

As noted previously, DeepView uses evaluation, and so each Introspector has an .introspect() method which will trigger the Producers to generate data and the pipelines to consume and process it. This is demonstrated in the diagram below.

An animated diagram illustrating the DeepView pipeline. A single batch at a time is fed through the entire pipeline, from Producer to Introspector.

Exploring Results

To explore the result of an introspection, DeepView’s network introspectors (PFA, IUA) have a .show() method built-in, that can be run in a Jupyter notebook to view the results live. For these show methods, the result of the .introspect() call should be passed in as the first argument.

For instance, an example for IUA:

iua_result = IUA.introspect(producer, batch_size=64)  # introspect!

# Show inactive unit analysis results (with default params)
A Pandas Dataframe showing the IUA result per layer, consisting of the mean and std inactive units.

The results of DeepView data introspectors can be visualized and explored in different manners. If the data introspectors are run as part of the Dataset Report, the introspection results may be directly fed to and explored interactively with the Canvas Framework which is a part of DeepView ToolKit. If the introspector is run outside of the Dataset Report, the DeepView notebook examples show one of many possible ways each result may be visualized.

Best Practices

Preparing Inputs for Introspectors

There are various ways in which DeepView introspection can be tailored for different use cases. Here are some common things for users to think about:

  • Which intermediate layer(s) to extract model responses from

  • Whether to attach metadata to batches (e.g., labels and unique IDs), for instance to refer back to the original data samples with a unique identifier

  • Whether to pool responses or reduce dimensionality before running model responses through the introspector

Selecting Model Responses

To use an introspector, typically certain layer(s) of the network model are used rather that using the final outputs (or predictions). These layer names can be provided as input, and thus requires finding the correct layer names. It’s possible to inspect a dictionary of responses with the response_infos method:

model = ... # load model here, e.g. with load_tf_model_from_path
response_infos = model.response_infos

DeepView also provides a utility function for finding input layers from a Model

model = ... # load model here, e.g. with load_tf_model_from_path
input_layers = model.input_layers
input_layer_names = list(input_layers.keys())

Caching responses from pipelines

When running DeepView in a Jupyter notebook, a good rule of thumb is to cache (temporarily store on disk) responses at a point in the pipeline where it doesn’t make sense to re-run every time the pipeline is processed (e.g. via introspect). This can be done by adding a Cacher as a PipelineStage. For instance:

from deepview_tensorflow import TFDatasetExamples, TFModelExamples
from deepview.base import pipeline
from deepview.introspectors import Familiarity
from deepview.processors import ImageResizer, Cacher

# Load data, model, and set up batch pipeline
cifar10 = TFDatasetExamples.CIFAR10()
mobilenet = TFModelExamples.MobileNet()
response_producer = pipeline(
     ImageResizer(pixel_format=ImageResizer.Format.HWC, size=(224, 224)),

     # Cache responses from MobileNet inference

In this code, the CIFAR10 dataset will only be pulled through the MobileNet model a single time, regardless of how many times response_producer is used later. The response_producer can then be fed to various introspectors or perform post-processing by creating new pipelines using response_producer as the producer. It is on the user to decide if caching will use significant space on their machine, and if it is worth the speed-up. For instance, caching a single model response per data sample (caching after model inference) will take up less space than caching large video data samples before model inference.

For a list of available pipeline stage objects, see the Batch Processors section.