Note

This page was generated from a Jupyter notebook. The original can be downloaded from here.

Principal Filter Analysis (PFA): Compression Basic Example for MobileNet on CIFAR-10¶

This notebook provides an example of how to apply PFA to a model (MobileNet) using data from CIFAR-10 and obtain the recipes that PFA recommends to follow in order to re-architect the network and obtain a smaller model while trying to preserve accuracy.

1. Use DeepView to run inference and collect responses from a model¶

In order to run PFA, it’s necessary to run inference using some data and collect the responses from the layers to analyze and compress. See the docs for more information about how to load a model into DeepView.

[1]:

import pandas as pd

from deepview.base import pipeline, ImageFormat, ResponseInfo
from deepview_tensorflow import TFModelExamples, TFDatasetExamples
from deepview.processors import ImageResizer, Pooler

from deepview.exceptions import enable_deprecation_warnings
enable_deprecation_warnings(error=True)  # treat DeepView deprecation warnings as errors

2025-06-21 23:18:36.837715: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
E0000 00:00:1750547916.852099 5915 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1750547916.856407 5915 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1750547916.867657 5915 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750547916.867670 5915 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750547916.867672 5915 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1750547916.867674 5915 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
2025-06-21 23:18:36.871535: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.

1a. Download a model, MobileNet, and store it locally¶

This will be the model to analyze with PFA. TFModelExamples are used to load the model, which loads the model from memory using deepview_tensorflow.load_tf_model_from_memory under the hood.

[2]:

# Load MobileNet
mobilenet = TFModelExamples.MobileNet()

2025-06-21 23:18:39.276308: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)

1.b Find input layers¶

The name of the input placeholder is needed to tell DeepView where data needs to be fed. The following loop shows how all layers can be filtered to find the Input name.

The input name will be input_1, later this information will be used when inference is run.

[3]:

possible_input_layers = [
    info.name
    for info in mobilenet.response_infos.values()
    if info.layer.kind is ResponseInfo.LayerKind.PLACEHOLDER
    and 'input' in info.name
]

print(possible_input_layers)

['input_layer']

1.c Find Convolution layers¶

Similarly to the prior cell, here all layers are parsed in search of Conv2D layers. This is the set of layers that to analyze using PFA. If the list of names of the layers to analyze is already known, it can be passed directly to the loaded model (see step 1.f).

Notice that the output layer (whose name is conv_preds) is excluded, since its size is determined by the number of classes of this problem.

[4]:

conv2d_responses = [
    info.name
    for info in mobilenet.response_infos.values()
    if info.layer.kind == ResponseInfo.LayerKind.CONV_2D
    and 'preds' not in info.name
]

for name in conv2d_responses:
    info = mobilenet.response_infos[name]
    print(f"{info.name} {info.shape}")

conv1 (None, 112, 112, 32)
conv1_bn (None, 112, 112, 32)
conv1_relu (None, 112, 112, 32)
conv_dw_1 (None, 112, 112, 32)
conv_dw_1_bn (None, 112, 112, 32)
conv_dw_1_relu (None, 112, 112, 32)
conv_pw_1 (None, 112, 112, 64)
conv_pw_1_bn (None, 112, 112, 64)
conv_pw_1_relu (None, 112, 112, 64)
conv_pad_2 (None, 113, 113, 64)
conv_dw_2 (None, 56, 56, 64)
conv_dw_2_bn (None, 56, 56, 64)
conv_dw_2_relu (None, 56, 56, 64)
conv_pw_2 (None, 56, 56, 128)
conv_pw_2_bn (None, 56, 56, 128)
conv_pw_2_relu (None, 56, 56, 128)
conv_dw_3 (None, 56, 56, 128)
conv_dw_3_bn (None, 56, 56, 128)
conv_dw_3_relu (None, 56, 56, 128)
conv_pw_3 (None, 56, 56, 128)
conv_pw_3_bn (None, 56, 56, 128)
conv_pw_3_relu (None, 56, 56, 128)
conv_pad_4 (None, 57, 57, 128)
conv_dw_4 (None, 28, 28, 128)
conv_dw_4_bn (None, 28, 28, 128)
conv_dw_4_relu (None, 28, 28, 128)
conv_pw_4 (None, 28, 28, 256)
conv_pw_4_bn (None, 28, 28, 256)
conv_pw_4_relu (None, 28, 28, 256)
conv_dw_5 (None, 28, 28, 256)
conv_dw_5_bn (None, 28, 28, 256)
conv_dw_5_relu (None, 28, 28, 256)
conv_pw_5 (None, 28, 28, 256)
conv_pw_5_bn (None, 28, 28, 256)
conv_pw_5_relu (None, 28, 28, 256)
conv_pad_6 (None, 29, 29, 256)
conv_dw_6 (None, 14, 14, 256)
conv_dw_6_bn (None, 14, 14, 256)
conv_dw_6_relu (None, 14, 14, 256)
conv_pw_6 (None, 14, 14, 512)
conv_pw_6_bn (None, 14, 14, 512)
conv_pw_6_relu (None, 14, 14, 512)
conv_dw_7 (None, 14, 14, 512)
conv_dw_7_bn (None, 14, 14, 512)
conv_dw_7_relu (None, 14, 14, 512)
conv_pw_7 (None, 14, 14, 512)
conv_pw_7_bn (None, 14, 14, 512)
conv_pw_7_relu (None, 14, 14, 512)
conv_dw_8 (None, 14, 14, 512)
conv_dw_8_bn (None, 14, 14, 512)
conv_dw_8_relu (None, 14, 14, 512)
conv_pw_8 (None, 14, 14, 512)
conv_pw_8_bn (None, 14, 14, 512)
conv_pw_8_relu (None, 14, 14, 512)
conv_dw_9 (None, 14, 14, 512)
conv_dw_9_bn (None, 14, 14, 512)
conv_dw_9_relu (None, 14, 14, 512)
conv_pw_9 (None, 14, 14, 512)
conv_pw_9_bn (None, 14, 14, 512)
conv_pw_9_relu (None, 14, 14, 512)
conv_dw_10 (None, 14, 14, 512)
conv_dw_10_bn (None, 14, 14, 512)
conv_dw_10_relu (None, 14, 14, 512)
conv_pw_10 (None, 14, 14, 512)
conv_pw_10_bn (None, 14, 14, 512)
conv_pw_10_relu (None, 14, 14, 512)
conv_dw_11 (None, 14, 14, 512)
conv_dw_11_bn (None, 14, 14, 512)
conv_dw_11_relu (None, 14, 14, 512)
conv_pw_11 (None, 14, 14, 512)
conv_pw_11_bn (None, 14, 14, 512)
conv_pw_11_relu (None, 14, 14, 512)
conv_pad_12 (None, 15, 15, 512)
conv_dw_12 (None, 7, 7, 512)
conv_dw_12_bn (None, 7, 7, 512)
conv_dw_12_relu (None, 7, 7, 512)
conv_pw_12 (None, 7, 7, 1024)
conv_pw_12_bn (None, 7, 7, 1024)
conv_pw_12_relu (None, 7, 7, 1024)
conv_dw_13 (None, 7, 7, 1024)
conv_dw_13_bn (None, 7, 7, 1024)
conv_dw_13_relu (None, 7, 7, 1024)
conv_pw_13 (None, 7, 7, 1024)
conv_pw_13_bn (None, 7, 7, 1024)
conv_pw_13_relu (None, 7, 7, 1024)

1.d Create a DNI Dataset wrapping CIFAR-10¶

Download CIFAR-10 data and only use 2000 images for the example, so inference is faster.

In order to be able to use CIFAR-10 in DeepView, it’s necessary to wrap the data into a Producer (if these were normal images, ImageProducer could be used with an input path).

Moreover, MobileNet accepts images of size 224x224, so the CIFAR-10 images need to be pre-processed by resizing them from 32x32 to 224x224. DeepView provides a set of processors in the module deepview.processors. ImageResizer is used and passed into a new pipeline that applies such pre-processing on top of CIFAR-10.

Note: TFDatasetExamples are used here to load the data. To learn how to write a custom Producer that conforms to the Producer protocol, see load data in the documentation.

[5]:

# Load CIFAR10 from DeepView's TF examples
cifar10_dataset = TFDatasetExamples.CIFAR10(max_samples=2000)
cifar10_dataset.shuffle()

# Here, use the standard MobileNet preprocessor,
# which was loaded into the "preprocessor" property when MobileNet was loaded:
mobilenet_preprocessor = mobilenet.preprocessing
assert mobilenet_preprocessor is not None

# Next, define an ImageResizer to 224x224.
resizer = ImageResizer(pixel_format=ImageFormat.HWC, size=(224, 224))

# Wrap the dataset that was just created with preprocessors,
# so DataBatches are pre-processed accordingly every time they are requested.
dataset = pipeline(cifar10_dataset, mobilenet_preprocessor, resizer)

1.f Run inference¶

Now run the data generated by the pipeline and collect statistics regarding the activation of the units in the layers to analyze.

Here is where it’s important to know how to map the data field "images" from the Batch to the input of the network "input_1". This mapping is done by the FieldRenamer. This feeds into the deepview_model defined earlier to perform inference.

ce372d99c21b4dd58fb356617156fd09

PFA expects data to be of the form (number_of_samples x number_of_units) for each layer. This means that the response obtained after inference needs to be post-processed. The typical post-processing operation used is max-pooling. This is the task performed using a DeepView Processor, specifically, a Pooler.

1b4917096a60443cab5383aab7d8db06

The inference and pooling steps are repeated for all input data and the pooled responses are collected.

Note: although this section is titled “run inference”, this actually defines how to run inference. Until the data is pulled through the pipeline (later when PFA introspection is run). Inference is not actually run here.

[6]:

from deepview.processors import Pooler, FieldRenamer

producer = pipeline(
    # the resized images
    dataset,

    # the loaded tensorflow model -- tell the model which responses to collect (computed earlier)
    mobilenet.model(conv2d_responses),

    # perform spatial max pooling on the result
    Pooler(dim=(1, 2), method=Pooler.Method.MAX),
)

2. Run PFA introspection¶

Notice that until this point the information has only been provided for DeepView to load data, run inference, and post-process the responses, but nothing has happened yet. Only once the Introspector calls introspect will all the operations actually be executed.

Notice also that if the responses were already stored somewhere, there would be no need to re-run inference and re-compute them. A CachedProducer can load them from disk or a custom Producer can be written that loads the responses (and post-process them, if needed) and yields them (similar to how the images were loaded earlier).

17c00d525bee4848a44f348d58a98f01

Once all responses have been collected the covariance matrix of the pooled responses is computed per each layer. From the covariance matrix, also extract its eigenvalues. The next step will use them to understand how to compress each layer.

76a5986f7b6f4dd8af1fd3b9a1e38868

[7]:

from deepview.introspectors import PFA, PFARecipe

# this runs the pipeline defined for producer and analyzes the results

print('Analyzing responses, this may take a few minutes...')
pfa = PFA.introspect(producer, batch_size=64)
print('Done.')

Analyzing responses, this may take a few minutes...

/opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/keras/src/models/functional.py:241: UserWarning: The structure of `inputs` doesn't match the expected structure.
Expected: [['input_layer']]
Received: inputs=['Tensor(shape=(64, 224, 224, 3))']
  warnings.warn(msg)
/opt/hostedtoolcache/Python/3.10.17/x64/lib/python3.10/site-packages/keras/src/models/functional.py:241: UserWarning: The structure of `inputs` doesn't match the expected structure.
Expected: [['input_layer']]
Received: inputs=['Tensor(shape=(16, 224, 224, 3))']
  warnings.warn(msg)

Done.

3. PFA-Energy¶

Now the key ingredient of PFA is computed: the eigenvalues. PFA provides three algorithms to compress a network: Energy, KL and Size. First, start with PFA.Strategy.Energy. The energy threshold tells PFA how much of the original energy to preserve. This involves computing how many eigenvalues should be kept in order to capture the desired amount of energy in each layer. In the following figure the energy threshold is set to 0.8 (i.e., 80% of the original energy) so, for each layer a different number of eigenvalues are selected so that their total energy amount to 80% of their original energy value.

a9153d96ef5d4384848c8338d8551ea6

The result is a dictionary of response name to PFARecipe. The recipe provides:

the original number of units;
the suggested number of units according to the current energy threshold;
additional diagnostics such as the actual energy level preserved.

If there is insufficient data to compute the covariance, a layer may be omitted – this can also queried via pfa.failed_responses.

The PFA.show method is used to visualize the results, which produces, by default, a pandas DataFrame of the results. Here, specify which columns of the results to visualize and then add an additional column that defines the input energy level used, and concatenate the results of the two energy levels together into one dataframe.

[8]:

# produce a list of energy, layer, counts that can be examined

# Run energy levels 0.8 and 0.99
energy_8_recipes = pfa.get_recipe(
    strategy=PFA.Strategy.Energy(energy_threshold=0.8, min_kept_count=3)
)
energy_99_recipes = pfa.get_recipe(
    strategy=PFA.Strategy.Energy(energy_threshold=0.99, min_kept_count=3)
)
print("Done running PFA Energy")

# Display both results in the same data frame using PFA.show
results_table = PFA.show((energy_8_recipes, energy_99_recipes))

# Add a column to display the energy level
results_table['Energy'] = ['0.8']*len(energy_8_recipes) + ['0.99']*len(energy_99_recipes)
print(results_table.head())

Done running PFA Energy
       layer name  original count  recommended count  \
0    conv_pw_8_bn             512                158
1       conv_dw_2              64                  7
2  conv_pw_2_relu             128                 12
3   conv_dw_11_bn             512                102
4       conv_dw_8             512                133

                                       units to keep Energy
0  {1, 6, 7, 12, 13, 14, 15, 23, 26, 28, 30, 33, ...    0.8
1                         {1, 6, 42, 44, 47, 28, 31}    0.8
2  {35, 7, 104, 41, 76, 111, 17, 114, 90, 59, 60,...    0.8
3  {15, 23, 27, 30, 33, 41, 42, 44, 53, 60, 69, 7...    0.8
4  {2, 3, 4, 5, 16, 18, 24, 28, 31, 32, 37, 38, 4...    0.8

4. PFA KL¶

The KL strategy, a.k.a PFA.Strategy.KL, is a recipe that does not require any user input parameter. The details of how this algorithm works are not needed to run PFA, however, here is a brief explanation. To understand how the recipe is computed it’s important to understand what is the “ideal” eigenvalues set. If a user desires decorrelated and equally contributing units, then the empirical eigenvalue distribution should be flat: this means that all units should be preserved. The opposite scenario is when only a single eigenvalue is non-zero: this means that the same task can be performed equivalently well by a single unit.

6284550294a04a738f0e5ecd67ebbb74

In practice the observed distribution will be in-between the two extreme cases. In order to determine how many units should be kept given an observed distribution the “distance” (the Kullback-Leibler divergence, KL, in this case) between the observed and the ideal distribution is computed. If the distance is 0 then keep all the units. If the distance is equal to the distance between the maximally correlated and the ideal distribution then keep only 1 unit. In all the intermediate cases, interpolate between the two extremes in order to map a distance “x” to the number of units to keep “b”.

67019ec9753c4094b7e7aae6ec005401

[9]:

# can also be written as pfa.get_recipe(strategy=PFA.Strategy.KL())
pfa_kl_recipe = pfa.get_recipe()

# Display kl recipe results
print(PFA.show(pfa_kl_recipe).head())

       layer name  original count  recommended count  \
0    conv_pw_8_bn             512                421
1       conv_dw_2              64                 38
2  conv_pw_2_relu             128                 78
3   conv_dw_11_bn             512                370
4       conv_dw_8             512                409

                                       units to keep
0  {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15,...
1  {1, 2, 3, 6, 8, 10, 12, 13, 14, 16, 17, 18, 19...
2  {1, 3, 7, 8, 10, 11, 12, 13, 16, 17, 18, 19, 2...
3  {0, 3, 4, 5, 6, 9, 11, 13, 15, 16, 18, 19, 20,...
4  {0, 1, 2, 3, 4, 5, 7, 8, 10, 11, 13, 15, 16, 1...

Note: to only show a subset of the information:

select specific columns to include, or
select all available information to show

[10]:

# Select specific columns to include
print(PFA.show(
    pfa_kl_recipe,
    include_columns=['layer_name', 'original count', 'recommended count']
).head())

   original count  recommended count
0             512                421
1              64                 38
2             128                 78
3             512                370
4             512                409

[11]:

# Show all data by setting 'include_columns' to '[]'
print(PFA.show(
    pfa_kl_recipe,
    include_columns=[]
).head())

       layer name  original count  recommended count  \
0    conv_pw_8_bn             512                421
1       conv_dw_2              64                 38
2  conv_pw_2_relu             128                 78
3   conv_dw_11_bn             512                370
4       conv_dw_8             512                409

                                       units to keep  KL divergence  \
0  {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 12, 13, 14, 15,...       1.115867
1  {1, 2, 3, 6, 8, 10, 12, 13, 14, 16, 17, 18, 19...       1.737101
2  {1, 3, 7, 8, 10, 11, 12, 13, 16, 17, 18, 19, 2...       1.926695
3  {0, 3, 4, 5, 6, 9, 11, 13, 15, 16, 18, 19, 20,...       1.735300
4  {0, 1, 2, 3, 4, 5, 7, 8, 10, 11, 13, 15, 16, 1...       1.265859

  PFA strategy  units ratio kept energy
0       PFA KL     0.821127         N/A
1       PFA KL     0.582315         N/A
2       PFA KL     0.602910         N/A
3       PFA KL     0.721832         N/A
4       PFA KL     0.797084         N/A

Note: print is unnecessary in the preceding cells. E.g.:

PFA.show(
    pfa_kl_recipe,
    include_columns=[]
)

Is also appropriate. print is included to assist with displaying the notebook output within the DeepView documentation.

Unit Selection¶

The recipes computed earlier specify how many units each analyzed layer should have and provide some additional diagnostic information that could be useful for introspection.

Something they do not provide is which units should be kept and which should be removed. This task is performed by the unit selection strategy.

Again, the details of how these algorithms work are not needed to run PFA with unit selection, however, here is a brief explanation.

All unit selection strategies are based on the Pearson’s correlation coefficients that can be extracted from the covariance matrix computed before. The Pearson’s correlation coefficients provide a measure of the strength of the linear relationship between pairs of variables (in this case pairs of units): the higher the coefficient the stronger the correlation.

d26508a1132943d6b92d1648fd79c787

PFA is equipped with a few strategies but here two of them are explored: `AbsMax <https://satishlokkoju.github.io/deepview/api/deepview/introspectors.html#deepview.introspectors.PFA.UnitSelectionStrategy.AbsMax>`__ and `L1Max <https://satishlokkoju.github.io/deepview/api/deepview/introspectors.html#deepview.introspectors.PFA.UnitSelectionStrategy.L1Max>`__.

[12]:

print("Starting selection. This may take several seconds")

# Run two strategies, AbsMax and L1Max
abs_max_recipes = pfa.get_recipe(unit_strategy=PFA.UnitSelectionStrategy.AbsMax())
l1_max_recipes = pfa.get_recipe(unit_strategy=PFA.UnitSelectionStrategy.L1Max())

# Display results via pandas dataframe + add strategy column
results_table = PFA.show((abs_max_recipes, l1_max_recipes))
results_table['strategy'] = ['ABS MAX']*len(abs_max_recipes) + ['L1 MAX']*len(l1_max_recipes)

Starting selection. This may take several seconds

[13]:

results_table.head()

[13]:

	layer name	original count	recommended count	units to keep	strategy
0	conv_pw_8_bn	512	421	{1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 13, 14, 15...	ABS MAX
1	conv_dw_2	64	38	{1, 3, 6, 7, 10, 12, 13, 14, 15, 16, 19, 21, 2...	ABS MAX
2	conv_pw_2_relu	128	78	{1, 2, 4, 7, 8, 10, 12, 13, 14, 16, 17, 19, 20...	ABS MAX
3	conv_dw_11_bn	512	370	{0, 3, 5, 6, 8, 15, 16, 18, 19, 20, 22, 23, 26...	ABS MAX
4	conv_dw_8	512	409	{0, 1, 2, 3, 4, 5, 7, 8, 9, 10, 11, 13, 15, 16...	ABS MAX

AbsMax¶

The PFA.UnitSelectionStrategy.AbsMax strategy iteratively selects the pair of units with the highest correlation coefficients (in absolute value). In order to disambiguate which unit of the selected pair should be removed, look at the second, third, etc… coefficients until a choice can be made. Remove the unit from the covariance, recompute the coefficients and repeat until the number of units recommended by the recipe is reached. 233259575ebc4e5b8ca099b27a1ef403

L1-MAX¶

The PFA.UnitSelectionStrategy.L1Max strategy iteratively selects the unit with the highest sum of all its correlation coefficients. Remove the unit from the covariance, recompute the coefficients and repeat until the number of units recommended by the recipe is reached. 4024f75c50c0480abf64a68274d90216

Visualization¶

As a wrap up, take a look at the compression achieved using, for example, PFA KL.

Here is a plot with the number of units per layer for the original model and the compressed one. The figure shows the amount of compression per layer recommended by PFA.

Interestingly, layer conv_pw_11 gets compressed a lot compared to other layers, meaning that a high amount of correlation is present in that layer. Keep in mind that only 2,000 images out of 50,000 were used. An interesting experiment (encouraged) is to increase the number of images, or just feed images from a single class, in order to get more insights from PFA.

Hint: By feeding images of only one class one should expect higher compression.

This matplotlib plot can be viewed using PFA.show, with added parameter vis_type set to PFA.VisType.CHART instead of the default PFA.VisType.TABLE.

[14]:

PFA.show(pfa_kl_recipe, vis_type=PFA.VisType.CHART)

[14]:

<Axes: xlabel='layer name', ylabel='number of units'>

../../_images/notebooks_model_introspection_principal_filter_analysis_28_1.png