.. _data_introspection: ================== Data Introspection ================== Data introspectors observe intermediate model responses, and process data in batches when calling :attr:`.introspect() `. :ref:`Dataset Report ` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The :ref:`DatasetReport ` bundles :ref:`Familiarity `, :ref:`Duplicates ` and :ref:`Dimension Reduction ` introspectors (below) in an interactive interface with various visualization options. :ref:`Familiarity ` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ :ref:`Familiarity ` quantifies how *familiar* a data point is to a specific dataset or subset, by fitting a probability distribution to the activations of the specified layer(s), and then evaluating the probability of any data sample according to the distribution. :ref:`Duplicates ` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Find near-duplicate data. Uses an approximate nearest neighbor to build a distance matrix for all samples and clusters the closest samples. :ref:`Dimension Reduction ` ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Projects high dimensional activation data to a lower dimension, usually for consumption by a different introspector or for 2D or 3D visualization. .. toctree:: :hidden: :maxdepth: 1 Dataset Report Familiarity Duplicates Dimension Reduction