deepview_data
¶
Support for custom image datasets in DeepView
- class deepview_data.CustomDatasets[source]¶
Bases:
object
Custom Datasets, each bundled as a DeepView
Producer
.- class ImageFolderDataset(root_folder, image_size=(64, 64), train_split=0.8, valid_extensions=None, max_samples=-1, write_to_folder=False)¶
Bases:
Producer
,_Logged
A dataset that loads images from a directory structure where each subdirectory represents a class.
Example directory structure:
- root_folder/
- class1/
image1.jpg image2.jpg
- class2/
image3.jpg image4.jpg
- Parameters:
root_folder (str) – Path to the root directory containing class subdirectories
image_size (Tuple[int, int]) – Tuple of (height, width) to resize images to
train_split (float) – Fraction of data to use for training (default: 0.8)
valid_extensions (List[str]) – List of valid file extensions to include (default: [‘.jpg’, ‘.jpeg’, ‘.png’])
max_samples (int) – Maximum number of samples to load (-1 for all, default: -1)
write_to_folder (bool)
- __call__(batch_size)¶
Produce generic
Batch
es from the loaded data, running through training and test sets.- Parameters:
batch_size (int) – the length of batches to produce
- Returns:
yields
Batches
of the split_dataset of sizebatch_size
. Ifself.attach_metadata
is True, attaches metadata in format:Batch.StdKeys.IDENTIFIER
: Use pathname as the identifier for each data sample, excluding base data directoryBatch.StdKeys.LABELS
: A dict with:”label”: a NumPy array of label features (format specific to each dataset)
”dataset”: a NumPy array of ints either 0 (for “train”) or 1 (for “test”)
- Return type:
- cleanup()¶
Explicitly clean up the dataset folder created by this instance.
This method attempts to delete the dataset folder if it exists and was created by this instance (write_to_folder=True). It uses a robust approach to handle potential file system locks.
- Returns:
True if cleanup was successful or not needed, False if cleanup failed
- Return type:
- raw_dataset: Tuple[Tuple[ndarray, ndarray], Tuple[ndarray, ndarray]] = Field(name=None,type=None,default=<dataclasses._MISSING_TYPE object>,default_factory=<dataclasses._MISSING_TYPE object>,init=False,repr=True,hash=None,compare=True,metadata=mappingproxy({}),kw_only=<dataclasses._MISSING_TYPE object>,_field_type=None)¶