edflow.data.believers.sequence module

Summary

Classes:

SequenceDataset

Wraps around a dataset and returns sequences of examples.

UnSequenceDataset

Flattened version of a SequenceDataset.

Functions:

getSeqDataset

This allows to not define a dataset class, but use a baseclass and a length and step parameter in the supplied config to load and sequentialize a dataset.

get_sequence_view

Generates a view on some base dataset given its sequence indices seq_indices.

Reference

edflow.data.believers.sequence.get_sequence_view(frame_ids, length, step=1, strategy='raise', base_step=1)[source]

Generates a view on some base dataset given its sequence indices seq_indices.

Parameters
  • seq_indices (np.ndarray) – An array of sorted frame indices. Must be of type int.

  • length (int) – Length of the returned sequences in frames.

  • step (int) – Step between returned frames. Must be >= 1.

  • strategy (str) – How to handle bad sequences, i.e. sequences starting with a fid_key > 0. - raise: Raise a ValueError - remove: remove the sequence - reset: remove the sequence

  • base_step (int) – Step between base frames of returned sequences. Must be >=1.

  • view will have len(dataset) - length * step entries and shape (This) –

  • - length * step, lenght] ([len(dataset)) –

class edflow.data.believers.sequence.SequenceDataset(dataset, length, step=1, fid_key='fid', strategy='raise', base_step=1)[source]

Bases: edflow.data.dataset_mixin.DatasetMixin

Wraps around a dataset and returns sequences of examples. Given the length of those sequences the number of available examples is reduced by this length times the step taken. Additionally each example must have a frame id fid_key specified in the labels, by which it can be filtered. This is to ensure that each frame is taken from the same video.

This class assumes that examples come sequentially with fid_key and that frame id 0 exists.

The SequenceDataset also exposes the Attribute self.base_indices, which holds at each index i the indices of the elements contained in the example from the sequentialized dataset.

__init__(dataset, length, step=1, fid_key='fid', strategy='raise', base_step=1)[source]
Parameters
  • dataset (DatasetMixin) – Dataset from which single frame examples are taken.

  • length (int) – Length of the returned sequences in frames.

  • step (int) – Step between returned frames. Must be >= 1.

  • fid_key (str) – Key in labels, at which the frame indices can be found.

  • strategy (str) – How to handle bad sequences, i.e. sequences starting with a fid_key > 0. - raise: Raise a ValueError - remove: remove the sequence - reset: remove the sequence

  • base_step (int) – Step between base frames of returned sequences. Must be >=1.

  • dataset will have len(dataset) - length * step examples. (This) –

class edflow.data.believers.sequence.UnSequenceDataset(seq_dataset)[source]

Bases: edflow.data.dataset_mixin.DatasetMixin

Flattened version of a SequenceDataset. Adds a new key seq_idx to each example, corresponding to the sequence index and a key example_idx corresponding to the original index. The ordering of the dataset is kept and sequence examples are ordererd as in the sequence they are taken from.

Warning

This will not create the original non-sequence dataset! The new dataset contains sequence-length x len(SequenceDataset) examples.

If the original dataset would be represented as a 2d numpy array the UnSequence version of it would be the concatenation of all its rows:

a = np.arange(12)
seq_dataset = a.reshape([3, 4])
unseq_dataset = np.concatenate(seq_dataset, axis=-1)

np.all(a == unseq_dataset))  # True
__init__(seq_dataset)[source]
Parameters

seq_dataset (SequenceDataset) – A SequenceDataset with attributes length.

get_example(i)[source]

Examples are gathered with the index i' = i // seq_len + i % seq_len

edflow.data.believers.sequence.getSeqDataset(config)[source]

This allows to not define a dataset class, but use a baseclass and a length and step parameter in the supplied config to load and sequentialize a dataset.

A config passed to edflow would the look like this:

dataset: edflow.data.dataset.getSeqDataSet
model: Some Model
iterator: Some Iterator

seqdataset:
        dataset: import.path.to.your.basedataset
        length: 3
        step: 1
        fid_key: fid
        base_step: 1

getSeqDataSet will import the base dataset and pass it to SequenceDataset together with length and step to make the actually used dataset.

Parameters

config (dict) –

An edflow config, with at least the keys

seqdataset and nested inside it dataset, seq_length and seq_step.

Returns

A Sequence Dataset based on the basedataset.

Return type

SequenceDataset