edflow.data.processing.processed module

Summary

Classes:

ProcessedDataset

A dataset with data processing applied.

Reference

class edflow.data.processing.processed.ProcessedDataset(data, process, update=True)[source]

Bases: edflow.data.dataset_mixin.DatasetMixin

A dataset with data processing applied.

__init__(data, process, update=True)[source]

Applies process to the examples in data everytime an example is requested.

Parameters
  • data (DatasetMixin) – The dataset to be processed.

  • process (Callable) –

    A function which expects all entries in the examples of data as keyword arguments and returns a dictionary.

    D = SomeDataset()
    print(D[42])  # {'a': 1, 'b': 2, 'index_': 42, 'foo': 'bar'}
    
    def process(a, b, **kwargs):
        return {'a': a+1, 'b': b**2}
    
    PD = ProcessedDataset(D, process)
    print(PD[42])  # {'a': 2, 'b': 4, 'index_': 42, 'foo', 'bar'}
    

  • update (bool) – If True (which is default), takes the original example and does an update call on it with the dict returned by process. Otherwise simply returns the dict generated by process.

get_example(i)[source]

Get example and process.