py2store.utils.cumul_aggreg_write

utils for bulk writing – accumulate, aggregate and write when some condition is met

class py2store.utils.cumul_aggreg_write.CumulAggregWrite(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>)[source]
class py2store.utils.cumul_aggreg_write.CumulAggregWriteKvItems(store)[source]
class py2store.utils.cumul_aggreg_write.CumulAggregWriteWithAutoFlush(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>, flush_cache_condition=<function condition_flush_on_every_write>)[source]
py2store.utils.cumul_aggreg_write.condition_flush_on_every_write(cache)[source]

Boolean function used as flush_cache_condition to anytime the cache is non-empty

py2store.utils.cumul_aggreg_write.mk_group_aggregator(item_to_kv, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]

Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key

Parameters
  • item_to_kv

  • aggregator_op

  • initial

Returns:

>>> # Collect words (as a csv string), grouped by the lower case of the first letter
>>> ag = mk_group_aggregator(lambda item: (item[0].lower(), item),
...                          aggregator_op=lambda x, y: ', '.join([x, y]))
>>> list(ag(['apple', 'bananna', 'Airplane']))
[('a', 'apple, Airplane'), ('b', 'bananna')]
>>> # Collect (and concatinate)  characters according to their ascii value modulo 3
>>> ag = mk_group_aggregator(lambda item: (item['age'], item['thing']),
...                          aggregator_op=lambda x, y: x + [y],
...                          initial=[])
>>> list(ag([{'age': 0, 'thing': 'new'}, {'age': 42, 'thing': 'every'}, {'age': 0, 'thing': 'just born'}]))
[(0, ['new', 'just born']), (42, ['every'])]
py2store.utils.cumul_aggreg_write.mk_group_aggregator_with_key_func(item_to_key, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]

Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key

Parameters
  • item_to_key – Function that takes an item of the generator and outputs the key that should be used to group items

  • aggregator_op – The aggregation binary function that is used to aggregate two items together. The function is used as is by the functools.reduce, applied to the sequence of items that were collected for a given group

  • initial – The “empty” element to start the reduce (aggregation) with, if necessary.

Returns:

>>> # Collect words (as a csv string), grouped by the lower case of the first letter
>>> ag = mk_group_aggregator_with_key_func(lambda item: item[0].lower(),
...                          aggregator_op=lambda x, y: ', '.join([x, y]))
>>> list(ag(['apple', 'bananna', 'Airplane']))
[('a', 'apple, Airplane'), ('b', 'bananna')]
>>>
>>> # Collect (and concatenate) characters according to their ascii value modulo 3
... ag = mk_group_aggregator_with_key_func(lambda item: (ord(item) % 3))
>>> list(ag('abcdefghijklmnop'))
[(1, 'adgjmp'), (2, 'behkn'), (0, 'cfilo')]
>>>
>>> # sum all even and odd number separately
... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2))
>>> list(ag([1, 2, 3, 4, 5]))  # sum of evens is 6, and sum of odds is 9
[(1, 9), (0, 6)]
>>>
>>> # if we wanted to collect all odds and evens, we'd need a different aggregator and initial
... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2), aggregator_op=lambda x, y: x + [y], initial=[])
>>> list(ag([1, 2, 3, 4, 5]))
[(1, [1, 3, 5]), (0, [2, 4])]