py2store.utils.cumul_aggreg_write¶
utils for bulk writing – accumulate, aggregate and write when some condition is met
-
class
py2store.utils.cumul_aggreg_write.
CumulAggregWrite
(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>)[source]¶
-
class
py2store.utils.cumul_aggreg_write.
CumulAggregWriteWithAutoFlush
(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>, flush_cache_condition=<function condition_flush_on_every_write>)[source]¶
-
py2store.utils.cumul_aggreg_write.
condition_flush_on_every_write
(cache)[source]¶ Boolean function used as flush_cache_condition to anytime the cache is non-empty
-
py2store.utils.cumul_aggreg_write.
mk_group_aggregator
(item_to_kv, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]¶ Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key
- Parameters
item_to_kv –
aggregator_op –
initial –
Returns:
>>> # Collect words (as a csv string), grouped by the lower case of the first letter >>> ag = mk_group_aggregator(lambda item: (item[0].lower(), item), ... aggregator_op=lambda x, y: ', '.join([x, y])) >>> list(ag(['apple', 'bananna', 'Airplane'])) [('a', 'apple, Airplane'), ('b', 'bananna')] >>> # Collect (and concatinate) characters according to their ascii value modulo 3 >>> ag = mk_group_aggregator(lambda item: (item['age'], item['thing']), ... aggregator_op=lambda x, y: x + [y], ... initial=[]) >>> list(ag([{'age': 0, 'thing': 'new'}, {'age': 42, 'thing': 'every'}, {'age': 0, 'thing': 'just born'}])) [(0, ['new', 'just born']), (42, ['every'])]
-
py2store.utils.cumul_aggreg_write.
mk_group_aggregator_with_key_func
(item_to_key, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]¶ Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key
- Parameters
item_to_key – Function that takes an item of the generator and outputs the key that should be used to group items
aggregator_op – The aggregation binary function that is used to aggregate two items together. The function is used as is by the functools.reduce, applied to the sequence of items that were collected for a given group
initial – The “empty” element to start the reduce (aggregation) with, if necessary.
Returns:
>>> # Collect words (as a csv string), grouped by the lower case of the first letter >>> ag = mk_group_aggregator_with_key_func(lambda item: item[0].lower(), ... aggregator_op=lambda x, y: ', '.join([x, y])) >>> list(ag(['apple', 'bananna', 'Airplane'])) [('a', 'apple, Airplane'), ('b', 'bananna')] >>> >>> # Collect (and concatenate) characters according to their ascii value modulo 3 ... ag = mk_group_aggregator_with_key_func(lambda item: (ord(item) % 3)) >>> list(ag('abcdefghijklmnop')) [(1, 'adgjmp'), (2, 'behkn'), (0, 'cfilo')] >>> >>> # sum all even and odd number separately ... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2)) >>> list(ag([1, 2, 3, 4, 5])) # sum of evens is 6, and sum of odds is 9 [(1, 9), (0, 6)] >>> >>> # if we wanted to collect all odds and evens, we'd need a different aggregator and initial ... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2), aggregator_op=lambda x, y: x + [y], initial=[]) >>> list(ag([1, 2, 3, 4, 5])) [(1, [1, 3, 5]), (0, [2, 4])]