dol.util¶

General util objects

class dol.util.Literal(val)[source]¶

An object to indicate that the value should be considered literally.

>>> t = Literal(42)
>>> t.get_val()
42
>>> t()
42

get_val()[source]¶

Get the value wrapped by Literal instance.

One might want to use literal.get_val() instead literal() to get the value a Literal is wrapping because .get_val is more explicit.

That said, with a bit of hesitation, we allow the literal() form as well since it is useful in situations where we need to use a callback function to get a value.

class dol.util.Pipe(*funcs, **named_funcs)[source]¶

Simple function composition. That is, gives you a callable that implements input -> f_1 -> … -> f_n -> output.

>>> def foo(a, b=2):
...     return a + b
>>> f = Pipe(foo, lambda x: print(f"x: {x}"))
>>> f(3)
x: 5
>>> len(f)
2

You can name functions, but this would just be for documentation purposes. The names are completely ignored.

>>> g = Pipe(
...     add_numbers = lambda x, y: x + y,
...     multiply_by_2 = lambda x: x * 2,
...     stringify = str
... )
>>> g(2, 3)
'10'
>>> len(g)
3

Notes

Pipe instances don’t have a __name__ etc. So some expectations of normal functions are not met.
Pipe instance are pickalable (as long as the functions that compose them are)

You can specify a single functions:

>>> Pipe(lambda x: x + 1)(2)
3

but

>>> Pipe()
Traceback (most recent call last):
  ...
ValueError: You need to specify at least one function!

You can specify an instance name and/or doc with the special (reserved) argument names __name__ and __doc__ (which therefore can’t be used as function names):

>>> f = Pipe(map, add_it=sum, __name__='map_and_sum', __doc__='Apply func and add')
>>> f(lambda x: x * 10, [1, 2, 3])
60
>>> f.__name__
'map_and_sum'
>>> f.__doc__
'Apply func and add'

dol.util.add_as_attribute_of(obj, name=None)[source]¶

Decorator that adds a function as an attribute of a container object obj.

If no name is given, the __name__ of the function will be used, with a leading underscore removed. This is useful for adding helper functions to main “container” functions without polluting the namespace of the module, at least from the point of view of imports and tab completion.

>>> def foo():
...    pass
>>>
>>> @add_as_attribute_of(foo)
... def _helper():
...    pass
>>> hasattr(foo, 'helper')
True
>>> callable(foo.helper)
True

In reality, any object that has a __name__ can be added to the attribute of obj, but the intention is to add helper functions to main “container” functions.

dol.util.add_attrs(remember_added_attrs=True, if_attr_exists='raise', **attrs)[source]¶

Make a function that will add attributes to an obj. Originally meant to be used as a decorator of a function, to inject

>>> from dol.util import add_attrs
>>> @add_attrs(bar='bituate', hello='world')
... def foo():
...     pass
>>> [x for x in dir(foo) if not x.startswith('_')]
['bar', 'hello']
>>> foo.bar
'bituate'
>>> foo.hello
'world'
>>> foo._added_attrs  # Another attr was added to hold the list of attributes added (in case we need to remove them
['bar', 'hello']

dol.util.chain_get(d: Mapping, keys, default=None)[source]¶

Returns the d[key] value for the first key in keys that is in d, and default if none are found

Note: Think of collections.ChainMap where you can look for a single key in a sequence of maps until we find it. Here we look for a sequence of keys in a single map, stopping as soon as we find a key that the map has.

>>> d = {'here': '&', 'there': 'and', 'every': 'where'}
>>> chain_get(d, ['not there', 'not there either', 'there', 'every'])
'and'

Notice how 'not there' and 'not there either' are skipped, 'there' is found and used to retrieve the value, and 'every' is not even checked (because 'there' was found). If non of the keys are found, None is returned by default.

>>> assert chain_get(d, ('none', 'of', 'these')) is None

You can change this default though:

>>> chain_get(d, ('none', 'of', 'these'), default='Not Found')
'Not Found'

dol.util.copy_attrs(target, source, attrs, raise_error_if_an_attr_is_missing=True)[source]¶

Copy attributes from one object to another.

>>> class A:
...     x = 0
>>> class B:
...     x = 1
...     yy = 2
...     zzz = 3
>>> dict_of = lambda o: {a: getattr(o, a) for a in dir(A) if not a.startswith('_')}
>>> dict_of(A)
{'x': 0}
>>> copy_attrs(A, B, 'yy')
>>> dict_of(A)
{'x': 0, 'yy': 2}
>>> copy_attrs(A, B, ['x', 'zzz'])
>>> dict_of(A)
{'x': 1, 'yy': 2, 'zzz': 3}

But if you try to copy something that B (the source) doesn’t have, copy_attrs will complain:

>>> copy_attrs(A, B, 'this_is_not_an_attr')
Traceback (most recent call last):
    ...
AttributeError: type object 'B' has no attribute 'this_is_not_an_attr'

If you tell it not to complain, it’ll just ignore attributes that are not in source.

>>> copy_attrs(A, B, ['nothing', 'here', 'exists'], raise_error_if_an_attr_is_missing=False)
>>> dict_of(A)
{'x': 1, 'yy': 2, 'zzz': 3}

dol.util.fill_with_dflts(d, dflt_dict=None)[source]¶

Fed up with multiline handling of dict arguments? Fed up of repeating the if d is None: d = {} lines ad nauseam (because defaults can’t be dicts as a default because dicts are mutable blah blah, and the python kings don’t seem to think a mutable dict is useful enough)? Well, my favorite solution would be a built-in handling of the problem of complex/smart defaults, that is visible in the code and in the docs. But for now, here’s one of the tricks I use.

Main use is to handle defaults of function arguments. Say you have a function func(d=None) and you want d to be a dict that has at least the keys foo and bar with default values 7 and 42 respectively. Then, in the beginning of your function code you’ll say:

d = fill_with_dflts(d, {‘a’: 7, ‘b’: 42})

See examples to know how to use it.

ATTENTION: A shallow copy of the dict is made. Know how that affects you (or not). ATTENTION: This is not recursive: It won’t be filling any nested fields with defaults.

Parameters

d – The dict you want to “fill”
dflt_dict – What to fill it with (a {k: v, …} dict where if k is missing in d, you’ll get a new field k, with value v.

Returns

val entries (if the key was missing in d).

Return type

a dict with the new key

>>> fill_with_dflts(None)
{}
>>> fill_with_dflts(None, {'a': 7, 'b': 42})
{'a': 7, 'b': 42}
>>> fill_with_dflts({}, {'a': 7, 'b': 42})
{'a': 7, 'b': 42}
>>> fill_with_dflts({'b': 1000}, {'a': 7, 'b': 42})
{'a': 7, 'b': 1000}

dol.util.flatten_pipe(pipe)[source]¶

Unravel nested Pipes to get a flat ‘sequence of functions’ version of input.

>>> def f(x): return x + 1
>>> def g(x): return x * 2
>>> def h(x): return x - 3
>>> a = Pipe(f, g, h)
>>> b = Pipe(f, Pipe(g, h))
>>> len(a)
3
>>> len(b)
2
>>> c = flatten_pipe(b)
>>> len(c)
3
>>> assert a(10) == b(10) == c(10) == 19

dol.util.format_invocation(name='', args=(), kwargs=None)[source]¶

Given a name, positional arguments, and keyword arguments, format a basic Python-style function call.

>>> print(format_invocation('func', args=(1, 2), kwargs={'c': 3}))
func(1, 2, c=3)
>>> print(format_invocation('a_func', args=(1,)))
a_func(1)
>>> print(format_invocation('kw_func', kwargs=[('a', 1), ('b', 2)]))
kw_func(a=1, b=2)

dol.util.groupby(items: Iterable[Any], key: Callable[[Any], Hashable], val: Optional[Callable[[Any], Any]] = None, group_factory=<class 'list'>) → dict[source]¶

Groups items according to group keys updated from those items through the given (item_to_)key function.

Parameters

items – iterable of items
key – The function that computes a key from an item. Needs to return a hashable.
val – An optional function that computes a val from an item. If not given, the item itself will be taken.
group_factory – The function to make new (empty) group objects and accumulate group items. group_items = group_factory() will be called to make a new empty group collection group_items.append(x) will be called to add x to that collection The default is list

Returns: A dict of {group_key: items_in_that_group, …}

See Also: regroupby, itertools.groupby, and dol.source.SequenceKvReader

>>> groupby(range(11), key=lambda x: x % 3)
{0: [0, 3, 6, 9], 1: [1, 4, 7, 10], 2: [2, 5, 8]}
>>>
>>> tokens = ['the', 'fox', 'is', 'in', 'a', 'box']
>>> groupby(tokens, len)
{3: ['the', 'fox', 'box'], 2: ['is', 'in'], 1: ['a']}
>>> key_map = {1: 'one', 2: 'two'}
>>> groupby(tokens, lambda x: key_map.get(len(x), 'more'))
{'more': ['the', 'fox', 'box'], 'two': ['is', 'in'], 'one': ['a']}
>>> stopwords = {'the', 'in', 'a', 'on'}
>>> groupby(tokens, lambda w: w in stopwords)
{True: ['the', 'in', 'a'], False: ['fox', 'is', 'box']}
>>> groupby(tokens, lambda w: ['words', 'stopwords'][int(w in stopwords)])
{'stopwords': ['the', 'in', 'a'], 'words': ['fox', 'is', 'box']}

dol.util.has_enabled_clear_method(store)[source]¶: Returns True iff obj has a clear method that is enabled (i.e. not disabled)

dol.util.igroupby(items: Iterable[Any], key: Callable[[Any], Hashable], val: Optional[Callable[[Any], Any]] = None, group_factory: Callable[[], Iterable[Any]] = <class 'list'>, group_release_cond: Union[Callable[[Hashable, Iterable[Any]], bool], Callable[[dict, Hashable, Iterable[Any]], bool]] = <function <lambda>>, release_remainding=True, append_to_group_items: Callable[[Iterable[Any], Any], Any] = <method 'append' of 'list' objects>, grouper_mapping=<class 'collections.defaultdict'>) → dict[source]¶

The generator version of dol groupby. Groups items according to group keys updated from those items through the given (item_to_)key function, yielding the groups according to a logic defined by group_release_cond

Parameters

items – iterable of items
key – The function that computes a key from an item. Needs to return a hashable.
val – An optional function that computes a val from an item. If not given, the item itself will be taken.
group_factory – The function to make new (empty) group objects and accumulate group items. group_items = group_collector() will be called to make a new empty group collection group_items.append(x) will be called to add x to that collection The default is list
group_release_cond – A boolean function that will be applied, at every iteration, to the accumulated items of the group that was just updated, and determines (if True) if the (group_key, group_items) should be yielded. The default is False, which results in lambda group_key, group_items: False being used.
release_remainding – Once the input items have been consumed, there may still be some items in the grouping “cache”. release_remainding is a boolean that indicates whether the contents of this cache should be released or not.

Yields: (group_key, items_in_that_group) pairs

The following will group numbers according to their parity (0 for even, 1 for odd), releasing a list of numbers collected when that list reaches length 3:

>>> g = igroupby(items=range(11),
...             key=lambda x: x % 2,
...             group_release_cond=lambda k, v: len(v) == 3)
>>> list(g)
[(0, [0, 2, 4]), (1, [1, 3, 5]), (0, [6, 8, 10]), (1, [7, 9])]

If we specify release_remainding=False though, we won’t get

>>> g = igroupby(items=range(11),
...             key=lambda x: x % 2,
...             group_release_cond=lambda k, v: len(v) == 3,
...             release_remainding=False)
>>> list(g)
[(0, [0, 2, 4]), (1, [1, 3, 5]), (0, [6, 8, 10])]

# >>> grps = partial(igroupby, group_release_cond=False, release_remainding=True)

Below we show that, with the default group_release_cond = lambda k, v: False and release_remainding=True`` we have dict(igroupby(...)) == groupby(...)

>>> from functools import partial
>>> from dol import groupby
>>>
>>> kws = dict(items=range(11), key=lambda x: x % 3)
>>> assert (dict(igroupby(**kws)) == groupby(**kws)
...         == {0: [0, 3, 6, 9], 1: [1, 4, 7, 10], 2: [2, 5, 8]})
>>>
>>> tokens = ['the', 'fox', 'is', 'in', 'a', 'box']
>>> kws = dict(items=tokens, key=len)
>>> assert (dict(igroupby(**kws)) == groupby(**kws)
...         == {3: ['the', 'fox', 'box'], 2: ['is', 'in'], 1: ['a']})
>>>
>>> key_map = {1: 'one', 2: 'two'}
>>> kws.update(key=lambda x: key_map.get(len(x), 'more'))
>>> assert (dict(igroupby(**kws)) == groupby(**kws)
...         == {'more': ['the', 'fox', 'box'], 'two': ['is', 'in'], 'one': ['a']})
>>>
>>> stopwords = {'the', 'in', 'a', 'on'}
>>> kws.update(key=lambda w: w in stopwords)
>>> assert (dict(igroupby(**kws)) == groupby(**kws)
...         == {True: ['the', 'in', 'a'], False: ['fox', 'is', 'box']})
>>> kws.update(key=lambda w: ['words', 'stopwords'][int(w in stopwords)])
>>> assert (dict(igroupby(**kws)) == groupby(**kws)
...         == {'stopwords': ['the', 'in', 'a'], 'words': ['fox', 'is', 'box']})

class dol.util.imdict[source]¶: A frozen hashable dict

dol.util.inject_method(obj, method_function, method_name=None)[source]¶

method_function could be:

a function
a {method_name: function, …} dict (for multiple injections)
a list of functions or (function, method_name) pairs

dol.util.instance_checker(*types)[source]¶

Makes a filter function that checks the type of an object.

>>> f = instance_checker(int, float)
>>> f(1)
True
>>> f(1.0)
True
>>> f('1.0')
False

class dol.util.lazyprop(func)[source]¶

A descriptor implementation of lazyprop (cached property). Made based on David Beazley’s “Python Cookbook” book and enhanced with boltons.cacheutils ideas.

>>> class Test:
...     def __init__(self, a):
...         self.a = a
...     @lazyprop
...     def len(self):
...         print('generating "len"')
...         return len(self.a)
>>> t = Test([0, 1, 2, 3, 4])
>>> t.__dict__
{'a': [0, 1, 2, 3, 4]}
>>> t.len
generating "len"
5
>>> t.__dict__
{'a': [0, 1, 2, 3, 4], 'len': 5}
>>> t.len
5
>>> # But careful when using lazyprop that no one will change the value of a without deleting the property first
>>> t.a = [0, 1, 2]  # if we change a...
>>> t.len  # ... we still get the old cached value of len
5
>>> del t.len  # if we delete the len prop
>>> t.len  # ... then len being recomputed again
generating "len"
3

class dol.util.lazyprop_w_sentinel(func)[source]¶

A descriptor implementation of lazyprop (cached property). Inserts a self.func.__name__ + ‘__cache_active’ attribute

>>> class Test:
...     def __init__(self, a):
...         self.a = a
...     @lazyprop_w_sentinel
...     def len(self):
...         print('generating "len"')
...         return len(self.a)
>>> t = Test([0, 1, 2, 3, 4])
>>> lazyprop_w_sentinel.cache_is_active(t, 'len')
False
>>> t.__dict__  # let's look under the hood
{'a': [0, 1, 2, 3, 4]}
>>> t.len
generating "len"
5
>>> lazyprop_w_sentinel.cache_is_active(t, 'len')
True
>>> t.len  # notice there's no 'generating "len"' print this time!
5
>>> t.__dict__  # let's look under the hood
{'a': [0, 1, 2, 3, 4], 'len': 5, 'sentinel_of__len': True}
>>> # But careful when using lazyprop that no one will change the value of a without deleting the property first
>>> t.a = [0, 1, 2]  # if we change a...
>>> t.len  # ... we still get the old cached value of len
5
>>> del t.len  # if we delete the len prop
>>> t.len  # ... then len being recomputed again
generating "len"
3

dol.util.max_common_prefix(a)[source]¶

Given a list of strings (or other sliceable sequences), returns the longest common prefix

Parameters: a – list-like of strings
Returns: the smallest common prefix of all strings in a

dol.util.norm_kv_filt(kv_filt: Callable[[Any], bool])[source]¶

Prepare a boolean function to be used with filter when fed an iterable of (k, v) pairs.

So you have a mapping. Say a dict d. Now you want to go through d.items(), filtering based on the keys, or the values, or both.

It’s not hard to do, really. If you’re using a dict you might use a dict comprehension, or in the general case you might do a filter(lambda kv: my_filt(kv[0], kv[1]), d.items()) if you have a my_filt that works wiith k and v, etc.

But thought simple, it can become a bit muddled. norm_kv_filt simplifies this by allowing you to bring your own filtering boolean function, whether it’s a key-based, value-based, or key-value-based one, and it will make a ready-to-use with filter function for you.

Only thing: Your function needs to call a key k and a value v. But hey, it’s alright, if you have a function that calls things differently, just do something like

new_filt_func = lambda k, v: your_filt_func(..., key=k, ..., value=v, ...)

and all will be fine.

Parameters: kv_filt – callable (starting with signature (k), (v), or (k, v)), and returning a boolean
Returns: A normalized callable.

>>> d = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> list(filter(norm_kv_filt(lambda k: k in {'b', 'd'}), d.items()))
[('b', 2), ('d', 4)]
>>> list(filter(norm_kv_filt(lambda v: v > 2), d.items()))
[('c', 3), ('d', 4)]
>>> list(filter(norm_kv_filt(lambda k, v: (v > 1) & (k != 'c')), d.items()))
[('b', 2), ('d', 4)]

dol.util.not_a_mac_junk_path(path: str)[source]¶

A function that will tell you if the path is not a mac junk path/ More precisely, doesn’t end with ‘.DS_Store’ or have a __MACOSX folder somewhere on it’s way.

This is usually meant to be used with filter or filt_iter to “filter in” only those actually wanted files (not the junk that mac writes to your filesystem).

These files annoyingly show up often in zip files, and are usually unwanted.

See https://apple.stackexchange.com/questions/239578/compress-without-ds-store-and-macosx

>>> paths = ['A/normal/path', 'A/__MACOSX/path', 'path/ending/in/.DS_Store', 'foo/b']
>>> list(filter(not_a_mac_junk_path, paths))
['A/normal/path', 'foo/b']

dol.util.num_of_args(func)[source]¶

Number of arguments (parameters) of the function.

Contrast the behavior below with that of num_of_required_args.

>>> num_of_args(lambda a, b, c: None)
3
>>> num_of_args(lambda a, b, c=3: None)
3
>>> num_of_args(lambda a, *args, b, c=1, d=2, **kwargs: None)
6

dol.util.num_of_required_args(func)[source]¶

Number or REQUIRED arguments of a function.

Contrast the behavior below with that of num_of_args, which counts all parameters, including the variadics and defaulted ones.

>>> num_of_required_args(lambda a, b, c: None)
3
>>> num_of_required_args(lambda a, b, c=3: None)
2
>>> num_of_required_args(lambda a, *args, b, c=1, d=2, **kwargs: None)
2

dol.util.partialclass(cls, *args, **kwargs)[source]¶

What partial(cls, *args, **kwargs) does, but returning a class instead of an object.

Parameters

cls – Class to get the partial of
kwargs – The kwargs to fix

The raison d’être of partialclass is that it returns a type, so let’s have a look at that with a useless class.

>>> from inspect import signature
>>> class A:
...     pass
>>> assert isinstance(A, type) == isinstance(partialclass(A), type) == True

>>> class A:
...     def __init__(self, a=0, b=1):
...         self.a, self.b = a, b
...     def mysum(self):
...         return self.a + self.b
...     def __repr__(self):
...         return f"{self.__class__.__name__}(a={self.a}, b={self.b})"
>>>
>>> assert isinstance(A, type) == isinstance(partialclass(A), type) == True
>>>
>>> assert str(signature(A)) == '(a=0, b=1)'
>>>
>>> a = A()
>>> assert a.mysum() == 1
>>> assert str(a) == 'A(a=0, b=1)'
>>>
>>> assert A(a=10).mysum() == 11
>>> assert str(A()) == 'A(a=0, b=1)'
>>>
>>>
>>> AA = partialclass(A, b=2)
>>> assert str(signature(AA)) == '(a=0, *, b=2)'
>>> aa = AA()
>>> assert aa.mysum() == 2
>>> assert str(aa) == 'A(a=0, b=2)'
>>> assert AA(a=1, b=3).mysum() == 4
>>> assert str(AA(3)) == 'A(a=3, b=2)'
>>>
>>> AA = partialclass(A, a=7)
>>> assert str(signature(AA)) == '(*, a=7, b=1)'
>>> assert AA().mysum() == 8
>>> assert str(AA(a=3)) == 'A(a=3, b=1)'

Note in the last partial that since a was fixed, you need to specify the keyword AA(a=3). AA(3) won’t work:

>>> AA(3)  
Traceback (most recent call last):
  ...
TypeError: __init__() got multiple values for argument 'a'

On the other hand, you can use *args to specify the fixtures:

>>> AA = partialclass(A, 22)
>>> assert str(AA()) == 'A(a=22, b=1)'
>>> assert str(signature(AA)) == '(b=1)'
>>> assert str(AA(3)) == 'A(a=22, b=3)'

dol.util.regroupby(items, *key_funcs, **named_key_funcs)[source]¶

Recursive groupby. Applies the groupby function recursively, using a sequence of key functions.

Note: The named_key_funcs argument names don’t have any external effect.: They just give a name to the key function, for code reading clarity purposes.

See Also: groupby, itertools.groupby, and dol.source.SequenceKvReader

>>> # group by how big the number is, then by it's mod 3 value
>>> # note that named_key_funcs argument names doesn't have any external effect (but give a name to the function)
>>> regroupby([1, 2, 3, 4, 5, 6, 7], lambda x: 'big' if x > 5 else 'small', mod3=lambda x: x % 3)
{'small': {1: [1, 4], 2: [2, 5], 0: [3]}, 'big': {0: [6], 1: [7]}}
>>>
>>> tokens = ['the', 'fox', 'is', 'in', 'a', 'box']
>>> stopwords = {'the', 'in', 'a', 'on'}
>>> word_category = lambda x: 'stopwords' if x in stopwords else 'words'
>>> regroupby(tokens, word_category, len)
{'stopwords': {3: ['the'], 2: ['in'], 1: ['a']}, 'words': {3: ['fox', 'box'], 2: ['is']}}
>>> regroupby(tokens, len, word_category)
{3: {'stopwords': ['the'], 'words': ['fox', 'box']}, 2: {'words': ['is'], 'stopwords': ['in']}, 1: {'stopwords': ['a']}}

dol.util.str_to_var_str(s: str) → str[source]¶

Make a valid python variable string from the input string. Left untouched if already valid.

>>> str_to_var_str('this_is_a_valid_var_name')
'this_is_a_valid_var_name'
>>> str_to_var_str('not valid  #)*(&434')
'not_valid_______434'
>>> str_to_var_str('99_ballons')
'_99_ballons'