dol.base¶
Base classes for making stores. In the language of the collections.abc module, a store is a MutableMapping that is configured to work with a specific representation of keys, serialization of objects (python values), and persistence of the serialized data.
That is, stores offer the same interface as a dict, but where the actual implementation of writes, reads, and listing are configurable.
Consider the following example. You’re store is meant to store waveforms as wav files on a remote server. Say waveforms are represented in python as a tuple (wf, sr), where wf is a list of numbers and sr is the sample rate, an int). The __setitem__ method will specify how to store bytes on a remote server, but you’ll need to specify how to SERIALIZE (wf, sr) to the bytes that constitute that wav file: _data_of_obj specifies that. You might also want to read those wav files back into a python (wf, sr) tuple. The __getitem__ method will get you those bytes from the server, but the store will need to know how to DESERIALIZE those bytes back into a python object: _obj_of_data specifies that
Further, say you’re storing these .wav files in /some/folder/on/the/server/, but you don’t want the store to use these as the keys. For one, it’s annoying to type and harder to read. But more importantly, it’s an irrelevant implementation detail that shouldn’t be exposed. THe _id_of_key and _key_of_id pair are what allow you to add this key interface layer.
These key converters object serialization methods default to the identity (i.e. they return the input as is). This means that you don’t have to implement these as all, and can choose to implement these concerns within the storage methods themselves.
-
class
dol.base.
Collection
[source]¶ The same as collections.abc.Collection, with some modifications: - Addition of a
head
-
class
dol.base.
KeyValidationABC
[source]¶ An ABC for an object writer. Single purpose: store an object under a given key. How the object is serialized and or physically stored should be defined in a concrete subclass.
-
class
dol.base.
KvPersister
[source]¶ Acts as a MutableMapping abc, but disabling the clear and __reversed__ method, and computing __len__ by iterating over all keys, and counting them.
Note that KvPersister is a MutableMapping, and as such, is dict-like. But that doesn’t mean it’s a dict.
For instance, consider the following code:
s = SomeKvPersister() s['a']['b'] = 3
If s is a dict, this would have the effect of adding a (‘b’, 3) item under ‘a’. But in the general case, this might - fail, because the s[‘a’] doesn’t support sub-scripting (doesn’t have a __getitem__) - or, worse, will pass silently but not actually persist the write as expected (e.g. LocalFileStore)
Another example: s.popitem() will pop a (k, v) pair off of the s store. That is, retrieve the v for k, delete the entry for k, and return a (k, v). Note that unlike modern dicts which will return the last item that was stored
– that is, LIFO (last-in, first-out) order – for KvPersisters, there’s no assurance as to what item will be, since it will depend on the backend storage system and/or how the persister was implemented.
-
clear
()¶ The clear method is disabled to make dangerous difficult. You don’t want to delete your whole DB If you really want to delete all your data, you can do so by doing something like this:
for k in self: del self[k]
or (in some cases)
for k in self: try: del self[k] except KeyError: pass
-
-
class
dol.base.
KvReader
[source]¶ Acts as a Mapping abc, but with default __len__ (implemented by counting keys) and head method to get the first (k, v) item of the store
-
dol.base.
KvStore
¶ alias of
dol.base.Store
-
dol.base.
Persister
¶ alias of
dol.base.KvPersister
-
dol.base.
Reader
¶ alias of
dol.base.KvReader
-
class
dol.base.
Store
(store=<class 'dict'>)[source]¶ By store we mean key-value store. This could be files in a filesystem, objects in s3, or a database. Where and how the content is stored should be specified, but StoreInterface offers a dict-like interface to this.
__getitem__ calls: _id_of_key _obj_of_data __setitem__ calls: _id_of_key _data_of_obj __delitem__ calls: _id_of_key __iter__ calls: _key_of_id
>>> # Default store: no key or value conversion ##################################### >>> from dol import Store >>> s = Store() >>> s['foo'] = 33 >>> s['bar'] = 65 >>> assert list(s.items()) == [('foo', 33), ('bar', 65)] >>> assert list(s.store.items()) == [('foo', 33), ('bar', 65)] # see that the store contains the same thing >>> >>> ################################################################################# >>> # Now let's make stores that have a key and value conversion layer ############## >>> # input keys will be upper cased, and output keys lower cased ################### >>> # input values (assumed int) will be converted to ascii string, and visa versa ## >>> ################################################################################# >>> >>> def test_store(s): ... s['foo'] = 33 # write 33 to 'foo' ... assert 'foo' in s # __contains__ works ... assert 'no_such_key' not in s # __nin__ works ... s['bar'] = 65 # write 65 to 'bar' ... assert len(s) == 2 # there are indeed two elements ... assert list(s) == ['foo', 'bar'] # these are the keys ... assert list(s.keys()) == ['foo', 'bar'] # the keys() method works! ... assert list(s.values()) == [33, 65] # the values() method works! ... assert list(s.items()) == [('foo', 33), ('bar', 65)] # these are the items ... assert list(s.store.items()) == [('FOO', '!'), ('BAR', 'A')] # but note the internal representation ... assert s.get('foo') == 33 # the get method works ... assert s.get('no_such_key', 'something') == 'something' # return a default value ... del(s['foo']) # you can delete an item given its key ... assert len(s) == 1 # see, only one item left! ... assert list(s.items()) == [('bar', 65)] # here it is >>> >>> # We can introduce this conversion layer in several ways. Here's a few... ###################### >>> # by subclassing ############################################################################### >>> class MyStore(Store): ... def _id_of_key(self, k): ... return k.upper() ... def _key_of_id(self, _id): ... return _id.lower() ... def _data_of_obj(self, obj): ... return chr(obj) ... def _obj_of_data(self, data): ... return ord(data) >>> s = MyStore(store=dict()) # note that you don't need to specify dict(), since it's the default >>> test_store(s) >>> >>> # by assigning functions to converters ########################################################## >>> class MyStore(Store): ... def __init__(self, store, _id_of_key, _key_of_id, _data_of_obj, _obj_of_data): ... super().__init__(store) ... self._id_of_key = _id_of_key ... self._key_of_id = _key_of_id ... self._data_of_obj = _data_of_obj ... self._obj_of_data = _obj_of_data ... >>> s = MyStore(dict(), ... _id_of_key=lambda k: k.upper(), ... _key_of_id=lambda _id: _id.lower(), ... _data_of_obj=lambda obj: chr(obj), ... _obj_of_data=lambda data: ord(data)) >>> test_store(s) >>> >>> # using a Mixin class ############################################################################# >>> class Mixin: ... def _id_of_key(self, k): ... return k.upper() ... def _key_of_id(self, _id): ... return _id.lower() ... def _data_of_obj(self, obj): ... return chr(obj) ... def _obj_of_data(self, data): ... return ord(data) ... >>> class MyStore(Mixin, Store): # note that the Mixin must come before Store in the mro ... pass ... >>> s = MyStore() # no dict()? No, because default anyway >>> test_store(s) >>> >>> # adding wrapper methods to an already made Store instance ######################################### >>> s = Store(dict()) >>> s._id_of_key=lambda k: k.upper() >>> s._key_of_id=lambda _id: _id.lower() >>> s._data_of_obj=lambda obj: chr(obj) >>> s._obj_of_data=lambda data: ord(data) >>> test_store(s)
Note on defining your own “Mapping Views”.
When you do a .keys(), a .values() or .items() you’re getting a MappingView instance; an iterable and sized container that provides some methods to access particular aspects of the wrapped mapping.
If you need to customize the behavior of these instances, you should avoid overriding the keys, values or items methods directly, but instead override the KeysView, ValuesView or ItemsView classes that they use.
For more, see: https://github.com/i2mint/dol/wiki/Mapping-Views
-
class
dol.base.
Stream
(stream)[source]¶ A layer-able version of the stream interface
__iter__ calls: _obj_of_data(map)
>>> from io import StringIO >>> >>> src = StringIO( ... '''a, b, c ... 1,2, 3 ... 4, 5,6 ... ''' ... ) >>> >>> from dol.base import Stream >>> >>> class MyStream(Stream): ... def _obj_of_data(self, line): ... return [x.strip() for x in line.strip().split(',')] ... >>> stream = MyStream(src) >>> >>> list(stream) [['a', 'b', 'c'], ['1', '2', '3'], ['4', '5', '6']] >>> stream.seek(0) # oh!... but we consumed the stream already, so let's go back to the beginning 0 >>> list(stream) [['a', 'b', 'c'], ['1', '2', '3'], ['4', '5', '6']] >>> stream.seek(0) # reverse again 0 >>> next(stream) ['a', 'b', 'c'] >>> next(stream) ['1', '2', '3']
Let’s add a filter! There’s two kinds you can use. One that is applied to the line before the data is transformed by _obj_of_data, and the other that is applied after (to the obj).
>>> from dol.base import Stream >>> from io import StringIO >>> >>> src = StringIO( ... '''a, b, c ... 1,2, 3 ... 4, 5,6 ... ''') >>> class MyFilteredStream(MyStream): ... def _post_filt(self, obj): ... return str.isnumeric(obj[0]) >>> >>> s = MyFilteredStream(src) >>> >>> list(s) [['1', '2', '3'], ['4', '5', '6']] >>> s.seek(0) 0 >>> list(s) [['1', '2', '3'], ['4', '5', '6']] >>> s.seek(0) 0 >>> next(s) ['1', '2', '3']
Recipes:
_pre_iter: involving itertools.islice to skip header lines
_pre_iter: involving enumerate to get line indices in stream iterator
_pre_iter = functools.partial(map, line_pre_proc_func) to preprocess all lines with line_pre_proc_func
_pre_iter: include filter before obj
-
dol.base.
delegator_wrap
(delegator: Callable, obj: Union[type, Any], class_trans=None, delegation_attr: str = 'store')[source]¶ Wrap a
obj
(type or instance) withdelegator
.If obj is not a type, trivially returns
delegator(obj)
.The interesting case of
delegator_wrap
is whenobj
is a type (a class). In this case,delegator_wrap
returns a callable (class or function) that has the same signature as obj, but that produces instances that are wrapped bydelegator
- Parameters
delegator – An instance wrapper. A Callable (type or function – with only one required input) that will return a wrapped version of it’s input instance.
obj – The object (class or instance) to be wrapped.
- Returns
A wrapped object
Let’s demo this on a simple Delegator class.
>>> class Delegator: ... i_think = 'therefore I am delegated' # this is just to verify that we're in a Delegator ... def __init__(self, wrapped_obj): ... self.wrapped_obj = wrapped_obj ... def __getattr__(self, attr): # delegation: just forward attributes to wrapped_obj ... return getattr(self.wrapped_obj, attr) ... wrap = classmethod(delegator_wrap) # this is a useful recipe to have the Delegator carry it's own wrapping method
The only difference between a wrapped object
Delegator(obj)
and the originalobj
is that the wrapped one has ai_think
attribute. The wrapped object should otherwise behave the same (on all but special (dunder) methods). So let’s test this on dictionaries, using the following test function:>>> def test_wrapped_d(wrapped_d, original_d): ... '''A function to test a wrapped dict''' ... assert not hasattr(original_d, 'i_think') # verify that the unwrapped_d doesn't have an i_think attribute ... assert list(wrapped_d.items()) == list(original_d.items()) # verify that wrapped_d has an items that gives us the same thing as origina_d ... assert hasattr(wrapped_d, 'i_think') # ... but wrapped_d has a i_think attribute ... assert wrapped_d.i_think == 'therefore I am delegated' # ... and its what we set it to be
Let’s try delegating a dict INSTANCE first:
>>> d = {'a': 1, 'b': 2} >>> wrapped_d = delegator_wrap(Delegator, d) >>> test_wrapped_d(wrapped_d, d)
If we ask
delegator_wrap
to wrap adict
type, we get a subclass of Delegator (NOT dict!) whose instances will have the behavior exhibited above:>>> WrappedDict = delegator_wrap(Delegator, dict, delegation_attr='wrapped_obj') >>> assert issubclass(WrappedDict, Delegator) >>> wrapped_d = WrappedDict(a=1, b=2)
>>> test_wrapped_d(wrapped_d, wrapped_d.wrapped_obj)
Now we’ll demo/test the
wrap = classmethod(delegator_wrap)
trick … with instances>>> wrapped_d = Delegator.wrap(d) >>> test_wrapped_d(wrapped_d, wrapped_d.wrapped_obj)
… with classes
>>> WrappedDict = Delegator.wrap(dict, delegation_attr='wrapped_obj') >>> wrapped_d = WrappedDict(a=1, b=2)
>>> test_wrapped_d(wrapped_d, wrapped_d.wrapped_obj) >>> class A(dict): ... def foo(self, x): ... pass >>> hasattr(A, 'foo') True >>> WrappedA = Delegator.wrap(A) >>> hasattr(WrappedA, 'foo') True
-
dol.base.
has_kv_store_interface
(o)[source]¶ Check if object has the KvStore interface (that is, has the kv wrapper methods
- Parameters
o – object (class or instance)
Returns: True if kv has the four key (in/out) and value (in/out) transformation methods
-
dol.base.
kv_walk
(v: collections.abc.Mapping, yield_func: Callable[[PT, KT, VT], Any] = <function asis>, walk_filt: Callable[[PT, KT, VT], bool] = <function val_is_mapping>, pkv_to_pv: Callable[[PT, KT, VT], Tuple[PT, VT]] = <function tuple_keypath_and_val>, p: PT = ()) → Iterator[Any][source]¶ Walks a nested structure of mappings, yielding stuff on the way.
- Parameters
v – A nested structure of mappings
yield_func – (pp, k, vv) -> what ever you want the gen to yield
walk_filt – (p, k, vv) -> (bool) whether to explore the nested structure v further
pkv_to_pv – (p, k, v) -> (pp, vv) where pp is a form of p + k (update of the path with the new node k) and vv is the value that will be used by both walk_filt and yield_func
p – The path to v (used internally, mainly, to keep track of the path)
>>> d = {'a': 1, 'b': {'c': 2, 'd': 3}} >>> list(kv_walk(d)) [(('a',), 'a', 1), (('b', 'c'), 'c', 2), (('b', 'd'), 'd', 3)] >>> list(kv_walk(d, lambda p, k, v: '.'.join(p))) ['a', 'b.c', 'b.d']
The walk_filt argument allows you to control what values the walk encountered should be walked through. This also means that this function is what controls when to stop the recursive traversal of the tree, and yield an actual “leaf”.
Say we want to get (path, values) items from a nested mapping/store based on a
levels
argument that determines what the desired values are. This can be done as follows:>>> def mk_level_walk_filt(levels): ... return lambda p, k, v: len(p) < levels - 1 ... >>> def leveled_map_walk(m, levels): ... yield from kv_walk( ... m, ... yield_func=lambda p, k, v: (p, v), ... walk_filt=mk_level_walk_filt(levels) ... ) >>> m = { ... 'a': {'b': {'c': 42}}, ... 'aa': {'bb': {'cc': 'dragon_con'}} ... } >>> >>> assert ( ... list(leveled_map_walk(m, 3)) ... == [ ... (('a', 'b', 'c'), 42), ... (('aa', 'bb', 'cc'), 'dragon_con') ... ] ... ) >>> assert ( ... list(leveled_map_walk(m, 2)) ... == [ ... (('a', 'b'), {'c': 42}), ... (('aa', 'bb'), {'cc': 'dragon_con'}) ... ] ... ) >>> >>> assert ( ... list(leveled_map_walk(m, 1)) ... == [ ... (('a',), {'b': {'c': 42}}), ... (('aa',), {'bb': {'cc': 'dragon_con'}}) ... ] ... )
Tip: If you want to use
kv_filt
to search and extract stuff from a nested mapping, you can have youryield_func
return a sentinel (say,None
) to indicate that the value should be skipped, and then filter out the ``None``s from your results.>>> mm = { ... 'a': {'b': {'c': 42}}, ... 'aa': {'bb': {'cc': 'meaning_of_life'}}, ... 'aaa': {'bbb': 314}, ... } >>> return_path_if_int_leaf = lambda p, k, v: (p, v) if isinstance(v, int) else None >>> list(filter(None, kv_walk(mm, yield_func=return_path_if_int_leaf))) [(('a', 'b', 'c'), 42), (('aaa', 'bbb'), 314)]
This “path search” functionality is available as a function in the
recipes
module, assearch_paths
.Inspiration:
kv_walk
was inspired by remap from the boltons package. You may consider using that instead, as it has a much more extensive documetation: See https://sedimental.org/remap.html for example.