class py2store.misc.MiscGetter(store=<py2store.persisters.local_files.PathFormatPersister object>, incoming_val_trans_for_key={'.bin': <function identity_method>, '.cnf': <function <lambda>>, '.conf': <function <lambda>>, '.config': <function <lambda>>, '.csv': <function <lambda>>, '.gz': <function decompress>, '.gzip': <function decompress>, '.ini': <function <lambda>>, '.json': <function <lambda>>, '.pickle': <function <lambda>>, '.pkl': <function <lambda>>, '.txt': <function <lambda>>, '.zip': <class 'py2store.slib.s_zipfile.FilesOfZip'>}, dflt_incoming_val_trans=<function identity_method>, func_key=<function MiscGetter.<lambda>>)[source]

An object to write (and only write) to a store (default local files) with automatic deserialization according to a property of the key (default: file extension).

>>> from py2store.misc import get_obj, misc_objs_get
>>> import os
>>> import json
>>> pjoin = lambda *p: os.path.join(os.path.expanduser('~'), *p)
>>> path = pjoin('tmp.json')
>>> d = {'a': {'b': {'c': [1, 2, 3]}}}
>>> json.dump(d, open(path, 'w'))  # putting a json file there, the normal way, so we can use it later
>>> k = path
>>> t = get_obj(k)  # if you'd like to use a function
>>> assert t == d
>>> tt = misc_objs_get[k]  # if you'd like to use an object (note: can get, but nothing else (no list, set, del, etc))
>>> assert tt == d
>>> t
{'a': {'b': {'c': [1, 2, 3]}}}
class py2store.misc.MiscGetterAndSetter(store=<py2store.persisters.local_files.PathFormatPersister object>, incoming_val_trans_for_key={'.bin': <function identity_method>, '.cnf': <function <lambda>>, '.conf': <function <lambda>>, '.config': <function <lambda>>, '.csv': <function <lambda>>, '.gz': <function decompress>, '.gzip': <function decompress>, '.ini': <function <lambda>>, '.json': <function <lambda>>, '.pickle': <function <lambda>>, '.pkl': <function <lambda>>, '.txt': <function <lambda>>, '.zip': <class 'py2store.slib.s_zipfile.FilesOfZip'>}, outgoing_val_trans_for_key={'.bin': <function identity_method>, '.cnf': <function <lambda>>, '.conf': <function <lambda>>, '.config': <function <lambda>>, '.csv': <function csv_fileobj>, '.gz': <function compress>, '.gzip': <function compress>, '.ini': <function <lambda>>, '.json': <function <lambda>>, '.pickle': <function <lambda>>, '.pkl': <function <lambda>>, '.txt': <function <lambda>>}, dflt_incoming_val_trans=<function identity_method>, func_key=<function MiscGetterAndSetter.<lambda>>)[source]

An object to read and write (and nothing else) to a store (default local) with automatic (de)serialization according to a property of the key (default: file extension).

>>> from py2store.misc import set_obj, misc_objs  # the function and the object
>>> import json
>>> import os
>>> pjoin = lambda *p: os.path.join(os.path.expanduser('~'), *p)
>>> d = {'a': {'b': {'c': [1, 2, 3]}}}
>>> misc_objs[pjoin('tmp.json')] = d
>>> filepath = os.path.expanduser('~/tmp.json')
>>> assert misc_objs[filepath] == d  # yep, it's there, and can be retrieved
>>> assert json.load(open(filepath)) == d  # in case you don't believe it's an actual json file
>>> # using pickle
>>> misc_objs[pjoin('tmp.pkl')] = d
>>> assert misc_objs[pjoin('tmp.pkl')] == d
>>> # using txt
>>> misc_objs[pjoin('tmp.txt')] = 'hello world!'
>>> assert misc_objs[pjoin('tmp.txt')] == 'hello world!'
>>> # using csv
>>> misc_objs[pjoin('tmp.csv')] = [[1,2,3], ['a','b','c']]
>>> assert misc_objs[pjoin('tmp.csv')] == [['1','2','3'], ['a','b','c']]  # yeah, well, not numbers, but you deal with it
>>> # using bin
... misc_objs[pjoin('tmp.bin')] = b'let us pretend these are bytes of an audio waveform'
>>> assert misc_objs[pjoin('tmp.bin')] == b'let us pretend these are bytes of an audio waveform'
class py2store.misc.MiscReaderMixin(incoming_val_trans_for_key=None, dflt_incoming_val_trans=None, func_key=None)[source]

Mixin to transform incoming vals according to the key their under. Warning: If used as a subclass, this mixin should (in general) be placed before the store

>>> # make a reader that will wrap a dict
>>> class MiscReader(MiscReaderMixin, dict):
...     def __init__(self, d,
...                         incoming_val_trans_for_key=None,
...                         dflt_incoming_val_trans=None,
...                         func_key=None):
...         dict.__init__(self, d)
...         MiscReaderMixin.__init__(self, incoming_val_trans_for_key, dflt_incoming_val_trans, func_key)
>>> incoming_val_trans_for_key = dict(
...     MiscReaderMixin._incoming_val_trans_for_key,  # take the existing defaults...
...     **{'.bin': lambda v: [ord(x) for x in v.decode()], # ... override how to handle the .bin extension
...      '.reverse_this': lambda v: v[::-1]  # add a new extension (and how to handle it)
...     })
>>> import pickle
>>> d = {
...     'a.bin': b'abc123',
...     'a.reverse_this': b'abc123',
...     'a.csv': b'event,year\n Magna Carta,1215\n Guido,1956',
...     'a.txt': b'this is not a text',
...     'a.pkl': pickle.dumps(['text', [str, map], {'a list': [1, 2, 3]}]),
...     'a.json': '{"str": "field", "int": 42, "float": 3.14, "array": [1, 2], "nested": {"a": 1, "b": 2}}',
... }
>>> s = MiscReader(d=d, incoming_val_trans_for_key=incoming_val_trans_for_key)
>>> list(s)
['a.bin', 'a.reverse_this', 'a.csv', 'a.txt', 'a.pkl', 'a.json']
>>> s['a.bin']
[97, 98, 99, 49, 50, 51]
>>> s['a.reverse_this']
>>> s['a.csv']
[['event', 'year'], [' Magna Carta', '1215'], [' Guido', '1956']]
>>> s['a.pkl']
['text', [<class 'str'>, <class 'map'>], {'a list': [1, 2, 3]}]
>>> s['a.json']
{'str': 'field', 'int': 42, 'float': 3.14, 'array': [1, 2], 'nested': {'a': 1, 'b': 2}}
class py2store.misc.MiscStoreMixin(incoming_val_trans_for_key=None, outgoing_val_trans_for_key=None, dflt_incoming_val_trans=None, dflt_outgoing_val_trans=None, func_key=None)[source]

Mixin to transform incoming and outgoing vals according to the key their under. Warning: If used as a subclass, this mixin should (in general) be placed before the store

See also: preset and postget args from wrap_kvs decorator from py2store.trans.

>>> # Make a class to wrap a dict with a layer that transforms written and read values
>>> class MiscStore(MiscStoreMixin, dict):
...     def __init__(self, d,
...                         incoming_val_trans_for_key=None, outgoing_val_trans_for_key=None,
...                         dflt_incoming_val_trans=None, dflt_outgoing_val_trans=None,
...                         func_key=None):
...         dict.__init__(self, d)
...         MiscStoreMixin.__init__(self, incoming_val_trans_for_key, outgoing_val_trans_for_key,
...                                 dflt_incoming_val_trans, dflt_outgoing_val_trans, func_key)
>>> outgoing_val_trans_for_key = dict(
...     MiscStoreMixin._outgoing_val_trans_for_key,  # take the existing defaults...
...     **{'.bin': lambda v: ''.join([chr(x) for x in v]).encode(), # ... override how to handle the .bin extension
...        '.reverse_this': lambda v: v[::-1]  # add a new extension (and how to handle it)
...     })
>>> ss = MiscStore(d={},  # store starts empty
...                incoming_val_trans_for_key={},  # overriding incoming trans so we can see the raw data later
...                outgoing_val_trans_for_key=outgoing_val_trans_for_key)
>>> # here's what we're going to write in the store
>>> data_to_write = {
...      'a.bin': [97, 98, 99, 49, 50, 51],
...      'a.reverse_this': b'321cba',
...      'a.csv': [['event', 'year'], [' Magna Carta', '1215'], [' Guido', '1956']],
...      'a.txt': 'this is not a text',
...      'a.pkl': ['text', [str, map], {'a list': [1, 2, 3]}],
...      'a.json': {'str': 'field', 'int': 42, 'float': 3.14, 'array': [1, 2], 'nested': {'a': 1, 'b': 2}}}
>>> # write this data in our store
>>> for k, v in data_to_write.items():
...     ss[k] = v
>>> list(ss)
['a.bin', 'a.reverse_this', 'a.csv', 'a.txt', 'a.pkl', 'a.json']
>>> # Looking at the contents (what was actually stored/written)
>>> for k, v in ss.items():
...     if k != 'a.pkl':
...         print(f"{k}: {v}")
...     else:  # need to verify pickle data differently, since printing contents is problematic in doctest
...         assert pickle.loads(v) == data_to_write['a.pkl']
a.bin: b'abc123'
a.reverse_this: b'abc123'
a.csv: b'event,year\r\n Magna Carta,1215\r\n Guido,1956\r\n'
a.txt: b'this is not a text'
a.json: b'{"str": "field", "int": 42, "float": 3.14, "array": [1, 2], "nested": {"a": 1, "b": 2}}'
py2store.misc.get_obj(k, store=<py2store.persisters.local_files.PathFormatPersister object>, incoming_val_trans_for_key={'.bin': <function identity_method>, '.cnf': <function <lambda>>, '.conf': <function <lambda>>, '.config': <function <lambda>>, '.csv': <function <lambda>>, '.gz': <function decompress>, '.gzip': <function decompress>, '.ini': <function <lambda>>, '.json': <function <lambda>>, '.pickle': <function <lambda>>, '.pkl': <function <lambda>>, '.txt': <function <lambda>>, '.zip': <class 'py2store.slib.s_zipfile.FilesOfZip'>}, dflt_incoming_val_trans=<function identity_method>, func_key=<function <lambda>>)[source]

A quick way to get an object, with default… everything (but the key, you know, a clue of what you want)

py2store.misc.set_obj(k, v, store=<py2store.persisters.local_files.PathFormatPersister object>, outgoing_val_trans_for_key={'.bin': <function identity_method>, '.cnf': <function <lambda>>, '.conf': <function <lambda>>, '.config': <function <lambda>>, '.csv': <function csv_fileobj>, '.gz': <function compress>, '.gzip': <function compress>, '.ini': <function <lambda>>, '.json': <function <lambda>>, '.pickle': <function <lambda>>, '.pkl': <function <lambda>>, '.txt': <function <lambda>>}, func_key=<function <lambda>>)[source]

A quick way to get an object, with default… everything (but the key, you know, a clue of what you want)


utils for testing

py2store.test.util.random_dict_gen(fields=('a', 'b', 'c'), word_size_range=(1, 10), alphabet='abcdefghijklmnopqrstuvwxyz', n: int = 100)[source]

Random dict (of strings) generator

  • fields – Field names for the random dicts

  • word_size_range – An int, 2-tuple of ints, or list-like object that defines the choices of word sizes

  • alphabet – A string or iterable defining the alphabet to draw from

  • n – The number of elements the generator will yield


Random dict (of strings) generator

py2store.test.util.random_formatted_str_gen(format_string='root/{}/{}_{}.test', word_size_range=(1, 10), alphabet='abcdefghijklmnopqrstuvwxyz', n=100)[source]

Random formatted string generator

  • format_string – A format string

  • word_size_range – An int, 2-tuple of ints, or list-like object that defines the choices of word sizes

  • alphabet – A string or iterable defining the alphabet to draw from

  • n – The number of elements the generator will yield


Yields random strings of the format defined by format_string


# >>> list(random_formatted_str_gen(‘root/{}/{}_{}.test’, (2, 5), ‘abc’, n=5)) [(‘root/acba/bb_abc.test’,),

(‘root/abcb/cbbc_ca.test’,), (‘root/ac/ac_cc.test’,), (‘root/aacc/ccbb_ab.test’,), (‘root/aab/abb_cbab.test’,)]

>>> # The following will be made not random (by restricting the constraints to "no choice"
>>> # ... this is so that we get consistent outputs to assert for the doc test.
>>> # Example with automatic specification
>>> list(random_formatted_str_gen('root/{}/{}_{}.test', (3, 4), 'a', n=2))
[('root/aaa/aaa_aaa.test',), ('root/aaa/aaa_aaa.test',)]
>>> # Example with manual specification
>>> list(random_formatted_str_gen('indexed field: {0}: named field: {name}', (2, 3), 'z', n=1))
[('indexed field: zz: named field: zz',)]
py2store.test.util.random_string(length=7, alphabet='abcdefghijklmnopqrstuvwxyz')[source]

Same as random_word, but it optimized for strings (5-10% faster for words of length 7, 25-30% faster for words of size 1000)

py2store.test.util.random_tuple_gen(tuple_length=3, word_size_range=(1, 10), alphabet='abcdefghijklmnopqrstuvwxyz', n: int = 100)[source]

Random tuple (of strings) generator

  • tuple_length – The length of the tuples generated

  • word_size_range – An int, 2-tuple of ints, or list-like object that defines the choices of word sizes

  • alphabet – A string or iterable defining the alphabet to draw from

  • n – The number of elements the generator will yield


Random tuple (of strings) generator

py2store.test.util.random_word(length, alphabet, concat_func=<built-in function add>)[source]

Make a random word by concatenating randomly drawn elements from alphabet together :param length: Length of the word :param alphabet: Alphabet to draw from :param concat_func: The concatenation function (e.g. + for strings and lists)

Note: Repeated elements in alphabet will have more chances of being drawn.


A word (whose type depends on what concatenating elements from alphabet produces).

Not making this a proper doctest because I don’t know how to seed the global random temporarily >>> t = random_word(4, ‘abcde’); # e.g. ‘acae’ >>> t = random_word(5, [‘a’, ‘b’, ‘c’]); # e.g. ‘cabba’ >>> t = random_word(4, [[1, 2, 3], [40, 50], [600], [7000]]); # e.g. [40, 50, 7000, 7000, 1, 2, 3] >>> t = random_word(4, [1, 2, 3, 4]); # e.g. 13 (because adding numbers…) >>> # … sometimes it’s what you want: >>> t = random_word(4, [2 ** x for x in range(8)]); # e.g. 105 (binary combination) >>> t = random_word(4, [1, 2, 3, 4], concat_func=lambda x, y: str(x) + str(y)); # e.g. ‘4213’ >>> t = random_word(4, [1, 2, 3, 4], concat_func=lambda x, y: int(str(x) + str(y))); # e.g. 3432

py2store.test.util.random_word_gen(word_size_range=(1, 10), alphabet='abcdefghijklmnopqrstuvwxyz', n=100)[source]

Random string generator :param word_size_range: An int, 2-tuple of ints, or list-like object that defines the choices of word sizes :param alphabet: A string or iterable defining the alphabet to draw from :param n: The number of elements the generator will yield


Random string generator



test files



scrap code


General util objects


Simple access to docx (Word Doc) elements.


Stores to talk to gitlab, using requests.

Example: ``` ogl = GitLabAccessor(base_url=”http://…”, project_name=None)

print(ogl.get_project_names()) # prints all project names ogl.set_project(“PROJECT_NAME”) # sets the project to “PROJECT_NAME” print(


) # gets the branch names of current project (as set previously) print(


) # gets a json of information about the master branch of current project. ```


a data object layer for HDF files


py2store Extensions, Add-ons, etc. We kept py2store purely dependency-less, using only built-ins for everything but storage system connectors.

That said, in order to provide the user with more power, and show him/her how py2store tools can be used to build powerful data accessors, we provide specialized modules that do require more than builtins. These dependencies are not listed in the module, but we wrap their imports with informative ImportError handlers.


a data object layer for matlab




a data object layer for github


Data as pandas.DataFrame from various sources


Utils to load stores from store specifications. Includes the logic to allow configurations (and defaults) to be parametrized by external environmental variables and files.

Every data-sourced problem has it’s problem-relevant stores. Once you get your stores right, along with the right access credentials, indexing, serialization, caching, filtering etc. you’d like to be able to name, save and/or share this specification, and easily get access to it later on.

Here are tools to help you out.

There are two main key-value stores: One for configurations the user wants to reuse, and the other for the user’s desired defaults. Both have the same structure:

  • first level key: Name of the resource (should be a valid python variable name)

  • The reminder is more or less free form (until the day we lay out some schemas for this)

The system will look for the specification of user_configs and user_defaults in a json file. The filepath to this json file can specified in environment variables


respectively. By default, they are:

~/.py2store_configs.json and ~/.py2store_defaults.json



Make a function that is the composition of the input functions

py2store.access.dflt_func_loader(f) → callable[source]

Loads and returns the function referenced by f, which could be a callable or a DOTPATH_TO_MODULE.FUNC_NAME dotpath string to one, or a pipeline of these

py2store.access.dotpath_to_func(f: (<class 'str'>, <built-in function callable>)) → callable[source]

Loads and returns the function referenced by f, which could be a callable or a DOTPATH_TO_MODULE.FUNC_NAME dotpath string to one.


Loads and returns the object referenced by the string DOTPATH_TO_MODULE.OBJ_NAME

py2store.access.fakit(fak, func_loader=<function dflt_func_loader>)[source]

Execute a fak with given f, a, k and function loader.

Essentially returns func_loader(f)(*a, **k)

  • fak – A (f, a, k) specification. Could be a tuple or a dict (with ‘f’, ‘a’, ‘k’ keys). All but f are optional.

  • func_loader – A function returning a function. This is where you specify any validation of func specification f, and/or how to get a callable from it.

Returns: A python object.

py2store.access.getenv(name, default=None)[source]

Like os.getenv, but removes a suffix r character if present (problem with some env var systems)


Your portal to many Data Object Layer goodies

py2store.__init__.ihead(store, n=1)[source]

Get the first item of an iterable, or a list of the first n items

py2store.__init__.kvhead(store, n=1)[source]

Get the first item of a kv store, or a list of the first n items


stores to operate on local files

class py2store.stores.local_store.AutoMkDirsOnSetitemMixin[source]

A mixin that will automatically create directories on setitem, when missing.

class py2store.stores.local_store.AutoMkPathformatMixin(path_format=None, max_levels=None)[source]

A mixin that will choose a path_format if none given

class py2store.stores.local_store.DirStore(rootdir)[source]

A store for local directories. Keys are directory names and values are subdirectory DirStores.

>>> from py2store import __file__
>>> import os
>>> root = os.path.dirname(__file__)
>>> s = DirStore(root)
>>> assert set(s).issuperset({'stores', 'persisters', 'serializers', 'key_mappers'})
class py2store.stores.local_store.LocalBinaryStore(path_format, max_levels=None)[source]

Local files store for binary data

class py2store.stores.local_store.LocalJsonStore(path_format, max_levels=None)[source]

Local files store for text dataData is assumed to be a JSON string, and is loaded with json.loads and dumped with json.dumps

class py2store.stores.local_store.LocalPickleStore(path_format, max_levels=None, fix_imports=True, protocol=None, pickle_encoding='ASCII', pickle_errors='strict', **open_kwargs)[source]

Local files store with pickle serialization


alias of py2store.stores.local_store.QuickPickleStore

class py2store.stores.local_store.LocalTextStore(path_format, max_levels=None)[source]

Local files store for text data

class py2store.stores.local_store.MakeMissingDirsStoreMixin[source]

Will make a local file store automatically create the directories needed to create a file. Should be placed before the concrete perisister in the mro but in such a manner so that it receives full paths.

class py2store.stores.local_store.PathFormatStore(path_format, max_levels: int = inf, mode='', **open_kwargs)[source]

Local file store using templated relative paths.

>>> from tempfile import gettempdir
>>> import os
>>> def write_to_key(fullpath_of_relative_path, relative_path, content):  # a function to write content in files
...    with open(fullpath_of_relative_path(relative_path), 'w') as fp:
...        fp.write(content)
>>> # Preparation: Make a temporary rootdir and write two files in it
>>> rootdir = os.path.join(gettempdir(), 'path_format_store_test' + os.sep)
>>> if not os.path.isdir(rootdir):
...     os.mkdir(rootdir)
>>> # recreate directory (remove existing files, delete directory, and re-create it)
>>> for f in os.listdir(rootdir):
...     fullpath = os.path.join(rootdir, f)
...     if os.path.isfile(fullpath):
...         os.remove(os.path.join(rootdir, f))
>>> if os.path.isdir(rootdir):
...     os.rmdir(rootdir)
>>> if not os.path.isdir(rootdir):
...    os.mkdir(rootdir)
>>> filepath_of = lambda p: os.path.join(rootdir, p)  # a function to get a fullpath from a relative one
>>> # and make two files in this new dir, with some content
>>> write_to_key(filepath_of, 'a', 'foo')
>>> write_to_key(filepath_of, 'b', 'bar')
>>> # point the obj source to the rootdir
>>> s = PathFormatStore(path_format=rootdir)
>>> # assert things...
>>> assert s._prefix == rootdir  # the _rootdir is the one given in constructor
>>> assert s[filepath_of('a')] == 'foo'  # (the filepath for) 'a' contains 'foo'
>>> # two files under rootdir (as long as the OS didn't create it's own under the hood)
>>> len(s)
>>> assert list(s) == [filepath_of('a'), filepath_of('b')]  # there's two files in s
>>> filepath_of('a') in s  # rootdir/a is in s
>>> filepath_of('not_there') in s  # rootdir/not_there is not in s
>>> filepath_of('not_there') not in s  # rootdir/not_there is not in s
>>> assert list(s.keys()) == [filepath_of('a'), filepath_of('b')]  # the keys (filepaths) of s
>>> sorted(list(s.values())) # the values of s (contents of files)
['bar', 'foo']
>>> assert list(s.items()) == [(filepath_of('a'), 'foo'), (filepath_of('b'), 'bar')]  # the (path, content) items
>>> assert s.get('this key is not there', None) is None  # trying to get the val of a non-existing key returns None
>>> s.get('this key is not there', 'some default value')  # ... or whatever you say
'some default value'
>>> # add more files to the same folder
>>> write_to_key(filepath_of, 'this.txt', 'this')
>>> write_to_key(filepath_of, 'that.txt', 'blah')
>>> write_to_key(filepath_of, 'the_other.txt', 'bloo')
>>> # see that you now have 5 files
>>> len(s)
>>> # and these files contain values:
>>> sorted(s.values())
['bar', 'blah', 'bloo', 'foo', 'this']
>>> # but if we make an obj source to only take files whose extension is '.txt'...
>>> s = PathFormatStore(path_format=rootdir + '{}.txt')
>>> rootdir_2 = os.path.join(gettempdir(), 'obj_source_test_2') # get another rootdir
>>> if not os.path.isdir(rootdir_2):
...    os.mkdir(rootdir_2)
>>> filepath_of_2 = lambda p: os.path.join(rootdir_2, p)
>>> # and make two files in this new dir, with some content
>>> write_to_key(filepath_of, 'this.txt', 'this')
>>> write_to_key(filepath_of, 'that.txt', 'blah')
>>> write_to_key(filepath_of, 'the_other.txt', 'bloo')
>>> ss = PathFormatStore(path_format=rootdir_2 + '{}.txt')
>>> assert s != ss  # though pointing to identical content, o and oo are not equal since the paths are not equal!
class py2store.stores.local_store.PathFormatStoreWithPrefix(*args, **kwargs)[source]

alias of py2store.stores.local_store.LocalPickleStore

class py2store.stores.local_store.QuickBinaryStore(path_format=None, max_levels=None)[source]

Local files store for binary data with default temp root and auto dir generation on write.

class py2store.stores.local_store.QuickJsonStore(path_format=None, max_levels=None)[source]

Local files store for text data with default temp root and auto dir generation on write.Data is assumed to be a JSON string, and is loaded with json.loads and dumped with json.dumps

class py2store.stores.local_store.QuickLocalStoreMixin(path_format=None, max_levels=None)[source]

A mixin that will choose a path_format if none given, and will automatically create directories on setitem, when missing.

class py2store.stores.local_store.QuickPickleStore(path_format=None, max_levels=None)[source]

Local files store with pickle serialization with default temp root and auto dir generation on write.


alias of py2store.stores.local_store.QuickPickleStore

class py2store.stores.local_store.QuickTextStore(path_format=None, max_levels=None)[source]

Local files store for text data with default temp root and auto dir generation on write.

class py2store.stores.local_store.RelativeDirPathFormatKeys(*args, **kwargs)[source]
class py2store.stores.local_store.RelativePathFormatStore2(*args, **kwargs)[source]


a package of various stores




Forwards to dol.core:

Core tools


utils to work with URIs

py2store.utils.uri_utils.build_uri(scheme, database='', username=None, password=None, host='localhost', port=None)[source]

Reverse of parse_uri function. Builds a URI string from provided params.


Parses DB URI string into a dict of params. :param uri: string formatted as: “scheme://username:password@host:port/database” :return: a dict with these params parsed.


utils to make stores based on a the input data itself

class py2store.utils.explicit.ExplicitKeymapReader(store, key_of_id=None, id_of_key=None)[source]

Wrap a store (instance) so that it gets it’s keys from an explicit iterable of keys.

>>> s = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> id_of_key = {'A': 'a', 'C': 'c'}
>>> ss = ExplicitKeymapReader(s, id_of_key=id_of_key)
>>> list(ss)
['A', 'C']
>>> ss['C']  # will look up 'C', find 'c', and call the store on that.
class py2store.utils.explicit.ExplicitKeys(key_collection: Collection)[source]

py2store.base.Keys implementation that gets it’s keys explicitly from a collection given at initialization time. The key_collection must be a (such as list, tuple, set, etc.)

>>> keys = ExplicitKeys(key_collection=['foo', 'bar', 'alice'])
>>> 'foo' in keys
>>> 'not there' in keys
>>> list(keys)
['foo', 'bar', 'alice']
class py2store.utils.explicit.ExplicitKeysSource(key_collection: Collection, _obj_of_key: Callable)[source]

An object source that uses an explicit keys collection and a specified function to read contents for a key.

class py2store.utils.explicit.ExplicitKeysStore(store, key_collection)[source]

Wrap a store (instance) so that it gets it’s keys from an explicit iterable of keys.

>>> s = {'a': 1, 'b': 2, 'c': 3, 'd': 4}
>>> list(s)
['a', 'b', 'c', 'd']
>>> ss = ExplicitKeysStore(s, ['d', 'a'])
>>> len(ss)
>>> list(ss)
['d', 'a']
>>> list(ss.values())
[4, 1]
>>> ss.head()
('d', 4)
class py2store.utils.explicit.ExplicitKeysWithPrefixRelativization(key_collection, _prefix=None)[source]

py2store.base.Keys implementation that gets it’s keys explicitly from a collection given at initialization time. The key_collection must be a (such as list, tuple, set, etc.)

>>> from py2store.base import Store
>>> s = ExplicitKeysWithPrefixRelativization(key_collection=['/root/of/foo', '/root/of/bar', '/root/for/alice'])
>>> keys = Store(store=s)
>>> 'of/foo' in keys
>>> 'not there' in keys
>>> list(keys)
['of/foo', 'of/bar', 'for/alice']
class py2store.utils.explicit.ObjReader(_obj_of_key: Callable)[source]

A reader that uses a specified function to get the contents for a given key.

>>> # define a contents_of_key that reads stuff from a dict
>>> data = {'foo': 'bar', 42: "everything"}
>>> def read_dict(k):
...     return data[k]
>>> pr = ObjReader(_obj_of_key=read_dict)
>>> pr['foo']
>>> pr[42]
>>> # define contents_of_key that reads stuff from a file given it's path
>>> def read_file(path):
...     with open(path) as fp:
...         return
>>> pr = ObjReader(_obj_of_key=read_file)
>>> file_where_this_code_is = __file__  # it should be THIS file you're reading right now!
>>> print(pr[file_where_this_code_is][62:155])  # print some characters of this file
from import Mapping
from typing import Callable, Collection as CollectionType
py2store.utils.explicit.invertible_maps(mapping=None, inv_mapping=None)[source]

Returns two maps that are inverse of each other. Raises an AssertionError iif both maps are None, or if the maps are not inverse of each other

Get a pair of invertible maps >>> invertible_maps({1: 11, 2: 22}) ({1: 11, 2: 22}, {11: 1, 22: 2}) >>> invertible_maps(None, {11: 1, 22: 2}) ({1: 11, 2: 22}, {11: 1, 22: 2})

If two maps are given and invertible, you just get them back >>> invertible_maps({1: 11, 2: 22}, {11: 1, 22: 2}) ({1: 11, 2: 22}, {11: 1, 22: 2})

Or if they’re not invertible >>> invertible_maps({1: 11, 2: 22}, {11: 1, 22: ‘ha, not what you expected!’}) Traceback (most recent call last):

AssertionError: mapping and inv_mapping are not inverse of each other!

>>> invertible_maps(None, None)
Traceback (most recent call last):
ValueError: You need to specify one or both maps


Tools to cache time-series data.

class py2store.utils.timeseries_caching.RegularTimeseriesCache(data_rate=1, time_rate=1, maxlen=None)[source]

A type that pretends to be a (possibly very large) list, but where contents of the list are populated as they are needed. Further, the indexing of the list can be overwritten for the convenience of the user.

The canonical application is where we have segments of continuous waveform indexed by utc microseconds timestamps.

It is convenient to be able to read segments of this waveform as if it was one big waveform (handling the discontinuities gracefully), and have the choice of using (relative or absolute) integer indices or utc indices.


utils for bulk writing – accumulate, aggregate and write when some condition is met

class py2store.utils.cumul_aggreg_write.CumulAggregWrite(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>)[source]
class py2store.utils.cumul_aggreg_write.CumulAggregWriteKvItems(store)[source]
class py2store.utils.cumul_aggreg_write.CumulAggregWriteWithAutoFlush(store, cache_to_kv=<function mk_kv_from_keygen.<locals>.aggregate>, mk_cache=<class 'list'>, flush_cache_condition=<function condition_flush_on_every_write>)[source]

Boolean function used as flush_cache_condition to anytime the cache is non-empty

py2store.utils.cumul_aggreg_write.mk_group_aggregator(item_to_kv, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]

Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key

  • item_to_kv

  • aggregator_op

  • initial


>>> # Collect words (as a csv string), grouped by the lower case of the first letter
>>> ag = mk_group_aggregator(lambda item: (item[0].lower(), item),
...                          aggregator_op=lambda x, y: ', '.join([x, y]))
>>> list(ag(['apple', 'bananna', 'Airplane']))
[('a', 'apple, Airplane'), ('b', 'bananna')]
>>> # Collect (and concatinate)  characters according to their ascii value modulo 3
>>> ag = mk_group_aggregator(lambda item: (item['age'], item['thing']),
...                          aggregator_op=lambda x, y: x + [y],
...                          initial=[])
>>> list(ag([{'age': 0, 'thing': 'new'}, {'age': 42, 'thing': 'every'}, {'age': 0, 'thing': 'just born'}]))
[(0, ['new', 'just born']), (42, ['every'])]
py2store.utils.cumul_aggreg_write.mk_group_aggregator_with_key_func(item_to_key, aggregator_op=<built-in function add>, initial=<py2store.utils.cumul_aggreg_write.NoInitial object>)[source]

Make a generator transforming function that will (a) make a key for each given item, (b) group all items according to the key

  • item_to_key – Function that takes an item of the generator and outputs the key that should be used to group items

  • aggregator_op – The aggregation binary function that is used to aggregate two items together. The function is used as is by the functools.reduce, applied to the sequence of items that were collected for a given group

  • initial – The “empty” element to start the reduce (aggregation) with, if necessary.


>>> # Collect words (as a csv string), grouped by the lower case of the first letter
>>> ag = mk_group_aggregator_with_key_func(lambda item: item[0].lower(),
...                          aggregator_op=lambda x, y: ', '.join([x, y]))
>>> list(ag(['apple', 'bananna', 'Airplane']))
[('a', 'apple, Airplane'), ('b', 'bananna')]
>>> # Collect (and concatenate) characters according to their ascii value modulo 3
... ag = mk_group_aggregator_with_key_func(lambda item: (ord(item) % 3))
>>> list(ag('abcdefghijklmnop'))
[(1, 'adgjmp'), (2, 'behkn'), (0, 'cfilo')]
>>> # sum all even and odd number separately
... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2))
>>> list(ag([1, 2, 3, 4, 5]))  # sum of evens is 6, and sum of odds is 9
[(1, 9), (0, 6)]
>>> # if we wanted to collect all odds and evens, we'd need a different aggregator and initial
... ag = mk_group_aggregator_with_key_func(lambda item: (item % 2), aggregator_op=lambda x, y: x + [y], initial=[])
>>> list(ag([1, 2, 3, 4, 5]))
[(1, [1, 3, 5]), (0, [2, 4])]


general utils


descriptors to cache data


CachedProperties. This is usable directly as a decorator when given names, or when not. Any of these patterns will work: * @CachedProperty * @CachedProperty() * @CachedProperty('n','n2') * def thing(self: …; thing = CachedProperty(thing) * def thing(self: …; thing = CachedProperty(thing, ‘n’)

class py2store.utils.cache_descriptors.Lazy(func, name=None)[source]

Lazy Attributes.

class py2store.utils.cache_descriptors.cachedIn(attribute_name)[source]

Cached property with given cache attribute.


utils to make add append and extend functionality to KV stores


utils to carry out affine transformations (of indices)

class py2store.utils.affine_conversion.AffineConverter(scale=1.0, offset=0.0)[source]

Getting a callable that will perform an affine conversion. Note, it does it as

(val - offset) * scale

(Note slope-intercept style (though there is the .from_slope_and_intercept constructor method for that)

Inverse is available through the inv method, performing:

val / scale + offset

>>> convert = AffineConverter(scale=0.5, offset=1)
>>> convert(0)
>>> convert(10)
>>> convert.inv(4)
>>> convert.inv(4.5)
py2store.utils.affine_conversion.get_affine_converter_and_inverse(scale=1, offset=0, source_type_cast=None, target_type_cast=None)[source]
Getting two affine functions with given scale and offset, that are inverse of each other. Namely (for input val):

(val - offset) * scale and val / scale + offset

Note this is not “slope intercept” style!!

The source_type_cast and target_type_case (optional), allow the user to specify if these transformations need to be further cast to a given type. :param scale: :param offset: :param source_type_cast: function to apply to input :param target_type_cast: function to apply to output :return: Two single val functions: affine_converter, inverse_affine_converter

Note: Code is a lot more complex than the basic operations it performs. The reason was a worry of efficiency since the functions that are returned are intended to be used in long loops.

See also: ocore.utils.conversion.AffineConverter

>>> affine_converter, inverse_affine_converter = get_affine_converter_and_inverse(scale=0.5,offset=1)
>>> affine_converter(0)
>>> affine_converter(10)
>>> inverse_affine_converter(4)
>>> inverse_affine_converter(4.5)
>>> affine_converter, inverse_affine_converter = get_affine_converter_and_inverse(scale=0.5,offset=1,target_type_cast=int)
>>> affine_converter(10)


Deprecated: Forwards to py2store.signatures


utils to add sliceable functionality to stores

class py2store.utils.sliceable.iSliceStore(store)[source]

Wraps a store to make a reader that acts as if the store was a list (with integer keys, and that can be sliced). I say “list”, but it should be noted that the behavior is more that of range, that outputs an element of the list when keying with an integer, but returns an iterable object (a range) if sliced.

Here, a map object is returned when the sliceable store is sliced.

>>> s = {'foo': 'bar', 'hello': 'world', 'alice': 'bob'}
>>> sliceable_s = iSliceStore(s)
>>> sliceable_s[1]
>>> list(sliceable_s[0:2])
['bar', 'world']
>>> list(sliceable_s[-2:])
['world', 'bob']
>>> list(sliceable_s[:-1])
['bar', 'world']


Utils to wrap any object into a mapping interface

class py2store.utils.mappify.LeafMappify(target, node_types=(<class 'dict'>, ), key_concat=<function Mappify.<lambda>>, names_of_literals=(), **kwargs)[source]

A dict-like interface to glom. Here, only leaf keys are taken into account.

>>> d = {
...     'a': 'simple',
...     'b': {'is': 'nested'},
...     'c': {'is': 'nested', 'and': 'has', 'a': [1, 2, 3]}
... }
>>> g = LeafMappify(d)
>>> assert list(g) == ['a', '', '', 'c.and', 'c.a']
>>> assert g['a'] == 'simple'
>>> assert g[''] == 'nested'
>>> assert g['c.a'] == [1, 2, 3]
>>> for k, v in g.items():
...     print(f"{k}: {v}")
a: simple nested nested
c.and: has
c.a: [1, 2, 3]
class py2store.utils.mappify.Mappify(target, node_types=(<class 'dict'>, ), key_concat=<function Mappify.<lambda>>, names_of_literals=(), **kwargs)[source]
>>> d = {
...     'a': 'simple',
...     'b': {'is': 'nested'},
...     'c': {'is': 'nested', 'and': 'has', 'a': [1, 2, 3]}
... }
>>> g = Mappify(d)
>>> assert list(g) == ['a', '', 'b', '', 'c.and', 'c.a', 'c']
>>> assert g['a'] == 'simple'
>>> assert g[''] == 'nested'
>>> assert g['c.a'] == [1, 2, 3]
>>> for k, v in g.items():
...     print(f"{k}: {v}")
a: simple nested
b: {'is': 'nested'} nested
c.and: has
c.a: [1, 2, 3]
c: {'is': 'nested', 'and': 'has', 'a': [1, 2, 3]}


glom is a util to extract stuff from nested structures. It’s one of those excellent utils that I’ve written many times, but never got quite right. Mahmoud Hashemi got it right.


Copyright (c) 2018, Mahmoud Hashemi

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

  • Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

  • Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

  • The names of the contributors may not be used to endorse or promote products derived from this software without specific prior written permission.



Now, at the time of writing this, I’ve already transformed it to bend it to my liking. At some point it may become something else, but I wanted there to be a trace of what my seed was. Though I can’t promise I’ll maintain the same functionality as I transform this module, here’s a tutorial on how to use it in it’s original form:

I only took the main (core) module from the glom project. Here’s the original docs of this glom module.

If there was ever a Python example of “big things come in small packages”, glom might be it.

The glom package has one central entrypoint, glom.glom(). Everything else in the package revolves around that one function.

A couple of conventional terms you’ll see repeated many times below:

  • target - glom is built to work on any data, so we simply refer to the object being accessed as the “target”

  • spec - (aka “glomspec”, short for specification) The accompanying template used to specify the structure of the return value.

Now that you know the terms, let’s take a look around glom’s powerful semantics.

class py2store.utils.glom.Auto(spec=None)[source]

Switch to Auto mode (the default)

TODO: this seems like it should be a sub-class of class Spec() – if Spec() could help define the interface for new “modes” or dialects that would also help make match mode feel less duct-taped on

class py2store.utils.glom.Call(func=None, args=None, kwargs=None)[source]

Call specifies when a target should be passed to a function, func.

Call is similar to partial() in that it is no more powerful than lambda or other functions, but it is designed to be more readable, with a better repr.


func (callable) – a function or other callable to be called with the target

Call combines well with T to construct objects. For instance, to generate a dict and then pass it to a constructor:

>>> class ExampleClass(object):
...    def __init__(self, attr):
...        self.attr = attr
>>> target = {'attr': 3.14}
>>> glom(target, Call(ExampleClass, kwargs=T)).attr

This does the same as glom(target, lambda target: ExampleClass(**target)), but it’s easy to see which one reads better.


Call is mostly for functions. Use a T object if you need to call a method.


Call has a successor with a fuller-featured API, new in 19.3.0: the Invoke specifier type.

glomit(target, scope)[source]

run against the current target

class py2store.utils.glom.Check(spec=T, **kwargs)[source]

Check objects are used to make assertions about the target data, and either pass through the data or raise exceptions if there is a problem.

If any check condition fails, a CheckError is raised.

  • spec – a sub-spec to extract the data to which other assertions will be checked (defaults to applying checks to the target itself)

  • type – a type or sequence of types to be checked for exact match

  • equal_to – a value to be checked for equality match (“==”)

  • validate – a callable or list of callables, each representing a check condition. If one or more return False or raise an exception, the Check will fail.

  • instance_of – a type or sequence of types to be checked with isinstance()

  • one_of – an iterable of values, any of which can match the target (“in”)

  • default – an optional default value to replace the value when the check fails (if default is not specified, GlomCheckError will be raised)

Aside from spec, all arguments are keyword arguments. Each argument, except for default, represent a check condition. Multiple checks can be passed, and if all check conditions are left unset, Check defaults to performing a basic truthy check on the value.

exception py2store.utils.glom.CheckError(msgs, check, path)[source]

This GlomError subtype is raised when target data fails to pass a Check’s specified validation.

An uncaught CheckError looks like this:

>>> target = {'a': {'b': 'c'}}
>>> glom(target, {'b': ('a.b', Check(type=int))})  
Traceback (most recent call last):
glom.CheckError: target at path ['a.b'] failed check, got error: "expected type to be 'int', found type 'str'"

If the Check contains more than one condition, there may be more than one error message. The string rendition of the CheckError will include all messages.

You can also catch the CheckError and programmatically access messages through the msgs attribute on the CheckError instance.


As of 2018-07-05 (glom v18.2.0), the validation subsystem is still very new. Exact error message formatting may be enhanced in future releases.

class py2store.utils.glom.Coalesce(*subspecs, **kwargs)[source]

Coalesce objects specify fallback behavior for a list of subspecs.

Subspecs are passed as positional arguments, and keyword arguments control defaults. Each subspec is evaluated in turn, and if none match, a CoalesceError is raised, or a default is returned, depending on the options used.


This operation may seem very familar if you have experience with SQL or even C# and others.

In practice, this fallback behavior’s simplicity is only surpassed by its utility:

>>> target = {'c': 'd'}
>>> glom(target, Coalesce('a', 'b', 'c'))

glom tries to get 'a' from target, but gets a KeyError. Rather than raise a PathAccessError as usual, glom coalesces into the next subspec, 'b'. The process repeats until it gets to 'c', which returns our value, 'd'. If our value weren’t present, we’d see:

>>> target = {}
>>> glom(target, Coalesce('a', 'b'))  
Traceback (most recent call last):
glom.CoalesceError: no valid values found. Tried ('a', 'b') and got (PathAccessError, PathAccessError) (at path [])

Same process, but because target is empty, we get a CoalesceError. If we want to avoid an exception, and we know which value we want by default, we can set default:

>>> target = {}
>>> glom(target, Coalesce('a', 'b', 'c'), default='d-fault')

'a', 'b', and 'c' weren’t present so we got 'd-fault'.

  • subspecs – One or more glommable subspecs

  • default – A value to return if no subspec results in a valid value

  • default_factory – A callable whose result will be returned as a default

  • skip – A value, tuple of values, or predicate function representing values to ignore

  • skip_exc – An exception or tuple of exception types to catch and move on to the next subspec. Defaults to GlomError, the parent type of all glom runtime exceptions.

If all subspecs produce skipped values or exceptions, a CoalesceError will be raised. For more examples, check out the tutorial, which makes extensive use of Coalesce.

exception py2store.utils.glom.CoalesceError(coal_obj, skipped, path)[source]

This GlomError subtype is raised from within a Coalesce spec’s processing, when none of the subspecs match and no default is provided.

The exception object itself keeps track of several values which may be useful for processing:

  • coal_obj (Coalesce) – The original failing spec, see Coalesce’s docs for details.

  • skipped (list) – A list of ignored values and exceptions, in the order that their respective subspecs appear in the original coal_obj.

  • path – Like many GlomErrors, this exception knows the path at which it occurred.

>>> target = {}
>>> glom(target, Coalesce('a', 'b'))  
Traceback (most recent call last):
glom.CoalesceError: no valid values found. Tried ('a', 'b') and got (PathAccessError, PathAccessError) ...
class py2store.utils.glom.Fill(spec=None)[source]

A specifier type which switches to glom into “fill-mode”. For the spec contained within the Fill, glom will only interpret explicit specifier types (including T objects). Whereas the default mode has special interpretations for each of these builtins, fill-mode takes a lighter touch, making Fill great for “filling out” Python literals, like tuples, dicts, sets, and lists.

>>> target = {'data': [0, 2, 4]}
>>> spec = Fill((T['data'][2], T['data'][0]))
>>> glom(target, spec)
(4, 0)

As you can see, glom’s usual built-in tuple item chaining behavior has switched into a simple tuple constructor.

(Sidenote for Lisp fans: Fill is like glom’s quasi-quoting.)

exception py2store.utils.glom.GlomError[source]

The base exception for all the errors that might be raised from glom() processing logic.

By default, exceptions raised from within functions passed to glom (e.g., len, sum, any lambda) will not be wrapped in a GlomError.

class py2store.utils.glom.Glommer(**kwargs)[source]

All the wholesome goodness that it takes to make glom work. This type mostly serves to encapsulate the type registration context so that advanced uses of glom don’t need to worry about stepping on each other’s toes.

Glommer objects are lightweight and, once instantiated, provide the glom() method we know and love:

>>> glommer = Glommer()
>>> glommer.glom({}, 'a.b.c', default='d')
>>> Glommer().glom({'vals': list(range(3))}, ('vals', len))

Instances also provide register() method for localized control over type handling.


register_default_types (bool) – Whether or not to enable the handling behaviors of the default glom(). These default actions include dict access, list and iterable iteration, and generic object attribute access. Defaults to True.

register(target_type, **kwargs)[source]

Register target_type so glom() will know how to handle instances of that type as targets.

  • target_type (type) – A type expected to appear in a glom() call target

  • get (callable) – A function which takes a target object and a name, acting as a default accessor. Defaults to getattr().

  • iterate (callable) – A function which takes a target object and returns an iterator. Defaults to iter() if target_type appears to be iterable.

  • exact (bool) – Whether or not to match instances of subtypes of target_type.


The module-level register() function affects the module-level glom() function’s behavior. If this global effect is undesirable for your application, or you’re implementing a library, consider instantiating a Glommer instance, and using the register() and Glommer.glom() methods instead.

class py2store.utils.glom.Inspect(*a, **kw)[source]

The Inspect specifier type provides a way to get visibility into glom’s evaluation of a specification, enabling debugging of those tricky problems that may arise with unexpected data.

Inspect can be inserted into an existing spec in one of two ways. First, as a wrapper around the spec in question, or second, as an argument-less placeholder wherever a spec could be.

Inspect supports several modes, controlled by keyword arguments. Its default, no-argument mode, simply echos the state of the glom at the point where it appears:

>>> target = {'a': {'b': {}}}
>>> val = glom(target, Inspect('a.b'))  # wrapping a spec
path:   ['a.b']
target: {'a': {'b': {}}}
output: {}

Debugging behavior aside, Inspect has no effect on values in the target, spec, or result.

  • echo (bool) – Whether to print the path, target, and output of each inspected glom. Defaults to True.

  • recursive (bool) – Whether or not the Inspect should be applied at every level, at or below the spec that it wraps. Defaults to False.

  • breakpoint (bool) – This flag controls whether a debugging prompt should appear before evaluating each inspected spec. Can also take a callable. Defaults to False.

  • post_mortem (bool) – This flag controls whether exceptions should be caught and interactively debugged with pdb on inspected specs.

All arguments above are keyword-only to avoid overlap with a wrapped spec.


Just like pdb.set_trace(), be careful about leaving stray Inspect() instances in production glom specs.

class py2store.utils.glom.Invoke(func)[source]

Specifier type designed for easy invocation of callables from glom.


func (callable) – A function or other callable object.

Invoke is similar to functools.partial(), but with the ability to set up a “templated” call which interleaves constants and glom specs.

For example, the following creates a spec which can be used to check if targets are integers:

>>> is_int = Invoke(isinstance).specs(T).constants(int)
>>> glom(5, is_int)

And this composes like any other glom spec:

>>> target = [7, object(), 9]
>>> glom(target, [is_int])
[True, False, True]

Another example, mixing positional and keyword arguments:

>>> spec = Invoke(sorted).specs(T).constants(key=int, reverse=True)
>>> target = ['10', '5', '20', '1']
>>> glom(target, spec)
['20', '10', '5', '1']

Invoke also helps with evaluating zero-argument functions:

>>> glom(target={}, spec=Invoke(int))

(A trivial example, but from timestamps to UUIDs, zero-arg calls do come up!)


Invoke is mostly for functions, object construction, and callable objects. For calling methods, consider the T object.

constants(*a, **kw)[source]

Returns a new Invoke spec, with the provided positional and keyword argument values stored for passing to the underlying function.

>>> spec = Invoke(T).constants(5)
>>> glom(range, (spec, list))
[0, 1, 2, 3, 4]

Subsequent positional arguments are appended:

>>> spec = Invoke(T).constants(2).constants(10, 2)
>>> glom(range, (spec, list))
[2, 4, 6, 8]

Keyword arguments also work as one might expect:

>>> round_2 = Invoke(round).constants(ndigits=2).specs(T)
>>> glom(3.14159, round_2)

constants() and other Invoke methods may be called multiple times, just remember that every call returns a new spec.

classmethod specfunc(spec)[source]

Creates an Invoke instance where the function is indicated by a spec.

>>> spec = Invoke.specfunc('func').constants(5)
>>> glom({'func': range}, (spec, list))
[0, 1, 2, 3, 4]
specs(*a, **kw)[source]

Returns a new Invoke spec, with the provided positional and keyword arguments stored to be interpreted as specs, with the results passed to the underlying function.

>>> spec = Invoke(range).specs('value')
>>> glom({'value': 5}, (spec, list))
[0, 1, 2, 3, 4]

Subsequent positional arguments are appended:

>>> spec = Invoke(range).specs('start').specs('end', 'step')
>>> target = {'start': 2, 'end': 10, 'step': 2}
>>> glom(target, (spec, list))
[2, 4, 6, 8]

Keyword arguments also work as one might expect:

>>> multiply = lambda x, y: x * y
>>> times_3 = Invoke(multiply).constants(y=3).specs(x='value')
>>> glom({'value': 5}, times_3)

specs() and other Invoke methods may be called multiple times, just remember that every call returns a new spec.

star(args=None, kwargs=None)[source]

Returns a new Invoke spec, with args and/or kwargs specs set to be “starred” or “star-starred” (respectively)

>>> import os.path
>>> spec = Invoke(os.path.join).star(args='path')
>>> target = {'path': ['path', 'to', 'dir']}
>>> glom(target, spec)
  • args (spec) – A spec to be evaluated and “starred” into the underlying function.

  • kwargs (spec) – A spec to be evaluated and “star-starred” into the underlying function.

One or both of the above arguments should be set.

The star(), like other Invoke methods, may be called multiple times. The args and kwargs will be stacked in the order in which they are provided.

class py2store.utils.glom.Let(**kw)[source]

This specifier type assigns variables to the scope.

>>> target = {'data': {'val': 9}}
>>> spec = (Let(value=T['data']['val']), {'val': S['value']})
>>> glom(target, spec)
{'val': 9}
class py2store.utils.glom.Literal(value)[source]

Literal objects specify literal values in rare cases when part of the spec should not be interpreted as a glommable subspec. Wherever a Literal object is encountered in a spec, it is replaced with its wrapped value in the output.

>>> target = {'a': {'b': 'c'}}
>>> spec = {'a': 'a.b', 'readability': Literal('counts')}
>>> pprint(glom(target, spec))
{'a': 'c', 'readability': 'counts'}

Instead of accessing 'counts' as a key like it did with 'a.b', glom() just unwrapped the literal and included the value.

Literal takes one argument, the literal value that should appear in the glom output.

This could also be achieved with a callable, e.g., lambda x: 'literal_string' in the spec, but using a Literal object adds explicitness, code clarity, and a clean repr().

class py2store.utils.glom.Path(*path_parts)[source]

Path objects specify explicit paths when the default 'a.b.c'-style general access syntax won’t work or isn’t desirable. Use this to wrap ints, datetimes, and other valid keys, as well as strings with dots that shouldn’t be expanded.

>>> target = {'a': {'b': 'c', 'd.e': 'f', 2: 3}}
>>> glom(target, Path('a', 2))
>>> glom(target, Path('a', 'd.e'))

Paths can be used to join together other Path objects, as well as T objects:

>>> Path(T['a'], T['b'])
>>> Path(Path('a', 'b'), Path('c', 'd'))
Path('a', 'b', 'c', 'd')

Paths also support indexing and slicing, with each access returning a new Path object:

>>> path = Path('a', 'b', 1, 2)
>>> path[0]
>>> path[-2:]
Path(1, 2)

return the same path but starting from T

classmethod from_text(text)[source]

Make a Path from .-delimited text:

>>> Path.from_text('a.b.c')
Path('a', 'b', 'c')

Returns a tuple of (operation, value) pairs.

>>> Path(T.a.b, 'c', T['d']).items()
(('.', 'a'), ('.', 'b'), ('P', 'c'), ('[', 'd'))

Returns a tuple of values referenced in this path.

>>> Path(T.a.b, 'c', T['d']).values()
('a', 'b', 'c', 'd')
exception py2store.utils.glom.PathAccessError(exc, path, part_idx)[source]

This GlomError subtype represents a failure to access an attribute as dictated by the spec. The most commonly-seen error when using glom, it maintains a copy of the original exception and produces a readable error message for easy debugging.

If you see this error, you may want to:

  • Check the target data is accurate using Inspect

  • Catch the exception and return a semantically meaningful error message

  • Use glom.Coalesce to specify a default

  • Use the top-level default kwarg on glom()

In any case, be glad you got this error and not the one it was wrapping!

  • exc (Exception) – The error that arose when we tried to access path. Typically an instance of KeyError, AttributeError, IndexError, or TypeError, and sometimes others.

  • path (Path) – The full Path glom was in the middle of accessing when the error occurred.

  • part_idx (int) – The index of the part of the path that caused the error.

>>> target = {'a': {'b': None}}
>>> glom(target, 'a.b.c')  
Traceback (most recent call last):
glom.PathAccessError: could not access 'c', part 2 of Path('a', 'b', 'c'), got error: ...
class py2store.utils.glom.Spec(spec, scope=None)[source]

Spec objects serve three purposes, here they are, roughly ordered by utility:

  1. As a form of compiled or “curried” glom call, similar to Python’s built-in re.compile().

  2. A marker as an object as representing a spec rather than a literal value in certain cases where that might be ambiguous.

  3. A way to update the scope within another Spec.

In the second usage, Spec objects are the complement to Literal, wrapping a value and marking that it should be interpreted as a glom spec, rather than a literal value. This is useful in places where it would be interpreted as a value by default. (Such as T[key], Call(func) where key and func are assumed to be literal values and not specs.)

  • spec – The glom spec.

  • scope (dict) – additional values to add to the scope when evaluating this Spec

class py2store.utils.glom.TType[source]

T, short for “target”. A singleton object that enables object-oriented expression of a glom specification.


T is a singleton, and does not need to be constructed.

Basically, think of T as your data’s stunt double. Everything that you do to T will be recorded and executed during the glom() call. Take this example:

>>> spec = T['a']['b']['c']
>>> target = {'a': {'b': {'c': 'd'}}}
>>> glom(target, spec)

So far, we’ve relied on the 'a.b.c'-style shorthand for access, or used the Path objects, but if you want to explicitly do attribute and key lookups, look no further than T.

But T doesn’t stop with unambiguous access. You can also call methods and perform almost any action you would with a normal object:

>>> spec = ('a', (T['b'].items(), list))  # reviewed below
>>> glom(target, spec)
[('c', 'd')]

A T object can go anywhere in the spec. As seen in the example above, we access 'a', use a T to get 'b' and iterate over its items, turning them into a list.

You can even use T with Call to construct objects:

>>> class ExampleClass(object):
...    def __init__(self, attr):
...        self.attr = attr
>>> target = {'attr': 3.14}
>>> glom(target, Call(ExampleClass, kwargs=T)).attr

On a further note, while lambda works great in glom specs, and can be very handy at times, T and Call eliminate the need for the vast majority of lambda usage with glom.

Unlike lambda and other functions, T roundtrips beautifully and transparently:

>>> T['a'].b['c']('success')

T-related access errors raise a PathAccessError during the glom() call.


While T is clearly useful, powerful, and here to stay, its semantics are still being refined. Currently, operations beyond method calls and attribute/item access are considered experimental and should not be relied upon.

class py2store.utils.glom.TargetRegistry(register_default_types=True)[source]

responsible for registration of target types for iteration and attribute walking

get_handler(op, obj, path=None, raise_exc=True)[source]

for an operation and object instance, obj, return the closest-matching handler function, raising UnregisteredTarget if no handler can be found for obj (or False if raise_exc=False)

register_op(op_name, auto_func=None, exact=False)[source]

add operations beyond the builtins (‘get’ and ‘iterate’ at the time of writing).

auto_func is a function that when passed a type, returns a handler associated with op_name if it’s supported, or False if it’s not.

See glom.core.register_op() for the global version used by extensions.

exception py2store.utils.glom.UnregisteredTarget(op, target_type, type_map, path)[source]

This GlomError subtype is raised when a spec calls for an unsupported action on a target type. For instance, trying to iterate on an non-iterable target:

>>> glom(object(), ['a.b.c'])  
Traceback (most recent call last):
glom.UnregisteredTarget: target type 'object' not registered for 'iterate', expected one of registered types: (...)

It should be noted that this is a pretty uncommon occurrence in production glom usage. See the setup-and-registration section for details on how to avoid this error.

An UnregisteredTarget takes and tracks a few values:

  • op (str) – The name of the operation being performed (‘get’ or ‘iterate’)

  • target_type (type) – The type of the target being processed.

  • type_map (dict) – A mapping of target types that do support this operation

  • path – The path at which the error occurred.

py2store.utils.glom.glom(target, spec, **kwargs)[source]

Access or construct a value from a given target based on the specification declared by spec.

Accessing nested data, aka deep-get:

>>> target = {'a': {'b': 'c'}}
>>> glom(target, 'a.b')

Here the spec was just a string denoting a path, 'a.b.. As simple as it should be. The next example shows how to use nested data to access many fields at once, and make a new nested structure.

Constructing, or restructuring more-complicated nested data:

>>> target = {'a': {'b': 'c', 'd': 'e'}, 'f': 'g', 'h': [0, 1, 2]}
>>> spec = {'a': 'a.b', 'd': 'a.d', 'h': ('h', [lambda x: x * 2])}
>>> output = glom(target, spec)
>>> pprint(output)
{'a': 'c', 'd': 'e', 'h': [0, 2, 4]}

glom also takes a keyword-argument, default. When set, if a glom operation fails with a GlomError, the default will be returned, very much like dict.get():

>>> glom(target, 'a.xx', default='nada')

The skip_exc keyword argument controls which errors should be ignored.

>>> glom({}, lambda x: 100.0 / len(x), default=0.0, skip_exc=ZeroDivisionError)
  • target (object) – the object on which the glom will operate.

  • spec (object) – Specification of the output object in the form of a dict, list, tuple, string, other glom construct, or any composition of these.

  • default (object) – An optional default to return in the case an exception, specified by skip_exc, is raised.

  • skip_exc (Exception) – An optional exception or tuple of exceptions to ignore and return default (None if omitted). If skip_exc and default are both not set, glom raises errors through.

  • scope (dict) – Additional data that can be accessed via S inside the glom-spec.

It’s a small API with big functionality, and glom’s power is only surpassed by its intuitiveness. Give it a whirl!


Similar in nature to callable(), is_iterable returns True if an object is `iterable`_, False if not. >>> is_iterable([]) True >>> is_iterable(1) False

py2store.utils.glom.make_sentinel(name='_MISSING', var_name=None)[source]

Creates and returns a new instance of a new class, suitable for usage as a “sentinel”, a kind of singleton often used to indicate a value is missing when None is a valid input.

  • name (str) – Name of the Sentinel

  • var_name (str) – Set this name to the name of the variable in its respective module enable pickleability.

>>> make_sentinel(var_name='_MISSING')

The most common use cases here in boltons are as default values for optional function arguments, partly because of its less-confusing appearance in automatically generated documentation. Sentinels also function well as placeholders in queues and linked lists.


By design, additional calls to make_sentinel with the same values will not produce equivalent objects.

>>> make_sentinel('TEST') == make_sentinel('TEST')
>>> type(make_sentinel('TEST')) == type(make_sentinel('TEST'))
py2store.utils.glom.register(target_type, **kwargs)[source]

Register target_type so glom() will know how to handle instances of that type as targets.

  • target_type (type) – A type expected to appear in a glom() call target

  • get (callable) – A function which takes a target object and a name, acting as a default accessor. Defaults to getattr().

  • iterate (callable) – A function which takes a target object and returns an iterator. Defaults to iter() if target_type appears to be iterable.

  • exact (bool) – Whether or not to match instances of subtypes of target_type.


The module-level register() function affects the module-level glom() function’s behavior. If this global effect is undesirable for your application, or you’re implementing a library, consider instantiating a Glommer instance, and using the register() and Glommer.glom() methods instead.

py2store.utils.glom.register_op(op_name, **kwargs)[source]

For extension authors needing to add operations beyond the builtin ‘get’ and ‘iterate’ to the default scope. See TargetRegistry for more details.








base classes to work with local files

class py2store.persisters.local_files.DirReader(rootdir)[source]

KV Reader whose keys (AND VALUES) are directory full paths of the subdirectories of rootdir.

class py2store.persisters.local_files.DirpathFormatKeys(path_format: str, max_levels: int = inf)[source]
class py2store.persisters.local_files.FileReader(rootdir)[source]

KV Reader whose keys are paths and values are: - Another FileReader if a path points to a directory - The bytes of the file if the path points to a file.

class py2store.persisters.local_files.FilepathFormatKeys(path_format: str, max_levels: int = inf)[source]
exception py2store.persisters.local_files.FolderNotFoundError[source]
class py2store.persisters.local_files.LocalFileRWD(mode='', **open_kwargs)[source]

A class providing get, set and delete functionality using local files as the storage backend.

class py2store.persisters.local_files.LocalFileStreamGetter(**open_kwargs)[source]

A class to get stream objects of local open files. The class can only get keys, and only to read, write (destructive or append).

>>> from tempfile import mkdtemp
>>> import os
>>> rootdir = mkdtemp()
>>> appendable_stream = LocalFileStreamGetter(mode='a+')
>>> reader = PathFormatPersister(rootdir)
>>> filepath = os.path.join(rootdir, 'tmp.txt')
>>> with appendable_stream[filepath] as fp:
...     fp.write('hello')
>>> print(reader[filepath])
>>> with appendable_stream[filepath] as fp:
...     fp.write(' world')
>>> print(reader[filepath])
hello world
class py2store.persisters.local_files.PathFormatPersister(path_format, max_levels: int = inf, mode='', **open_kwargs)[source]
class py2store.persisters.local_files.PrefixedDirpathsRecursive[source]

Keys collection for local files, where the keys are full filepaths RECURSIVELY under a given root dir _prefix. This mixin adds iteration (__iter__), length (__len__), and containment (__contains__(k)).

class py2store.persisters.local_files.PrefixedFilepaths[source]

Keys collection for local files, where the keys are full filepaths DIRECTLY under a given root dir _prefix. This mixin adds iteration (__iter__), length (__len__), and containment (__contains__(k)).

class py2store.persisters.local_files.PrefixedFilepathsRecursive[source]

Keys collection for local files, where the keys are full filepaths RECURSIVELY under a given root dir _prefix. This mixin adds iteration (__iter__), length (__len__), and containment (__contains__(k)).

py2store.persisters.local_files.ensure_slash_suffix(path: str)[source]

Add a file separation (/ or ) at the end of path str, if not already present.





This module contains key-value views of disparate sources.


Layers introspection


functions to pickle objects


Generates a reader and writer using marshal. That is, a pair of parametrized loads and dumps

>>> read, write = mk_marshal_rw_funcs()
>>> d = {'a': 'simple', 'and': {'a': b'more', 'complex': [1, 2.2]}}
>>> serialized_d = write(d)
>>> deserialized_d = read(serialized_d)
>>> assert d == deserialized_d
py2store.serializers.pickled.mk_pickle_rw_funcs(fix_imports=True, protocol=None, pickle_encoding='ASCII', pickle_errors='strict')[source]

Generates a reader and writer using pickle. That is, a pair of parametrized loads and dumps

>>> read, write = mk_pickle_rw_funcs()
>>> d = {'a': 'simple', 'and': {'a': b'more', 'complex': [1, 2.2, dict]}}
>>> serialized_d = write(d)
>>> deserialized_d = read(serialized_d)
>>> assert d == deserialized_d



a package of serializers




Tools to add caching layers to stores.




stores that implement various write caching algorithms


The cache timestamps (with system clock) every item on insertion (append) and uses the min timestamp as a key for storage.


modules demoing various uses of py2store


Note: Moved to umpyre (pip install umpyre)

Get stats about packages. Your own, or other’s. Things like…

# >>> import collections # >>> modules_info_df(collections) # lines empty_lines … num_of_functions num_of_classes # collections.__init__ 1273 189 … 1 9 # 3 1 … 0 25 # <BLANKLINE> # [2 rows x 7 columns] # >>> modules_info_df_stats( # lines 1276.000000 # empty_lines 190.000000 # comment_lines 73.000000 # docs_lines 133.000000 # function_lines 138.000000 # num_of_functions 1.000000 # num_of_classes 34.000000 # empty_lines_ratio 0.148903 # comment_lines_ratio 0.057210 # function_lines_ratio 0.108150 # mean_lines_per_function 138.000000 # dtype: float64 # >>> stats_of([‘urllib’, ‘json’, ‘collections’]) # urllib json collections # empty_lines_ratio 0.157034 0.136818 0.148903 # comment_lines_ratio 0.074142 0.038432 0.057210 # function_lines_ratio 0.213907 0.449654 0.108150 # mean_lines_per_function 13.463768 41.785714 138.000000 # lines 4343.000000 1301.000000 1276.000000 # empty_lines 682.000000 178.000000 190.000000 # comment_lines 322.000000 50.000000 73.000000 # docs_lines 425.000000 218.000000 133.000000 # function_lines 929.000000 585.000000 138.000000 # num_of_functions 69.000000 14.000000 1.000000 # num_of_classes 55.000000 3.000000 34.000000


walking through kv stores

class py2store.examples.kv_walking.SrcReader(src, src_to_keys, key_to_obj)[source]

Updates the _keys_cache by calling its {} method

py2store.examples.kv_walking.conjunction(*args, **kwargs)[source]

` will be equal to ` func_1(*args, **kwargs) & … & func_n(*args, **kwargs) ``` for all args, kwargs.

py2store.examples.kv_walking.kv_walk(v:, yield_func=<function asis>, walk_filt=<function val_is_mapping>, pkv_to_pv=<function tuple_keypath_and_val>, p=())[source]
  • v

  • yield_func – (pp, k, vv) -> what ever you want the gen to yield

  • walk_filt – (p, k, vv) -> (bool) whether to explore the nested structure v further

  • pkv_to_pv – (p, k, v) -> (pp, vv) where pp is a form of p + k (update of the path with the new node k) and vv is the value that will be used by both walk_filt and yield_func

  • p – The path to v

>>> d = {'a': 1, 'b': {'c': 2, 'd': 3}}
>>> list(kv_walk(d))
[(('a',), 'a', 1), (('b',), 'b', {'c': 2, 'd': 3}), (('b', 'c'), 'c', 2), (('b', 'd'), 'd', 3)]
>>> list(kv_walk(d, lambda p, k, v: '.'.join(p)))
['a', 'b', 'b.c', 'b.d']
>>> list(kv_walk(d, lambda p, k, v: '.'.join(p)))
['a', 'b', 'b.c', 'b.d']

functionalities meant to be configurable

define stores (and functions) so they give you data as you want it, depending on the extension


Transformation/wrapping tools


utils from strings


Get the sets of indices and names used in manual specification of format strings, or None, None if auto spec. :param format_string: A format string (i.e. a string with {…} to mark parameter placement and formatting


None, None if format_string is an automatic specification set_of_indices_used, set_of_fields_used if it is a manual specification

>>> format_string = '{0} (no 1) {2}, {see} this, {0} is a duplicate (appeared before) and {name} is string-named'
>>> assert args_and_kwargs_indices(format_string) == ({0, 2}, {'name', 'see'})
>>> format_string = 'This is a format string with only automatic field specification: {}, {}, {} etc.'
>>> assert args_and_kwargs_indices(format_string) == (set(), set())

Get an auto field version of the format_str


format_str – A format string


A transformed format_str that has no names {inside} {formatting} {braces}.

>>> auto_field_format_str('R/{0}/{one}/{}/{two}/T')

The (quasi-)inverse of string.Formatter.parse.

  • parsed – iterator of (literal_text, field_name, format_spec, conversion) tuples,

  • yield by string.Formatter.parse (as) –


A format string that would produce such a parsed input.

>>> s =  "ROOT/{}/{0!r}/{1!i:format}/hello{:0.02f}TAIL"
>>> assert compile_str_from_parsed(string.Formatter().parse(s)) == s
>>> # Or, if you want to see more details...
>>> parsed = list(string.Formatter().parse(s))
>>> for p in parsed:
...     print(p)
('ROOT/', '', '', None)
('/', '0', '', 'r')
('/', '1', 'format', 'i')
('/hello', '', '0.02f', None)
('TAIL', None, None, None)
>>> compile_str_from_parsed(parsed)

Get the “parameter” indices/names of the format_string


format_string – A format string (i.e. a string with {…} to mark parameter placement and formatting


A list of parameter indices used in the format string, in the order they appear, with repetition. Parameter indices could be integers, strings, or None (to denote “automatic field numbering”.

>>> format_string = '{0} (no 1) {2}, and {0} is a duplicate, {} is unnamed and {name} is string-named'
>>> format_params_in_str_format(format_string)
[0, 2, 0, None, 'name']
>>> parsed = parse_str_format("all/{}/is/{2}/position/{except}{this}{0}")
>>> get_explicit_positions(parsed)
{0, 2}

Says if the format_params is from an automatic specification See Also: is_manual_format_params and is_hybrid_format_params


Says if the format_string is uses automatic specification See Also: is_manual_format_params >>> is_automatic_format_string(‘Manual: indices: {1} {2}, named: {named} {fields}’) False >>> is_automatic_format_string(‘Auto: only un-indexed and un-named: {} {}…’) True >>> is_automatic_format_string(‘Hybrid: at least a {}, and a {0} or a {name}’) False >>> is_manual_format_string(‘No formatting is both manual and automatic formatting!’) True


Says if the format_params is from a hybrid of auto and manual. Note: Hybrid specifications are considered non-valid and can’t be formatted with format_string.format(…). Yet, it can be useful for flexibility of expression (but will need to be resolved to be used). See Also: is_manual_format_params and is_automatic_format_params


Says if the format_params is from a hybrid of auto and manual. Note: Hybrid specifications are considered non-valid and can’t be formatted with format_string.format(…). Yet, it can be useful for flexibility of expression (but will need to be resolved to be used).

>>> is_hybrid_format_string('Manual: indices: {1} {2}, named: {named} {fields}')
>>> is_hybrid_format_string('Auto: only un-indexed and un-named: {} {}...')
>>> is_hybrid_format_string('Hybrid: at least a {}, and a {0} or a {name}')
>>> is_manual_format_string('No formatting is both manual and automatic formatting (so hybrid is both)!')

Says if the format_params is from a manual specification See Also: is_automatic_format_params


Says if the format_string uses a manual specification See Also: is_automatic_format_string and >>> is_manual_format_string(‘Manual: indices: {1} {2}, named: {named} {fields}’) True >>> is_manual_format_string(‘Auto: only un-indexed and un-named: {} {}…’) False >>> is_manual_format_string(‘Hybrid: at least a {}, and a {0} or a {name}’) False >>> is_manual_format_string(‘No formatting is both manual and automatic formatting!’) True


Get an auto field version of the format_str


format_str – A format string


A transformed format_str that has no names {inside} {formatting} {braces}.

>>> auto_field_format_str('R/{0}/{one}/{}/{two}/T')

The number of parameters

py2store.key_mappers.str_utils.name_fields_in_format_str(format_str, field_names=None)[source]

Get a manual field version of the format_str

  • format_str – A format string

  • names – An iterable that produces enough strings to fill all of format_str fields


A transformed format_str

>>> name_fields_in_format_str('R/{0}/{one}/{}/{two}/T')
>>> # Note here that we use the field name to inject a field format as well
>>> name_fields_in_format_str('R/{foo}/{0}/{}/T', ['42', 'hi:03.0f', 'world'])


Tools to map tuple-structured keys. That is, converting from any of the following kinds of keys:

  • tuples (or list-like)

  • dicts

  • formatted/templated strings

  • dsv (Delimiter-Separated Values)

py2store.key_mappers.tuples.dsv_of_list(d, sep=',')[source]

Converting a list of strings to a dsv (delimiter-separated values) string.

Note that unlike most key mappers, there is no schema imposing size here. If you wish to impose a size validation, do so externally (we suggest using a decorator for that).

  • d – A list of component strings

  • sep – The delimiter text used to separate a string into a list of component strings


The delimiter-separated values (dsv) string for the input tuple

>>> dsv_of_list(['a', 'brown', 'fox'], sep=' ')
'a brown fox'
>>> dsv_of_list(('jumps', 'over'), sep='/')  # for filepaths (and see that tuple inputs work too!)
>>> dsv_of_list(['Sat', 'Jan', '1', '1983'], sep=',')  # csv: the usual delimiter-separated values format
>>> dsv_of_list(['First', 'Last'], sep=':::')  # a longer delimiter
>>> dsv_of_list(['singleton'], sep='@')  # when the list has only one element
>>> dsv_of_list([], sep='@')  # when the list is empty
py2store.key_mappers.tuples.list_of_dsv(d, sep=',')[source]

Converting a dsv (delimiter-separated values) string to the list of it’s components.

  • d – A (delimiter-separated values) string

  • sep – The delimiter text used to separate the string into a list of component strings


A list of component strings corresponding to the input delimiter-separated values (dsv) string

>>> list_of_dsv('a brown fox', sep=' ')
['a', 'brown', 'fox']
>>> tuple(list_of_dsv('jumps/over', sep='/'))  # for filepaths
('jumps', 'over')
>>> list_of_dsv('Sat,Jan,1,1983', sep=',')  # csv: the usual delimiter-separated values format
['Sat', 'Jan', '1', '1983']
>>> list_of_dsv('First:::Last', sep=':::')  # a longer delimiter
['First', 'Last']
>>> list_of_dsv('singleton', sep='@')  # when the list has only one element
>>> list_of_dsv('', sep='@')  # when the string is empty

Make a function that transforms a string to an object. The factory making inverses of what mk_str_from_obj makes.


constructor – The function (or class) that will be used to make objects from the **kwargs parsed out of the string.


A function factory.


Make a function that transforms objects to strings, using specific attributes of object.


attrs – Attributes that should be read off of the object to make the parameters of the string


A transformation function

>>> from dataclasses import dataclass
>>> @dataclass
... class A:
...     foo: int
...     bar: str
>>> a = A(foo=0, bar='rin')
>>> a
A(foo=0, bar='rin')
>>> str_from_obj = mk_str_of_obj(['foo', 'bar'])
>>> str_from_obj(a, 'ST{foo}/{bar}/G')
py2store.key_mappers.tuples.str_of_tuple(d, str_format)[source]

Convert tuple to str. It’s just str_format.format(*d). Why even write such a function? (1) To have a consistent interface for key conversions (2) We want a KeyValidationError to occur here :param d: tuple if params to str_format :param str_format: Auto fields format string. If you have manual fields, consider auto_field_format_str to convert.


parametrized string

>>> str_of_tuple(('hello', 'world'), "Well, {} dear {}!")
'Well, hello dear world!'


Module that forwards to py2store.paths, kept for back-compatibility


This module only forwards to py2store.naming, and is deprecated.


key mapping


Forwards to dol.errors:

Error objects and utils


Data Object Layer for configparser standard lib.


modules for standard libs


a data object layer for zipfile

exception py2store.slib.s_zipfile.EmptyZipError[source]
class py2store.slib.s_zipfile.FileStreamsOfZip(zip_file, prefix='', open_kws=None)[source]

Like FilesOfZip, but object returns are file streams instead. So you use it like this:

``` z = FileStreamsOfZip(rootdir) with z[relpath] as fp:

… # do stuff with fp, like fp.readlines() or such…


class py2store.slib.s_zipfile.FilesOfZip(zip_file, prefix='', open_kws=None)[source]
class py2store.slib.s_zipfile.FlatZipFilesReader(rootdir, subpath='.+\\.zip', pattern_for_field=None, max_levels=0, zip_reader=<class 'py2store.slib.s_zipfile.ZipReader'>, **zip_reader_kwargs)[source]

Read the union of the contents of multiple zip files. A local file reader whose keys are the zip filepaths of the rootdir and values are corresponding ZipReaders.

exception py2store.slib.s_zipfile.OverwriteNotAllowed[source]

alias of py2store.slib.s_zipfile.ZipFilesReader

class py2store.slib.s_zipfile.ZipFileStreamsReader(rootdir, subpath='.+\\.zip', pattern_for_field=None, max_levels=0, *, zip_reader=<class 'py2store.slib.s_zipfile.FileStreamsOfZip'>, **zip_reader_kwargs)

Like ZipFilesReader, but objects returned are file streams instead.

class py2store.slib.s_zipfile.ZipFilesReader(rootdir, subpath='.+\\.zip', pattern_for_field=None, max_levels=0, zip_reader=<class 'py2store.slib.s_zipfile.ZipReader'>, **zip_reader_kwargs)[source]

A local file reader whose keys are the zip filepaths of the rootdir and values are corresponding ZipReaders.

class py2store.slib.s_zipfile.ZipFilesReaderAndBytesWriter(rootdir, subpath='.+\\.zip', pattern_for_field=None, max_levels=0, zip_reader=<class 'py2store.slib.s_zipfile.ZipReader'>, **zip_reader_kwargs)[source]

Like ZipFilesReader, but the ability to write bytes (assumed to be valid bytes of the zip format) to a key

class py2store.slib.s_zipfile.ZipReader(zip_file, prefix='', open_kws=None, file_info_filt=None)[source]

A KvReader to read the contents of a zip file. Provides a KV perspective of

ZipReader has two value categories: Directories and Files. Both categories are distinguishable by the keys, through the “ends with slash” convention.

When a file, the value return is bytes, as usual.

When a directory, the value returned is a ZipReader itself, with all params the same, except for the prefix

which serves to specify the subfolder (that is, ``prefix` acts as a filter).

Note: If you get data zipped by a mac, you might get some junk along with it. Namely __MACOSX folders .DS_Store files. I won’t rant about it, since others have. But you might find it useful to remove them from view. One choice is to use py2store.trans.filt_iter to get a filtered view of the zips contents. In most cases, this should do the job: ` # applied to store instance or class: store = filt_iter(filt=lambda x: not x.startswith('__MACOSX') and '.DS_Store' not in x)(store) `

Another option is just to remove these from the zip file once and for all. In unix-like systems: ` zip -d __MACOSX/\* zip -d \*/.DS_Store `


# >>> s = ZipReader(‘/path/to/’) # >>> len(s) # 53432 # >>> list(s)[:3] # the first 3 elements (well… their keys) # [‘odir/’, ‘odir/app/’, ‘odir/app/data/’] # >>> list(s)[-3:] # the last 3 elements (well… their keys) # [‘odir/app/data/audio/d/1574287049078391/m/Ctor.json’, # ‘odir/app/data/audio/d/1574287049078391/m/intensity.json’, # ‘odir/app/data/run/status.json’] # >>> # getting a file (note that by default, you get bytes, so need to decode) # >>> s[‘odir/app/data/run/status.json’].decode() # b’{“test_phase_number”: 9, “test_phase”: “TestActions.IGNORE_TEST”, “session_id”: 0}’ # >>> # when you ask for the contents for a key that’s a directory, # >>> # you get a ZipReader filtered for that prefix: # >>> s[‘odir/app/data/audio/’] # ZipReader(‘/path/to/’, ‘odir/app/data/audio/’, {}, <function take_everything at 0x1538999e0>) # >>> # Often, you only want files (not directories) # >>> # You can filter directories out using the file_info_filt argument # >>> s = ZipReader(‘/path/to/’, file_info_filt=ZipReader.FILES_ONLY) # >>> len(s) # compare to the 53432 above, that contained dirs too # 53280 # >>> list(s)[:3] # first 3 keys are all files now # [‘odir/app/data/plc/d/1574304926795633/d/1574305026895702’, # ‘odir/app/data/plc/d/1574304926795633/d/1574305276853053’, # ‘odir/app/data/plc/d/1574304926795633/d/1574305159343326’] # >>> # >>> # ZipReader.FILES_ONLY and ZipReader.DIRS_ONLY are just convenience filt functions # >>> # Really, you can provide any custom one yourself. # >>> # This filter function should take a ZipInfo object, and return True or False. # >>> # ( # >>> # >>> import re # >>> p = re.compile(‘audio.*.json$’) # >>> my_filt_func = lambda fileinfo: bool( # >>> s = ZipReader(‘/Users/twhalen/Downloads/’, file_info_filt=my_filt_func) # >>> len(s) # 48 # >>> list(s)[:3] # [‘odir/app/data/audio/d/1574333557263758/m/Ctor.json’, # ‘odir/app/data/audio/d/1574333557263758/m/intensity.json’, # ‘odir/app/data/audio/d/1574288084739961/m/Ctor.json’]

class py2store.slib.s_zipfile.ZipStore(zip_filepath, compression=8, allow_overwrites=True, pwd=None)[source]

Zip read and writing. When you want to read zips, there’s the FilesOfZip, ZipReader, or ZipFilesReader we know and love.

Sometimes though, you want to write to zips too. For this, we have ZipStore.

Since ZipStore can write to a zip, it’s read functionality is not going to assume static data, and cache things, as your favorite zip readers did. This, and the acrobatics need to disguise the weird zipfile into something more… key-value natural, makes for a not so efficient store, out of the box.

I advise using one of the zip readers if all you need to do is read, or subclassing or

wrapping ZipStore with caching layers if it is appropriate to you.

py2store.slib.s_zipfile.func_conjunction(func1, func2)[source]

Returns a function that is equivalent to lambda x: func1(x) and func2(x)

py2store.slib.s_zipfile.mk_flatzips_store(dir_of_zips, zip_pair_path_preproc=<built-in function sorted>, mk_store=<class 'py2store.slib.s_zipfile.FlatZipFilesReader'>, **extra_mk_store_kwargs)[source]

A store so that you can work with a folder that has a bunch of zip files, as if they’ve all been extracted in the same folder. Note that zip_pair_path_preproc can be used to control how to resolve key conflicts (i.e. when you get two different zip files that have a same path in their contents). The last path encountered by zip_pair_path_preproc(zip_path_pairs) is the one that will be used, so one should make zip_pair_path_preproc act accordingly.


Forwards to dol.base:

Base classes for making stores. In the language of the module, a store is a MutableMapping that is configured to work with a specific representation of keys, serialization of objects (python values), and persistence of the serialized data.

That is, stores offer the same interface as a dict, but where the actual implementation of writes, reads, and listing are configurable.

Consider the following example. You’re store is meant to store waveforms as wav files on a remote server. Say waveforms are represented in python as a tuple (wf, sr), where wf is a list of numbers and sr is the sample rate, an int). The __setitem__ method will specify how to store bytes on a remote server, but you’ll need to specify how to SERIALIZE (wf, sr) to the bytes that constitute that wav file: _data_of_obj specifies that. You might also want to read those wav files back into a python (wf, sr) tuple. The __getitem__ method will get you those bytes from the server, but the store will need to know how to DESERIALIZE those bytes back into a python object: _obj_of_data specifies that

Further, say you’re storing these .wav files in /some/folder/on/the/server/, but you don’t want the store to use these as the keys. For one, it’s annoying to type and harder to read. But more importantly, it’s an irrelevant implementation detail that shouldn’t be exposed. THe _id_of_key and _key_of_id pair are what allow you to add this key interface layer.

These key converters object serialization methods default to the identity (i.e. they return the input as is). This means that you don’t have to implement these as all, and can choose to implement these concerns within the storage methods themselves.





