dol.naming
This module is about generating, validating, and operating on (parametrized) fields (i.e. stings, e.g. paths).
- class dol.naming.BigDocTest[source]
# TODO: Fix this test (maybe test assertions aren’t correct) # This happened when we changed some re.compile to safe_compile # >>> # >>> e_name = BigDocTest.mk_e_naming() # >>> u_name = BigDocTest.mk_u_naming() # >>> e_sref = ‘s3://bucket-GROUP/example/files/USER/SUBUSER/2017-01-24/1485272231982_1485261448469’ # >>> u_sref = “s3://uploads/GROUP/upload/files/USER/2017-01-24/SUBUSER/a_file.wav” # >>> u_name_2 = “s3://uploads/ANOTHER_GROUP/upload/files/ANOTHER_USER/2017-01-24/SUBUSER/a_file.wav” # >>> # >>> ####### is_valid(self, name): ###### # >>> e_name.is_valid(e_sref) # True # >>> e_name.is_valid(u_sref) # False # >>> u_name.is_valid(u_sref) # True # >>> # >>> ####### is_valid_prefix(self, name): ###### # >>> e_name.is_valid_prefix(‘s3://bucket-‘) # True # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP’) # False # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP/example/’) # False # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP/example/files’) # False # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP/example/files/’) # True # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP/example/files/USER/SUBUSER/2017-01-24/’) # True # >>> e_name.is_valid_prefix(‘s3://bucket-GROUP/example/files/USER/SUBUSER/2017-01-24/0_0’) # True # >>> # >>> ####### info_dict(self, name): ###### # >>> e_name.info_dict(e_sref) # see that utc_ms args were cast to ints # {‘group’: ‘GROUP’, ‘user’: ‘USER’, ‘subuser’: ‘SUBUSER’, ‘day’: ‘2017-01-24’, ‘s_ums’: 1485272231982, ‘e_ums’: 1485261448469} # >>> u_name.info_dict(u_sref) # returns None (because self was made for example! # {‘group’: ‘GROUP’, ‘user’: ‘USER’, ‘day’: ‘2017-01-24’, ‘subuser’: ‘SUBUSER’, ‘filename’: ‘a_file.wav’} # >>> # but with a u_name, it will work # >>> u_name.info_dict(u_sref) # {‘group’: ‘GROUP’, ‘user’: ‘USER’, ‘day’: ‘2017-01-24’, ‘subuser’: ‘SUBUSER’, ‘filename’: ‘a_file.wav’} # >>> # >>> ####### extract(self, item, name): ###### # >>> e_name.extract(‘group’, e_sref) # ‘GROUP’ # >>> e_name.extract(‘user’, e_sref) # ‘USER’ # >>> u_name.extract(‘group’, u_name_2) # ‘ANOTHER_GROUP’ # >>> u_name.extract(‘user’, u_name_2) # ‘ANOTHER_USER’ # >>>
# # >>> ####### mk_prefix(self, *args, **kwargs): ###### # >>> e_name.mk_prefix() # ‘s3://bucket-’ # >>> e_name.mk_prefix(group=’GROUP’) # ‘s3://bucket-GROUP/example/files/’ # >>> e_name.mk_prefix(group=’GROUP’, user=’USER’) # ‘s3://bucket-GROUP/example/files/USER/’ # >>> e_name.mk_prefix(group=’GROUP’, user=’USER’, subuser=’SUBUSER’) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/’ # >>> e_name.mk_prefix(group=’GROUP’, user=’USER’, subuser=’SUBUSER’, day=’0000-00-00’) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/0000-00-00/’ # >>> e_name.mk_prefix(group=’GROUP’, user=’USER’, subuser=’SUBUSER’, day=’0000-00-00’, # … s_ums=1485272231982) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/0000-00-00/1485272231982_’ # >>> e_name.mk_prefix(group=’GROUP’, user=’USER’, subuser=’SUBUSER’, day=’0000-00-00’, # … s_ums=1485272231982, e_ums=1485261448469) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/0000-00-00/1485272231982_1485261448469’ # >>> # >>> u_name.mk_prefix() # ‘s3://uploads/’ # >>> u_name.mk_prefix(group=’GROUP’) # ‘s3://uploads/GROUP/upload/files/’ # >>> u_name.mk_prefix(group=’GROUP’, user=’USER’) # ‘s3://uploads/GROUP/upload/files/USER/’ # >>> u_name.mk_prefix(group=’GROUP’, user=’USER’, day=’DAY’) # ‘s3://uploads/GROUP/upload/files/USER/DAY/’ # >>> u_name.mk_prefix(group=’GROUP’, user=’USER’, day=’DAY’) # ‘s3://uploads/GROUP/upload/files/USER/DAY/’ # >>> u_name.mk_prefix(group=’GROUP’, user=’USER’, day=’DAY’, subuser=’SUBUSER’) # ‘s3://uploads/GROUP/upload/files/USER/DAY/SUBUSER/’ # >>> # >>> ####### mk(self, *args, **kwargs): ###### # >>> e_name.mk(group=’GROUP’, user=’USER’, subuser=’SUBUSER’, day=’0000-00-00’, # … s_ums=1485272231982, e_ums=1485261448469) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/0000-00-00/1485272231982_1485261448469’ # >>> e_name.mk(group=’GROUP’, user=’USER’, subuser=’SUBUSER’, day=’from_s_ums’, # … s_ums=1485272231982, e_ums=1485261448469) # ‘s3://bucket-GROUP/example/files/USER/SUBUSER/2017-01-24/1485272231982_1485261448469’ # >>> # >>> ####### replace_name_elements(self, *args, **kwargs): ###### # >>> name = ‘s3://bucket-redrum/example/files/oopsy@domain.com/ozeip/2008-11-04/1225779243969_1225779246969’ # >>> e_name.replace_name_elements(name, user=’NEW_USER’, group=’NEW_GROUP’) # ‘s3://bucket-NEW_GROUP/example/files/NEW_USER/ozeip/2008-11-04/1225779243969_1225779246969’
- class dol.naming.KeyMaps(key_of_id, id_of_key)
- id_of_key
Alias for field number 1
- key_of_id
Alias for field number 0
- dol.naming.LinearNaming
alias of
StrTupleDictWithPrefix
- class dol.naming.PartialFormatter[source]
A string formatter that won’t complain if the fields are only partially formatted. But note that you will lose the spec part of your template (e.g. in {foo:1.2f}, you’ll loose the 1.2f if not foo is given – but {foo} will remain).
>>> partial_formatter = PartialFormatter() >>> str_template = 'foo:{foo} bar={bar} a={a} b={b:0.02f} c={c}' >>> partial_formatter.format(str_template, bar="BAR", b=34) 'foo:{foo} bar=BAR a={a} b=34.00 c={c}'
Note: If you only need a formatting function (not the transformed formatting string), a simpler solution may be:
` import functools format_str = functools.partial(str_template.format, bar="BAR", b=34) `
See https://stackoverflow.com/questions/11283961/partial-string-formatting for more options and discussions.
- class dol.naming.StrTupleDictWithPrefix(template: str | tuple | list, format_dict=None, process_kwargs=None, process_info_dict=None, named_tuple_type_name='NamedTuple', sep: str = '/')[source]
Converting from and to strings, tuples, and dicts, but with partial “prefix” specs allowed.
- Parameters:
template – The string format template
format_dict – A {field_name: field_value_format_regex, …} dict
process_kwargs – A function taking the field=value pairs and producing a dict of processed {field: value,…} dict (where both fields and values could have been processed. This is useful when we need to process (format, default, etc.) fields, or their values, according to the other fields of values in the collection. A specification of {field: function_to_process_this_value,…} wouldn’t allow the full powers we are allowing here.
process_info_dict – A sort of converse of format_dict. This is a {field_name: field_conversion_func, …} dict that is used to convert info_dict values before returning them.
name_separator – Used
>>> ln = StrTupleDictWithPrefix('/home/{user}/fav/{num}.txt', ... format_dict={'user': '[^/]+', 'num': r'\d+'}, ... process_info_dict={'num': int}, ... sep='/' ... ) >>> ln.mk('USER', num=123) # making a string (with args or kwargs) '/home/USER/fav/123.txt' >>> ####### prefix methods ####### >>> ln.is_valid_prefix('/home/USER/fav/') True >>> ln.is_valid_prefix('/home/USER/fav/12') # False because too long False >>> ln.is_valid_prefix('/home/USER/fav') # False because too short False >>> ln.is_valid_prefix('/home/') # True because just right True >>> ln.is_valid_prefix('/home/USER/fav/123.txt') # full path, so output same as is_valid() method True >>> >>> ln.mk_prefix('ME') '/home/ME/fav/' >>> ln.mk_prefix(user='YOU', num=456) # full specification, so output same as same as mk() method '/home/YOU/fav/456.txt'
- dol.naming.dict_to_namedtuple(d, namedtuple_obj=None)[source]
>>> from collections import namedtuple >>> NT = namedtuple('MyTuple', ('foo', 'hello')) >>> nt = NT(1, 42) >>> nt MyTuple(foo=1, hello=42) >>> d = namedtuple_to_dict(nt) >>> d {'foo': 1, 'hello': 42} >>> dict_to_namedtuple(d) NamedTupleFromDict(foo=1, hello=42) >>> dict_to_namedtuple(d, nt) MyTuple(foo=1, hello=42)
- dol.naming.get_fields_from_template(template)[source]
Get list from {item} items of template string :param template: a “template” string (a string with {item} items – the kind that is used to mark token for str.format) :return: a list of the token items of the string, in the order they appear
>>> get_fields_from_template('this{is}an{example}of{a}template') ['is', 'example', 'a']
- dol.naming.mk_kwargs_trans(**trans_func_for_key)[source]
Make a dict transformer from functions that depends solely on keys (of the dict to be transformed) Used to easily make process_kwargs and process_info_dict arguments for LinearNaming.
- dol.naming.mk_pattern_from_template_and_format_dict(template, format_dict=None, sep='/')[source]
Make a compiled regex to match template :param template: A format string :param format_dict: A dict whose keys are template fields and values are regex strings to capture them
Returns: a compiled regex
>>> import os >>> p = mk_pattern_from_template_and_format_dict('{here}/and/{there}') >>> if os.name == 'nt': # for windows ... assert p == re.compile('(?P<here>[^\\\\]+)/and/(?P<there>[^\\\\]+)') ... else: ... assert p == re.compile('(?P<here>[^/]+)/and/(?P<there>[^/]+)') >>> p = mk_pattern_from_template_and_format_dict('{here}/and/{there}', {'there': r'\d+'}) >>> if os.name == 'nt': # for windows ... assert p == re.compile(r'(?P<here>[^\\\\]+)/and/(?P<there>\d+)') ... else: ... assert p == re.compile(r'(?P<here>[^/]+)/and/(?P<there>\d+)') >>> type(p) <class 're.Pattern'> >>> p.match('HERE/and/1234').groupdict() {'here': 'HERE', 'there': '1234'}
- dol.naming.mk_store_from_path_format_store_cls(store=None, *, subpath='', store_cls_kwargs=None, key_type=<function namedtuple>, keymap=<class 'dol.naming.StrTupleDict'>, keymap_kwargs=None, name=None, __module__=None, __name__=None, __qualname__=None, __doc__=None, __annotations__=None, __defaults__=None, __kwdefaults__=None)[source]
Wrap a store (instance or class) that uses string keys to make it into a store that uses a specific key format.
- Parameters:
store – The instance or class to wrap
subpath – The subpath (defining the subset of the data pointed at by the URI
store_cls_kwargs – # if store is a class, the kwargs that you would have given the store_cls to make itself
key_type – The key type you want to interface with: dict, tuple, namedtuple, str or ‘dict’, ‘tuple’, ‘namedtuple’, ‘str’
keymap – # the keymap instance or class you want to use to map keys
keymap_kwargs – # if keymap is a cls, the kwargs to give it (besides the subpath)
name – The name to give the class the function will make here
Returns: An instance of a wrapped class
Example: ``` # Get a (session, bt) indexed LocalJsonStore s = mk_store_from_path_format_store_cls(LocalJsonStore,
os.path.join(root_dir, ‘d’), subpath=’{session}/d/{bt}’, keymap_kwargs=dict(process_info_dict={‘session’: int, ‘bt’: int}))
- dol.naming.mk_tupled_store_from_path_format_store_cls(store=None, *, subpath='', store_cls_kwargs=None, key_type=<function namedtuple>, keymap=<class 'dol.naming.StrTupleDict'>, keymap_kwargs=None, name=None, __module__=None, __name__=None, __qualname__=None, __doc__=None, __annotations__=None, __defaults__=None, __kwdefaults__=None)
Wrap a store (instance or class) that uses string keys to make it into a store that uses a specific key format.
- Parameters:
store – The instance or class to wrap
subpath – The subpath (defining the subset of the data pointed at by the URI
store_cls_kwargs – # if store is a class, the kwargs that you would have given the store_cls to make itself
key_type – The key type you want to interface with: dict, tuple, namedtuple, str or ‘dict’, ‘tuple’, ‘namedtuple’, ‘str’
keymap – # the keymap instance or class you want to use to map keys
keymap_kwargs – # if keymap is a cls, the kwargs to give it (besides the subpath)
name – The name to give the class the function will make here
Returns: An instance of a wrapped class
Example: ``` # Get a (session, bt) indexed LocalJsonStore s = mk_store_from_path_format_store_cls(LocalJsonStore,
os.path.join(root_dir, ‘d’), subpath=’{session}/d/{bt}’, keymap_kwargs=dict(process_info_dict={‘session’: int, ‘bt’: int}))
- dol.naming.namedtuple_to_dict(nt)[source]
>>> from collections import namedtuple >>> NT = namedtuple('MyTuple', ('foo', 'hello')) >>> nt = NT(1, 42) >>> nt MyTuple(foo=1, hello=42) >>> d = namedtuple_to_dict(nt) >>> d {'foo': 1, 'hello': 42}
- dol.naming.update_fields_of_namedtuple(nt: tuple, *, name_of_output_type=None, remove_fields=(), **kwargs)[source]
Replace fields of namedtuple
>>> from collections import namedtuple >>> NT = namedtuple('NT', ('a', 'b', 'c')) >>> nt = NT(1,2,3) >>> nt NT(a=1, b=2, c=3) >>> update_fields_of_namedtuple(nt, c=3000) # replacing a single field NT(a=1, b=2, c=3000) >>> update_fields_of_namedtuple(nt, c=3000, a=1000) # replacing two fields NT(a=1000, b=2, c=3000) >>> update_fields_of_namedtuple(nt, a=1000, c=3000) # see that the original order doesn't change NT(a=1000, b=2, c=3000) >>> update_fields_of_namedtuple(nt, b=2000, d='hello') # replacing one field and adding a new one UpdatedNT(a=1, b=2000, c=3, d='hello') >>> # Now let's try controlling the name of the output type, remove fields, and add new ones >>> update_fields_of_namedtuple(nt, name_of_output_type='NewGuy', remove_fields=('a', 'c'), hello='world') NewGuy(b=2, hello='world')
- dol.naming.validate_kwargs(kwargs_to_validate, validation_dict, validation_funs=None, all_kwargs_should_be_in_validation_dict=False, ignore_misunderstood_validation_instructions=False)[source]
Utility to validate a dict. It’s main use is to validate function arguments (expressing the validation checks in validation_dict) by doing validate_kwargs(locals()), usually in the beginning of the function (to avoid having more accumulated variables than we need in locals()) :param kwargs_to_validate: as the name implies… :param validation_dict: A dict specifying what to validate. Keys are usually name of variables (when feeding
locals()) and values are dicts, themselves specifying check:check_val pairs where check is a string that points to a function (see validation_funs argument) and check_val is an object that the kwargs_to_validate value will be checked against.
- Parameters:
validation_funs – A dict of check:check_function(val, check_val) where check_function is a function returning True if val is valid (with respect to check_val).
all_kwargs_should_be_in_validation_dict – If True, will raise an error if kwargs_to_validate contains keys that are not in validation_dict.
ignore_misunderstood_validation_instructions – If True, will raise an error if validation_dict contains a key that is not in validation_funs (safer, since if you mistype a key in validation_dict, the function will tell you so!
- Returns:
True if all the validations passed.
>>> validation_dict = { ... 'system': { ... 'be in': {'darwin', 'linux'} ... }, ... 'fv_version': { ... 'be a': int, ... 'be at least': 5 ... } ... } >>> validate_kwargs({'system': 'darwin'}, validation_dict) True >>> try: ... validate_kwargs({'system': 'windows'}, validation_dict) ... except AssertionError as e: ... assert str(e).startswith('system must be in') # omitting the set because inconsistent order >>> try: ... validate_kwargs({'fv_version': 9.9}, validation_dict) ... except AssertionError as e: ... print(e) fv_version must be a <class 'int'> >>> try: ... validate_kwargs({'fv_version': 4}, validation_dict) ... except AssertionError as e: ... print(e) fv_version must be at least 5 >>> validate_kwargs({'fv_version': 6}, validation_dict) True