Generic_Util package#
Subpackages#
Submodules#
Generic_Util.benchmarking module#
Functions covering typical code-timing scenarios, such as a “with” statement context, an n-executions timer, and a convenient function for comparing and summarising n execution times of different implementations of the same function.
- Generic_Util.benchmarking.compare_implementations(fs_with_shared_args: dict[str, Callable], n=200, wait=1, verbose=True, fs_with_own_args: dict[str, tuple[Callable, list, dict]] | None = None, args: list | None = None, kwargs: dict | None = None)#
Benchmark multiple implementations of the same function called n times (each with the same args and kwargs), with a break between functions. Recommended later output view if verbose is False: print(table.to_markdown(index = False)). :param fs_with_own_args: alternative to fs_with_shared_args, args and kwargs arguments: meant for additional functions taking different *args and **kwargs.
- Generic_Util.benchmarking.merge_bench_tables(*tables, verbose=True)#
- Generic_Util.benchmarking.time_context(name: str = None)#
“with” statement context for timing the execution of the enclosed code block, i.e. with time_context(‘Name of code block’): …
- Generic_Util.benchmarking.time_n(f: Callable, n=2, *args, **kwargs)#
Run f (with given arguments) n times and return the execution intervals
Generic_Util.iter module#
Iterable-focussed functions, covering multiple varieties of flattening, iterable combining, grouping and predicate/property-based processing (including topological sorting), element-comparison-based operations, value interspersal, and finally batching.
- Generic_Util.iter.all_combinations(xs: Sequence, min_size=1, max_size=None) list#
Produce all combinations of elements of xs between min_size and max_size subset size (i.e. the powerset elements of certain sizes).
- Generic_Util.iter.batch_iter(n: int, xs: Iterable[_a]) Generator[_a, None, None]#
Batch an iterable in batches of size n (possibly except the last). If len(xs) is knowable use batch_seq instead.
- Generic_Util.iter.batch_iter_by(by: Callable[[_a], float], size: int, xs: Iterable[_a], keep_totals=False) Generator[_a, None, None]#
Batch an iterable by sum of some weight/score/cost of each element, with no batch exceeding size. :param keep_totals: if True, the function returns tuples of (batch_total_value, batch) instead of just batches
- Generic_Util.iter.batch_seq(n: int, xs: Sequence[_a], n_is_number_of_batches=False) Generator[_a, None, None]#
Batch an iterable of knowable length in batches of size n (possibly except the last). :param n_is_number_of_batches: If True, divide into n batches of equal length (possibly except the last) rather than in batches of n elements
- Generic_Util.iter.batch_seq_by_into(by: Callable[[_a], float], k: int, xs: Sequence[_a], keep_totals=False, optimal_but_reordered=False) Generator[_a, None, None]#
Batch an iterable of knowable length into k batches containing elements whose sum of some weight/score/cost (by) is roughly equal. :param keep_totals: if True, the function returns tuples of (batch_total_value, batch) instead of just batches :param optimal_but_reordered: if True, the function produces the optimal (hence not order-preserving) batching
of summable xs into a given number of batches containing roughly equal totals. If False, the process retains order but may return more batches than requested.
- Generic_Util.iter.deep_extract(xss_: Iterable[Iterable], *key_path) Generator#
Given a nested combination of iterables and a path of keys in it, return a deep_flatten-ed list of entries under said path. Note: deep_extract(xss_, *key_path) == deep_flatten(Generic_Util.operator.get_nested(xss_, *key_path))
- Generic_Util.iter.deep_flatten(xss_: Iterable) Generator#
Flatten any nested combination of iterables to its leaves. The flattening ignores keys of Mappings and stops at tuples, strings, bytes or non-Iterables. Same as a recursive chain.from_iterable but handling values of Mappings instead of keys.
- Generic_Util.iter.diff(xs: Iterable[_a], ys: Iterable[_a]) list[_a]#
Difference of iterables. .. rubric:: Notes
not a set difference, so strictly removing as many xs duplicate entries as there are in ys
preserves order in xs
- Generic_Util.iter.eq_elems(xs: Iterable[_a], ys: Iterable[_a]) bool#
Equality of iterables by their elements
- Generic_Util.iter.first(c: Callable[[_a], bool], xs: Iterable[_a], default: _a | None = None) _a#
Return the first value in xs which satisfies condition c.
- Generic_Util.iter.flatten(xss: Iterable[Iterable], as_generator=False) list | Generator#
Standard flattening (returning a list by default).
- Generic_Util.iter.foldq(f: Callable[[_b, _a], _b], g: Callable[[_b, _a, list[_a]], list[_a]], c: Callable[[_a], bool], xs: Sequence[_a], acc: _b) tuple[_b, list[_a]]#
Fold-like higher-order function where xs is traversed by consumption conditional on c, and remaining xs are updated by g (therefore consumption order is not known a priori):
the first/next item to be ingested is the first in the remaining xs to fulfil condition c
at every x ingestion, the item is removed from (a copy of) xs, and all the remaining ones are potentially modified by function g
this function always returns a tuple of (acc, remaining_xs), unlike the stricter foldq_, which raises an exception for leftover xs
Note: fold(f, xs, acc) == foldq(f, lambda acc, x, xs: xs, lambda x: True, xs, acc).
Sequence of suitable names leading to the current one: consumption_fold, condition_update_fold, cu_fold, q_fold, qfold or foldq
- Parameters:
f – ‘Traditional’ fold function :: acc -> x -> acc
g – ‘Update’ function for all remaining xs at every iteration :: acc -> x -> xs -> xs
c – ‘Condition’ function to select the next x (first which satisfies it) :: x -> Bool
xs – Structure to consume
acc – Starting value for the accumulator
- Returns:
(acc, remaining_xs)
- Generic_Util.iter.foldq_(f: Callable[[_b, _a], _b], g: Callable[[_b, _a, list[_a]], list[_a]], c: Callable[[_a], bool], xs: list[_a], acc: _b) _b#
Stricter version of foldq (see its description for details); only returns the accumulator and raises an exception on leftover xs. :raises ValueError on leftover xs
- Generic_Util.iter.group_by(f: Callable[[_a], _b], xs: Iterable[_a]) dict[_b, list[_a]]#
Generalisation of partition to any-output key-function. .. rubric:: Notes
‘Retrieval’ functions from the operator package are typical f values (itemgetter(…), attrgetter(…) or methodcaller(…))
This is NOT Haskell’s groupBy function
- Generic_Util.iter.intersperse(xs: Sequence[_a], ys: list[_a], n: int, prepend=False, append=False) Sequence[_a]#
Intersperse elements of ys every n elements of xs.
- Generic_Util.iter.intersperse_val(xs: Sequence[_a], y: _a, n: int, prepend=False, append=False) Sequence[_a]#
Intersperse y every n elements of xs.
- Generic_Util.iter.partition(p: Callable[[_a], bool], xs: Iterable[_a]) tuple[Iterable[_a], Iterable[_a]]#
Haskell’s partition function, partitioning xs by some boolean predicate p: partition p xs == (filter p xs, filter (not . p) xs).
- Generic_Util.iter.topological_sort(nodes_incoming_edges_tuples: Iterable[tuple[_a, list[_b]]]) list[_a]#
Topological sort, i.e. sort (non-uniquely) DAG nodes by directed path, e.g. sort packages by dependency order.
- Generic_Util.iter.unique(xs: Iterable[_a]) list[_a]#
Order-preserving uniqueness. If the values of xs are hashable, use unique_a instead.
- Generic_Util.iter.unique_a(xs: ndarray[Any, dtype[ScalarType]]) ndarray[Any, dtype[ScalarType]]#
Order-preserving uniqueness for an array (of hashable objects). Much faster than a compiled version of the non-hashing-using algorithm, even including the final sort.
- Generic_Util.iter.unique_by(f: Callable[[_a], Any], xs: Iterable[_a]) list[_a]#
Order-preserving uniqueness by some property f. If the values of xs are hashable, it is recommended to use unique_a.
- Generic_Util.iter.unzip(list_of_ntuples: Iterable[Iterable], as_generator=False) list[list] | Generator[list, None, None]#
Standard unzip (i.e. zip(*…)) but with concretised outputs (as lists).
- Generic_Util.iter.update_dict_with(d0: dict, d1: dict, f: Callable[[_a, _b], _a | _b]) dict[Any, Union[_a, _b]]#
Update a dictionary’s entries with those of another using a given function, e.g. appending (operator.add is ideal for this). NOTE: This modifies d0, so a prior copy or deepcopy is advisable.
- Generic_Util.iter.zip_maps(*maps: list[Mapping[_a, _b]]) Generator[tuple[_a, tuple], None, None]#
Generic_Util.misc module#
Functions with less generic purpose than in the above; currently mostly to do with min/max-based operations.
- Generic_Util.misc.interval_overlap(ab: tuple[float, float], cd: tuple[float, float]) float#
Compute the overlap of two intervals (expressed as tuples of start and end values)
- Generic_Util.misc.min_max(xs: Sequence[_a]) tuple[_a, _a]#
Mathematically most efficient joint identification of min and max (minimum comparisons = 3n/2 - 2). .. note:
- This function is numba-compilable, e.g. as `njit(nTup(f8,f8)(f8[::1]))(min_max)` (see `Generic_Util.numba.types` for `nTup` shorthand), - If using numpy arrays, min and max are cached for O(1) lookup, and one would imagine this is the used algorithm
Generic_Util.operator module#
Functions regarding item retrieval, and syntactic sugar for patterns of function application.
- Generic_Util.operator.fst(ab: tuple[_a, _b]) _a#
Same as operator.itemgetter(0), but intended (and type-annotated) specifically for 2-tuple use
- Generic_Util.operator.get_nested(xss_: Iterable[Iterable], *key_path) Generator#
Follow a path of keys for a nested combination of iterables
- Generic_Util.operator.on(f: Callable, xs: Iterable[_a], g: Callable[[_a, ...], _b], *args, **kwargs)#
Transform xs by element-wise application of g and call f with them as its arguments. E.g. on(operator.gt, (a, b), len) .. rubric:: Notes
- Generic_Util.operator.on_a(f: Callable, xs: Iterable, a: str)#
Extract attribute a from xs elements and call f with them as its arguments. E.g. on_a(operator.eq, [a, b], ‘__class__’)
- Generic_Util.operator.on_m(f: Callable, xs: Iterable, m: str, *args, **kwargs)#
Call method m on xs elements and call f with their results as its arguments. E.g. on_m(operator.gt, [a, b], ‘count’, ‘hello’) .. rubric:: Notes
- Generic_Util.operator.snd(ab: tuple[_a, _b]) _b#
Same as operator.itemgetter(1), but intended (and type-annotated) specifically for 2-tuple use
Module contents#
Copyright (c) 2023 Thomas Fletcher. All rights reserved.
Generic-Util: Convenient functions not found in Python’s Standard Library (regarding benchmarking, iterables, functional tools, operators and numba compilation)