runscripts.debug.compare_pickles

Compare two .cPickle files or all the .cPickle files in a pair of directories. Show the differences or optionally just a count of difference lines.

Usage (PATH is a path like ‘out/manual/intermediates’):

runscripts/debug/comparePickles.py PATH1 PATH2

class runscripts.debug.compare_pickles.Repr(repr_)[source]

Bases: object

A Repr has the given repr() string without quotes and != any other value.

runscripts.debug.compare_pickles._are_instances_of(a, b, a_type)[source]

Return True if a and b are both instances of the given type (or tuple of types).

runscripts.debug.compare_pickles.all_vars(obj)[source]

Returns a dict of all the object’s instance variables stored in ordinary fields and in compact slots. This expands on the built-in function vars(). If the object implements the pickling method __getstate__, call that instead to get its defining state.

runscripts.debug.compare_pickles.compare_floats(f1, f2)[source]

Compare two floats, allowing some tolerance, NaN, and Inf values. This considers all types of NaN to match. Return 0.0 (which is falsey) if they match, else (f1, f2).

runscripts.debug.compare_pickles.compare_ndarrays(array1, array2)[source]

Compare two ndarrays, checking the shape and all elements, allowing for NaN values and non-numeric values. Return () if they match, else a tuple of diff info or just a diff description.

TODO(jerry): Allow tolerance for float elements of structured arrays and

handle NaN and Inf values.

runscripts.debug.compare_pickles.diff_dirs(dir1, dir2, print_diff_lines=True)[source]

Diff the pickle files in the pair of named directories. Return the total diff line count.

Parameters:
Return type:

int

runscripts.debug.compare_pickles.diff_files(path1, path2, print_diff_lines=True)[source]

Diff the pair of named pickle files. Return the diff line count.

Parameters:
Return type:

int

runscripts.debug.compare_pickles.diff_trees(a, b)[source]

Find the differences between two trees or leaf nodes a and b. Return a falsey value if the inputs match OR a truthy value that explains or summarizes their differences, where each point in the tree where the inputs differ will be a tuple (a’s value, b’s value, optional description).

Floating point numbers are compared with the tolerance set by the constant NULP (Number of Units in the Last Place), allowing for NaN and infinite values. (Adjust the tolerance level NULP if needed.)

This operation is symmetrical.

runscripts.debug.compare_pickles.elide(value, max_len=200)[source]

Return a value with the same repr but elided if it’d be longer than max.

runscripts.debug.compare_pickles.has_python_vars(obj)[source]

Returns true if the given object has any Python instance variables, that is ordinary fields or compact slots. If not, it’s presumably a built-in type or extension type implemented entirely in C and Cython.

runscripts.debug.compare_pickles.is_leaf(value, leaves=(<class 'unum.Unum'>, Bio.Seq.Seq, sympy.Basic, <class 'numbers.Number'>, <class 'functools.partial'>, <class 'function'>, sympy.matrices.dense.MutableDenseMatrix, <class 'wholecell.utils.unit_struct_array.UnitStructArray'>))[source]

Predicate to determine if we have reached the end of how deep we want to traverse through the object tree.

runscripts.debug.compare_pickles.list_pickles(directory)[source]

Return a map of .cPickle file names to paths in the given directory sorted by file modification time then by filename.

Parameters:

directory (str)

Return type:

Dict[str, str]

runscripts.debug.compare_pickles.load_fit_tree(out_subdir)[source]

Load the parameter calculator’s (Parca’s) output as an object_tree.

runscripts.debug.compare_pickles.load_tree(path)[source]

Load a .cPickle file as an object_tree.

runscripts.debug.compare_pickles.object_tree(obj, path='', debug=None)[source]

Diagnostic tool to inspect a complex data structure.

Given an object, exhaustively traverse down all attributes it contains until leaves are reached, and convert everything found into a dictionary or a list. The resulting dictionary will mirror the structure of the original object, but instead of attributes with values it will be a dictionary where the keys are the attribute names. The type of the dictionarified object will be encoded under the key !type which is assumed to not be in conflict with any other attributes. The result should aid in serialization and deserialization of the object and is intended to be a translation of a pickled object.

Parameters:
  • obj (object) – The object to inspect.

  • path (optional str) – The root path of this object tree. This will be built upon for each child of the current object found and reported in a value is provided for debug.

  • debug (optional str) – If provided, prints paths of the attributes encountered. If the value is ‘ALL’, it will print every path. If the value is ‘CALLABLE’, it will only print methods and functions it finds.

runscripts.debug.compare_pickles.pprint_diffs(diffs, *, width=160, print_diff_lines=True, print_count=True)[source]

Pretty-print the diff info: optionally print the detailed diff lines, optionally print the diff line count as a single figure of merit; then return the line count.

runscripts.debug.compare_pickles.simplify_error_message(message)[source]
runscripts.debug.compare_pickles.size_tree(o, cutoff=0.1)[source]

Find the size of attributes in an object tree. Sizes greater than the cutoff (in MB) will be returned for displaying. Sizes include all values contained within an attribute (eg. a Dict will be represented by the size of all keys and values in addition to the Dict size itself).

TODO: double check total size vs disk size - might be missing some types