noisify.faults package

noisify.faults.fault module

The base classes for faults.

class noisify.faults.fault.AttributeFault(*args, **kwargs)[source]

Derived base class for attribute_readers, adds mapping behaviour which enables attribute faults to be added at higher levels of data representation.

For example:

>>> from noisify.faults import GaussianNoise
>>> noise = GaussianNoise(sigma=0.5)
>>> noise.impact(100)
100.66812113455995
>>> noise.impact({'A group': 100, 'of numbers': 123})
{'of numbers': 122.83439465953323, 'A group': 99.69284150349345}
condition(triggering_object)[source]

Overrides the condition method to be constitutively active at the initial mapping stage.

Parameters:triggering_object
Returns:
map_fault(truth_object)[source]

Attempts to apply the fault to all subitems of the given object, in practice this means calling the fault on all values of a dict.

Parameters:truth_object
Returns:
class noisify.faults.fault.Fault(*args, **kwargs)[source]

Fault base class.

Requires implementations to be registered in its subclasses. Subclasses register implementations with the “register_implementation(priority=x)” decorator.

All implementations will be attempted using a try except loop which will except Type, Attribute and Import errors. If no implementations succeed, the Fault will return the original object, unchanged.

By default faults are constitutively active, this can be overridden at instantiation by providing a ‘likelihood’ keyword argument with a probability of activation as a float.

Example Usage:

>>> class AddOneFault(Fault):
...     def condition(self, triggering_object):
...         return True
...
...     @register_implementation(priority=2)
...     def make_uppercase(self, lowercase_string):
...         return lowercase_string.upper()
...
...     @register_implementation(priority=1)
...     def add_to_int_string(self, integer_object):
...         return int(str(integer_object) + "1")
...
>>> adder = AddOneFault()
>>> adder.impact("testing priority")
'TESTING PRIORITY'
>>> adder.impact(1234)
12341

This decorator will also honour any type hints in the decorated function.

Example:

>>> class AddOneFault(Fault):
...     @register_implementation(priority=1)
...     def make_uppercase(self, lowercase_string: str):
...         print('Called uppercase function')
...         return lowercase_string.upper()
...
...     @register_implementation(priority=2)
...     def add_to_int_string(self, integer_object: int):
...         print('Called integer adding function')
...         return int(str(integer_object) + "1")
...
>>> adder = AddOneFault()
>>> adder.impact("testing annotation")
Called uppercase function
'TESTING ANNOTATION'
>>> adder.impact(1234)
Called integer adding function
12341
apply(not_faulted_object)[source]

Applies the fault to an object, returns self and the new object if the activation condition is met.

Parameters:not_faulted_object
Returns:self or None, impacted_object
condition(triggering_object)[source]

Base condition method, applies fault either constitutively or according to a likelihood argument at instantiation.

Parameters:triggering_object – Can be used to create object-type dependant activation in overridden methods
Returns:Boolean of whether or not the fault applies
impact(impacted_object)[source]

Attempts to apply the fault to an object, cycles through all implementations until one succesfully executes. If none execute it will return the original object, unharmed.

Parameters:impacted_object
Returns:

noisify.faults.attribute_faults module

Basic attribute level faults, mostly basic numeric manipulations. A good place to get started.

class noisify.faults.attribute_faults.CalibrationFault(offset=0)[source]

Subclass of UnitFault, adds a constant offset to the input numeric.

>>> calibration_fault = CalibrationFault(10)
>>> calibration_fault.impact(200)
210
class noisify.faults.attribute_faults.GaussianNoise(sigma=0)[source]

Applies a gaussian noise to a numeric object.

>>> noise = GaussianNoise(sigma=0.5)
>>> noise.impact(27)
28.08656007204934

Numpy arrays like objects apply noise separately to each element.

>>> import numpy as np
>>> test = np.arange(5)
>>> noise.impact(test)
array([0.56983913, 0.92835482, 2.36240306, 2.87398093, 3.92371237])
numpy_array(array_like_object)[source]

Support for numpy arrays

pandas_df(data_frame)[source]

Support for pandas dataframes

pil_image(image_object)[source]

Support for PIL image objects, undetectable unless high sigma given

python_numeric(python_numeric_object)[source]

Support for basic Python numeric types

class noisify.faults.attribute_faults.InterruptionFault(likelihood=0)[source]

Replaces input with None, activates according to set likelihood.

>>> interrupt = InterruptionFault(1.0)
>>> interrupt.impact('This can be anything')
>>>
impact_truth(truth)[source]

Basic behaviour, just returns None!

numpy_array(array_like_object)[source]

Support numpy arrays and pandas dataframes

pil_image(image_object)[source]

Support for PIL images

class noisify.faults.attribute_faults.TypographicalFault(likelihood=0, severity=0)[source]

Applies a rough misspelling to the input using faults.utilities.typo()

>>> from noisify.faults import TypographicalFault
>>> typo_fault = TypographicalFault(1.0, 1)
>>> typo_fault.impact('This is the original text')
'Thhiisith heiginal etxt'
impact_float(float_object: float)[source]

Scrambles floats, ensures still valid before returning

impact_int(int_object: int)[source]

Scrambles ints

impact_string(string_object: str)[source]

Scrambles strings

class noisify.faults.attribute_faults.UnitFault(likelihood=1.0, unit_modifier=None)[source]

Applies a user defined adjustment to the input numeric object. Useful for modelling unit errors.

>>> def celsius_to_kelvin(celsius_value):
...     return celsius_value + 273.15
...
>>> kelvin_fault = UnitFault(unit_modifier=celsius_to_kelvin)
>>> kelvin_fault.impact(21)
294.15
numeric(numeric_object)[source]

Support for basic numeric types, including dataframes and numpy arrays

pil_image(image_object)[source]

Support for PIL images

noisify.faults.attribute_faults.get_mode_size(mode)[source]

Converts a PIL image mode string into a dimension cardinality

noisify.faults.attribute_faults.typo(string, severity)[source]

Roughly rearranges string with the occasional missed character, based on applying a gaussian noise filter to the string character indexes and then rounding to the closest index.

Parameters:
  • string
  • severity
Returns:

mistyped string

noisify.faults.report_faults module

Report level faults typically comprise faults that depend on multiple attributes. For example switching attribute values.

class noisify.faults.report_faults.ConfuseSpecificAttributes(attribute1, attribute2, likelihood=0)[source]

Swaps a specific pair of attribute values in a given object

impact_dictionary(dictionary_object)[source]

Support for dictionary like objects

class noisify.faults.report_faults.LoseEntireReport(likelihood=0)[source]

Replaces entire report with None, activates according to set likelihood.

impact_truth(truth)[source]

Just returns None!

class noisify.faults.report_faults.ScrambleAttributes(likelihood=0.1, attribute_identifiers=None)[source]

Switches the values of different attribute_readers within the object. By default it will apply to all attribute_readers.

impact_dictionary(dictionary_object)[source]

Swaps random values in a dictionary

numpy_array(array_like)[source]

Swaps random cells in a numpy array-like object

pillow_image(pillow_image)[source]

Swaps random pixels in a PIL Image

noisify.faults.utilities module

Fault utility functions, general purpose code that is used by multiple functions.

noisify.faults.utilities.dropped_scramble(collection, scrambledness, confusion_range)[source]

Scrambles objects in a collection, with a chance to lose some objects

Parameters:
  • collection
  • scrambledness – How likely two objects are to be switched
  • confusion_range – How far apart objects can be confused with one another
Returns:

noisify.faults.utilities.scramble(collection, scrambledness, confusion_range)[source]

Scrambles the order of objects in a collection using a gaussian distribution, can lead to duplicate objects

Parameters:
  • collection
  • scrambledness – How likely two objects are to be switched
  • confusion_range – How far apart objects can be confused with one another
Returns: