A Python library for blind analysis of measurement data.
Do you need to blind your data?
Maybe not. But if you expect (or want?!) a measurement to produce a certain result, blinding can remove the temptation to tune your analysis.
Do you need a package to blind your data?
Nope. To randomly offset the values in a column of a pandas.DataFrame(),
import numpy as np
import pandas as pd
# parameters
fil = "./my_awesome_measurement.csv"
name = "A"
offset = 0.1
# load
df = pd.read_csv(fil)
# shift data values
df[name] = df[name] + offset * np.random.rand()blindat extends this simple concept using transformation rules.
This library is an experiment in developing a reasonable workflow for blind analysis. It is not intended to be a universal solution for all forms of data or blind analysis techniques.
I assume the user wants to avoid bias. Trust allows for a simple and reversible approach (stored data is never altered). More paranoia is more appropriate for more critical applications.
Requires python>=3.10, numpy and pandas.
Activate your Python analysis environment. Clone the source code and cd into the directory. Install using pip
pip install .Blind analysis can be as simple as applying an unknown transform to an appropriate column of data (e.g., laser wavelength or microwave frequency).
In this example, a random offset is selected from the range of 10 to 20 and added to column A of the pandas DataFrame, df.
import blindat as bd
rules = bd.generate_rules("A", offset=(10.0, 20.0), random_seed=42)
df1 = bd.blind(df, rules)
df1.head()| A | B | C | D | |
|---|---|---|---|---|
| 0 | 14.264812 | 0.030766 | 0.064909 | 0.930325 |
| 1 | 14.014989 | 0.562393 | 0.227109 | 0.202936 |
| 2 | 14.114655 | 0.579577 | 0.015450 | 0.534170 |
| 3 | 14.417311 | 0.868601 | 0.142738 | 0.573955 |
| 4 | 14.648785 | 0.921365 | 0.019821 | 0.263312 |
The @blindat decorator can be used to wrap methods that load a pandas.Dataframe with the blind() function.
The normalize() function blinds data by offsetting and scaling values to force a mean value of zero and a standard deviation of one.
See docs for examples.