optframework.kernel_opt.opt_data module

data-processing-related calculations during optimization

class optframework.kernel_opt.opt_data.OptData(base)[source]

Bases: object

read_exp(exp_data_path, t_vec, sheet_name=None)[source]

Reads experimental data from a specified path and processes it for use in the optimization.

Parameters

exp_data_pathstr

Path to the experimental data file.

t_vecarray-like

The time vector corresponding to the desired time points for the experimental data.

Returns

tuple of arrays
  • x_uni_exp: An array of unique particle sizes from the experimental data.

  • raw_data_exp: An array of the sum of number concentrations for the unique particle sizes.

function_noise(ori_data)[source]

Adds noise to the original data based on the noise_type and noise_strength attributes. Supported noise types include Gaussian (‘Gaus’), Uniform (‘Uni’), Poisson (‘Po’), and Multiplicative (‘Mul’). The resulting noisy data is clipped to be non-negative.

Parameters

ori_dataarray-like

The original data to which noise will be added.

Returns

array-like

The noised data.

Notes

The noise types behave as follows:
  • Gaussian (‘Gaus’): Adds noise with mean 0 and standard deviation noise_strength.

  • Uniform (‘Uni’): Adds noise uniformly distributed over [-noise_strength/2, noise_strength/2).

  • Poisson (‘Po’): Adds Poisson-distributed noise where noise_strength serves as lambda.

  • Multiplicative (‘Mul’): Applies Gaussian multiplicative noise with mean 1 and standard deviation noise_strength, multiplying the original data by the generated noise.

The resulting noised data is clipped to ensure no negative values.

KDE_fit(x_uni_ori, data_ori, bandwidth='scott', kernel_func='epanechnikov')[source]

Fit a Kernel Density Estimation (KDE) model to the original data using the specified kernel function and bandwidth.

Parameters

x_uni_oriarray-like

The unique values of the data variable. Must be a one-dimensional array.

data_oriarray-like

The original data corresponding to x_uni_ori. Should be absolute values, not relative.

bandwidthfloat or {‘scott’, ‘silverman’}, optional

The bandwidth of the kernel. If a float is provided, it defines the bandwidth directly. If a string (‘scott’ or ‘silverman’) is provided, the bandwidth is estimated using one of these methods. Defaults to ‘scott’.

kernel_func{‘gaussian’, ‘tophat’, ‘epanechnikov’, ‘exponential’, ‘linear’, ‘cosine’}, optional

The kernel to use for the density estimation. Defaults to ‘epanechnikov’.

Returns

sklearn.neighbors.kde.KernelDensity

The fitted KDE model.

Notes

  • x_uni_ori must be reshaped into a column vector for compatibility with the KernelDensity class.

  • Any values in data_ori that are zero or less are adjusted to a small positive value (1e-20) to avoid numerical issues during KDE fitting.

KDE_score(kde, x_uni_new)[source]

Evaluate and normalize the KDE model on new data points based on the cumulative distribution function (Q3).

Parameters

kdesklearn.neighbors.kde.KernelDensity

The fitted KDE model from the method.

x_uni_newarray-like

New unique data points where the KDE model will be evaluated.

Returns

array-like

The smoothed and normalized data based on the KDE model.

Notes

  • The KDE model is evaluated on the new data points by calculating the log density, which is then exponentiated to get the actual density values.

  • The smoothed data is normalized by dividing by the last value of the cumulative distribution (Q3).

traverse_path(label, path_ori)[source]

Update the file path or list of file paths based on the given label.

This method modifies the provided file path or a list of file paths by appending or updating a numerical label (e.g., ‘_0’, ‘_1’) to distinguish different samples of the same test.

Parameters

labelint

The label to update or append to the file path(s). The label corresponds to the current sample or iteration number.

path_oristr or list of str

The original file path or list of file paths to be updated.

Returns

str or list of str

The updated file path(s) with the new label.