onebone.preprocessing

onebone.preprocessing.feature_selection

Feature Selection methods.

onebone.preprocessing.feature_selection.fs_crosscorrelation(x: ndarray, refer: ndarray, output_col_num: int) ndarray

Note

This method uses scipy.signal.correlate.

Reduce the dimension of input data by removing the signals which have small cross correlation with reference signal.

Parameters
  • x (numpy.ndarray of shape (data_length, n_features)) – The data.

  • refer (numpy.ndarray of shape (data_length,)) – The reference data.

  • output_col_num (int) – Number of columns after dimension reduction.

Returns

x_tr (numpy.ndarray of shape (data_length, n_features)) – The data after dimension reduction.

Examples

>>> t = np.linspace(0, 1, 1000)
>>> a = 1.0 * np.sin(2 * np.pi * 30.0 * t)
>>> b = 5.0 * np.sin(2 * np.pi * 30.0 * t)
>>> x = np.stack([a, b], axis=1)
>>> x.shape
(1000, 2)
>>> refer = 1.0 * np.sin(2 * np.pi * 10.0 * t)
>>> x_dimreduced = fs_crosscorrelation(x, refer, output_col_num=1)
>>> x_dimreduced.shape
(1000, 1)

onebone.preprocessing.pd

Transform PRPS(Phase Resolved Pulse Sequence) format pd data to PRPD(Phase Resolved Partial Discharge) format.

onebone.preprocessing.pd.ps2pd(ps, range_amp: Tuple[int, int] = (0, 256), resol_amp: int = 128) ndarray

Transform prps(phase resolved pulse sequance) to a prpd(phaes resolved partial discharge) by marginalizing time dimension.

Parameters
  • ps (array_like of shape (n_resolution_phase, n_timestep)) – The data. Ex: kepco standard=(3600, 128)

  • range_amp (tuple (min, max), default=(0, 256)) – Measurement range of PD DAQ. Refers to DAQ manufacture.

  • resol_amp (int, default=128) – Desired resolution of amplitude resolution for transformd prpd.

Returns

pd (numpy.ndarray of shape (n_resolution_phase, n_resolution_amplitude)) – The transformed prpd.

Examples

>>> ps = np.random.random([3600,128])
>>> ps2pd(ps)
array([[0., 0., 0., ..., 0., 0., 0.],
    [0., 0., 0., ..., 0., 0., 0.],
    ...,
    [0., 0., 0., ..., 0., 0., 0.]])

onebone.preprocessing.scaling

Data scaling methods.

onebone.preprocessing.scaling.minmax_scaling(x, feature_range: Tuple[int, int] = (0, 1), axis: int = 0) ndarray

Note

This method uses sklearn.preprocessing.minmax_scale method as it is.

Transform features by scaling each feature to a given range.

\[x' = {(x - x_{min}) \over (x_{max} - x_{min})}\]
Parameters
  • x (array_like of shape (n_samples, n_features)) – The data.

  • feature_range (tuple (min, max), default=(0, 1)) – Desired range of transformed data.

  • axis (int, default=0) – Axis used to scale along.

Returns

x_tr (numpy.ndarray of shape (n_samples, n_features)) – The transformed data.

Examples

>>> a = list(range(9))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> minmax_scaling(a)
array([0.   , 0.125, 0.25 , 0.375, 0.5  , 0.625, 0.75 , 0.875, 1.   ])
onebone.preprocessing.scaling.zscore_scaling(x, axis: int = 0)

Note

This method uses sklearn.preprocessing.scale method as it is.

Transform input data so that they can be described as a normal distribution.

\[x' = {(x - x_{mean}) \over x_{std}}\]
Parameters
  • x (array_like of shape (n_samples, n_features)) – The data.

  • axis (int, default=0) – Axis used to compute the means and standard deviations along.

Returns

x_tr (numpy.ndarray of shape (n_samples, n_features)) – The transformed data.

Examples

>>> a = list(range(9))
>>> a
[0, 1, 2, 3, 4, 5, 6, 7, 8]
>>> zscore_scaling(a)
array([-1.54919334, -1.161895  , -0.77459667, -0.38729833,  0.,
        0.38729833,  0.77459667,  1.161895  ,  1.54919334])