- the window length must be an odd number
- the polynomial order (1, 2, 3, ...) must be less than the window length
- the data points are assumed to be "evenly spaced" (for example in time, voltage, or magnetic field)
For example, when measuring current I as a function of applied voltage V from -10 to +30 V in equal steps of 0.1 volts, specifying a window length of 21 points and a second order polynomial fit would correspond to a fitting a quadratic function to a set of 21 points spanning a 2 V range centered about a particular data point (including that point), then repeating that process for each subsequent data point. In the SciPy implementation, this repeated fitting over a moving data stream is done automatically for us.
As noted above, the number of points included in the subset and the order of the polynomial must be specified by the user. In all data smoothing routines, there is a tradeoff between the degree of smoothing and the ability to resolve small features in the underlying signal; data smoothing is not a substitute for working as carefully as possible to minimize the measurement noise and maximize the accuracy and precision of the originally measured data! In the end, the 'best values' for these parameters are usually determined by trial and error. Keep track of the values used when determining the resolution of your experiment!
Scipy.Signal.Savgol_filter
This filter can be
implemented in Python with just two lines of code: one to import from the
scipy.signal
module the function
savgol_filter
, and one more to carry out the data smoothing.
We first need to specify two parameters:
- window_length, the number of adjacent data points we wish to include in the fit
- polyorder, the order of the polynomial fit (where 0 = constant, 1 = linear, 2 = quadratic, and 3 = cubic)
If we also want to calculate the slope of the data at each point by taking a derivative, we then need to specify two additional parameters :
- deriv, the order of the derivative (where 0 = no derivative, 1 = first derivative, 2 = 2nd derivative,...)
- delta, the spacing of the samples to which the filter will be applied.
As an example, suppose we want again to determine \(\frac{dI}{dV}\) from a measurement of \(I\left(V\right)\) for applied voltages between -10 V and +30 V and that measurements were made evenly spaced in voltage with a step size of 0.1 V. If so, we then specify that deriv = 1 and delta = 0.1 within the savgol_filter
function.
Finally, for the computed derivative to be physically and quantitatively meaningful,
- the order of the polynomial fit needs to be greater than the order of derivative (!)
- The delta value needs to be correctly specified. If unspecified, it is assumed that delta = 1.
As usual, there are additional optional settings, including
options for how to handle data near the endpoints (see
mode
): the default (
mode = interp
) is to fit the last window_length / 2 points to a polynomial of order polyorder. For this and other details, see the
Scipy reference manual page.
Sample Python code
Here is an example of how to use the SciPy function savgol_filter
for data smoothing.