Mark up time series
Time series has a pattern. The pattern consists of several segments. Each segment could be represented as a simple parametric regression function. The whole pattern is the sequence of concantenated regression models.
The following information is given: 1) One-dimensional time series and 2) a set of the regression models. Each model defines: - type of regression function (linear, exponential, gaussian, etc.); - parameters of the function; - minimal and maximal number of samples in the segment; - maximal value of the variance of the residuals.
One must compute starts of the segments. Note that there is - no gap between the segments; - there could be any gap between patterns; - segments may vary in length; - there are broken, unknown, etc., patterns.
The main problem is how to define the optimal segment start point between two neighbours segments.
The principle of the algorithm: 1) There is the regression function 'any' for each pattern to begin with. 2) The 1-st point (start of the segment) is the point between 'any' and the 1st segment of the pattern. 3) Continue for each segments' pair of the patters.
How to find the point? Since each segment could very it's length from min to max, assume the Cartesian product of the length of both segments - antecedent and consequent. For each pair of segments calculate sum of variances of the segment residuals The minimal variance defines length of the segments and so, the point between them.
Contents
Load the time series
x = dlmread('xMarkUp.csv'); %x(1:1500) = []; figure; hold on plot(1:length(x),x,'g.'); axis tight xlabel time ylabel value title('The time series to mark up');
Define regression functions for segments
It is a structure array with the following fields:
The dummy segment (for any possible gap between patterns)
mark(1).func = 'mark_any'; % function name, the function must be in the Matlab path mark(1).w = []; % regression function parameters (there are no parameters for this function mark(1).tmin = 0; % minimal length of the segment mark(1).tmax = 100; % maximal length of the segment % NOTE! set the value no more than the length of the pattern mark(1).err = 0.1e-6; % maximal variance of the regression residuals (0.1e-6 for this function) % The 1st segment mark(2).func = 'mark_anylin'; mark(2).w = 10.818; mark(2).tmin = 9; %10; mark(2).tmax = 12; %15; mark(2).err = 5; % The 2nd segment mark(3).func = 'mark_const'; % the first constant mark(3).w = 150; % mark(3).tmin = 28; mark(3).tmax = 36;% 32 GD100;% mark(3).err = 5; % The 3rd segment mark(4).func = 'mark_anylin'; mark(4).w = 4.9548; mark(4).tmin = 36;%24; mark(4).tmax = 44;%46; mark(4).err = 12; % The 4rh segment mark(5).func = 'mark_anyconst'; % the second constant mark(5).w = []; mark(5).tmin = 40; %62 GD100; % 40 ONLY for IPOH mark(5).tmax = 68;%80 mark(5).err = 8; % The 5th (last) segment mark(6).func = 'mark_anyexp'; mark(6).w = [298.7405 -0.0574]; % mark(6).tmin = 80; mark(6).tmax = 80; mark(6).err = 7;
Discussion
The regression parameters are defined manually. There are two ways to get mark the segment: 1) check the variance of the residuals 2) check the regression parameters have acceptable values. We chose the 1st way. Below we define each segment manually and estimate its possible length and parameters.
Find the parameters of the 1-st segment, line
ptr_list = [0
255
255+12+1
255+12+39+1
255+12+39+30+1
255+12+39+30+60+1
]; % Define starts of each segment
% 1) Observe the segments one-by-one
for segNum = [2 3 4 5 6] %
% 2) Set the start of the segment manually
ptr = ptr_list(segNum);
fprintf(1,'\nSegment %d\n', segNum-1);
% 3) Calculate parameters and variance for each length
vecErr = []; % vector of segment variances
matW = []; % matrix of segment parameters
figure; hold on
for timRelative = mark(segNum).tmin : mark(segNum).tmax
tim = [ptr:ptr+timRelative]'; % time ticks of the segment
seg = x(tim); % segment values
% 4) Get the parameters and variance
[y, err, w] = feval(mark(segNum).func, mark(segNum).w, seg);
vecErr = [vecErr; err];
matW = [ matW;, w];
% 5) Use the parameters to show possible variances for each segment length
%[y, vecSig2(end+1) ] = feval(mark(segNum).func, mark(segNum).w, x(ptr:ptr+ti));
% 6) Plot the result
%subplot(5,1,segNum-1); hold on
plot(tim, seg, 'b.');
plot(tim, y, 'r-');
xlabel('time'); ylabel('value');
title(['Segment ', num2str(segNum-1)]);
fprintf(1, ' len = %d, err = %0.2f, w = %s\n', timRelative, err, num2str(w));
end
end
Segment 1
len = 9, err = 0.80, w = 10.6424
len = 10, err = 1.06, w = 10.5091
len = 11, err = 2.30, w = 10.1818
len = 12, err = 4.05, w = 9.7143
Segment 2
len = 28, err = 1.34, w = 150
len = 29, err = 1.32, w = 150
len = 30, err = 1.30, w = 150
len = 31, err = 1.27, w = 150
len = 32, err = 1.26, w = 150
len = 33, err = 1.24, w = 150
len = 34, err = 1.22, w = 150
len = 35, err = 1.20, w = 150
len = 36, err = 1.19, w = 150
Segment 3
len = 36, err = 5.83, w = 4.5737
len = 37, err = 6.83, w = 4.4762
len = 38, err = 7.82, w = 4.3757
len = 39, err = 8.93, w = 4.2662
len = 40, err = 9.82, w = 4.1641
len = 41, err = 10.71, w = 4.0621
len = 42, err = 11.68, w = 3.9544
len = 43, err = 12.48, w = 3.8548
len = 44, err = 13.25, w = 3.7568
Segment 4
len = 40, err = 0.62, w =
len = 41, err = 0.62, w =
len = 42, err = 0.61, w =
len = 43, err = 0.60, w =
len = 44, err = 0.60, w =
len = 45, err = 0.59, w =
len = 46, err = 0.58, w =
len = 47, err = 0.58, w =
len = 48, err = 0.57, w =
len = 49, err = 0.57, w =
len = 50, err = 0.56, w =
len = 51, err = 0.55, w =
len = 52, err = 0.55, w =
len = 53, err = 0.54, w =
len = 54, err = 0.54, w =
len = 55, err = 0.53, w =
len = 56, err = 0.53, w =
len = 57, err = 0.53, w =
len = 58, err = 0.52, w =
len = 59, err = 0.52, w =
len = 60, err = 0.57, w =
len = 61, err = 0.95, w =
len = 62, err = 2.23, w =
len = 63, err = 4.14, w =
len = 64, err = 6.44, w =
len = 65, err = 9.05, w =
len = 66, err = 11.75, w =
len = 67, err = 14.71, w =
len = 68, err = 17.76, w =
Segment 5
len = 80, err = 3.01, w = 298.9107 -0.05706107
The main mark up function
[patterns, err] = markup(mark, x, 1);
%[patterns, err, params] = markup(mark, x);
Plot the result of the marking with the fixed regression parameters
figure; hold on plot(1:length(x),x,'g.'); axis tight yminmax = ylim; colors = {'r','b','y','c','m'}; for segTimeNaN = patterns' segTime = segTimeNaN(find (~isnan(segTimeNaN)) ); for segNum = 1:length(segTime)-1 tim = [segTime(segNum) : segTime(segNum+1) - 1]'; %NOTE minus one seg = x(tim); [segY, seg1sig] = feval(mark(segNum+1).func, mark(segNum+1).w, seg); % calculate the variance (fitness) plot(tim, seg, [colors{segNum},'.']); plot(tim, segY, [colors{segNum},'-']); end end xlabel time ylabel value title('All patterns, including the broken ones');
Plot only correct patterns
[patterns, err] = markup(mark, x); figure; hold on plot(1:length(x),x,'g.'); axis tight yminmax = ylim; colors = {'r','b','y','c','m'}; for segTimeNaN = patterns' if ~any(isnan(segTimeNaN)) segTime = segTimeNaN; for segNum = 1:length(segTime)-1 tim = [segTime(segNum) : segTime(segNum+1) - 1]'; %NOTE minus one seg = x(tim); [segY, seg1sig, w] = feval(mark(segNum+1).func, mark(segNum+1).w, seg); % calculate the variance (fitness) plot(tim, seg, [colors{segNum},'.']); plot(tim, segY, [colors{segNum},'-']); end end end xlabel time ylabel value title('There is only one correct pattern in the time series');
Appendix: library of the regression functions
now three functions are suggested: mark_any, mark_lin, mark_exp The example of mark_exp:
% function [y, mse, w] = mark_exp(w, x) % % [idx, w, sig2] = mark_exp(x, t, w, sig2) % % mark time series x, the exponential function % % % % w [1,W] parameters of the model here y = w(1) + w(2)*x; % % x [m,1] time series, the depended variable of the regression fit % % % % y [m,1] the calculated depended variable % % mse [scalar] the residual variance, MSE % % w [1,W] parameters of the function % % % % if the regression parameters are required, the Levenberg_Markquardt Method will % % be called, and the parameters will be returned together with new MSE. % % % Example % % x = [298; 294; 284; 272; 260; 248; 238; 226; 216; 206; 196; 186; 178; 170; 162; 154; 148; 140; 134; 128; 122; 118; 112; 108; 102; 98; 94; 90; 88; 84; 80; 78; 76; 72; 70; 68; 66; 64; 62; 60; 58; 56; 56; 54; 54; 52; 50; 50; 48; 48; 48; 46; 46; 46; 46; 44; 44; 44; 44; 42; 42; 42; 42; 42; 40; 40; 40; 40; 40; 40; 40; 40; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38; 38]; % % w = [[30 300 -0.05]] % % [y1, sig2] = mark_exp(w, x) % % [y2, sig2, w] = mark_exp(w, x) % % tim = [1:length(x)]'; % % figure; hold on % % plot(tim, x, 'k.'); % % plot(tim, y1, 'b-'); % % plot(tim, y2, 'r-'); % % legend('source', 'manual parameters', 'optimised parameters'); % % xlabel('time'); ylabel('value'); % % f = inline('w(1) + w(2) * exp(w(3) * (1:length(x))'')', 'w', 'x'); % if nargout > 2 % w = nlinfit((1:length(x))', x, f, w); % end % y = f(w,x); % mse = var(x-y); % return