WGPDFIT Parameter estimates for Generalized Pareto data CALL: [phat,cov] = wgpdfit(data,method,plotflag) phat = Estimated parameters (see wgpdcdf) [k s] = [shape scale] cov = covariance matrix of estimates data = an one-dimensional data set method = a string, describing the method of estimation 'pkd' = Pickands' estimator (default) 'pwm' = Probability Weighted Moments 'mom' = Moment method 'ml' = Maximum Likelihood method plotflag = 1, plot the empiricial distribution function and the estimated cdf (default) = 0, do not plot Estimates parameters in the Generalized Pareto Distribution (GPD). The default method is PKD since it works for all values of the shape parameter. The ML is valid when shape<=1, the PWM when shape>-0.5, the MOM when shape>-0.25. The variances of the ML estimates are usually smaller than those of the other estimators. However, for small sample sizes it is recommended to use the PWM or MOM, if they are valid. Example: sample = wgpdrnd(0.3,1,0,200,1); [phat,cov] = wgpdfit(sample,'pkd') [phat,cov] = wgpdfit(sample,'ml') See also wgpdcdf, wgevfit, empdistr
Computes and plots the empirical CDF | |
Generalized Pareto cumulative distribution function | |
Parameter estimates for Generalized Pareto data | |
Internal routine for wgpdfit (ML estimates for GPD data) | |
Remove trailing blanks. | |
Display message and abort function. | |
Check if variables or functions are defined. | |
Scalar nonlinear zero finding. | |
Hold current graph. | |
Linearly spaced vector. | |
Convert string to lowercase. | |
Average or mean value. | |
Not-a-Number. | |
Convert number to string. (Fast version) | |
Create/alter OPTIM OPTIONS structure. | |
Linear plot. | |
Start timer(s) running. | |
Standard deviation. | |
Compare strings. | |
Graph title. | |
Display warning message; disable or enable warning messages. |
% CHAPTER5 contains the commands used in Chapter 5 of the tutorial | |
Extrapolate level crossing spectrum | |
Extrapolate level crossing spectrum | |
Extrapolates a sequence of turning points. | |
Parameter estimates for Generalized Pareto data |
001 function [parms,cov] = wgpdfitwgpdfit(data,method,plotflag) 002 %WGPDFIT Parameter estimates for Generalized Pareto data 003 % 004 % CALL: [phat,cov] = wgpdfit(data,method,plotflag) 005 % 006 % phat = Estimated parameters (see wgpdcdf) 007 % [k s] = [shape scale] 008 % cov = covariance matrix of estimates 009 % data = an one-dimensional data set 010 % method = a string, describing the method of estimation 011 % 'pkd' = Pickands' estimator (default) 012 % 'pwm' = Probability Weighted Moments 013 % 'mom' = Moment method 014 % 'ml' = Maximum Likelihood method 015 % 016 % plotflag = 1, plot the empiricial distribution 017 % function and the estimated cdf (default) 018 % = 0, do not plot 019 % 020 % Estimates parameters in the Generalized Pareto Distribution (GPD). 021 % The default method is PKD since it works for all values of the shape 022 % parameter. The ML is valid when shape<=1, the PWM when shape>-0.5, 023 % the MOM when shape>-0.25. The variances of the ML estimates are usually 024 % smaller than those of the other estimators. However, for small sample 025 % sizes it is recommended to use the PWM or MOM, if they are valid. 026 % 027 % Example: 028 % sample = wgpdrnd(0.3,1,0,200,1); 029 % [phat,cov] = wgpdfit(sample,'pkd') 030 % [phat,cov] = wgpdfit(sample,'ml') 031 % 032 % See also wgpdcdf, wgevfit, empdistr 033 034 % References 035 % 036 % Borg, S. (1992) 037 % XS - a statistical program package in Splus for extreme-value 038 % analysis. 039 % Dept. of Mathematical Statistics, Lund University, 1992:E2 040 % 041 % Davidson & Smith (1990) 042 % Models for Exceedances over high Threholds. 043 % Journal of the Royal Statistical Society B,52, pp. 393-442. 044 % 045 % Grimshaw, S. D. (1993) 046 % Computing the Maximum Likelihood Estimates for the Generalized Pareto Distribution. 047 % Technometrics, 35, pp. 185-191. 048 049 % Tested on; Matlab 5.3 050 % History: 051 % Revised by jr 22.12.1999 052 % Modified by PJ 08-Mar-2000 053 % Changed 'pgpd' to 'gpdcdf' for method 'pkd'. 054 % Hjälptext 055 % Added 'hold off' in plotting. 056 % Added line: 'data = data(:)';' 057 % revised ms 14.06.2000 058 % - updated header info 059 % - changed name to wgpdfit (from gpdfit) 060 % - added w* to used WAFO-files 061 % Updated by PJ 22-Jun-2000 062 % Added new method 'ml', maximum likelihood estimation of 063 % parameters in GPD. 064 % Correction by PJ 19-Jul-2000 065 % Found error in the covariance matrix for method 'ml'. Now correct! 066 % Some other small changes in method 'ml' 067 % Correction by PJ 30-Nov-2000 068 % Updated method 'ml'. New method for finding start values for numerical solving. 069 % Updated help text, and some other small changes. 070 % Correction by PJ 11-Dec-2000 071 % Now method 'ml' works with Matlab <5.3. ('fzero' changed format) 072 073 % Check input 074 ni = nargin; 075 no = nargout; 076 error(nargchk(1,3,ni)); 077 078 if ni < 2, method = []; end 079 if ni < 3, plotflag = []; end 080 081 % Default values 082 if isempty(method), method = 'pkd'; end 083 if isempty(plotflag), plotflag = 1; end 084 085 data = data(:)'; % Make sure data is a row vector, PJ 08-Mar-2000 086 n = length(data); 087 method = lower(method); 088 089 if strcmp(method,'pkd'), 090 091 epsilon=1.e-4; 092 xi = -sort(-data); % data in descending order 093 x = sort(data); 094 y = ((1:n)-0.5)/n; 095 096 EDF=[x' y']; 097 098 % Find the M that minimizes diff. between EDF and G, the estimated cdf. 099 n4=floor(n/4); 100 d = 2*ones(1,n4);; 101 102 for M=1:n4, 103 shape = - log((xi(M)-xi(2*M))/(xi(2*M)-xi(4*M)))/log(2); 104 scale = (xi(2*M)-xi(4*M)); 105 106 if (abs(shape) < epsilon), 107 scale = scale/log(2); 108 else 109 scale = scale*shape/(1-0.5^shape); 110 end 111 112 d(M) = max(abs(wgpdcdf(EDF(:,1), shape, scale)-EDF(:,2))); 113 114 end % end M-loop 115 116 least = find(d(:)==min(d)); 117 M = least(1); % possibly more than one M; in that case use the smallest. 118 119 shape = - log((xi(M)-xi(2*M))/(xi(2*M)-xi(4*M)))/log(2); 120 scale = (xi(2*M)-xi(4*M)); 121 122 if (abs(shape) < epsilon), 123 scale = 1000*scale/log(2); 124 else 125 scale = scale*shape/(1-1/(2^shape)); 126 end 127 128 cov=[]; 129 130 elseif strcmp(method,'mom'), 131 132 m=mean(data); s=std(data); 133 shape = ((m/s)^2 - 1)/2; 134 scale = m*((m/s)^2+1)/2; 135 if (shape<=-0.25), 136 cov=ones(2)*NaN; 137 warning([' The estimate of shape (' num2str(shape) ') is not within the range where the Moment estimator is valid (shape>-0.25).']) 138 else 139 Vshape = (1+2*shape)^2*(1+shape+6*shape^2); 140 Covari = scale*(1+2*shape)*(1+4*shape+12*shape^2); 141 Vscale = 2*scale^2*(1+6*shape+12*shape^2); 142 cov_f = (1+shape)^2/(1+2*shape)/(1+3*shape)/(1+4*shape)/n; 143 cov=cov_f*[Vshape Covari; Covari, Vscale]; 144 end 145 146 elseif strcmp(method,'pwm'), 147 148 a0 = mean(data); 149 xio = sort(data); 150 p = n+0.35-(1:n); 151 a1 = sum(p.*xio)/n/n; 152 shape = a0/(a0-2*a1)-2; 153 scale = 2*a0*a1/(a0-2*a1); 154 if (shape <= -0.5), 155 cov=ones(2)*NaN; 156 warning([' The estimate of shape (' num2str(shape) ') is not within the range where the PWM estimator is valid (shape>-0.5).']) 157 else 158 f = 1/(1+2*shape)/(3+2*shape); 159 Vscale = scale^2*(7+18*shape+11*shape^2+2*shape^3); 160 Covari = scale*(2+shape)*(2+6*shape+7*shape^2+2*shape^3); 161 Vshape = (1+shape)*(2+shape)^2*(1+shape+2*shape^2); 162 cov = f/n*[Vshape Covari; Covari, Vscale]; 163 end 164 165 elseif strcmp(method,'ml'), % Maximum Likelihood 166 % See Davidson & Smith (1990) and Grimshaw (1993) 167 168 max_data = max(data); 169 170 % Calculate start values 171 % The start value can not be less than 1/max_data , 172 % since if it happens then the upper limit of the GPD is 173 % lower than the highest data value 174 175 % Change variables, in order to avoid boundary problems in numerical solution 176 % Transformation: x = log(1/max_data - t), -Inf < t < 1/max_data 177 % Inverse Trans.: t = 1/max(data) - exp(x), -Inf < x < Inf 178 179 old_ver = 0; 180 if old_ver 181 % Old version 182 % Find a start value for the zero-search. 183 184 if (nargin<4) % Start values 185 t_start = []; 186 else % start values from input 187 if start<1/max_data 188 t_start=start; 189 else 190 t_start = []; 191 end 192 end 193 194 if isempty(t_start) % compute starting values by pwm or mom 195 parms0 = wgpdfit(data,'pwm',0); 196 t_start= parms0(1)/parms0(2); 197 end 198 if t_start>=1/max_data 199 parms0 = wgpdfit(data,'mom',0); 200 t_start= parms0(1)/parms0(2); 201 end 202 203 if t_start<1/max_data 204 x_start = log(1/max_data - t_start); 205 else 206 x_start = 0; 207 end 208 209 else 210 % New version 211 % Find an interval where the function changes sign. 212 % Gives interval for start value for the zero-search. 213 % Upper and lower limits are given in Grimshaw (1993) 214 215 X1=min(data); Xn=max(data); Xmean = mean(data); 216 Eps = 1e-6/Xmean; 217 t_L = 2*(X1-Xmean)/X1^2; % Lower limit 218 t_U = 1/Xn-Eps; % Upper limit 219 x_L = log(1/max_data - t_L); % Lower limit 220 x_U = log(1/max_data - t_U); % Upper limit 221 222 x=linspace(x_L,x_U,10); 223 for i=1:length(x) 224 f(i)=wgpdfit_ml(x(i),data); 225 end 226 227 I = find(f(1:end-1).*f(2:end) < 0); 228 if isempty(I) 229 x_start = []; 230 else 231 i = I(1); 232 x_start = [x(i) x(i+1)]; 233 end 234 end 235 236 if isempty(x_start) 237 shape=NaN; scale=NaN; cov=ones(2)*NaN; 238 warning([' Can not find an estimate. Probably the ML-estimator does not exist for this data. Try other methods.']) 239 else 240 % Solve the ML-equation 241 if exist('optimset') >= 2 % Function 'optimset' exists ? 242 % For Matlab 5.3 and higher ??? 243 xx = fzero('wgpdfit_ml',x_start,optimset('disp','off'),data); 244 %xx = fzero('wgpdfit_ml',x_start,optimset('disp','iter'),data); 245 else 246 % For Matlab 5.2 and lower ??? 247 xx = fzero('wgpdfit_ml',x_start,[],[],data); 248 end 249 250 251 % Extract estimates 252 [f,shape,scale] = wgpdfit_ml(xx,data); 253 254 % Calculate the covariance matrix by inverting the observed information. 255 if (shape >= 1.0), 256 cov=ones(2)*NaN; 257 warning([' The estimate of shape (' num2str(shape) ') is not within the range where the ML estimator is valid (shape<1).']) 258 elseif (shape >= 0.5), 259 cov=ones(2)*NaN; 260 warning([' The estimate of shape (' num2str(shape) ') is not within the range where the ML estimator is asymptotically normal (shape<1/2).']) 261 else 262 Vshape = 1-shape; 263 Vscale = 2*scale^2; 264 Covari = scale; 265 cov = (1-shape)/n*[Vshape Covari; Covari Vscale]; 266 end 267 end 268 % End ML 269 else 270 error(['Unknown method ' method '.']); 271 end 272 273 % --> The parameter estimates: <-- 274 275 parms = [shape scale]; 276 277 % Plotting, JR 9909 278 if plotflag 279 x = sort(data); 280 empdistr(data), hold on 281 plot(x, wgpdcdf(x,shape,scale)), hold off 282 if strcmp(method,'ml') 283 str = 'ml'; 284 elseif strcmp(method,'pwm') 285 str = 'pwm'; 286 elseif strcmp(method,'mom') 287 str = 'mom'; 288 else 289 str = 'pkd'; 290 end 291 title([deblank(['Empirical and GPD estimated cdf (',str,' method)'])]) 292 end 293
Comments or corrections to the WAFO group