KDEFUN Kernel Density Estimator. CALL: [f, hs] = kdefun(data,options,x1,x2,...,xd) f = kernel density estimate evaluated at x1,x2,...,xd. data = data matrix, size N x D (D = # dimensions) options = kdeoptions-structure or cellvector of named parameters with corresponding values, see kdeoptset for details. x1,x2..= vectors/matrices defining the points to evaluate the density KDEFUN gives a slow, but exact kernel density estimate evaluated at x1,x2,...,xd. Notice that densities close to normality appear to be the easiest for the kernel estimator to estimate and that the degree of estimation difficulty increases with skewness, kurtosis and multimodality. If D > 1 KDE calculates quantile levels by integration. An alternative is to calculate them by ranking the kernel density estimate obtained at the points DATA i.e. use the commands f = kde(data); r = kdefun(data,[],num2cell(data,1)); f.cl = qlevels2(r,f.PL); The first is probably best when estimating the pdf and the latter is the easiest and most robust for multidimensional data when only a visualization of the data is needed. For faster estimates try kdebin. Examples: data = wraylrnd(1,500,1); x = linspace(sqrt(eps),5,55); wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable f = kdefun(data,{'L2',.5},x); plot(x,f,x,wraylpdf(x,1),'r') See also kde, mkernel, kdebin
Kernel Density Estimator. | |
Create or alter KDE OPTIONS structure. | |
Multivariate Kernel Function, alternative version. | |
Transforms process X and up to four derivatives | |
Create cell array. | |
Create object or return object class. | |
Deal inputs to outputs. | |
Determinant. | |
Difference and approximate derivative. | |
Display message and abort function. | |
Matrix inverse. | |
True for cell array. | |
True if arrays are numerically equal. | |
Convert string to lowercase. | |
Convert numeric array into cell array. |
Kernel Density Estimator. | |
Kernel Density Estimator. | |
Computes and plots a kernel density estimate of PDF | |
Random sampling from a smoothed empirical distribution |
001 function [f, hs,lambda]= kdefunkdefun(A,options,varargin) 002 %KDEFUN Kernel Density Estimator. 003 % 004 % CALL: [f, hs] = kdefun(data,options,x1,x2,...,xd) 005 % 006 % f = kernel density estimate evaluated at x1,x2,...,xd. 007 % data = data matrix, size N x D (D = # dimensions) 008 % options = kdeoptions-structure or cellvector of named parameters with 009 % corresponding values, see kdeoptset for details. 010 % x1,x2..= vectors/matrices defining the points to evaluate the density 011 % 012 % KDEFUN gives a slow, but exact kernel density estimate evaluated at x1,x2,...,xd. 013 % Notice that densities close to normality appear to be the easiest for the kernel 014 % estimator to estimate and that the degree of estimation difficulty increases with 015 % skewness, kurtosis and multimodality. 016 % 017 % If D > 1 KDE calculates quantile levels by integration. An 018 % alternative is to calculate them by ranking the kernel density 019 % estimate obtained at the points DATA i.e. use the commands 020 % 021 % f = kde(data); 022 % r = kdefun(data,[],num2cell(data,1)); 023 % f.cl = qlevels2(r,f.PL); 024 % 025 % The first is probably best when estimating the pdf and the latter is the 026 % easiest and most robust for multidimensional data when only a visualization 027 % of the data is needed. 028 % 029 % For faster estimates try kdebin. 030 % 031 % Examples: 032 % data = wraylrnd(1,500,1); 033 % x = linspace(sqrt(eps),5,55); 034 % wnormplot((data).^(.5)) % gives a straight line => L2 = 0.5 reasonable 035 % f = kdefun(data,{'L2',.5},x); 036 % plot(x,f,x,wraylpdf(x,1),'r') 037 % 038 % See also kde, mkernel, kdebin 039 040 % Reference: 041 % B. W. Silverman (1986) 042 % 'Density estimation for statistics and data analysis' 043 % Chapman and Hall , pp 100-110 044 % 045 % Wand, M.P. and Jones, M.C. (1995) 046 % 'Kernel smoothing' 047 % Chapman and Hall, pp 43--45 048 049 050 051 052 %Tested on: matlab 5.2 053 % History: 054 % revised pab Feb2004 055 % -options moved into a structure 056 % revised pab Dec2003 057 % -removed some code 058 % revised pab 27.04.2001 059 % - changed call from mkernel to mkernel2 (increased speed by 10%) 060 % revised pab 01.01.2001 061 % - added the possibility that L2 is a cellarray of parametric 062 % or non-parametric transformations (secret option) 063 % revised pab 14.12.1999 064 % - fixed a small error in example in help header 065 % revised pab 28.10.1999 066 % - added L2 067 % revised pab 21.10.99 068 % - added alpha to input arguments 069 % - made it fully general for d dimensions 070 % - HS may be a smoothing matrix 071 % revised pab 21.09.99 072 % - adapted from kdetools by Christian Beardah 073 074 defaultoptions = kdeoptset; 075 % If just 'defaults' passed in, return the default options in g 076 if ((nargin==1) & (nargout <= 1) & isequal(A,'defaults')), 077 f = defaultoptions; 078 return 079 end 080 error(nargchk(1,inf, nargin)) 081 082 [n, d]=size(A); % Find dimensions of A, 083 % n=number of data points, 084 % d=dimension of the data. 085 if (nargin<2 | isempty(options)) 086 options = defaultoptions; 087 else 088 switch lower(class(options)) 089 case {'char','struct'}, 090 options = kdeoptset(defaultoptions,options); 091 case {'cell'} 092 093 options = kdeoptset(defaultoptions,options{:}); 094 otherwise 095 error('Invalid options') 096 end 097 end 098 kernel = options.kernel; 099 h = options.hs; 100 alpha = options.alpha; 101 L2 = options.L2; 102 hsMethod = options.hsMethod; 103 104 if isempty(h) 105 h=zeros(1,d); 106 end 107 108 L22 = cell(1,d); 109 k3=[]; 110 if isempty(L2) 111 L2=ones(1,d); % default no transformation 112 elseif iscell(L2) % cellarray of non-parametric and parametric transformations 113 Nl2 = length(L2); 114 if ~(Nl2==1|Nl2==d), error('Wrong size of L2'), end 115 [L22{1:d}] = deal(L2{1:min(Nl2,d)}); 116 L2 = ones(1,d); % default no transformation 117 for ix=1:d, 118 if length(L22{ix})>1, 119 k3=[k3 ix]; % Non-parametric transformation 120 else 121 L2(ix) = L22{ix}; % Parameter to the Box-Cox transformation 122 end 123 end 124 elseif length(L2)==1 125 L2=L2(:,ones(1,d)); 126 end 127 128 amin=min(A); 129 if any((amin(L2~=1)<=0)) , 130 f=[]; 131 error('DATA cannot be negative or zero when L2~=1') 132 end 133 134 135 nv=length(varargin); 136 if nv<d, 137 error('some or all of the evaluation points x1,x2,...,xd is missing') 138 end 139 140 xsiz = size(varargin{1}); % remember size of input 141 Nx = prod(xsiz); 142 X = zeros(Nx,d); 143 for ix=1:min(nv,d), 144 if (any(varargin{ix}<=0) & (L2(ix)~=1)), 145 f=[]; 146 error('xi cannot be negative or zero when L2~=1') 147 end 148 X(:,ix)=varargin{ix}(:); % make sure it is a column vector 149 end 150 151 152 %new call 153 lX = X; %zeros(Nx,d); 154 lA = A; %zeros(size(A)); 155 156 k1 = find(L2==0); % logaritmic transformation 157 if any(k1) 158 lA(:,k1)=log(A(:,k1)); 159 lX(:,k1)=log(X(:,k1)); 160 end 161 k2=find(L2~=0 & L2~=1); % power transformation 162 if any(k2) 163 lA(:,k2)=sign(L2(ones(n,1),k2)).*A(:,k2).^L2(ones(n,1),k2); 164 lX(:,k2)=sign(L2(ones(Nx,1),k2)).*X(:,k2).^L2(ones(Nx,1),k2); 165 end 166 % Non-parametric transformation 167 for ix = k3, 168 lA(:,ix) = tranproc(A(:,ix),L22{ix}); 169 lX(:,ix) = tranproc(X(:,ix),L22{ix}); 170 end 171 172 173 hsiz=size(h); 174 if (min(hsiz)==1)|(d==1) 175 if max(hsiz)==1, 176 h=h*ones(1,d); 177 else 178 h=reshape(h,[1,d]); % make sure it has the correct dimension 179 end; 180 ind=find(h<=0); 181 if any(ind) % If no value of h has been specified by the user then 182 h(ind)=feval(hsMethod,lA(:,ind),kernel); % calculate automatic values. 183 end 184 deth = prod(h); 185 else % fully general smoothing matrix 186 deth = det(h); 187 if deth<=0 188 error('bandwidth matrix h must be positive definit') 189 end 190 end 191 192 if alpha>0 193 Xn = num2cell(lA,1); 194 opt1 = kdeoptset('kernel',kernel,'hs',h,'alpha',0,'L2',1); 195 f2 = kdefun(lA,opt1,Xn{:}); % get a pilot estimate by regular KDE (alpha=0) 196 g = exp(sum(log(f2))/n); 197 198 lambda=(f2(:)/g).^(-alpha); 199 else 200 lambda=ones(n,1); 201 end 202 203 204 205 206 207 f=zeros(Nx,1); 208 if (min(hsiz)==1)|(d==1) 209 for ix=1:n, % Sum over all data points 210 Avec=lA(ix,:); 211 Xnn=(lX-Avec(ones(Nx,1),:))./(h(ones(Nx,1),:) *lambda(ix)); 212 f = f + mkernel2(Xnn,kernel)/lambda(ix)^d; 213 end 214 else % fully general 215 h1=inv(h); 216 for ix=1:n, % Sum over all data points 217 Avec=lA(ix,:); 218 Xnn=(lX-Avec(ones(Nx,1),:))*(h1/lambda(ix)); 219 f = f + mkernel2(Xnn,kernel)/lambda(ix)^d; 220 end 221 end 222 f=f/(n*deth); 223 224 % transforming back 225 if any(k1), % L2=0 i.e. logaritmic transformation 226 for ix=k1 227 f=f./X(:,ix); 228 end 229 if any(max(abs(diff(f)))>10) 230 disp('Warning: Numerical problems may have occured due to the logaritmic') 231 disp('transformation. Check the KDE for spurious spikes') 232 end 233 end 234 if any(k2) % L2~=0 i.e. power transformation 235 for ix=k2 236 f=f.*(X(:,ix).^(L2(ix)-1))*L2(ix)*sign(L2(ix)); 237 end 238 if any(max(abs(diff(f)))>10) 239 disp('Warning: Numerical problems may have occured due to the power') 240 disp('transformation. Check the KDE for spurious spikes') 241 end 242 end 243 if any(k3), % non-parametric transformation 244 oneC = ones(Nx,1); 245 for ix=k3 246 gn = L22{ix}; 247 %Gn = fliplr(L22{ix}); 248 %x0 = tranproc(lX(:,ix),Gn); 249 if any(isnan(X(:,ix))), 250 error('The transformation does not have a strictly positive derivative.') 251 end 252 hg1 = tranproc([X(:,ix) oneC],gn); 253 der1 = abs(hg1(:,2)); % dg(X)/dX = 1/(dG(Y)/dY) 254 % alternative 2 255 %pp = smooth(Gn(:,1),Gn(:,2),1,[],1); 256 %dpp = diffpp(pp); 257 %der1 = 1./abs(ppval(dpp,f.x{ix})); 258 % Alternative 3 259 %pp = smooth(gn(:,1),gn(:,2),1,[],1); 260 %dpp = diffpp(pp); 261 %%plot(hg1(:,1),der1-abs(ppval(dpp,x0))) 262 %der1 = abs(ppval(dpp,x0)); 263 if any(der1<=0), 264 error(['The transformation must have a strictly positive derivative']) 265 end 266 f = f.*der1; 267 end 268 if any(max(abs(diff(f)))>10) 269 disp('Warning: Numerical problems may have occured due to the power') 270 disp('transformation. Check the KDE for spurious spikes') 271 end 272 end 273 274 f=reshape(f,xsiz); % restore original shape 275 if nargout>1 276 hs=h; 277 end 278 279 280 281 282 283 284 285 286
Comments or corrections to the WAFO group