Zoltan Szabo avatar Zoltan Szabo committed d0b19a8

K and L divergence estimation: added; see 'DK_DKL_initialization.m', 'DK_DKL_estimation.m', 'DL_DKL_initialization.m', 'DL_DKL_estimation.m'. Handling 'Y==Q' in case of co.kNNmethod = 'knnsearch': included; see 'kNN_squared_distances.m'. Dimension verification: added to (i) meta estimators, see 'ITsallis_DTsallis_estimation.m', 'DEnergyDist_DMMD_estimation.m', 'DJdistance_estimation.m', 'DJensenRenyi_HRenyi_estimation.m', 'DJensenShannon_HShannon_estimation.m', 'DKL_CCE_HShannon_estimation.m'; (ii) utilities; see 'estimate_Dtemp1.m', 'estimate_Dtemp2.m'.

Comments (0)

Files changed (15)

+v0.37 (May 12, 2013):
+-K divergence estimation: added; see 'DK_DKL_initialization.m' and 'DK_DKL_estimation.m'.
+-L divergence estimation: added; see 'DL_DKL_initialization.m' and 'DL_DKL_estimation.m'.
+-Handling 'Y==Q' in case of co.kNNmethod = 'knnsearch': included; see 'kNN_squared_distances.m'.
+-Dimension verification: added to 
+(i) meta estimators, see 'ITsallis_DTsallis_estimation.m', 'DEnergyDist_DMMD_estimation.m', 'DJdistance_estimation.m', 'DJensenRenyi_HRenyi_estimation.m', 'DJensenShannon_HShannon_estimation.m', 'DKL_CCE_HShannon_estimation.m'.
+(ii) utilities 'estimate_Dtemp1.m' and 'estimate_Dtemp2.m'.
+
 v0.36 (Apr 26, 2013):
 -Jensen-Renyi divergence estimation: added; see 'DJensenRenyi_HRenyi_initialization.m' and 'DJensenRenyi_HRenyi_estimation.m'.
 -Jensen-Shannon divergence estimation: added; see 'DJensenShannon_HShannon_initialization.m' and 'DJensenShannon_HShannon_estimation.m'.
 
 - `entropy (H)`: Shannon entropy, R�nyi entropy, Tsallis entropy (Havrda and Charv�t entropy), complex entropy,
 - `mutual information (I)`: generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon mutual information, L2 mutual information, R�nyi mutual information, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa, complex mutual information, Cauchy-Schwartz quadratic mutual information, Euclidean distance based quadratic mutual information, distance covariance, distance correlation, approximate correntropy independence measure,
-- `divergence (D)`: Kullback-Leibler divergence (relative entropy; I directed divergence), L2 divergence, R�nyi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance, an integral probability metric), J-distance (symmetrised Kullback-Leibler divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence, energy distance (specially the Cramer-Von Mises distance), Jensen-Shannon divergence, Jensen-R�nyi divergence,
+- `divergence (D)`: Kullback-Leibler divergence (relative entropy, I directed divergence), L2 divergence, R�nyi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance, an integral probability metric), J-distance (symmetrised Kullback-Leibler divergence, J divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence, energy distance (specially the Cramer-Von Mises distance), Jensen-Shannon divergence, Jensen-R�nyi divergence, K divergence, L divergence,
 - `association measures (A)`, including `measures of concordance`: multivariate extensions of Spearman's rho (Spearman's rank correlation coefficient, grade correlation coefficient), correntropy, centered correntropy, correntropy coefficient, correntropy induced metric, centered correntropy induced metric, multivariate extension of Blomqvist's beta (medial correlation coefficient), multivariate conditional version of Spearman's rho, lower/upper tail dependence via conditional Spearman's rho,
 - `cross quantities (C)`: cross-entropy.
 
 
 **Download** the latest release: 
 
-- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.36_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.36_code.tar.bz2), 
-- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.36_documentation.pdf).
+- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.37_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.37_code.tar.bz2), 
+- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.37_documentation.pdf).
 
 

code/H_I_D_A_C/meta_estimators/DEnergyDist_DMMD_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if size(Y1,1)~=size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
 D =  2 * ( D_estimation(Y1,Y2,co.member_co) )^2;

code/H_I_D_A_C/meta_estimators/DJdistance_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if size(Y1,1)~=size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
 D_J =  D_estimation(Y1,Y2,co.member_co) + D_estimation(Y2,Y1,co.member_co);

code/H_I_D_A_C/meta_estimators/DJensenRenyi_HRenyi_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if size(Y1,1)~=size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
 w = co.w;
 mixtureY = mixture_distribution(Y1,Y2,w);
 D_JR =  H_estimation(mixtureY,co.member_co) - (w(1)*H_estimation(Y1,co.member_co) + w(2)*H_estimation(Y2,co.member_co));

code/H_I_D_A_C/meta_estimators/DJensenShannon_HShannon_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if size(Y1,1)~=size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
 w = co.w;
 mixtureY = mixture_distribution(Y1,Y2,w);
 D_JS =  H_estimation(mixtureY,co.member_co) - (w(1)*H_estimation(Y1,co.member_co) + w(2)*H_estimation(Y2,co.member_co));

code/H_I_D_A_C/meta_estimators/DKL_CCE_HShannon_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if size(Y1,1)~=size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
 CE = C_estimation(Y1,Y2,co.CE_member_co);
 H = H_estimation(Y1,co.H_member_co);
 D =  CE - H;

code/H_I_D_A_C/meta_estimators/DK_DKL_estimation.m

+function [D_K] = DK_DKL_estimation(Y1,Y2,co)
+%Estimates the K divergence of Y1 and Y2 using the relation: D_K(f_1,f_2) = D(f_1,(f_1+f_2)/2), where D denotes the Kullback-Leibler divergence.
+%
+%Note:
+%   1)We use the naming convention 'D<name>_estimation' to ease embedding new divergence estimation methods.
+%   2)This is a meta method: the Kullback-Leibler divergence estimator can be arbitrary.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution.
+%  co: divergence estimator object.
+%
+%REFERENCE:
+%  Jianhua Lin. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37:145-151, 1991.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+%verification:
+    [dY1,num_of_samplesY1] = size(Y1);
+    [dY2,num_of_samplesY2] = size(Y2);
+    if dY1~=dY2
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
+%mixture of Y1 and Y2 with 1/2, 1/2 weights:
+    w = [1/2;1/2];
+    %samples to the mixture (second part of Y1 and Y2; =:Y1m, Y2m):
+        num_of_samplesY1m = floor(num_of_samplesY1/2); %(max) number of samples to the mixture from Y1
+        num_of_samplesY2m = floor(num_of_samplesY2/2); %(max) number of samples to the mixture from Y2
+        Y1m = Y1(:,num_of_samplesY1m+1:end);
+        Y2m = Y2(:,num_of_samplesY2m+1:end);
+    mixtureY = mixture_distribution(Y1m,Y2m,w);
+
+D_K =  D_estimation(Y1(:,1:num_of_samplesY1m),mixtureY,co.member_co);

code/H_I_D_A_C/meta_estimators/DK_DKL_initialization.m

+function [co] = DK_DKL_initialization(mult)
+%Initialization of the K divergence estimator, defined according to the relation: D_K(f_1,f_2) = D(f_1,(f_1+f_2)/2), where D denotes the Kullback-Leibler divergence.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'D<name>_initialization' to ease embedding new divergence estimation methods.
+%   3)This is a meta method: the Kullback-Leibler divergence estimator can arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'K_DKL';
+    co.mult = mult;
+    
+%other fields:
+    co.member_name = 'KL_kNN_k'; %you can change it to any Kullback-Leibler divergence estimator
+    co.member_co = D_initialization(co.member_name,mult);

code/H_I_D_A_C/meta_estimators/DL_DKL_estimation.m

+function [D_L] = DL_DKL_estimation(Y1,Y2,co)
+%Estimates the L divergence of Y1 and Y2 using the relation: D_L(f_1,f_2) = D(f_1,(f_1+f_2)/2) + D(f_2,(f_1+f_2)/2), where D denotes the Kullback-Leibler divergence.
+%
+%Note:
+%   1)We use the naming convention 'D<name>_estimation' to ease embedding new divergence estimation methods.
+%   2)This is a meta method: the Kullback-Leibler divergence estimator can be arbitrary.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution.
+%  co: divergence estimator object.
+%
+%REFERENCE:
+%  Jianhua Lin. Divergence measures based on the Shannon entropy. IEEE Transactions on Information Theory, 37:145-151, 1991.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+%verification:
+    [dY1,num_of_samplesY1] = size(Y1);
+    [dY2,num_of_samplesY2] = size(Y2);
+    if dY1~=dY2
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
+%mixture of Y1 and Y2 with 1/2, 1/2 weights:
+    w = [1/2;1/2];
+    %samples to the mixture (second part of Y1 and Y2; =:Y1m, Y2m):
+        num_of_samplesY1m = floor(num_of_samplesY1/2); %(max) number of samples to the mixture from Y1
+        num_of_samplesY2m = floor(num_of_samplesY2/2); %(max) number of samples to the mixture from Y2
+        Y1m = Y1(:,num_of_samplesY1m+1:end);
+        Y2m = Y2(:,num_of_samplesY2m+1:end);
+    mixtureY = mixture_distribution(Y1m,Y2m,w);
+
+D_L =  D_estimation(Y1(:,1:num_of_samplesY1m),mixtureY,co.member_co) + D_estimation(Y2(:,1:num_of_samplesY2m),mixtureY,co.member_co);

code/H_I_D_A_C/meta_estimators/DL_DKL_initialization.m

+function [co] = DL_DKL_initialization(mult)
+%Initialization of the L divergence estimator, defined according to the relation: D_L(f_1,f_2) = D(f_1,(f_1+f_2)/2) + D(f_2,(f_1+f_2)/2), where D denotes the Kullback-Leibler divergence.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'D<name>_initialization' to ease embedding new divergence estimation methods.
+%   3)This is a meta method: the Kullback-Leibler divergence estimator can arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'L_DKL';
+    co.mult = mult;
+    
+%other fields:
+    co.member_name = 'KL_kNN_k'; %you can change it to any Kullback-Leibler divergence estimator
+    co.member_co = D_initialization(co.member_name,mult);

code/H_I_D_A_C/meta_estimators/ITsallis_DTsallis_estimation.m

 
 %co.mult:OK.
 
+%verification:
+    if sum(ds) ~= size(Y,1);
+        error('The subspace dimensions are not compatible with Y.');
+    end
+
 [Y1,Y2] = div_sample_generation(Y,ds);
 I = D_estimation(Y1,Y2,co.member_co);

code/H_I_D_A_C/utilities/estimate_Dtemp1.m

 function [Dtemp1] = estimate_Dtemp1(X,Y,co)
-%Estimates Dtemp1 = \int p^{\alpha}(x)q^{1-\alpha}(x)dx, the Renyi and the Tsallis divergences are simple functions of this quantity.
+%Estimates Dtemp1 = \int p^{\alpha}(u)q^{1-\alpha}(u)du, the Renyi and the Tsallis divergences are simple functions of this quantity.
 %
 %INPUT:
-%   X: X(:,t) is the t^th sample from the first distribution.
-%   Y: Y(:,t) is the t^th sample from the second distribution.
+%   X: X(:,t) is the t^th sample from the first distribution (X~p).
+%   Y: Y(:,t) is the t^th sample from the second distribution (Y~q).
 %  co: cost object (structure).
 %
 %Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
 %
 %You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
 
-[d,num_of_samplesY] = size(Y);
-[d,num_of_samplesX] = size(X);
+%initialization:
+    [dY,num_of_samplesY] = size(Y);
+    [dX,num_of_samplesX] = size(X);
+
+%verification:
+    if dX~=dY
+        error('The dimension of the samples in X and Y must be equal.');
+    end
+
+%initialization - continued:
+    d = dX; %=dY
 
 squared_distancesXX = kNN_squared_distances(X,X,co,1);
 squared_distancesYX = kNN_squared_distances(Y,X,co,0);

code/H_I_D_A_C/utilities/estimate_Dtemp2.m

 function [Dtemp2] = estimate_Dtemp2(X,Y,co)
-%Estimates Dtemp2 = \int p^a(x)q^b(x)dx; the Hellinger distance and the Bhattacharyya distance are simple functions of this quantity.
+%Estimates Dtemp2 = \int p^a(u)q^b(u)p(u)du; the Hellinger distance and the Bhattacharyya distance are simple functions of this quantity.
 %
 %INPUT:
-%   X: X(:,t) is the t^th sample from the first distribution.
-%   Y: Y(:,t) is the t^th sample from the second distribution.
+%   X: X(:,t) is the t^th sample from the first distribution (X~p).
+%   Y: Y(:,t) is the t^th sample from the second distribution (Y~q).
 %  co: cost object (structure).
 %
 %Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
 %You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
 
 %initialization:
-    [d,num_of_samplesY] = size(Y);
-    [d,num_of_samplesX] = size(X);
+    [dY,num_of_samplesY] = size(Y);
+    [dX,num_of_samplesX] = size(X);
+    
+%verification:
+    if dX~=dY
+        error('The dimension of the samples in X and Y must be equal.');
+    end
+
+%initialization - continued:
+    d = dX; %=dY
     a = co.a;
     b = co.b;
     k = co.k;

code/H_I_D_A_C/utilities/kNN_squared_distances.m

 
 switch co.kNNmethod
     case 'knnFP1'%fast pairwise distance computation and C++ partial sort
-        if Y_equals_to_Q %'Y==Q'
+        if Y_equals_to_Q %'Y==Q' => exclude the points themselves
             [squared_distances,indices] = knn(Q, Y, max(co.k)+1);%assumption below:max(co.k)+1 <= size(Q,1)[=size(Y,1)]
             squared_distances = squared_distances(2:end,:);
             indices = int32(indices(2:end,:));
             indices = int32(I(1:max(co.k),:));
         end
     case 'knnsearch' %Statistics Toolbox:Matlab
-        [indices,distances] = knnsearch(Y.',Q.','K',max(co.k),'NSMethod',co.NSmethod); %[double,...
-        indices = int32(indices.');%.': to be compatible with 'ANN'
-        squared_distances = (distances.').^2;%distances -> squared distances; .': to be compatible with 'ANN'
+        if Y_equals_to_Q %'Y==Q' => exclude the points themselves
+            [indices,distances] = knnsearch(Y.',Q.','K',max(co.k)+1,'NSMethod',co.NSmethod); %[double,...
+            indices = int32(indices(:,2:end).');%.': to be compatible with 'ANN'
+            squared_distances = (distances(:,2:end).').^2;%distances -> squared distances; .': to be compatible with 'ANN'
+        else
+            [indices,distances] = knnsearch(Y.',Q.','K',max(co.k),'NSMethod',co.NSmethod); %[double,...
+            indices = int32(indices.');%.': to be compatible with 'ANN'
+            squared_distances = (distances.').^2;%distances -> squared distances; .': to be compatible with 'ANN'
+        end
     case 'ANN'%ANN library/wrapper
         if working_environment_Matlab
             ann_object = ann(Y);
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.