Commits

Zoltán Szabó committed eb0961c

f-divergence estimation based on second-order Taylor expansion + Pearson chi square divergence, Shannon mutual information estimation based on KL divergence: added; see 'Df_DChiSquare_initialization.m', 'Df_DChiSquare_estimation.m', 'IShannon_DKL_initialization.m', 'IShannon_DKL_estimation.m'. Quick tests updated with the new estimators; see 'quick_test_IShannon.m', 'quick_test_Iimreg.m', 'quick_test_Iindependence.m', 'quick_test_Dequality.m'. ARfit download: updated to the new 'http://clidyn.ethz.ch/arfit/arfit.zip'; url, see 'ITE_install.m'.

Comments (0)

Files changed (12)

+v0.53 (Feb 2, 2014):
+
+-f-divergence estimation based on second-order Taylor expansion + Pearson chi square divergence: added; see 'Df_DChiSquare_initialization.m', 'Df_DChiSquare_estimation.m'.
+-Shannon mutual information estimation based on KL divergence: added; see 'IShannon_DKL_initialization.m', 'IShannon_DKL_estimation.m'. 
+-Quick tests updated with the new estimators; see 'quick_test_IShannon.m', 'quick_test_Iimreg.m', 'quick_test_Iindependence.m', 'quick_test_Dequality.m'.
+-ARfit download: updated to the new 'http://clidyn.ethz.ch/arfit/arfit.zip' url, see 'ITE_install.m'.
+
 v0.52 (Jan 9, 2014):
 
 -Sharma-Mittal divergence estimation: 
 
 - `entropy (H)`: Shannon entropy, Rényi entropy, Tsallis entropy (Havrda and Charvát entropy), complex entropy, Phi-entropy (f-entropy), Sharma-Mittal entropy,
 - `mutual information (I)`: generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon mutual information (total correlation, multi-information), L2 mutual information, Rényi mutual information, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa, complex mutual information, Cauchy-Schwartz quadratic mutual information, Euclidean distance based quadratic mutual information, distance covariance, distance correlation, approximate correntropy independence measure, chi-square mutual information (Hilbert-Schmidt norm of the normalized cross-covariance operator, squared-loss mutual information,  mean square contingency), 
-- `divergence (D)`: Kullback-Leibler divergence (relative entropy, I directed divergence), L2 divergence, Rényi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance), J-distance (symmetrised Kullback-Leibler divergence, J divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence, energy distance (specially the Cramer-Von Mises distance), Jensen-Shannon divergence, Jensen-Rényi divergence, K divergence, L divergence, certain f-divergences (Csiszár-Morimoto divergence, Ali-Silvey distance), non-symmetric Bregman distance (Bregman divergence), Jensen-Tsallis divergence, symmetric Bregman distance, Pearson chi square divergence (chi square distance), Sharma-Mittal divergence,
+- `divergence (D)`: Kullback-Leibler divergence (relative entropy, I directed divergence), L2 divergence, Rényi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance), J-distance (symmetrised Kullback-Leibler divergence, J divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence, energy distance (specially the Cramer-Von Mises distance), Jensen-Shannon divergence, Jensen-Rényi divergence, K divergence, L divergence, f-divergence (Csiszár-Morimoto divergence, Ali-Silvey distance), non-symmetric Bregman distance (Bregman divergence), Jensen-Tsallis divergence, symmetric Bregman distance, Pearson chi square divergence (chi square distance), Sharma-Mittal divergence,
 - `association measures (A)`, including `measures of concordance`: multivariate extensions of Spearman's rho (Spearman's rank correlation coefficient, grade correlation coefficient), correntropy, centered correntropy, correntropy coefficient, correntropy induced metric, centered correntropy induced metric, multivariate extension of Blomqvist's beta (medial correlation coefficient), multivariate conditional version of Spearman's rho, lower/upper tail dependence via conditional Spearman's rho,
 - `cross quantities (C)`: cross-entropy,
-- `kernels on distributions (K)`: expected kernel, Bhattacharyya kernel, probability product kernel, Jensen-Shannon kernel, exponentiated Jensen-Shannon kernel, exponentiated Jensen-Renyi kernel(s), Jensen-Tsallis kernel, exponentiated Jensen-Tsallis kernel(s), and
+- `kernels on distributions (K)`: expected kernel (summation kernel, mean map kernel), Bhattacharyya kernel, probability product kernel, Jensen-Shannon kernel, exponentiated Jensen-Shannon kernel, exponentiated Jensen-Renyi kernel(s), Jensen-Tsallis kernel, exponentiated Jensen-Tsallis kernel(s), and
 - `+some auxiliary quantities`: Bhattacharyya coefficient (Hellinger affinity), alpha-divergence.
 
 ITE offers 
 
 **ITE mailing list**: You can [sign up](https://groups.google.com/d/forum/itetoolbox) here.
 
-**Follow ITE**: on [Bitbucket](https://bitbucket.org/szzoli/ite/follow), on [Twitter](https://twitter.com/ITEtoolbox) to be always up-to-date.
+**Follow ITE**: on [Bitbucket](https://bitbucket.org/szzoli/ite/follow), on [Twitter](https://twitter.com/ITEtoolbox).
 
 * * *
 
 
 **Download** the latest release: 
 
-- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.52_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.52_code.tar.bz2), 
-- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.52_documentation.pdf).
+- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.53_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.53_code.tar.bz2), 
+- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.53_documentation.pdf).

code/ITE_install.m

                 end
             %download arfit.zip, extract, delete .zip:
                 disp('ARfit package: downloading, extraction: started.');
-                    [FN,status] = urlwrite('http://www.gps.caltech.edu/~tapio/arfit/arfit.zip','arfit.zip');
+                    %[FN,status] = urlwrite('http://www.gps.caltech.edu/~tapio/arfit/arfit.zip','arfit.zip');
+                    [FN,status] = urlwrite('http://clidyn.ethz.ch/arfit/arfit.zip','arfit.zip'); %new ARfit url
                     if status %downloading: successful
                         unzip(FN,strcat(ITE_code_dir, '/shared/downloaded/ARfit'));
                         delete(FN);%delete the .zip file    

code/estimators/meta_estimators/Df_DChiSquare_estimation.m

+function [D] = Df_DChiSquare_estimation(Y1,Y2,co)
+%function [D] = Df_DChiSquare_estimation(Y1,Y2,co)
+%Estimates the f-divergence (D) of Y1 and Y2 using second-order Taylor expansion of f and Pearson chi square divergence.
+%
+%Note:
+%  1)We use the naming convention 'D<name>_estimation' to ease embedding new divergence estimation methods.
+%  2)This is a meta method: the Pearson chi square divergence estimator can be arbitrary.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution. Note: the number of samples in Y1 [=size(Y1,2)] and Y2 [=size(Y2,2)] can be different.
+%  co: divergence estimator object.
+%
+%REFERENCE: 
+%    Frank Nielsen and Richard Nock. On the chi square and higher-order chi distances for approximating f-divergences. IEEE Signal Processing Letters, 2:10-13, 2014.
+%    Neil S. Barnett, Pietro Cerone, Sever Silvestru Dragomir, and A. Sofo. Approximating Csiszar f-divergence by the use of Taylor's formula with integral remainder. Mathematical Inequalities and Applications, 5:417-432, 2002.
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK. The information theoretical quantity of interest can be (and is!) estimated exactly [co.mult=1]; the computational complexity of the estimation is essentially the same as that of the 'up to multiplicative constant' case [co.mult=0]. In other words, the estimation is carried out 'exactly' (instead of up to 'proportionality').
+
+%verification:
+    if size(Y1,1) ~= size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+    
+D = co.H/2 * D_estimation(Y1,Y2,co.member_co); 
+

code/estimators/meta_estimators/Df_DChiSquare_initialization.m

+function [co] = Df_DChiSquare_initialization(mult,post_init)
+%function [co] = Df_DChiSquare_initialization(mult)
+%function [co] = Df_DChiSquare_initialization(mult,post_init)
+%Initialization of the second-order Taylor expansion and Pearson chi square divergence based f-divergence estimator. 
+%Assumption: f convex and f(1) = 0.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'D<name>_initialization' to ease embedding new divergence estimation methods.
+%   3)This is a meta method: the Pearson chi square divergence estimator can be arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes (='exact' estimation), '=0' no (=estimation up to 'proportionality').
+%   post_init: {field_name1,field_value1,field_name2,field_value2,...}; cell array containing the names and the values of the cost object fields that are to be used
+%   (instead of their default values). For further details, see 'post_initialization.m'.
+%OUTPUT:
+%   co: cost object (structure).
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields (following the template structure of the estimators to make uniform usage of the estimators possible):
+    co.name = 'f_DChiSquare';
+    co.mult = mult;
+    
+%other fields:
+    co.H = 2; %=f^{(2)}(1), the second derivative of f at 1.
+    co.member_name = 'ChiSquare_kNN_k';  %you can change it to any Pearson chi square divergence estimator
+    
+%post initialization (put it _before_ initialization of the members in case of a meta estimator):    
+    if nargin==2 %there are given (name,value) cost object fields
+        co = post_initialization(co,post_init);
+    end     
+    
+%initialization of the member(s):
+    co.member_co = D_initialization(co.member_name,mult);

code/estimators/meta_estimators/IShannon_DKL_estimation.m

+function [I] = IShannon_DKL_estimation(Y,ds,co)
+%function [I] = IShannon_DKL_estimation(Y,ds,co)
+%Estimates Shannon mutual information (I) based on Kullback-Leibler divergence. The estimation is carried out according to the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
+%
+%Note:
+%   1)We use the naming convention 'I<name>_estimation' to ease embedding new mutual information estimation methods.
+%   2)This is a meta method: the Kullback-Leibler divergence estimator can be arbitrary. 
+%
+%INPUT:
+%   Y: Y(:,t) is the t^th sample.
+%  ds: subspace dimensions. ds(m) = dimension of the m^th subspace, m=1,...,M (M=length(ds)).
+%  co: mutual information estimator object.
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK. The information theoretical quantity of interest can be (and is!) estimated exactly [co.mult=1]; the computational complexity of the estimation is essentially the same as that of the 'up to multiplicative constant' case [co.mult=0]. In other words, the estimation is carried out 'exactly' (instead of up to 'proportionality').
+
+%verification:
+    if sum(ds) ~= size(Y,1);
+        error('The subspace dimensions are not compatible with Y.');
+    end
+
+[Y1,Y2] = div_sample_generation(Y,ds);
+I = D_estimation(Y1,Y2,co.member_co);

code/estimators/meta_estimators/IShannon_DKL_initialization.m

+function [co] = IShannon_DKL_initialization(mult,post_init)
+%function [co] = IShannon_DKL_initialization(mult)
+%function [co] = IShannon_DKL_initialization(mult,post_init)
+%Initialization of the "meta" Shannon mutual information estimator based on Kullback-Leibler divergence.
+%Mutual information is estimated using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'I<name>_initialization' to ease embedding new mutual information estimation methods.
+%   3)This is a meta method: the Kullback-Leibler divergence estimator can be arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes (='exact' estimation), '=0' no (=estimation up to 'proportionality').
+%   post_init: {field_name1,field_value1,field_name2,field_value2,...}; cell array containing the names and the values of the cost object fields that are to be used
+%   (instead of their default values). For further details, see 'post_initialization.m'.
+%OUTPUT:
+%   co: cost object (structure).
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields (following the template structure of the estimators to make uniform usage of the estimators possible):
+    co.name = 'Shannon_DKL';
+    co.mult = mult;
+	
+%other fields:    
+    co.member_name = 'KL_kNN_k'; %you can change it to any Kullback-Leibler divergence estimator
+
+%post initialization (put it _before_ initialization of the members in case of a meta estimator):    
+    if nargin==2 %there are given (name,value) cost object fields
+        co = post_initialization(co,post_init);
+    end  
+    
+%initialization of the member(s):
+    co.member_co = D_initialization(co.member_name,mult);
+   

code/estimators/quick_tests/quick_test_Dequality.m

              %cost_name = 'JensenTsallis_HTsallis';%d>=1
              %cost_name = 'symBregman_DBregman';   %d>=1
              %cost_name = 'BMMD_DMMD_Ustat';       %d>=1
-    
+             %cost_name = 'f_DChiSquare';          %d>=1
+
 %initialization:
     num_of_samples_max = num_of_samples_v(end);
     L = length(num_of_samples_v);

code/estimators/quick_tests/quick_test_HShannon.m

             %analytical value of Shannon entropy:
                 par.cov = C;
                 H = analytical_value_HShannon(distr,par);
-                H =  1/2 * log( (2*pi*exp(1))^d * det(C) ); %equals to: H = 1/2 * log(det(C)) + d/2*log(2*pi) + d/2
         otherwise
             error('Distribution=?');
     end  

code/estimators/quick_tests/quick_test_IShannon.m

     
 %parameters:
     distr = 'normal'; %fixed
-    ds = [1;1]; %subspace dimensions. ds(m) = dimension of the m^th subspace, m=1,...,M (M=length(ds)); M>=2
+    ds = [2;2]; %subspace dimensions. ds(m) = dimension of the m^th subspace, m=1,...,M (M=length(ds)); M>=2
     num_of_samples_v = [1000:1000:20*1000]; %sample numbers used for estimation
     %estimator (of Shannon mutual information), meta:
         cost_name = 'Shannon_HShannon';
+	    %cost_name = 'Shannon_DKL';
     
 %initialization:
     num_of_samples_max = num_of_samples_v(end);

code/estimators/quick_tests/quick_test_Iimreg.m

             %cost_name = 'dCov_IHSIC';       %h>=0
             %cost_name = 'ApprCorrEntr';     %h=0; computationally intensive
             %cost_name = 'ChiSquare_DChiSquare'; %h>=0
+	        %cost_name = 'Shannon_DKL'; %h>=0
          
 %initialization: 
     ds = (2*h+1)^2 *ones(2,1);

code/estimators/quick_tests/quick_test_Iindependence.m

             %cost_name = 'dCov_IHSIC';       %dm>=1,M =2
             %cost_name = 'ApprCorrEntr';     %dm =1,M =2
             %cost_name = 'ChiSquare_DChiSquare'; %dm>=1,M>=2
-        
+            %cost_name = 'Shannon_DKL';          %dm>=1, M=2
+
 %initialization:
     num_of_samples_max = num_of_samples_v(end);
     L = length(num_of_samples_v);