Commits

Zoltán Szabó  committed 5452501

Citing information (JMLR page numbers): updated in the documentation. Shannon entropy and cross-entropy estimation based on maximum likelihood estimation + analytical formula in the chosen exponential family: added; see 'HShannon_expF_initialization.m', 'HShannon_expF_estimation.m', 'CCE_expF_initialization', 'CCE_expF_estimation.m'. Quick tests: updated with the new estimators, see 'quick_test_HShannon.m', 'quick_test_Himreg.m' and 'quick_test_CCE.m'.

  • Participants
  • Parent commits 1fe7f4f
  • Tags release-0.55

Comments (0)

Files changed (13)

File CHANGELOG.txt

+v0.55 (Marc 7, 2014):
+
+-Shannon entropy and cross-entropy estimation based on maximum likelihood estimation + analytical formula in the chosen exponential family: added; see 'HShannon_expF_initialization.m', 'HShannon_expF_estimation.m', 'CCE_expF_initialization', 'CCE_expF_estimation.m'.
+-Quick tests: updated with the new estimators, see 'quick_test_HShannon.m', 'quick_test_Himreg.m' and 'quick_test_CCE.m'.
+-Citing information (JMLR page numbers): updated in the documentation.
+
 v0.54 (Feb 24, 2014):
 
--Renyi entropy estimation based on maximum likelihood estimation (MLE) + analytical formula in the exponential family: added; see 'HRenyi_expF_initialization.m', 'HRenyi_expF_estimation.m'.
+-Renyi entropy estimation based on maximum likelihood estimation + analytical formula in the exponential family: added; see 'HRenyi_expF_initialization.m', 'HRenyi_expF_estimation.m'.
 -Tsallis entropy estimation based on MLE + analytical formula in the exponential family: added; see 'HTsallis_expF_initialization.m', 'HRenyi_Tsallis_estimation.m'.
 -Quick tests: updated according to the new estimators; see 'quick_test_HRenyi.m', 'quick_test_HTsallis.m', 'quick_test_Himreg.m'.
 
 
 **Download** the latest release: 
 
-- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.54_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.54_code.tar.bz2), 
-- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.54_documentation.pdf).
+- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.55_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.55_code.tar.bz2), 
+- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.55_documentation.pdf).

File code/estimators/base_estimators/CCE_expF_estimation.m

+function [CE] = CCE_expF_estimation(Y1,Y2,co)
+%function [CE] = CCE_expF_estimation(Y1,Y2,co)
+%Estimates the cross-entropy (CE) of Y1 and Y2 using maximum likelihood estimation (MLE) + analytical formula associated to the chosen exponential family.
+%Assumption: k, the carrier measure is zero.
+%
+%We use the naming convention 'C<name>_estimation' to ease embedding new cross quantity estimation methods.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution. Note: the number of samples in Y1 [=size(Y1,2)] and Y2 [=size(Y2,2)] can be different.
+%  co: cross quantity estimator object.
+%
+%REFERENCE: 
+%    Frank Nielsen and Richard Nock. Entropies and cross-entropies of exponential families. In IEEE International Conference on Image Processing (ICIP), pages 3621–3624, 2010.
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK. The information theoretical quantity of interest can be (and is!) estimated exactly [co.mult=1]; the computational complexity of the estimation is essentially the same as that of the 'up to multiplicative constant' case [co.mult=0]. In other words, the estimation is carried out 'exactly' (instead of up to 'proportionality').
+
+%verification:
+    if size(Y1,1) ~= size(Y2,1)
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+
+%MLE:
+  np1 = expF_MLE(Y1,co.distr);
+  np2 = expF_MLE(Y2,co.distr);
+
+%the two terms of CE:  
+  term1 = expF_F(co.distr,np2); %F(np2)
+  term2 = expF_np1_np2_mult(np2,expF_gradF(co.distr,np1)); %<np2,nabla F(np1)>
+  
+CE = term1 - term2;
+

File code/estimators/base_estimators/CCE_expF_initialization.m

+function [co] = CCE_expF_initialization(mult,post_init)
+%function [co] = CCE_expF_initialization(mult)
+%function [co] = CCE_expF_initialization(mult,post_init)
+%Initialization of the exponential family based cross-entropy estimator (maximum likelihood + analytical formula associated to the chosen exponential family).
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'C<name>_initialization' to ease embedding new cross quantity estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes (='exact' estimation), '=0' no (=estimation up to 'proportionality').
+%   post_init: {field_name1,field_value1,field_name2,field_value2,...}; cell array containing the names and the values of the cost object fields that are to be used
+%   (instead of their default values). For further details, see 'post_initialization.m'.
+%OUTPUT:
+%   co: cost object (structure).
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields (following the template structure of the estimators to make uniform usage of the estimators possible):
+    co.name = 'CE_expF';
+    co.mult = mult;
+    
+%other fields:
+    co.distr = 'normal'; %exponential family used for estimation; fixed
+    
+%post initialization (put it _before_ initialization of the members in case of a meta estimator):    
+    if nargin==2 %there are given (name,value) cost object fields
+        co = post_initialization(co,post_init);
+    end    

File code/estimators/base_estimators/HShannon_expF_estimation.m

+function [H] = HShannon_expF_estimation(Y,co)
+%function [H] = HShannon_expF_estimation(Y,co)
+%Estimates the Shannon (H) of Y using maximum likelihood estimation (MLE) + analytical formula corresponding in the chosen exponential family.
+%Assumption: k, the carrier measure is zero.
+%
+%We use the naming convention 'H<name>_estimation' to ease embedding new entropy estimation methods.
+%
+%INPUT:
+%   Y: Y(:,t) is the t^th sample.
+%  co: entropy estimator object.
+%
+%REFERENCE: 
+%    Frank Nielsen and Richard Nock. A closed-form expression for the Sharma-Mittal entropy of exponential families. Journal of Physics A: Mathematical and Theoretical, 45:032003, 2012. (analytical formula)
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK. The information theoretical quantity of interest can be (and is!) estimated exactly [co.mult=1]; the computational complexity of the estimation is essentially the same as that of the 'up to multiplicative constant' case [co.mult=0]. In other words, the estimation is carried out 'exactly' (instead of up to 'proportionality').
+
+%MLE:
+  np = expF_MLE(Y,co.distr);
+  
+term1 = expF_F(co.distr,np); %F(theta)
+term2 = expF_np1_np2_mult(np,expF_gradF(co.distr,np)); %<theta,grad_F(theta)>
+H =  term1 - term2; %assumption: k, the carrier measure is zero

File code/estimators/base_estimators/HShannon_expF_initialization.m

+function [co] = HShannon_expF_initialization(mult,post_init)
+%function [co] = HShannon_expF_initialization(mult)
+%function [co] = HShannon_expF_initialization(mult,post_init)
+%Initialization of the exponential family based Shannon entropy estimator (maximum likelihood + analytical formula in the chosen exponential family).
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'H<name>_initialization' to ease embedding new entropy estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes (='exact' estimation), '=0' no (=estimation up to 'proportionality').
+%   post_init: {field_name1,field_value1,field_name2,field_value2,...}; cell array containing the names and the values of the cost object fields that are to be used
+%   (instead of their default values). For further details, see 'post_initialization.m'.
+%OUTPUT:
+%   co: cost object (structure).
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields (following the template structure of the estimators to make uniform usage of the estimators possible):
+    co.name = 'Shannon_expF';
+    co.mult = mult;
+    
+%other fields:
+    co.distr = 'normal'; %exponential family used for estimation; fixed
+
+%post initialization (put it _before_ initialization of the members in case of a meta estimator):    
+    if nargin==2 %there are given (name,value) cost object fields
+        co = post_initialization(co,post_init);
+    end    

File code/estimators/quick_tests/quick_test_CCE.m

     num_of_samples_v = [100:500:12*1000]; %sample numbers used for estimation
     %estimator (of cross-entropy), base:
         cost_name = 'CE_kNN_k'; %d>=1
+        %cost_name = 'CE_expF'; %d>=1
     
 %initialization:
     num_of_samples_max = num_of_samples_v(end);

File code/estimators/quick_tests/quick_test_HShannon.m

             %cost_name = 'Shannon_MaxEnt1';    %d=1; approximation around the normal distribution...
             %cost_name = 'Shannon_MaxEnt2';    %d=1; approximation around the normal distribution...
             %cost_name = 'Shannon_PSD_SzegoT'; %d=1
+            %cost_name = 'Shannon_expF';       %d>=1; distr = 'normal'
+            
         %meta:
             %cost_name = 'Shannon_DKL_N';  %d>=1
             %cost_name = 'Shannon_DKL_U';  %d>=1          

File code/estimators/quick_tests/quick_test_Himreg.m

             %cost_name = 'Shannon_KDP';      %h>=0
             %cost_name = 'Renyi_expF';       %h>=0
             %cost_name = 'Tsallis_expF';     %h>=0
+            %cost_name = 'Shannon_expF';     %h>=0
             
         %meta:
             %cost_name = 'ensemble';      %h>=0

File code/estimators/utilities/exp_family/expF_gradF.m

+function [gradF] = expF_gradF(distr,np) 
+%function [gradF] = expF_gradF(distr,np) 
+%Computes the gradient of the log-normalizer (gradF) at a given natural parameter value (np) for the input exponential family.
+%
+%INPUT:
+%   distr: 'normal'.
+%   np   : natural parameters.
+%          distr = 'normal': np.t1 = C^{-1}*m, np.t2 = 1/2*C^{-1}, where m is the mean, C is the covariance matrix.
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+switch distr
+    case 'normal' %Ref: Frank Nielsen, Vincent Garcia. Statistical exponential families: A digest with flash cards. "http://arxiv.org/abs/0911.4863"
+        I = inv(np.t2);
+        s = I * np.t1;
+        gradF.t1 = s / 2;
+        gradF.t2 = -I/2 - s * s.' / 4;
+    otherwise
+       error('Distribution=?');   
+end

File code/estimators/utilities/exp_family/expF_np1_np2_mult.m

+function [innerp] = expF_np1_np2_mult(np1,np2)
+%function [innerp] = expF_np1_np2_mult(np1,np2)
+%Inner product of the 'natural parameter' variables, np1 and np2. (np1,np2: structures with the same fields; example: np1.t1, np1.t2, np2.t1, np2.t2).
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+innerp = 0;
+F1 = fieldnames(np1);
+for n = 1 : length(F1)
+    aF = F1{n}; %field-name of F1 = field-name of F2 (<== assumption)
+    innerp = innerp + ip(np1.(aF),np2.(aF));
+end

File code/estimators/utilities/ip.m

+function [inner_product] = ip(A,B)
+%function [inner_product] = ip(A,B)
+%Computes the inner product of matrix A and B.
+
+%Copyright (C) 2012-2014 Zoltan Szabo ("http://www.gatsby.ucl.ac.uk/~szabo/", "zoltan (dot) szabo (at) gatsby (dot) ucl (dot) ac (dot) uk")
+%
+%This file is part of the ITE (Information Theoretical Estimators) toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+inner_product = sum(sum(A.*B));

File doc/ITE_documentation.txt

 From v0.20, the documentation of a given release is available at 'https://bitbucket.org/szzoli/ite/downloads': Downloads tab: 'ITE-<release>_documentation.pdf'.
+Note: the contents of the .zip and .tar.bz2-ed codes are identical.