Commits

Zoltan Szabo committed cb07f5b

2 k-nearest neighbor based Kullback-Leibler divergence estimators: added; see 'DKL_kNN_k_initialization.m', 'DKL_kNN_k_estimation.m', 'DKL_kNN_kiTi_initialization.m', 'DKL_kNN_kiTi_estimation.m'.

Comments (0)

Files changed (7)

+v0.19 (Nov 16, 2012):
+-2 k-nearest neighbor based Kullback-Leibler divergence estimators: added. See 'DKL_kNN_k_initialization.m', 'DKL_kNN_k_estimation.m', 'DKL_kNN_kiTi_initialization.m', 'DKL_kNN_kiTi_estimation.m'.
 -compute_CDSS.cpp: 'sqrt(T)' -> 'sqrt(double(T))', to increase compatibility with compilers.
 -Note on Jensen-Shannon divergence: deleted (doc).
 
 - multi-platform (tested extensively on Windows and Linux),
 - free and open source (released under the GNU GPLv3(>=) license).
 
-ITE can estimate Shannon-, Rényi-, Tsallis entropy; generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon-, L2-, Rényi-, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa; complex variants of entropy and mutual information; L2-, Rényi-, Tsallis divergence; Hellinger-, Bhattacharyya distance; maximum mean discrepancy, and J-distance.
+ITE can estimate Shannon-, Rényi-, Tsallis entropy; generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon-, L2-, Rényi-, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa; complex variants of entropy and mutual information; L2-, Rényi-, Tsallis-, Kullback-Leibler divergence; Hellinger-, Bhattacharyya distance; maximum mean discrepancy, and J-distance.
 
 ITE offers solution methods for 
 

code/H_I_D/base_estimators/DKL_kNN_k_estimation.m

+function [D] = DKL_kNN_k_estimation(X,Y,co)
+%Estimates the Kullback-Leibler divergence (D) of X and Y (X(:,t), Y(:,t) is the t^th sample)
+%using the kNN method (S={k}). The number of samples in X [=size(X,2)] and Y [=size(Y,2)] can be different. Cost parameters are provided in the cost object co.
+%
+%We make use of the naming convention 'D<name>_estimation', to ease embedding new divergence estimation methods.
+%
+%REFERENCE: 
+%   Fernando Perez-Cruz. Estimation of Information Theoretic Measures for Continuous Random Variables. Advances in Neural Information Processing Systems (NIPS), pp. 1257-1264, 2008.
+%   Nikolai Leonenko, Luc Pronzato, and Vippal Savani. A class of Renyi information estimators for multidimensional densities. Annals of Statistics, 36(5):2153-2182, 2008.
+%   Quing Wang, Sanjeev R. Kulkarni, and Sergio Verdu. Divergence estimation for multidimensional densities via k-nearest-neighbor distances. IEEE Transactions on Information Theory, 55:2392-2405, 2009.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+[dX,num_of_samplesX] = size(X);
+[dY,num_of_samplesY] = size(Y);
+
+if dX~=dY
+    disp('Error: the dimension of X and Y must be equal.');
+else
+    d = dX;
+    squared_distancesXX = kNN_squared_distances(X,X,co,1);
+    squared_distancesYX = kNN_squared_distances(Y,X,co,0);
+    dist_k_XX = sqrt(squared_distancesXX(end,:));
+    dist_k_YX = sqrt(squared_distancesYX(end,:));
+    D = d * mean(log(dist_k_YX./dist_k_XX)) + log(num_of_samplesY/(num_of_samplesX-1));
+end

code/H_I_D/base_estimators/DKL_kNN_k_initialization.m

+function [co] = DKL_kNN_k_initialization(mult)
+%Initialization of the kNN (k-nearest neighbor, S={k}) based Kullback-Leibler divergence estimator.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We make use of the naming convention 'D<name>_initialization', to ease embedding new divergence estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'KL_kNN_k';
+    co.mult = mult;
+    
+%other fields:
+    %Possibilities for 'co.kNNmethod' (see 'kNN_squared_distances.m'): 
+        %I: 'knnFP1': fast pairwise distance computation and C++ partial sort; parameter: co.k.                
+        %II: 'knnFP2': fast pairwise distance computation; parameter: co.k. 						
+        %III: 'knnsearch' (Matlab Statistics Toolbox): parameters: co.k, co.NSmethod ('kdtree' or 'exhaustive').
+        %IV: 'ANN' (approximate nearest neighbor); parameters: co.k, co.epsi. 
+		%I:
+            co.kNNmethod = 'knnFP1';
+            co.k = 3;%k-nearest neighbors				
+		%II:
+            %co.kNNmethod = 'knnFP2';
+            %co.k = 3;%k-nearest neighbors				
+        %III:
+            %co.kNNmethod = 'knnsearch';
+            %co.k = 3;%k-nearest neighbors
+            %co.NSmethod = 'kdtree';
+        %IV:
+            %co.kNNmethod = 'ANN';
+            %co.k = 3;%k-nearest neighbors
+            %co.epsi = 0; %=0: exact kNN; >0: approximate kNN, the true (not squared) distances can not exceed the real distance more than a factor of (1+epsi).
+
+%initialize the ann wrapper in Octave, if needed:
+    initialize_Octave_ann_wrapper_if_needed(co.kNNmethod);

code/H_I_D/base_estimators/DKL_kNN_kiTi_estimation.m

+function [D] = DKL_kNN_kiTi_estimation(X,Y,co)
+%Estimates the Kullback-Leibler divergence (D) of X and Y (X(:,t), Y(:,t) is the t^th sample)
+%using the kNN method (S={k}). The number of samples in X [=size(X,2)] and Y [=size(Y,2)] can be different. Cost parameters are provided in the cost object co.
+%
+%We make use of the naming convention 'D<name>_estimation', to ease embedding new divergence estimation methods.
+%
+%REFERENCE: 
+%   Quing Wang, Sanjeev R. Kulkarni, and Sergio Verdu. Divergence estimation for multidimensional densities via k-nearest-neighbor distances. IEEE Transactions on Information Theory, 55:2392-2405, 2009.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+[dX,num_of_samplesX] = size(X);
+[dY,num_of_samplesY] = size(Y);
+
+if dX~=dY
+    disp('Error: the dimension of X and Y must be equal.');
+else
+    d = dX;
+    k1 = floor(sqrt(num_of_samplesX));
+    k2 = floor(sqrt(num_of_samplesY));
+    
+    co.k = k1;
+    squared_distancesXX = kNN_squared_distances(X,X,co,1);
+    
+    co.k = k2;
+    squared_distancesYX = kNN_squared_distances(Y,X,co,0);
+    
+    dist_k_XX = sqrt(squared_distancesXX(end,:));
+    dist_k_YX = sqrt(squared_distancesYX(end,:));
+    D = d * mean(log(dist_k_YX./dist_k_XX)) + log( k1/k2 * num_of_samplesY/(num_of_samplesX-1) );
+end

code/H_I_D/base_estimators/DKL_kNN_kiTi_initialization.m

+function [co] = DKL_kNN_kiTi_initialization(mult)
+%Initialization of the kNN (k-nearest neighbor, S_1={k_1}, S_2={k_2}) based Kullback-Leibler divergence estimator. Here, ki-s depend on the number of samples, they are set in 'DKullback_Leibler_kNN_kiTi_estimation.m'.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We make use of the naming convention 'D<name>_initialization', to ease embedding new divergence estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'KL_kNN_kiTi';
+    co.mult = mult;
+    
+%other fields:
+    %Possibilities for 'co.kNNmethod' (see 'kNN_squared_distances.m'): 
+        %I: 'knnFP1': fast pairwise distance computation and C++ partial sort; parameter: co.k.                
+        %II: 'knnFP2': fast pairwise distance computation; parameter: co.k. 						
+        %III: 'knnsearch' (Matlab Statistics Toolbox): parameters: co.k, co.NSmethod ('kdtree' or 'exhaustive').
+        %IV: 'ANN' (approximate nearest neighbor); parameters: co.k, co.epsi. 
+		%I:
+            co.kNNmethod = 'knnFP1';
+		%II:
+            %co.kNNmethod = 'knnFP2';
+        %III:
+            %co.kNNmethod = 'knnsearch';
+            %co.NSmethod = 'kdtree';
+        %IV:
+            %co.kNNmethod = 'ANN';
+            %co.epsi = 0; %=0: exact kNN; >0: approximate kNN, the true (not squared) distances can not exceed the real distance more than a factor of (1+epsi).
+
+%initialize the ann wrapper in Octave, if needed:
+    initialize_Octave_ann_wrapper_if_needed(co.kNNmethod);
Add a comment to this file

doc/ITE_documentation.pdf

Binary file modified.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.