Commits

Zoltan Szabo committed d64cd16

Schweizer-Wolff's sigma and kappa: added. Hoeffding's Phi computation: scaled-up (C++).

Comments (0)

Files changed (49)

+v0.12 (Oct 27, 2012):
+-Schweizer-Wolff's sigma and kappa: added; see 'ISW1_initialization.m', 'ISW1_estimation.m', 'ISWinf_initialization.m', 'ISWinf_estimation.m'.
+-Hoeffding's Phi computation: scaled-up (C++); see 'Hoeffding_term1.cpp'.
+-HRenyi_weightedkNN_initialization.m: contained a superfluous co.k parameter, deleted.
+-chol_gauss.c: a superfluous 'double *y' variable deleted.
+-dY==dX verification added to 'DRenyi_kNN_k_estimation.m', 'DTsallis_kNN_k_estimation.m', 'DL2_kNN_k_estimation.m'.
+-some comments improved.
+
 v0.11 (Oct 20, 2012):
 -multivariate version of the Hoeffding's Phi estimator: added (see 'IHoeffding_initialization.m', 'IHoeffding_estimation.m').
 -ITE_install.m: updated to perform only the 'compiled' quick tests. I am a greedy user;) See #1.
 - multi-platform (tested extensively on Windows and Linux),
 - free and open source (released under the GNU GPLv3(>=) license).
 
-ITE can estimate Shannon-, R�nyi entropy; generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon-, L2-, R�nyi-, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi; complex variants of entropy and mutual information; L2-, R�nyi-, Tsallis divergence, maximum mean discrepancy, and J-distance.
+ITE can estimate Shannon-, R�nyi entropy; generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon-, L2-, R�nyi-, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa; complex variants of entropy and mutual information; L2-, R�nyi-, Tsallis divergence, maximum mean discrepancy, and J-distance.
 
 ITE offers solution methods for 
 

code/H_I_D/I_estimation.m

 %ds(m)-dimensional; using the method/cost object co.
 %
 %INPUT:
-%   Y: Y(:,t) t^th sample.
-%  ds: subspace dimension.
+%   Y: Y(:,t) is the t^th sample.
+%  ds: subspace dimensions.
 %  co: mutual information estimator object (structure).
 %
 %Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")

code/H_I_D/base_estimators/DL2_kNN_k_estimation.m

 
 %co.mult:OK.
 
-[d,num_of_samplesY] = size(Y);
-[d,num_of_samplesX] = size(X);
+[dY,num_of_samplesY] = size(Y);
+[dX,num_of_samplesX] = size(X);
+
+if dX~=dY
+    disp('Error: the dimension of X and Y must be equal.');
+else
+    d = dX;
+end
 
 squared_distancesXX = kNN_squared_distances(X,X,co,1);
 squared_distancesYX = kNN_squared_distances(Y,X,co,0);

code/H_I_D/base_estimators/DRenyi_kNN_k_estimation.m

 
 %co.mult:OK.
 
+if size(X,1)~=size(Y,1)
+    disp('Error: the dimension of X and Y must be equal.');
+end
+
 D_alpha = estimate_Dalpha(X,Y,co);
 R_alpha = log(D_alpha) / (co.alpha-1);

code/H_I_D/base_estimators/DTsallis_kNN_k_estimation.m

 
 %co.mult:OK.
 
+if size(X,1)~=size(Y,1)
+    disp('Error: the dimension of X and Y must be equal.');
+end
+
 D_alpha = estimate_Dalpha(X,Y,co);
 T_alpha = (D_alpha-1)/(co.alpha-1);

code/H_I_D/base_estimators/HRenyi_weightedkNN_initialization.m

         %I: 'knnFP1': fast pairwise distance computation and C++ partial sort; parameter: co.k.
         %II: 'knnFP2': fast pairwise distance computation; parameter: co.k. 												 		
         %III: 'knnsearch' (Matlab Statistics Toolbox): parameters: co.k, co.NSmethod ('kdtree' or 'exhaustive').        
-        %IV: 'ANN' (approximate nearest neighbor); parameters: co.k, co.epsi.         
+        %IV: 'ANN' (approximate nearest neighbor); parameters: co.k, co.epsi. 
+		%Note. co.k (the number of neighbors) depends on the number of samples used for estimation, it is set in 'HRenyi_weightedkNN_estimation.m'.
 		%I:
             co.kNNmethod = 'knnFP1';
-            co.k = 3;%k-nearest neighbors
 		%II:
             %co.kNNmethod = 'knnFP2';
-            %co.k = 3;%k-nearest neighbors
         %III:
             %co.kNNmethod = 'knnsearch';
-            %co.k = 3;%k-nearest neighbors
             %co.NSmethod = 'kdtree';
         %IV:
             %co.kNNmethod = 'ANN';
-            %co.k = 3;%k-nearest neighbors
             %co.epsi = 0; %=0: exact kNN; >0: approximate kNN, the true (not squared) distances can not exceed the real distance more than a factor of (1+epsi).
 
 %initialize the ann wrapper in Octave, if needed:

code/H_I_D/base_estimators/IHSIC_estimation.m

 %Estimates mutual information (I) using the HSIC (Hilbert-Schmidt independence criterion) method. 
 %
 %INPUT:
-%   Y: Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: initialized mutual information estimator object.
 %REFERENCE:

code/H_I_D/base_estimators/IHoeffding_estimation.m

 %Estimates mutual information (I) using the multivariate version of Hoeffding's Phi. 
 %
 %INPUT:
-%   Y: Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: initialized mutual information estimator object.
 %REFERENCE:
     U = copula_transformation(Y);
 
     %term1:
-        cumsumMi = 1 - bsxfun(@max,U(1,:).',U(1,:));
-        for i = 2 : d
-            cumsumMi = cumsumMi .* (1 - bsxfun(@max,U(i,:).',U(i,:)));
-        end
-        term1 = sum(sum(cumsumMi)) / num_of_samples^2;
+        term1 = Hoeffding_term1(U);
     %term2:
         if co.small_sample_adjustment
             term2 = -2 * sum(prod(1-U.^2-(1-U)/num_of_samples,1)) / (num_of_samples * 2^d);

code/H_I_D/base_estimators/ISW1_estimation.m

+function [I] = ISW1_estimation(Y,ds,co)
+%Estimates mutual information (I) using Schweizer-Wolff's sigma. 
+%
+%INPUT:
+%   Y: Y(:,t) is the t^th sample.
+%  ds: subspace dimensions.
+%  co: initialized mutual information estimator object.
+%REFERENCE:
+%  Sergey Kirshner and Barnab�s P�czos. ICA and ISA Using Schweizer-Wolff Measure of Dependence. International Conference on Machine Learning (ICML), 464-471, 2008.
+%  B. Schweizer and E. F. Wolff. On Nonparametric Measures of Dependence for Random Variables. The Annals of Statistics 9:879-885, 1981.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+if one_dimensional_problem(ds) && length(ds)==2
+    %initialization:
+        bound = 1;
+        num_of_samples = size(Y,2);
+    if (num_of_samples <= co.max_num_of_samples_without_appr) %directly
+        r = rank_transformation(Y);
+        I = SW_sigma(r,bound);
+    else %approximation on a sparse grid
+        bin_size = ceil(num_of_samples/co.max_num_of_bins);
+        num_of_samples = floor(num_of_samples/bin_size)*bin_size;    
+        r = rank_transformation(Y(:,1:num_of_samples));
+        I = SW_sigma(r,bound,bin_size);
+    end    
+else
+    disp('Error: there must be 2 pieces of one-dimensional subspaces (coordinates) for this estimator.');
+end

code/H_I_D/base_estimators/ISW1_initialization.m

+function [co] = ISW1_initialization(mult)
+%Initialization of the Schweizer-Wolff's sigma mutual information estimator.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We make use of the naming convention 'I<name>_initialization', to ease embedding new mutual information estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'SW1';
+    co.mult = mult;
+    
+%other fields:
+    co.max_num_of_samples_without_appr = 2500; %maximum number of samples to be handled without approximation
+    co.max_num_of_bins = 2500; %maximum number of bins

code/H_I_D/base_estimators/ISWinf_estimation.m

+function [I] = ISWinf_estimation(Y,ds,co)
+%Estimates mutual information (I) using Schweizer-Wolff's kappa. 
+%
+%INPUT:
+%   Y: Y(:,t) is the t^th sample.
+%  ds: subspace dimensions.
+%  co: initialized mutual information estimator object.
+%REFERENCE:
+%  Sergey Kirshner and Barnab�s P�czos. ICA and ISA Using Schweizer-Wolff Measure of Dependence. International Conference on Machine Learning (ICML), 464-471, 2008.
+%  B. Schweizer and E. F. Wolff. On Nonparametric Measures of Dependence for Random Variables. The Annals of Statistics 9:879-885, 1981.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+if one_dimensional_problem(ds) && length(ds)==2
+    %initialization:
+        bound = 1;
+        num_of_samples = size(Y,2);
+    if (num_of_samples <= co.max_num_of_samples_without_appr) %directly
+        r = rank_transformation(Y);
+        I = SW_kappa(r,bound);
+    else %approximation on a sparse grid
+        bin_size = ceil(num_of_samples/co.max_num_of_bins);
+        num_of_samples = floor(num_of_samples/bin_size)*bin_size;    
+        r = rank_transformation(Y(:,1:num_of_samples));
+        I = SW_kappa(r,bound,bin_size);
+    end    
+else
+    disp('Error: there must be 2 pieces of one-dimensional subspaces (coordinates) for this estimator.');
+end

code/H_I_D/base_estimators/ISWinf_initialization.m

+function [co] = ISWinf_initialization(mult)
+%Initialization of the Schweizer-Wolff's kappa mutual information estimator.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We make use of the naming convention 'I<name>_initialization', to ease embedding new mutual information estimation methods.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'SWinf';
+    co.mult = mult;
+    
+%other fields:
+    co.max_num_of_samples_without_appr = 2500; %maximum number of samples to be handled without approximation
+    co.max_num_of_bins = 2500; %maximum number of bins

code/H_I_D/meta_estimators/HRPensemble_estimation.m

 function [H] = HRPensemble_estimation(Y,co)
-%Estimates Shannon differential entropy (H) from the average of H
-%estimations on RP-ed (random projection) groups of samples; this a "meta" method, the applied H estimator can be arbitrary.
+%Estimates entropy (H) from the average of H estimations on RP-ed (random projection) groups of samples; this a "meta" method, the applied H estimator can be arbitrary.
 %
 %We make use of the naming convention 'H<name>_estimation', to ease embedding new entropy estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  co: entropy estimator object.
 %REFERENCE: 
 %	Zolt�n Szab�, Andr�s L�rincz: Fast Parallel Estimation of High Dimensional Information Theoretical Quantities with Low Dimensional Random Projection Ensembles. ICA 2009, pages 146-153.
 	R = randn(d_RP,d) / sqrt(d_RP);
     H = H + H_estimation(R*Y(:,(k-1)*g+1:k*g),co.member_co);
 end
-H = H / num_of_groups;
+H = H / num_of_groups;

code/H_I_D/meta_estimators/HRPensemble_initialization.m

 function [co] = HRPensemble_initialization(mult)
-%Initialization of the "meta" differential entropy estimator. The estimator is:
+%Initialization of the "meta" entropy estimator. The estimator is:
 %   1)constructed from the average of entropy estimations on groups of RP-ed (random projection) samples.
 %   2)treated as a cost object (co). 
 %
 %other fields:
     co.group_size = 500;%assumption: co.group_size <= number of samples
 	co.dim_RP = 2; %assumption: co.dim_RP <= dimension of the samples
-    co.member_name = 'Shannon_kNN_k'; %you can change it to any Shannon entropy estimator.
+    co.member_name = 'Shannon_kNN_k'; %you can change it to any entropy estimator.
     co.member_co = H_initialization(co.member_name,mult);
     

code/H_I_D/meta_estimators/Hcomplex_estimation.m

 function [H] = Hcomplex_estimation(Y,co)
-%Estimates entropy from a real-valued (vector) entropy estimator; this a "meta" method, the applied estimator can be arbitrary.
+%Estimates complex entropy from a real-valued (vector) entropy estimator; this a "meta" method, the applied estimator can be arbitrary.
 %
 %We make use of the naming convention 'H<name>_estimation', to ease embedding new entropy estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  co: entropy estimator object.
 %
 %Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")

code/H_I_D/meta_estimators/Hcomplex_initialization.m

 function [co] = Hcomplex_initialization(mult)
-%Initialization of the "meta" entropy estimator. The estimator is.
+%Initialization of the "meta" complex entropy estimator. The estimator is
 %   1)constructed from a real-valued (vector) entropy estimator.
 %   2)treated as a cost object (co). 
 %
     co.mul = mult;
     
 %other fields:
-    co.member_name = 'Renyi_kNN_k'; %you can change it to any Shannon entropy estimator.
+    co.member_name = 'Renyi_kNN_k'; %you can change it to any real entropy estimator.
     co.member_co = H_initialization(co.member_name,mult);
     

code/H_I_D/meta_estimators/Hensemble_estimation.m

 %We make use of the naming convention 'H<name>_estimation', to ease embedding new entropy estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  co: entropy estimator object.
 %REFERENCE: 
 %  Jan Kybic: High-dimensional mutual information estimation for image registration. ICIP 2004, pages 1779�1782.

code/H_I_D/meta_estimators/Hensemble_initialization.m

     
 %other fields:
     co.group_size = 500;%assumption: co.group_size <= number of samples
-    co.member_name = 'Shannon_kNN_k'; %you can change it to any Shannon entropy estimator.
+    co.member_name = 'Shannon_kNN_k'; %you can change it to any entropy estimator.
     co.member_co = H_initialization(co.member_name,mult);

code/H_I_D/meta_estimators/IL2_DL2_estimation.m

 function [I] = IL2_DL2_estimation(Y,ds,co)
-%Estimates mutual information (I) making use of an(y) estimator for L2 divergence; co is the cost object.
+%Estimates L2 mutual information (I) making use of an(y) estimator for L2 divergence; co is the cost object.
 %This is a  "meta" method, using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
 %
 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %REFERENCE:

code/H_I_D/meta_estimators/IL2_DL2_initialization.m

 function [co] = IL2_DL2_initialization(mult)
-%Initialization of the "meta" mutual information estimator, which is 
+%Initialization of the "meta" L2 mutual information estimator, which is 
 %   1)based on an(y) estimator for L2 divergence,
 %   2)is treated as a cost object (co). 
 %Mutual information is estimated using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
     co.mul = mult;
 	
 %other fields:    
-    co.member_name = 'L2_kNN_k'; %you can change it to any L2-divergence estimator
+    co.member_name = 'L2_kNN_k'; %you can change it to any L2 divergence estimator
     co.member_co = D_initialization(co.member_name,mult);

code/H_I_D/meta_estimators/IRenyi_DRenyi_estimation.m

 function [I] = IRenyi_DRenyi_estimation(Y,ds,co)
-%Estimates mutual information (I) making use of an(y) estimator for R�nyi divergence; co is the cost object.
+%Estimates R�nyi mutual information (I) making use of an(y) estimator for R�nyi divergence; co is the cost object.
 %This is a  "meta" method, using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
 %
 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %REFERENCE:

code/H_I_D/meta_estimators/IRenyi_DRenyi_initialization.m

 function [co] = IRenyi_DRenyi_initialization(mult)
-%Initialization of the "meta" mutual information estimator, which is 
+%Initialization of the "meta" R�nyi mutual information estimator, which is 
 %   1)based on an(y) estimator for R�nyi  divergence,
 %   2)is treated as a cost object (co). 
 %Mutual information is estimated using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).

code/H_I_D/meta_estimators/IRenyi_HRenyi_estimation.m

 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %REFERENCE:

code/H_I_D/meta_estimators/IRenyi_HRenyi_initialization.m

 function [co] = IRenyi_HRenyi_initialization(mult)
-%Initialization of the R�nyi "meta" mutual information estimator. The estimator uses the identity: 
+%Initialization of the "meta" R�nyi mutual information estimator. The estimator uses the identity: 
 %I_{alpha}(X) = -H_{alpha}(Z), where Z =[F_1(X_1);...;F_d(X_d)] is the copula transformation of X; F_i is the cdf of X_i.
 %
 %Note:

code/H_I_D/meta_estimators/IShannon_HShannon_estimation.m

 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y: Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %REFERENCE: 

code/H_I_D/meta_estimators/IShannon_HShannon_initialization.m

 function [co] = IShannon_HShannon_initialization(mult)
-%Initialization of the Shannon "meta" mutual information estimator, which is 
+%Initialization of the "meta" Shannon mutual information estimator, which is 
 %   1)based on an(y) estimator for Shannon differential entropy,
 %   2)is treated as a cost object (co). 
 %Mutual information is estimated using the relation: 
 	%factor correct' H estimation (mult=1) below. Note: R�nyi entropy (H_{alpha}) also gives in limit (alpha->1) the Shannon entropy (H).
     co.member_co = H_initialization(co.member_name,1);%'1': since we use the relation '(*)' (multiplicative factors in entropy estimations are NOT allowed).
    
-    
+    

code/H_I_D/meta_estimators/ITsallis_DTsallis_estimation.m

 function [I] = ITsallis_DTsallis_estimation(Y,ds,co)
-%Estimates mutual information (I) making use of an(y) estimator for Tsallis divergence; co is the cost object.
+%Estimates Tsallis mutual information (I) making use of an(y) estimator for Tsallis divergence; co is the cost object.
 %This is a  "meta" method, using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).
 %
 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %REFERENCE:

code/H_I_D/meta_estimators/ITsallis_DTsallis_initialization.m

 function [co] = ITsallis_DTsallis_initialization(mult)
-%Initialization of the "meta" mutual information estimator, which is 
+%Initialization of the "meta" Tsallis mutual information estimator, which is 
 %   1)based on an(y) estimator for Tsallis divergence,
 %   2)is treated as a cost object (co). 
 %Mutual information is estimated using the relation: I(y^1,...,y^M) = D(f_y,\prod_{m=1}^M f_{y^m}).

code/H_I_D/meta_estimators/Icomplex_estimation.m

 %We make use of the naming convention 'I<name>_estimation', to ease embedding new mutual information estimation methods.
 %
 %INPUT:
-%   Y:Y(:,t) t^th sample.
+%   Y: Y(:,t) is the t^th sample.
 %  ds: subspace dimensions.
 %  co: mutual information estimator object.
 %

code/H_I_D/utilities/Hoeffding_term1.cpp

+#include "mex.h"
+#include "math.h"
+
+/* .cpp version of 'Hoeffding_term1.m' */
+
+/* Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>. */
+
+/* The computational routine(s) */
+double max(double a,double b)
+{
+return (a>b)?a:b;
+}
+
+long factorial(int n)
+{
+  int c;
+  long result = 1;
+ 
+  for (c = 1; c <= n; c++)
+    result = result * c;
+ 
+  return result;
+}
+
+/* The gateway function */
+void mexFunction(
+    int nlhs, mxArray *plhs[],
+    int nrhs, const mxArray *prhs[])
+{
+
+    double *U;    /* pointer to the input matrix */    
+    mwSize d,T;    
+    
+    U = mxGetPr(prhs[0]); /* create a pointer to the input matrix */
+    d = mxGetM(prhs[0]);  /* number of rows */
+    T = mxGetN(prhs[0]);  /* number of columns */
+
+    double *I;    
+    /* Allocating array for I */
+    plhs[0] = mxCreateDoubleMatrix(1,1,mxREAL);
+    I = mxGetPr(plhs[0]);    
+    
+    mwSize j,k,i;
+    double temp,term1;
+    
+    term1 = 0;
+    for (j=1; j<T+1; j++) {   /* j: in Matlab sense*/
+    for (k=1; k<T+1; k++) {   /* k: in Matlab sense*/
+      temp = 1 - max(U[(j-1)*d],U[(k-1)*d]);  /*temp = 1-max(U(i,j),U(i,k)) */
+      for (i=2; i<d+1; i++) {      /* i: in Matlab sense*/
+          temp = temp * ( 1 - max(U[(i-1)+(j-1)*d],U[(i-1)+(k-1)*d]) );
+      }
+      term1 = term1 + temp;
+    }
+    }
+    term1 = term1 / (T*T);
+
+    *I = term1;
+}

code/H_I_D/utilities/Hoeffding_term1.m

+function [term1] = Hoeffding_term1(U)
+%Estimates the first term (term1) in Hoeffding's Phi based on the copula transformation (U).
+%
+%INPUT:
+%   U: U(:,t) is the t^th sample.
+%NOTE:
+%  For problems with high number of samples, this computation can be somewhat slow + memory consuming; acceleration possibility: compile 'Hoeffding_term1.cpp'.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+[d,num_of_samples] = size(U);
+
+cumsumMi = 1 - bsxfun(@max,U(1,:).',U(1,:));
+for i = 2 : d
+    cumsumMi = cumsumMi .* (1 - bsxfun(@max,U(i,:).',U(i,:)));
+end
+
+term1 = sum(sum(cumsumMi)) / num_of_samples^2;
+

code/H_I_D/utilities/I_similarity_matrix.m

 %
 %You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
 
+
+disp('Pairwise similarity matrix computation: started.');
 %similarity matrix(S):
     switch co.name
         case 'GV'
             end
             end
     end
-    
+disp('Pairwise similarity matrix computation: ready.');
+
 %Verify that S_ij >=0 (negative values may appear, e.g, in case of (i) an entropy estimator with an additive constant, or (ii) small number of samples.
     nonnegativeS = ( sum(sum(S>=0)) == prod(size(S)) );
     if ~nonnegativeS

code/H_I_D/utilities/copula_transformation.m

 %
 %You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
 
-[d,num_of_samples] = size(X);
-[temp,J] = sort(X,2);
-Z = zeros(size(J));%preallocation
-
-%I.:
-    for k = 1 : d
-        Z(k,J(k,:)) = [1:num_of_samples]; %inverse of permutation J(k,:)
-    end
-    
-%II.:
-    %I = repmat([1:d].',1,num_of_samples);
-    %Z(I+(J-1)*d) = repmat([1:num_of_samples],d,1);
-	
-Z = Z / num_of_samples;
+[Z] = rank_transformation(X);
+Z = Z / size(X,2);

code/H_I_D/utilities/estimate_Dalpha.m

 function [D_alpha] = estimate_Dalpha(X,Y,co)
-%Estimates D_alpha, the R�nyi and the Tsallis divergences are simple functions of this quantity.
+%Estimates D_alpha = \int p^{\alpha}(x)q^{1-\alpha}(x)dx, the R�nyi and the Tsallis divergences are simple functions of this quantity.
 %
 %INPUT:
-%   X:X(:,t) is the t^th sample from the first distribution.
-%   Y:Y(:,t) is the t^th sample from the second distribution.
+%   X: X(:,t) is the t^th sample from the first distribution.
+%   Y: Y(:,t) is the t^th sample from the second distribution.
 %  co: cost object (structure).
 %
 %Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")

code/H_I_D/utilities/rank_transformation.m

+function [Z] = rank_transformation(X)
+%Computes the rank transformation (Z) of signal X.
+%
+%INPUT:
+%   X: X(:,t) is the t^th sample.
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+[d,num_of_samples] = size(X);
+[temp,J] = sort(X,2);
+Z = zeros(size(J));%preallocation
+
+%I.:
+    for k = 1 : d
+        Z(k,J(k,:)) = [1:num_of_samples]; %inverse of permutation J(k,:)
+    end
+    
+%II.:
+    %I = repmat([1:d].',1,num_of_samples);
+    %Z(I+(J-1)*d) = repmat([1:num_of_samples],d,1);

code/IPA/optimization/clustering_UD0.m

 %       a)base methods: 'Shannon_kNN_k', 'Renyi_kNN_k', 'Renyi_kNN_1tok', 'Renyi_kNN_S', 'Renyi_weightedkNN', 'Renyi_MST', 'Renyi_GSF'.
 %       b)meta methods: 'complex', 'ensemble', 'RPensemble'.
 %   ii)mutual information:
-%       a)base methods: 'GV', 'KCCA', 'KGV', 'HSIC', 'Hoeffding'.
+%       a)base methods: 'GV', 'KCCA', 'KGV', 'HSIC', 'Hoeffding', 'SW1', 'SWinf'.
 %       b)meta methods: 
 %           -Hilbert transformation based method: 'complex',
 %           -divergence based techniques: 'L2_DL2', 'Renyi_DRenyi', 'Tsallis_DTsallis', 'MMD_DMMD',

code/ITE_install.m

     compile_NCut = 1;%1=compile, 0=do not compile
     compile_ANN = 1;%1=compile, 0=do not compile
     compile_knn = 1;%1=compile, 0=do not compile
-	compile_chol_gauss = 1;%1=compile, 0=do not compile; (not necessary, but can further speed-up the computations; the package also contains the purely Matlab/Octave 'chol_gauss.m')
+	compile_TCA = 1;%1=compile, 0=do not compile (chol_gauss); not necessary, but can further speed-up the computations; the package also contains the purely Matlab/Octave 'chol_gauss.m'
+    compile_SWICA = 1; %1=compile, 0=do not compile; not necessary, but can accelerate computations; the package also contains the purely Matlab/Octave 'SW_kappa.m' and 'SW_sigma.m'
+    compile_Hoeffding_term1 = 1; %1=compile, 0=do not compile; not necessary, but can be more ecomical in terms of memory used + accelerate computations; the package also contains the purely Matlab/Octave 'Hoeffding_term1.m'
 	delete_Ncut_in_Octave = 0; %1=yes, 0=no
 
 disp('Installation: started.');
     end
 end
 
-%compile 'chol_gauss', if needed:	
-	if compile_chol_gauss    
+%compile TCA (chol_gauss), if needed:	
+	if compile_TCA
         cd(strcat(ITE_code_dir,'/shared/embedded/TCA'));%cd 'TCA'
-        disp('chol_gauss compilation: started.');
+        disp('TCA (chol_gauss.c) compilation: started.');
         mex chol_gauss.c;
-        disp('chol_gauss compilation: ready.');
-	end
+        disp('TCA (chol_gauss.c) compilation: ready.');
+    end
+    
+%compile SWICA (SW_kappa, SW_sigma), if needed:    
+    if compile_SWICA
+        cd(strcat(ITE_code_dir,'/shared/embedded/SWICA'));%cd 'SWICA'
+        disp('SWICA (SW_kappa.cpp, SW_sigma.cpp) compilation: started.');
+        build_SWICA;
+        disp('SWICA (SW_kappa.cpp, SW_sigma.cpp) compilation: ready.');
+    end
 
+%compile 'Hoeffding_term1.cpp', if needed:
+    if compile_Hoeffding_term1    
+        cd(strcat(ITE_code_dir,'/H_I_D/utilities'));
+        disp('Hoeffding_term1.cpp compilation: started.');
+        mex Hoeffding_term1.cpp;
+        disp('Hoeffding_term1.cpp compilation: ready.');
+    end
+    
 %compile 'knn', if needed:
     if compile_knn%if needed
         cd(strcat(ITE_code_dir,'/shared/embedded/knn'));
-        disp('knn compilation: started.');
+        disp('knn (top.cpp) compilation: started.');
         build; %compile the knn package (top.cpp)
-        disp('knn compilation: ready.');
+        disp('knn (top.cpp) compilation: ready.');
     end	
-
+    
 %change back to the stored working directory:
     cd(pd);
     

code/shared/embedded/NCut/compileDir_simple.m

 names = {'mex_w_times_x_symmetric.cpp', 'sparsifyc.cpp', 'spmtimesd.cpp'};
 for k = 1:length(names)
     cm = sprintf('mex -largeArrayDims %s',names{k});
-    disp(cm);
+    %disp(cm);
     eval(cm);
 end

code/shared/embedded/SWICA/LICENSE

+The license below is the BSD open-source license. See 
+	http://www.opensource.org/licenses/bsd-license.php
+with:
+	<OWNER> = Alberta Ingenuity Centre for Machine Learning, Sergey	Kirshner, and Barnabas Poczos
+	<ORGANIZATION> = University of Alberta
+	<YEAR> = 2008
+
+This license allows free redistribution of SWICA sofware as stand-alone application or a component in the redistributed software provided the copyright notice and the disclaimer are also included.
+
+
+
+The SWICA License:
+
+Copyright (c) 2008, Alberta Ingenuity Centre for Machine Learning,
+Sergey Kirshner, and Barnabas Poczos
+All rights reserved.
+
+Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:
+
+    * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.
+    * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.
+    * Neither the name of the University of Alberta nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.
+
+THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

code/shared/embedded/SWICA/README

+SWICA (Schweizer-Wolff contrast for Independent Component Analysis) algorithm is implemented mostly in Matlab with some subroutines using Mex/C++.  Mex/C++ routines are also duplicated in Matlab code, but I STRONGLY suggest compiling C++ files and using the binaries instead.
+
+List of files:
+SWICA_solve.m -- envelop routine for running the software;
+
+SW_contrast.m -- envelop routine for computing SW measure of dependence, the contrast used in SWICA;
+
+SW_sigma.m    -- computation of SW sigma measure of dependence using empirical copulas;
+SW_sigma.cpp
+
+SW_kappa.m    -- computation of SW kappa measure of dependence using empirical copulas;
+SW_kappa.cpp
+
+SW_demo.m     -- a simple example of how to use SWICA
+
+
+SWICA is distributed under BSD license.  See file LICENSE for details.
+
+If you use the software, please use the following citation:
+Sergey Kirshner, Barnabas Poczos, "ICA and ISA Using Schweizer-Wolff Measure of Dependence," Proceedings of the Twenty Fifth Conference on Machine Learning, 2008.
+
+
+Sergey Kirshner (sergey@cs.ualberta.ca)
+May 7, 2008

code/shared/embedded/SWICA/SW_kappa.cpp

+#include "mex.h"
+
+/*
+ * SW_kappa.c - accelerated computation of Schweizer-Wolff kappa (L-inf) measure of dependence 
+ * using empirical copula (given ranks of data).  Approximation is obtained by summing the
+ * empirical copula on a sparse regular grid.
+ *
+ * Sergey Kirshner (sergey@cs.ualberta.ca)
+ * May 7, 2008
+ */
+
+#include <math.h>
+#include <stdlib.h>
+
+void mexFunction( int nlhs, mxArray *plhs[],
+                  int nrhs, const mxArray *prhs[])
+{
+  double *r;        /* Input array of ranks */
+  long N,d;         /* Dimensions of the array of ranks */ 
+  long Nbins;       /* Number of bins used */
+  double Nbinsd;  
+  long Cbins;       /* Number of indices in each bin, Nbins*Cbins=N */
+  double Cbinsd;
+  double *kappa;    /* Output array of Schweizer-Wolff kappa */ 
+  long i,j,k,l;     /* Index variables */
+  long *r_sorted;   /* Temporary array of resorted ranks */ 
+  double *ecc;      /* Column of empirical copula */
+  double bound;     /* Upper bound on sigma (used for acceleration) */
+  long *becd,*bec;  /* Binned empirical copula density and copula */
+  long curr_bin_min;/* Index from which to start updating empirical copula column */
+  long temp_index;
+  double Linf;      /* Temporary absolute value of the difference between empirical copula
+		       and product copula */
+
+  /* SW_kappa takes three inputs:
+     r     -- dxN array of ranks for data
+     bound -- (optional, default=1) upper bound on sigma
+     Cbins -- (optional, default=1) number of data points falling between a pair of horizontal or vertical gridpoints
+
+     Output:
+     kappa -- (d-1)x(d-1) array of Schweizer-Wolff kappa dependence measures (for all pairs of dimensions)
+  */
+
+  if( nrhs<1 )
+    mexErrMsgTxt("SW_kappa: Need at least one input argument.");
+  else {
+    /* Getting a pointer to the rank array */
+    r=mxGetPr(prhs[0]);
+    
+    /* Getting the number of rows and columns */
+    d=mxGetM(prhs[0]); /* Number of rows */
+    N=mxGetN(prhs[0]); /* Number of columns */
+
+    if( nrhs>1 ) {
+      /* Getting bound on SW */
+      bound=(double)mxGetScalar(prhs[1]);
+      if( nrhs>2 ) {
+	/* Getting the number of points in each axis bin */
+        Cbins=(long)mxGetScalar(prhs[2]);
+      } else {
+        Cbins=1;
+      }
+    } else {
+      bound=1.0;
+    }
+  }
+  
+  /* Determining the number of evaluation points on the horizontal and vertical
+     axes with empirical copula evaluated at (i*Cbins/N,j*Cbins/N), i,j=1,Nbins. */
+   Nbins=N/Cbins;
+  
+  Nbinsd=(double)Nbins;
+  Cbinsd=(double)Cbins;
+  
+  /* Allocating array of SW sigma */
+  plhs[0]=mxCreateDoubleMatrix(d-1,d-1,mxREAL);
+  kappa=mxGetPr(plhs[0]);
+    
+  /* Rescaling the bound to be used inside the sum over empirical copula values */ 
+  bound/=4.0;
+  bound*=Nbinsd;
+
+  if( Cbins>1 ) {
+  /* Allocating binned empirical copula arrays */
+    becd=new long[Nbins];
+    bec=new long[Nbins];
+  }
+  
+  /* Allocating auxiliary arrays */
+  r_sorted=new long[N];
+  ecc=new double[Nbins-1];
+  
+  for( j=0; j<d-1; j++ ) {
+    for( k=j+1; k<d; k++ ) {
+      /* Considering pair of variables (j,k) */
+
+      /* Arranging indices for dimension k along sorted indices for dimension j */
+      /* Ranges 0..N-1 */
+      for( i=0; i<N; i++ )
+        r_sorted[(long)r[i*d+j]-1]=(long)r[i*d+k]-1;
+
+      /* Zeroing out current empirical copula column */ 
+      for( i=0; i<Nbins-1; i++ )
+        ecc[i]=0.0;
+
+      if( Cbins>1 ) {
+	/* Zeroing out binned empirical copula density column */
+	for( i=0; i<Nbins; i++ )
+	  becd[i]=0;
+      }
+
+      for( i=1; i<Nbins; i++ ) {
+	if( Cbins>1 ) {
+	  /* Initializing starting point for empirical copula column update */
+	  curr_bin_min=Nbins;
+	  
+	  /* Populating binned empirical copula density */
+	  for( l=0; l<Cbins; l++ ) {
+	    /* Determining which row of the binned empirical copula to assign data point
+	       with U-rank (i-1)*Cbins+(l+1) */
+	    temp_index=(r_sorted[(i-1)*Cbins+l])/Cbins;
+	    
+	    /* Updating the column of the binned empirical copula */
+	    becd[temp_index]++;
+	    
+	    /* Recording the index of the lowest row that was changed */
+	    if( temp_index<curr_bin_min )
+	      curr_bin_min=temp_index;
+	  }
+	  
+	  /* Computing current binned empirical copula column update */
+	  bec[curr_bin_min]=becd[curr_bin_min];
+	  for( l=curr_bin_min+1; l<Nbins; l++ )
+	    bec[l]=bec[l-1]+becd[l];
+	}
+
+	if( Cbins>1 ) {
+	  /* Updating current binned empirical copula column */
+	  for( l=curr_bin_min; l<Nbins-1; l++ ) {
+	    ecc[l]+=(double)bec[l];
+	  }
+	} else {
+	  /* Updating current empirical copula column */
+	  for( l=r_sorted[i-1]; l<N-1; l++ ) {
+	    ecc[l]+=1.0;
+	  }
+	}
+	  
+	/* Updating sigma */
+	for( l=1; l<Nbins; l++ ) {
+	  if( Cbins>1 )
+	    Linf=fabs(ecc[l-1]/Cbinsd-(double)i*(double)l/Nbinsd);
+	  else
+	    Linf=fabs(ecc[l-1]-(double)i*(double)l/Nbinsd);
+	  if( kappa[(k-1)*(d-1)+j]<Linf ) {
+	    /* New L-inf value is larger */
+	    kappa[(k-1)*(d-1)+j]=Linf;
+	  }
+	}
+	  
+	if( Cbins>1 ) {
+	  /* De-populating binned empirical copula density (much faster than zeroing out) */
+	  for( l=0; l<Cbins; l++ )
+	    becd[r_sorted[(i-1)*Cbins+l]/Cbins]--;
+	}
+
+	if( kappa[(k-1)*(d-1)+j]>bound )
+	  /* Already exceeded the bound */
+	  /* Stopping */
+	  i=Nbins;
+	}
+      
+      /* Rescaling sigma to fall in [0,1] interval */
+      kappa[(k-1)*(d-1)+j]*=4.0/Nbinsd;
+    }
+  }
+  
+  delete r_sorted;
+  delete ecc;
+
+  if( Cbins>1 ) {
+    /* Deallocating binned empirical copula arrays */
+    delete becd;
+    delete bec;
+  }
+  
+  return;
+}
+

code/shared/embedded/SWICA/SW_kappa.m

+function kappa=SW_kappa(r,bound,Cbins);
+%SW_kappa -- sample version of Schweizer-Wolff kappa measure of dependence
+% SW_kappa computes Schweizer-Wolff kappa (L-inf) measure of dependence using
+% empirical copula (given ranks of data). Approximation is obtained by summing
+% the empirical copula on a sparse regular grid.
+%
+% SW_kappa accelerates the computation in two possible ways.  One, it
+% provides an option of returning the minimum of the actual SW kappa or the
+% upper bound, saving computation on not performing the sum over all N^2
+% evaluations of empirical copula.  Two, if the number of data points is too
+% large, an approximation is obtained by summing the empirical copula on 
+% a sparse regular grid.
+%
+% See Kirshner, S. and Poczos, B., "ICA and ISA Using Schweizer-Wolff Measure 
+% of Dependence," ICML-2008
+% For basic information on empirical copulas, see "Introduction to Copulas",
+% Nelsen 2006, pp. 207-209 and 219.
+% Reference for Schweizer-Wolff sigma: "On nonparametric measures of
+% dependence for random variables", B. Schweizer, E.F. Wolff, The Annals of
+% Statistics, 1981, v. 9(4), pp. 879-885.
+%
+% INPUTS:
+%   r     -- dxN array of ranks for data
+%   bound -- (optional, default=1) upper bound on sigma
+%   Cbins -- (optional, default=1) number of data points falling in a
+%            horizonal or a vertical cell
+%
+% OUTPUTS:
+%   kappa -- (d-1)x(d-1) upper triangular array of Schweizer-Wolff sigma
+%            measures of dependence (for all pairs of variables)
+%
+%Sergey Kirshner (sergey@cs.ualberta.ca)
+%May 7, 2008
+
+if( nargin<1 )
+  error( 'SW_sigma: Need at least one input argument.' );
+end
+
+if( nargin<3 )
+  Cbins=1;
+  if( nargin<2 )
+    bound=1;
+  end
+end
+
+% Determining the size of data
+[d,N]=size(r);
+  
+% Determining the number of evaluation points on the horizontal and vertical
+% axes with empirical copula evaluated at (i*Cbins/N,j*Cbins/N), i,j=1,Nbins.
+Nbins=N/Cbins;
+    
+% Initializing sigma
+kappa=zeros([d-1 d-1]);    
+
+% Rescaling the bound to be used inside the sum over empirical copula values 
+bound=bound*Nbins/4;
+  
+for j=1:d-1
+  for k=j+1:d
+    % Considering pair of variables (j,k)
+
+    % Arranging indices for dimension k along sorted indices for dimension j
+    for i=1:N
+      r_sorted(r(j,i))=r(k,i);
+    end
+
+    % Empirical copula column
+    ecc=zeros([Nbins 1]);
+
+    if( Cbins>1 )
+      % Binned empirical copula density column
+      becd=zeros([Nbins 1]);
+    end
+
+    for i=1:Nbins-1
+      if( Cbins>1 )
+        % Initializing starting point for empirical copula column update
+        curr_bin_min=Nbins;
+	  
+        % Populating binned empirical copula density
+        for l=1:Cbins
+          % Determining which row of the binned empirical copula to assign
+          % a data point
+          temp_index=ceil(r_sorted((i-1)*Cbins+l)/Cbins);
+	    
+          % Updating the column of the binned empirical copula
+          becd(temp_index)=becd(temp_index)+1;
+	    
+          % Recording the index of the lowest row that was changed
+          if( temp_index<curr_bin_min )
+            curr_bin_min=temp_index;
+          end
+        end
+          
+        % Computing current binned empirical copula column update
+        bec(curr_bin_min:Nbins-1,1)=cumsum(becd(curr_bin_min:Nbins-1));
+      end
+
+      if( Cbins>1 )
+        % Updating current binned empirical copula column
+        ecc(curr_bin_min:Nbins-1)=ecc(curr_bin_min:Nbins-1)+bec(curr_bin_min:Nbins-1);
+      else
+        % Updating current empirical copula column
+        ecc(r_sorted(i):Nbins-1)=ecc(r_sorted(i):Nbins-1)+ones([Nbins-r_sorted(i) 1]);
+      end
+
+      % Updating kappa    
+      if( Cbins>1 )
+        kappa(j,k-1)=max([kappa(j,k-1);abs(ecc(1:Nbins-1)/Cbins-i*[1:Nbins-1]'/Nbins)]);
+      else
+        kappa(j,k-1)=max([kappa(j,k-1);abs(ecc(1:Nbins-1)-i*[1:Nbins-1]'/Nbins)]);
+      end
+      
+      if( Cbins>1 )
+        % Depopulating binned empirical copula density (much faster than
+        % zeroing out)
+        for l=1:Cbins
+          becd(ceil(r_sorted((i-1)*Cbins+l)/Cbins))=becd(ceil(r_sorted((i-1)*Cbins+l)/Cbins))-1;
+        end
+      end
+
+      if( kappa(j,k-1)>bound )
+        % Already exceeded the bound -- stopping
+        break;
+      end
+    end    
+  end
+end
+
+% Rescaling sigma to fall in [0,1] interval
+kappa=4*kappa/Nbins;

code/shared/embedded/SWICA/SW_sigma.cpp

+#include "mex.h"
+
+/*
+ * SW_sigma.c - accelerated computation of Schweizer-Wolff sigma (L1) measure of dependence 
+ * using empirical copula (given ranks of data).  Approximation is obtained by summing the
+ * empirical copula on a sparse regular grid.
+ *
+ * Sergey Kirshner (sergey@cs.ualberta.ca)
+ * April 23, 2008
+ * Modified May 7, 2008
+ */
+
+#include <math.h>
+#include <stdlib.h>
+
+void mexFunction( int nlhs, mxArray *plhs[],
+                  int nrhs, const mxArray *prhs[])
+{
+  double *r;        /* Input array of ranks */
+  long N,d;         /* Dimensions of the array of ranks */ 
+  long Nbins;       /* Number of bins used */
+  double Nbinsd;  
+  long Cbins;       /* Number of indices in each bin, Nbins*Cbins=N */
+  double Cbinsd;
+  double *sigma;    /* Output array of Schweizer-Wolff sigma */ 
+  long i,j,k,l;     /* Index variables */
+  long *r_sorted;   /* Temporary array of resorted ranks */ 
+  double *ecc;      /* Column of empirical copula */
+  double bound;     /* Upper bound on sigma (used for acceleration) */
+  long *becd,*bec;  /* Binned empirical copula density and copula */
+  long curr_bin_min;/* Index from which to start updating empirical copula column */
+  long temp_index;
+
+  /* SW_sigma takes three inputs:
+     r     -- dxN array of ranks for data
+     bound -- (optional, default=1) upper bound on sigma
+     Cbins -- (optional, default=1) number of data points falling between a pair of horizontal or vertical gridpoints
+
+     Output:
+     sigma -- (d-1)x(d-1) array of Schweizer-Wolff sigma dependence measures (for all pairs of dimensions)
+  */
+
+  if( nrhs<1 )
+    mexErrMsgTxt("SW_sigma: Need at least one input argument.");
+  else {
+    /* Getting a pointer to the rank array */
+    r=mxGetPr(prhs[0]);
+    
+    /* Getting the number of rows and columns */
+    d=mxGetM(prhs[0]); /* Number of rows */
+    N=mxGetN(prhs[0]); /* Number of columns */
+
+    if( nrhs>1 ) {
+      /* Getting bound on SW */
+      bound=(double)mxGetScalar(prhs[1]);
+      if( nrhs>2 ) {
+	/* Getting the number of points in each axis bin */
+        Cbins=(long)mxGetScalar(prhs[2]);
+      } else {
+        Cbins=1;
+      }
+    } else {
+      bound=1.0;
+    }
+  }
+  
+  /* Determining the number of evaluation points on the horizontal and vertical 
+     axes with empirical copula evaluated at (i*Cbins/N,j*Cbins/N), i,j=1,Nbins. */
+  Nbins=N/Cbins;
+  
+  Nbinsd=(double)Nbins;
+  Cbinsd=(double)Cbins;
+  
+  /* Allocating array of SW sigma */
+  plhs[0]=mxCreateDoubleMatrix(d-1,d-1,mxREAL);
+  sigma=mxGetPr(plhs[0]);
+    
+  /* Rescaling the bound to be used inside the sum over empirical copula values */ 
+  bound/=12.0;
+  bound*=Nbinsd*(Nbinsd*Nbinsd-1.0);
+
+  if( Cbins>1 ) {
+  /* Allocating binned empirical copula arrays */
+    becd=new long[Nbins];
+    bec=new long[Nbins];
+  }
+  
+  /* Allocating auxiliary arrays */
+  r_sorted=new long[N];
+  ecc=new double[Nbins-1];
+  
+  for( j=0; j<d-1; j++ ) {
+    for( k=j+1; k<d; k++ ) {
+      /* Considering pair of variables (j,k) */
+
+      /* Arranging indices for dimension k along sorted indices for dimension j */
+      /* Ranges 0..N-1 */
+      for( i=0; i<N; i++ )
+        r_sorted[(long)r[i*d+j]-1]=(long)r[i*d+k]-1;
+
+      /* Zeroing out current empirical copula column */ 
+      for( i=0; i<Nbins-1; i++ )
+        ecc[i]=0.0;
+
+      if( Cbins>1 ) {
+	/* Zeroing out binned empirical copula density column */
+	for( i=0; i<Nbins; i++ )
+	  becd[i]=0;
+      }
+
+      for( i=1; i<Nbins; i++ ) {
+	if( Cbins>1 ) {
+	  /* Initializing starting point for empirical copula column update */
+	  curr_bin_min=Nbins;
+	  
+	  /* Populating binned empirical copula density */
+	  for( l=0; l<Cbins; l++ ) {
+	    /* Determining which row of the binned empirical copula to assign data point
+	       with U-rank (i-1)*Cbins+(l+1) */
+	    temp_index=(r_sorted[(i-1)*Cbins+l])/Cbins;
+	    
+	    /* Updating the column of the binned empirical copula */
+	    becd[temp_index]++;
+	    
+	    /* Recording the index of the lowest row that was changed */
+	    if( temp_index<curr_bin_min )
+	      curr_bin_min=temp_index;
+	  }
+	  
+	  /* Computing current binned empirical copula column update */
+	  bec[curr_bin_min]=becd[curr_bin_min];
+	  for( l=curr_bin_min+1; l<Nbins; l++ )
+	    bec[l]=bec[l-1]+becd[l];
+	}
+
+	if( Cbins>1 ) {
+	  /* Updating current binned empirical copula column */
+	  for( l=curr_bin_min; l<Nbins-1; l++ ) {
+	    ecc[l]+=(double)bec[l];
+	  }
+	} else {
+	  /* Updating current empirical copula column */
+	  for( l=r_sorted[i-1]; l<N-1; l++ ) {
+	    ecc[l]+=1.0;
+	  }
+	}
+	  
+	/* Updating sigma */
+	for( l=1; l<Nbins; l++ ) {
+	  if( Cbins>1 )
+	    sigma[(k-1)*(d-1)+j]+=fabs(ecc[l-1]/Cbinsd-(double)i*(double)l/Nbinsd);
+	  else
+	    sigma[(k-1)*(d-1)+j]+=fabs(ecc[l-1]-(double)i*(double)l/Nbinsd);
+	}
+	  
+	if( Cbins>1 ) {
+	  /* De-populating binned empirical copula density (much faster than zeroing out) */
+	  for( l=0; l<Cbins; l++ )
+	    becd[r_sorted[(i-1)*Cbins+l]/Cbins]--;
+	}
+
+	if( sigma[(k-1)*(d-1)+j]>bound )
+	  /* Already exceeded the bound */
+	  /* Stopping */
+	  i=Nbins;
+	}
+      
+      /* Rescaling sigma to fall in [0,1] interval */
+      sigma[(k-1)*(d-1)+j]=12.0*sigma[(k-1)*(d-1)+j]/(Nbinsd*(Nbinsd*Nbinsd-1.0));
+    }
+  }
+  
+  delete r_sorted;
+  delete ecc;
+
+  if( Cbins>1 ) {
+    /* Deallocating binned empirical copula arrays */
+    delete becd;
+    delete bec;
+  }
+  
+  return;
+}
+

code/shared/embedded/SWICA/SW_sigma.m

+function sigma=SW_sigma(r,bound,Cbins);
+%SW_sigma -- sample version of Schweizer-Wolff sigma measure of dependence
+% SW_sigma computes Schweizer-Wolff sigma (L1) measure of dependence using
+% empirical copula (given ranks of data). Approximation is obtained by summing
+% the empirical copula on a sparse regular grid.
+%
+% SW_sigma accelerates the computation in two possible ways.  One, it
+% provides an option of returning the minimum of the actual SW sigma or the
+% upper bound, saving computation on not performing the sum over all N^2
+% evaluations of empirical copula.  Two, if the number of data points is too
+% large, an approximation is obtained by summing the empirical copula on 
+% a sparse regular grid.
+%
+% See Kirshner, S. and Poczos, B., "ICA and ISA Using Schweizer-Wolff Measure 
+% of Dependence," ICML-2008
+% For basic information on empirical copulas, see "Introduction to Copulas",
+% Nelsen 2006, pp. 207-209 and 219.
+% Reference for Schweizer-Wolff sigma: "On nonparametric measures of
+% dependence for random variables", B. Schweizer, E.F. Wolff, The Annals of
+% Statistics, 1981, v. 9(4), pp. 879-885.
+%
+% INPUTS:
+%   r     -- dxN array of ranks for data
+%   bound -- (optional, default=1) upper bound on sigma
+%   Cbins -- (optional, default=1) number of data points falling in a
+%            horizonal or a vertical cell
+%
+% OUTPUTS:
+%   sigma -- (d-1)x(d-1) upper triangular array of Schweizer-Wolff sigma
+%            measures of dependence (for all pairs of variables)
+%
+%Sergey Kirshner (sergey@cs.ualberta.ca)
+%May 7, 2008
+
+if( nargin<1 )
+  error( 'SW_sigma: Need at least one input argument.' );
+end
+
+if( nargin<3 )
+  Cbins=1;
+  if( nargin<2 )
+    bound=1;
+  end
+end
+
+% Determining the size of data
+[d,N]=size(r);
+  
+% Determining the number of evaluation points on the horizontal and vertical
+% axes with empirical copula evaluated at (i*Cbins/N,j*Cbins/N), i,j=1,Nbins.
+Nbins=N/Cbins;
+    
+% Initializing sigma
+sigma=zeros([d-1 d-1]);    
+
+% Rescaling the bound to be used inside the sum over empirical copula values 
+bound=bound*Nbins*(Nbins^2-1)/12;
+  
+for j=1:d-1
+  for k=j+1:d
+    % Considering pair of variables (j,k)
+
+    % Arranging indices for dimension k along sorted indices for dimension j
+    for i=1:N
+      r_sorted(r(j,i))=r(k,i);
+    end
+
+    % Empirical copula column
+    ecc=zeros([Nbins 1]);
+
+    if( Cbins>1 )
+      % Binned empirical copula density column
+      becd=zeros([Nbins 1]);
+    end
+
+    for i=1:Nbins-1
+      if( Cbins>1 )
+        % Initializing starting point for empirical copula column update
+        curr_bin_min=Nbins;
+	  
+        % Populating binned empirical copula density
+        for l=1:Cbins
+          % Determining which row of the binned empirical copula to assign
+          % a data point
+          temp_index=ceil(r_sorted((i-1)*Cbins+l)/Cbins);
+	    
+          % Updating the column of the binned empirical copula
+          becd(temp_index)=becd(temp_index)+1;
+	    
+          % Recording the index of the lowest row that was changed
+          if( temp_index<curr_bin_min )
+            curr_bin_min=temp_index;
+          end
+        end
+          
+        % Computing current binned empirical copula column update
+        bec(curr_bin_min:Nbins-1,1)=cumsum(becd(curr_bin_min:Nbins-1));
+      end
+
+      if( Cbins>1 )
+        % Updating current binned empirical copula column
+        ecc(curr_bin_min:Nbins-1)=ecc(curr_bin_min:Nbins-1)+bec(curr_bin_min:Nbins-1);
+      else
+        % Updating current empirical copula column
+        ecc(r_sorted(i):Nbins-1)=ecc(r_sorted(i):Nbins-1)+ones([Nbins-r_sorted(i) 1]);
+      end
+
+      % Updating sigma    
+      if( Cbins>1 )
+        sigma(j,k-1)=sigma(j,k-1)+sum(abs(ecc(1:Nbins-1)/Cbins-i*[1:Nbins-1]'/Nbins));
+      else
+        sigma(j,k-1)=sigma(j,k-1)+sum(abs(ecc(1:Nbins-1)-i*[1:Nbins-1]'/Nbins));
+      end
+      
+      if( Cbins>1 )
+        % Depopulating binned empirical copula density (much faster than
+        % zeroing out)
+        for l=1:Cbins
+          becd(ceil(r_sorted((i-1)*Cbins+l)/Cbins))=becd(ceil(r_sorted((i-1)*Cbins+l)/Cbins))-1;
+        end
+      end
+
+      if( sigma(j,k-1)>bound )
+        % Already exceeded the bound -- stopping
+        break;
+      end
+    end    
+  end
+end
+
+% Rescaling sigma to fall in [0,1] interval
+sigma=12*sigma/(Nbins*(Nbins^2-1));

code/shared/embedded/SWICA/build_SWICA.m

+mex SW_kappa.cpp;
+mex SW_sigma.cpp;

code/shared/embedded/TCA/chol_gauss.c

 int iter;
 int *pp;
 int nmax;
-double *x, *y, residual;
+double *x, residual;
 
 m = mxGetM(prhs[0]); /* dimension of input space might be greater than 1*/
 n = mxGetN(prhs[0]); /* number of samples */
Add a comment to this file

code/shared/embedded/TCA/chol_gauss.mexw32

Binary file removed.

Add a comment to this file

doc/ITE_documentation.pdf

Binary file modified.

Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.