Zoltan Szabo avatar Zoltan Szabo committed 750bcf2

Energy distance-, distance covariance- via HSIC, energy distance estimation via MMD: added; see 'DEnergyDist_initialization.m', 'DEnergyDist_estimation.m', 'IdCov_IHSIC_initialization.m', 'IdCov_IHSIC_estimation.m', 'DEnergyDist_DMMD_initialization.m', 'DEnergyDist_DMMD_estimation'. We computed the square of distance correlation: sqrt added, see 'IdCor_estimation.m'. The ARfit website is again available: 'ITE_install.m' changed to its original form.

Comments (0)

Files changed (10)

+v0.26 (Dec 22, 2012):
+-Distance covariance estimation via HSIC: added; see IdCov_IHSIC_initialization.m', 'IdCov_IHSIC_estimation.m'.
+-Energy distance estimation via MMD: added; see 'DEnergyDist_DMMD_initialization.m', 'DEnergyDist_DMMD_estimation'.
+-Energy distance estimation: added; see 'DEnergyDist_initialization.m', 'DEnergyDist_estimation.m'.
+-We computed the square of distance correlation: sqrt added, see 'IdCor_estimation.m'.
+-The ARfit website is again available: 'ITE_install.m' changed to its original form.
+
 v0.25 (Dec 15, 2012):
 -Distance covariance, distance correlation estimation: added; see 'IdCov_initialization.m', 'IdCov_estimation.m', 'IdCor_initialization.m', 'IdCor_estimation.m'.
 -Temporarily the homepage of the downloaded ARfit website seems to be unavailable. Download link changed to 'http://www.mathworks.com/matlabcentral/fileexchange/174-arfit?download=true'; see 'ITE_install.m'.
 
 - `entropy (H)`: Shannon entropy, R�nyi entropy, Tsallis entropy (Havrda and Charv�t entropy), complex entropy,
 - `mutual information (I)`: generalized variance, kernel canonical correlation analysis, kernel generalized variance, Hilbert-Schmidt independence criterion, Shannon mutual information, L2 mutual information, R�nyi mutual information, Tsallis mutual information, copula-based kernel dependency, multivariate version of Hoeffding's Phi, Schweizer-Wolff's sigma and kappa, complex mutual information, Cauchy-Schwartz quadratic mutual information, Euclidean distance based quadratic mutual information, distance covariance, distance correlation,
-- `divergence (D)`: Kullback-Leibler divergence (relative entropy), L2 divergence, R�nyi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance, an integral probability metric), J-distance (symmetrised Kullback-Leibler divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence,
+- `divergence (D)`: Kullback-Leibler divergence (relative entropy), L2 divergence, R�nyi divergence, Tsallis divergence, Hellinger distance, Bhattacharyya distance, maximum mean discrepancy (kernel distance, an integral probability metric), J-distance (symmetrised Kullback-Leibler divergence), Cauchy-Schwartz divergence, Euclidean distance based divergence, energy distance (specially the Cramer-Von Mises distance),
 - `association measures (A)`, including `measures of concordance`: multivariate extensions of Spearman's rho (Spearman's rank correlation coefficient, grade correlation coefficient),
 - `cross quantities (C)`: cross-entropy.
 
 
 **Download** the latest release: 
 
-- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.25_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.25_code.tar.bz2), 
-- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.25_documentation.pdf).
+- code: [zip](https://bitbucket.org/szzoli/ite/downloads/ITE-0.26_code.zip), [tar.bz2](https://bitbucket.org/szzoli/ite/downloads/ITE-0.26_code.tar.bz2), 
+- [documentation (pdf)](https://bitbucket.org/szzoli/ite/downloads/ITE-0.26_documentation.pdf).
 
 

code/H_I_D_A_C/base_estimators/DEnergyDist_estimation.m

+function [D] = DEnergyDist_estimation(Y1,Y2,co)
+%Estimates the energy distance (D) using pairwise distances of the sample points.
+%
+%We use the naming convention 'D<name>_estimation' to ease embedding new divergence estimation methods.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution.
+%  co: divergence estimator object.
+%
+%REFERENCE:
+%   Gabor J. Szekely and Maria L. Rizzo. A new test for multivariate normality. Journal of Multivariate Analysis, 93:58-80, 2005. (metric space of negative type)
+%   Gabor J. Szekely and Maria L. Rizzo. Testing for equal distributions in high dimension. InterStat, 5, 2004. (R^d)
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+%verification:
+    [dY1,num_of_samplesY1] = size(Y1);
+    [dY2,num_of_samplesY2] = size(Y2);
+
+    if dY1~=dY2
+        error('The dimension of the samples in Y1 and Y2 must be equal.');
+    end
+    
+%Euclidean distance:
+squared_distances_Y1Y1 = sqrt(sqdistance(Y1));
+squared_distances_Y2Y2 = sqrt(sqdistance(Y2));
+squared_distances_Y1Y2 = sqrt(sqdistance(Y1,Y2));
+
+D =  2 * sum(sum(squared_distances_Y1Y2)) / (num_of_samplesY1*num_of_samplesY2) -  sum(sum(squared_distances_Y1Y1)) / (num_of_samplesY1^2) -  sum(sum(squared_distances_Y2Y2)) / (num_of_samplesY2^2);

code/H_I_D_A_C/base_estimators/DEnergyDist_initialization.m

+function [co] = DEnergyDist_initialization(mult)
+%Initialization of the energy distance estimator. The estimation is based on pairwise distances of the sample points.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'D<name>_initialization' to ease embedding new divergence estimation methods.
+%   3)This is a meta method: the MMD estimator can arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'EnergyDist';
+    co.mult = mult;
+  

code/H_I_D_A_C/base_estimators/IdCor_estimation.m

 B = compute_dCov_dCor_statistics(Y(ds(1)+1:ds(1)+ds(2),:),co.alpha);
 
 I = sum(sum(A.*B)) / sqrt(sum(sum(A.^2)) * sum(sum(B.^2))); %<A,B> / sqrt(<A,A><B,B>)
+I = sqrt(I);
 
 
 

code/H_I_D_A_C/meta_estimators/DEnergyDist_DMMD_estimation.m

+function [D] = DEnergyDist_DMMD_estimation(Y1,Y2,co)
+%Estimates the energy distance (D) according to the relation: D(f_1,f_2;rho) = 2 [MMD(f_1,f_2;k)]^2, where MMD denotes maximum mean discrepancy and k is a kernel that generates rho, a semimetric of negative type.
+%
+%Note:
+%   1)We use the naming convention 'D<name>_estimation' to ease embedding new divergence estimation methods.
+%   2)This is a meta method: the MMD estimator can be arbitrary.
+%
+%INPUT:
+%  Y1: Y1(:,t) is the t^th sample from the first distribution.
+%  Y2: Y2(:,t) is the t^th sample from the second distribution.
+%  co: divergence estimator object.
+%
+%REFERENCE:
+%   Dino Sejdinovic, Arthur Gretton, Bharath Sriperumbudur, and Kenji Fukumizu. Hypothesis testing using pairwise distances and associated kernels. International Conference on Machine Learning (ICML), pages 1111-1118, 2012. (semimetric space; energy distance <=> MMD, with a suitable kernel)
+%   Russell Lyons. Distance Covariance in metric spaces. Technical report, Indiana University, 2011. http://arxiv.org/abs/1106.5758. (energy distance, metric space of negative type; pre-equivalence to MMD)
+%   Gabor J. Szekely and Maria L. Rizzo. A new test for multivariate normality. Journal of Multivariate Analysis, 93:58-80, 2005. (energy distance; metric space of negative type)
+%   Gabor J. Szekely and Maria L. Rizzo. Testing for equal distributions in high dimension. InterStat, 5, 2004. (energy distance; R^d)
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+D =  2 * ( D_estimation(Y1,Y2,co.member_co) )^2;

code/H_I_D_A_C/meta_estimators/DEnergyDist_DMMD_initialization.m

+function [co] = DEnergyDist_DMMD_initialization(mult)
+%Initialization of the energy distance estimator. The computation is carried out according to the relation: D(f_1,f_2;rho) = 2 [MMD(f_1,f_2;k)]^2, where MMD denotes maximum mean discrepancy and k is a kernel that generates rho, a semimetric of negative type.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'D<name>_initialization' to ease embedding new divergence estimation methods.
+%   3)This is a meta method: the MMD estimator can arbitrary.
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'EnergyDist_DMMD';
+    co.mult = mult;
+    
+%other fields:
+    co.member_name = 'MMD_Ustat'; %you can change it to any MMD estimator
+    co.member_co = D_initialization(co.member_name,mult);

code/H_I_D_A_C/meta_estimators/IdCov_IHSIC_estimation.m

+function [I] = IdCov_IHSIC_estimation(Y,ds,co)
+%Estimates distance covariance based on the formula: [I(y^1,y^2;rho_1,rho_2)]^2 = 4 [HSIC(y^1,y^2;k)]^2, where HSIC stands for the Hilbert-Schmidt independence criterion, y=[y^1;y^2] has density f, y^i-s have density f_i-s, and k=k_1 x k_2, where k_i-s generates rho_i-s, semimetrics of negative type used in distance covariance.
+%
+%Note:
+%   1)We use the naming convention 'I<name>_estimation' to ease embedding new mutual information estimation methods.
+%   2)This is a meta method: the MMD estimator can be arbitrary.
+%
+%INPUT:
+%   Y: Y(:,t) is the t^th sample.
+%  ds: subspace dimensions.
+%  co: mutual information estimator object.
+%
+%REFERENCE:
+%   Dino Sejdinovic, Arthur Gretton, Bharath Sriperumbudur, and Kenji Fukumizu. Hypothesis testing using pairwise distances and associated kernels. International Conference on Machine Learning (ICML), pages 1111-1118, 2012. (equivalence to HSIC)
+%   Russell Lyons. Distance Covariance in metric spaces. Technical report, Indiana University, 2011. http://arxiv.org/abs/1106.5758. (generalized distance covariance, rho_i; equivalence to HSIC)
+%   Gabor J. Szekely and Maria L. Rizzo and. Brownian distance covariance. The Annals of Applied Statistics, 3:1236-1265, 2009. (distance covariance)
+%   Gabor J. Szekely, Maria L. Rizzo, and Nail K. Bakirov. Measuring and testing dependence by correlation of distances. The Annals of Statistics, 35:2769-2794, 2007. (distance covariance)
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%co.mult:OK.
+
+%verification:
+    if sum(ds) ~= size(Y,1);
+        error('The subspace dimensions are not compatible with Y.');
+    end
+    if length(ds)~=2
+        error('There must be two subspaces for this estimator.');
+    end
+
+I = 2 * abs(I_estimation(Y,ds,co.member_co));
+

code/H_I_D_A_C/meta_estimators/IdCov_IHSIC_initialization.m

+function [co] = IdCov_IHSIC_initialization(mult)
+%Initialization of the distance covariance estimator. The estimation is carried out based on the formula: [I(y^1,y^2;rho_1,rho_2)]^2 = 4 [HSIC(y^1,y^2;k)]^2, where HSIC stands for the Hilbert-Schmidt independence criterion, y=[y^1;y^2] has density f, y^i-s have density f_i-s, and k=k_1 x k_2, where k_i-s generates rho_i-s, semimetrics of negative type used in distance covariance.
+%
+%Note:
+%   1)The estimator is treated as a cost object (co).
+%   2)We use the naming convention 'I<name>_initialization' to ease embedding new mutual information estimation methods.
+%   3)This is a meta method: the MMD estimator can be arbitrary. 
+%
+%INPUT:
+%   mult: is a multiplicative constant relevant (needed) in the estimation; '=1' means yes, '=0' no.
+%OUTPUT:
+%   co: cost object (structure).
+%
+%Copyright (C) 2012 Zoltan Szabo ("http://nipg.inf.elte.hu/szzoli", "szzoli (at) cs (dot) elte (dot) hu")
+%
+%This file is part of the ITE (Information Theoretical Estimators) Matlab/Octave toolbox.
+%
+%ITE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by
+%the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
+%
+%This software is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of
+%MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for more details.
+%
+%You should have received a copy of the GNU General Public License along with ITE. If not, see <http://www.gnu.org/licenses/>.
+
+%mandatory fields:
+    co.name = 'dCov_IHSIC';
+    co.mult = mult;
+	
+%other fields:    
+    co.member_name = 'HSIC'; %you can change it to any HSIC estimator
+    co.member_co = I_initialization(co.member_name,mult);

code/ITE_install.m

 	
 if download_ARfit  %download and extract the ARfit package to '/shared/embedded/ARfit':
     disp('ARfit package: downloading, extraction: started.');
-    %[FN,status] = urlwrite('http://www.gps.caltech.edu/~tapio/arfit/arfit.zip','arfit.zip');%this webpage seems to unavailable temporarily
-    [FN,status] = urlwrite('http://www.mathworks.com/matlabcentral/fileexchange/174-arfit?download=true','arfit.zip');
+    [FN,status] = urlwrite('http://www.gps.caltech.edu/~tapio/arfit/arfit.zip','arfit.zip');
     if status %downloading: successful
         %create 'shared/downloaded/ARfit', if needed:
             if ~exist('shared/downloaded/ARfit')
Tip: Filter by directory path e.g. /media app.js to search for public/media/app.js.
Tip: Use camelCasing e.g. ProjME to search for ProjectModifiedEvent.java.
Tip: Filter by extension type e.g. /repo .js to search for all .js files in the /repo directory.
Tip: Separate your search with spaces e.g. /ssh pom.xml to search for src/ssh/pom.xml.
Tip: Use ↑ and ↓ arrow keys to navigate and return to view the file.
Tip: You can also navigate files with Ctrl+j (next) and Ctrl+k (previous) and view the file with Ctrl+o.
Tip: You can also navigate files with Alt+j (next) and Alt+k (previous) and view the file with Alt+o.