Snippets

Lilian Besson (Naereen) Python 2/3 script to test kernels, for the Kernel Methods homework.

Updated by Lilian Besson (Naereen)

File test_kernel.py Modified

  • Ignore whitespace
  • Hide word diff
-#! /usr/bin/env python2
+#! /usr/bin/env python
 # -*- coding: utf-8; mode: python -*-
 """
-A Python 2 script to test kernels, for the Kernel Methods homework.
+A Python 2/3 script to test kernels, for the Kernel Methods homework.
 
-Source : http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/
+Source: http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/
 PDF: http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/fichiers/homework_mva_2016.pdf
 
 - *Date:* Saturday 06 February 2016.
Updated by Lilian Besson (Naereen)

File test_kernel.py Modified

  • Ignore whitespace
  • Hide word diff
 # From the slides
 @kernel("SMALLINT")
 def power_two_dot_k(x, y):
-    """ K(x, y) = 2 ** (x + y) """
-    return 2 ** (x + y)
+    """ K(x, y) = 2 ** (x * y) """
+    return 2 ** (x * y)
 
 
 # Here we could have an overflow error, so using small integers
 @kernel("SMALLINT")
 def power_two_sum_k(x, y):
-    """ K(x, y) = 2 ** (x * y) """
-    return 2 ** (x * y)
+    """ K(x, y) = 2 ** (x + y) """
+    return 2 ** (x + y)
 
 
 @kernel("REALPLUS")
Updated by Lilian Besson (Naereen)

File test_kernel.py Modified

  • Ignore whitespace
  • Hide word diff
 @kernel("SMALLINT")
 def power_two_dot_k(x, y):
     """ K(x, y) = 2 ** (x + y) """
-    return 2 ** (x * y)
+    return 2 ** (x + y)
 
 
 # Here we could have an overflow error, so using small integers
Created by Lilian Besson (Naereen)

File test_kernel.output.txt Added

  • Ignore whitespace
  • Hide word diff
+Script to test if functions are kernels, by Lilian Besson (C) 2016
+MIT Licensed (http://lbesson.mit-license.org)
+
+
+Using N = 1000, T = 5.
+For test domains:
+  - IR  ~= [-300.0, 300.0],
+  - IR+ ~= [ 0, 300.0],
+  - IN  ~= [ 0, 1000],
+  - IN  ~= [ 0, 30] (small to reduce overflow),
+Tolerance for small real values = 1e-12.
+Tolerance for success rates     = 0.01.
+And sampling the a_i from uniform in [-50.0, 50.0].
+
+Starting to test 14 kernels...
+
+- Testing the kernel abs_k  K(x, y) = < x, y > = x * y :
+    It has domain = REAL.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel abs_k...
+
+- Testing the kernel one_by_k  K(x, y) = 1 / (1 - x * y) :
+    It has domain = MONEONE.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel one_by_k...
+
+- Testing the kernel power_two_dot_k  K(x, y) = 2 ** (x + y) :
+    It has domain = SMALLINT.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel power_two_dot_k...
+
+- Testing the kernel power_two_sum_k  K(x, y) = 2 ** (x * y) :
+    It has domain = SMALLINT.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel power_two_sum_k...
+
+- Testing the kernel log_k  K(x, y) = log(1 + x * y) :
+    It has domain = REALPLUS.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 96.50%.
+    ===> I guess this kernel log_k  K(x, y) = log(1 + x * y)  is NOT a positive definite kernel !
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel log_k...
+
+- Testing the kernel exp_k  K(x, y) = exp(-(x-y)^2) :
+    It has domain = REAL.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel exp_k...
+
+- Testing the kernel cos_plus_k  K(x, y) = cos(x + y) :
+    It has domain = REAL.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 46.60%.
+    ===> I guess this kernel cos_plus_k  K(x, y) = cos(x + y)  is NOT a positive definite kernel !
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel cos_plus_k...
+
+- Testing the kernel cos_minus_k  K(x, y) = cos(x - y) :
+    It has domain = REAL.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel cos_minus_k...
+
+- Testing the kernel min_k  K(x, y) = min(x, y) :
+    It has domain = REALPLUS.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel min_k...
+
+- Testing the kernel max_k  K(x, y) = max(x, y) :
+    It has domain = REALPLUS.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 78.60%.
+    ===> I guess this kernel max_k  K(x, y) = max(x, y)  is NOT a positive definite kernel !
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel max_k...
+
+- Testing the kernel min_by_max_k  K(x, y) = min(x, y) / max(x, y) :
+    It has domain = REALPLUS.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel min_by_max_k...
+
+- Testing the kernel gcd_k  K(x, y) = gcd(x, y) :
+    It has domain = INTEGER.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 100.00%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel gcd_k...
+
+- Testing the kernel lcm_k  K(x, y) = lcm(x, y) :
+    It has domain = INTEGER.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 58.80%.
+    ===> I guess this kernel lcm_k  K(x, y) = lcm(x, y)  is NOT a positive definite kernel !
+    - Tested for definiteness : success rate 99.95%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel lcm_k...
+
+- Testing the kernel gcd_by_lcm_k  K(x, y) = gcd(x, y) / lcm(x, y) :
+    It has domain = INTEGER.
+  Results of the N = 1000 tests:
+    - Tested for positiveness : success rate 100.00%.
+    - Tested for definiteness : success rate 99.95%.
+    - Tested for   symmetry   : success rate 100.00%.
+  Finished for the kernel gcd_by_lcm_k...
+
+
+Done testing some kernels...
+
+- Apparently, the kernel log_k  K(x, y) = log(1 + x * y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 96.50%.
+
+- Apparently, the kernel cos_plus_k  K(x, y) = cos(x + y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 46.60%.
+
+- Apparently, the kernel max_k  K(x, y) = max(x, y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 78.60%.
+
+- Apparently, the kernel lcm_k  K(x, y) = lcm(x, y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 58.80%.
+
+- All the other kernel appeared to be positive definite. Now prove it.
+
+Finished for this script 'test_kernel.py'.

File test_kernel.py Added

  • Ignore whitespace
  • Hide word diff
+#! /usr/bin/env python2
+# -*- coding: utf-8; mode: python -*-
+"""
+A Python 2 script to test kernels, for the Kernel Methods homework.
+
+Source : http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/
+PDF: http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/fichiers/homework_mva_2016.pdf
+
+- *Date:* Saturday 06 February 2016.
+- *Author:* Lilian Besson, for the MVA Master, (C) 2015-16.
+- *Licence:* MIT Licence (http://lbesson.mit-license.org).
+"""
+
+from __future__ import print_function, division  # Python 2 compatibility if needed
+
+import numpy as np
+
+try:
+    from sympy import lcm, gcd
+except:
+    # if True:
+    print("Error: sympy is not available, switching back to fractions implementation of lcm/gcd")
+    from fractions import gcd
+
+    def lcm(x, y):
+        """ lcm(x, y) : lowest common multiple, = x * y / gcd(x, y). """
+        if gcd(x, y) == 0:
+            return 0
+        else:
+            return (x * y) / gcd(x, y)
+
+
+#: Number of random tests to do
+N = 1000
+#: Size of the sum_ij
+T = 5
+T = 50
+#: Lower bound for random number in IR
+LOWER_BOUND = -300.0
+#: Upper bound for random number in IR
+UPPER_BOUND = 300.0
+#: Upper bound for random number in IN
+UPPER_BOUND_INT = 1000
+#: Small upper bound for random number in IN
+SMALL_UPPER_BOUND_INT = 30
+#: Tolerance for real values tests to zero
+REAL_TOLERANCE = 1e-12
+#: Tolerance for probabilities check to 100%
+PROBA_TOLERANCE = 0.01
+
+#: Standard deviation of the weights
+WEIGHT_STD = 100
+#: Distribution for weights
+# WEIGHT_DISTRIBUTION = "NORMAL"
+# WEIGHT_DISTRIBUTION = "UNIFORM_0_1"  # Useless!
+WEIGHT_DISTRIBUTION = "UNIFORM_-1_1"
+
+#: Map a domain to a random generator on this domain
+random_generators = {
+    "REAL":
+        lambda: np.random.uniform(LOWER_BOUND, UPPER_BOUND),
+    "REALPLUS":
+        lambda: np.random.uniform(0.0, UPPER_BOUND),
+    "INTEGER":
+        lambda: np.random.randint(0, UPPER_BOUND_INT),
+    "SMALLINT":
+        lambda: np.random.randint(0, SMALL_UPPER_BOUND_INT),
+    "MONEONE":
+        lambda: np.random.uniform(-1.0, 1.0)
+}
+
+
+def test_kernel(K, N=N, T=T, verb=False):
+    """
+    Test if a function is a kernel on lots of random values (500 by default):
+
+     - test if K(x, y) = K(y, x) (symmetry),
+     - test if K(x, x) = 0 ==> x = 0 (definiteness),
+     - test if sum_ij a_i a_j K(x_i, x_j) >= 0 for 'all' t, and a_i in IR, x_i in DOMAIN (positiveness) (for all i,j in range(t), and t drawn in range(T) randomly).
+
+    Return numerical values, average of success rate on N = 500 tests.
+    """
+    # Get the domain of K and an appropriate random generator.
+    domain = K.domain
+    newvalue = random_generators[domain]
+    # Random generator for weights
+    def newweights(t):
+        if WEIGHT_DISTRIBUTION == "UNIFORM_0_1":
+            return np.random.rand(t) * WEIGHT_STD
+        if WEIGHT_DISTRIBUTION == "UNIFORM_-1_1":
+            return (2*np.random.rand(t) - 1) * (WEIGHT_STD / 2)
+        if WEIGHT_DISTRIBUTION == "NORMAL":
+            return WEIGHT_STD * np.random.randn(t)
+    # Test for symmetry
+    prop_sym = 0
+    # Test for definiteness
+    prop_def = 0
+    # Test for positiveness
+    prop_pos = 0
+    for i in range(N):
+        x = newvalue()
+        y = newvalue()
+        if verb:
+            print("       - Random values x = {}, y = {}.".format(x, y))
+        if K(x, y) == K(y, x):
+            prop_sym += 1
+        # Test : (A ==> B)  <=>  (B or not(A))
+        if (not (abs(K(x, x)) < REAL_TOLERANCE)) or (abs(x) > REAL_TOLERANCE):
+            prop_def += 1
+        if (not (abs(K(y, y)) < REAL_TOLERANCE)) or (abs(y) > REAL_TOLERANCE):
+            prop_def += 1
+        # Generate lots of samples for checking positiveness
+        t = np.random.randint(1, T)
+        X = [newvalue() for i in range(t)]
+        a = newweights(t)
+        if verb:
+            print("       - Random values t = {}, a = {}, X = {}.".format(t, a, X))
+        if sum(a[i] * a[j] * K(X[i], X[j]) for i in range(t) for j in range(t)) >= 0:
+            prop_pos += 1
+    if verb:
+        print("     - After {} random tests, sym = {:>3}, def = {:>3}, pos = {:>3}.".format(N, prop_sym, prop_def, prop_pos))
+    # Return the results
+    return {
+        "symmetry":     (prop_sym / N),
+        "definiteness": (prop_def / (2*N)),
+        "positiveness": (prop_pos / N),
+    }
+
+
+# %% Tweak to define a new kernel
+
+
+#: List of all kernels to test, dynamically created by @kernel decorator
+all_kernels = []
+
+
+def kernel(domain="REAL"):
+    """
+    Decorator for a new kernel (just an elegant way to add it to all_kernels).
+
+    The parameter 'domain' can be:
+
+     - "REAL"      for (-oo,+oo) = IR   real number,
+     - "REALPLUS"  for [0,+oo)   = IR+  positive real number,
+     - "INTEGER"   for [|0,+oo)  = IN   positive integer,
+     - "SMALLINT"  for [|0,+oo)  = IN   (small) positive integer,
+     - "MONEONE"   for (-1,+1)   = UR   the unit open disk of real values.
+    """
+    # global all_kernels
+    def kernel_decorator(func):
+        """
+        Decorator for a new kernel, with a certain domain.
+        """
+        global all_kernels
+        # Add a new kernel !
+        all_kernels += [func]
+        func.domain = domain
+        return func
+    return kernel_decorator
+
+
+# %% List of Kernels
+
+
+# Classical example of kernel: linear kernel!
+@kernel("REAL")
+def abs_k(x, y):
+    """ K(x, y) = < x, y > = x * y """
+    return x * y
+
+
+@kernel("MONEONE")
+def one_by_k(x, y):
+    """ K(x, y) = 1 / (1 - x * y) """
+    return 1 / (1 - x * y)
+
+
+# Here we could have an overflow error, so using small integers
+# From the slides
+@kernel("SMALLINT")
+def power_two_dot_k(x, y):
+    """ K(x, y) = 2 ** (x + y) """
+    return 2 ** (x * y)
+
+
+# Here we could have an overflow error, so using small integers
+@kernel("SMALLINT")
+def power_two_sum_k(x, y):
+    """ K(x, y) = 2 ** (x * y) """
+    return 2 ** (x * y)
+
+
+@kernel("REALPLUS")
+def log_k(x, y):
+    """ K(x, y) = log(1 + x * y) """
+    return np.log(1 + x * y)
+
+
+@kernel("REAL")
+def exp_k(x, y):
+    """ K(x, y) = exp(-(x-y)^2) """
+    return np.exp(-(x-y)**2)
+
+
+@kernel("REAL")
+def cos_plus_k(x, y):
+    """ K(x, y) = cos(x + y) """
+    return np.cos(x + y)
+
+
+@kernel("REAL")
+def cos_minus_k(x, y):
+    """ K(x, y) = cos(x - y) """
+    return np.cos(x - y)
+
+
+@kernel("REALPLUS")
+def min_k(x, y):
+    """ K(x, y) = min(x, y) """
+    return min(x, y)
+
+
+@kernel("REALPLUS")
+def max_k(x, y):
+    """ K(x, y) = max(x, y) """
+    return max(x, y)
+
+
+@kernel("REALPLUS")
+def min_by_max_k(x, y):
+    """ K(x, y) = min(x, y) / max(x, y) """
+    if abs(max(x, y)) < REAL_TOLERANCE:
+        return 0.0
+    else:
+        return min(x, y) / max(x, y)
+
+
+@kernel("INTEGER")
+def gcd_k(x, y):
+    """ K(x, y) = gcd(x, y) """
+    return gcd(x, y)
+
+
+@kernel("INTEGER")
+def lcm_k(x, y):
+    """ K(x, y) = lcm(x, y) """
+    return lcm(x, y)
+
+
+@kernel("INTEGER")
+def gcd_by_lcm_k(x, y):
+    """ K(x, y) = gcd(x, y) / lcm(x, y) """
+    if abs(lcm(x, y)) < REAL_TOLERANCE:
+        return 0.0
+    else:
+        return gcd(x, y) / lcm(x, y)
+
+
+# %% Testing all the kernels
+
+def main(kernels_to_test=all_kernels, T=T, verb=False):
+    """
+    Test a lot of different kernels.
+    """
+    nb = len(all_kernels)
+    print("\nStarting to test {} kernels...".format(nb))
+    # Store results
+    results = [1] * nb
+    not_kernels = []
+    # Start
+    for i, K in enumerate(all_kernels):
+        print("\n- Testing the kernel {} {}:".format(K.__name__, K.__doc__))
+        print("    It has domain = {}.".format(K.domain))
+        result = test_kernel(K, N=N, T=T, verb=verb)
+        print("  Results of the N = {} tests:".format(N))
+        # bad_test = None
+        # min_rate = None
+        for test, rate in result.items():
+            print("    - Tested for {:^12} : success rate {:.2%}.".format(test, rate))
+            if rate < 1.0 - PROBA_TOLERANCE:
+                print("    ===> I guess this kernel {} {} is NOT a positive definite kernel !".format(K.__name__, K.__doc__))
+                not_kernels.append({'K': K, 'test': test, 'rate': rate})
+                # bad_test, min_rate = test, rate
+        print("  Finished for the kernel {}...".format(K.__name__))
+        results[i] = result
+    print("\n\nDone testing some kernels...")
+    return results, not_kernels
+
+
+if __name__ == '__main__':
+    print("\nScript to test if functions are kernels, by Lilian Besson (C) 2016")
+    print("MIT Licensed (http://lbesson.mit-license.org)")
+    print("\n\nUsing N = {}, T = {}.".format(N, T))
+    print("For test domains:")
+    print("  - IR  ~= [{}, {}],".format(LOWER_BOUND, UPPER_BOUND))
+    print("  - IR+ ~= [ 0, {}],".format(UPPER_BOUND))
+    print("  - IN  ~= [ 0, {}],".format(UPPER_BOUND_INT))
+    print("  - IN  ~= [ 0, {}] (small to reduce overflow),".format(SMALL_UPPER_BOUND_INT))
+    print("Tolerance for small real values = {}.".format(REAL_TOLERANCE))
+    print("Tolerance for success rates     = {}.".format(PROBA_TOLERANCE))
+    if WEIGHT_DISTRIBUTION == "UNIFORM_0_1":
+        print("And sampling the a_i from uniform in [0, {}].".format(WEIGHT_STD))
+    if WEIGHT_DISTRIBUTION == "UNIFORM_-1_1":
+        print("And sampling the a_i from uniform in [-{}, {}].".format(WEIGHT_STD/2, WEIGHT_STD/2))
+    if WEIGHT_DISTRIBUTION == "NORMAL":
+        print("And sampling the a_i from a Gaussian of mean 0 and std {}.".format(WEIGHT_STD))
+    # all_kernels = [absk]
+    # results, not_kernels = main(kernels_to_test=all_kernels, verb=True)
+    results, not_kernels = main()
+    for item in not_kernels:
+        K = item['K']
+        test = item['test']
+        rate = item['rate']
+        print("\n- Apparently, the kernel {} {} is NOT a positive definite kernel : because {} was only observed with probability {:.2%}.".format(K.__name__, K.__doc__, test, rate))
+    print("\n- All the other kernel appeared to be positive definite. Now prove it.")
+    print("\nFinished for this script 'test_kernel.py'.")
+
+# End of test_kernel.py
HTTPS SSH

You can clone a snippet to your computer for local editing. Learn more.