Created by Lilian Besson (Naereen) last modified 2016-06-02

test_kernel.output.txt

Script to test if functions are kernels, by Lilian Besson (C) 2016
MIT Licensed (http://lbesson.mit-license.org)


Using N = 1000, T = 5.
For test domains:
  - IR  ~= [-300.0, 300.0],
  - IR+ ~= [ 0, 300.0],
  - IN  ~= [ 0, 1000],
  - IN  ~= [ 0, 30] (small to reduce overflow),
Tolerance for small real values = 1e-12.
Tolerance for success rates     = 0.01.
And sampling the a_i from uniform in [-50.0, 50.0].

Starting to test 14 kernels...

- Testing the kernel abs_k  K(x, y) = < x, y > = x * y :
    It has domain = REAL.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel abs_k...

- Testing the kernel one_by_k  K(x, y) = 1 / (1 - x * y) :
    It has domain = MONEONE.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel one_by_k...

- Testing the kernel power_two_dot_k  K(x, y) = 2 ** (x + y) :
    It has domain = SMALLINT.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel power_two_dot_k...

- Testing the kernel power_two_sum_k  K(x, y) = 2 ** (x * y) :
    It has domain = SMALLINT.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel power_two_sum_k...

- Testing the kernel log_k  K(x, y) = log(1 + x * y) :
    It has domain = REALPLUS.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 96.50%.
    ===> I guess this kernel log_k  K(x, y) = log(1 + x * y)  is NOT a positive definite kernel !
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel log_k...

- Testing the kernel exp_k  K(x, y) = exp(-(x-y)^2) :
    It has domain = REAL.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel exp_k...

- Testing the kernel cos_plus_k  K(x, y) = cos(x + y) :
    It has domain = REAL.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 46.60%.
    ===> I guess this kernel cos_plus_k  K(x, y) = cos(x + y)  is NOT a positive definite kernel !
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel cos_plus_k...

- Testing the kernel cos_minus_k  K(x, y) = cos(x - y) :
    It has domain = REAL.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel cos_minus_k...

- Testing the kernel min_k  K(x, y) = min(x, y) :
    It has domain = REALPLUS.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel min_k...

- Testing the kernel max_k  K(x, y) = max(x, y) :
    It has domain = REALPLUS.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 78.60%.
    ===> I guess this kernel max_k  K(x, y) = max(x, y)  is NOT a positive definite kernel !
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel max_k...

- Testing the kernel min_by_max_k  K(x, y) = min(x, y) / max(x, y) :
    It has domain = REALPLUS.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel min_by_max_k...

- Testing the kernel gcd_k  K(x, y) = gcd(x, y) :
    It has domain = INTEGER.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 100.00%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel gcd_k...

- Testing the kernel lcm_k  K(x, y) = lcm(x, y) :
    It has domain = INTEGER.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 58.80%.
    ===> I guess this kernel lcm_k  K(x, y) = lcm(x, y)  is NOT a positive definite kernel !
    - Tested for definiteness : success rate 99.95%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel lcm_k...

- Testing the kernel gcd_by_lcm_k  K(x, y) = gcd(x, y) / lcm(x, y) :
    It has domain = INTEGER.
  Results of the N = 1000 tests:
    - Tested for positiveness : success rate 100.00%.
    - Tested for definiteness : success rate 99.95%.
    - Tested for   symmetry   : success rate 100.00%.
  Finished for the kernel gcd_by_lcm_k...


Done testing some kernels...

- Apparently, the kernel log_k  K(x, y) = log(1 + x * y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 96.50%.

- Apparently, the kernel cos_plus_k  K(x, y) = cos(x + y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 46.60%.

- Apparently, the kernel max_k  K(x, y) = max(x, y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 78.60%.

- Apparently, the kernel lcm_k  K(x, y) = lcm(x, y)  is NOT a positive definite kernel : because positivenesswas only observed with probability 58.80%.

- All the other kernel appeared to be positive definite. Now prove it.

Finished for this script 'test_kernel.py'.

test_kernel.py

      #! /usr/bin/env python
# -*- coding: utf-8; mode: python -*-
"""
A Python 2/3 script to test kernels, for the Kernel Methods homework.

Source: http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/
PDF: http://lear.inrialpes.fr/people/mairal/teaching/2015-2016/MVA/fichiers/homework_mva_2016.pdf

- *Date:* Saturday 06 February 2016.
- *Author:* Lilian Besson, for the MVA Master, (C) 2015-16.
- *Licence:* MIT Licence (http://lbesson.mit-license.org).
"""

from __future__ import print_function, division  # Python 2 compatibility if needed

import numpy as np

try:
    from sympy import lcm, gcd
except:
    # if True:
    print("Error: sympy is not available, switching back to fractions implementation of lcm/gcd")
    from fractions import gcd

    def lcm(x, y):
        """ lcm(x, y) : lowest common multiple, = x * y / gcd(x, y). """
        if gcd(x, y) == 0:
            return 0
        else:
            return (x * y) / gcd(x, y)


#: Number of random tests to do
N = 1000
#: Size of the sum_ij
T = 5
T = 50
#: Lower bound for random number in IR
LOWER_BOUND = -300.0
#: Upper bound for random number in IR
UPPER_BOUND = 300.0
#: Upper bound for random number in IN
UPPER_BOUND_INT = 1000
#: Small upper bound for random number in IN
SMALL_UPPER_BOUND_INT = 30
#: Tolerance for real values tests to zero
REAL_TOLERANCE = 1e-12
#: Tolerance for probabilities check to 100%
PROBA_TOLERANCE = 0.01

#: Standard deviation of the weights
WEIGHT_STD = 100
#: Distribution for weights
# WEIGHT_DISTRIBUTION = "NORMAL"
# WEIGHT_DISTRIBUTION = "UNIFORM_0_1"  # Useless!
WEIGHT_DISTRIBUTION = "UNIFORM_-1_1"

#: Map a domain to a random generator on this domain
random_generators = {
    "REAL":
        lambda: np.random.uniform(LOWER_BOUND, UPPER_BOUND),
    "REALPLUS":
        lambda: np.random.uniform(0.0, UPPER_BOUND),
    "INTEGER":
        lambda: np.random.randint(0, UPPER_BOUND_INT),
    "SMALLINT":
        lambda: np.random.randint(0, SMALL_UPPER_BOUND_INT),
    "MONEONE":
        lambda: np.random.uniform(-1.0, 1.0)
}


def test_kernel(K, N=N, T=T, verb=False):
    """
    Test if a function is a kernel on lots of random values (500 by default):

     - test if K(x, y) = K(y, x) (symmetry),
     - test if K(x, x) = 0 ==> x = 0 (definiteness),
     - test if sum_ij a_i a_j K(x_i, x_j) >= 0 for 'all' t, and a_i in IR, x_i in DOMAIN (positiveness) (for all i,j in range(t), and t drawn in range(T) randomly).

    Return numerical values, average of success rate on N = 500 tests.
    """
    # Get the domain of K and an appropriate random generator.
    domain = K.domain
    newvalue = random_generators[domain]
    # Random generator for weights
    def newweights(t):
        if WEIGHT_DISTRIBUTION == "UNIFORM_0_1":
            return np.random.rand(t) * WEIGHT_STD
        if WEIGHT_DISTRIBUTION == "UNIFORM_-1_1":
            return (2*np.random.rand(t) - 1) * (WEIGHT_STD / 2)
        if WEIGHT_DISTRIBUTION == "NORMAL":
            return WEIGHT_STD * np.random.randn(t)
    # Test for symmetry
    prop_sym = 0
    # Test for definiteness
    prop_def = 0
    # Test for positiveness
    prop_pos = 0
    for i in range(N):
        x = newvalue()
        y = newvalue()
        if verb:
            print("       - Random values x = {}, y = {}.".format(x, y))
        if K(x, y) == K(y, x):
            prop_sym += 1
        # Test : (A ==> B)  <=>  (B or not(A))
        if (not (abs(K(x, x)) < REAL_TOLERANCE)) or (abs(x) > REAL_TOLERANCE):
            prop_def += 1
        if (not (abs(K(y, y)) < REAL_TOLERANCE)) or (abs(y) > REAL_TOLERANCE):
            prop_def += 1
        # Generate lots of samples for checking positiveness
        t = np.random.randint(1, T)
        X = [newvalue() for i in range(t)]
        a = newweights(t)
        if verb:
            print("       - Random values t = {}, a = {}, X = {}.".format(t, a, X))
        if sum(a[i] * a[j] * K(X[i], X[j]) for i in range(t) for j in range(t)) >= 0:
            prop_pos += 1
    if verb:
        print("     - After {} random tests, sym = {:>3}, def = {:>3}, pos = {:>3}.".format(N, prop_sym, prop_def, prop_pos))
    # Return the results
    return {
        "symmetry":     (prop_sym / N),
        "definiteness": (prop_def / (2*N)),
        "positiveness": (prop_pos / N),
    }


# %% Tweak to define a new kernel


#: List of all kernels to test, dynamically created by @kernel decorator
all_kernels = []


def kernel(domain="REAL"):
    """
    Decorator for a new kernel (just an elegant way to add it to all_kernels).

    The parameter 'domain' can be:

     - "REAL"      for (-oo,+oo) = IR   real number,
     - "REALPLUS"  for [0,+oo)   = IR+  positive real number,
     - "INTEGER"   for [|0,+oo)  = IN   positive integer,
     - "SMALLINT"  for [|0,+oo)  = IN   (small) positive integer,
     - "MONEONE"   for (-1,+1)   = UR   the unit open disk of real values.
    """
    # global all_kernels
    def kernel_decorator(func):
        """
        Decorator for a new kernel, with a certain domain.
        """
        global all_kernels
        # Add a new kernel !
        all_kernels += [func]
        func.domain = domain
        return func
    return kernel_decorator


# %% List of Kernels


# Classical example of kernel: linear kernel!
@kernel("REAL")
def abs_k(x, y):
    """ K(x, y) = < x, y > = x * y """
    return x * y


@kernel("MONEONE")
def one_by_k(x, y):
    """ K(x, y) = 1 / (1 - x * y) """
    return 1 / (1 - x * y)


# Here we could have an overflow error, so using small integers
# From the slides
@kernel("SMALLINT")
def power_two_dot_k(x, y):
    """ K(x, y) = 2 ** (x * y) """
    return 2 ** (x * y)


# Here we could have an overflow error, so using small integers
@kernel("SMALLINT")
def power_two_sum_k(x, y):
    """ K(x, y) = 2 ** (x + y) """
    return 2 ** (x + y)


@kernel("REALPLUS")
def log_k(x, y):
    """ K(x, y) = log(1 + x * y) """
    return np.log(1 + x * y)


@kernel("REAL")
def exp_k(x, y):
    """ K(x, y) = exp(-(x-y)^2) """
    return np.exp(-(x-y)**2)


@kernel("REAL")
def cos_plus_k(x, y):
    """ K(x, y) = cos(x + y) """
    return np.cos(x + y)


@kernel("REAL")
def cos_minus_k(x, y):
    """ K(x, y) = cos(x - y) """
    return np.cos(x - y)


@kernel("REALPLUS")
def min_k(x, y):
    """ K(x, y) = min(x, y) """
    return min(x, y)


@kernel("REALPLUS")
def max_k(x, y):
    """ K(x, y) = max(x, y) """
    return max(x, y)


@kernel("REALPLUS")
def min_by_max_k(x, y):
    """ K(x, y) = min(x, y) / max(x, y) """
    if abs(max(x, y)) < REAL_TOLERANCE:
        return 0.0
    else:
        return min(x, y) / max(x, y)


@kernel("INTEGER")
def gcd_k(x, y):
    """ K(x, y) = gcd(x, y) """
    return gcd(x, y)


@kernel("INTEGER")
def lcm_k(x, y):
    """ K(x, y) = lcm(x, y) """
    return lcm(x, y)


@kernel("INTEGER")
def gcd_by_lcm_k(x, y):
    """ K(x, y) = gcd(x, y) / lcm(x, y) """
    if abs(lcm(x, y)) < REAL_TOLERANCE:
        return 0.0
    else:
        return gcd(x, y) / lcm(x, y)


# %% Testing all the kernels

def main(kernels_to_test=all_kernels, T=T, verb=False):
    """
    Test a lot of different kernels.
    """
    nb = len(all_kernels)
    print("\nStarting to test {} kernels...".format(nb))
    # Store results
    results = [1] * nb
    not_kernels = []
    # Start
    for i, K in enumerate(all_kernels):
        print("\n- Testing the kernel {} {}:".format(K.__name__, K.__doc__))
        print("    It has domain = {}.".format(K.domain))
        result = test_kernel(K, N=N, T=T, verb=verb)
        print("  Results of the N = {} tests:".format(N))
        # bad_test = None
        # min_rate = None
        for test, rate in result.items():
            print("    - Tested for {:^12} : success rate {:.2%}.".format(test, rate))
            if rate < 1.0 - PROBA_TOLERANCE:
                print("    ===> I guess this kernel {} {} is NOT a positive definite kernel !".format(K.__name__, K.__doc__))
                not_kernels.append({'K': K, 'test': test, 'rate': rate})
                # bad_test, min_rate = test, rate
        print("  Finished for the kernel {}...".format(K.__name__))
        results[i] = result
    print("\n\nDone testing some kernels...")
    return results, not_kernels


if __name__ == '__main__':
    print("\nScript to test if functions are kernels, by Lilian Besson (C) 2016")
    print("MIT Licensed (http://lbesson.mit-license.org)")
    print("\n\nUsing N = {}, T = {}.".format(N, T))
    print("For test domains:")
    print("  - IR  ~= [{}, {}],".format(LOWER_BOUND, UPPER_BOUND))
    print("  - IR+ ~= [ 0, {}],".format(UPPER_BOUND))
    print("  - IN  ~= [ 0, {}],".format(UPPER_BOUND_INT))
    print("  - IN  ~= [ 0, {}] (small to reduce overflow),".format(SMALL_UPPER_BOUND_INT))
    print("Tolerance for small real values = {}.".format(REAL_TOLERANCE))
    print("Tolerance for success rates     = {}.".format(PROBA_TOLERANCE))
    if WEIGHT_DISTRIBUTION == "UNIFORM_0_1":
        print("And sampling the a_i from uniform in [0, {}].".format(WEIGHT_STD))
    if WEIGHT_DISTRIBUTION == "UNIFORM_-1_1":
        print("And sampling the a_i from uniform in [-{}, {}].".format(WEIGHT_STD/2, WEIGHT_STD/2))
    if WEIGHT_DISTRIBUTION == "NORMAL":
        print("And sampling the a_i from a Gaussian of mean 0 and std {}.".format(WEIGHT_STD))
    # all_kernels = [absk]
    # results, not_kernels = main(kernels_to_test=all_kernels, verb=True)
    results, not_kernels = main()
    for item in not_kernels:
        K = item['K']
        test = item['test']
        rate = item['rate']
        print("\n- Apparently, the kernel {} {} is NOT a positive definite kernel : because {} was only observed with probability {:.2%}.".format(K.__name__, K.__doc__, test, rate))
    print("\n- All the other kernel appeared to be positive definite. Now prove it.")
    print("\nFinished for this script 'test_kernel.py'.")

# End of test_kernel.py

    

Comments (3)

HTTPS

SSH

You can clone a snippet to your computer for local editing. Learn more.