Wiki

Clone wiki

DeepDriving / Results

Results

MNIST - 6000 Training-Samples / 10000 Test-Samples (Multi-Task: Autoencoder+Classifier)

Network Config Results Error Run Comment
(Model1) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 6.66% model1 Classifier only
(Model2) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 6.63% model2 Classifier only + Hidden-Noise
(Model3) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 5.65% model3 Classifier only + Input-Noise
(Model4) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 5.56% model4 Classifier only + Input-Noise + Hidden-Noise
(Model5) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 6.90% model5 Classifier + AE
(Model6) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 6.83% model5 Classifier + AE + Hidden Noise
(Model7) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs result_classification.txt 5.67% model5 Classifier + AE + InputNoise

MNIST - reduced dataset-size (Autoencoder)

Network Config Results Error Run Comment
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 0 Autoencoder-States + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 150 Epochs result_classification.txt 1.40% reduced_dataset/0_ae_1_classifier Time: 2:01h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 1 Autoencoder-States (without noise) + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 1.42% reduced_dataset/1_ae_1_classifier Time: 3:30h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 1 Autoencoder-States (with noise) + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 1.35% 1_ae_noise_1_classifier Time: 3:40h

MNIST (Autoencoder)

Network Config Results Error Run Comment
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 0 Autoencoder-States + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_classification.txt 0.76% 0-ae-states_1-classifier-state Time: 1:30h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 4 Autoencoder-States + 2 Classifier-States using target-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 1.52% 4-ae-states_2-classifier-states_target-loss Time: 2:30h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 1 Autoencoder-State + 2 Classifier-States using target-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 1.17% 1-ae-state_2-classifier-states_target-loss Time: 1:57h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 1 Autoencoder-State (noise) + 1 Classifier-States using reconstruction-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 0.69% 1-ae-state_1-classifier-states_target-loss Time: 2:05h
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer 1 Autoencoder-State (without noise) + 1 Classifier-States using reconstruction-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs result_reconstruction.txt result_classification.txt 0.72% 1-ae-state_without_noise_1-classifier-state_recon_loss Time: 2:05h

DeepDriving

Network Config Mean SD Results Run Comment
Per-Pixel-Normalization + Original-Net + Batch-Normalization + LRN LR: 0.01; Decay: 0.90/40; WD: 0.005; 500 Epochs; Momentum-Optimizer ? ? results.txt run_1 Quite good results
Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer ? ? results.txt run_2 same as run_1
Per-Pixel-Normalization + Original-Net + Batch-Normalization + no sigmoid output-layer LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer ? ? results.txt run_3 extreme MAE in the beginning slow convergence
Per-Pixel-Standardization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer ? ? results.txt run_4 slower convergence in the beginning, I need to run longer
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer ? ? results.txt run_5 Good results, but only with HUE delta 0.05. With HUE delta 0.07 or bigger, there is a strong divergence.
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.005; 240 Epochs; Nestrov-Momentum-Optimizer ? ? results.txt run_7 No changes compared to run_1
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.0; 240 Epochs; Nestrov-Momentum-Optimizer ? ? results.txt run_8 No changes compared to run_1
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 1.0; Decay: 0.90/40; WD: 0.0; ? Epochs; Ada-Delta Optimizer ? ? results.txt run_9 No changes compared to run_1, but weights are exploding.
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.90/40; WD: 0.0; ? Epochs; Adam-Optimizer ? ? - run_10 Exploding weights, strong divergence!
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.1; Decay: 0.90/40; WD: 0.0001; ? Epochs; Adam-Optimizer ? ? - run_11 No convergence!
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer ? ? - run_12 good convergence, comparable to run_1
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 1.0 ? ? - run_13 exploding weights, divergence!
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 0.3 ? ? - run_14 exploding weights, bad error
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 0.01 ? ? - run_15 comparable to run_1, higher standard deviation of error
Data-Augmentation + Per-Pixel-Normalization + VGG LR: 1.0; Decay: 0.90/30; WD: 0.0; ? Epochs; AdaDelta-Optimizer; Noise: 0.01 ? ? - run_16 almost no convergence
Per-Pixel-Normalization + VGG LR: 0.01; Decay: 0.90/30; WD: 0.0; ? Epochs; Adam-Optimizer; Noise: 0.01 ? ? - run_17 almost no convergence
Per-Pixel-Normalization + VGG LR: 0.01; Decay: 0.90/40; WD: 0.0005; ? Epochs; Adam-Optimizer; Noise: 0.01 ? ? - run_18 almost no convergence
Per-Pixel-Normalization + VGG + Batch-Normalization LR: 0.01; Decay: 0.10/100; WD: 0.0005; ? Epochs; Momentum-Optimizer; Noise: 0.01 ? ? - run_19 almost no convergence
Per-Pixel-Normalization + Original-Net + Batch-Normalization LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1220 Epochs; Momentum-Optimizer; Noise: 0.0 ? ? results.txt run_20 good convergence, comparable to run_1
Per-Pixel-Normalization + Original-Net without Dropout + Batch-Normalization LR: 0.01; Decay: 0.50/300; WD: 0.0005; 520 Epochs; Momentum-Optimizer; Noise: 0.0 ? ? results.txt run_21 better convergence than with drop-out, especially for training-data
Per-Pixel-Normalization + Original-Net with corrected Dropout + Batch-Normalization LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1850 Epochs; Momentum-Optimizer; Noise: 0.0 ? ? results.txt run_22 best convergence of validation-data so far!
Per-Pixel-Normalization + Original-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1860 Epochs; Momentum-Optimizer; Noise: 0.0 ? ? results.txt run_23 best performance, better than original net in many categories
Per-Pixel-Normalization + Original-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) + no sigmoid at the output LR: 0.01; Decay: 0.50/300; WD: 0.0005; ? Epochs; Momentum-Optimizer; Noise: 0.0 ? ? results.txt run_24 very noisy validation, bad performance
Per-Pixel-Normalization + Custom-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) LR: 0.01; Decay: 0.50/300; WD: 0.0005; 2000 Epochs; Momentum-Optimizer; Noise: 0.0 16.41 16.67 results.txt run_25 very good performance, final version

Cifar-10

Pre-Processing

Network Config Error Results Notes Run
Tutorial-Net LR: 0.005; Decay: 0.96/1; WD: 0; 30 Epochs 38.39% results.txt Very sparse activation in conv1 run_1
Tutorial-Net LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 34.30% results.txt Very sparse activation in conv1 run_2
Tutorial-Net + Preprocessing (-0.5) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 31.32% results.txt rich feature maps conv1, but sparse activation in following layers run_3
Tutorial-Net + Preprocessing (Color-Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 32.07% results.txt less feature maps in layer conv1 run_4
Tutorial-Net + Preprocessing (PerPixel-Color-Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 27.68% results.txt rich feature map in layer conv1, and less sparsity in layer 4 run_5
Tutorial-Net + Preprocessing (PerPixel-Mean-Subtraction) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 29.48% results.txt Very rich feature map of conv1, but more sparse conv2 layer. Also layer 4 is more sparse. run_6
Tutorial-Net + Preprocessing (Per-Image Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 29.54% results.txt Very righ feature-maps in conv1. run_7
Tutorial-Net + Preprocessing (Per-Pixel Standardization + Per-Image Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 28.59% results.txt Almost very rich feature map in conv1. run_8
Tutorial-Net + Data-Augmentation (Random cropping) + Preprocessing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 27.52% results.txt Less rich feature map in conv1, slightly less sparse feature map in conv2. run_9
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping) + Preprocessing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 28.82% results.txt Sparsity decreases in conv1 on the long run. run_10
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 90.02% results.txt Images are extremely bright or dark. run_11
Tutorial-Net + Preprocessing (Per-Pixel Standardization) + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Image Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 38.17% results.txt Images are extremely bright or dark. run_12
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Image Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 39.39% results.txt Images are extremely bright or dark. run_13
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 30.52% results.txt run_14
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 29.61% results.txt Rich feature maps in conv1. run_15
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Image Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 47.50% results.txt Very rich feature map in conv1 but extremely poor feature map in conv2. run_16
Tutorial-Net + Data-Augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs 28.50% results.txt Less rich feature map in conv1 but quite rich feature map in conv2. run_17
Tutorial-Net + Data-Augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 23.90% results.txt Almost sparse feature map in conv1. run_18

Network structure

  • All networks use data-augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) and pre-processing (per-pixel standardization)
Network Config Error Results Notes Run
Tutorial-Net LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 23.29% results.txt run_1
Tutorial-Net + Variable-Module for Conv1 LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 22.31% results.txt run_2
Tutorial-Net + Use own Conv2D function with Bias constant initializer of 0.1 LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 90.06% results.txt run_3
Tutorial-Net + Use own Conv2D function with Bias constant initializer of 0.0 LR: 0.005; Decay: 0.96/10; WD: 0.0; 30 Epochs 22.25% results.txt run_4
Tutorial-Net + Use own Conv2D, Activation and Pool function LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 22.70% results.txt run_5
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function in conv1 LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 23.32% results.txt run_6
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function for conv1 and conv2, conv2 bias is also initialized with 0.0 LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 28.75% results.txt run_7
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function for conv1 and conv2, conv2 bias is initialized with 0.1 LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 23.40% results.txt run_8
Tutorial-Net + Use custom Conv2D, Activation, Pool and LRN function for conv1 and conv2 + custom Fully-Connected layer 3 (with stddev=0.02) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 25.34% results.txt run_9
Tutorial-Net + Use custom Conv2D, Activation, Pool and LRN function for conv1 and conv2 + custom Fully-Connected layer 3 (with stddev=0.04) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 22.38% results.txt run_10
Custom-Net (2 Conv-Layer, 2 FC-Layer, 1 Dense-Output) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 22.80% results.txt run_11
Custom-Net (3 Conv-Layer, 1 FC-Layer, 1 Dense-Output) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 22.17% results.txt run_12
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 21.88% results.txt run_13
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + no LRN LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 23.67% results.txt run_14
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs 19.66% results.txt High loss for validation in the beginning of training run_15
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/25; WD: 0.0005; 30 Epochs 27.84% results.txt run_16
  • The following runs use 60 Epochs for training.
Network Config Error Results Notes Run
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0002; 60 Epochs 14.98% results.txt run_17
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + no Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 60 Epochs 23.16% results.txt run_18
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0002; 60 Epochs 16.10% results.txt run_19
  • The following runs use 120 Epochs for training.
Network Config Error Results Notes Run
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0002; 120 Epochs 15.34% results.txt run_19
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + no Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0002; 120 Epochs 14.56% results.txt run_20
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + no Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 13.91% results.txt run_21
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 15.14% results.txt run_22
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 13.90% results.txt run_23
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 1 FC-Layer + Dropout, 1 FC-Layer + no Dropout. 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 13,38% results.txt run_24
Custom-Net (3 Conv-Layer (128 Filter), 1 FC-Layer + Dropout, 1 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12,71% results.txt run_25
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(512) + Dropout, 1 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12.31% results.txt run_26
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12.13% results.txt run_27
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256) + Dropout, 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12.32% results.txt run_28
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + no Dropout, 1 FC-Layer (256) + no Dropout, 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 13.01% results.txt run_29
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12.31% results.txt run_30
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + PReLU LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs 12.37% results.txt run_31
  • The following runs where based on the tweaked meta-parameters of run_7.
Network Config Error Results Notes Run
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for FC layers) LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs 11.50% results.txt run_32
Custom-Net (3x2 Conv-Layer (128 Filter) + reduce kernel size to 3x3, 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for FC layers) LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs 10.31% results.txt run_33
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for all layers) LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs 10.34% results.txt run_34
Custom-Net (3x2 Conv-Layer (128 Filter) + BN + ReLU, 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs 8.79% results.txt run_35

Optimizer-Arguments

Network Config Error Results Notes Run
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + ReLU LR: 0.001; Decay: 0.96/1; WD: 0.0000; 120 Epochs 11.57% results.txt run_1
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs 11.37% results.txt slight overfitting in loss run_2
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization LR: 0.001; Decay: 0.5/30; WD: 0.0001; 120 Epochs 11.93% results.txt still some overfitting run_3
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Use Xavier-Initialization for Conv-Layer LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs 11.14% results.txt run_4
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Use Xavier-Initialization LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs 11.42% results.txt run_5
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization LR: 0.002; Decay: 0.5/30; WD: 0.0000; 120 Epochs 11.09% results.txt run_6
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs 10.91% results.txt run_7

Updated