Wiki
Clone wikiDeepDriving / Results
Results
MNIST - 6000 Training-Samples / 10000 Test-Samples (Multi-Task: Autoencoder+Classifier)
Network | Config | Results | Error | Run | Comment |
---|---|---|---|---|---|
(Model1) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 6.66% | model1 | Classifier only |
(Model2) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 6.63% | model2 | Classifier only + Hidden-Noise |
(Model3) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 5.65% | model3 | Classifier only + Input-Noise |
(Model4) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.0 AE, 1.0 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 5.56% | model4 | Classifier only + Input-Noise + Hidden-Noise |
(Model5) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 6.90% | model5 | Classifier + AE |
(Model6) Image=Per-Pixel Mean Substraction, Code=Dense(Image, 1000)+GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 6.83% | model5 | Classifier + AE + Hidden Noise |
(Model7) Image=Per-Pixel Mean Substraction, Code=GaussianNoise(Mean=0.0, StdDev=0.1, TrainingOnly)+Dense(Image, 1000)+Sigmoid, Output=Dense(Code, 28x28x1), Class=Dense(Code, 10)+Softmax | 0 Autoencoder-States + 1 Classifier-States (Loss: 0.4 AE, 0.6 Class), LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 120 Epochs | result_classification.txt | 5.67% | model5 | Classifier + AE + InputNoise |
MNIST - reduced dataset-size (Autoencoder)
Network | Config | Results | Error | Run | Comment |
---|---|---|---|---|---|
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 0 Autoencoder-States + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 150 Epochs | result_classification.txt | 1.40% | reduced_dataset/0_ae_1_classifier | Time: 2:01h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 1 Autoencoder-States (without noise) + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 1.42% | reduced_dataset/1_ae_1_classifier | Time: 3:30h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 1 Autoencoder-States (with noise) + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 1.35% | 1_ae_noise_1_classifier | Time: 3:40h |
MNIST (Autoencoder)
Network | Config | Results | Error | Run | Comment |
---|---|---|---|---|---|
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 0 Autoencoder-States + 1 Classifier-States, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_classification.txt | 0.76% | 0-ae-states_1-classifier-state | Time: 1:30h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 4 Autoencoder-States + 2 Classifier-States using target-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 1.52% | 4-ae-states_2-classifier-states_target-loss | Time: 2:30h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 1 Autoencoder-State + 2 Classifier-States using target-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 1.17% | 1-ae-state_2-classifier-states_target-loss | Time: 1:57h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 1 Autoencoder-State (noise) + 1 Classifier-States using reconstruction-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 0.69% | 1-ae-state_1-classifier-states_target-loss | Time: 2:05h |
Per-Pixel Mean Substraction + 4x2 Conv. Layer + 1 Dense Layer | 1 Autoencoder-State (without noise) + 1 Classifier-States using reconstruction-loss only, LR: 0.001, LR-Decay: 0.5/15 Epoch, Weight-Decay: 0, 180 Epochs | result_reconstruction.txt result_classification.txt | 0.72% | 1-ae-state_without_noise_1-classifier-state_recon_loss | Time: 2:05h |
DeepDriving
Network | Config | Mean | SD | Results | Run | Comment |
---|---|---|---|---|---|---|
Per-Pixel-Normalization + Original-Net + Batch-Normalization + LRN | LR: 0.01; Decay: 0.90/40; WD: 0.005; 500 Epochs; Momentum-Optimizer | ? | ? | results.txt | run_1 | Quite good results |
Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer | ? | ? | results.txt | run_2 | same as run_1 |
Per-Pixel-Normalization + Original-Net + Batch-Normalization + no sigmoid output-layer | LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer | ? | ? | results.txt | run_3 | extreme MAE in the beginning slow convergence |
Per-Pixel-Standardization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer | ? | ? | results.txt | run_4 | slower convergence in the beginning, I need to run longer |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.005; ? Epochs; Momentum-Optimizer | ? | ? | results.txt | run_5 | Good results, but only with HUE delta 0.05. With HUE delta 0.07 or bigger, there is a strong divergence. |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.005; 240 Epochs; Nestrov-Momentum-Optimizer | ? | ? | results.txt | run_7 | No changes compared to run_1 |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.0; 240 Epochs; Nestrov-Momentum-Optimizer | ? | ? | results.txt | run_8 | No changes compared to run_1 |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 1.0; Decay: 0.90/40; WD: 0.0; ? Epochs; Ada-Delta Optimizer | ? | ? | results.txt | run_9 | No changes compared to run_1, but weights are exploding. |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.90/40; WD: 0.0; ? Epochs; Adam-Optimizer | ? | ? | - | run_10 | Exploding weights, strong divergence! |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.1; Decay: 0.90/40; WD: 0.0001; ? Epochs; Adam-Optimizer | ? | ? | - | run_11 | No convergence! |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer | ? | ? | - | run_12 | good convergence, comparable to run_1 |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 1.0 | ? | ? | - | run_13 | exploding weights, divergence! |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 0.3 | ? | ? | - | run_14 | exploding weights, bad error |
Data-Augmentation + Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 1.0; Decay: 0.90/30; WD: 0.0001; ? Epochs; AdaDelta-Optimizer; Noise: 0.01 | ? | ? | - | run_15 | comparable to run_1, higher standard deviation of error |
Data-Augmentation + Per-Pixel-Normalization + VGG | LR: 1.0; Decay: 0.90/30; WD: 0.0; ? Epochs; AdaDelta-Optimizer; Noise: 0.01 | ? | ? | - | run_16 | almost no convergence |
Per-Pixel-Normalization + VGG | LR: 0.01; Decay: 0.90/30; WD: 0.0; ? Epochs; Adam-Optimizer; Noise: 0.01 | ? | ? | - | run_17 | almost no convergence |
Per-Pixel-Normalization + VGG | LR: 0.01; Decay: 0.90/40; WD: 0.0005; ? Epochs; Adam-Optimizer; Noise: 0.01 | ? | ? | - | run_18 | almost no convergence |
Per-Pixel-Normalization + VGG + Batch-Normalization | LR: 0.01; Decay: 0.10/100; WD: 0.0005; ? Epochs; Momentum-Optimizer; Noise: 0.01 | ? | ? | - | run_19 | almost no convergence |
Per-Pixel-Normalization + Original-Net + Batch-Normalization | LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1220 Epochs; Momentum-Optimizer; Noise: 0.0 | ? | ? | results.txt | run_20 | good convergence, comparable to run_1 |
Per-Pixel-Normalization + Original-Net without Dropout + Batch-Normalization | LR: 0.01; Decay: 0.50/300; WD: 0.0005; 520 Epochs; Momentum-Optimizer; Noise: 0.0 | ? | ? | results.txt | run_21 | better convergence than with drop-out, especially for training-data |
Per-Pixel-Normalization + Original-Net with corrected Dropout + Batch-Normalization | LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1850 Epochs; Momentum-Optimizer; Noise: 0.0 | ? | ? | results.txt | run_22 | best convergence of validation-data so far! |
Per-Pixel-Normalization + Original-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) | LR: 0.01; Decay: 0.50/300; WD: 0.0005; 1860 Epochs; Momentum-Optimizer; Noise: 0.0 | ? | ? | results.txt | run_23 | best performance, better than original net in many categories |
Per-Pixel-Normalization + Original-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) + no sigmoid at the output | LR: 0.01; Decay: 0.50/300; WD: 0.0005; ? Epochs; Momentum-Optimizer; Noise: 0.0 | ? | ? | results.txt | run_24 | very noisy validation, bad performance |
Per-Pixel-Normalization + Custom-Net + Dropout + Batch-Normalization (no Batch-Normalization and weight-decay for output layer) | LR: 0.01; Decay: 0.50/300; WD: 0.0005; 2000 Epochs; Momentum-Optimizer; Noise: 0.0 | 16.41 | 16.67 | results.txt | run_25 | very good performance, final version |
Cifar-10
Pre-Processing
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Tutorial-Net | LR: 0.005; Decay: 0.96/1; WD: 0; 30 Epochs | 38.39% | results.txt | Very sparse activation in conv1 | run_1 |
Tutorial-Net | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 34.30% | results.txt | Very sparse activation in conv1 | run_2 |
Tutorial-Net + Preprocessing (-0.5) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 31.32% | results.txt | rich feature maps conv1, but sparse activation in following layers | run_3 |
Tutorial-Net + Preprocessing (Color-Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 32.07% | results.txt | less feature maps in layer conv1 | run_4 |
Tutorial-Net + Preprocessing (PerPixel-Color-Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 27.68% | results.txt | rich feature map in layer conv1, and less sparsity in layer 4 | run_5 |
Tutorial-Net + Preprocessing (PerPixel-Mean-Subtraction) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 29.48% | results.txt | Very rich feature map of conv1, but more sparse conv2 layer. Also layer 4 is more sparse. | run_6 |
Tutorial-Net + Preprocessing (Per-Image Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 29.54% | results.txt | Very righ feature-maps in conv1. | run_7 |
Tutorial-Net + Preprocessing (Per-Pixel Standardization + Per-Image Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 28.59% | results.txt | Almost very rich feature map in conv1. | run_8 |
Tutorial-Net + Data-Augmentation (Random cropping) + Preprocessing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 27.52% | results.txt | Less rich feature map in conv1, slightly less sparse feature map in conv2. | run_9 |
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping) + Preprocessing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 28.82% | results.txt | Sparsity decreases in conv1 on the long run. | run_10 |
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 90.02% | results.txt | Images are extremely bright or dark. | run_11 |
Tutorial-Net + Preprocessing (Per-Pixel Standardization) + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Image Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 38.17% | results.txt | Images are extremely bright or dark. | run_12 |
Tutorial-Net + Data-Augmentation (Random cropping, Random flipping, Random brightness) + Preprocessing (Per-Image Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 39.39% | results.txt | Images are extremely bright or dark. | run_13 |
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 30.52% | results.txt | run_14 | |
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 29.61% | results.txt | Rich feature maps in conv1. | run_15 |
Tutorial-Net + Data-Augmentation (Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Image Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 47.50% | results.txt | Very rich feature map in conv1 but extremely poor feature map in conv2. | run_16 |
Tutorial-Net + Data-Augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.004; 30 Epochs | 28.50% | results.txt | Less rich feature map in conv1 but quite rich feature map in conv2. | run_17 |
Tutorial-Net + Data-Augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) + Pre-Processing (Per-Pixel Standardization) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 23.90% | results.txt | Almost sparse feature map in conv1. | run_18 |
Network structure
- All networks use data-augmentation (Random-Cropping and Flipping, Random brightness, contrast, saturation and HUE) and pre-processing (per-pixel standardization)
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Tutorial-Net | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 23.29% | results.txt | run_1 | |
Tutorial-Net + Variable-Module for Conv1 | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 22.31% | results.txt | run_2 | |
Tutorial-Net + Use own Conv2D function with Bias constant initializer of 0.1 | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 90.06% | results.txt | run_3 | |
Tutorial-Net + Use own Conv2D function with Bias constant initializer of 0.0 | LR: 0.005; Decay: 0.96/10; WD: 0.0; 30 Epochs | 22.25% | results.txt | run_4 | |
Tutorial-Net + Use own Conv2D, Activation and Pool function | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 22.70% | results.txt | run_5 | |
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function in conv1 | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 23.32% | results.txt | run_6 | |
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function for conv1 and conv2, conv2 bias is also initialized with 0.0 | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 28.75% | results.txt | run_7 | |
Tutorial-Net + Use own Conv2D, Activation, Pool and LRN function for conv1 and conv2, conv2 bias is initialized with 0.1 | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 23.40% | results.txt | run_8 | |
Tutorial-Net + Use custom Conv2D, Activation, Pool and LRN function for conv1 and conv2 + custom Fully-Connected layer 3 (with stddev=0.02) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 25.34% | results.txt | run_9 | |
Tutorial-Net + Use custom Conv2D, Activation, Pool and LRN function for conv1 and conv2 + custom Fully-Connected layer 3 (with stddev=0.04) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 22.38% | results.txt | run_10 | |
Custom-Net (2 Conv-Layer, 2 FC-Layer, 1 Dense-Output) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 22.80% | results.txt | run_11 | |
Custom-Net (3 Conv-Layer, 1 FC-Layer, 1 Dense-Output) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 22.17% | results.txt | run_12 | |
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 21.88% | results.txt | run_13 | |
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + no LRN | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 23.67% | results.txt | run_14 | |
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0; 30 Epochs | 19.66% | results.txt | High loss for validation in the beginning of training | run_15 |
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/25; WD: 0.0005; 30 Epochs | 27.84% | results.txt | run_16 |
- The following runs use 60 Epochs for training.
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0002; 60 Epochs | 14.98% | results.txt | run_17 | |
Custom-Net (3 Conv-Layer, 2 FC-Layer, 1 Dense-Output) + no Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 60 Epochs | 23.16% | results.txt | run_18 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0002; 60 Epochs | 16.10% | results.txt | run_19 |
- The following runs use 120 Epochs for training.
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0002; 120 Epochs | 15.34% | results.txt | run_19 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + no Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0002; 120 Epochs | 14.56% | results.txt | run_20 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + no Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 13.91% | results.txt | run_21 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 15.14% | results.txt | run_22 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 2 FC-Layer + Dropout, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 13.90% | results.txt | run_23 | |
Custom-Net (2 Conv-Layer, 1 Conv-Layer + no Dropout, 1 FC-Layer + Dropout, 1 FC-Layer + no Dropout. 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 13,38% | results.txt | run_24 | |
Custom-Net (3 Conv-Layer (128 Filter), 1 FC-Layer + Dropout, 1 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12,71% | results.txt | run_25 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(512) + Dropout, 1 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12.31% | results.txt | run_26 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 2 FC-Layer, 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12.13% | results.txt | run_27 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256) + Dropout, 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12.32% | results.txt | run_28 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + no Dropout, 1 FC-Layer (256) + no Dropout, 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 13.01% | results.txt | run_29 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12.31% | results.txt | run_30 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + PReLU | LR: 0.005; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 12.37% | results.txt | run_31 |
- The following runs where based on the tweaked meta-parameters of run_7.
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for FC layers) | LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 11.50% | results.txt | run_32 | |
Custom-Net (3x2 Conv-Layer (128 Filter) + reduce kernel size to 3x3, 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for FC layers) | LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 10.31% | results.txt | run_33 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization + (classes for all layers) | LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 10.34% | results.txt | run_34 | |
Custom-Net (3x2 Conv-Layer (128 Filter) + BN + ReLU, 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization | LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 8.79% | results.txt | run_35 |
Optimizer-Arguments
Network | Config | Error | Results | Notes | Run |
---|---|---|---|---|---|
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + ReLU | LR: 0.001; Decay: 0.96/1; WD: 0.0000; 120 Epochs | 11.57% | results.txt | run_1 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization | LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 11.37% | results.txt | slight overfitting in loss | run_2 |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization | LR: 0.001; Decay: 0.5/30; WD: 0.0001; 120 Epochs | 11.93% | results.txt | still some overfitting | run_3 |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Use Xavier-Initialization for Conv-Layer | LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 11.14% | results.txt | run_4 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Use Xavier-Initialization | LR: 0.001; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 11.42% | results.txt | run_5 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization | LR: 0.002; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 11.09% | results.txt | run_6 | |
Custom-Net (3x2 Conv-Layer (128 Filter), 1 FC-Layer(1024) + Dropout, 1 FC-Layer (256), 1 FC-Layer (64), 1 Dense-Output) + Batch-Normalization + Xavier-Initialization | LR: 0.003; Decay: 0.5/30; WD: 0.0000; 120 Epochs | 10.91% | results.txt | run_7 |
Updated