Wiki

Clone wiki

BaristaSource / UserGuide / FeatureExtractionDlg

The Edge and Interest-Point Extraction Dialog

This dialog provides a user interface for selecting methods for image feature extraction and the parameters for these feature extraction methods. In the group Feature Extraction Mode, two methods for feature extraction can be selected:

  1. Canny: Select this option if you want to extract image edges using the Canny operator

  2. Foerstner Operator: Select this option if you want to extract image points and/or edges using the framework for polymorphic feature extraction that is often referred to as the Foerstner Operator.

Depending on which feature extraction mode has been selected, different parameters will be accessible.

Grafik Grafik

Left: The Feature Extraction Dialog for the Canny operator; right: the Feature Extraction Dialog for the Foerstner operator

Image Edge Extraction using the Canny Operator

Image edge extraction using the Canny Operator consists of six phases:

  1. Computation of gradients: The image gradients are computed using a user-defined filter.

  2. Edge pixel classification: Pixels having a significant gradient strength are marked as edge pixels.

  3. Non-maxima suppression: Edge pixels not being local maxima are suppressed, which leaves the most significant edge pixels.

  4. Subpixel estimation: For the edge pixels, a subpixel estimation is carried out.

  5. Edge tracking: Neighbouring edge pixels are connected to edge pixel chains.

  6. Edge thinning: The edge pixel chains are thinned out and approximated by polygons.

Barista will store both the edge pixel chains and the final image edges:

  1. Edge pixel chains: The edge pixel chains will be added as a 2D Contour node to the image node in the thumbnail view. By default, the edge pixel chains will not be displayed, but they can be made active by selecting the 2D Contour node with a mouse button and selecting Display On/Off in the appearing context menu.

  2. Image edges: The thinned-out image edges chains will be added as a 2D Line node to the image node in the thumbnail view. They will also be displayed in the images.

Grafik

The Icons for the extracted 2D Contours and the extracted 2D Lines

Computation of gradients for the Canny operator

This is the first phase of feature extraction. The original image is convolved using a derivative filter for both row and column direction. This convolution results in a gradient vector, the components of which are the first derivatives of the grey levels in row and column directions. For images having more than one image band, the gradients are computed independently for each band, and an overall gradient is computed weighting the individual bands by their noise level. Finally, the length of the gradient vector is computed as an indicator for the occurrence of an edge. The following parameters can be selected:

  1. Parameters for computing the first derivative: The user can select between three different filter kernels to compute the first derivatives:

    1. Simple Derivative: The derivatives will be computed using a simple derivative kernel:

dg / dx (x,y) = 0.5 * (g[x + 1, y] - g[x - 1, y])

dg / dy (x,y) = 0.5 * (g[x, y + 1] - g[x, y - 1])

This is the fastest option, but the one with the lowest degree of smooting.

2. **Sobel Operator**: A 3 x 3 Sobel operator kernel will be used for computing the derivatives. This option does offer some degree of smoothing.

3. **Derivative of Gaussian**: the first derivative of a 2D Gaussian function in row and column direction, respectively, will be used for computing the image derivatives. This filter has a parameter **Sigma**, i.e. the width sigma of the Gaussian function in [pixels], which can be selected in the respective numeric field. The size of the filter kernel will be 2 x (3 x Sigma) + 1 pixels. The larger the parameter Sigma, the more noise will be suppressed, but the more details will be suppressed, too. Thus, selecting a large value for Sigma will result in only the most relevant edges of the image to be extracted, whereas a small value for Sigma will result in more edges, but also in more noise. This is the theoretically optimal filter, but it is also the slowest one.
  1. Noise Model for Feature Extraction: The gradients are computed independently for each band of an image; the gradients for the individual bands are combined by a weighting average, the weight being dependent on the selected noise model. For more detail on the selection of that noise model, see chapter edge pixel classification.

Grafik

Selection of the derivative kernel

Edge pixel classification

The lengths of the gradient vector computed previously are compared to a threshold. Pixels with a gradient vector length smaller than that threshold are classified as homogeneous pixels, whereas pixels with a gradient vector length larger than that threshold are classified as edge pixels. The selection of the threshold depends on the noise model for feature extraction and the selected significance level alpha. The user can select between three different models for characterising image noise and selecting the threshold for edge pixel classification:

  1. Percentile: The threshold will be computed as the alpha percentile of the distribution of the gradient vector lengths. The value for alpha can be entered in the field Alpha / Percentile. It has to be a value between 0 and 1. Using this model means that the 100 x alpha percent of the pixels having the smallest gradient lenghts will be classified as homogeneous, whereas the 100 x (1 - alpha) percent of the pixels with the largest gradient lengths will be flagged as edges.

  2. Gaussian: If this model is selected, the noise variance of the image will be computed under the assumption of a Gaussian noise distribution, and the threshold will be computed as the alpha percentile of a Gaussian distribution having that noise variance. In this case, alpha is to be interpreted as a significance level, i.e. it is equal to the probability of erroneously classifying a homogeneous pixel as an edge pixel. Alpha has to be between 0 and 1. The larger alpha, the more pixels will be classified as edge pixels, which will result in more image details, but also in more noise. The smaller alpha, the fewer pixels will be classified as edge pixels, which will result in less noise and in more significant edges being extracted, but also in fewer details.

  3. Poisson: If this model is selected, the noise variance of the image will be computed under the assumption of a Poisson distribution for the noise noise, which means that the image noise depends on the grey levels. The poisson distribution is approximated by a Gaussian distribution using signal-dependent noise:

sigma_n = a + b x g

where g is the grey level and the parameters a and b are estimated from the images. Choosing this option works very similar to choosing Gaussian. It is theoretically sounder and might yield slightly better results in very dark or very bright areas.

Grafik

Selection of the noise model

Non-maxima suppression

For all pixels classified as edge pixels, their gradient vector lenghts are compared to the lenghts of the two neighbouring edge pixels in gradient direction. If any of these two neighbours has a larger gradient length, the current pixel is not a local maximum of gradient vector length, and thus the pixel is declared not to be an edge pixel. Thus, only the most significant edge pixels remain. No parameters can be selected.

Subpixel estimation

For the remaining edge pixels, a subpixel position of the actual edge point is carried out. The gradients in gradient direction are approximated by a second order polynomial, and the position of the edge point is estimated as the position of the maximum of the polynomial. No parameters can be selected.

Edge tracking

Neighbouring edge pixels are connected by an edge tracking algorithm, which results in edge pixel chains. These edge pixel chains will be split at junctions (i.e. where edges intersect) and at points of high curvature. After that, short edge pixel chains will be discarded. The Minimum Line Length in [pixels] can be selected in the according numerical field.

Grafik

The extracted edge pixel chains.

Edge thinning

Each edge pixel chain is approximated by a polygon in an iterative procedure:

  1. Splitting: The edge pixel chain is split into straight line segments by a recursive algorithm. Splitting is finished when the maximum distance between any edge pixel and its nearest straight line segment is smaller than a threshold. This threshold (in [pixels]) can be selected in the numerical field Edge Thinning Threshold.

  2. Approximation: The parameters of the individual straight line segments are computed from the edge pixels assigned to them, and the intersection points between these line segments, i.e. the vertices of the thinned-out edge polygons, are computed.

  3. Merging: A hypothesis test for identity is carried out for each pair of neighbouring straight line segments. If the straight line segments are found to be identical, they are merged.

  4. Steps 2 and 3 are repeated until no further changes occur.

No parameters can be selected.

Grafik

The extracted image edges

Image Point and/or Edge Extraction using the Foerstner Operator

Image feature extraction using the Foerstner Operator consists of six phases:

  1. Computation of gradients: The image gradients are computed using a user-defined filter.

  2. Computation of texture strength and directedness: From the gradients, texture strength and texture directedness.

  3. Texture classification: Based on texture strength and texture directedness, a three-way classification is carried out for each pixel. Each pixel is classified as being either a point, an edge, or a homogeneous pixel.

  4. Non-maxima suppression: Edge and/or point pixels not being local maxima are suppressed, which leaves the most significant edge/point pixels.

  5. Subpixel estimation: A subpixel estimation is carried out for both edge and point pixels.

  6. Edge tracking: Neighbouring edge pixels are connected to edge pixel chains.

  7. Edge thinning: The edge pixel chains are thinned out and approximated by polygons.

Note that here the user can select which features are to be extracted: points, edges, or both. If no edges are to be extracted, only the stages relevant for point extraction will be carried out; if no points are to be extracted, only the stages relevant for edge extraction will be carried out.

Barista will store the extracted points, the edge pixel chains and the final image edges:

  1. Points: The extracted image points will be added to the 2D Points node of image node in the thumbnail view. If the 2D Points are active, the extracted image points will be superimposed to the image. Note that the point label is initialised by an empty string.

  2. Edge pixel chains: The edge pixel chains will be added as a 2D Contour node to the image node in the thumbnail view. By default, the edge pixel chains will not be displayed, but they can be made active by selecting the 2D Contour node with a mouse button and selecting Display On/Off in the appearing context menu.

  3. Image edges: The thinned-out image edges chains will be added as a 2D Line node to the image node in the thumbnail view. They will also be displayed in the images.

Grafik

The Icons for the extracted 2D Contours and the extracted 2D Lines

Computation of gradients for the Foerstner operator

This is the first phase of feature extraction. The original image is convolved using a derivative filter for both row and column direction. This convolution results in a gradient vector, the components of which are the first derivatives of the grey levels in row and column directions. For images having more than one image band, the gradients are computed independently for each band. The filter kernel can be selected:

  1. Simple Derivative: The derivatives will be computed using a simple derivative kernel:

dg / dx (x,y) = 0.5 * (g[x + 1, y] - g[x - 1, y])

dg / dy (x,y) = 0.5 * (g[x, y + 1] - g[x, y - 1])

This is the fastest option, but the one with the lowest degree of smooting.

  1. Sobel Operator: A 3 x 3 Sobel operator kernel will be used for computing the derivatives. This option does offer some degree of smoothing.

  2. Derivative of Gaussian: the first derivative of a 2D Gaussian function in row and column direction, respectively, will be used for computing the image derivatives. This filter has a parameter Sigma, i.e. the width sigma of the Gaussian function in [pixels], which can be selected in the respective numeric field. The size of the filter kernel will be 2 x (3 x Sigma) + 1 pixels. The larger the parameter Sigma, the more noise will be suppressed, but the more details will be suppressed, too. Thus, selecting a large value for Sigma will result in only the most relevant edges of the image to be extracted, whereas a small value for Sigma will result in more edges, but also in more noise. This is the theoretically optimal filter, but it is also the slowest one.

Grafik

Selection of the derivative kernel

Computation of texture strength and directedness

From the gradients, two measures describing the local variations of the grey levels are computed:

  1. Texture Strength W: This is the smoothed average squared gradient in a certain window. The user can select which window and which smoothing kernels are to be used for computing W. For colour images, the gradients for the individual bands are combined, using the noise variance of the respective bands as weights.

  2. Texture Directedness Q: This parameter is also computed from the smoothed squared gradients (and mixed products). Q is a measure for the local variation of the directional angles of the gradient vectors in a local neighbourhood, with Q being in the interval [0, 1]. Q is large for image edges and small for image points.

The following parameters can be selected:

  1. Kernel: The user can select one of three smoothing kernels for computing texture strenght and directedness:

    1. Gaussian Filter: A 2D Gaussian function will be used for smoothing. This filter has a parameter Sigma, i.e. the width sigma of the Gaussian function in [pixels]. The size of the filter kernel will be 2 x (3 x Sigma) + 1 pixels. The larger the parameter Sigma, the more noise will be suppressed, but the more details will be suppressed, too. Thus, selecting a large value for Sigma will result in only the most relevant features of the image to be extracted, whereas a small value for Sigma will result in more features, but also in more noise. This is the theoretically optimal filter, but it is also the slowest one.

    2. Binomial Filter: A n x n binomial filter kernel will be used for smoothing, where the parameter n is the extent of the kernel. This can be seen as an approximation for a Gaussian filter.

    3. Moving Average: A n x n moving average filter kernel will be used for smoothing, where the parameter n is the extent of the kernel.

  2. Integration Scale Edges: Here, the parameter of the filter kernel for edge extraction can be selected. If the kernel is a Gaussian Filter, it is the filter parameter Sigma; in the two other cases, it is the extent n in [pixels] of the filter kernel. If edges are to be extracted, this parameter will be the one used for texture classification, independently from whether points are to be extracted or not. If no edges are to be extracted, this parameter will be inaccessible.

  3. Integration Scale Points: Here, the parameter of the filter kernel for edge extraction can be selected. If the kernel is a Gaussian Filter, it is the filter parameter Sigma; in the two other cases, it is the extent n in [pixels] of the filter kernel. If edges are to be extracted, this parameter will not be the one used for texture classification, but it will be used for non-maxima suppression of points. If no edges are to be extracted, this will be the parameter used for texture classification. If no points are to be extracted, this parameter will be inaccessible.

  4. Noise Model for Feature Extraction: The gradients are computed independently for each band of an image; the gradients for the individual bands are combined by a weighting average, the weight being dependent on the selected noise model. For more detail on the selection of that noise model, see chapter texture classification.

Grafik

Selection of the smoothing kernel for the Foerstner operator

Texture classification

A three-way classification is carried out for each pixel of the image, using the texture strength W and texture directedness Q, using two thresholds Wmin and Qmin:

  1. If W is smaller than Wmin, the pixel is classified as a homogeneous pixel, since the grey level gradients in its neighbourhood are not sigificant given the level of noise in the image described by the noise variance.

  2. If W is smaller than Wmin, the grey level gradients in the pixel's neighbourhood are sigificant, i.e., there is some significant texture. The pixel is further classified according to texture directedness using a threshold qQmin:

    1. If Q is smaller than Qmin, the grey level changes are anisotropic, and the pixel is classified as an edge pixel.

    2. If Q is larger than Qmin, the grey level changes are isotropic, and the pixel is classified as point pixel.

The selection of the threshold Wmin depends on the noise model for feature extraction and the selected significance level alpha. The user can select between three different models for characterising image noise and selecting the threshold for pixel classification:

  1. Percentile: The threshold will be computed as the alpha percentile of the distribution of the gradient vector lengths. The value for alpha can be entered in the field Alpha / Percentile. It has to be a value between 0 and 1. Using this model means that the 100 x alpha percent of the pixels having the smallest gradient lenghts will be classified as homogeneous, whereas the 100 x (1 - alpha) percent of the pixels with the largest gradient lengths will be flagged as non-homogeneous.

  2. Gaussian: If this model is selected, the noise variance of the image will be computed under the assumption of a Gaussian noise distribution, and the threshold will be computed as the alpha percentile of a Gaussian distribution having that noise variance. In this case, alpha is to be interpreted as a significance level, i.e. it is equal to the probability of erroneously classifying a homogeneous pixel as a non-homogeneous pixel. Alpha has to be between 0 and 1. The larger alpha, the more pixels will be classified as edge pixels, which will result in more image details, but also in more noise. The smaller alpha, the fewer pixels will be classified as point or edge pixels, which will result in less noise and in more significant features being extracted, but also in fewer details.

  3. Poisson: If this model is selected, the noise variance of the image will be computed under the assumption of a Poisson distribution for the noise noise, which means that the image noise depends on the grey levels. The poisson distribution is approximated by a Gaussian distribution using signal-dependent noise:

sigma_n = a + b x g

where g is the grey level and the parameters a and b are estimated from the images. Choosing this option works very similar to choosing Gaussian. It is theoretically sounder and might yield slightly better results in very dark or very bright areas.

The threshold Qmin can be selected in the according numeric field. Qmin has to be between 0 and 1. The smaller Qmin, the more non-homogeneous pixels will be classified as points, but there will be fewer edge pixels; the larger Qmin, the more non-homogeneous pixels will be classified as edges, but there will be fewer point pixels.

Non-maxima suppression

  1. Edges: For all pixels classified as edge pixels, their texture strengths are compared to the lenghts of the two neighbouring edge pixels in gradient direction. If any of these two neighbours has a larger texture strength, the current pixel is not a local maximum of texture strength, and thus the pixel is declared not to be an edge pixel. Thus, only the most significant edge pixels remain. No parameters can be selected.

  2. Points: Local maxima for the texture strength are searched. The size of n of the search area can be selected in the numerical field Non-Max Suppression in [pixels]. Thus, if there is any point pixel within a window of size n x n centered at a certain pixel that has a greater texture strength than the current pixel, the current pixel is not a local maximum and will be declared not to be a point pixel. Thus, only the most significant point pixels will survive.

Subpixel estimation

  1. Edges: For the remaining edge pixels, a subpixel position of the actual edge point is carried out. The gradients in gradient direction are approximated by a second order polynomial, and the position of the edge point is estimated as the position of the maximum of the polynomial. No parameters can be selected.

  2. Points: Sub-pixel coordinates for points will be computed in a local window by using two models, i.e. that the point is a corner point or that it is the centre of a circular symmetric structure. The model achieving a significantly better fit is accepted; if the difference between the accuracies achieved is not significant, the point is assumed to be a corner.

Edge tracking

Neighbouring edge pixels are connected by an edge tracking algorithm, which results in edge pixel chains. These edge pixel chains will be split at junctions (i.e. where edges intersect) and at points of high curvature. After that, short edge pixel chains will be discarded. The Minimum Line Length in [pixels] can be selected in the according numerical field.

Grafik

The extracted edge pixel chains.

Edge thinning

Each edge pixel chain is approximated by a polygon in an iterative procedure:

  1. Splitting: The edge pixel chain is split into straight line segments by a recursive algorithm. Splitting is finished when the maximum distance between any edge pixel and its nearest straight line segment is smaller than a threshold. This threshold (in [pixels]) can be selected in the numerical field Edge Thinning Threshold.

  2. Approximation: The parameters of the individual straight line segments are computed from the edge pixels assigned to them, and the intersection points between these line segments, i.e. the vertices of the thinned-out edge polygons, are computed.

  3. Merging: A hypothesis test for identity is carried out for each pair of neighbouring straight line segments. If the straight line segments are found to be identical, they are merged.

  4. Steps 2 and 3 are repeated until no further changes occur.

No parameters can be selected.

Grafik

The extracted image edges

Back to Image View Handling

Updated