- edited description
- attached Captura.JPG
All classes have 0 elements sampled in Classification Workflow
We are trying to run the Classification Workflow with our own data. We load our shape file and select the labels column but then the sampling process selects 0 samples for every class, even if we have selected 100% sample size:
We have made 2 versions of the points data (attached zip files):
test.shp has one field for each point (“single parts”),
test2 has one field for each class (“multi part”).
In both cases we select the “level_1_id” field.
In both cases, the points are displayed over the image (envi format) in EnMapBox Map window.
Comments (16)
-
reporter -
reporter - edited description
-
reporter I have found that the problem was not having the same CRS information in the hdr file of the image than in the prj file of the shape.
While the .prj states the info for epsg:25831, our hdr had:
map info = {Arbitrary, 1, 1, 0, 0, 1, 1, 0, North}
Despite being prompted 4 or 5 times for the CRS of the image by EnMapBox, entering epsg:25831 every time and getting a correct overlay of points in the Map display, it seems that the Classification process is not actually able to extract the image information for the points.
The problem was solved by entering full CRS information in the hdr:
map info = {UTM, 1.000, 1.000, 0, 0, 1, 1, 31, North, ETRS-89, units=Meters} coordinate system string = {PROJCS["ETRS89 / UTM zone 31N", GEOGCS["ETRS89", DATUM["European_Terrestrial_Reference_System_1989", SPHEROID["GRS 1980",6378137,298.257222101, AUTHORITY["EPSG","7019"]], TOWGS84[0,0,0,0,0,0,0], AUTHORITY["EPSG","6258"]], PRIMEM["Greenwich",0, AUTHORITY["EPSG","8901"]], UNIT["degree",0.0174532925199433, AUTHORITY["EPSG","9122"]], AUTHORITY["EPSG","4258"]], PROJECTION["Transverse_Mercator"], PARAMETER["latitude_of_origin",0], PARAMETER["central_meridian",3], PARAMETER["scale_factor",0.9996], PARAMETER["false_easting",500000], PARAMETER["false_northing",0], UNIT["metre",1, AUTHORITY["EPSG","9001"]], AXIS["Easting",EAST], AXIS["Northing",NORTH], AUTHORITY["EPSG","25831"]]}
Note that omitting the
coordinate system string
results on qgis reading the image as epsg:32631 (WGS-84 datum instead of ETRS-89)I do not know if requiring explicit CRS information in the image (and thus neglecting was has been entered by the user when prompted) is intentional or a bug
In any case, what is really needed is a clear error message in case of disagreement between the CRS information of the image and that of the vector, i.e. “ERROR: CRS information of input layers do not agree” and best if the message is followed by both CRS infos as read by the classification process.
I thus think that this issue can be closed (cannot find the way to close it myself) and I open a request for a clear error message.
-
map info = {Arbitrary, 1, 1, 0, 0, 1, 1, 0, North}
@Agustin Lobo as you already discovered, the QGIS API requires more than the map info tag to derive a valid CRS definition. For such cases you can specify a default CRS that is used for every case where the CRS is unknown (see
)#274Regards the classification workflow I suggest to implement a clear error message just for the case that one of the inputs has an invalid CRS.
If both CRS are valid, the vector reference should by warped on-the-fly into the raster CRS (can be easily done in memory).
In case of artificial data sets (like that above) we might use a default projected metric CRS, e.g. EPSG:32626 (Atlantic)
-
-
assigned issue to
-
assigned issue to
-
reporter Actually, I would segregate the extraction of information for the points to a previous step. The input for the classification process should be a table with the spectral values for each point, not the actual points shapefile. The user would take care of making that table before the actual classification workflow. This would have 2 advantages:
- Separate the eventual geometric problems from the actual classification process.
- Most important, let the user include information from different images (or even other sources) in the table, model and apply the model to the input image.
-
@Agustin Lobo the inclusion of samples from different images can be achieved using the SpectralLibrary View using the Import profiles from raster + vector sources option multiple times. Unfortunately, the Classification Workflow App is not yet able to use a spectral library as input.
If you like, you can provide we with some testdata and I will implement your usecase.
-
- changed status to on hold
-
reporter Andreas,
Any of the current test data would be enough. The point is being able to classify from training datasets that come from multiple images. I think the most straightforward would be an input in the form of a simple csv file with columns ID, class, band1, band2,band3…
Perhaps you have reasons to prefer the spectral library to the csv file (maybe keeping an internal consistency across the package) and that would be fine with me (provided spectral libraries can be built in a non-interactive way from polygon or point vector files). But for sure that the current procedure (described in https://enmap-box.readthedocs.io/en/latest/usr_section/usr_cookbook/classification.html) in which the user must convert the training set from vector to raster (in a procedure with the confusing name of “Classification from Vectorraster”) is very inconvenient.
Maybe we should move this discussion to a new ticket named “Allow for classification training sets to be built from multiple images”.
-
+1
I like the idea to train a classifiers based on CSV input, as this is probably more convenient with other machine learning frameworks. I’ll add an export function to the Spectral Library to export 1 selected label column + n columns for n spectral bands as *.csv
-
reporter Great. But better 2 label columns: ID of the pixel or polygon, another for the class
-
reporter Benjamin, are you working or going to work on this? Why is this ticket on hold?
-
- changed status to open
-
@Agustin Lobo @Andreas Janz has already implemented several improvements to the Classification Workflow. It now shoud handle better projection differences and allows to use different inputs:
- raster features + raster references
- raster features + vector references. Aactually it’s required to use a vector layer with CategorizedSymbolRender from which the class info is derived.
- Spectral Library with Categorited Symbol Renderer
We also started to provide “developer” versions though the QGIS Plugin Repository. Got into the PLugin Manager Settings, activate “Show also experimental plugins”, than you can install the “experimental” developer/test versions:
-
@Agustin Lobo we have a new release v3.7 planned for end of October. I would recommend waiting.
This release will allow to build training dataset from multiple images via the SpectralView.
Regarding training via CSVs: if your format is compatible with the SpectralView import “CSV Table”, you can again train via the SpectralView.
Does that makes sense?
-
- changed status to resolved
- Log in to comment