hu-geomatics / enmap-box / issues / #673 - Classification Workflow: repeated n-fold cross-validation? — Bitbucket

Issue #673 resolved

Agustin Lobo created an issue 2021-05-27

In the Classification Workflow, when selecting cross-validation with n-folds, the classification is performed using n-1 groups as training and predicting the remaining group, but is this performed only once? or several n-folded partitions are tried? In that case, how many? The usual way is repeating i.e. 10x for a 10-fold partition.

Comments (6)

Andreas Janz
It’s done n times. Each time, one of the n partitions is hold out and class labels are predicted. In the end we have an indipendent prediction for all samples in the dataset.
- 2021-05-27T09:06:39+00:00
Agustin Lobo reporter
so I understand RepeatedKFold(n_splits=10, n_repeats=10, random_state=random_state)

‌
- 2021-05-27T09:11:18+00:00
Andreas Janz
No, internally the cross_val_predict function is used: https://scikit-learn.org/stable/modules/generated/sklearn.model_selection.cross_val_predict.html
- 2021-05-27T10:16:15+00:00
Agustin Lobo reporter
So it would just be (for 10-fold):

cross_val_predict(model, X, y, cv=10)

?

Actually, how could I just check this myself? Is there a way to know which specific *.py files are used by a given process (eg. Cross-validation Accuracy Assessment in Classification workflow).
- 2021-05-30T11:48:40+00:00
Andreas Janz
For external users, I would guess that it is not really easy to find the specific lines of code related to a specific functionality. What you are looking for is located here:

‌
- 2021-05-31T14:46:19+00:00
Andreas Janz
- changed status to resolved
- 2021-09-22T13:19:10+00:00
Log in to comment

Assignee: –

Type: proposal

Priority: major

Status: resolved

Component: Processing

Milestone: –

Version: 3.7

Votes: 0

Watchers: 1

Jira: the preferred issue tracker for Bitbucket. Join the team!