Gracefully handle files with varying column numbers

Issue #458 new
Robert Leach created an issue

USE CASE: WHAT DO YOU WANT TO DO?

Get feedback on poorly formatted files that will help me fix it quickly, or even enable the ability to get around errors.

STEPS TO REPRODUCE AN ISSUE (OR TRIGGER A NEW FEATURE)

  1. Open a file which has at least 1 row with a different number of tabs

CURRENT BEHAVIOR

An exception occurs and the load progress bar freezes at 99%.

EXPECTED BEHAVIOR

A meaningful error should be presented to the user. The user should also have an option to fill missing values at the end of short rows with NaNs.

DEVELOPERS ONLY SECTION

SUGGESTED CHANGE (Pseudocode optional)

Pre-process a file to determine whether the number of columns in each row is consistent. Save the max.

If there are column labels...

If an inconsistency is found such that there are fewer columns than the number of column header labels (and no rows with more than the number of labels), present the user with a confirm dialog showing how many rows had a lesser number of columns, and ask if they would like to fill the missing data with empty values and proceed, or cancel. If a row is encountered with more columns than labels, present the user with an error indicating the number of rows (out of the total) with too many columns and the number of labels with an example of the first 1-5 row numbers (and/or labels if row labels exist) which have too many columns and tell the user that they must repair the file in a text editor and try again. In this case, the user is returned to the app window.

If there are no column labels

This should be basically the same, but instead of the number of column labels, indicate the rows with the max number of columns and the number of rows which differ from that max along with an example of the first 1-5 rows.

FILES AFFECTED (where the changes will be implemented) - developers only

unknown

LEVEL OF EFFORT - developers only

medium

COMMENTS

Comments (1)

  1. Log in to comment