Translation of Excel table column name leads to "recover excel" dialogue box appearance
Got that "recover excel" dialogue after merging it back
To reproduce:
- I've attached a sample file. Download it
- Extract it (to XLIFF)
- Find "OrderDate" word on the
<target>
and replace it with something else (“NewOrderDate”, for example) - Merge it back
- Open the merged file. You should see the "recover excel" dialogue box
Comments (22)
-
-
Update: I’ve just found out that it’s happened not only on Arabic text, it’s also happened on normal text too
-
I was able to reproduce this error by changing “OrderDate” to “XOrderDate” and merge it. The simple round trip didn’t cause this, so I suspect some kind of inconsistency between files within the .xlsx.
-
@Kuro Kurosaka Can you send me the XLIFF you used when you were merging this? I’m having trouble reproducing.
-
- attached SampleData.xlsx.xlf
@Chase Tingley , I'm attaching the .xlsx.xlf file and .out.xlsx file. The Excel that I'm using is an unlicensed (read-only) version of Microsoft Excel for Mac Version 16.36. Interestingly, LibreOffice can open the .out.xlsx file without a problem.
-
- attached SampleData.out.xlsx
-
@Kuro Kurosaka Oh, I was also trying to open it in LibreOffice, maybe that’s why I didn’t see the problem. Possibly a place where Microsoft doesn’t follow their own spec.
cc @Denis Konovalyenko , when you have a chance
-
The company this Excel sheet is referring to, https://www.contextures.com/index.html, is probably the company that made this file. Since they are claiming to be the Excel expert, I am guessing that it might be using some uncommon features against which Okapi has never been tested.
-
@Chase Tingley , @Kuro Kurosaka , Handika D, a closer look at the problem described has been taken and the following nuance has been revealed:
The document has a table (
table1.xml
part) with<tableColumn... name="SomeName"/>
markup, wherename
attribute values represent the unique caption of the table column - this is what is displayed in the header row in the UI, and is referenced through functions. So, if any of the first-row cell values ofsheet2.xml
part is changed (translated), a correspondingtableColumn
name
attribute value has to be changed (translated) as well. This is not happening at the moment.So, a workaround would be to manually adjust every needed table column name to match it with the one that has been translated (see the
sharedStrings.xml
part). As an example, below you can find a screenshot with manually changed “OrderDate” to “NewOrderDate” value:For more information please refer to the
SampleDataManuallyAdjusted.out.xlsx
document. -
- attached SampleDataManuallyAdjusted.out.xlsx
-
So you mean I have to manually open the SampleData file and change the table header (also manually) in Word?
Thank you guys for this discussion anyway
-
Thank you @Denis Konovalyenko . I didn’t know there was a different type of things called table in Excel; I thought each sheet of Excel is a gigantic table. I’m learning everyday.
-
Handika D, I am afraid, the OpenXML filter does not support the behaviour with table parts involved. So, yes, the merged (translated) document would require additional steps to fix it. I will try to make them a bit more verbose:
- Unpack a document with
unzip
- Look at every
xl/tables/tableN.xml
part and find out what are the column names and their order -
Find related
xl/worksheets/sheetN.xml
parts and discover the indexes under which the shared strings are written in thexl/sharedStrings.xml
part. For instance,xml <row r="1" spans="1:7" x14ac:dyDescent="0.25"> <!-- the first row --> <c r="A1" s="11" t="s"> <!-- the cell number --> <v>16</v> <!-- this is a 0-based index in the sharedStrings.xml --> </c> ...
4. Copy all translated column names found in thexl/sharedString.xml
to their corresponding places in thexl/tables/tableN.xml
parts 5. Pack the document withzip
Hope this would be helpful.
@Kuro Kurosaka , indeed, OOXML is not an easy thing to deal with, especially when implementations deviate from the original specification to some extent (happens with Microsoft products from time to time)…
- Unpack a document with
-
Any plan this will be implemented into Okapi?
-
- changed title to Translation of Excel table column name leads to "recover excel" dialogue box appearance
- edited description
-
I’m now trying to solve this issue. Finally I have the time for this even though I don’t have any experience on building localization tools
I’ve inspected thoroughly the
xl/tables/table{N}.xml
for daysI find that the most complicated part of the Table feature is I can’t find the sheet and column related to the content. Have read SpreadsheetML on the official docs but still can’t find the explanation related to this thing.
@Kuro Kurosaka @Denis Konovalyenko Do you guys have opinion on this?
-
@Handika Dwi, I appreciate your attitude towards the resolution of this issue. You are a goal-oriented person. Could you please let me know whether you are willing to patch some documents by hand or there is an intention to dive into a related coding experience?
For the reasons of reducing the volume of data to store and the time to process Excel documents, the shared strings part is used for keeping unique strings as much as possible. They are referenced from worksheets by 0-based indexes.
Furthermore, a table element within a table part refers to a range of columns and rows a worksheet represents (using A1 style referencing). There is
ref=”A1:P701”
in your example. So, I expect that 16 column names should appear in the first row of a worksheet. And here the picture is assembled - we can match the actual values (strings).I assume you may wonder about how to link a worksheet with a particular table. This can be done with the help of the parts relationships. Just look for the information like this
<Relationship Id="rId1" Type="http://schemas.openxmlformats.org/officeDocument/2006/relationships/table" Target="../tables/table1.xml"/>
inworksheets/_rels/sheetX.xml.rels
, wheresheetX
is a worksheet, and a table available for this worksheet can be found in theTarget
attribute. To be more concise, there can be even more reference data in a sheet. It looks a bit extra to me but still someone could use this for coming up with an idea of what tables are linked to a sheet. E.g.<tableParts count="1"> <tablePart r:id="rId1"/> </tableParts>
As for the programmatic solution for this issue, it is going to be considered in the scope of the issue
#1059(I have been looking into this for some time). -
I want to channel my 3+ years of software engineering experience onto Okapi by solving this issue after I’ve used Okapi for the last 2 months. I’m keen on Java, Maven and its whatnot.
However before I delve into Okapi and this issue for that matter, what should I know first related to Okapi’s things?
-
- changed milestone to 1.42.0
-
assigned issue to
-
- changed milestone to 1.43.0
-
Related changes were added to the available pull request #578.
-
- changed status to resolved
A related solution merged in the scope of pull request #578.
- Log in to comment
This is my post. forgot to login
I’m using Tikal to produce this