Issue: 'Maximum attribute size limit exceeded' error from openxml package

Issue #1318 open
Di Hu created an issue

Hello,

We got 'Maximum attribute size limit exceeded' error during extracting, and I tried to follow the method proposed in existing issue, but it did not work as expected.

I created my okf_openxml@maxAttrubuteSize.fprm file and customized configuration to create OpenXMLFilter. I found the maxAttrubuteSize is read but the value is not passed to P_MAX_ATTRIBUTE_SIZE by the code below in OpenXMLFilter.java.

setPropertyIfSupported(inputFactory, WstxInputProperties.P_MAX_ATTRIBUTE_SIZE, conditionalParameters.getMaxAttributeSize());

So when do extraction, the exception

“Caused by: javax.xml.stream.XMLStreamException: Maximum attribute size limit (2097152) exceeded” still exists.

Can you help identify if anything is wrong? Thank you so much!

D‌etails => I’ve done the following:

  1. Add a fprm file okf_openxml@maxAttrSize.fprm containing maxAttrubuteSize.i=33333333
  2. Created a CustomOpenXMLFilterConfiguration.java
public class CustomOpenXMLFilterConfiguration {
    public static final String CUSTOM_OKAPI_FILTER_ID = "okf_openxml@maxAttrSize";
    private static final String OKAPI_OPENXML_FILTER_CLASS = "net.sf.okapi.filters.openxml.OpenXMLFilter";
    private static final String CONFIG_FILE_LOCATION = "/resources/okf_openxml@maxAttrSize.fprm";
    private static final String OPENXML_EXTENTIONS = ".docx;.docm;.dotx;.dotm;.pptx;.pptm;.ppsx;.ppsm;.potx;.potm;" +
        ".xlsx;.xlsm;.xltx;.xltm;.vsdx;.vsdm;";

    public static net.sf.okapi.common.filters.FilterConfiguration provideCustomOpenXMLFilterConfiguration() {
        return new FilterConfiguration(
            CUSTOM_OKAPI_FILTER_ID,
            MimeTypeMapper.XML_MIME_TYPE,
            OKAPI_OPENXML_FILTER_CLASS,
            "OPENXML (Customize MaxAttributeSize)",
            "Customize MaxAttributeSize",
            CONFIG_FILE_LOCATION,
            OPENXML_EXTENTIONS);
    }
}

3. In FilterConfiguration.java, Add the customized config to FILTER_CONFIGURATION_MAPPER, add the new filter_ID to EXTENSIONS_MAP.

FILTER_CONFIGURATION_MAPPER.addConfiguration(CustomOpenXMLFilterConfiguration.provideCustomOpenXMLFilterConfiguration());
public static final ImmutableMap<FileContentType, String>
    EXTENSIONS_MAP = new ImmutableMap.Builder<FileContentType, String>()
    .put(FileContentType.HTML, CustomHTMLFilterConfiguration.CUSTOM_OKAPI_FILTER_ID)
    .put(FileContentType.XLIFF, "okf_xliff")
    .put(FileContentType.MOSES_TEXT, "okf_mosestext")
    .put(FileContentType.DOCX, CustomOpenXMLFilterConfiguration.CUSTOM_OKAPI_FILTER_ID)
    .put(FileContentType.XLSX, CustomOpenXMLFilterConfiguration.CUSTOM_OKAPI_FILTER_ID)
    .put(FileContentType.PPTX, CustomOpenXMLFilterConfiguration.CUSTOM_OKAPI_FILTER_ID)
    .put(FileContentType.TMX, "okf_tmx")
    .build();

Thanks in advance for the effort!!

Comments (3)

  1. Di Hu reporter

    Hey,

    I added 1 line in FilterConfigurationMapper.java, it seems work now

    if ( fc.parametersLocation != null ) {
       System.out.println(fc.parametersLocation);
    
       if ( fc.custom ) {
          System.out.println("fc.parametersLocation != null and fc.custom");
          params = getCustomParameters(fc, filter);
       } else if (fc.parametersLocation != null ) {
          // Note that we cannot assume the parameters are the same
          // if we re-used an existing filter, as we cannot compare the 
          // configuration identifiers
          params.load(filter.getClass().getResourceAsStream(fc.parametersLocation), false);
          // ADDED THIS LINE TO SET CUSTOM PARAMETERS
          filter.setParameters(params);
       }
    

  2. Log in to comment