Validation fails with ConcurrentModificationException

Issue #253 resolved
Jan Pradac (bredy) created an issue

Hi,

our validation on frequently used service sometimes fails with ConcurrentModificationException:

OpenAPI location=REQUEST key=GET#/service-name/items/ABCD04 levels=ERROR messages=Validation failed.
[ERROR][REQUEST][GET /items/ABCD04 @path.itemId] An error occurred during schema validation - com.google.common.util.concurrent.UncheckedExecutionException: java.util.ConcurrentModificationException.

Where validated path parameter is very simple and is right:

components:
  parameters:
    itemIdParam:
      name: itemId
      in: path
      required: true
      description: ID of a param
      schema:
        $ref: "#/components/schemas/ItemId"
  schemas:
    ItemId:
      type: string
      minLength: 1
      maxLength: 20
      nullable: false
      description: "Id of item entity."
      example: "ABCD04"

Occurrence characteristics:

  • occurrence tents of exceptions per week
  • issue happens with latest “swagger-request-validator-2.8.1“.

After playing with it we was able to catch the exception which is by default absorbed by validation framework:

2019.11.29 09:42:09.422 [ForkJoinPool.commonPool-worker-13] DEBUG  c.a.o.v.s.SchemaValidator  - Error during schema validation
com.google.common.util.concurrent.UncheckedExecutionException: com.google.common.util.concurrent.UncheckedExecutionException: java.util.ConcurrentModificationException
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3976)
    at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4960)
    at com.github.fge.jsonschema.core.processing.CachingProcessor.process(CachingProcessor.java:122)
    at com.github.fge.jsonschema.processors.validation.InstanceValidator.process(InstanceValidator.java:109)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.process(ValidationProcessor.java:56)
    at com.github.fge.jsonschema.processors.validation.ValidationProcessor.process(ValidationProcessor.java:34)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.main.JsonSchemaImpl.doValidate(JsonSchemaImpl.java:77)
    at com.github.fge.jsonschema.main.JsonSchemaImpl.validate(JsonSchemaImpl.java:100)
    at com.atlassian.oai.validator.schema.SchemaValidator.validate(SchemaValidator.java:120)
    at com.atlassian.oai.validator.interaction.request.ParameterValidator.validate(ParameterValidator.java:67)
    at com.atlassian.oai.validator.interaction.request.RequestValidator.lambda$validatePathParameter$7(RequestValidator.java:232)
    at java.base/java.util.Optional.map(Optional.java:265)
    at com.atlassian.oai.validator.interaction.request.RequestValidator.validatePathParameter(RequestValidator.java:232)
    at com.atlassian.oai.validator.interaction.request.RequestValidator.lambda$validatePathParameters$5(RequestValidator.java:215)
    at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:195)
    at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
    at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1801)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:913)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:553)
    at com.atlassian.oai.validator.interaction.request.RequestValidator.validatePathParameters(RequestValidator.java:216)
    at com.atlassian.oai.validator.interaction.request.RequestValidator.validateRequest(RequestValidator.java:106)
    at com.atlassian.oai.validator.OpenApiInteractionValidator.lambda$validateRequest$2(OpenApiInteractionValidator.java:186)
    at java.base/java.util.function.Function.lambda$andThen$1(Function.java:88)
    at com.atlassian.oai.validator.OpenApiInteractionValidator.validateOnApiOperation(OpenApiInteractionValidator.java:243)
    at com.atlassian.oai.validator.OpenApiInteractionValidator.validateRequest(OpenApiInteractionValidator.java:183)
    at com.cisco.td.service.globalintelligence.core.init.RiseConditionShareWithThirdPartyTest.lambda$reproduceErrorHello$0(RiseConditionShareWithThirdPartyTest.java:28)
    at java.base/java.util.stream.StreamSpliterators$InfiniteSupplyingSpliterator$OfRef.tryAdvance(StreamSpliterators.java:1360)
    at java.base/java.util.stream.StreamSpliterators$UnorderedSliceSpliterator$OfRef.forEachRemaining(StreamSpliterators.java:1042)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:952)
    at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:926)
    at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
    at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:746)
    at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:290)
    at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1020)
    at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1656)
    at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1594)
    at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:177)
Caused by: com.google.common.util.concurrent.UncheckedExecutionException: java.util.ConcurrentModificationException
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2051)
    at com.google.common.cache.LocalCache.get(LocalCache.java:3953)
    at com.google.common.cache.LocalCache.getOrLoad(LocalCache.java:3976)
    at com.google.common.cache.LocalCache$LocalLoadingCache.get(LocalCache.java:4960)
    at com.github.fge.jsonschema.core.processing.CachingProcessor.process(CachingProcessor.java:122)
    at com.github.fge.jsonschema.processors.validation.ValidationChain.process(ValidationChain.java:114)
    at com.github.fge.jsonschema.processors.validation.ValidationChain.process(ValidationChain.java:57)
    at com.github.fge.jsonschema.core.processing.ProcessorMap$Mapper.process(ProcessorMap.java:166)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:141)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:133)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045)
    ... 44 common frames omitted
Caused by: java.util.ConcurrentModificationException: null
    at java.base/java.util.ArrayList$Itr.checkForComodification(ArrayList.java:1042)
    at java.base/java.util.ArrayList$Itr.next(ArrayList.java:996)
    at com.atlassian.oai.validator.schema.keyword.Nullable$NullableDigester.alreadySupportsNullType(Nullable.java:111)
    at com.atlassian.oai.validator.schema.keyword.Nullable$NullableDigester.lambda$setupNullableTypes$1(Nullable.java:72)
    at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:176)
    at java.base/java.util.stream.ReferencePipeline$2$1.accept(ReferencePipeline.java:177)
    at java.base/java.util.ArrayList$ArrayListSpliterator.forEachRemaining(ArrayList.java:1654)
    at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:484)
    at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:474)
    at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:150)
    at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:173)
    at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
    at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:497)
    at com.atlassian.oai.validator.schema.keyword.Nullable$NullableDigester.setupNullableTypes(Nullable.java:73)
    at com.atlassian.oai.validator.schema.keyword.Nullable$NullableDigester.digest(Nullable.java:57)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.buildDigests(SchemaDigester.java:96)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.process(SchemaDigester.java:82)
    at com.github.fge.jsonschema.processors.digest.SchemaDigester.process(SchemaDigester.java:47)
    at com.github.fge.jsonschema.core.processing.ProcessorChain$ProcessorMerger.process(ProcessorChain.java:189)
    at com.github.fge.jsonschema.core.processing.ProcessorChain$ProcessorMerger.process(ProcessorChain.java:189)
    at com.github.fge.jsonschema.core.processing.ProcessingResult.of(ProcessingResult.java:79)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:141)
    at com.github.fge.jsonschema.core.processing.CachingProcessor$1.load(CachingProcessor.java:133)
    at com.google.common.cache.LocalCache$LoadingValueReference.loadFuture(LocalCache.java:3529)
    at com.google.common.cache.LocalCache$Segment.loadSync(LocalCache.java:2278)
    at com.google.common.cache.LocalCache$Segment.lockedGetOrLoad(LocalCache.java:2155)
    at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2045)
    ... 58 common frames omitted

From issue observation it seems that issue happens mostly just after web server starts when validator caches are not fully loaded and simultaneously incoming requests are validated. With this knowledge we was also able to create test case with which we can sometimes catch it (once, twice per 1-3 test runs):

Test schema “test-openapi.yaml“

openapi: "3.0.0"
info:
  version: 1.0.0
  title: Example API
  license:
    name: Example license

servers:
  - url: http://example:8080/example-service

paths:
  /hello/{helloId}:
    get:
      summary: Get hello for hello Id
      description: Returns hello for specified Id
      operationId: getHelloDef
      parameters:
        - $ref: "#/components/parameters/helloIdParam"
      responses:
        '200':
          description: Response with one hello message
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/HelloDef'
        '404':
          description: Hello for Id not found
        default:
          description: Unexpected error
          content:
            application/json:
              schema:
                $ref: '#/components/schemas/Error'

components:
  parameters:
    helloIdParam:
      name: helloId
      in: path
      required: true
      description: ID of a hello
      schema:
        $ref: "#/components/schemas/HelloId"

  schemas:
    HelloId:
      type: string
      minLength: 1
      maxLength: 20
      nullable: false
      description: "Id of hello entity."
      example: "HELL01"

    HelloDef:
      properties:
        helloKey:
          type: string
          nullable: true
          example: 'HELLO-123'

    Error:
      required:
        - code
        - message
      properties:
        code:
          type: integer
          format: int32
          example: 13
        message:
          type: string
          example: Error in hello example.

Test case:

public class RiseConditionShareWithThirdPartyTest {

    @Test
    public void reproduceErrorHello() {
        var reportSet = new HashSet<>();
        for (int i = 0; i < 1000; i++) {
            System.out.println(i + " - " + Thread.currentThread().getName());
            OpenApiInteractionValidator validator = new OpenApiInteractionValidator.Builder().withApiSpecification("file:///...UPDATE_PATH...test-openapi.yaml").build();
            Set<ValidationReport> collect = Stream.generate(() -> validator.validateRequest(buildHelloRequest())).parallel().limit(500).collect(Collectors.toSet());
            reportSet.addAll(collect.stream().filter(ValidationReport::hasErrors).collect(Collectors.toSet()));
        }
        assertThat(reportSet, is(emptySet()));
    }

    private SimpleRequest buildHelloRequest() {
        SecureRandom random = new SecureRandom();
        int num = random.nextInt(10);
        return SimpleRequest.Builder.get(String.format("hello/HELL%02d", num)).build();
    }
}

Unfortunately this issue is too deep in core so we don’t feel confident to fix it on our own.

Another unpleasant aspect of this issue is that when it happens, server responds with client error “400 Bad request” which prevents requesting service to distinguish between 5xx and 4xx and possibly repeat request for 5xx. Please consider to handle better exceptions coming from validations in SchemaValidator.java - saw that there was probably attempt to improve it weakening catch from Exception → RuntimeException but “ConcurrentModificationException extends RuntimeException”:

        } catch (final RuntimeException e) {
            log.debug("Error during schema validation", e);
            return ValidationReport.singleton(
                    messages.create(
                            "validation." + keyPrefix + ".schema.unknownError",
                            messages.get(UNKNOWN_ERROR_KEY, e.getMessage()).getMessage()
                    )
            );
        }

Comments (4)

  1. Sven Döring

    I would recommend to upgrade to 2.12.0.

    With the resolved #292 the error became far more unlikely. In fact I wasn’t able to reproduce this error at all. Even with 1.000.000 repetitions and validating 5.000 requests in parallel the error didn’t show. The test took 16 minutes.
    But with v2.8.1 I could very easily within the first seconds.

    With #292 the used JsonSchema is cached. Every JsonSchema creates their own list of ValidationProcessor within a cache. This is now done only once on the first validation.

    However there is still a theoretical chance of getting this error on huge API definitions or 100 simultaneous requests directly after a startup.

  2. Sven Döring

    I found the problem.

    In this specific case the NullableDigester prepares the schemaObject for validation. All nodes of the type "type" will be prepared. On parallel validation perhaps simultaneously.
    To bad the schemaObject mostly contains references to the same objects in SchemaValidator#definitions. Thus leading to a ConcurrentModificationException as same objects may be modified at the same time.

    I’ll prepare a fix.

  3. Sven Döring

    I did a test run with this fix for version 2.8.1.

    After 1 million repetitions and 5 hours test run the error didn’t show.

  4. Log in to comment