[Federation] Paginated Listings

Issue #2109 closed
Michael Fraser created an issue

Under section 8.3 (https://openid.net/specs/openid-federation-1_0.html#name-subordinate-listings) the list endpoint is defined and an issue is flagged inside the spec around the size of the response

the list contained in the result MAY be a very large list.

I would argue this is something best solved through pagination inside the spec instead of simply just acknowledging it as an issue.

I would propose adding two optional pagination keys to the request (size and page) and adjust the response to the following

200 OK
Content-Type: application/json

{
  "page": 0,
  "size": 5,
  "content": [
    "https://ntnu.andreas.labs.uninett.no/",
    "https://blackboard.ntnu.no/openid/callback",
    "https://serviceprovider.andreas.labs.uninett.no/application17"
  ]
}

I would like to know people’s thoughts on this, thanks

Comments (23)

  1. Michael Fraser reporter

    I think there are some clear benefits to keeping responses a bit smaller and controlled but also there are some (theoretical) limits to the current design where some technologies (such as cloud services) limit the response size. This limitation is, of course, mostly conceptual as you’d have to be serving an ungodly number of entity IDs to hit them…

    I do like the format proposed in the issue you link

  2. Michael Jones
    • changed status to open

    The SCIM experience is that pagination is a source of complexity. Handing the case where the data set changed between paginated calls is non-trivial.

  3. Michael Fraser reporter

    Yes, we’ve had this issue come up in one implementation with an endpoint with a similar style of functionality to the federation list. In that particular scenario, the ecosystem operator simply accepted it as a known problem especially as the risk of missing a change in data is also present with a one-time grab too. A potential counter to that could be the use of cursor-based pagination? Given each entity identifier must be globally unique anyway this would mitigate the problem of duplicate records

    Below based on (https://jsonapi.org/profiles/ethanresnick/cursor-pagination/)

    200 OK
    Content-Type: application/json
    
    {
      "links": {
        "prev": "/list?page[before]=https%3A%2F%2Fntnu.andreas.labs.uninett.no%2F&page[size]=2",
        "next": "/list?page[after]=https%3A%2F%2Fblackboard.ntnu.no%2Fopenid%2Fcallback&page[size]=2"
      },
      "content": [
        "https://ntnu.andreas.labs.uninett.no/",
        "https://blackboard.ntnu.no/openid/callback"
      ]
    }
    

    I do believe this is a problem worth attempting to solve (or at least allowing for implementations to optionally solve) given the sheer size this list can grow to

  4. Giuseppe De Marco

    Hey Michael,
    we have two options:

    1. create an optional endpoint in the current federation specs
    2. create a separate draft to extend the current federation specs with this additional endpoint

    let’s see what other authors and you think about this.

  5. Michael Jones

    Or we have a third option:

    3. Wait for feedback from actual deployments about whether pagination is needed in practice.

  6. Michael Jones

    This was discussed by the editors on Feb 1, 2024. We agreed to wait for feedback from actual deployments about whether pagination is needed in practice.

    If it's needed, this capability can be added later in a non-breaking way.

  7. Michael Fraser reporter

    Off the back of some discussion we had at IETF 119 and to add more context to this issue, this has been raised off of the back of challenges that Federations we are implementing in Australia and Brazil will face. In their respective models, the Federation is a very flat structure with a single trust anchor / intermediate issuing statements for a few thousand entities. The large number of entries this leads to in the list endpoint has been the driver behind this issue

    We also discussed the challenges that pagination poses with regard to data updates mid-retrieval. It was suggested that the fetch endpoint could be used to filter out inconsistencies here

  8. Giuseppe De Marco

    Following our discussion in Rome at the OSW, I'd like to present some insights regarding the proposal to introduce an optional advanced listing endpoint featuring pagination among other enhancements.

    Problem Addressed by Pagination

    In federations adopting a star topology without intermediaries, the subordinate entity listing endpoint may need to accommodate more than 16K entities. This scenario necessitates pagination.

    Pagination Challenges

    The primary challenge with pagination is the potential for inconsistent results in non-transactional datasets. For instance, if Giuseppe requests Page 2 while an entity from Page 1 is removed, the results may not align. A proposed workaround involves tracking the total_entries count to detect changes in the number of entries. However, this approach falls short in scenarios where one entity is removed as another is added, keeping the total_entries count unchanged despite a variation in the dataset.

    Proposed Solution for Pagination

    To address this, a top-level claim indicating changes in the dataset is necessary. If this claim alters while navigating through pages, it would signal a change in the dataset, prompting the requester to restart the pages fetching process from the beginning.

    Advanced Listing Endpoint Proposal

    This endpoint is designed to complement, not replace, the existing subordinate listing endpoint, which remains mandatory. Features of the advanced listing endpoint include:

    • Pagination as per the specification.
    • The ability to append additional data about an entity in a JSON object, covering aspects currently unaddressed, such as accreditation dates, update dates, reasons for subordinate revocation, and the dates those revocations took effect.

    With the inclusion of optional claims per entity, implementers seeking to provide comprehensive data can do so efficiently. For example, they could offer multiple subordinate statements in one go, maintaining consistency in the dataset_iat unless there's a change in the dataset's composition.

    {
      "iss": "https://trust-anchor.star-federation.example.org",
      "dataset_iat": 1713456341,
      "immediate_subordinates": [
        {
          "https://rp1.example.com/oidc/rp": {
            "registered": 1704217688,
            "updated": 1704217688,
            "subordinate_statement": "eyJ0eXAiOiJlbnRpdHktc3RhdGVtZW50K2p3dCIsImFsZyI6IlJTMjU2Iiwia2lkIjoiQlh2ZnJsbmhBTXVIUjA3YWpVbUFjQlJRY1N6bXcwY19SQWdKbnBTLTlXUSJ9.eyJleHAiOjE3MTM2MjkxNDEsImlhdCI6MTcxMzQ1NjM0MSwiaXNzIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwIiwic3ViIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL29pZGMvcnAiLCJqd2tzIjp7ImtleXMiOlt7Imt0eSI6IlJTQSIsImUiOiJBUUFCIiwibiI6InBuX0ljaEM2NlNGUU1oYlRITHRiRDU4aktpWVl2WW83UzR3alBqekVDTXUyN2M2RkpWRk5YdGx1YnRiN3NDNi1XVFExSHY0clNRZFBoYWZKYkl4YTMyUjUxc1JRcGtUcjNKRk1ZUDd4MjJEUlFEX2l4dFFKUmFpSHctbnBuWjhxZ1ZISl90NGdSVGM0SEprZWhCTEd2NC1ySFZBS3pGaVFOVTF1MkFGdzFmV01uTUg0b2JfcHlpc1hWZ2NrdTNkeTE0bDdzWVNBTmxwWHVmWV9xbmtRRlR2MHdNSC1DNkl6bC1ha0VOUzJVSHB2VExoZkNCVktQckZYSnh1bDRYRGJVd1Vidk5aVXhUZXJuRXg4bFY1Z3hDU2dLU0JFZ29IOU1ncEQxWVdGUGJBbndpN3A3ZTdNTkd6NWxIN2VERktrUFFoWExXQUJVOFV2RUlJV3lBOTVTUSIsImtpZCI6Ims1NEhRdERpYnlHY3M5WldWTWZ2aUhmLTJxTGNGVXRwd1kycmd4Qms4OE0ifV19LCJtZXRhZGF0YV9wb2xpY3kiOnsib3BlbmlkX3JlbHlpbmdfcGFydHkiOnsic2NvcGUiOnsic3VwZXJzZXRfb2YiOlsib3BlbmlkIl0sInN1YnNldF9vZiI6WyJvcGVuaWQiLCJvZmZsaW5lX2FjY2VzcyIsInByb2ZpbGUiLCJlbWFpbCJdfSwiY29udGFjdHMiOnsiYWRkIjpbImNpYW9AZW1haWwuaXQiXX19fSwic291cmNlX2VuZHBvaW50IjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL2ZldGNoIiwidHJ1c3RfbWFya3MiOlt7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXB1YmxpYyIsInRydXN0X21hcmsiOiJleUowZVhBaU9pSjBjblZ6ZEMxdFlYSnJLMnAzZENJc0ltRnNaeUk2SWxKVE1qVTJJaXdpYTJsa0lqb2lRbGgyWm5Kc2JtaEJUWFZJVWpBM1lXcFZiVUZqUWxKUlkxTjZiWGN3WTE5U1FXZEtibkJUTFRsWFVTSjkuZXlKcGMzTWlPaUpvZEhSd09pOHZNVEkzTGpBdU1DNHhPamd3TURBaUxDSnpkV0lpT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQXZiMmxrWXk5eWNDSXNJbWxoZENJNk1UY3hNelExTmpNME1Td2lhV1FpT2lKb2RIUndjem92TDNkM2R5NXpjR2xrTG1kdmRpNXBkQzlqWlhKMGFXWnBZMkYwYVc5dUwzSndJaXdpYldGeWF5STZJbWgwZEhCek9pOHZkM2QzTG1GbmFXUXVaMjkyTG1sMEwzUm9aVzFsY3k5amRYTjBiMjB2WVdkcFpDOXNiMmR2TG5OMlp5SXNJbkpsWmlJNkltaDBkSEJ6T2k4dlpHOWpjeTVwZEdGc2FXRXVhWFF2YVhSaGJHbGhMM053YVdRdmMzQnBaQzF5WldkdmJHVXRkR1ZqYm1samFHVXRiMmxrWXk5cGRDOXpkR0ZpYVd4bEwybHVaR1Y0TG1oMGJXd2lmUS5DdWVNTm53TG9SNWlqZ1hpUnRWWlkwU1ZCMWFhNGh6Yk5HRWxvR0ZDa1JBaE1zcTZXNVVxMXFidHFRcHVzczBLWE1EX254WEthandIT3BfT2x6a0ctWWNMdjRSeTUwbTROYW1GVUpRckQzYWlxVHFCR09BNXkyUVhJUFhwa2lzNUN3OVhyTko2ZUcyUXN5MFFhc1FfazZ1N05rTGFUUTgwYUJqcHdVX0YtaUdzV3dpLS1Yc1g5Q1Z0VC1yRHJuWUdYbnFwUnNWRlQzUHU1blNJZzhzVEU2bWRTS3lZN0F2MjBUNU5SVlRKcnBrVzZ5UDhBMktpR1JCeUFiYVVickZtQ0c1NGlpUlNPQVRFMmxMbTV1RW16bUJyVzcwTVlhTWpQUmRGemJlNGhPbzV2UTJSZHlwUXNWLUFtNzI0bWNHaHl1R0N6MWk4emMxMFVrLXVpbkkyOFEifSx7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXByaXZhdGUiLCJ0cnVzdF9tYXJrIjoiZXlKMGVYQWlPaUowY25WemRDMXRZWEpySzJwM2RDSXNJbUZzWnlJNklsSlRNalUySWl3aWEybGtJam9pUWxoMlpuSnNibWhCVFhWSVVqQTNZV3BWYlVGalFsSlJZMU42Ylhjd1kxOVNRV2RLYm5CVExUbFhVU0o5LmV5SnBjM01pT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQWlMQ0p6ZFdJaU9pSm9kSFJ3T2k4dk1USTNMakF1TUM0eE9qZ3dNREF2YjJsa1l5OXljQ0lzSW1saGRDSTZNVGN4TXpRMU5qTTBNU3dpYVdRaU9pSm9kSFJ3Y3pvdkwzZDNkeTV6Y0dsa0xtZHZkaTVwZEM5alpYSjBhV1pwWTJGMGFXOXVMM0p3TDNCeWFYWmhkR1VpTENKc2IyZHZYM1Z5YVNJNkltaDBkSEJ6T2k4dmQzZDNMbUZuYVdRdVoyOTJMbWwwTDNSb1pXMWxjeTlqZFhOMGIyMHZZV2RwWkM5c2IyZHZMbk4yWnlJc0luSmxaaUk2SW1oMGRIQnpPaTh2Wkc5amN5NXBkR0ZzYVdFdWFYUXZhWFJoYkdsaEwzTndhV1F2YzNCcFpDMXlaV2R2YkdVdGRHVmpibWxqYUdVdGIybGtZeTlwZEM5emRHRmlhV3hsTDJsdVpHVjRMbWgwYld3aWZRLkxNbnBhcTRubWJVbkpQYllhNHNrU25OUk5DV0VHSi1xbUhpUDR6cVoxcW4tWmNtaXVjb0ZIR1VVMU44RDQyd3RiRXN0TEttMTJPY0xaMk43N1NRMHRMMnQ3NFF0ZF8xV3Y2VzFaaEVoUlZ3dWVLMVZCS0F0SXR1YXM1a2RwR1oxcHRHRUJDQklBSWVGaGQwS3BlOXRIMGpZRnFBbEQ5b0k5cFdrR2xIcEp1SFoweEI5LU03dHRuRl9HSGUwSFZNcmZoOUNZTkxhRHFXdDRsQko2bDBMOWU2eDl6T3YzRllMSUJTdTdTWmE5VTJReDBtdEtWQ3A4VnhKSEMyN3Rfa0dZX1FMaGcxRFFMUTB3SGpON2o1MDZHeEJ3TEVlTVlDVERwYlZkWWN1ZG5ZVzBNRkViaVNRdnFPMGZiX2RVVTNZM2tWOGJQdnNCSnhfQ2xCalphenY3QSJ9XX0.ZB2ClwQ9zbGzwoXebHyzpd9yVGjTV_mk-183q31SY6sI47iHNMNApgz_a2TvfR2U6qzvfysP412reBUDYp1P5c4KG4eVAH-LBlE9tDq9iZc4kNi2AT_GX83APGHh10IF2_HVF6kr7c0scwcObn7rCmv4dF_ca49UCtRhqjDnxltDfcMSOx-M5zriKJycqpURJ28pVX0ZX1Jzu_MM3iwen4xzPfkJG_U2Tk-JjqQnpsAtIYiaqdAsIldvz3AX77GRVIVX1UuAMu_mW607FELOzRn_-rH4XLWdCL2gl9dXfda4yMpweOpKbiIto30xLhH0oyCXkqlfYlkfDuoYFqo5TQ"
          }
        },
        {
          "https://rp2.example.edu/rp": {
            "registered": 1704215688,
            "updated": 1704216688,
            "revoked": 1704217688,
            "revocation_reason": "..."
          }
        }
      ],
      "trust_marked_entities": { ... },
      "page": 1,
      "total_pages": 1,
      "total_entries": 2,
      "next_page_path": "",
      "prev_page_path": ""
    }
    

    WDYT?

  9. Michael Fraser reporter

    I very much like this solution - it would both enable the solving of this issue here as well as https://bitbucket.org/openid/connect/issues/2145/additional-filtering-options-in-the

    As discussed at OSW I think we’d have to bring some guidance as to what flavour of pagination to bring for interop purposes (thinking cursor vs page). What's your vision for the trust_marked_entities in the above? A minor issue but I’d be hesitant to have additional non-metadata style data outside of the paged response itself

    When it comes to the fields available to be added under each entity, my gut is we should define a list of allowed values… whether that forms the list of metadata options already defined in the spec anyway plus some additional such as “registered”, “updated”, etc

    I’ll have a crack at implementing this as a test in our own implementation and see how it goes

  10. Giuseppe De Marco

    yes, let’s try to address https://bitbucket.org/openid/connect/issues/2145/additional-filtering-options-in-the in this adv listing endpoint.

    regarding the trust_marked_entities : optional member, it can be considered completely out of scope, or not. It’s up to our discussion. Probably this would have more sense:

    {
          "https://rp1.example.com/oidc/rp": {
            "registered": 1704217688,
            "updated": 1704217688,
            "trust_marks": [{...},{...}],
            "subordinate_statement": "eyJ0eXAiOiJlbnRpdHktc3RhdGVtZW50K2p3dCIsImFsZyI6IlJTMjU2Iiwia2lkIjoiQlh2ZnJsbmhBTXVIUjA3YWpVbUFjQlJRY1N6bXcwY19SQWdKbnBTLTlXUSJ9.eyJleHAiOjE3MTM2MjkxNDEsImlhdCI6MTcxMzQ1NjM0MSwiaXNzIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwIiwic3ViIjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL29pZGMvcnAiLCJqd2tzIjp7ImtleXMiOlt7Imt0eSI6IlJTQSIsImUiOiJBUUFCIiwibiI6InBuX0ljaEM2NlNGUU1oYlRITHRiRDU4aktpWVl2WW83UzR3alBqekVDTXUyN2M2RkpWRk5YdGx1YnRiN3NDNi1XVFExSHY0clNRZFBoYWZKYkl4YTMyUjUxc1JRcGtUcjNKRk1ZUDd4MjJEUlFEX2l4dFFKUmFpSHctbnBuWjhxZ1ZISl90NGdSVGM0SEprZWhCTEd2NC1ySFZBS3pGaVFOVTF1MkFGdzFmV01uTUg0b2JfcHlpc1hWZ2NrdTNkeTE0bDdzWVNBTmxwWHVmWV9xbmtRRlR2MHdNSC1DNkl6bC1ha0VOUzJVSHB2VExoZkNCVktQckZYSnh1bDRYRGJVd1Vidk5aVXhUZXJuRXg4bFY1Z3hDU2dLU0JFZ29IOU1ncEQxWVdGUGJBbndpN3A3ZTdNTkd6NWxIN2VERktrUFFoWExXQUJVOFV2RUlJV3lBOTVTUSIsImtpZCI6Ims1NEhRdERpYnlHY3M5WldWTWZ2aUhmLTJxTGNGVXRwd1kycmd4Qms4OE0ifV19LCJtZXRhZGF0YV9wb2xpY3kiOnsib3BlbmlkX3JlbHlpbmdfcGFydHkiOnsic2NvcGUiOnsic3VwZXJzZXRfb2YiOlsib3BlbmlkIl0sInN1YnNldF9vZiI6WyJvcGVuaWQiLCJvZmZsaW5lX2FjY2VzcyIsInByb2ZpbGUiLCJlbWFpbCJdfSwiY29udGFjdHMiOnsiYWRkIjpbImNpYW9AZW1haWwuaXQiXX19fSwic291cmNlX2VuZHBvaW50IjoiaHR0cDovLzEyNy4wLjAuMTo4MDAwL2ZldGNoIiwidHJ1c3RfbWFya3MiOlt7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXB1YmxpYyIsInRydXN0X21hcmsiOiJleUowZVhBaU9pSjBjblZ6ZEMxdFlYSnJLMnAzZENJc0ltRnNaeUk2SWxKVE1qVTJJaXdpYTJsa0lqb2lRbGgyWm5Kc2JtaEJUWFZJVWpBM1lXcFZiVUZqUWxKUlkxTjZiWGN3WTE5U1FXZEtibkJUTFRsWFVTSjkuZXlKcGMzTWlPaUpvZEhSd09pOHZNVEkzTGpBdU1DNHhPamd3TURBaUxDSnpkV0lpT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQXZiMmxrWXk5eWNDSXNJbWxoZENJNk1UY3hNelExTmpNME1Td2lhV1FpT2lKb2RIUndjem92TDNkM2R5NXpjR2xrTG1kdmRpNXBkQzlqWlhKMGFXWnBZMkYwYVc5dUwzSndJaXdpYldGeWF5STZJbWgwZEhCek9pOHZkM2QzTG1GbmFXUXVaMjkyTG1sMEwzUm9aVzFsY3k5amRYTjBiMjB2WVdkcFpDOXNiMmR2TG5OMlp5SXNJbkpsWmlJNkltaDBkSEJ6T2k4dlpHOWpjeTVwZEdGc2FXRXVhWFF2YVhSaGJHbGhMM053YVdRdmMzQnBaQzF5WldkdmJHVXRkR1ZqYm1samFHVXRiMmxrWXk5cGRDOXpkR0ZpYVd4bEwybHVaR1Y0TG1oMGJXd2lmUS5DdWVNTm53TG9SNWlqZ1hpUnRWWlkwU1ZCMWFhNGh6Yk5HRWxvR0ZDa1JBaE1zcTZXNVVxMXFidHFRcHVzczBLWE1EX254WEthandIT3BfT2x6a0ctWWNMdjRSeTUwbTROYW1GVUpRckQzYWlxVHFCR09BNXkyUVhJUFhwa2lzNUN3OVhyTko2ZUcyUXN5MFFhc1FfazZ1N05rTGFUUTgwYUJqcHdVX0YtaUdzV3dpLS1Yc1g5Q1Z0VC1yRHJuWUdYbnFwUnNWRlQzUHU1blNJZzhzVEU2bWRTS3lZN0F2MjBUNU5SVlRKcnBrVzZ5UDhBMktpR1JCeUFiYVVickZtQ0c1NGlpUlNPQVRFMmxMbTV1RW16bUJyVzcwTVlhTWpQUmRGemJlNGhPbzV2UTJSZHlwUXNWLUFtNzI0bWNHaHl1R0N6MWk4emMxMFVrLXVpbkkyOFEifSx7ImlkIjoiaHR0cHM6Ly93d3cuc3BpZC5nb3YuaXQvb3BlbmlkLWZlZGVyYXRpb24vYWdyZWVtZW50L3NwLXByaXZhdGUiLCJ0cnVzdF9tYXJrIjoiZXlKMGVYQWlPaUowY25WemRDMXRZWEpySzJwM2RDSXNJbUZzWnlJNklsSlRNalUySWl3aWEybGtJam9pUWxoMlpuSnNibWhCVFhWSVVqQTNZV3BWYlVGalFsSlJZMU42Ylhjd1kxOVNRV2RLYm5CVExUbFhVU0o5LmV5SnBjM01pT2lKb2RIUndPaTh2TVRJM0xqQXVNQzR4T2pnd01EQWlMQ0p6ZFdJaU9pSm9kSFJ3T2k4dk1USTNMakF1TUM0eE9qZ3dNREF2YjJsa1l5OXljQ0lzSW1saGRDSTZNVGN4TXpRMU5qTTBNU3dpYVdRaU9pSm9kSFJ3Y3pvdkwzZDNkeTV6Y0dsa0xtZHZkaTVwZEM5alpYSjBhV1pwWTJGMGFXOXVMM0p3TDNCeWFYWmhkR1VpTENKc2IyZHZYM1Z5YVNJNkltaDBkSEJ6T2k4dmQzZDNMbUZuYVdRdVoyOTJMbWwwTDNSb1pXMWxjeTlqZFhOMGIyMHZZV2RwWkM5c2IyZHZMbk4yWnlJc0luSmxaaUk2SW1oMGRIQnpPaTh2Wkc5amN5NXBkR0ZzYVdFdWFYUXZhWFJoYkdsaEwzTndhV1F2YzNCcFpDMXlaV2R2YkdVdGRHVmpibWxqYUdVdGIybGtZeTlwZEM5emRHRmlhV3hsTDJsdVpHVjRMbWgwYld3aWZRLkxNbnBhcTRubWJVbkpQYllhNHNrU25OUk5DV0VHSi1xbUhpUDR6cVoxcW4tWmNtaXVjb0ZIR1VVMU44RDQyd3RiRXN0TEttMTJPY0xaMk43N1NRMHRMMnQ3NFF0ZF8xV3Y2VzFaaEVoUlZ3dWVLMVZCS0F0SXR1YXM1a2RwR1oxcHRHRUJDQklBSWVGaGQwS3BlOXRIMGpZRnFBbEQ5b0k5cFdrR2xIcEp1SFoweEI5LU03dHRuRl9HSGUwSFZNcmZoOUNZTkxhRHFXdDRsQko2bDBMOWU2eDl6T3YzRllMSUJTdTdTWmE5VTJReDBtdEtWQ3A4VnhKSEMyN3Rfa0dZX1FMaGcxRFFMUTB3SGpON2o1MDZHeEJ3TEVlTVlDVERwYlZkWWN1ZG5ZVzBNRkViaVNRdnFPMGZiX2RVVTNZM2tWOGJQdnNCSnhfQ2xCalphenY3QSJ9XX0.ZB2ClwQ9zbGzwoXebHyzpd9yVGjTV_mk-183q31SY6sI47iHNMNApgz_a2TvfR2U6qzvfysP412reBUDYp1P5c4KG4eVAH-LBlE9tDq9iZc4kNi2AT_GX83APGHh10IF2_HVF6kr7c0scwcObn7rCmv4dF_ca49UCtRhqjDnxltDfcMSOx-M5zriKJycqpURJ28pVX0ZX1Jzu_MM3iwen4xzPfkJG_U2Tk-JjqQnpsAtIYiaqdAsIldvz3AX77GRVIVX1UuAMu_mW607FELOzRn_-rH4XLWdCL2gl9dXfda4yMpweOpKbiIto30xLhH0oyCXkqlfYlkfDuoYFqo5TQ"
          }
        }
    

    When it comes to the fields available to be added under each entity, my gut is we should define a list of allowed values… whether that forms the list of metadata options already defined in the spec anyway plus some additional such as “registered”, “updated”, etc

    I agree with you, we should define a know set of members and leave up to the implementers to add any other claim, as we already done with the trust mark schema.

  11. Giuseppe De Marco
    • changed status to new

    since it is actively discussed, I realized that "on hold" doesn't bring it in the issues list, actually hiding it

  12. Michael Fraser reporter

    I’ve implemented a PoC of the above on our Federation implementation internally and so far it's been good to work with. I just did a basic set of query parameters:

    • page_number
    • page_size
    • updated_after

    I also included all of the parameters from the original list endpoint (entity_type, trust_marked, trust_mark_id). The only thing that jumped out at me is that entity_type should really become mandatory when using the format above. It not being mandatory combined with the fact that the type of additional “keys” one can expect in the response maps changes depending on the subject entity type did add complexity. This would be solved with trust_marked becoming mandatory thus you know what sort of format to expect

  13. Ralph Bragg

    Hi, registering my support / need for paginated responses due to the size of entities within our federations. 1000+. i’d also like to flag that in some interactions ecosystems where the list response contains very large datasets we would want to deny the use of the ‘list’ endpoint and require appropriate use of the advanced filtering endpoint / api. I’m not going to want to have federation participants downloading MB’s of information when a more advanced, smaller payload perhaps one with a default filter of ‘last 24 hours worth of changes' etc added too it.

  14. Michael Jones

    To be clear, the listing endpoint returns Entity Identifiers (URLs) - not Entity Configurations (JWTs). 1000 Entity Identifiers is likely to be about 20-30K of data - not megabytes. It’s only once people start retrieving the corresponding Entity Configurations that you’ll reach megabytes or possibly tens of megabytes.

    Don’t get me wrong - I’m in favor of the ability to list useful subsets of immediate subordinate entities. But I’m trying to have us be precise about what the operations being discussed do.

    My ask is for those who have implemented prototypes and/or who have these ecosystems needs to say exactly what query parameters they want to use on the listing endpoint and what their meanings are.

    For instance, if queries select based on “changed since” information, what kinds of changes qualify? Key changes? Joining the federation as an immediate subordinate? What else? Do you want responses for former subordinates that are no longer part of the federation? What kinds of state are you asking that immediate superiors track about their immediate subordinates, and how stale is that information allowed to be in query responses?

    Ralph, I appreciate your participation. The more specific you can be about what you need when and why, the more actionable the information will be. Thanks.

  15. Michael Fraser reporter

    We’ve been in discussion recently with a party whose ecosystem would have upwards of 50k participants. In such a scenario the list endpoint would be between 5 and 10 MB

    To your point above Mike in the Australian context for the current proprietary API that we’re hoping to ditch in favour of the advanced listing endpoint for interop purposes, “changed since” has been taken as anything that would produce a change in the information that an Intermediate or Trust Anchor will issue in their entity statements for their immediate subordinates

  16. Lukasz Jaromin

    I can only echo what Ralph and Michael wrote. I’ll add that even if there’s an optional advanced endpoint that supports limits and filtering, exposing publicly listAll endpoint with anonymous access that may return couple megabytes of data may be asking for trouble.

    I think we should consider seek pagination for this endpoint. Not only because of the performance advantage when it comes to large data sets, but also because it has no impact on the response format.

    It would require adding two request parameters:

    after_entity_id
      OPTIONAL. The value of this parameter is an Entity Identifier. 
      If this parameter after_entity_id is present then the result 
      list MUST be filtered to include only these Entity Identifiers 
      that, in the results list, immediately follow the Entity Identifier 
      that is value of this parameter. If the Entity Identifier that equals 
      value of this parameter does not exist it MUST use the HTTP status code 400 
      and the content type application/json, with the error code entity_id_not_found (TBD).
    limit
      OPTIONAL. Positive integer that specifies maximum number of Entity Identifiers 
      included in the results list contained in the response. If this parameter
      is not present the result list contained in the response will 
      include maximum 1000 (thousand) Entity Identifiers. 
      If the limit parameter is present 
      the result list MUST be filtered to include not more than 
      the specified number of Entity Identifiers. 
    
      It MUST support values less than 1000. It MAY support 
      values higher than 1000. If it does not support the
      limit value provided in the parameter it MUST use the  
      HTTP status code 400 and the content type application/json the 
      error code unsupported_limit_value (TBD)
    

    The original response would remain unchanged

    GET /list HTTP/1.1
    
    200 OK
    Content-Type: application/json
    
    [
       "https://0.openid.net/",
       "https://1.openid.net/",
       "https://2.openid.net/"
    ]
    

    If there is more than 1000 (default limit) elements to be returned, the response contains only 1000.

    GET /list HTTP/1.1
    
    200 OK
    Content-Type: application/json
    
    [
       "https://0.openid.net/",
       "https://1.openid.net/",
       "https://2.openid.net/",
       ...
       "https://999.openid.net/"
    ]
    

    Since there’s 1000 results the client sends another request with an additional query param to check whether there is more results and fetch them. This approach is a tradeoff and has its pros and cons.

    GET /list?after_entity_id=https://999.openid.net
    
    200 OK
    Content-Type: application/json
    [
       "https://1000.openid.net/",
       "https://1001.openid.net/",
       ...
       "https://1020.openid.net/""
    ]
    

    Client may need a smaller page sizes that shall always be supported

    GET /list?limit=20
    GET /list?after_entity_id=https://1020.openid.net/&limit=20
    

    Or larger page sizes that may be supported and if are not supported such request may end up with an error. It would be good for a client to be able to know the max supported size.

    GET /list?limit=10000
    

    If the entity identifier provided in the after_entity_id parameter does not exist (because e.g. was deleted in the meantime) an error is returned. The client needs to start over in this case.

    GET /list?after_entity_id=https://23xyz.openid.net/
    
    400 Bad request
    Content-Type: application/json
    
    {
      "error": "entity_id_not_found",
      "error_description":
        "TBD"
    }
    

  17. Giuseppe De Marco

    how to resolve the issue represented by any addition/removal of entities while and in between different requests to the list endpoint?

    at time T entities A,C, D, E and F are found in the listing endpoint response
    at time T+1 the list request of the verifier Z sets the after_entity_id with C and limit set with 2, obtaining D, E
    at time T+2 the entity B in registered as an immediate subordinate and therefore available in the list response
    at time T+3 the list request of the verifier Z sets the after_entity_id with E and limit set with 2, obtaining F

    the chunked responses missed the entity B.
    We therefore need a way to hint to the requester that the population of the subordinates is somehow changed. For this reason I have proposed the parameter dataset_uid here: https://bitbucket.org/openid/connect/pull-requests/732/diff#Lopenid-federation-1_0.xmlT4356

    the list endpoint result doesn’t give these additional level of details (unless we don’t put these within the http headers within the response …). For this reason I have proposed another endpoint to handle these detailed requests and resposes, enabling also the provisioning of multiple subordinate statements within the objects pertaining each subordinate entity

  18. Lukasz Jaromin

    Hi Giuseppe. Thank you for looking at this issue.

    the chunked responses missed the entity B.

    In the referred example, the result will be identical to the result we obtain using the endpoint without pagination at time T. There should be no significant impact on a client as it gets the same list as it would get with original endpoint.

    Case 1
    T:        List:A,C,D,E,F    /list?limit=3 result:A,C,D     /list result: A,C,D,E,F
    T+1 (+B): List:A,B,C,D,E,F  /list?limit=3&after_entity_id=D result:E,F 
    

    However, there’s also a variation of this case where the modification happens to pages that haven’t been requested yet. Like add operation (+E1):

    Case 2
    T:        List:A,C,D,E,F    /list?limit=3 result:A,C,D     /list result: A,C,D,E,F
    T+1 (+E1): List:A,B,C,D,E,E1,F  /list?limit=3&after_entity_id=D result:E,E1,F 
    

    or delete operation (-D):

    Case 3
    T:        List:A,C,D,E,F    /list?limit=3 result:A,C,D     /list result: A,C,D,E,F
    T+1 (-E): List:A,B,C,D,E,F  /list?limit=3&after_entity_id=D result:F 
    

    In these ^^^ cases 2, 3 client gets up-to-date information regarding these pages

    Case 4
    T:        List:A,C,D,E,F    /list?limit=3 result:A,C,D     /list result: A,C,D,E,F
    T+1 (-D): List:A,B,C,E,F  /list?limit=3&after_entity_id=D result:E,F OR ERROR
    

    There’s a corner case 4 ^^^ where D gets removed an in this case either forward pages results as returned as usual or an error to signal that D doesn’t exist.

    I see where you are going with the dataset_uid, however do we really need to aim for such strong consistency?

    I imagine that data returned by this endpoint even now without pagination is eventually consistent, so issues that you are describing can occur even without pagination especially with use of CQRS or a distributed data store.

  19. Log in to comment