How to avoid stitching of enhancer/super enhancer over genes?

Issue #7 resolved
Roy Blum created an issue

We recently ran ROSE using a stitching distance of 15000bp and obtained a list of enhancers and super enhancers (the run seemed to be completed without any errors).

When we crossed checked the obtained enhancers on igv (and by using an intersectBed command against the track of RefSeq genes coordinates) we found that some of the enhancers & super enhancers that where created (after stitching) were stitched over a gene, and in some cases over two genes... Is this behavior expected? or is this some sort of a glitch? Would you be able to suggest how to run ROSE in order to avoid such behavior? So that enhancer would be only stitched when there's no gene(s) in the middle. We assumed that the fact that we define distance of 3000bp (-t 3000) from the TSS, and the fact that we provide the HG19 (as a genome) should eliminate such scenario. Please advise,

Thanks in advance! Roy

Comments (2)

  1. Brian Abraham

    Roy,

    As explained in the manual, the exclusion zone is only for excluding enhancers that are completely contained within this zone. Enhancers will be stitched across promoters if, for instance, an intronic enhancer downstream of the promoter is within the stitching distance of an enhancer upstream of the transcription start site.

    TSS_EXCLUSION_ZONE_SIZE: exclude regions contained within +/- this distance from TSS in order to account for promoter biases (Default: 0; recommended if used: 2500). If this value is 0, will not look for a gene file.

  2. Log in to comment