HTML report generation fails on too long path

Issue #580 closed
István Bede
created an issue

Hi!

I'm trying to generate HTML report on my coverage. Unfortunately I have very deep folder structure with some long directory and file names. The HTML generation tries to create an .html filename containing all this path, but on Windows 7 it fails if the file name is longer than 260 (NTFS). It reports an error like:

---------- coverage: platform win32, python 2.7.13-final-0 -----------
Internal Error: runTests aborted: [Errno 2] No such file or directory: u'logs/coverage/_jenkins_work_XXXX_nightly_main_build_workspace_YYYYYYYYYYYYYYYYY-master_AAAAAAAAAAAAAAAAAAAAAAA_src_BBBBBBBBBBBBBBBBBBBB_CCCCCCCCCCCCCCCCCCCCCCC_py.html'

I made a small workaround/patch to make it work: Added directory_depth config value in coverage.cfg:

[html]
directory = logs/coverage/
directory_depth = 4

and patched html generation the following way in html.py:

    def html_file(self, fr, analysis):
        """Generate an HTML file for one source file."""
        relname = fr.relative_filename()
        if self.config.dir_depth > 0: # initialized with -1
            import re
            relname = relname[[m.start() for m in re.finditer('\\\\', relname)][relname.count('\\') - self.config.dir_depth - 1] + 1:]
        rootname = flat_rootname(relname)
        html_filename = rootname + ".html"
        html_path = os.path.join(self.directory, html_filename)

and of course I added dir_depth to CoverageConfig.

As a result, only the needed directory depth is included in the filename. I don't know if other OS or file system is problematic or not.

Note: maybe the annotate module is also affected as it uses the same mechanism for file name generation.

Thanks for the good work!

Comments (3)

  1. Ned Batchelder repo owner

    I wonder if we can't do this in a more automatic way: if the filename is longer than some limit (250), then automatically truncate it, with a hash of the full name as part of the short name, to prevent collisions.

  2. István Bede reporter

    If the only reason to include the full path in the generated file names is to distinguish between input files with same name, then maybe some hash of the path + the file name would be sufficient. Something like this:

    The original generated filename was

    _jenkins_work_XXXX_nightly_main_build_workspace_YYYYYYYYYYYYYYYYY-master_AAAAAAAAAAAAAAAAAAAAAAA_src_BBBBBBBBBBBBBBBBBBBB_CCCCCCCCCCCCCCCCCCCCCCC_py.html
    

    instead of this, you may use something like:

    c980842b72071eb170d4601471450dad_CCCCCCCCCCCCCCCCCCCCCCC_py.html
    

    where the hash in the beginning is the md5 of the path-string or so.

    With this, the files residing in the same directory would have the same hash prefix, and the original input file name is also visible. I think that wouldn't bother the annotate module's mechanisms either. Myself, I'd prefer to use this independent from the length. :)

  3. Log in to comment