Force generation of subtitles

Issue #69 closed
Former user created an issue

Hi, Love webdl! Thanks so much for all your hard work. If I wanted subtitles available for all downloads, how would I do that? I've looked at common.py and the man page for ffmpeg and I'm wondering if all I need to do is add "-scodec", "copy" to the ffmpeg command, but I'm guessing there's more to it than that. Thanks again, Ned

Comments (16)

  1. delx repo owner

    That will only work if the original streaming videos contain a subtitle track.

    Try it and see :)

  2. Former user Account Deleted

    Hi delx, It's been ages but I now have a better understanding of subtitles and I'm totally frustrated. Before I discovered you I was using SBS and iView Rippers, but I needed to fire up VirtualBox to use them on a Mac and they were a bit flakey. However, each of those gave me the option to download .srt files with the mp4s and those worked so well I thought it'd be a simple matter to download .srt files for all the downloads I've done with webDL. I can't get any to sync with the mp4s so they're currently pretty useless. Lots of sync solutions for Windows but only a few for Mac and those I've tried are very poor cousins to the Windows versions. So I'm wondering if you could explain the process the rippers would use and if it'd be possible for me to download those versions, or even better, for webDL to download them? It's probably not even on your horizon but as the baby boomers are turning grey they're finding more and more need for captions - I know I am. Cheers, Ned

  3. pawD

    @nedena Try to share my idea if you interest. Assumed that the video offers captions and the provider keep same structure. - in iview.py, the playlist has lots of items/entries, for each item/entry lookup "program" and then "caption", "src-vtt" is the URL for the captions which is not video stream but text. - in sbs.py, the xpath is "//smil:textstream" instead of "//smil:video" the "src" is the url.

  4. Former user Account Deleted

    Thanks, great idea! I don't know Python (or OOP for that matter) and sbs.py looked a little more straightforward so I started on that. To keep changes separate I used a lot of cut'n'paste but I'm a bit confused about the "src" and the urls. How much of the video_url string do I need? I'm obviously not generating the correct captions url (error "Exception: Unsupported captions 1119327299531: Monster S1 Ep3"). I added a call to node.download_captions() at line 47 of autograbber.py and I've currently got the following routines in sbs.py class SbsVideoNode. Can you tell me where I've gone wrong? Thanks again, Ned.

    def download_captions(self):
        with requests_cache.disabled():
            doc = grab_html(VIDEO_URL % self.video_id)
        player_params = self.get_player_params(doc)
        release_url = player_params["releaseUrls"]["html"]
    
        captions_url = self.get_captions_url(release_url)
        if not captions_url:
            raise Exception("Unsupported captions %s: %s" % (self.video_id, self.title))
        filename_srt = self.title + ".srt"
        return download_srt(filename_srt, captions_url)
    

    def get_captions_url(self, release_url): with requests_cache.disabled(): doc = grab_xml("http:" + release_url.replace("http:", "").replace("https:", "")) captions = doc.xpath("//smil:textstream", namespaces=NS) if not captions: return captions_url = video[0].attrib["src"] return captions_url

    def download_srt(filename_srt, captions_url):
        filename = sanify_filename(filename_srt)
        video_url = "hlsvariant://" + captions_url
        logging.info("\nDownloading: %s", filename)
    
        cmd = [
            "livestreamer",
            "-f",
            "-o", filename,
            video_url,
            "best",
        ]
        return exec_subprocess(cmd)
    
  5. pawD

    Sorry I am no good in both Python and debug other's code. You may want to print out your captions_url and confirm it is something relate to the caption (It shall be a file extension of srt). Besides, it is just a text file and only need a http get while I am not sure livestreamer handles it or not.

  6. Former user Account Deleted

    Thanks for your help pawD. I scrapped most of it and followed your lead re. the text file. The sbs captions are actually in .dfxp format and I've been able to download them. Learning some Python along the way too so it's all good. Now for iview.py. Cheers, Ned

  7. Former user Account Deleted

    Delx, thankyou for writing such great code. You've made learning Python really enjoyable and my Rewind buttons are breathing a sigh of relief. I now have captions for SBS (srt, dfxp, xml) and ABC (vtt), recorded in a seen list and converted to srt. Job done, even though there are bound to be better ways of doing it. I'll keep tweaking.

  8. delx repo owner

    I'm glad you got it to work.

    If you have any code that you want to push back into webdl please raise a pull request :)

  9. Former user Account Deleted

    Absolutely, if you think it'd be useful. I've never done a pull request but I'll give it a go. Need to start using git again and read the docs so it won't be quick!

  10. FordUte

    I have made changes to grabber.py, autograbber.py, iview.py, and sbs.py that grabs subtitles if available. I attempted to create a pull request but do not have access.

  11. Former user Account Deleted

    I had the same problem with the pull request but I'm not familiar with the process and I thought I'd probably missed a step or two. Maybe we should combine forces? I'd attach a zip file of the files I modified but I'm not sure how to do that either!

  12. delx repo owner

    You need to fork this repository into your own account. Then you can create a pull request from your fork.

  13. FordUte

    Great, thanks. If you had not said to fork to my account I would have been lost. I have loaded files and created pull request.

  14. Log in to comment