1. Christian Specht
  2. Bitbucket Backup
  3. Issues
Issue #22 resolved

Backup completed but no actual files downloaded - any help appreciated

Rich Reeves
created an issue

Hi Christian,

I found your utility and think it is perfect for some extra backups. I ran a backup and it seemed to run ok and created folders for each of our repos... however, this ran suspiciously quickly!

Checking the folders, they contains all of the head, refs etc, but nothing more - in fact, only a few k for the whole backup folder. What am I missing? Is it the way we've setup our branches or repos? I'm assuming your app brings down the entire repo and all current remote branches.

Thanks for your help with this.

Rich

Comments (17)

  1. Christian Specht repo owner

    Hi Rich,

    "head" and "refs" sounds like you are talking about Git (and not Mercurial) repositories, right?

    I don't use Git at all (read this to understand where I'm coming from), so I can't tell you if your branches and repositories are set up correct.

    Under the covers, Bitbucket Backup gets a list of your repositories from Bitbucket, and just uses the Git installation on your machine to do the following for each repository on the list:

    Step 1: Create a bare repository in the local backup folder

    git init --bare
    

    Step 2: Fetch everything from the repository on Bitbucket into the remote repository

    git fetch url-to-your-repo refs/heads/*:refs/heads/* refs/tags/*:refs/tags/*
    

    From what you're describing, I suppose that only Step 1 works on your machine, but Step 2 does not.

    What happens when you run the command in step 2 directly in the command line?

  2. Andy Brudtkuhl

    I'm having the same issue running this on a Windows Server 2012 machine. Running the command in step 2 does nothing.

    All that's pulled down is what is in the .git/* folder - not the actual files in source control.

  3. Andy Brudtkuhl

    Note on Git after looking at source. A fetch does not actually pull down files. It only pulls down the latest refs from Git. So it appears this does not actually ever download (pull) the actual files in the repository - it just fetches the git references.

  4. Rich Reeves reporter

    Andy,

    I could never get it to work with Git. In the end I settled for just writing a git bash script which I called 'clonerepos.sh' and I just call from within the git bash command tool. Not the most elegant solution but it works and I only use it once a week for DR backups, which I zip and dump elsewhere off our network.

    I used a credentials helper so I don't have to type my password for each repo line and then for each repo just add another line to the script.

    git config --global credential.helper wincred
    git clone https://yourusername@bitbucket.org/pathtorepo/reponame.git --template "C:\Program Files (x86)\Git\share\git-core\templates"
    

    The path to the templates had me stumped for a bit but it works just fine. I also have another DOS batch file that clears down each folder so I can do the whole thing again next time. Something like:

    cd /whereveryourbackupsare
    rd /s /q foldername
    

    One rd for each repo folder. Jobs a good un.

    Good luck and if you get the above thing working, give me a shout! Would much prefer an automated backup.

    Rich

  5. Christian Specht repo owner

    Guys,

    as I said before: I don't really know Git, because I don't use it.

    What Bitbucket Backup does with Git repositories is what I understand to be the Git equivalent of how I would do it in Mercurial, which I'm way more familiar with:

    // create new empty repo:
    md foo
    cd foo
    hg init
    
    // pull from Bitbucket:
    hg pull https://bitbucket.org/yourname/foo
    

    Now the foo folder contains a repository without working copy, but the complete history including the data is inside the .hg subfolder.

    To get a working copy with the files form the newest revision, I just have to execute this:

    hg update tip
    

    Now concerning Git - as far as I understand Git, this:

    git init --bare
    git fetch url-to-your-repo refs/heads/*:refs/heads/* refs/tags/*:refs/tags/*
    

    ...is the Git equivalent to this:

    hg init
    hg pull url-to-your-repo
    

    As I already said in the comment that I linked at the top of this answer, I didn't come up with the refs/heads/*: syntax myself...no, I got it from Stack Overflow, from a guy who has nearly 600 reputation in the git tag, so I suppose he knows what he's doing.

    When I create a bare Git repository and execute this on the command line:

    git fetch url-to-your-repo refs/heads/*:refs/heads/* refs/tags/*:refs/tags/*
    

    ...Git outputs something like the following, so I guess that it really pulls down the actual files in source control:

    remote: Reusing existing pack: 850, done.
    remote: Total 850 (delta 0), reused 0 (delta 0)Receiving objects:  82% (697/850), 1.17 MiB | 469.00 KiB/s
    Receiving objects: 100% (850/850), 1.36 MiB | 469.00 KiB/s, done.
    Resolving deltas: 100% (625/625), done.
    

    After that, I can view the history and the files and folders with TortoiseGit, although I admit that I'm not able to actually check out the working directory, because after 20 minutes of Googling, I still have no clue if there's a Git equivalent to hg update tip.

    (I don't want to start a religious war here, but stuff like that is the exact reason why I'm using Mercurial, and not Git)

    However, I can tell you that this guy is using Bitbucket Backup successfully to backup 160+ Git repositories from Bitbucket, so apparently it's working for him.

    By the way, what version of Git are you using? I believe that you need at least 1.7.x for the refs/heads/*: stuff to work.

  6. Andy Brudtkuhl

    A git fetch only fetches the remote info (ie branches, refs, etc)

    A git pull will run a git fetch and actually downloads and merges the files in the repo. I think one line of code will do it

    "Incorporates changes from a remote repository into the current branch. In its default mode, git pull is shorthand for git fetch followed by git merge FETCH_HEAD." - docs

    git init --bare
    git pull url-to-your-repo refs/heads/*:refs/heads/* refs/tags/*:refs/tags/*
    

    I tried to fork it to add a line but alas i don't use Mercurial so can't really do that.

    https://bitbucket.org/christianspecht/bitbucket-backup/src/0dcc201e48ceab2e6dc6902884787f53e8dda07a/src/BitbucketBackup/GitRepository.cs?at=default#cl-30

    public override void Pull()
            {
                this.git.Execute(String.Format("pull {0} refs/heads/*:refs/heads/* refs/tags/*:refs/tags/*, this.remoteuri);
            }
    

    I think a git pull is equivalent to a hg update tip wherein in pulls down changes from the specified remote host and merges it with your local copy.

    I am running the latest version of Git. It backs up the git files from Bitbucket but not the actual code files.

  7. Drew Peterson

    Hi everyone!

    Christian reached out to me to ask me about this so I thought I'd jump in.

    So when I added this feature I decided to clone the repositories bare so they were fit for backup purposes. Since git is a distributed system and keeps a local copy of the repository in your .git folder, I chose to go with the bare repositories to avoid having the second copy of all the source code in the working directory. This also means that if you were to back up your git repositories to a file server, you could easily clone from/push/pull to those repositories in the case of a Bitbucket outage or loss of data.

    It turns out I made one mistake though, it seems the git tools do not like cloning from a bare repository that does not end in the .git extension. So assuming Bitbucket Backup backed up MyGitRepository to the MyGitRepository folder, if I renamed the folder to MyGitRepository.git I could then issue a git clone ssh://server/MyGitRepository.git LocalRepoFolder and have a local copy of the repository with a working directory.

    So we do need to fix the folder name that the bare repository is stored in to end with the .git extension, but otherwise I feel this feature works as it should. Does that help clear things up?

  8. Andy Brudtkuhl

    Okay - i was under the impression that it actually backed up the source code on the machine that runs it. Backing up the git repo information is only about 5% of what we need. I really wanted this to be something i can run as a scheduled task daily so i know our code is backed up on our server and can act as a backup remote if something ever happened to our Bitbucket repos.

  9. Drew Peterson

    I think you're misunderstanding - the source code is backed up. This isn't backing up metadata about a repository, it's the entire repository itself. With that repository you can clone/push/pull, whatever you need. There is no difference between the repository that is cloned by bitbucket backup and the repository sitting out on github/bitbucket/etc.

    As Christian mentioned, I run this as a scheduled task every day and back up what is approaching 200 repositories from Bitbucket. Each is a full and complete clone of the repository on Bitbucket.

    I think your confusion is in the difference between a git repository, and your working directory. You check out and edit code in your working directory and commit to your local repository. When you do a git push or pull, you're not pushing or pulling directly to/from your working directory, you're pushing/pulling between the remote repository and your local repository. A bare repository is a repository with no working directory.

  10. Christian Specht repo owner

    Drew:
    Thank you very much for the clarification!

    Rich and Andy:
    In addition to what Drew said - you don't even need a different machine (ssh://server/...), you can just do this on the machine where you're running Bitbucket Backup:

    • start cmd or Powershell
    • go to the directory where Bitbucket Backup saved the bare repositories
    • rename one of them like Drew said, e.g. from foo to foo.git
    • execute git clone foo.git foo2

    Now Git will clone the repo to foo2 (in the same directory), and this repo will have a working folder.

    Can you confirm that this works for you as well?
    If yes, I think I'll add two things to Bitbucket Backup:

    1. Append ".git" to the folder names of the local bare repositories
    2. A section on the website that explains what a bare repository is and how to get the actual files out of it.
  11. Drew Peterson

    FYI I've submitted a pull request so that the bare repositories fetched with bitbucket backup can be cloned from using the git command line (by default without the .git extension the tool does not recognize it as a repository).

  12. Andy Brudtkuhl

    Christian – yea I think if you updated the website to clarify what is actually being backed up as well as some instructions for restoring from that backup – that would be great.

  13. Log in to comment