The symlink change detection solution

#5 Declined
Repository
chg-symlinkhash
Branch
preview1
Repository
chg
Branch
stable
Author
  1. quark-zju
Reviewers
Description

This is a version that from our discussion at #4. It basically works.

It's even simpler than two layer symlink abstraction. Only one symlink (server) is used. The second layer is not necessary. In case of mtime hash change, the server just unlink its socket file and send its hash (not changed) back to client as a "redirect" instruction.

There are some ugly code such as creating a useless file just for --daemon-pipe-fd, I will change core code to make these simpler.

Tests run much faster under chg:

% chgt test-update-*.t
real    0m2.954s
% hgt test-update-*.t
real    0m15.222s

I'm still testing hg testcases one by one (33 of 404 non serve/web tests are failing, some minor fixes may be outside this PR). Put it here for preview and discussion purpose. Will send to mail list when chg client is in core.

Currently there are no options to disable config/mtime check. I will add option to restore old behavior later.

Comments (7)

  1. Yuya Nishihara repo owner

    I've read these changes briefly, and they look good to me. Thanks.

    A couple of random comments follow.

    AutoExitMixIn:

    • SocketServer does not inherit object, which means super() cannot be used. Perhaps process_request() is never called.
    • We could use handle_timeout() and socket timeout instead of threading.
    • Is it reliable to compare st_ino by issocketowner()?
      I'm not sure when unlinked inode is reused.

    checkconfig:

    • Can we move unlink() to client?
      I'm not a fan of managing parent's resource by children.

    --daemon-pipefds:

    • I think it's still usable to notice that the daemon starts listening, though it is less important since we have wider lock.

    pidfile:

    • Can we have it?
    • I'm working on the test runner, and I want to kill all daemons spawned in tests cleanly, without relying on GC.

    assume-tty:

    • Yep, it would be useless now since the progress extension is in core.
  2. quark-zju author

    Thanks for your quick review!

    I noticed some debugging code and low quality commit messages. Sorry for those. I will make them look right before sending to maillist.

    AutoExitMixIn

    • About object, I added object to the place the mixin is used. Maybe I should just use old style.
    • handle_timeout seems to be used for single request timeout, which is not called periodically.
    • About st_ino: Nice catch! Reusing inode is possible. I will add st_mtime check.
    • Ideally I'd like to put these logic inside the serve_forever loop instead of a thread. However it seems I serve_forever looks hard to hook without copy paste a lot of code. Besides, the comment there suggests use a thread for periodic tasks (although does not make much sense to me).

    checkconfig

    If you consider the server and its forked request handler as a whole, it is managing its own resource and does not look that strange. If we do unlink in client, we need to pass additional information to make the client distinguish "redirect, use whatever exists" and "redirect, but kill what exists". I am neutral about where to do it. But I prefer shorter and simpler code so I'd like to remain current way.

    --daemon-pipefds

    I'd like to make unlink server side non fatal and then we can get rid of creating a dummy file in client.

    pidfile

    pidfile requires extra work to maintain correctly (especially if we put multiple pid into one file), and is fragile (the user can edit it). I'd like to make the server have strong consistency with its socket file, which can even handle rare cases simply correctly, for example:

    1. server-hash1 accepts a request.
    2. It gets hash2 for confighash and mtimehash remains same. So it tells the client to redirect to hash2 but do not exit.
    3. Client does not find server-hash2 alive and starts a new server.
    4. The new server calculates its hash to be hash1 and overrides the server-hash1.
    5. The old server exits automatically.

    Currently rm -f server* will kill all servers.

    I think pidfile makes it easy to send signals to the process if the process does not expose other interfaces, like nginx etc. Since we have socket and are getting rid of restart, pidfile does not make much sense.

  3. Yuya Nishihara repo owner

    About object, I added object to the place the mixin is used. Maybe I should just use old style.

    Perhaps. SocketServer follows OOP design of 90's. It's subtle which class will win the race of multiple inheritance. I don't understand all cases.

    https://docs.python.org/2/tutorial/classes.html#multiple-inheritance

    handle_timeout seems to be used for single request timeout, which is not called periodically.

    Right. They do select() with 0.5sec timeout, but there seems no sane way to detect a timeout. I wanna get rid of SocketServer at some point. ;)

    If we do unlink in client, we need to pass additional information to make the client distinguish ...

    Yes, but it would be just a few line of codes?

    I won't insist this, but please rename the command if it can kill the server. "check" shouldn't do more than checking.

    I'd like to make unlink server side non fatal and then we can get rid of creating a dummy file in client.

    No, we shouldn't. --daemon-pipefds is the internal option paired with --daemon. I abuse it for chg's own locking mechanism.

    pidfile requires extra work to maintain correctly (especially if we put multiple pid into one file), and is fragile (the user can edit it).

    I agree pidfile isn't reliable, but it's a plan old way to manage daemon. It's also simple if we relax the requirement for consistency, for example, pid file per hash, and it may be stale.

    Since we have socket and are getting rid of restart, pidfile does not make much sense.

    Don't we have an option to disable automatic restart and hash comparison?

  4. quark-zju author

    Perhaps. SocketServer follows OOP design of 90's. It's subtle which class will win the race of multiple inheritance. I don't understand all cases.

    I had no Python < 2.7 experience prior to working on hg. My impression is the old style is okay as long as the mixin is only used internally (so the parent class of the mixin can be hardcoded).

    Right. They do select() with 0.5sec timeout, but there seems no sane way to detect a timeout. I wanna get rid of SocketServer at some point. ;)

    I'd like to see that happen to get rid of thread as well. But SocketServer is non-trivial. Are you okay with copy-pasting tens or even 100+ lines code from SocketServer.BaseServer to override some of its internal methods (namely serve_forever and shutdown related code)?

    I won't insist this, but please rename the command if it can kill the server. "check" shouldn't do more than checking.

    Will spend some time looking for a new name :)

    --daemon-pipefds is the internal option paired with --daemon

    First, this is an issue of hg not chg.

    I believe the name is inappropriate because it's not fd. It was fd in the beginning until at some time we wanted to support Windows and changed it to a dummy file path. If you read the code, this file is neither read nor written but just deleted by the server, which does not make much sense to me. I suggest renaming it by splitting it into two options like --daemon-<somename> and --daemon-unlink.

    for example, pid file per hash, and it may be stale.

    per hash will require some refactoring or duplication, since at the time we handle --pidfile, we don't know config hash. And you have to list the directory to get all pids, which is not simpler than current approach.

    Don't we have an option to disable automatic restart and hash comparison?

    I plan to add a CHG environment variable to opt in the old behavior (was thinking about doing that in hgrc but it is not possible since the client needs the information before figuring out argv to start a server).

  5. Yuya Nishihara repo owner

    Are you okay with copy-pasting tens or even 100+ lines code from SocketServer.BaseServe

    No, I gave up. Threading is sensible for now.

    It was fd in the beginning until at some time we wanted to support Windows

    Yep, Windows provides no fork().

    this file is neither read nor written but just deleted by the server, which does not make much sense to me.

    Really? It should work as follows:

    main process:

    1. serve --daemon creates a lock
    2. forks daemon process with serve --daemon --daemon-pipefds=...
    3. and waits until lock file is deleted
    4. now we're ready to connect to the server

    daemon process:

    1. bind and listen on socket
    2. deletes lockfile to tell that the daemon starts listening

    And you have to list the directory to get all pids, which is not simpler than current approach.

    Ok, I don't agree in that it isn't simpler, but let's go without pidfile. Pidfile isn't that important.

    I plan to add a CHG environment variable to opt in the old behavior (was thinking about doing that in hgrc but it is not possible since the client needs the information before figuring out argv to start a server).

    Can't we start the server by the same argv? It won't be 100% compatible with the old behavior because we would pass --config and drop --cwd, but that seems okay to me.

  6. quark-zju author
    1. and waits until lock file is deleted

    Why not just try to connect to the server directly?

    I think lock by testing file existence is fragile when process crashes.

    Can't we start the server by the same argv?

    Then we probably want to keep --pidfile to support SIGHUP reload. It is not useful in multiple server case though.


    About renaming checkconfig:

    • I still think it's better done server side since it can avoid some race conditions neatly.
    • Since the server can gc itself, the socket file can disappear at any time so the possible side effect of unlink looks okay. The client does not care about whether the file is unlinked or not since the forked process will continue serve the client.
    • I had a discussion with my colleague about renaming. From the server's perspective, it's about loading config, hashing, checking mtime, unlink, etc., which is very hard to name short. From the client's perspective, it's all about "redirect". so the new name will be something like *redirect, for example, calcredirect, generateredirect etc.
  7. Yuya Nishihara repo owner

    Why not just try to connect to the server directly?

    Probably because it isn't reliable.

    $ hg serve --daemon -p 8000
    $ hg serve --daemon -p 8000
    

    Also, I can't say dummy connect is cleaner than crash-unsafe lock file.

    Then we probably want to keep --pidfile to support SIGHUP reload.

    Let's drop SIGHUP reload if it is a blocker. I don't think it is useful.

    *redirect, for example, calcredirect, generateredirect etc.

    Sounds better.