Slowdown due to gethostbyname

Issue #1162 closed
Ian Hinder created an issue

When running the ET testsuite on my laptop, I found it was taking 7 minutes to run just the tests from the McLachlan arrangement. The CPU usage was negligible for much of the time. I noticed that when running an empty parameter file, there was a slowdown before printing the host name. Additionally, when I attached a debugger to find out why cactus was taking so long to run the tests, the backtrace showed:

0x00007fff90f0dd16 in kevent ()
(gdb) bt
#0  0x00007fff90f0dd16 in kevent ()
#1  0x00007fff954c390a in _mdns_search ()
#2  0x00007fff954c3345 in mdns_hostbyname ()
#3  0x00007fff954c31c1 in search_host_byname ()
#4  0x00007fff954c30c1 in gethostbyname ()
#5  0x00000001096becf8 in Util_GetHostName (name=0x7fff56551550 "macbook", length=255) at Network.c:86

Util_GetHostName first calls gethostname, and if that function returns something with no "." in it, it calls gethostbyname. Indeed, my local hostname does not have a "." in it.

This is on Mac OS 10.8.2. Changing the hostname via

sudo scutil --set HostName Ians-MacBook-Pro.local

so the hostname included a ".", the tests now run in 1m30s. I don't know why gethostbyname is so slow on my system. Does Cactus, or maybe simfactory, or maybe the test system, call gethostbyname very frequently? Perhaps the output could be cached if this call can sometimes be slow.

Keyword:

Comments (6)

  1. Erik Schnetter
    • removed comment

    Util_GetHostName is called by Formaline, and by Carpet's I/O routines. The latter call it for every file.

    Yes, caching its output (in Util_GetHostName) would make sense.

    An alternative would be to use MPI_Get_processor_name, if MPI is available. In fact, Carpet itself does not call Util_GetHostName any more because this may have led to crashes, possibly related to calling fork().

    Upon startup, Carpet determines the host names of all nodes, to see which MPI processes share a node. This is stored in a mapping (see file Hosts.cc); we could add a routine to return the current MPI processe's host name.

  2. Erik Schnetter
    • changed status to open
    • removed comment

    The attached patch caches the result of Util_GetHostname. Does this help?

  3. Ian Hinder reporter
    • changed status to open
    • removed comment

    Thanks. This reduces the time taken to run the tests from 7m8s to 1m56s. "ping macbook" takes 5 seconds to fail. I don't know why this is; it sounds like a system configuration or OS error.

    Please apply.

    Regarding the patch,

    • Can the passed in "length" variable be used to set the size of the name string, rather than setting it to 100? We now require C99, which requires this feature, but http://en.wikipedia.org/wiki/Variable-length_array says that variable-length automatic arrays are only an optional feature in C11, the newest C standard. Do we use VLAs in other thorns? If we are concerned about not having them available, we should provide an autoconf test for them.
    • The logic in the patch doesn't seem to warn if a truncated name is copied from name to returned_name due to returned_name being too small.
  4. Erik Schnetter
    • removed comment

    No, we can't use the "length" variable passed in, because the first call may pass length=1, and the next may pass length=100. A typical operating system length limit may be 64.

    Also, static variable probably cannot be VLAs.

    Yes, there is no such logic, but there was no such logic there before either. The patch doesn't add error checking.

  5. Log in to comment