Clean environment in launched container.
Hi. Seems that launched container has clean environment. There are only mesos and godocker vars.
MESOS_DIRECTORY=/var/lib/mesos/slaves/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S105/frameworks/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017/executors/248-0/runs/ab43c0f5-aebf-4c40-8c2f-bad4af5f51ce
MESOS_EXECUTOR_ID=248-0
PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs
PWD=/mnt/go-docker
MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.3.0.so
MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.3.0.so
MESOS_HTTP_COMMAND_EXECUTOR=0
MESOS_SLAVE_PID=slave(1)@10.0.0.10:5051
MESOS_FRAMEWORK_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017
MESOS_CHECKPOINT=0
SHLVL=2
LIBPROCESS_PORT=0
MESOS_SLAVE_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S105
GODOCKER_JID=248
GODOCKER_PWD=/mnt/go-docker
GODOCKER_HOME=/mnt/go-docker
GODOCKER_DATA=/mnt/god-data
MESOS_SANDBOX=/mnt/mesos/sandbox
However, we are using containers with shell entrypoint, so in this case all vars that has been set inside container must be available.
So when I run this docker container by hands it has next vars:
root@0f6506516c96:/# env
CUDNN_VERSION=7.0.4.31
HOSTNAME=0f6506516c96
NVIDIA_REQUIRE_CUDA=cuda>=9.0
TERM=xterm
MKL_THREADING_LAYER=GNU
LIBRARY_PATH=/usr/local/cuda/lib64/stubs:
NVIDIA_VISIBLE_DEVICES=all
LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64
NVIDIA_DRIVER_CAPABILITIES=compute,utility
PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin
PWD=/
CUDA_PKG_VERSION=9-0=9.0.176-1
CUDA_VERSION=9.0.176
SHLVL=1
HOME=/root
NCCL_VERSION=2.1.2
_=/usr/bin/env
root@0f6506516c96:/# whoami
root
Could you, please, help, am I doing something wrong ? Because we haven't such cases with aurora fw. Job was launched with "Run as root" option.
Comments (28)
-
repo owner -
repo owner I jsut did a test, both as root and non root, and I got my env variables as expected:
Sample image:
FROM debian ENV TEST1=test11 ENV TEST2=test22
In my godocker script:
#!/bin/bash echo HelloGODOCKER env
I get in god.log
HelloGODOCKER ... GODOCKER_HOME=/mnt/go-docker GODOCKER_JID=150550 PWD=/mnt/go-docker TEST1=test11 TEST2=test22 ....
so env vars are present
-
reporter Ok, But, what if you set entrypoint ?
like this: ENTRYPOINT start.sh
I asking because if using entrypoint type shell, all env should be saved, but if exec type - no.
So when you override entrypoint, maybe it looks like this:
ENTRYPOINT [''/bin/bash", "/mnt/go-docker/wrapper.sh"]
-
reporter Could you, please also check this: http://mesos.apache.org/documentation/latest/isolators/docker-runtime/
-
repo owner when executing as non-root, godocker gets original env, save it to god.env file (in job dir) and loads god.env when starting command file. when executing a root, command file is executed directly, so it gets its original env directly.
If you want to see what is in env at container startup, for test, you can modify godocker/godscheduler.py, line 722
wrapper = "#!/bin/sh\n" => wrapper += "env > /mnt/go-docker/test.env\n" wrapper += "if hash bash 2>/dev/null; then\n"
this will create a test.env file in job directory with the image environment (before any godocker specific setup etc...). Your env should be correct here. As it works for me, it could be a difference with my config/test
-
repo owner we see task command value and command shell = true
-
repo owner if I understand, problem would be to get env variables for images having an original entrypoint, overriden by godocker ? (but ok if no entrypoiny)
-
reporter can modify godocker/godscheduler.py, line 722
have done it. but still the same.
cat god.log
test env LIBPROCESS_IP=10.0.0.14 MESOS_AGENT_ENDPOINT=10.0.32.184:5051 MESOS_DIRECTORY=/var/lib/mesos/slaves/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S93/frameworks/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017/executors/268-0/runs/1bd3db62-83d9-4a8f-b92e-01013c96e47e MESOS_EXECUTOR_ID=268-0 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs PWD=/mnt/go-docker MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.3.0.so MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.3.0.so MESOS_HTTP_COMMAND_EXECUTOR=0 MESOS_SLAVE_PID=slave(1)@10.0.0.14:5051 MESOS_FRAMEWORK_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017 MESOS_CHECKPOINT=0 SHLVL=2 LIBPROCESS_PORT=0 MESOS_SLAVE_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S93 GODOCKER_JID=268 GODOCKER_PWD=/mnt/go-docker GODOCKER_HOME=/mnt/go-docker GODOCKER_DATA=/mnt/god-data MESOS_SANDBOX=/mnt/mesos/sandbox _=/usr/bin/env
However, that is part of my container info, docker inspect command: as you can see there is all my envs:
"Env": [ "PATH=/usr/local/nvidia/bin:/usr/local/cuda/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin", "CUDA_VERSION=9.0.176", "CUDA_PKG_VERSION=9-0=9.0.176-1", "LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64", "NVIDIA_VISIBLE_DEVICES=all", "NVIDIA_DRIVER_CAPABILITIES=compute,utility", "NVIDIA_REQUIRE_CUDA=cuda>=9.0", "NCCL_VERSION=2.1.2", "LIBRARY_PATH=/usr/local/cuda/lib64/stubs:", "CUDNN_VERSION=7.0.4.31", "MKL_THREADING_LAYER=GNU" ], "Cmd": [ "bash" ], "Image": "1c394d3be123", "Volumes": { "/models": {} }, "WorkingDir": "", "Entrypoint": [ "/bin/sh", "-c", "startup.sh && screen-startup.sh" ],
-
reporter will check again
looks like wrapper.sh starts with /bin/sh
what if I'll change it to /bin/bash ?
-
repo owner wrapper need to start with sh, not bash because all images do not have bash. Anyway, in the test I made above with TEST1,TEST2, you can see I get my env from the original image, so sh/bash is not the issue
-
repo owner I tried with an image with an entrypoint, and still get my env....
FROM debian ENV TEST1=test11 ENV TEST2=test22 ENV TEST3=test33 RUN echo "#!/bin/bash\necho hello" > /root/start.sh RUN chmod +x /root/start.sh ENTRYPOINT /root/start.sh
In godocker cmd:
echo HelloGODOCKER env
In god.log
.... TEST1=test11 TEST2=test22 TEST3=test33 ...
-
repo owner what is your docker image (is it public available?)
-
repo owner can you try with my test image (osallou/test1) to see if you get the TESTxx env vars (to see if it is image or setup related)
-
reporter what is your docker image (is it public available?)
no. Will try to find small image on docker hub with env, for test.
-
repo owner did a try with nvida/cuda:9.1-base and get in god.log
... NVIDIA_REQUIRE_CUDA=cuda>=9.1 LD_LIBRARY_PATH=/usr/local/nvidia/lib:/usr/local/nvidia/lib64 NVIDIA_VISIBLE_DEVICES=all NVIDIA_DRIVER_CAPABILITIES=compute,utility CUDA_PKG_VERSION=9-1=9.1.85-1
-
repo owner do you have docker/runtime set in /etc/mesos-slave/isolation ? (or your slave startup flags). I suppose yes if it works with aurora, and it should not work with docker images if not set
-
reporter do you have docker/runtime set in /etc/mesos-slave/isolation ? (or your slave startup flags). I suppose yes if it works with aurora.
yes
I got nginx from docker hub https://github.com/nginxinc/docker-nginx/blob/dbd053d52727bc8db0fec704caa22b8e0d5f6c84/mainline/stretch/Dockerfile
It has ENV NGINX_VERSION 1.13.9-1~stretch ENV NJS_VERSION 1.13.9.0.1.15-1~stretch
this is log from mesos
Executing pre-exec command '{"arguments":["mesos-containerizer","mount","--help=false","--operation=make-rslave","--path=\/"],"shell":false,"value":"\/usr\/libexec\/mesos\/mesos-containerizer"}' Executing pre-exec command '{"arguments":["mount","-n","--rbind","\/var\/lib\/mesos\/slaves\/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S111\/frameworks\/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017\/executors\/277-0\/runs\/9032a063-a19f-41ea-8434-99c52205ce86","\/var\/lib\/mesos\/provisioner\/containers\/9032a063-a19f-41ea-8434-99c52205ce86\/backends\/aufs\/rootfses\/d243db2c-a3a9-45bb-b129-19586ba2bdb7\/mnt\/mesos\/sandbox"],"shell":false,"value":"mount"}' Executing pre-exec command '{"arguments":["mount","-n","--rbind","\/mesos-storage\/godshared\/tasks\/pairtree_root\/27\/7\/task","\/var\/lib\/mesos\/provisioner\/containers\/9032a063-a19f-41ea-8434-99c52205ce86\/backends\/aufs\/rootfses\/d243db2c-a3a9-45bb-b129-19586ba2bdb7\/mnt\/go-docker"],"shell":false,"value":"mount"}' Received SUBSCRIBED event Subscribed executor on host1 Received LAUNCH event Starting task 277-0 Running '/usr/libexec/mesos/mesos-containerizer launch <POSSIBLY-SENSITIVE-DATA>' Forked command at 81789 Changing root to /var/lib/mesos/provisioner/containers/9032a063-a19f-41ea-8434-99c52205ce86/backends/aufs/rootfses/d243db2c-a3a9-45bb-b129-19586ba2bdb7 MESOS_EXECUTOR_ID=277-0 MESOS_CHECKPOINT=0 MESOS_HTTP_COMMAND_EXECUTOR=0 MESOS_EXECUTOR_SHUTDOWN_GRACE_PERIOD=5secs LIBPROCESS_PORT=0 MESOS_AGENT_ENDPOINT=10.0.0.1:5051 PATH=/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin MESOS_SANDBOX=/mnt/mesos/sandbox MESOS_NATIVE_JAVA_LIBRARY=/usr/lib/libmesos-1.3.0.so MESOS_FRAMEWORK_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017 MESOS_DIRECTORY=/var/lib/mesos/slaves/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S111/frameworks/1d2d8f83-e7ff-40ff-bf38-d21248192ca6-0017/executors/277-0/runs/9032a063-a19f-41ea-8434-99c52205ce86 MESOS_NATIVE_LIBRARY=/usr/lib/libmesos-1.3.0.so MESOS_SLAVE_ID=1d2d8f83-e7ff-40ff-bf38-d21248192ca6-S111 PWD=/mnt/mesos/sandbox MESOS_SLAVE_PID=slave(1)@10.0.0.1:5051 LIBPROCESS_IP=10.0.0.1 OK Command exited with status 0 (pid: 81789)
I see "shell":false
Is that shell should be true ?
Will check tomorrow with aurora (because didn't check by myself, colleague said that all it works with aurora, but didn't with godocker)
-
repo owner if I look at my mesos logs, I also see "shell": false, so this is not the pb (related to mounts anyway, not startup) I just did the test with dockerhub nginx and can see env vars in god.log (still run as root)
NJS_VERSION=1.13.9.0.1.15-1~stretch NGINX_VERSION=1.13.9-1~stretch
-
repo owner however, i do not see the same "mesos" logs
Received LAUNCH event Starting task 150558-0 /usr/libexec/mesos/mesos-containerizer launch ..... Forked command at 14848 Changing root to /var/lib/mesos/provisioner/containers/d178f96b-aad8-439b-b384-c8d08ab64a55/backends/aufs/rootfses/74916d37-15ed-4f00-b2ba-6e53e869c6a2 Overwriting environment variable 'PATH', original: '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin', new: '/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin' OK Command exited with status 0 (pid: 14848)
it does not display env variables for me, and I have a message about env overwriting. And the mesos-containerizer message is not the same too.
Which mesos version do you use? (I am currently using 1.2.0, but had tested env variables before)
-
repo owner with nginx, you do not see them neither in god.log?
-
reporter have tested on mesos v. 1.3.0.
env output in mesos log is from wrapper.sh where I have added "env" after #!/bin/sh as you advised early. Another env command is in the task job command windows.
with nginx, you do not see them neither in god.log?
no in mesos logs, nither in god.log
I also has mesos v. 1.4.1 in dev env. So, I'll try to check this issue tomorrow on it.
-
repo owner Will try to upgrade locally mesos to see it is mesos version related
-
repo owner I have upgraded locally to mesos 1.5.0 and tested against nginx image, and still have nginx_version env vars in god.log So either there is an issue with your specific mesos version (I tested with success on 1.0.0, 1.2.0 and 1.5.0), either it is related to your installation/setup.... (there is no godocker config/options for this)
Your testing in an other dev setup should give us some hints.
-
repo owner hum... tested with mesos 1.3.0 too and do not find nginx env vars anymore. But worked with 1.5.0, so could be an issue with your mesos version. So seems an issue with mesos version.
-
repo owner Found a bug in mesos 1.3.0, solved in 1.3.1 where env vars are not available in container
[MESOS-7692] - Default environment variables defined in Docker image are not available in Mesos containerizer.
-
reporter Found a bug in mesos 1.3.0, solved in 1.3.1 where env vars are not available in container
Oh, no.... Just have checked with aurora with the same result: env was cleared. So looks like it really mesos 1.3.0 issue. Very sorry for your time. Thank you.
-
repo owner - changed status to resolved
Bug in mesos 1.3.0, solved in 1.3.1
-
reporter Bug in mesos 1.3.0, solved in 1.3.1
Confirmed. Have checked on mesos 1.4.1 - all working fine.
- Log in to comment
Hi, Entrypoint is overriden, this is necessary due to required setup to run script with user rights and chown job files. Regarding env vars, set in container, there is nothing done regarding this. They should be in container. But i do not know what is mesos unified containerizer behavior regarding this. Godocker only adds GOD xxx env vars and mesos sets his owns.
Mesos support docker container images, but mesos unified conts create containers on its own, maybe handling env vars differently (or need options to use them?). Will try to look at mesos doc.