Warm tip: This article is reproduced from serverfault.com, please click

elasticbeanstalk failed to deploy worker environment /var/pids/web.pid: no such file or directory

发布于 2020-09-04 22:32:39

I am struggling with EBS for what I figured has a relatively simple architecture. I have a Django app. This Django app will contain code for both my worker environment and my WebServer.I have been deploying to a WebServer just fine, and only have this issue with my Worker environment as I try and incorporate it. This poor chap had the same issue, which the community doesn't seem to have a good answer for.

This was mistake number 1 I learned, due to not having a great way to separate configurations.

It began with me getting this error:

2020/09/03 20:17:59.065285 [ERROR] update processes [web sqsd nginx healthd cfn-hup] pid symlinks failed with error Read pid source file /var/pids/web.pid failed with error:open /var/pids/web.pid: no such file or directory

which from this link, I thought was due to my having my .platform with an nginx overwrite (only modifying gzip to turn it on) was somehow causing this. I had removed it and redeployed and the error wasn't there, so it seemed logical. Unfortunately it seems rather temperamental, because it's come back (after I have already deployed a few times!!!!).

My first attempt to fix this was using saved_configs/

├── .elasticbeanstalk
│   ├── config.yml
│   └── saved_configs
│       ├── web.cfg.yml
│       └── worker.cfg.yml

and calling it like this:

eb create testweb --cfg web
eb create testwrkr -t worker --cfg worker
eb deploy testweb
eb deploy testwrkr

which seemed to work, but I still couldn't deploy my worker environment. Next I tried to create a Makefile since removing the .platform fixed the error for me yesterday.

ENVS = 'production development'
ifndef env
$(error "env" is not specified. Please use one of: $(ENVS))
endif

create_web: clean copy_web_configuration
    @echo "Creating Environment: $(env)"
    -eb create $(env)

create_worker: clean copy_worker_configuration
    @echo "Creating Worker Environment: $(env)"
    -eb create $(env) -t worker

deploy_web: clean copy_web_configuration
    @echo "Deploying to: $(env)"
    -eb deploy $(env) $(args)
    @$(MAKE) clean

deploy_worker: clean copy_worker_configuration
    @echo "Deploying worker to: $(env)"
    -eb deploy $(env) $(args)
    @$(MAKE) clean

copy_web_configuration:
    @cp -r config/ebs/extensions/shared/ .ebextensions/
    @cp -r config/ebs/extensions/web/ .ebextensions/
    @cp -r config/ebs/platform/web/ .platform/

copy_worker_configuration:
    @cp -r config/ebs/extensions/shared/ .ebextensions/
    @cp -r config/ebs/extensions/worker/ .ebextensions/

clean:
    @find .ebextensions/ -maxdepth 1 -type f -exec rm -f {} \;
    @rm -rf .platform/nginx
    @find .platform/ -maxdepth 1 -type f -exec rm -f {} \;

so I could be 100% sure that whatever was in .ebextensions/ or .platform/ was for the intended platform.

My new file tree looks like this:
.ebextensions/
.platform/
config/
├── __init__.py
├── ebs
│   ├── extensions
│   │   ├── shared
│   │   │   ├── 01_packages.config
│   │   │   ├── appslog.config
│   │   │   └── django.config
│   │   ├── web
│   │   │   ├── db-migrate.config
│   │   │   ├── securelistener-clb.config
│   │   │   └── static.config
│   │   └── worker
│   │       └── worker.config
│   └── platform
│       └── web
│           └── nginx
│               └── nginx.conf
├── settings
│   ├── __init__.py
│   ├── base.py
│   ├── local.py
│   └── production.py
├── urls.py
└── wsgi.py

And now, when I deploy (after adding in cron.yaml), I get my old friend again

2020/09/04 22:12:21.485660 [INFO] Executing instruction: Track pids in healthd 2020/09/04 22:12:21.485677 [INFO] This is an enhanced health env... 2020/09/04 22:12:21.485697 [INFO] Running command /bin/sh -c systemctl show -p ConsistsOf aws-eb.target | cut -d= -f2 2020/09/04 22:12:21.491871 [INFO] nginx.service healthd.service cfn-hup.service sqsd.service

2020/09/04 22:12:21.491894 [INFO] Running command /bin/sh -c systemctl show -p ConsistsOf eb-app.target | cut -d= -f2 2020/09/04 22:12:21.496690 [INFO] web.service

2020/09/04 22:12:21.496761 [ERROR] update processes [web nginx healthd cfn-hup sqsd] pid symlinks failed with error Read pid source file /var/pids/web.pid failed with error:open /var/pids/web.pid: no such file or directory 2020/09/04 22:12:21.496772 [ERROR] An error occurred during execution of command [app-deploy] - [Track pids in healthd]. Stop running the command. Error: update processes [web nginx healthd cfn-hup sqsd] pid symlinks failed with error Read pid source file /var/pids/web.pid failed with error:open /var/pids/web.pid: no such file or directory

2020/09/04 22:12:21.496776 [INFO] Executing cleanup logic 2020/09/04 22:12:21.496861 [INFO] CommandService Response: {"status":"FAILURE","api_version":"1.0","results":[{"status":"FAILURE","msg":"Engine execution has encountered an error.","returncode":1,"events":[{"msg":"Instance deployment successfully generated a 'Procfile'.","timestamp":1599257531,"severity":"INFO"},{"msg":"Instance deployment failed. For details, see 'eb-engine.log'.","timestamp":1599257541,"severity":"ERROR"}]}]}

My worker configuration doesn't have much in it, so I'm at a loss for words on why this won't deploy. Has anyone seen this issue before? The only resources I have found online are:


I monitored the logs at the same time and saw in an order of operations that first I get this error:

Sep  4 22:42:18 ip-172-31-7-235 web: File "/usr/lib64/python3.7/importlib/__init__.py", line 127, in import_module
Sep  4 22:42:18 ip-172-31-7-235 web: return _bootstrap._gcd_import(name[level:], package, level)
Sep  4 22:42:18 ip-172-31-7-235 web: File "<frozen importlib._bootstrap>", line 1006, in _gcd_import
Sep  4 22:42:18 ip-172-31-7-235 web: File "<frozen importlib._bootstrap>", line 983, in _find_and_load
Sep  4 22:42:18 ip-172-31-7-235 web: File "<frozen importlib._bootstrap>", line 965, in _find_and_load_unlocked
Sep  4 22:42:18 ip-172-31-7-235 web: ModuleNotFoundError: No module named 'application'
Sep  4 22:42:18 ip-172-31-7-235 web: [2020-09-04 22:42:18 +0000] [9303] [INFO] Worker exiting (pid: 9303)
Sep  4 22:42:18 ip-172-31-7-235 web: [2020-09-04 22:42:18 +0000] [9296] [INFO] Shutting down: Master
Sep  4 22:42:18 ip-172-31-7-235 web: [2020-09-04 22:42:18 +0000] [9296] [INFO] Reason: Worker failed to boot.

and then

2020/09/04 22:42:21.717799 [INFO] Running command /bin/sh -c systemctl show -p ConsistsOf eb-app.target | cut -d= -f2
2020/09/04 22:42:21.722604 [INFO] web.service

2020/09/04 22:42:21.722678 [ERROR] update processes [web healthd nginx sqsd cfn-hup] pid symlinks failed with error Read pid source file /var/pids/web.pid failed with error:open /var/pids/web.pid: no such file or directory
2020/09/04 22:42:21.722689 [ERROR] An error occurred during execution of command [app-deploy] - [Track pids in healthd]. Stop running the command. Error: update processes [web healthd nginx sqsd cfn-hup] pid symlinks failed with error Read pid source file /var/pids/web.pid failed with error:open /var/pids/web.pid: no such file or directory 

2020/09/04 22:42:21.722694 [INFO] Executing cleanup logic
2020/09/04 22:42:21.722778 [INFO] CommandService Response: {"status":"FAILURE","api_version":"1.0","results":[{"status":"FAILURE","msg":"Engine execution has encountered an error.","returncode":1,"events":[{"msg":"Instance deployment successfully generated a 'Procfile'.","timestamp":1599259331,"severity":"INFO"},{"msg":"Instance deployment failed. For details, see 'eb-engine.log'.","timestamp":1599259341,"severity":"ERROR"}]}]}

so I'm guessing the web.service that isn't being found is because the ModuleNotFound error. What I don't understand is I am using the exact same code repo as the Web environment.. so how could the Worker environment fail? The configuration looks okay to me. My tree is above, and the relevant beanstalk configuration is this:

  aws:elasticbeanstalk:container:python:
    WSGIPath: config.wsgi:application
Questioner
Nick Brady
Viewed
11
Nick Brady 2020-09-08 23:58:33

Since the original question was about the web.service failure, I'll add this note from the AWS support staff.

Next, ‘web.service’ is the linux service for Beanstalk which is responsible of activating the virtual environment in staging and running the WSGI server using ‘gunicorn’ python plugin[2]. So, from the error ‘/var/pids/web.pid failed with error:open /var/pids/web.pid’ it looks like, when ‘web.service’ was trying to run the application using ‘ExecStart=/bin/sh -c "gunicorn --bind 127.0.0.1:8000 --workers=1 --threads=15 application”’ and it failed as module was missing.

This points to where I thought the issue was, which was the ModuleNotFound. When I tail -f'd two different log files and saw the ModuleNotFound was happening directly before the other error, which supported that thesis.

This ended up happening because after I switched to using a Makefile, I wasn't automatically committing my code before I ran my eb deploy, or at least staging it and running deploy with the --staged flag. Unfortunately the recommended advice from AWS was to use two different branches to manage worker code and web code if I want to use the same repository. I found this to be absolutely terrible advice and poor design on EBS, so I decided I will keep my Makefile, and found a workaround where if I specify an .ebignore, I can circumvent git, and whatever files are copied over directly before being deployed, and subsequently cleaned, will be deployed.


Here was my debugging process if anyone finds it helpful:

The WSGI application is not launching. This means that the entry point to the python application is not being found for whatever reason.

At the Root of the application, I have:

config/
├── settings
│   ├── __init__.py
│   ├── base.py
│   ├── local.py
│   └── production.py
├── urls.py
└── wsgi.py

which has the variable:

application = get_wsgi_application()

The configuration file is failing to find this variable for some reason which starts the python application

Relevant Elastic Beanstalk code to start the application in /.ebextensions/django.config

  aws:elasticbeanstalk:container:python:
    WSGIPath: config.wsgi:application

This seems right to me, so I'm a bit confused. Also, I've deployed many times with this setup and it's worked; however now it is failing. Why?

checking the console to see if WSGI path is being set as I think it is

enter image description here

It just says application. What does it say on the environments that are actually working? As I thought, the working environment says config.wsgi:application. So.. the issue is that /the configuration is not actually making it to Elastic Beanstalk for whatever reason./

I realized that the files I wanted to deploy were not actually being committed. The files in the .ebextensions directory need to be committed just like the rest of the application. This makes the Makefile a little tricky, since the files won't be committed as I've just copied them over.

Trying to add the newly copied files to git, then run my deploy with --staged, then remove the changes with git restore --source=HEAD --staged --worktree -- .ebextensions/. so I went to look if there is a way to deploy outside of git. This works but I don't like interfering with the git workflow. From this AWS thread it looks like this can be accomplished with a .ebignore file.

Specifically said here in the documentation: Configure the EB CLI - AWS Elastic Beanstalk

When .ebignore is present, the EB CLI doesn't use git commands to create your source bundle. This means that EB CLI ignores files specified in .ebignore, and includes all other files. In particular, it includes uncommitted source files.

This appears to have fixed the problem.