Skip to content

[Xenial] Execution stuck in 'requested' state on a fresh RabbitMQ #3290

@arm4b

Description

@arm4b

When executing any st2 action for the first time on a fresh & clean RabbitMQ immediately after st2 startup, - it runs forewer and stuck in requested state:

root@ubuntu16:~# st2 run core.local echo 123
....................................................................................................................................................
root@ubuntu16:~# st2 execution list
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+
| id                       | action.ref | context.user | status                 | start_timestamp        | end_timestamp        |
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+
| 58c96cadc8980518d7cf8ada | core.local | st2admin     | requested              | Wed, 15 Mar 2017       |                      |
|                          |            |              |                        | 16:32:45 UTC           |                      |
+--------------------------+------------+--------------+------------------------+------------------------+----------------------+


root@ubuntu16:~# st2 execution get 58c96cadc8980518d7cf8ada
id: 58c96cadc8980518d7cf8ada
status: requested
parameters: 
  cmd: echo 123
result: None

I can only guess that at early point st2 is busy with RabbitMQ bootstrapping and for some reason can't trigger an action (since topic/queue is not yet created/message is lost or something like that ?) when running things for the first time.

Reproduce

Requirements to reproduce:

  • OS is Ubuntu Xenial
  • RabbitMQ is clean
  • st2 was just started
  • action runs immediately after st2 start

Script to reproduce

I could reproduce it every time with this script:

#!/bin/bash

# output executed commands
set -o xtrace

sudo st2ctl stop
# emulate fresh & clean RabbitMQ
rabbitmqctl stop_app
sudo rabbitmqctl reset
rabbitmqctl start_app

# isolate & make sure the problem is not with RabbitMQ startup
sleep 30
# but with StackStorm startup itself
sudo st2ctl start

# The command is stuck and runs forever
# See: st2 execution list
st2 run core.local echo 123

This is similar to StackStorm/st2-packages#445 (comment)
The problem is more serious than it looks like, being blocker for Automation, when deploying StackStorm in prod. Stuck execution immediately after startup is pretty much a bad thing.

I originally thought this could be solved with packaging, but after repro it sounds like more about StackStorm core.
cc @m4dcoder @Kami @lakshmi-kannan.

Metadata

Metadata

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions