Describe the bug
Ironic master nodes go into maintenance mode due to 'Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request'
The logs of the ipa-downloader container show that the image was downloaded successfully(in ~9 minutes)
[root@rhhi-node-worker-0 dev-scripts]# podman logs ipa-downloader
+ SNAP=current-tripleo-rdo
+ FILENAME=ironic-python-agent
+ FILENAME_EXT=.tar
+ FFILENAME=ironic-python-agent.tar
++ mktemp -d
+ TMPDIR=/tmp/tmp.aXPrgeLvMo
+ mkdir -p /shared/html/images
+ cd /shared/html/images
+ ls -l
total 2556996
-rw-r--r--. 1 1000 1000 1872166912 Jul 31 11:27 rhcos-420.8.20190708.2-openstack.qcow2
-rw-r--r--. 1 1000 1000 746192896 Jul 31 11:27 rhcos-ootpa-latest.qcow2
-rw-r--r--. 1 1000 1000 33 Jul 31 11:27 rhcos-ootpa-latest.qcow2.md5sum
+ '[' -n '' -a '!' -e ironic-python-agent.tar.headers ']'
+ '[' -e ironic-python-agent.tar.headers ']'
+ cd /tmp/tmp.aXPrgeLvMo
+ curl --dump-header ironic-python-agent.tar.headers -O https://images.rdoproject.org/stein/rdo_trunk/current-tripleo-rdo/ironic-python-agent.tar
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 383M 100 383M 0 0 716k 0 0:09:08 0:09:08 --:--:-- 565k
+ '[' -s ironic-python-agent.tar ']'
+ tar -xf ironic-python-agent.tar
++ awk '/ETag:/ {print $2}' ironic-python-agent.tar.headers
++ tr -d '"\r'
+ ETAG=17ff4800-58eee02bf37c0
+ cd -
/shared/html/images
+ chmod 755 /tmp/tmp.aXPrgeLvMo
+ mv /tmp/tmp.aXPrgeLvMo ironic-python-agent-17ff4800-58eee02bf37c0
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.tar.headers ironic-python-agent.tar.headers
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.initramfs ironic-python-agent.initramfs
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.kernel ironic-python-agent.kernel
Expected/observed behavior
Ironic nodes get into maintenance mode because they cannot reach http://172.22.0.1/images/ironic-python-agent.initramfs. Nevertheless the image is there so perhaps this is a race condition showing up when it takes longer to download the image from https://images.rdoproject.org/stein/rdo_trunk/current-tripleo-rdo/ironic-python-agent.tar
[cloud-user@rhhi-node-worker-0 dev-scripts]$ openstack baremetal node list
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| UUID | Name | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| 299125f7-830f-4600-a130-9be50cfb9719 | openshift-master-1 | None | power on | clean wait | True |
| 4d1e150a-e77b-4396-b2d6-5aac6fb5df8c | openshift-master-0 | None | power on | clean wait | True |
| f6c9a83f-cf07-4740-ab29-b823558343d3 | openshift-master-2 | None | power on | clean wait | True |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
[cloud-user@rhhi-node-worker-0 dev-scripts]$ openstack baremetal node show 299125f7-830f-4600-a130-9be50cfb9719 -f yaml
allocation_uuid: null
automated_clean: null
bios_interface: no-bios
boot_interface: ipxe
chassis_uuid: null
clean_step: {}
conductor: rhhi-node-worker-0.example.com
conductor_group: ''
console_enabled: false
console_interface: no-console
created_at: '2019-07-31T11:45:07+00:00'
deploy_interface: direct
deploy_step: {}
description: null
driver: ipmi
driver_info:
deploy_kernel: http://172.22.0.1/images/ironic-python-agent.kernel
deploy_ramdisk: http://172.22.0.1/images/ironic-python-agent.initramfs
ipmi_address: 192.168.123.1
ipmi_password: '******'
ipmi_port: '6231'
ipmi_username: admin
driver_internal_info:
agent_continue_if_ata_erase_failed: false
agent_enable_ata_secure_erase: true
agent_erase_devices_iterations: 1
agent_erase_devices_zeroize: true
agent_last_heartbeat: '2019-07-31T12:18:37.783851'
agent_url: http://172.22.0.95:9999
agent_version: 3.6.2.dev3
disk_erasure_concurrency: 1
extra: {}
fault: clean failure
inspect_interface: inspector
inspection_finished_at: null
inspection_started_at: null
instance_info: {}
instance_uuid: null
last_error: null
maintenance: true
maintenance_reason: 'Failed to prepare node 299125f7-830f-4600-a130-9be50cfb9719 for
cleaning: Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs
failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request.'
management_interface: ipmitool
name: openshift-master-1
network_interface: noop
owner: null
power_interface: ipmitool
power_state: power on
properties:
cpu_arch: x86_64
local_gb: '50'
root_device:
name: /dev/sda
protected: false
protected_reason: null
provision_state: clean wait
provision_updated_at: '2019-07-31T11:47:29+00:00'
raid_config: {}
raid_interface: no-raid
rescue_interface: no-rescue
reservation: null
resource_class: baremetal
storage_interface: noop
target_power_state: null
target_provision_state: available
target_raid_config: {}
traits: []
updated_at: '2019-07-31T12:18:37+00:00'
uuid: 299125f7-830f-4600-a130-9be50cfb9719
vendor_interface: ipmitool
Describe the bug
Ironic master nodes go into maintenance mode due to 'Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request'
The logs of the ipa-downloader container show that the image was downloaded successfully(in ~9 minutes)
Expected/observed behavior
Ironic nodes get into maintenance mode because they cannot reach http://172.22.0.1/images/ironic-python-agent.initramfs. Nevertheless the image is there so perhaps this is a race condition showing up when it takes longer to download the image from https://images.rdoproject.org/stein/rdo_trunk/current-tripleo-rdo/ironic-python-agent.tar