Skip to content

Ironic master nodes go into maintenance mode during deployment due to 'Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request' #711

@mcornea

Description

@mcornea

Describe the bug

Ironic master nodes go into maintenance mode due to 'Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request'

The logs of the ipa-downloader container show that the image was downloaded successfully(in ~9 minutes)

[root@rhhi-node-worker-0 dev-scripts]# podman logs ipa-downloader
+ SNAP=current-tripleo-rdo
+ FILENAME=ironic-python-agent
+ FILENAME_EXT=.tar
+ FFILENAME=ironic-python-agent.tar
++ mktemp -d
+ TMPDIR=/tmp/tmp.aXPrgeLvMo
+ mkdir -p /shared/html/images
+ cd /shared/html/images
+ ls -l
total 2556996
-rw-r--r--. 1 1000 1000 1872166912 Jul 31 11:27 rhcos-420.8.20190708.2-openstack.qcow2
-rw-r--r--. 1 1000 1000  746192896 Jul 31 11:27 rhcos-ootpa-latest.qcow2
-rw-r--r--. 1 1000 1000         33 Jul 31 11:27 rhcos-ootpa-latest.qcow2.md5sum
+ '[' -n '' -a '!' -e ironic-python-agent.tar.headers ']'
+ '[' -e ironic-python-agent.tar.headers ']'
+ cd /tmp/tmp.aXPrgeLvMo
+ curl --dump-header ironic-python-agent.tar.headers -O https://images.rdoproject.org/stein/rdo_trunk/current-tripleo-rdo/ironic-python-agent.tar
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  383M  100  383M    0     0   716k      0  0:09:08  0:09:08 --:--:--  565k
+ '[' -s ironic-python-agent.tar ']'
+ tar -xf ironic-python-agent.tar
++ awk '/ETag:/ {print $2}' ironic-python-agent.tar.headers
++ tr -d '"\r'
+ ETAG=17ff4800-58eee02bf37c0
+ cd -
/shared/html/images
+ chmod 755 /tmp/tmp.aXPrgeLvMo
+ mv /tmp/tmp.aXPrgeLvMo ironic-python-agent-17ff4800-58eee02bf37c0
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.tar.headers ironic-python-agent.tar.headers
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.initramfs ironic-python-agent.initramfs
+ ln -sf ironic-python-agent-17ff4800-58eee02bf37c0/ironic-python-agent.kernel ironic-python-agent.kernel

Expected/observed behavior

Ironic nodes get into maintenance mode because they cannot reach http://172.22.0.1/images/ironic-python-agent.initramfs. Nevertheless the image is there so perhaps this is a race condition showing up when it takes longer to download the image from https://images.rdoproject.org/stein/rdo_trunk/current-tripleo-rdo/ironic-python-agent.tar

[cloud-user@rhhi-node-worker-0 dev-scripts]$ openstack baremetal node list
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| UUID                                 | Name               | Instance UUID | Power State | Provisioning State | Maintenance |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+
| 299125f7-830f-4600-a130-9be50cfb9719 | openshift-master-1 | None          | power on    | clean wait         | True        |
| 4d1e150a-e77b-4396-b2d6-5aac6fb5df8c | openshift-master-0 | None          | power on    | clean wait         | True        |
| f6c9a83f-cf07-4740-ab29-b823558343d3 | openshift-master-2 | None          | power on    | clean wait         | True        |
+--------------------------------------+--------------------+---------------+-------------+--------------------+-------------+

[cloud-user@rhhi-node-worker-0 dev-scripts]$ openstack baremetal node show 299125f7-830f-4600-a130-9be50cfb9719 -f yaml
allocation_uuid: null
automated_clean: null
bios_interface: no-bios
boot_interface: ipxe
chassis_uuid: null
clean_step: {}
conductor: rhhi-node-worker-0.example.com
conductor_group: ''
console_enabled: false
console_interface: no-console
created_at: '2019-07-31T11:45:07+00:00'
deploy_interface: direct
deploy_step: {}
description: null
driver: ipmi
driver_info:
  deploy_kernel: http://172.22.0.1/images/ironic-python-agent.kernel
  deploy_ramdisk: http://172.22.0.1/images/ironic-python-agent.initramfs
  ipmi_address: 192.168.123.1
  ipmi_password: '******'
  ipmi_port: '6231'
  ipmi_username: admin
driver_internal_info:
  agent_continue_if_ata_erase_failed: false
  agent_enable_ata_secure_erase: true
  agent_erase_devices_iterations: 1
  agent_erase_devices_zeroize: true
  agent_last_heartbeat: '2019-07-31T12:18:37.783851'
  agent_url: http://172.22.0.95:9999
  agent_version: 3.6.2.dev3
  disk_erasure_concurrency: 1
extra: {}
fault: clean failure
inspect_interface: inspector
inspection_finished_at: null
inspection_started_at: null
instance_info: {}
instance_uuid: null
last_error: null
maintenance: true
maintenance_reason: 'Failed to prepare node 299125f7-830f-4600-a130-9be50cfb9719 for
  cleaning: Validation of image href http://172.22.0.1/images/ironic-python-agent.initramfs
  failed, reason: Got HTTP code 404 instead of 200 in response to HEAD request.'
management_interface: ipmitool
name: openshift-master-1
network_interface: noop
owner: null
power_interface: ipmitool
power_state: power on
properties:
  cpu_arch: x86_64
  local_gb: '50'
  root_device:
    name: /dev/sda
protected: false
protected_reason: null
provision_state: clean wait
provision_updated_at: '2019-07-31T11:47:29+00:00'
raid_config: {}
raid_interface: no-raid
rescue_interface: no-rescue
reservation: null
resource_class: baremetal
storage_interface: noop
target_power_state: null
target_provision_state: available
target_raid_config: {}
traits: []
updated_at: '2019-07-31T12:18:37+00:00'
uuid: 299125f7-830f-4600-a130-9be50cfb9719
vendor_interface: ipmitool

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions