Commit d4b9756
committed
contrib/pkg/awstagdeprovision: Switch to DescribeInstancesPages
In case there are too many instance to fit on the single
DescribeInstances page. Docs for the new function are in [1]. This
might address an issue I saw yesterday, where the instance reaper
exited early:
$ oc logs --timestamps -f e2e-aws -c teardown | tee /tmp/teardown.log
$ cat /tmp/teardown.log
2018-10-24T01:45:49.406653647Z Gathering artifacts ...
...
2018-10-24T01:46:16.142364238Z Waiting for logs ...
...
2018-10-24T01:46:19.33849685Z Deprovisioning cluster ...
...
2018-10-24T01:46:19.359616557Z level=debug msg="Deleting instances"
...
2018-10-24T01:46:19.988936278Z level=debug msg="deleting instance: i-0bfc77b0fd7bbe707"
...
2018-10-24T01:46:20.173421738Z level=debug msg="deleting instance: i-0905586e42655a097"
...
2018-10-24T01:46:20.362874514Z level=debug msg="deleting instance: i-06ae20414f46aaccc"
...
2018-10-24T01:46:20.527601571Z level=debug msg="deleting instance: i-0bd8dc53eb954d0b8"
...
2018-10-24T01:46:20.713777056Z level=debug msg="deleting instance: i-01c91b49aba53d43b"
...
2018-10-24T01:46:20.891650892Z level=debug msg="deleting instance: i-0326b5e815732422e"
...
2018-10-24T01:46:21.0556686Z level=debug msg="deleting instance: i-05c9d0368d46be9b2"
...
2018-10-24T01:46:21.186803438Z level=debug msg="Exiting deleting instances"
...
2018-10-24T01:46:31.187047842Z level=debug msg="Deleting instances"
...
2018-10-24T01:46:31.533318629Z level=debug msg="Exiting deleting instances"
2018-10-24T01:46:31.533340968Z level=debug msg="goroutine deleteInstances complete"
...
2018-10-24T02:34:00.038768501Z level=debug msg="Deleting VPCs"
2018-10-24T02:34:00.463719417Z level=debug msg="deleting VPC: vpc-057311209bfc67050"
2018-10-24T02:34:00.528213402Z level=debug msg="error deleting VPC vpc-057311209bfc67050: DependencyViolation: The vpc 'vpc-057311209bfc67050' has dependencies and cannot be deleted.\n\tstatus code: 400, request id: 65f40f64-8e08-467f-8ae8-cd320d9630c7"
2018-10-24T02:34:00.528272636Z level=debug msg="Exiting deleting VPCs"
...
2018-10-24T02:48:25.570568032Z level=debug msg="Deleting VPCs"
2018-10-24T02:48:25.739046406Z level=debug msg="Exiting deleting VPCs"
2018-10-24T02:48:25.739139735Z level=debug msg="goroutine deleteVPCs complete"
...
That attempts deletion for seven instances, which sounds right (one
bootstrap, and three masters and workers each). But you can see that
VPC deletion hung for over an hour due to a blocking dependency. I
ended up deleting a leftover master via the AWS console, which allowed
me to delete the VPC (also from the console). It's possible that the
destroy logic would have cleaned up the VPC on its own, but with 14
minutes between attempts I didn't want to wait (can we cap the
exponential backoff? Or just poll every two minutes or something
without backoff).
Unfortunately I did not collect tag information from that master, so
I'm not entirely sure why the automated destroyer missed it. My
initial guess was that we had more than one page of instances in the
account and the leftover master missed the first page, causing the
instance goroutine to exit thinking its task was complete. But it
looks like the instance requests are filtered on the server side,
which makes "no instances in the first page that match but there are
instances in later pages" less likely ;). Still, solid pagination
seems like a useful thing to have even if it wasn't the cause of this
particular issue.
[1]: https://docs.aws.amazon.com/sdk-for-go/api/service/ec2/#EC2.DescribeInstancesPages1 parent 35b7998 commit d4b9756
1 file changed
+9
-14
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
700 | 700 | | |
701 | 701 | | |
702 | 702 | | |
703 | | - | |
704 | | - | |
705 | | - | |
706 | | - | |
707 | | - | |
708 | | - | |
709 | | - | |
710 | | - | |
711 | | - | |
712 | | - | |
713 | | - | |
| 703 | + | |
| 704 | + | |
| 705 | + | |
714 | 706 | | |
715 | 707 | | |
716 | 708 | | |
| |||
724 | 716 | | |
725 | 717 | | |
726 | 718 | | |
727 | | - | |
| 719 | + | |
728 | 720 | | |
729 | 721 | | |
730 | 722 | | |
| |||
736 | 728 | | |
737 | 729 | | |
738 | 730 | | |
739 | | - | |
| 731 | + | |
| 732 | + | |
| 733 | + | |
| 734 | + | |
740 | 735 | | |
741 | 736 | | |
742 | | - | |
| 737 | + | |
743 | 738 | | |
744 | 739 | | |
745 | 740 | | |
| |||
0 commit comments