Skip to content

Conversation

@weizhouapache
Copy link
Member

@weizhouapache weizhouapache commented Sep 8, 2021

Description

This PR fixes #5413

Types of changes

  • Breaking change (fix or feature that would cause existing functionality to change)
  • New feature (non-breaking change which adds functionality)
  • Bug fix (non-breaking change which fixes an issue)
  • Enhancement (improves an existing feature and functionality)
  • Cleanup (Code refactoring and cleanup, that may add test cases)

Feature/Enhancement Scale or Bug Severity

Feature/Enhancement Scale

  • Major
  • Minor

Bug Severity

  • BLOCKER
  • Critical
  • Major
  • Minor
  • Trivial

Screenshots (if appropriate):

How Has This Been Tested?

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@weizhouapache weizhouapache changed the base branch from main to 4.15 September 8, 2021 09:29
@blueorangutan
Copy link

Packaging result: ✖️ el7 ✖️ el8 ✔️ debian ✔️ suse15. SL-JID 1175

@weizhouapache
Copy link
Member Author

Hi @coreymr

can you please test with the following file in CPVM ?
cloud-console-proxy-4.15.1.0.zip

cd /usr/local/cloud/systemvm/
mv /usr/local/cloud/systemvm/cloud-console-proxy-4.15.1.0.jar /root/
wget https://github.com/apache/cloudstack/files/7128023/cloud-console-proxy-4.15.1.0.zip
mv cloud-console-proxy-4.15.1.0.zip cloud-console-proxy-4.15.1.0.jar
systemctl restart cloud

@weizhouapache
Copy link
Member Author

@blueorangutan package

@blueorangutan
Copy link

@weizhouapache a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian. SL-JID 1177

@weizhouapache weizhouapache marked this pull request as ready for review September 8, 2021 12:40
@weizhouapache
Copy link
Member Author

@rhtyd @nvazquez
we maybe need this pr in 4.15.2.0

Mike has confirmed it fixes his console issue on vmware 7.

Copy link
Contributor

@nvazquez nvazquez left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@nvazquez
Copy link
Contributor

nvazquez commented Sep 8, 2021

Thanks @weizhouapache, @rhtyd can this be included on 4.15.2?

@nvazquez
Copy link
Contributor

nvazquez commented Sep 8, 2021

@blueorangutan package

@blueorangutan
Copy link

@nvazquez a Jenkins job has been kicked to build packages. I'll keep you posted as I make progress.

@blueorangutan
Copy link

Packaging result: ✔️ el7 ✔️ el8 ✔️ debian. SL-JID 1184

@nvazquez
Copy link
Contributor

nvazquez commented Sep 8, 2021

@blueorangutan test centos7 vmware-70u1

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + vmware-70u1) has been kicked to run smoke tests

@nvazquez
Copy link
Contributor

nvazquez commented Sep 8, 2021

@rhtyd @weizhouapache I think this is not a blocker, no need to include it on 4.15.2, do you agree?

@blueorangutan
Copy link

Trillian test result (tid-1992)
Environment: vmware-70u1 (x2), Advanced Networking with Mgmt server 7
Total time taken: 48769 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5419-t1992-vmware-70u1.zip
Smoke tests completed. 76 look OK, 11 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_create_lb_rule_src_nat Failure 44.70 test_loadbalance.py
test_02_create_lb_rule_non_nat Failure 34.30 test_loadbalance.py
test_assign_and_removal_lb Failure 34.26 test_loadbalance.py
test_isolate_network_password_server Failure 7.56 test_password_server.py
test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 Failure 259.56 test_internal_lb.py
test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 Failure 328.93 test_internal_lb.py
test_02_vpc_privategw_static_routes Failure 239.72 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 237.90 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 390.03 test_privategw_acl.py
test_router_dhcphosts Failure 7.56 test_router_dhcphosts.py
ContextSuite context=TestRouterDHCPHosts>:teardown Error 18.97 test_router_dhcphosts.py
test_router_dns_guestipquery Failure 3.37 test_router_dnsservice.py
test_01_isolate_network_FW_PF_default_routes_egress_true Failure 94.90 test_routers_network_ops.py
test_02_isolate_network_FW_PF_default_routes_egress_false Failure 97.78 test_routers_network_ops.py
test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true Failure 143.43 test_routers_network_ops.py
test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false Failure 147.34 test_routers_network_ops.py
test_04_change_offering_small Error 147.80 test_service_offerings.py
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL Failure 563.56 test_vpc_redundant.py
test_02_redundant_VPC_default_routes Failure 421.74 test_vpc_redundant.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Failure 520.69 test_vpc_redundant.py
test_02_VPC_default_routes Failure 189.92 test_vpc_router_nics.py
test_01_redundant_vpc_site2site_vpn Error 398.48 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 263.16 test_vpc_vpn.py
test_01_vpc_site2site_vpn Error 276.26 test_vpc_vpn.py

@weizhouapache weizhouapache changed the title CPVM: use X509ExtendedTrustManager CPVM: use X509ExtendedTrustManager to skip hostname verification Sep 10, 2021
@yadvr yadvr added this to the 4.16.0.0 milestone Sep 10, 2021
@nvazquez
Copy link
Contributor

@blueorangutan test centos7 vmware-70u1

@blueorangutan
Copy link

@nvazquez a Trillian-Jenkins test job (centos7 mgmt + vmware-70u1) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2016)
Environment: xenserver-71 (x2), Advanced Networking with Mgmt server 7
Total time taken: 47889 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5419-t2016-xenserver-71.zip
Smoke tests completed. 88 look OK, 1 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_07_deploy_kubernetes_ha_cluster Failure 85.99 test_kubernetes_clusters.py
ContextSuite context=TestKubernetesCluster>:teardown Error 111.13 test_kubernetes_clusters.py

@yadvr
Copy link
Member

yadvr commented Sep 14, 2021

cc @DaanHoogland @nvazquez @davidjumani - do we know what causes so many smoketest failures with Vmware7, can we fix them?

Copy link
Member

@yadvr yadvr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - did not test it, this would require some manual testing + 100% smoketests pass before merging.

@weizhouapache
Copy link
Member Author

@blueorangutan test centos7 vmware-65u2

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + vmware-65u2) has been kicked to run smoke tests

@nvazquez
Copy link
Contributor

@rhtyd @weizhouapache I've started a Vmware7 round of tests on the health check PR so we can compare if the failures are related to this PR or needs fixing

@nvazquez
Copy link
Contributor

@rhtyd @weizhouapache this fix has been tested by @coreymr, do we need additional tests?

@yadvr
Copy link
Member

yadvr commented Sep 15, 2021

@nvazquez yes my concern is not specifically VMware7, but other regression testing hypervisors/version combinations (with/without SSL enabled?) - if it breaks anything? Or do you or @weizhouapache think that's not necessary?

@weizhouapache
Copy link
Member Author

weizhouapache commented Sep 15, 2021

@nvazquez yes my concern is not specifically VMware7, but other regression testing hypervisors/version combinations (with/without SSL enabled?) - if it breaks anything? Or do you or @weizhouapache think that's not necessary?

@rhtyd
I have tested with centos7 (ssl), ubuntu20 (non-ssl) and xenserver65 (non-ssl). no issues.

@yadvr
Copy link
Member

yadvr commented Sep 15, 2021

Thanks for confirming @weizhouapache

@yadvr
Copy link
Member

yadvr commented Sep 15, 2021

(just curious why XenServer65? and what about Vmware 6.5/6.7?)

@weizhouapache
Copy link
Member Author

(just curious why XenServer65? and what about Vmware 6.5/6.7?)

@rhtyd sorry for typo.

centos7 (ssl), ubuntu20 (non-ssl), vmware65 (non-ssl) and xenserver71 (non-ssl).

@weizhouapache
Copy link
Member Author

@blueorangutan test centos7 vmware-70u1 keepEnv

@blueorangutan
Copy link

@weizhouapache a Trillian-Jenkins test job (centos7 mgmt + vmware-70u1) has been kicked to run smoke tests

@blueorangutan
Copy link

Trillian test result (tid-2068)
Environment: vmware-70u1 (x2), Advanced Networking with Mgmt server 7
Total time taken: 50210 seconds
Marvin logs: https://github.com/blueorangutan/acs-prs/releases/download/trillian/pr5419-t2068-vmware-70u1.zip
Smoke tests completed. 77 look OK, 12 have errors
Only failed tests results shown below:

Test Result Time (s) Test File
test_01_internallb_roundrobin_1VPC_3VM_HTTP_port80 Failure 317.27 test_internal_lb.py
test_02_internallb_roundrobin_1RVPC_3VM_HTTP_port80 Failure 430.32 test_internal_lb.py
test_01_create_lb_rule_src_nat Failure 47.70 test_loadbalance.py
test_02_create_lb_rule_non_nat Failure 37.43 test_loadbalance.py
test_assign_and_removal_lb Failure 45.32 test_loadbalance.py
test_isolate_network_password_server Failure 7.56 test_password_server.py
test_02_vpc_privategw_static_routes Failure 364.22 test_privategw_acl.py
test_03_vpc_privategw_restart_vpc_cleanup Failure 349.69 test_privategw_acl.py
test_04_rvpc_privategw_static_routes Failure 524.53 test_privategw_acl.py
test_router_dhcphosts Failure 7.54 test_router_dhcphosts.py
ContextSuite context=TestRouterDHCPHosts>:teardown Error 17.82 test_router_dhcphosts.py
test_router_dns_guestipquery Failure 3.41 test_router_dnsservice.py
test_01_isolate_network_FW_PF_default_routes_egress_true Failure 125.98 test_routers_network_ops.py
test_02_isolate_network_FW_PF_default_routes_egress_false Failure 129.80 test_routers_network_ops.py
test_01_RVR_Network_FW_PF_SSH_default_routes_egress_true Failure 200.59 test_routers_network_ops.py
test_02_RVR_Network_FW_PF_SSH_default_routes_egress_false Failure 209.72 test_routers_network_ops.py
test_04_change_offering_small Error 146.45 test_service_offerings.py
test_02_list_snapshots_with_removed_data_store Error 55.10 test_snapshots.py
test_01_create_redundant_VPC_2tiers_4VMs_4IPs_4PF_ACL Failure 1079.67 test_vpc_redundant.py
test_02_redundant_VPC_default_routes Failure 406.46 test_vpc_redundant.py
test_03_create_redundant_VPC_1tier_2VMs_2IPs_2PF_ACL_reboot_routers Failure 504.67 test_vpc_redundant.py
test_02_VPC_default_routes Failure 214.68 test_vpc_router_nics.py
test_01_redundant_vpc_site2site_vpn Error 492.92 test_vpc_vpn.py
test_01_vpc_site2site_vpn_multiple_options Error 329.31 test_vpc_vpn.py
test_01_vpc_site2site_vpn Error 331.14 test_vpc_vpn.py

@weizhouapache
Copy link
Member Author

@rhtyd @nvazquez
the test failures are caused by macchinina template we use for testing.

only the first ssh connection is ok, the next ssh attempts are refused. (it seems to be working in vmware65 environments)

root@r-463-VM:~# date;ssh 10.1.1.159
Thu 16 Sep 2021 09:55:47 AM UTC
The authenticity of host '10.1.1.159 (10.1.1.159)' can't be established.
ECDSA key fingerprint is SHA256:WQQZ6Sm2CxR5eDkbOw4Xwl+mTrE37fMmnR1tZigCQ8E.
Are you sure you want to continue connecting (yes/no/[fingerprint])? ^C

root@r-463-VM:~# date;ssh 10.1.1.159
Thu 16 Sep 2021 09:55:49 AM UTC
ssh: connect to host 10.1.1.159 port 22: Connection refused
root@r-463-VM:~# date;ssh 10.1.1.159
Thu 16 Sep 2021 09:55:52 AM UTC
ssh: connect to host 10.1.1.159 port 22: Connection refused

for default centos55 template, it works

root@r-463-VM:~# date;ssh -o KexAlgorithms=diffie-hellman-group1-sha1 10.1.1.120
Thu 16 Sep 2021 09:58:53 AM UTC
The authenticity of host '10.1.1.120 (10.1.1.120)' can't be established.
RSA key fingerprint is SHA256:gdLIogU+y021XvZfEGio/58rU5NrHOqHHLmA8T6z5G8.
Are you sure you want to continue connecting (yes/no/[fingerprint])? ^C
root@r-463-VM:~# 

root@r-463-VM:~# date;ssh -o KexAlgorithms=diffie-hellman-group1-sha1 10.1.1.120
Thu 16 Sep 2021 09:58:56 AM UTC
The authenticity of host '10.1.1.120 (10.1.1.120)' can't be established.
RSA key fingerprint is SHA256:gdLIogU+y021XvZfEGio/58rU5NrHOqHHLmA8T6z5G8.
Are you sure you want to continue connecting (yes/no/[fingerprint])? 
Host key verification failed.

When I use centos55 for testing, tests passed.

diff --git a/test/integration/smoke/test_loadbalance.py b/test/integration/smoke/test_loadbalance.py
index 53047f91f2..0a1b2ee5f4 100644
--- a/test/integration/smoke/test_loadbalance.py
+++ b/test/integration/smoke/test_loadbalance.py
@@ -41,7 +41,7 @@ class TestLoadBalance(cloudstackTestCase):
         cls.domain = get_domain(cls.apiclient)
         cls.zone = get_zone(cls.apiclient, testClient.getZoneForTests())
         cls.hypervisor = testClient.getHypervisorInfo()
-        template = get_test_template(
+        template = get_template(
                             cls.apiclient,
                             cls.zone.id,
                             cls.hypervisor)
diff --git a/test/integration/smoke/test_privategw_acl.py b/test/integration/smoke/test_privategw_acl.py
index 90596c6984..28cc0a3999 100644
--- a/test/integration/smoke/test_privategw_acl.py
+++ b/test/integration/smoke/test_privategw_acl.py
@@ -169,7 +169,7 @@ class TestPrivateGwACL(cloudstackTestCase):
         cls.zone = get_zone(cls.api_client, cls.testClient.getZoneForTests())
         cls.hypervisor = cls.testClient.getHypervisorInfo()
         cls.services['mode'] = cls.zone.networktype
-        cls.template = get_test_template(
+        cls.template = get_template(
             cls.api_client,
             cls.zone.id,
             cls.hypervisor)

@nvazquez
Copy link
Contributor

@NuxRo any input on this strange behaviour with macchinina and Vmware 7?

Copy link
Contributor

@sureshanaparti sureshanaparti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

code LGTM

@weizhouapache
Copy link
Member Author

weizhouapache commented Sep 17, 2021

@NuxRo any input on this strange behaviour with macchinina and Vmware 7?

@NuxRo
if you have time, can you have a look at the macchinina template ?
on vmware7, ssh is unreachable after the first ssh attempt.
after executing "/usr/sbin/dropbear" in the vm, ssh works, but only once. it does not work again after a ssh attempt.
no issue on vmware65. it is strange.

@weizhouapache
Copy link
Member Author

@nvazquez @sureshanaparti
can we merge this into 4.15 or master ? (4.15 might be better I think)

@yadvr
Copy link
Member

yadvr commented Sep 21, 2021

@weizhouapache not before we have some agreement on the vmware failures (4.15 is fine, but I wonder if somebody will do a 4.15.3 as we'll soon have 4.16.0, 4.16.1 and 4.17.0 in the pipeline)

@yadvr
Copy link
Member

yadvr commented Sep 21, 2021

ping @NuxRo @nvazquez @weizhouapache - kindly discuss and agree on whether this can be merged or some fixes in this PR or in trillian/lab env is needed.

@weizhouapache
Copy link
Member Author

ping @NuxRo @nvazquez @weizhouapache - kindly discuss and agree on whether this can be merged or some fixes in this PR or in trillian/lab env is needed.

@rhtyd
the test failures are not related to this PR in my opinon.
but we can wait some days to see if macchinina template can be fixed on vmware7 @NuxRo
anyway, we should merge it before 4.16.0.0 RC. @nvazquez

@nvazquez
Copy link
Contributor

Agree with @weizhouapache, the failures are not related to the PR as the same were seen in the health checks PR, lets merge it and fix the issues for macchinina and Vmware 7 on a separate PR

@nvazquez nvazquez merged commit 50a0e80 into apache:main Sep 22, 2021
@weizhouapache
Copy link
Member Author

For those who have same issue with 4.15.2.0, please use the jar below in SSVM
cloud-console-proxy-4.15.2.0.zip

cd /usr/local/cloud/systemvm/
mv /usr/local/cloud/systemvm/cloud-console-proxy-4.15.2.0.jar /root/
wget https://github.com/apache/cloudstack/files/7215623/cloud-console-proxy-4.15.2.0.zip
mv cloud-console-proxy-4.15.2.0.zip cloud-console-proxy-4.15.2.0.jar
systemctl restart cloud

DaanHoogland pushed a commit to shapeblue/cloudstack that referenced this pull request May 20, 2022
@weizhouapache weizhouapache deleted the 4.15-cpvm-X509ExtendedTrustManager branch December 9, 2022 08:45
shwstppr pushed a commit to shapeblue/cloudstack that referenced this pull request Jan 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Console Proxy & VMware 7 websocket issue

5 participants