Skip to content

strict round robin parent select doesn't work as expected #6321

@jvgutierrez

Description

@jvgutierrez

We've identified two issues regarding strict parent proxy select policy.
Our parent.config looks like this:

dest_domain=. parent="10.132.0.112:3120,10.132.0.112:3121,10.132.0.112:3122,10.132.0.112:3123,10.132.0.112:3124,10.132.0.112:3125,10.132.0.112:3126,10.132.0.112:3127" parent_is_proxy=false round_robin=strict

on long-lived ATS instances ss shows that all the traffic is being handled by the first parent. This seems to be related to the fact that rr_next (declared as uint32_t in https://github.com/apache/trafficserver/blob/master/proxy/ParentSelection.h#L143) gets casted as int32_t in https://github.com/apache/trafficserver/blob/master/proxy/ParentRoundRobin.cc#L102 and as soon as the value overflows, cur_index always begins on a negative value, returning always the parent with index 0:

Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:61 (selectParent)> (parent_select) In ParentRoundRobin::selectParent(): Using a round robin parent selection strategy.
Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:140 (selectParent)> (parent_select) cur_index: -4, result->start_parent: -4
Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:140 (selectParent)> (parent_select) cur_index: -3, result->start_parent: -4
Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:140 (selectParent)> (parent_select) cur_index: -2, result->start_parent: -4
Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:140 (selectParent)> (parent_select) cur_index: -1, result->start_parent: -4
Jan 14 17:40:41 cp4027 traffic_manager[20323]: [Jan 14 17:40:41.847] {0x2ade5d766700} DEBUG: <ParentRoundRobin.cc:140 (selectParent)> (parent_select) cur_index: 0, result->start_parent: -4

On short-lived ATS instance, the debug output looks as expected:

Jan 14 18:10:41 cp5012 traffic_manager[48369]: [Jan 14 18:10:41.335] {0x2b4690202700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3120
Jan 14 18:10:41 cp5012 traffic_manager[48369]: [Jan 14 18:10:41.335] {0x2b4690202700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3121
Jan 14 18:10:43 cp5012 traffic_manager[48369]: [Jan 14 18:10:43.890] {0x2b46905e9700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3122
Jan 14 18:10:43 cp5012 traffic_manager[48369]: [Jan 14 18:10:43.890] {0x2b46905e9700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3123
Jan 14 18:10:44 cp5012 traffic_manager[48369]: [Jan 14 18:10:44.970] {0x2b46908e9700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3124
Jan 14 18:10:44 cp5012 traffic_manager[48369]: [Jan 14 18:10:44.970] {0x2b46908e9700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3125
Jan 14 18:10:45 cp5012 traffic_manager[48369]: [Jan 14 18:10:45.405] {0x2b4690a8c700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3126
Jan 14 18:10:45 cp5012 traffic_manager[48369]: [Jan 14 18:10:45.405] {0x2b4690a8c700} DEBUG: <ParentRoundRobin.cc:172 (selectParent)> (parent_select) Chosen parent = 10.132.0.112.3127

but ss only shows traffic hitting the odd parents:

vgutierrez@cp5012:~$ for port in {3120..3127}; do ss  "( dport = $port or sport = $port )" |wc -l; done
1
28
1
40
2
40
1
35

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions