Skip to content

Improve channel lookup and logging#2048

Merged
softins merged 2 commits intojamulussoftware:masterfrom
softins:channel-lookup
Sep 29, 2021
Merged

Improve channel lookup and logging#2048
softins merged 2 commits intojamulussoftware:masterfrom
softins:channel-lookup

Conversation

@softins
Copy link
Copy Markdown
Member

@softins softins commented Sep 27, 2021

Short description of changes

In CHostAddress::Compare(), does an explicit comparison of IPv4 addresses instead of just subtracting them. This avoids a potential signed overflow inconsistency.

Also improves the logging of connections by fetching the channel count at the time of connection and passing it through to the logging, instead of fetching it later in the logging thread.

Context: Fixes an issue?

May fix #2046

Does this change need documentation? What needs to be documented and how?

No, code improvement only

Status of this Pull Request

Working, but needs testing.

What is missing until this pull request can be merged?

Checklist

  • I've verified that this Pull Request follows the general code principles
  • I tested my code and it does what I want
  • My code follows the style guide
  • I waited some time after this Pull Request was opened and all GitHub checks completed without errors.
  • I've filled all the content above

Fetch the channel count at the time it changes, and propagate
the value to the logging, instead of having the log process
fetch the value. This improves the log accuracy when multiple
channels connect at the same time.
Comment thread src/util.cpp
quint32 otherAddr = other.InetAddr.toIPv4Address();

return (int) thisAddr - (int) otherAddr;
return thisAddr < otherAddr ? -1 : thisAddr > otherAddr ? 1 : 0;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I saw that in IPv6 branch uses memcmp, I suppose that can be used as well for consistency?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would probably make more logical sense.

Copy link
Copy Markdown
Member Author

@softins softins Sep 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IPv6 uses memcmp() because the addresses are 16 bytes long. IPv4 addresses can be compared whole as they fit a 32-bit integer, so that seems more efficient than memcmp() which would be comparing a byte at a time (I assume).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect memcmp to be optimised for the platform its running on and the data sizes being compared.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s some quick and dirty benchmark. I am using Google Benchmark library (apt install libbenchmark-dev):

#include <benchmark/benchmark.h>
#include <string.h>
#include <stdlib.h>

static void BM_ManualComparison(benchmark::State &state)
{
    srand(1);
    for (auto _ : state)
    {
        state.PauseTiming();
        unsigned int a = rand();
        unsigned int b = rand();
        state.ResumeTiming();
        benchmark::DoNotOptimize(a < b ? -1 : a > b ? 1 : 0);
    }
}
BENCHMARK(BM_ManualComparison);

static void BM_Memcmp(benchmark::State &state)
{
    srand(1);
    for (auto _ : state)
    {
        state.PauseTiming();
        unsigned int a = rand();
        unsigned int b = rand();
        state.ResumeTiming();
        benchmark::DoNotOptimize(memcmp(&a, &b, sizeof(unsigned int)));
    }
}
BENCHMARK(BM_Memcmp);

BENCHMARK_MAIN();

Running benchmark with -O3 resulted in roughly the same amount of CPU time per iteration. Your results may vary:

$ g++ -lbenchmark bench.cpp -O3 && ./a.out

Running ./a.out
Run on (1 X 2900 MHz CPU )
CPU Caches:
  L1 Data 64K (x1)
  L1 Instruction 64K (x1)
  L2 Unified 512K (x1)
  L3 Unified 16384K (x1)
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------------
Benchmark                    Time           CPU Iterations
-----------------------------------------------------------
BM_ManualComparison        329 ns        325 ns    2138302
BM_Memcmp                  326 ns        325 ns    2130022

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's certainly an interesting result! :)

On a little-endian architecture such as Intel, the comparison by memcmp() must necessarily give different results than a direct comparison of a pair of int or unsigned int variables. In this particular application that doesn't actually matter, as the order is immaterial, provided it is consistent.

Personally, in this instance, I prefer the comparison of unsigned int directly (as in the current PR) rather than memcmp(), but not strongly enough to argue about it. If the consensus is to prefer memcmp(), then fair enough.

I repeated the test on my RPi4:

tony@pi:~/benchmark $ g++ -lbenchmark bench.cpp -O3 && ./a.out
2021-09-28 17:54:48
Running ./a.out
Run on (4 X 1500 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------------
Benchmark                    Time           CPU Iterations
-----------------------------------------------------------
BM_ManualComparison       2458 ns       2432 ns     285846
BM_Memcmp                 2443 ns       2425 ns     289761
tony@pi:~/benchmark $ ./a.out
2021-09-28 17:55:26
Running ./a.out
Run on (4 X 1500 MHz CPU s)
***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead.
***WARNING*** Library was built as DEBUG. Timings may be affected.
-----------------------------------------------------------
Benchmark                    Time           CPU Iterations
-----------------------------------------------------------
BM_ManualComparison       2464 ns       2434 ns     288050
BM_Memcmp                 2487 ns       2463 ns     286765

So again there is not much to choose, and the two runs were different in the opposite way!

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personally, I am fine with either way; I don't have a strong opinion — just pointing out that using memcmp will make both IPv4 and IPv6 code more consistent with negligible performance penalty, although consistency in this specific case may not be so important to warrant a change.

(By the way I am already deploying the code with patches from this PR, as it is right now, to my servers for testing tomorrow.)

@ann0see ann0see added this to the Release 3.8.1 milestone Sep 28, 2021
@ann0see ann0see added the bug Something isn't working label Sep 28, 2021
@softins softins merged commit d540df9 into jamulussoftware:master Sep 29, 2021
@softins softins deleted the channel-lookup branch January 11, 2022 12:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[3.8.1beta1] Existing connection gets allocated to a new channel

4 participants