Fix GetValue() to correctly extract values from a report when compiled in LP64 mode (usbhid-ups) by nbriggs · Pull Request #1040 · networkupstools/nut

nbriggs · 2021-06-03T17:45:02Z

Changes the strategy for removing potential garbage bits from values extracted from a report.
Use the LogMin and LogMax fields to drive which bits are meaningful but avoid confusion when,
for example, given a range of -1..2147483647

Closes #1023

…d in LP64 mode Changes the strategy for removing potential garbage bits from values extracted from a report. Use the LogMin and LogMax fields to drive which bits are meaningful but avoid confusion when, for example, given a range of -1..2147483647

jimklimov

That's quite a binary-maths exercise! :)

Thanks for the proposed change and simplified code, noted one concern in the comments but you may convince me that it does not matter :)

jimklimov · 2021-06-12T18:55:54Z

-	}
-	b = hibit(range-1);
+	/* calculate where the sign bit will be if needed */
+	signbit = 1L << hibit(magMax > magMin ? magMax : magMin);


I guess 1L would undermine my question, but wouldn't there be some better case to use an explicit (u)int32_t instead of long? In other words, wondering if NUT might run on platforms where a long might be less than 4 bytes. Or wastefully more than that, for that matter, and in amounts that that would matter... (love the puns)

Alternately, could we "just assume that" 1ULL should certainly be at least 32 bits wide on any platform and shift that for the (u)int32_t types here?

Or just use a boring but presumably safe and portable uint32_t signbit = 1; (assume this zeroes out the other meaningful 31 bits) and signbit <<= ... ?

There's a lot packed up in there to respond to but perhaps the place to start is

platforms where a long might be less than 4 bytes

the C standard, e.g, ISO C99 draft precludes that possibility in section 5.2.4.2.1 Sizes of integer types, where it says that the LONG_MAX value must be no less than +2147483647.

The two common size configurations we currently see are ILP32 (ints, longs, pointers are all 32-bit) and LP64 (int is 32-bit, but long and pointer are 64-bit). I haven't personally encountered an ILP64 system, but I gather some may exist.

We know that 1L is by definition the same size as long -- I think introducing 1ULL, that is unsigned long long to the mix would add to confusion and possible errors rather than reduce it.

I think that if the rest of the NUT code were converted to use explicit sizes rather than the existing short, int, long, long long, and even some "unsigned short int" and "unsigned long long int", then it would be time to do the same here. I also think that lacking test cases for both conforming and non-conforming (to the HID spec) UPSs there's not much chance of doing that and still producing correct results.

[edit to add missing "not"]

BTW, I just tried running test cases for GetValue, with signbit and mask being uint32_t, and then with magMin and magMax either being unsigned long or uint32_t. In none of these cases did it produce the correct result when the program was compiled in 64-bit (LP64) mode. I haven't done the analysis to determine where things go wrong in these cases, but I'd argue that it's more important to get the correct answer...

Thanks for the clarifications! I wonder if a version of that makes sense as a comment in the code, and a link to the issues and PRs you made on this point, so "future-we" (generally in the community) do not try to blindly "optimize" this and break stuff unwittingly.

Regarding the latter comment, I take it as that the current solution works for all architectures you could get hands on, and "similar" implementations as well as original one did not?

Also, do you have some test code to share (maybe into same comment, so future "wannabe optimizers" can test their fixes and/or put that into unit tests eventually)? It might help confirm non-regression on other platforms to be sure :)

…flow

nbriggs · 2021-06-16T15:42:21Z

I'll add an external test driver (in C) for the GetValue code in a separate PR if you don't mind.

jimklimov · 2021-06-18T19:30:11Z

Thanks for the note, and thanks for thinking about the other PR for tests!

Currently Travis went down, so I'll be finalizing the new CI that was brewing (too slowly) hopefully this weekend to take over the marathon stick.

nbriggs · 2021-06-18T23:50:08Z

@jimklimov -- hold off on merging this for a bit. I got a test result I didn't expect which I want to investigate.

I'll turn this back into a draft until I'm happy with the results.

nbriggs · 2021-06-19T00:10:42Z

Problem resolved. It was my mistake in the data I was passing in the test driver because I had forgotten that the report items are presented to GetValue in little-endian order, so 00.00.08.00 is NOT 2048 (what I expected to see) -- it should be 00.08.00.00.

nbriggs · 2021-06-20T16:15:30Z

@jimklimov -- would you happen to have access to any debug output from any UPSs from which I can extract data like

50.628467 [D3] Report[buf]: (5 bytes) => 16 0c 00 00 00
50.628522 [D2] Path: UPS.PowerSummary.PresentStatus.Charging, Type: Input, ReportID: 0x16, Offset: 0, Size: 1, Value: 0
50.628566 [D3] NUT doesn't use this HID object
50.628626 [D3] Report[buf]: (5 bytes) => 16 0c 00 00 00
50.628678 [D2] Path: UPS.PowerSummary.PresentStatus.Discharging, Type: Input, ReportID: 0x16, Offset: 1, Size: 1, Value: 0

and

58.659692 [D3] Report[buf]: (4 bytes) => 0c 64 11 0d
58.659745 [D2] Path: UPS.PowerSummary.RemainingCapacity, Type: Input, ReportID: 0x0c, Offset: 0, Size: 8, Value: 100
58.659785 [D3] NUT doesn't use this HID object
58.659835 [D3] Report[buf]: (4 bytes) => 0c 64 11 0d
58.659885 [D2] Path: UPS.PowerSummary.RunTimeToEmpty, Type: Input, ReportID: 0x0c, Offset: 8, Size: 16, Value: 3345

and so on -- it doesn't matter whether they're entries that NUT uses. I'm building up a collection of predefined tests, as well as making it easy to check values from the command line as in:

./getvaluetest "0c 64 11 0d" 8 16 0 65535 3345
Test #0 buf "0c 64 11 0d" offset 8 size 16 logmin 0 (0x0) logmax 65535 (0xffff) value 3345 PASS

One of the predefined tests shows the problem with the original GetValue() code between 32- and 64-bit compiles:

% file getvaluetest32
getvaluetest32: ELF 32-bit MSB executable SPARC32PLUS Version 1, V8+ Required, UltraSPARC3 Extensions Required, dynamically linked, not stripped
% ./getvaluetest32
Test #1 buf "00 ff ff ff ff" offset 0 size 32 logmin -1 (0xffffffff) logmax 2147483647 (0x7fffffff) value -1 PASS
[...]
% file getvaluetest 
getvaluetest:     ELF 64-bit MSB executable SPARCV9 Version 1, dynamically linked, not stripped
% ./getvaluetest
Test #1 buf "00 ff ff ff ff" offset 0 size 32 logmin -1 (0xffffffffffffffff) logmax 2147483647 (0x7fffffff) value 0 FAIL expected -1
[...]

with the updated code the test passes for both compilation modes.

jimklimov · 2021-06-21T21:17:09Z

Got an innotech (nutdrv_qx) UPS on USB of one openindiana (amd64) machine, reporting this sort of data points at startup:

   6.454593     [D3] send: Q1
   6.514069     [D5] read [  0]: (8 bytes) => 28 32 32 38 2e 34 20 32
   6.546635     [D5] read [  8]: (8 bytes) => 32 38 2e 34 20 32 32 38
   6.577904     [D5] read [ 16]: (8 bytes) => 2e 34 20 30 31 33 20 35
   6.611767     [D5] read [ 24]: (8 bytes) => 30 2e 33 20 31 33 2e 36
   6.674604     [D5] read [ 32]: (8 bytes) => 20 32 35 2e 30 20 30 30
   6.715857     [D5] read [ 40]: (8 bytes) => 30 30 31 30 30 31 0d 00
   6.715933     [D3] read: (228.4 228.4 228.4 013 50.3 13.6 25.0 00001001
   6.715985     [D5] send_to_all: SETINFO input.voltage "228.4"

  45.050434     [D1] upsdrv_updateinfo...
  45.050519     [D1] Quick update...
  45.052299     [D3] send: Q1
  45.104409     [D5] read [  0]: (8 bytes) => 28 32 32 39 2e 34 20 32
  45.168411     [D5] read [  8]: (8 bytes) => 32 39 2e 34 20 32 32 39
  45.200408     [D5] read [ 16]: (8 bytes) => 2e 34 20 30 31 33 20 35
  45.232386     [D5] read [ 24]: (8 bytes) => 30 2e 33 20 31 33 2e 36
  45.264416     [D5] read [ 32]: (8 bytes) => 20 32 35 2e 30 20 30 30
  45.296381     [D5] read [ 40]: (8 bytes) => 30 30 31 30 30 31 0d 00
  45.296442     [D3] read: (229.4 229.4 229.4 013 50.3 13.6 25.0 00001001
  45.296482     [D5] update_status: OL
  45.296522     [D5] update_status: !LB
  45.296563     [D5] update_status: !CAL
  45.296599     [D5] update_status: !FSD

 556.327876     [D1] Quick update...
 556.330645     [D3] send: Q1
 556.408327     [D5] read [  0]: (8 bytes) => 28 32 32 38 2e 30 20 32
 556.427190     [D5] read [  8]: (8 bytes) => 32 38 2e 30 20 32 32 38
 556.449794     [D5] read [ 16]: (8 bytes) => 2e 30 20 30 31 30 20 35
 556.537239     [D5] read [ 24]: (8 bytes) => 30 2e 33 20 31 33 2e 36
 556.545810     [D5] read [ 32]: (8 bytes) => 20 32 35 2e 30 20 30 30
 556.597602     [D5] read [ 40]: (8 bytes) => 30 30 31 30 30 31 0d 00
 556.597668     [D3] read: (228.0 228.0 228.0 010 50.3 13.6 25.0 00001001
 556.597707     [D5] update_status: OL
 556.597743     [D5] update_status: !LB
 556.597780     [D5] update_status: !CAL
 556.597814     [D5] update_status: !FSD

 589.313029     [D2] send_to_one: sending PONG
 589.313060     [D5] send_to_one: PONG
 589.313100     [D1] upsdrv_updateinfo...
 589.313136     [D1] Full update...
 589.315422     [D3] send: Q1
 589.430053     [D5] read [  0]: (8 bytes) => 28 32 32 39 2e 34 20 32
 589.469210     [D5] read [  8]: (8 bytes) => 32 39 2e 34 20 32 32 39
 589.472486     [D5] read [ 16]: (8 bytes) => 2e 34 20 30 31 30 20 35
 589.557763     [D5] read [ 24]: (8 bytes) => 30 2e 33 20 31 33 2e 36
 589.638961     [D5] read [ 32]: (8 bytes) => 20 32 35 2e 30 20 30 30
 589.696486     [D5] read [ 40]: (8 bytes) => 30 30 31 30 30 31 0d 00
 589.696551     [D3] read: (229.4 229.4 229.4 010 50.3 13.6 25.0 00001001
 589.696601     [D5] send_to_all: SETINFO input.voltage "229.4"
 589.696655     [D5] send_to_all: SETINFO input.voltage.fault "229.4"
 589.696698     [D5] send_to_all: SETINFO output.voltage "229.4"
 589.696738     [D5] send_to_all: SETINFO ups.load "10"
 589.696790     [D5] update_status: OL
 589.696824     [D5] update_status: !LB
 589.700088     [D5] update_status: !CAL
 589.700136     [D5] update_status: !FSD

Do these help your test cases? :)

nbriggs · 2021-06-21T21:23:40Z

Thanks... it's not quite enough info -- it's not reporting the offset, size, and value that it extracted from the bytes that it read. I'll go take a look at the nutdrv_qx driver later today to see if there's a way to get the full info out of it.

nbriggs · 2021-06-21T21:48:46Z

Oops... answered without doing the necessary investigation! My change only affects UPS units that use the usbhid-ups driver, or, if I am reading it correctly, the mge-shut driver. From the Makefile, that would seem to include units like these which are subdrivers of the usbhid-ups driver:

apc-hid.c belkin-hid.c cps-hid.c liebert-hid.c mge-hid.c powercom-hid.c
tripplite-hid.c idowell-hid.c openups-hid.c powervar-hid.c delta_ups-hid.c

The nutdrv_qx doesn't use the hidparser code.

clepple · 2021-07-01T01:56:08Z

@nbriggs Thanks for digging into this.

would you happen to have access to any debug output from any UPSs from which I can extract data like ...

#733 has several smaller fields, and one 24-bit field (UPS.PowerSummary.RuntimeToEmpty). We added the [D3]-style debug level prefix to the output fairly recently (compared to when usbhid-ups was first written), so unfortunately it's not a single search string across all history. I got a fair number of hits in Gmail when searching for "Report[get]" (and most of those should be archived online e.g.: https://alioth-lists.debian.net/pipermail/nut-upsuser/2015-August/009790.html )

clepple · 2021-07-01T02:03:43Z

I will say that a place where you probably shouldn't trust the HID data is a CyberPower UPS. They seem to be relying on very strange interpretations of the HID stack description (which affects either Physical or Logical min/max), and I haven't had the time to figure out a decent structure for a HID descriptor patching system that doesn't involve including verbatim copies of all known buggy descriptors in NUT to match against.

clepple

All good points in the discussion on this PR, but I would want to try either unit tests or build and run on actual hardware to give it an explicit thumbs-up. Not sure when I will have time for testing.

nbriggs · 2021-07-01T04:05:02Z

@clepple -- Hi, I've pushed a branch (and create PR #1055) with the test harness that you can use to compare the old and new implementations of GetValue if you so choose. It's independent of this updated GetValue() code.

jimklimov · 2021-09-20T16:19:16Z

I've updated the CI farm to include containers with various OSes and platforms (as QEMU emulated on Linux), but so far they seem too slow and complicated to add into the main build iterations. So I hope to at least run some tests for codebase of #1055 with and without this PR added, to see if it behaves well everywhere, thanks :)

I will also see if the different-endianness containers do work, some claimed errors during setup (not all instructions implemented in the vCPUs)

jimklimov · 2021-10-04T02:25:43Z

FYI: With some recent development on CI side, I made a branch that should combine QEMU testing and proposed LP64 fix and test from #1040 and #1055 ... "so here goes nothing" : https://ci.networkupstools.org/job/nut/job/nut/job/issue_1023_GetValue_qemu_test/

jimklimov · 2021-10-15T19:52:34Z

QEMU on the already-VM CI farm is sloooow... but as of https://ci.networkupstools.org/blue/organizations/jenkins/nut%2Fnut/detail/issue_1023_GetValue_qemu_test/7/pipeline/ the tests went okay for Big-Endian s390x (64-bit) and mips (32-bit) as much as was possible to emulate.

And I also checked that the getvaluetest (without the LP64 fix) did fail for s390x in that first item, same as x86 envs originally:

jim@jenkins-debian11-s390x:~/nut$ (cd tests && ./getvaluetest)
Test #1 buf "00 ff ff ff ff" offset 0 size 32 logmin -1 (0xffffffffffffffff) logmax 2147483647 (0x7fffffff) value 0 FAIL expected -1
Test #2 buf "00 ff" offset 0 size 8 logmin -1 (0xffffffffffffffff) logmax 127 (0x7f) value -1 PASS
Test #3 buf "00 ff" offset 0 size 8 logmin 0 (0x0) logmax 127 (0x7f) value 127 PASS
Test #4 buf "00 ff" offset 0 size 8 logmin 0 (0x0) logmax 255 (0xff) value 255 PASS
Test #5 buf "33 00 0a 08 80" offset 0 size 32 logmin 0 (0x0) logmax 65535 (0xffff) value 2560 PASS
Test #6 buf "00 00 08 00 00" offset 0 size 32 logmin 0 (0x0) logmax 65535 (0xffff) value 2048 PASS
Test #7 buf "06 00 00 08" offset 0 size 8 logmin 0 (0x0) logmax 255 (0xff) value 0 PASS
Test #8 buf "06 00 00 08" offset 8 size 8 logmin 0 (0x0) logmax 255 (0xff) value 0 PASS
Test #9 buf "06 00 00 08" offset 16 size 8 logmin 0 (0x0) logmax 255 (0xff) value 8 PASS
Test #10 buf "16 0c 00 00 00" offset 0 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #11 buf "16 0c 00 00 00" offset 1 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #12 buf "16 0c 00 00 00" offset 2 size 1 logmin 0 (0x0) logmax 1 (0x1) value 1 PASS
Test #13 buf "16 0c 00 00 00" offset 3 size 1 logmin 0 (0x0) logmax 1 (0x1) value 1 PASS
Test #14 buf "16 0c 00 00 00" offset 4 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #15 buf "16 0c 00 00 00" offset 5 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #16 buf "16 0c 00 00 00" offset 6 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #17 buf "16 0c 00 00 00" offset 7 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #18 buf "16 0c 00 00 00" offset 8 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #19 buf "16 0c 00 00 00" offset 9 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS
Test #20 buf "16 0c 00 00 00" offset 10 size 1 logmin 0 (0x0) logmax 1 (0x1) value 0 PASS

Still waiting for mips result with that branch.

nbriggs · 2021-10-15T20:37:16Z

Happy to see it's still making progress. When it's merged I'll be rebuilding NUT for the guy whose (SPARC 64-bit) system originally prompted this exercise.

jimklimov · 2021-10-16T10:20:04Z

UPDATE: mips(32-bit) build did not expose the 64-bit bit-maths issues, passed the test for PR 1055 codebase (alone, without PR 1040 added).

Now both PRs are merged, adding the test fault on master history and fixing it :)

Thanks, for the find, fix, explanations and patience!

jimklimov reviewed Jun 12, 2021

View reviewed changes

nbriggs added 2 commits June 16, 2021 08:07

Merge branch 'networkupstools:master' into issue_1023_GetValue-LP64

b1245e0

Add note regarding sensitivity to 32- vs 64- bit compilation and over…

058eecc

…flow

nbriggs marked this pull request as draft June 18, 2021 23:50

nbriggs marked this pull request as ready for review June 19, 2021 00:07

jimklimov added ready / code review Author (and CI) consider the PR worthy of human rewievers' time ready / gonna merge The PR is in final cycles leading to merge unless someone logs an objection before we hit the button USB labels Jun 30, 2021

jimklimov requested review from aquette and clepple June 30, 2021 16:31

jimklimov changed the title ~~Fix GetValue() to correctly extract values from a report when compiled in LP64 mode~~ Fix GetValue() to correctly extract values from a report when compiled in LP64 mode (usbhid-ups) Jun 30, 2021

clepple reviewed Jul 1, 2021

View reviewed changes

nbriggs mentioned this pull request Jul 1, 2021

Create test harness for GetValue() report value extraction (usbhid-ups) #1055

Merged

nbriggs and others added 3 commits August 18, 2021 11:48

Merge branch 'networkupstools:master' into issue_1023_GetValue-LP64

16dc6db

Merge branch 'master' into issue_1023_GetValue-LP64

bb16006

Merge branch 'master' into issue_1023_GetValue-LP64

25b67c4

Merge branch 'master' into issue_1023_GetValue-LP64

e9608be

jimklimov added 2 commits October 1, 2021 23:33

Merge branch 'master' into issue_1023_GetValue-LP64

c84b321

Merge branch 'master' into issue_1023_GetValue-LP64

27100c4

jimklimov merged commit e67d7aa into networkupstools:master Oct 15, 2021

This was referenced Oct 20, 2021

Research fallout from usbhid-ups fix for LP64 bit maths #1138

Closed

Hotfix for drivers/hidparser.c losing the link with incorrect HID data #1146

Merged

Fightwarn: fix warnings in codebase of recently added getvaluetest #1149

Merged

jimklimov mentioned this pull request Nov 29, 2021

Numerous "ERROR in GetValue: LogMin is greater than LogMax, possibly vendor HID is incorrect on device side" from APC Back-UPS BX1600MI #1208

Closed

wmigas mentioned this pull request Dec 21, 2021

CPS: patch HID Report Descriptor to fix "output.voltage" #439

Closed

jimklimov added the USB-HID encoding/LogMin/LogMax Issues and solutions (PRs) specifically about incorrect values in bitstream label Jan 12, 2022

jimklimov mentioned this pull request Mar 24, 2022

CPS wrong output voltage #1338

Closed

jimklimov mentioned this pull request Apr 25, 2022

PowerWalker Basic VI 1000 SB supported by usbhid-ups #818

Closed

jimklimov mentioned this pull request Dec 16, 2024

usbhid-ups generally: apply disable_fix_report_desc…; cps-hid: fix mismatched LogMax between input/output voltages (bad encoding) #2718

Merged

Uh oh!

Conversation

nbriggs commented Jun 3, 2021

Uh oh!

jimklimov left a comment

Choose a reason for hiding this comment

Uh oh!

jimklimov Jun 12, 2021

Choose a reason for hiding this comment

Uh oh!

nbriggs Jun 12, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nbriggs Jun 14, 2021

Choose a reason for hiding this comment

Uh oh!

jimklimov Jun 14, 2021

Choose a reason for hiding this comment

Uh oh!

nbriggs commented Jun 16, 2021

Uh oh!

jimklimov commented Jun 18, 2021

Uh oh!

nbriggs commented Jun 18, 2021

Uh oh!

nbriggs commented Jun 19, 2021

Uh oh!

nbriggs commented Jun 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jimklimov commented Jun 21, 2021

Uh oh!

nbriggs commented Jun 21, 2021

Uh oh!

nbriggs commented Jun 21, 2021

Uh oh!

clepple commented Jul 1, 2021

Uh oh!

clepple commented Jul 1, 2021

Uh oh!

clepple left a comment

Choose a reason for hiding this comment

Uh oh!

nbriggs commented Jul 1, 2021

Uh oh!

jimklimov commented Sep 20, 2021

Uh oh!

jimklimov commented Oct 4, 2021

Uh oh!

jimklimov commented Oct 15, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

nbriggs commented Oct 15, 2021

Uh oh!

jimklimov commented Oct 16, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

nbriggs Jun 12, 2021 •

edited

Loading

nbriggs commented Jun 20, 2021 •

edited

Loading

jimklimov commented Oct 15, 2021 •

edited

Loading