Skip to content

generic_modbus driver - handle connection errors (broken pipe), configurable modbus timeouts#1239

Merged
jimklimov merged 24 commits intonetworkupstools:masterfrom
dtsecon:master
Jan 18, 2022
Merged

generic_modbus driver - handle connection errors (broken pipe), configurable modbus timeouts#1239
jimklimov merged 24 commits intonetworkupstools:masterfrom
dtsecon:master

Conversation

@dtsecon
Copy link
Copy Markdown
Contributor

@dtsecon dtsecon commented Dec 23, 2021

generic_modbus driver now handles connection errors (broken pipe) with modbus tcp server devices. Response time out between requests and between frame bytes are now configurable via ups.conf to handle "invalid data" errors. The new code have been tested.

configuration options for modbus response and byte timeouts in ups.conf
# Modbus timeouts
mod_resp_to_s = 2
mod_resp_to_us = 0
mod_byte_to_s = 0
mode_byte_to_us = 20000

…us response timeouts between requests and frame bytes
@lgtm-com
Copy link
Copy Markdown

lgtm-com Bot commented Dec 23, 2021

This pull request introduces 2 alerts when merging bbf2d51 into 5454b34 - view on LGTM.com

new alerts:

  • 2 for Comparison result is always the same

Copy link
Copy Markdown
Member

@jimklimov jimklimov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice :)

@jimklimov
Copy link
Copy Markdown
Member

LGTM errors seem right this time: you define mod_resp_to_us and mod_byte_to_us as uint32_t, so comparisons if mod...<0 really do not make sense. Is zero a valid value? Maybe a future-proof version (if you anticipate changes to a signed type) could be <=0 ?..

@jimklimov
Copy link
Copy Markdown
Member

jimklimov commented Dec 23, 2021

While some CI workers complained about those comparisons, at least one other seems to have a different modbus API so there should be configure-time checks (m4) for acceptable usage respectively ifdef-ed in C code:

[2021-12-23T16:58:14.416Z] generic_modbus.c:131:59: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]  rval = modbus_set_response_timeout(mbctx, mod_resp_to_s, mod_resp_to_us);
[2021-12-23T16:58:14.416Z]         ~~~~~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:151:1: note: 'modbus_set_response_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_response_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:138:55: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]  rval = modbus_set_byte_timeout(mbctx, mod_byte_to_s, mod_byte_to_us);
[2021-12-23T16:58:14.416Z]         ~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:154:1: note: 'modbus_set_byte_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_byte_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:728:51: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
[2021-12-23T16:58:14.416Z]     if (mod_resp_to_us > 999999 || mod_resp_to_us < 0) {
[2021-12-23T16:58:14.416Z]                                    ~~~~~~~~~~~~~~ ^ ~
[2021-12-23T16:58:14.416Z] generic_modbus.c:743:51: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
[2021-12-23T16:58:14.416Z]     if (mod_byte_to_us > 999999 || mod_byte_to_us < 0) {
[2021-12-23T16:58:14.416Z]                                    ~~~~~~~~~~~~~~ ^ ~
[2021-12-23T16:58:14.416Z] generic_modbus.c:1011:62: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]     rval = modbus_set_response_timeout(mbctx, mod_resp_to_s, mod_resp_to_us);
[2021-12-23T16:58:14.416Z]            ~~~~~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:151:1: note: 'modbus_set_response_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_response_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:1018:58: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]     rval = modbus_set_byte_timeout(mbctx, mod_byte_to_s, mod_byte_to_us);
[2021-12-23T16:58:14.416Z]            ~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:154:1: note: 'modbus_set_byte_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_byte_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] 2 warnings and 4 errors generated.

(that's with libmodbus-3.0.5 from Ubuntu 14.04)

@jimklimov
Copy link
Copy Markdown
Member

And gotta investigate how to better handle the two different APIs, it seems ...

@dtsecon
Copy link
Copy Markdown
Contributor Author

dtsecon commented Jan 4, 2022

What if users enter a negative? Or if valid "long" input exceeds UINT32_MAX?..

"Leave them shot in the foot" is an option, this setting is quite specific... But elsewhere code makes range-checks of an honest long and rejects or casts then.

The values for configuration variables mod_*_to_s, mod_*_to_us correspond to seconds and micro seconds engineering units. They can't be negative values. The user that set those values is responsible to honor the appropriate variable type and range. On the other hand the code must survive of non conforming values and in order to do so, it casts the long converted string user input to uint32_t value and checks against range upper limit in order to exit with failure without crashing. If user input value is within range but still a typo, it's his responsibility to figure out and fix it. So negative values or values that cross UINT32_MAX are handled by code (value out of range > 999999) by exiting with a failure.

@dtsecon
Copy link
Copy Markdown
Contributor Author

dtsecon commented Jan 4, 2022

While some CI workers complained about that comparisons, at least one other seems to have a different modbus API so there should be configure-time checks (m4) for acceptable usage respectively ifdef-ed in C code:

[2021-12-23T16:58:14.416Z] generic_modbus.c:131:59: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]  rval = modbus_set_response_timeout(mbctx, mod_resp_to_s, mod_resp_to_us);
[2021-12-23T16:58:14.416Z]         ~~~~~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:151:1: note: 'modbus_set_response_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_response_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:138:55: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]  rval = modbus_set_byte_timeout(mbctx, mod_byte_to_s, mod_byte_to_us);
[2021-12-23T16:58:14.416Z]         ~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:154:1: note: 'modbus_set_byte_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_byte_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:728:51: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
[2021-12-23T16:58:14.416Z]     if (mod_resp_to_us > 999999 || mod_resp_to_us < 0) {
[2021-12-23T16:58:14.416Z]                                    ~~~~~~~~~~~~~~ ^ ~
[2021-12-23T16:58:14.416Z] generic_modbus.c:743:51: warning: comparison of unsigned expression < 0 is always false [-Wtautological-compare]
[2021-12-23T16:58:14.416Z]     if (mod_byte_to_us > 999999 || mod_byte_to_us < 0) {
[2021-12-23T16:58:14.416Z]                                    ~~~~~~~~~~~~~~ ^ ~
[2021-12-23T16:58:14.416Z] generic_modbus.c:1011:62: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]     rval = modbus_set_response_timeout(mbctx, mod_resp_to_s, mod_resp_to_us);
[2021-12-23T16:58:14.416Z]            ~~~~~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:151:1: note: 'modbus_set_response_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_response_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] generic_modbus.c:1018:58: error: too many arguments to function call, expected 2, have 3
[2021-12-23T16:58:14.416Z]     rval = modbus_set_byte_timeout(mbctx, mod_byte_to_s, mod_byte_to_us);
[2021-12-23T16:58:14.416Z]            ~~~~~~~~~~~~~~~~~~~~~~~                       ^~~~~~~~~~~~~~
[2021-12-23T16:58:14.416Z] /usr/include/modbus/modbus.h:154:1: note: 'modbus_set_byte_timeout' declared here
[2021-12-23T16:58:14.416Z] void modbus_set_byte_timeout(modbus_t *ctx, const struct timeval *timeout);
[2021-12-23T16:58:14.416Z] ^
[2021-12-23T16:58:14.416Z] 2 warnings and 4 errors generated.

(that's with libmodbus-3.0.5 from Ubuntu 14.04)

the set_response_timeout function has been modified since 2013. This is an older version that used to accept timeout as a struct timeval instead of seconds and microseconds.

@jimklimov
Copy link
Copy Markdown
Member

Well, NUT aims to be usable on a wide spread of systems, some embedded and not easily upgradeable from whatever older baseline OS they boast :\

There are already precedents of multiple-API support (e.g. libusb-0.1 vs libusb-1.0) as well as "lockpicking" suitable method argument types like in https://github.com/networkupstools/nut/blob/master/m4/nut_func_getnameinfo_argtypes.m4 (loosely modeled after cURL config code; note that AC_COMPILE_IFELSE is preferable for build-time differences, but AC_RUN_IFELSE can be used similarly if there are some run-time nuances to detect - e.g. platforms with known bugs in one method so an alternative would be used).

@jimklimov
Copy link
Copy Markdown
Member

jimklimov commented Jan 11, 2022

Hi @dtsecon : I've posted the fix to adjust to two available libmodbus APIs, as a PR to your "nut_fork:master" (note, future development and PRing of features should better be handled in dedicated branches): dtsecon#1

If you can, please test if that still works for your builds; if all is OK, after you merge that PR this one should be automatically updated.

Otherwise, I intend to merge that state of the codebase (including your and my fixes) in a few days, if NUT CI farm is okay with that - as would be seen at https://ci.networkupstools.org/job/nut/job/nut/job/dtsecon-master-PR-1239


To keep the comment in NUT for posterity:

As discussed in PR #1239 there are two libmodbus API variants to consider on the build farm reflecting current and "recent" distros for NUT:

In both cases, error-check should happen at usage time and was not specifically investigated for handled in this PR. Per docs,

When a byte timeout is set, if elapsed time for the first byte of response is longer than the given timeout, an ETIMEDOUT error will be raised by the function waiting for a response

@jimklimov
Copy link
Copy Markdown
Member

Some merge-conflicts cropped up between your "master" branch and the target (upstream NUT "master" branch which had some clean-up applied since original driver merge). Hopefully resolved now with minimal noise.

@dtsecon
Copy link
Copy Markdown
Contributor Author

dtsecon commented Jan 12, 2022

Hi @dtsecon : I've posted the fix to adjust to two available libmodbus APIs, as a PR to your "nut_fork:master" (note, future development and PRing of features should better be handled in dedicated branches): dtsecon#1

If you can, please test if that still works for your builds; if all is OK, after you merge that PR this one should be automatically updated.

Otherwise, I intend to merge that state of the codebase (including your and my fixes) in a few days, if NUT CI farm is okay with that - as would be seen at https://ci.networkupstools.org/job/nut/job/nut/job/dtsecon-master-PR-1239

To keep the comment in NUT for posterity:

As discussed in PR #1239 there are two libmodbus API variants to consider on the build farm reflecting current and "recent" distros for NUT:

In both cases, error-check should happen at usage time and was not specifically investigated for handled in this PR. Per docs,

When a byte timeout is set, if elapsed time for the first byte of response is longer than the given timeout, an ETIMEDOUT error will be raised by the function waiting for a response

tested and seems to work fine against my setup

@dtsecon
Copy link
Copy Markdown
Contributor Author

dtsecon commented Jan 17, 2022

Hi @jimklimov. I noticed that mr jenkins struggles to pass this PR during the last days without success. Since my last commits against generic_modbus have been applied to my master forked nut repo (current PR), this pending merge holds me from moving on with a couple of more PR branches (a new driver for adelesystems UPS, more updates on generic_modbus, and a fix for upssched which is currently broken). Regards!

@jimklimov
Copy link
Copy Markdown
Member

jimklimov commented Jan 18, 2022

Hi, sorry about that delay - it also had a few complaints about the code (though did not expose them in the build verdict... recipe fixed elsewhere - so should now complain about quality regression for future NUT CI farm builds). Those complaints were fixed in the later 3 commits, and the build passed without warnings last night - so merging, finally! Thanks for the solid improvements!

And for those further PRs, please remember to base them off of dedicated branches, not "master", in your forked repo :)

@jimklimov jimklimov merged commit b14fdf3 into networkupstools:master Jan 18, 2022
dtsecon added a commit to dtsecon/nut_fork that referenced this pull request Feb 4, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants