Skip to content

ESI: remove custom parser#128

Merged
yadij merged 8 commits intosquid-cache:masterfrom
yadij:v5-cleanup_esi_parser
Jan 18, 2018
Merged

ESI: remove custom parser#128
yadij merged 8 commits intosquid-cache:masterfrom
yadij:v5-cleanup_esi_parser

Conversation

@yadij
Copy link
Contributor

@yadij yadij commented Jan 17, 2018

Alex Rousskov:
let's consider removing the custom ESI parser from Squid. It is of
terrible quality and "nobody" is testing ESI code when things change. Is
the CVE risk worth supporting few platforms that do not have the right
parser libraries?

Alex Rousskov:
  let's consider removing the custom ESI parser from Squid. It is of
terrible quality and "nobody" is testing ESI code when things change. Is
the CVE risk worth supporting few platforms that do not have the right
parser libraries?
@yadij yadij force-pushed the v5-cleanup_esi_parser branch from 5b9e5d2 to 46391e0 Compare January 17, 2018 05:01
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make ./configure --enable-esi fail when none of the now-required libraries is available.

Please warn squid-users about this change to increase our chances of making an informed decision regarding the custom parser "value" to Squid admins. I, personally, certainly cannot represent ESI users!

assert(prCustom); // we should be called once, and only after Init()

#if HAVE_LIBEXPAT
assert(prExpat); // we should be called once, and only after Init()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree with the old comment. Or, to be more precise, I disagree that "should" is assert().

There is no reason to restrict cleanup that much IMHO. Let's not add these two asserts in Clean().

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

yadij added 2 commits January 18, 2018 02:58
Also, remove the 05-nodeps-esi QA tests. We cannot test expected-to-fail
build error cases at present.
Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for addressing my earlier concerns. Unfortunately, the new changes brought in one new problem (see inlined comments for details). I also realized that we forgot to fix esi_parser metadata and description in src/cf.data.pre.

AC_ARG_ENABLE(esi,
AS_HELP_STRING([--enable-esi],
[Enable ESI for accelerators. Benefits from expat or libxml2.
AS_HELP_STRING([--disable-esi],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt we should enable ESI where possible because I suspect that most admins do not want ESI processing of responses (and other ESI side effects). At the very least, this change in default should be done/debated separately from this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is catching the documentation up to the change in v3.1 "ESI support default enabled", although there was one case incorrectly still disabled which changes with the library checks being more central to the auto-disable criteria.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ouch! I consider this change request addressed, thank you.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

However, if the official code was not actually enabling ESI by default in many cases (due to the bug you have mentioned), and we have now "fixed" that, effectively enabling ESI for many unsuspected admins, then we have a bigger mess to deal with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nod, one of the reasons this is not going near v3.5. If you believe the change is going to be too much of a surprise, I'm fine with leaving it in v5. Though IMO it is still smaller than the other changes already targeted at v4 (eg PR #45 and #81) and the testing these will all need can overlap without obstructing any later fixes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: People using distro packages, or basing their builds on the distro options should be unaffected. I believe the majority of self-builds are just rebuilds of distro packages for SSL-Bump these days, so not many left to surprise.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you believe the change is going to be too much of a surprise

I do not have enough information to believe either way. This is one of the reasons you should post to squid-users about this change IMHO.

configure.ac Outdated
squid_opt_use_esi=yes
],[
AS_IF(test "x$with_expat" = "xyes",[
AC_MSG_ERROR([Required library expat is not able to be found.])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please fix phrasing by doing s/is not able to be/not/ (or something like that) in both copies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, done

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New ESI configuration concerns in the new code. See inlined comment for details.

src/cf.data.pre Outdated
TYPE: string
LOC: ESIParser::Type
DEFAULT: custom
DEFAULT: libxml2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would happen when Squid is built with libexpat and without libxml2? Only one of those two libraries is required to get USE_SQUID_ESI enabled, right? It feels like the default should be influenced by the library availability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Squid can use either or both. The code seems to prefer libxml2 and a quick check online shows that library is a bit more active than libexpat. Though its documentation is not as good. They both have packages or binaries available for all the OS we formally support, and then some.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, missed your meaning for a bit there. Yes expat-only would be a startup problem. Fixing that now by moving the default selection into esi/Parser.cc based on the HAVE_* instead of static value in cf.data.pre

Copy link
Contributor

@rousskov rousskov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I found a few more bugs. Inlined comments have specifics and a polishing recommendation for the new m4 code.

Also, we should make libesi.la build conditional on ESI being enabled, right?

AC_MSG_ERROR([Required library libxml2 is not able to be found.])
fi
AS_IF(test "x$HAVE_LIBXML2" = "x1",[
squid_opt_use_esi=yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please do not (re)set squid_opt_use_esi here. The original value we used to enter this block is fine and, for diagnostic/reporting purposes, should be preserved. We should not overwrite user input unless really necessary. Preserving the original value also simplifies the logic behind the (currently buggy) code that tries to put all the tests together later.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The user input is in $enable_esi. squid_opt_use_esi is our local variable to indicate whether we define ENABLE_ESI and USE_SQUID_ESI later. It starts as "auto" and ends as "no" if ESI is forbidden, or "yes" if permitted. Staying auto at the end means broken dependencies.

configure.ac Outdated
AS_IF(test "x$with_libxml2" = "xyes",[
AC_MSG_ERROR([Required library libxml2 not found.])
],[
AC_MSG_NOTICE([Required library libxml2 not found.])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do not think we know that libxml2 is required in this particular case. AFAICT, this else clause should be removed. Later, we check that at least one of the libraries is available (as we should).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nod, remove "Required" from the NOTICE messages. Absence there is covered by other libraries existing or the final error message.

configure.ac Outdated
fi

AS_IF([test "x$squid_opt_use_esi" = "xyes"],[
AS_IF(test $HAVE_LIBXML2 = 1 -a $HAVE_EXPAT = 1,[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the opposite of the current condition AFAICT. After fixing this condition, please double check the else clause.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, that was the broken part. checking for at least one of them existing now

])
fi

AS_IF([test "x$squid_opt_use_esi" = "xyes"],[
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you preserve the original squid_opt_use_esi, then this condition would make a lot more sense (and the ESI enabling code would be moved into a dedicated block, with its own condition that would also be easy to grok).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If ESI is to be enabled (either required or libraries auto-detected) then ...

src/cf.data.pre Outdated
ESI markup is not strictly XML compatible. The custom ESI parser
will give higher performance, but cannot handle non ASCII character
encodings.
ESI markup is not strictly XML compatible.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The purpose or implication of that remaining statement is not really clear. Do those XML-parsing libraries handle ESI markup correctly?

  • If XML-parsing libraries handle ESI markup correctly, then why mention this?
  • Otherwise, this PR not just removes "faster but poor quality and unmaintained" code, but actually breaks some ESI setups (the ones that XML-parsing libraries cannot handle)!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what that text should be saying now. My reading of the original was that the custom parser could only handle ASCII strict XML syntax properly, while the other libraries should be better.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the libraries can handle ESI correctly, then the new text could say something like

Selects the XML parsing library to use when interpreting responses with Edge Side Includes. To disable ESI handling completely, ./configure Squid with --disable-esi.

and

DEFAULT_DOC: Selects XXX if available at ./confgure time or YYY otherwise.

Please note that I have no idea whether those libraries can handle ESI correctly. My interpretation of the original documentation was that they may not be able to handle some cases, but I did not research that, and other reasonable interpretations (like yours) are possible.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good. done

while (Parser != NULL && strcasecmp(Parser->name, Type) != 0)
Parser = Parser->next;
// if type name matters, use it
if (strcasecmp(Type, "auto") == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mnemonic: == means the strings are equal. You want != here AFAICT.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. fixed

configure.ac Outdated
AS_IF(test "x$with_expat" = "xyes",[
AC_MSG_ERROR([Required library expat not found.])
],[
AC_MSG_NOTICE([Required library expat not found.])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See libxml2 for the comment that applies to this code as well.

AC_MSG_ERROR([Required library expat is not able to be found.])
fi
AS_IF(test "x$HAVE_LIBEXPAT" = "x1",[
squid_opt_use_esi=yes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See libxml2 for the comment that applies to this code as well.

@kinkie
Copy link
Contributor

kinkie commented Jan 18, 2018 via email

@yadij
Copy link
Contributor Author

yadij commented Jan 18, 2018

nod, got those in this last update

@yadij yadij added the S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box label Jan 18, 2018
@rousskov rousskov added S-waiting-for-committer privileged action is expected (and usually required) and removed S-waiting-for-reviewer ready for review: Set this when requesting a (re)review using GitHub PR Reviewers box labels Jan 18, 2018
@yadij yadij merged commit 799b66d into squid-cache:master Jan 18, 2018
@yadij yadij removed the S-waiting-for-committer privileged action is expected (and usually required) label Jan 18, 2018
@yadij yadij deleted the v5-cleanup_esi_parser branch January 18, 2018 20:55
squidadm pushed a commit to squidadm/squid that referenced this pull request Jan 19, 2018
Alex Rousskov:
  let's consider removing the custom ESI parser from Squid. It is of
terrible quality and "nobody" is testing ESI code when things change. Is
the CVE risk worth supporting few platforms that do not have the right
parser libraries?

* Fixed configure.ac tests for ESI libraries
squidadm pushed a commit to squidadm/squid that referenced this pull request Jan 19, 2018
Alex Rousskov:
  let's consider removing the custom ESI parser from Squid. It is of
terrible quality and "nobody" is testing ESI code when things change. Is
the CVE risk worth supporting few platforms that do not have the right
parser libraries?

* Fixed configure.ac tests for ESI libraries
yadij added a commit that referenced this pull request Jan 19, 2018
Alex Rousskov:
  let's consider removing the custom ESI parser from Squid. It is of
terrible quality and "nobody" is testing ESI code when things change. Is
the CVE risk worth supporting few platforms that do not have the right
parser libraries?

* Fixed configure.ac tests for ESI libraries
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants