Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 23 additions & 6 deletions doc/admin-guide/plugins/cachekey.en.rst
Original file line number Diff line number Diff line change
Expand Up @@ -110,14 +110,16 @@ Cache key structure and related plugin parameters

::

Optional components | ┌───────────────────┐
| │ --include-headers │
| ├───────────────────┤
Default values if no | │ (empty) |
optional components | └───────────────────┘
Optional components | ┌───────────────────┬────────────────────
| │ --include-headers │ --capture-headers │
| ├────────────────────────────────────────
Default values if no | │ (empty) | (empty) |
optional components | └───────────────────┴────────────────────
configured |

* ``--include-headers`` (default: empty list) - comma separated list of headers to be added to the cache key. The list of headers defined by ``--include-headers`` are always sorted before adding them to the cache key.
* ``--include-headers`` (default: empty list) - comma separated list of headers to be added to the cache key. The list of headers defined by ``--include-headers`` are always sorted before adding them to the cache key.

* ``--capture-header=<headername>:<capture_definition>`` (default: empty) - captures elements from header <headername> using <capture_definition> and adds them to the cache key.

"Cookies" section
^^^^^^^^^^^^^^^^^
Expand Down Expand Up @@ -400,6 +402,21 @@ The following headers ``HeaderA`` and ``HeaderB`` will be used when constructing

@plugin=cachekey.so @pparam=--include-headers=HeaderA,HeaderB

The following would capture from the ``Authorization`` header and will add the captured element to the cache key ::

@plugin=cachekey.so \
@pparam=--capture-header=Authorization:/AWS\s(?<clientID>[^:]+).*/clientID:$1/"

If the request looks like the following::

http://example-cdn.com/path/file
Authorization: AWS MKIARYMOG51PT0DLD:DLiWQ2lyS49H4Zyx34kW0URtg6s=

Cache key would be set to::

/example-cdn.com/80/clientID:MKIARYMOG51PTCKQ0DLD/path/file


HTTP Cookies
^^^^^^^^^^^^

Expand Down
119 changes: 80 additions & 39 deletions plugins/cachekey/cachekey.cc
Original file line number Diff line number Diff line change
Expand Up @@ -437,6 +437,61 @@ CacheKey::appendPath(Pattern &pathCapture, Pattern &pathCaptureUri)
}
}

template <class T>
void
CacheKey::processHeader(const String &name, const ConfigHeaders &config, T &dst,
void (*fun)(const ConfigHeaders &config, const String &name_s, const String &value_s, T &captures))
{
TSMLoc field;

for (field = TSMimeHdrFieldFind(_buf, _hdrs, name.c_str(), name.size()); field != TS_NULL_MLOC;
field = ::nextDuplicate(_buf, _hdrs, field)) {
const char *value;
int vlen;
int count = TSMimeHdrFieldValuesCount(_buf, _hdrs, field);

for (int i = 0; i < count; ++i) {
value = TSMimeHdrFieldValueStringGet(_buf, _hdrs, field, i, &vlen);
if (value == nullptr || vlen == 0) {
CacheKeyDebug("missing value %d for header %s", i, name.c_str());
continue;
}

String value_s(value, vlen);
fun(config, name, value_s, dst);
}
}
}

template <class T>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think T is any std::container<std:string>. Can you clarify that in comments or param name.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With this code this is true, I can see the benefit of the clarification but I am not sure we should limit things. Also this is a local template which helped me process the headers + taking 2 different actions w/o duplicating code and it is not meant to be reused a lot (not part of the "public" interface of the class, etc). If you look through out the plugin code you can see I like lots of comments but it feels overkill in this case.

void
captureWholeHeaders(const ConfigHeaders &config, const String &name, const String &value, T &captures)
{
CacheKeyDebug("processing header %s", name.c_str());
if (config.toBeAdded(name)) {
String header;
header.append(name).append(":").append(value);
captures.insert(header);
CacheKeyDebug("adding header '%s: %s'", name.c_str(), value.c_str());
} else {
CacheKeyDebug("failed to find header '%s'", name.c_str());
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How noisy will this be? Will this print every time there is a header that is not captured? Maybe it helps enough in debugging that it is worth it. Just don't want to roll logs too frequently if we turn on debugging here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be ok, let's leave it (we can always trim). Here is why I think it should be ok...

When one wants a header to be included or captured from the request into the cache key one would most likely expect the header or the capture to result in non-empty string on all requests, otherwise the cache key would become somehow unstructured (missing elements based on the traffic). So headers specified in --include-header and --capture-header are usually expected in the requests.

For efficiency reason the plugin does not iterate over all headers in the request (at least when the plugin was developed it was not efficient to iterate over headers in ATS, as far as I know it is still the case, see the comments in the code for more details).

The plugin iterates over the --include-header and --capture-header lists instead which in the usual use case will not have big number of headers (compared to the number of header fields in the request). Usually less then 5-10 parameters, 1 or 2 most of the time.

So the noise is not expected to be huge and the thinking is that while debugging the operator should know that an expected header or header capture was not found.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK sounds good. Agree we can trim it later if it is an issue.

}
}

template <class T>
void
captureFromHeaders(const ConfigHeaders &config, const String &name, const String &value, T &captures)
{
CacheKeyDebug("processing capture from header %s", name.c_str());
auto itMp = config.getCaptures().find(name);
if (config.getCaptures().end() != itMp) {
itMp->second->process(value, captures);
CacheKeyDebug("found capture pattern for header '%s'", name.c_str());
} else {
CacheKeyDebug("failed to find header '%s'", name.c_str());
}
}

/**
* @brief Append headers by following the rules specified in the header configuration object.
* @param config header-related configuration containing information about which headers need to be appended to the key.
Expand All @@ -445,49 +500,35 @@ CacheKey::appendPath(Pattern &pathCapture, Pattern &pathCaptureUri)
void
CacheKey::appendHeaders(const ConfigHeaders &config)
{
if (config.toBeRemoved() || config.toBeSkipped()) {
// Don't add any headers to the cache key.
return;
}

TSMLoc field;
StringSet hset; /* Sort and uniquify the header list in the cache key. */

/* Iterating header by header is not efficient according to comments inside traffic server API,
* Iterate over an 'include'-kind of list to avoid header by header iteration.
* @todo: revisit this when (if?) adding regex matching for headers. */
for (StringSet::iterator it = config.getInclude().begin(); it != config.getInclude().end(); ++it) {
String name_s = *it;

for (field = TSMimeHdrFieldFind(_buf, _hdrs, name_s.c_str(), name_s.size()); field != TS_NULL_MLOC;
field = ::nextDuplicate(_buf, _hdrs, field)) {
const char *value;
int vlen;
int count = TSMimeHdrFieldValuesCount(_buf, _hdrs, field);

for (int i = 0; i < count; ++i) {
value = TSMimeHdrFieldValueStringGet(_buf, _hdrs, field, i, &vlen);
if (value == nullptr || vlen == 0) {
CacheKeyDebug("missing value %d for header %s", i, name_s.c_str());
continue;
}

String value_s(value, vlen);
if (!config.toBeRemoved() && !config.toBeSkipped()) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was the original logic reversed and the early return removed here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before adding --capture-header list there was only code that handled --include-header list. toBeRemoved() and toBeSkipped() are associated with --capture-header list processing so exiting the function earlier will not give --capture-header list related code to run.

Beforehand considered various ways to add the new --capture-header functionality (parameters/design-wise) and it seems there might be a moment in time when I will redesign or refactor things to generalize various lists handling in the plugin but with this implementation I think we are still in a good shape (unless we find something problematic or not performing well).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense to me. Thanks.

/* Iterating header by header is not efficient according to comments inside traffic server API,
* Iterate over an 'include'-kind of list or the capture definitions to avoid header by header iteration.
* @todo: revisit this when (if?) adding regex matching for headers. */

/* Adding whole headers, iterate over "--include-header" list */
StringSet hdrSet; /* Sort and uniquify the header list in the cache key. */
for (auto it = config.getInclude().begin(); it != config.getInclude().end(); ++it) {
processHeader(*it, config, hdrSet, captureWholeHeaders);
}

if (config.toBeAdded(name_s)) {
String header;
header.append(name_s).append(":").append(value_s);
hset.insert(header);
CacheKeyDebug("adding header => '%s: %s'", name_s.c_str(), value_s.c_str());
}
}
/* Append to the cache key. It doesn't make sense to have the headers unordered in the cache key. */
String headers_key = containerToString<StringSet, StringSet::const_iterator>(hdrSet, "", _separator);
if (!headers_key.empty()) {
append(headers_key);
}
}

/* It doesn't make sense to have the headers unordered in the cache key. */
String headers_key = containerToString<StringSet, StringSet::const_iterator>(hset, "", _separator);
if (!headers_key.empty()) {
append(headers_key);
if (!config.getCaptures().empty()) {
/* Adding captures from headers, iterate over "--capture-header" definitions */
StringVector hdrCaptures;
for (auto it = config.getCaptures().begin(); it != config.getCaptures().end(); ++it) {
processHeader(it->first, config, hdrCaptures, captureFromHeaders);
}

/* Append to the cache key. Add the captures in the order capture definitions are captured / specified */
for (auto &capture : hdrCaptures) {
append(capture);
}
}
}

Expand Down
4 changes: 4 additions & 0 deletions plugins/cachekey/cachekey.h
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,10 @@ class CacheKey
private:
CacheKey(); // disallow

template <class T>
void processHeader(const String &name_s, const ConfigHeaders &config, T &dst,
void (*fun)(const ConfigHeaders &config, const String &name_s, const String &value_s, T &captures));

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is only a helper for the implementation, maybe we don't need to declare it here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO there is nothing wrong in having helper functions as private members of a class. I usually try to have them localized to the unit where they are used but sometimes there are lots of class members used by the function and in this case it feels that the usage would become cumbersome (lots of parameter passing, more chances for mistakes too).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I agree. Just wondering why we need to add its declaration to a header if no one else is using it. We can leave it if there we want.

/* Information from the request */
TSHttpTxn _txn; /**< @brief transaction handle */
TSMBuffer _buf; /**< @brief marshal buffer */
Expand Down
2 changes: 2 additions & 0 deletions plugins/cachekey/common.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,11 +26,13 @@
#define PLUGIN_NAME "cachekey"

#include <string>
#include <string_view>
#include <set>
#include <list>
#include <vector>

typedef std::string String;
typedef std::string_view StringView;
typedef std::set<std::string> StringSet;
typedef std::list<std::string> StringList;
typedef std::vector<std::string> StringVector;
Expand Down
52 changes: 52 additions & 0 deletions plugins/cachekey/configs.cc
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,47 @@ setPattern(MultiPattern &multiPattern, const char *arg)
}
}

bool
ConfigElements::setCapture(const String &name, const String &pattern)
{
auto it = _captures.find(name);
if (_captures.end() == it) {
auto mp = new MultiPattern(name);
if (nullptr != mp) {
_captures[name] = mp;
} else {
return false;
}
}
setPattern(*_captures[name], pattern.c_str());
CacheKeyDebug("added capture pattern '%s' for element '%s'", pattern.c_str(), name.c_str());
return true;
}

void
ConfigElements::addCapture(const char *arg)
{
StringView args(arg);
StringView::size_type pos = args.find_first_of(':');
if (StringView::npos != pos) {
String name(args.substr(0, pos));
if (!name.empty()) {
String pattern(args.substr(pos + 1));
if (!pattern.empty()) {
if (!setCapture(name, pattern)) {
CacheKeyError("failed to add capture: '%s'", arg);
}
} else {
CacheKeyError("missing pattern in capture: '%s'", arg);
}
} else {
CacheKeyError("missing element name in capture: %s", arg);
}
} else {
CacheKeyError("invalid capture: %s, should be 'name:<capture_definition>", arg);
}
}

void
ConfigElements::setExcludePatterns(const char *arg)
{
Expand Down Expand Up @@ -140,6 +181,13 @@ ConfigElements::noIncludeExcludeRules() const
return _exclude.empty() && _excludePatterns.empty() && _include.empty() && _includePatterns.empty();
}

ConfigElements::~ConfigElements()
{
for (auto it = _captures.begin(); it != _captures.end(); it++) {
delete it->second;
}
}

/**
* @brief finalizes the query parameters related configuration.
*
Expand Down Expand Up @@ -348,6 +396,7 @@ Configs::init(int argc, const char *argv[], bool perRemapConfig)
{const_cast<char *>("remove-path"), optional_argument, nullptr, 'r'},
{const_cast<char *>("separator"), optional_argument, nullptr, 's'},
{const_cast<char *>("uri-type"), optional_argument, nullptr, 't'},
{const_cast<char *>("capture-header"), optional_argument, nullptr, 'u'},
{nullptr, 0, nullptr, 0},
};

Expand Down Expand Up @@ -452,6 +501,9 @@ Configs::init(int argc, const char *argv[], bool perRemapConfig)
case 't': /* uri-type */
setUriType(optarg);
break;
case 'u': /* capture-header */
_headers.addCapture(optarg);
break;
}
}

Expand Down
14 changes: 12 additions & 2 deletions plugins/cachekey/configs.h
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@
#include "pattern.h"
#include "common.h"

#include <map>

enum CacheKeyUriType {
REMAP,
PRISTINE,
Expand All @@ -40,14 +42,19 @@ class ConfigElements
{
public:
ConfigElements() : _sort(false), _remove(false), _skip(false) {}
virtual ~ConfigElements() {}
virtual ~ConfigElements();
void setExclude(const char *arg);
void setInclude(const char *arg);
void setExcludePatterns(const char *arg);
void setIncludePatterns(const char *arg);
void setRemove(const char *arg);
void setSort(const char *arg);

void addCapture(const char *arg);
const auto &
getCaptures() const
{
return _captures;
}
/** @brief shows if the elements are to be sorted in the result */
bool toBeSorted() const;
/** @brief shows if the elements are to be removed from the result */
Expand All @@ -67,6 +74,7 @@ class ConfigElements

protected:
bool noIncludeExcludeRules() const;
bool setCapture(const String &name, const String &pattern);

StringSet _exclude;
StringSet _include;
Expand All @@ -77,6 +85,8 @@ class ConfigElements
bool _sort;
bool _remove;
bool _skip;

std::map<String, MultiPattern *> _captures;
};

/**
Expand Down
12 changes: 12 additions & 0 deletions plugins/cachekey/pattern.cc
Original file line number Diff line number Diff line change
Expand Up @@ -452,6 +452,18 @@ MultiPattern::name() const
return _name;
}

bool
MultiPattern::process(const String &subject, StringVector &result) const
{
bool res = false;
for (auto p : this->_list) {
if (nullptr != p && p->process(subject, result)) {
res = true;
}
}
return res;
}

/**
* @brief Destructor, deletes all multi-patterns.
*/
Expand Down
2 changes: 2 additions & 0 deletions plugins/cachekey/pattern.h
Original file line number Diff line number Diff line change
Expand Up @@ -85,6 +85,8 @@ class MultiPattern
virtual bool match(const String &subject) const;
const String &name() const;

bool process(const String &subject, StringVector &result) const;

protected:
std::vector<Pattern *> _list; /**< @brief vector which dictates the order of the pattern evaluation. */
String _name; /**< @brief multi-pattern name */
Expand Down