Skip to content

add maxmind country db support#31147

Closed
hwyuan wants to merge 2 commits intoenvoyproxy:mainfrom
hwyuan:add-maxmind-country-db
Closed

add maxmind country db support#31147
hwyuan wants to merge 2 commits intoenvoyproxy:mainfrom
hwyuan:add-maxmind-country-db

Conversation

@hwyuan
Copy link
Copy Markdown
Contributor

@hwyuan hwyuan commented Dec 2, 2023

Commit Message: add maxmind country db support
Additional Description:
Support maxmind country db in geoip maxmind provider.
Since both country db and city db can populate country info, we allow at most one of country db and city db can be configured to avoid ambiguity.
Risk Level: Low
Testing: unit test
Docs Changes: TODO
Release Notes: TODO
Platform Specific Features:
[Optional Fixes #Issue] #30956

@repokitteh-read-only
Copy link
Copy Markdown

Hi @hwyuan, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #31147 was opened by hwyuan.

see: more, trace.

@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #31147 was opened by hwyuan.

see: more, trace.

@repokitteh-read-only
Copy link
Copy Markdown

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to (api/envoy/|docs/root/api-docs/).
envoyproxy/api-shepherds assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to (api/envoy/|docs/root/api-docs/).

🐱

Caused by: #31147 was opened by hwyuan.

see: more, trace.

@hwyuan hwyuan force-pushed the add-maxmind-country-db branch 2 times, most recently from ab7875c to f0505e0 Compare December 5, 2023 06:54
@hwyuan
Copy link
Copy Markdown
Contributor Author

hwyuan commented Dec 5, 2023

/retest

@hwyuan
Copy link
Copy Markdown
Contributor Author

hwyuan commented Dec 5, 2023

cc @JuniorHsu

Copy link
Copy Markdown
Contributor

@JuniorHsu JuniorHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also needs changelogs/current.yaml change

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by seeing this comment, should we do oneof in proto so we only allow either country or city map?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

makes sense, let me update the PR to try oneof

Copy link
Copy Markdown
Contributor

@zhxie zhxie Dec 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An oneof change of the existing non-oneof field may break the backward compatibility in ProtoBuf, you should not change it and add configuration time exception to notice the user setting both country and city databases is not valid.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, let's throw Exception in the constructor of GeoipProviderConfig

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we are adding a new field country_db_path_ in this PR, and users will only be able to pick country_db_path_ or city_db_path after this change, i.e., currently they would either configure city_db_path_ or not using it, would we still have backward compatibility issues of we can adopt oneof directly in proto definition in this PR?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It does not depend on how you use the field. The ProtoBuf does not support oneof backward compatibility and it may cause certain issues. Please refer to the ProtoBuf doc and Envoy API style (Prefer multiple fields with defined precedence over boolean overloads of fields or oneof).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#30851 This is a pretty recent policy change for discouraging oneof. Thanks for the pointer.

As my understanding, some config generation tools have different setup with oneof and non-oneof so moving in/out oneof might causing compatibility issue as well.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it, thank you both for the context. will update the PR to add the config time exception

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this function has lots of similarity with lookupInCityDb. could we refactor to achieve DRY if it's not too hard?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are indeed similarities in these lookup function and ideally be better abstracted, wonder if we could do the refacor in a seperate PR so that this PR focuses on adding country db support

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll defer to code owner @nezdolik @ravenblackx

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there is already similar duplicated code prior to this PR and PR adds more duplication, is good opportunity to tackle refactor here.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let me give it a try, was worried that refactoring here might make this PR complicated to review, maybe could discuss more once there is a draft commit

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems one possible approach is to have a common lookupInMaxmindDb function, and then we pass the db_type and maxmind_db database pointer to this function, where lookupInMaxmindDb will call related isLookupEnableForDbType and populateGeoLookupResultForDbType functions, where db-specific logics are handled.

What do you think about the approach? also still feels like might be better to do in a separate PR to it's cleaner and easier to review (and to revert if ever needed) as other db lookup functions will be touched, thoughts? @nezdolik

pseudocode

void GeoipProvider::lookupInMaxmindDb(
    const Network::Address::InstanceConstSharedPtr& remote_address,
    absl::flat_hash_map<std::string, std::string>& lookup_result,
    const std::string& db_type /* we can make db_type a enum */,
    const std::string& stat_prefix /* e.g. city for city_db */,
    const MaxmindDbPtr& maxmind_db /* passing db pointer */,
    ) const {
  if (isLookupEnableForDbType(db_type))
    ASSERT(maxmind_db, fmt.format("Maxmind {} database is not initialised for performing lookups", stat_prefix));
    int mmdb_error;
    const uint32_t n_prev_hits = lookup_result.size();
    MMDB_lookup_result_s mmdb_lookup_result = MMDB_lookup_sockaddr(
        maxmind_db.get(), reinterpret_cast<const sockaddr*>(remote_address->sockAddr()), &mmdb_error);
    if (!mmdb_error) {
      MMDB_entry_data_list_s* entry_data_list;
      int status = MMDB_get_entry_data_list(&mmdb_lookup_result.entry, &entry_data_list);
      if (status == MMDB_SUCCESS) {
        // populate headers
        populateGeoLookupResultForDbType(db_type)
        if (lookup_result.size() > n_prev_hits) {
          config_->incHit(fmt.format("{}_db", stat_prefix));
        }
        MMDB_free_entry_data_list(entry_data_list);
      }
    } else {
      config_->incLookupError(fmt.format("{}_db", stat_prefix));
    }
    config_->incTotal(fmt.format("{}_db", stat_prefix));
  }
}

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

drafted an example PR in #31490, please take a look when you have a chance

@hwyuan hwyuan force-pushed the add-maxmind-country-db branch 2 times, most recently from 8722b63 to 68d7960 Compare December 8, 2023 09:12
@hwyuan hwyuan changed the title WIP: add maxmind country db support add maxmind country db support Dec 8, 2023
@hwyuan hwyuan marked this pull request as ready for review December 8, 2023 17:30
Copy link
Copy Markdown
Contributor

@JuniorHsu JuniorHsu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done the first pass modulo nits and questions. Over to owner/maintainer's review @nezdolik @ravenblackx

Comment thread source/extensions/geoip_providers/maxmind/geoip_provider.cc Outdated
Comment thread source/extensions/geoip_providers/maxmind/geoip_provider.cc Outdated
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'll defer to code owner @nezdolik @ravenblackx

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to dup this warning to city_db_path as well. @nezdolik @ravenblackx thought?

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this field in the end of the proto message, as it is the most recently added.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fine to have warning only here, given that there will validation error when user sets both fields (so they should read docs for both fields).

@hwyuan hwyuan force-pushed the add-maxmind-country-db branch 3 times, most recently from 5399e8d to 042ef2b Compare December 9, 2023 07:59
@hwyuan
Copy link
Copy Markdown
Contributor Author

hwyuan commented Dec 9, 2023

/retest

@hwyuan hwyuan force-pushed the add-maxmind-country-db branch from 042ef2b to 006b66d Compare December 9, 2023 19:25
Signed-off-by: Haowei Yuan <hyuan@wustl.edu>
@hwyuan hwyuan force-pushed the add-maxmind-country-db branch from 006b66d to 8e12495 Compare December 12, 2023 06:41
Copy link
Copy Markdown
Member

@nezdolik nezdolik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at Maxmind comparison of city and country databases (in the bottom of the page), country one contains subset of fields from city, so there will be no case when someone needs both databases to be configured.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: move this field in the end of the proto message, as it is the most recently added.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is fine to have warning only here, given that there will validation error when user sets both fields (so they should read docs for both fields).

@@ -23,7 +23,13 @@ option (xds.annotations.v3.file_status).work_in_progress = true;
// :ref:`anon_db_path <envoy_v3_api_field_extensions.geoip_providers.maxmind.v3.MaxMindConfig.anon_db_path>` must be configured.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also update here?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, does it make sense to move the warning of "at most one of country_db_path and city_db_path can be configured" here so that these instructions are together? and remove it from line 30 below

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that there is already similar duplicated code prior to this PR and PR adds more duplication, is good opportunity to tackle refactor here.

EXPECT_DEATH(initializeProvider(config_yaml), ".*Unable to open Maxmind database file.*");
}

TEST_F(GeoipProviderTest, ValidConfigCountryDbSuccessfulLookup) {
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add tests to geoip_filter_integration_test.cc as well?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure, will do

message MaxMindConfig {
// Full file path to the Maxmind country database, e.g. /etc/GeoLite2-Country.mmdb.
// Database file is expected to have .mmdb extension.
// Note that only one of country_db_path and city_db_path can be set.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add reference to proto fields, e.g. :ref:city_db_path <envoy_v3_api_field_extensions.geoip_providers.maxmind.v3.MaxMindConfig.city_db_path>

Signed-off-by: Haowei Yuan <hyuan@wustl.edu>
Copy link
Copy Markdown
Member

@lizan lizan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm api

@github-actions
Copy link
Copy Markdown

This pull request has been automatically marked as stale because it has not had activity in the last 30 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot added the stale stalebot believes this issue/PR has not been touched recently label Jan 21, 2024
@github-actions
Copy link
Copy Markdown

This pull request has been automatically closed because it has not had activity in the last 37 days. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@github-actions github-actions Bot closed this Jan 28, 2024
@zetaab
Copy link
Copy Markdown

zetaab commented Oct 8, 2024

@nezdolik @lizan @hwyuan could this PR be reopened? At least I see that this is useful feature and actually it was already lgtm by @lizan. So I do not see why it was closed?

@nezdolik
Copy link
Copy Markdown
Member

nezdolik commented Oct 9, 2024

it did require more work on pr to be done

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale stalebot believes this issue/PR has not been touched recently waiting

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants