-
Notifications
You must be signed in to change notification settings - Fork 14
Description
As @artemp mentioned today over chat, it is important to ensure code is correct before optimizing it. Because, of course, Faster code that gets the wrong answer is not superior to correct code. per https://howardhinnant.github.io/coding_guidelines.html.
#28 indicates a potentially critical bug involving the closest point results, so clearly we should be wary to optimize around the closest point code path until #28 is resolved.
However there are other code paths that:
- Look and behave correctly
- Exhibit potential for optimization
Below are some recommendations to consider for optimization.
Use more appropriate container types
Currently the code is using std::map<std::string, variant_type> to hold feature properties. The most efficient container is the one that is fastest for your purpose. In our case creation and iteration speed are what matters (e.g. we don't search so that does not matter). In my experience std::vector<std::pair<std::string, variant_type>> can be created faster than std::map and therefore is a better container when fast search is not needed. So, I would recommend changing to std::vector<std::pair<std::string, variant_type>>.
Don't risk blocking the main loop
Only the code inside Execute() runs in the threadpool. The code before entering the threadpool and after (HandleOKCallback()) all run on the main event loop and therefore is synchronous. If that synchronous code is slow (does CPU intensive work or triggers many allocations) then that is going to block the main loop and potentially reduce the ability to dispatch work to the threadpool. This problem is also going to be more severe when the code is running in a larger application where other JS is running on the main event loop.
So, I would recommend reverting a175212#diff-d293222b917c4d9edf75c1288538554fL281 and moving the results_to_json_string(result_string_, results_); out of the HandleOKCallback() function and back into the Execute().
@@ -283,6 +283,7 @@ struct Worker : Nan::AsyncWorker {
std::sort(results_.begin(), results_.end(), [](const ResultObject& a, const ResultObject& b) { return a.distance < b.distance; });
// TODO(sam) create new results vector (from results_) of length specific to num_results option
+ results_to_json_string(result_string_, results_);
} catch (const std::exception& e) {
SetErrorMessage(e.what());
@@ -299,7 +300,6 @@ struct Worker : Nan::AsyncWorker {
void HandleOKCallback() override {
Nan::HandleScope scope;
- results_to_json_string(result_string_, results_);
auto const argc = 2u;
v8::Local<v8::Value> argv[argc] = {Reducing unnecessary copies/reallocation
This is especially important when custom objects are being copied because:
- Custom objects may be large and poorly packed (and therefore larger than they would be if packed in a more optimized way).
- Contain other object members that are large and require allocation and deallocation to be copied themselves (like
std::vectororstd::map).
Accidental copies are easy to have happen and a common cause of slow code. They are one of the reasons, in my experience, why it is easy to write C++ that is slower than Javascript (v8 goes to extreme lengths to avoid allocations for you). To write C++ faster than Javascript we need to aim for zero copy code. To achieve that we need to either pass objects by reference or leverage C++11 move semantics.
Okay, enough context, let's dive in...
We need to avoid copying the results passed to the json generator. Passing by reference here makes sense:
- void results_to_json_string(std::string & s, std::vector<ResultObject> results) {
+ void results_to_json_string(std::string & s, std::vector<ResultObject> const& results) {Also the properties arg being passed to the ResultObject constructor is currently being copied. And the ResultObject itself is being allocated, then copied, when it could be constructed in place. To avoid these extra allocations do:
@@ -32,7 +32,7 @@ struct ResultObject {
ResultObject(
mapbox::geometry::point<double> p,
double distance0,
- std::map<std::string, ResultObject::variant_type> props_map,
+ std::map<std::string, ResultObject::variant_type> && props_map,
std::string name,
GeomType geom_type)
: coordinates(p),
@@ -272,8 +272,7 @@ struct Worker : Nan::AsyncWorker {
properties_map.insert(std::pair<std::string, variant_type>(key, value));
}
- ResultObject r(feature_lnglat, meters, properties_map, layer_name, original_geometry_type);
- results_.emplace_back(r);
+ results_.emplace_back(feature_lnglat, meters, std::move(properties_map), layer_name, original_geometry_type);
}
} // end tile.layer.feature loop
} // end tile.layer loopAlso, the individual properties are suffering from an extra copy currently due to the use of insert. Instead we want to use C++11 emplace:
@@ -267,9 +267,9 @@ struct Worker : Nan::AsyncWorker {
using variant_type = ResultObject::variant_type;
ResultObject::properties_type properties_map;
while (auto prop = feature.next_property()) {
- std::string key = std::string{prop.key()};
- variant_type value = vtzero::convert_property_value<variant_type>(prop.value());
- properties_map.insert(std::pair<std::string, variant_type>(key, value));
+ properties_map.emplace(
+ std::string{prop.key()},
+ vtzero::convert_property_value<variant_type>(prop.value()));
}
ResultObject r(feature_lnglat, meters, properties_map, layer_name, original_geometry_type);Also, all this is probably of minor cost compared to the re-allocation cost that is more hidden:
- Reallocation of the
ResultObjects inside thestd::vector<ResultObject>as it grows - Reallocation of the
ResultObjects inside thestd::vector<ResultObject>when it is sorted
We want to avoid both:
- the amount of reallocation
- the cost of reallocation.
The best way to avoid the amount of allocation is to call vector.reserve(<max possible size>). However in our case it is tricky to know <max possible size> because we don't know how many ResultObject we will end up with. This needs some experimentation and may or may not result in faster code. This post has some details on this.
The real win possible in vtquery currently is in reducing the cost of reallocation. The cost is large currently because the ResultObject does not ban copying or implement a move assignment or move constructor. If we were to dig deep in https://www.slideshare.net/ripplelabs/howard-hinnant-accu2014 we'd learn that if we delete the copy constructor and copy assignment operator and define the move assignment or move constructor, then the reallocation happening during std::sort will magically start using move semantics and will avoid the reallocation from copying data around during sort.
This can be done by adding these operators to the ResultObject class:
+ // non-copyable
+ ResultObject(ResultObject const&) = delete;
+ ResultObject& operator=(ResultObject const&) = delete;
+
+ // movable
+ ResultObject(ResultObject&&) = default;
+ ResultObject& operator=(ResultObject&&) = default;Be more lazy
Currently the feature properties are:
- Being pushed into a
std::map - Being decoded into a
variant
These structures require allocations.
Given that:
- All we do with the feature properties is ultimately turn them into JSON objects, and
- When limiting results is implemented (I noticed
num_resultsis currently not respected), we will only return a subset of features and therefore the creation of properties will be wasted.
I think:
- We should be more lazy and not decode into a
variantorstd::mapduring thelayer.next_feature()loop. - Instead we should take a reference to the
data_viewrepresenting thevtzero::feature - After sorting, we can decode only the top N features directly from their
data_viewinto a JSON object withoutstd::maporvariantintermediate objects.
Note: Unless a better solution were devised, this would probably require bringing back the It should be possible to store a std::deque<vtzero::layer> layers; that was removed in 84606a3 (to be able to keep the vtzero::layer referenced by the vtzero::feature alive).data_view representing the property values to avoid needing to keep the layer around (the layer is only needed to get the right property key/value, but once gotten it should be safe to use them without the layer alive?).
/cc @mapbox/core-tech