Memory optimization algorithm#1523
Conversation
lockshaw
left a comment
There was a problem hiding this comment.
Reviewed 11 of 17 files at r1, all commit messages.
Reviewable status: 11 of 17 files reviewed, 10 unresolved discussions (waiting on @wmdi)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 16 at r1 (raw file):
struct ICostEstimator { virtual CostMetric estimate_cost(OpCostEstimateKey const &) const = 0; virtual CostMetric estimate_cost(TensorSetMovement const &) const = 0;
Why does communication cost return a memory usage?
lib/compiler/include/compiler/machine_mapping/feasible_machine_mapping_result.struct.toml line 16 at r1 (raw file):
[[fields]] name = "cost" type = "::FlexFlow::CostMetric"
Separate out into different result type
lib/compiler/include/compiler/machine_mapping/get_optimal_machine_mapping.h line 23 at r1 (raw file):
MachineSpecification const &resources, MachineMappingConstraints const &constraints, MachineMemoryConstraints const &memory_constraints,
I think for now let's separate the memory optimization algorithm from the existing algorithm. There may be a bit of code duplication, but I don't think it should be too much as they can both use a lot of existing infrastructure, and it is likely to make things a lot more maintainable in the long run.
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml line 2 at r1 (raw file):
namespace = "FlexFlow" name = "MachineMappingConfig"
Remove in favor of separate algorithm implementation
lib/compiler/include/compiler/machine_mapping/machine_mapping_result.h line 27 at r1 (raw file):
¶llel_split_transformation); [[nodiscard]] MachineMappingResult parallel_combine(MachineMappingConfig const &config,
Add operations for a new result type rather than overriding these
lib/compiler/include/compiler/machine_mapping/machine_memory_constraints/machine_memory_constraints.struct.toml line 2 at r1 (raw file):
namespace = "FlexFlow" name = "MachineMemoryConstraints"
Probably better to just make this a field on MachineSpecification
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 10 at r1 (raw file):
/*memory=*/0, }; }
I'm not really convinced this is all that useful of a function?
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 36 at r1 (raw file):
for (CostMetric const &cost : costs) { result = combine_cost_metrics_intra_device_sequential(result, cost); }
foldl1 from utils/containers?
Code quote:
CostMetric result = zero_cost_metric();
for (CostMetric const &cost : costs) {
result = combine_cost_metrics_intra_device_sequential(result, cost);
}lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 40 at r1 (raw file):
} CostMetric combine_cost_metrics_intra_device_parallel(CostMetric const &c1,
What does "intra_device" mean here?
Code quote:
intra_devicelib/compiler/src/compiler/cost_estimator/cost_metric.cc line 51 at r1 (raw file):
for (CostMetric const &cost : costs) { result = combine_cost_metrics_intra_device_parallel(result, cost); }
foldl1 from utils/containers?
Code quote:
CostMetric result = zero_cost_metric();
for (CostMetric const &cost : costs) {
result = combine_cost_metrics_intra_device_parallel(result, cost);
}
wmdi
left a comment
There was a problem hiding this comment.
Reviewable status: 2 of 26 files reviewed, 10 unresolved discussions (waiting on @lockshaw)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 16 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Why does communication cost return a memory usage?
Done.
lib/compiler/include/compiler/machine_mapping/get_optimal_machine_mapping.h line 23 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I think for now let's separate the memory optimization algorithm from the existing algorithm. There may be a bit of code duplication, but I don't think it should be too much as they can both use a lot of existing infrastructure, and it is likely to make things a lot more maintainable in the long run.
Done.
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml line 2 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove in favor of separate algorithm implementation
Done.
lib/compiler/include/compiler/machine_mapping/machine_memory_constraints/machine_memory_constraints.struct.toml line 2 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Probably better to just make this a field on
MachineSpecification
Done.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## repo-refactor #1523 +/- ##
=================================================
+ Coverage 78.16% 78.41% +0.24%
=================================================
Files 860 866 +6
Lines 27994 28917 +923
Branches 770 784 +14
=================================================
+ Hits 21881 22674 +793
- Misses 6113 6243 +130
Flags with carried forward coverage won't be shown. Click here to find out more.
|
lockshaw
left a comment
There was a problem hiding this comment.
The main thing is some unit testing should be added--currently the test coverage for this PR is 3.5%.
Reviewed 2 of 17 files at r1, 22 of 22 files at r2, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @wmdi)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 18 at r2 (raw file):
virtual float estimate_cost(TensorSetMovement const &) const = 0; virtual CostMetric estimate_cost_with_memory(OpCostEstimateKey const &) const = 0;
I don't think we need another function here--communication shouldn't return a memory usage, but returning memory usage for an operator is fine.
it would be nice to get a more descriptive name than CostMetric, maybe something like OperatorCostMetrics to specify it's just for operators and it contains more than one "metric"?
Suggestion:
virtual CostMetric estimate_cost(OpCostEstimateKey const &) const = 0;
virtual float estimate_cost(TensorSetMovement const &) const = 0;lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 31 at r2 (raw file):
float estimate_cost(OpCostEstimateKey const &k) const; float estimate_cost(TensorSetMovement const &m) const; CostMetric estimate_cost_with_memory(OpCostEstimateKey const &k) const;
Suggestion:
CostMetric estimate_cost(OpCostEstimateKey const &k) const;
float estimate_cost(TensorSetMovement const &m) const;lib/compiler/include/compiler/cost_estimator/cost_metric.h line 7 at r2 (raw file):
#include <vector> namespace FlexFlow {
These operations feel like something algorithm-specific rather than relevant to cost_estimator--I have no problem with them existing, but maybe the file should be somewhere else? Would the new machine_mapping_with_memory directory be appropriate?
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml line 2 at r1 (raw file):
Previously, wmdi (Mengdi Wu) wrote…
Done.
This file still seems to be here? Is it still necessary?
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml line 0 at r2 (raw file):
Rename to machine_mapping_with_memory_cache.struct.toml
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingCacheWithMemory"
Suggestion:
name = "MachineMappingWithMemoryCache"lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingResultWithMemory"
Suggestion:
ame = "MachineMappingWithMemoryResult"lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml line 20 at r2 (raw file):
[[fields]] name = "machine_mappings" type = "std::unordered_set<::FlexFlow::SingleMachineMapping>"
What does this set represent? What is the meaning behind a SingleMachineMapping--is that a machine mapping for a single layer, or a single pareto-optimal solution, or something else entirely? SingleMachineMapping likely needs a clearer name.
Code quote:
type = "std::unordered_set<::FlexFlow::SingleMachineMapping>"lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.cc line 12 at r2 (raw file):
std::optional<MachineMappingResultWithMemory> machine_mapping_cache_with_memory_load(
For all the functions here
Suggestion:
machine_mapping_with_memory_cache_load(lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 26 at r2 (raw file):
} MachineMappingResultWithMemory remove_non_dominating_machine_mapping_result(
Suggestion:
remove_non_pareto_optimal_machine_mapping_resultslib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h line 32 at r2 (raw file):
CostMetric estimate_cost_with_memory(OpCostEstimateKey const &) const override; };
Suggestion:
struct TestCostEstimator : public ICostEstimator {
std::function<CostMetric(OpCostEstimateKey const &)> get_operator_cost;
std::function<float(TensorSetMovement const &)> get_communication_cost;
TestCostEstimator() = delete;
TestCostEstimator(decltype(get_operator_cost) const &get_operator_cost,
decltype(get_communication_cost)
const &get_communication_cost,
decltype(get_operator_cost_with_memory)
const &get_operator_cost_with_memory);
CostMetric estimate_cost(OpCostEstimateKey const &) const override;
float estimate_cost(TensorSetMovement const &) const override;
};lib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h line 54 at r2 (raw file):
std::unordered_map<TensorSetMovement, float> const &comm_cost_map, std::unordered_map<OpCostEstimateKey, CostMetric> const &op_cost_with_memory_map);
Suggestion:
CostEstimator make_fake_cost_estimator(
std::function<CostMetric(OpCostEstimateKey const &)> const &get_operator_cost,
std::function<float(TensorSetMovement const &)> const
&get_communication_cost);
CostEstimator make_fake_cost_estimator(
std::unordered_map<OpCostEstimateKey, CostMetric> const &op_cost_map,
std::unordered_map<TensorSetMovement, float> const &comm_cost_map);
wmdi
left a comment
There was a problem hiding this comment.
Reviewable status: 26 of 28 files reviewed, 16 unresolved discussions (waiting on @lockshaw)
lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 18 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I don't think we need another function here--communication shouldn't return a memory usage, but returning memory usage for an operator is fine.
it would be nice to get a more descriptive name than
CostMetric, maybe something likeOperatorCostMetricsto specify it's just for operators and it contains more than one "metric"?
Done.
lib/compiler/include/compiler/cost_estimator/cost_metric.h line 7 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
These operations feel like something algorithm-specific rather than relevant to cost_estimator--I have no problem with them existing, but maybe the file should be somewhere else? Would the new
machine_mapping_with_memorydirectory be appropriate?
Done.
lib/compiler/include/compiler/machine_mapping/machine_mapping_config.struct.toml line 2 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
This file still seems to be here? Is it still necessary?
Done.
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml line at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Rename to
machine_mapping_with_memory_cache.struct.toml
Done.
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 10 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
I'm not really convinced this is all that useful of a function?
Done.
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 36 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
foldl1fromutils/containers?
Done.
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 40 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What does "intra_device" mean here?
Done.
lib/compiler/src/compiler/cost_estimator/cost_metric.cc line 51 at r1 (raw file):
Previously, lockshaw (Colin Unger) wrote…
foldl1fromutils/containers?
Done.
lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.cc line 12 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
For all the functions here
Done.
lib/compiler/include/compiler/cost_estimator/cost_estimator.h line 31 at r2 (raw file):
float estimate_cost(OpCostEstimateKey const &k) const; float estimate_cost(TensorSetMovement const &m) const; CostMetric estimate_cost_with_memory(OpCostEstimateKey const &k) const;
Done.
lib/compiler/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 26 at r2 (raw file):
} MachineMappingResultWithMemory remove_non_dominating_machine_mapping_result(
Done.
lib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h line 32 at r2 (raw file):
CostMetric estimate_cost_with_memory(OpCostEstimateKey const &) const override; };
Done.
lib/compiler/test/src/compiler/machine_mapping/cost_estimator_for_test.h line 54 at r2 (raw file):
std::unordered_map<TensorSetMovement, float> const &comm_cost_map, std::unordered_map<OpCostEstimateKey, CostMetric> const &op_cost_with_memory_map);
Done.
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_cache_with_memory.struct.toml line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingCacheWithMemory"
Done.
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml line 2 at r2 (raw file):
namespace = "FlexFlow" name = "MachineMappingResultWithMemory"
Done.
lockshaw
left a comment
There was a problem hiding this comment.
Reviewed 23 of 23 files at r4, all commit messages.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @wmdi)
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/get_optimal_machine_mapping_with_memory.cc line 215 at r4 (raw file):
MachineMappingWithMemoryResult correct = MachineMappingWithMemoryResult{{ SingleMachineMapping{ OpCostMetrics{1.0 + 2.0 + 0.1, 2 + 3},
Argument name comments would be nice here
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 11 at r3 (raw file):
MachineView machine_view_0 = MachineView{ /*start=*/MachineSpaceCoordinate{ /*node_idx=*/0,
@Marsella8 What do you think of making these node_idx and device_idx wrapper types to avoid construction errors? Would there be anything wrong with that?
Code quote:
/*node_idx=*/0,lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 57 at r3 (raw file):
2.0, 2, };
Argument name comments please, it's not immediately obvious what these parameters are
Code quote:
CostMetric cost1 = CostMetric{
2.0,
2,
};lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 115 at r3 (raw file):
SUBCASE("no non-dominating") { MachineMappingResultWithMemory to_remove = MachineMappingResultWithMemory{
With the name to_remove it makes it sounds like all of these should be removed, but instead I think they're the container of elements to be filtered?
Suggestion:
MachineMappingResultWithMemory input = MachineMappingResultWithMemory{lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 147 at r4 (raw file):
}}; printf("Before constructing cost_estimator\n");
Remove print
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 150 at r4 (raw file):
auto map1 = std::unordered_map<OpCostEstimateKey, OpCostMetrics>{{ {map_unmapped_op_cost_estimate_key(k1, mv1), OpCostMetrics(1.0, 0)},
Argument name comments here would be nice
Code quote:
OpCostMetrics(1.0, 0)},lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 150 at r4 (raw file):
auto map1 = std::unordered_map<OpCostEstimateKey, OpCostMetrics>{{ {map_unmapped_op_cost_estimate_key(k1, mv1), OpCostMetrics(1.0, 0)},
Prefer bracket initialization
Suggestion:
{map_unmapped_op_cost_estimate_key(k1, mv1), OpCostMetrics{1.0, 0}},lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 156 at r4 (raw file):
}}; printf("After constructing map1\n");
Remove prints 🙂
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 172 at r4 (raw file):
}}); // CostEstimator cost_estimator = make_fake_cost_estimator(
Delete?
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 193 at r4 (raw file):
// }}); printf("After constructing cost_estimator\n");
Remove print
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 200 at r4 (raw file):
}; printf("After constructing context\n");
Remove print
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 114 at r4 (raw file):
} SUBCASE("no non-pareto_optimal") {
Suggestion:
SUBCASE("all solutions are pareto-optimal") {lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 128 at r4 (raw file):
} SUBCASE("non-pareto_optimal") {
Suggestion:
SUBCASE("a solution is non-pareto optimal") {lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 149 at r4 (raw file):
} TEST_CASE("series_combine(memory)") {
Would be nice to have the actual function signature here instead of just a (memory) suffix to make it easier to match up with the corresponding function
Code quote:
TEST_CASE("series_combine(memory)") {lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 340 at r4 (raw file):
} TEST_CASE("parallel_combine(memory)") {
Would be nicer to have the actual function signature to make it easier to match up with the corresponding function
Marsella8
left a comment
There was a problem hiding this comment.
Reviewable status: all files reviewed, 16 unresolved discussions (waiting on @lockshaw and @wmdi)
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 11 at r3 (raw file):
Previously, lockshaw (Colin Unger) wrote…
@Marsella8 What do you think of making these
node_idxanddevice_idxwrapper types to avoid construction errors? Would there be anything wrong with that?
Agree that they should be wrapped. Perhaps we should also do the same for MachineSpecification? (e.g. see the current tests for machine_view, where we have do annotate all the arguments to make it readable).
wmdi
left a comment
There was a problem hiding this comment.
Reviewable status: 26 of 29 files reviewed, 16 unresolved discussions (waiting on @lockshaw)
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 147 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove print
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 150 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Argument name comments here would be nice
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 150 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Prefer bracket initialization
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 156 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove prints 🙂
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 172 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Delete?
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 193 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove print
Done.
lib/compiler/test/src/compiler/machine_mapping/get_optimal_machine_mapping.cc line 200 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Remove print
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/get_optimal_machine_mapping_with_memory.cc line 215 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Argument name comments would be nice here
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 57 at r3 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Argument name comments please, it's not immediately obvious what these parameters are
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 115 at r3 (raw file):
Previously, lockshaw (Colin Unger) wrote…
With the name
to_removeit makes it sounds like all of these should be removed, but instead I think they're the container of elements to be filtered?
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 149 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Would be nice to have the actual function signature here instead of just a
(memory)suffix to make it easier to match up with the corresponding function
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 340 at r4 (raw file):
Previously, lockshaw (Colin Unger) wrote…
Would be nicer to have the actual function signature to make it easier to match up with the corresponding function
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 114 at r4 (raw file):
} SUBCASE("no non-pareto_optimal") {
Done.
lib/compiler/test/src/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.cc line 128 at r4 (raw file):
} SUBCASE("non-pareto_optimal") {
Done.
lockshaw
left a comment
There was a problem hiding this comment.
Reviewed 1 of 3 files at r5, 5 of 5 files at r6, all commit messages.
Reviewable status:complete! all files reviewed, all discussions resolved (waiting on @wmdi)
wmdi
left a comment
There was a problem hiding this comment.
Reviewable status: 28 of 29 files reviewed, all discussions resolved
lib/compiler/include/compiler/machine_mapping/memory_optimization/machine_mapping_result_with_memory.struct.toml line 20 at r2 (raw file):
Previously, lockshaw (Colin Unger) wrote…
What does this set represent? What is the meaning behind a
SingleMachineMapping--is that a machine mapping for a single layer, or a single pareto-optimal solution, or something else entirely?SingleMachineMappinglikely needs a clearer name.
Done.
Description of changes:
Related Issues:
Linked Issues:
Issues closed by this PR:
This change is