Refactor the autoschedulers to their own directory. by alexreinking · Pull Request #5228 · halide/Halide

alexreinking · 2020-08-31T11:37:27Z

This PR adjusts all three existing autoschedulers including Mullapudi2016 to be standalone. It reorganizes them into a src/autoschedulers folder and also installs them into the release packages.

The autoschedulers are also built with hidden symbols by default, so the HALIDE_EXPORT macro is now exposed and GCC compatible.

The add_halide_library function now adds the "auto_schedule=true" parameter to the start of the list when the AUTOSCHEDULER argument is passed. This improves ergonomics somewhat now that there is no default autoscheduler baked into the library.

All the autoscheduler tests now take which autoscheduler to use as their first argument. This will enable us (later) to test every autoscheduler against the existing tests.

Fixes #4341
Fixes #4349
Fixes #4053

abadams · 2020-08-31T16:07:10Z

Can you elaborate on the weak linkage thing? I didn't think that was the case. My understanding was that the plugins rely on having their static constructors called at load time, and need to use symbols from the parent library, and that both of these things were normal on Windows.

steven-johnson

General direction definitely LGTM; obviously we'll need to get this working on non-Windows too. (I guess we're gonna need to add Makefile support too, alas.)

compiling Li2018 against shared Halide results in a ~130KB binary and compiling it against static Halide results in a 20MB binary

I wonder if there might be hidden issues with compiling against a non-shared libHalide; Halide uses globals in various ways and it doesn't seem impossible that having non-shared sets of these could cause amusing issues (e.g. when creating unique names).

src/Pipeline.cpp

steven-johnson · 2020-08-31T17:38:22Z

src/Util.cpp

                   << err_msg << "\n";
    }
+
+    FARPROC as_name = GetProcAddress(library, "AutoschedulerName");


I don't know I should be surprised that the legacy of NEAR/FAR ptrs on x86 lingers on to this day, but somehow I am

src/plugins/CMakeLists.txt

steven-johnson · 2020-08-31T17:42:56Z

src/plugins/autoschedulers/mullapudi2016/AutoSchedule.cpp

@@ -1,23 +1,14 @@
+#include "halide_mullapudi2016_export.h"


Where/how do these _export.h files get created? What do they contain?

steven-johnson · 2020-08-31T17:43:20Z

src/plugins/autoschedulers/mullapudi2016/AutoSchedule.cpp

-
-namespace Halide {
-namespace Internal {
+#include <Halide.h>


Pretty sure that it should always be "Halide.h" and never <Halide.h>

alexreinking · 2020-08-31T19:13:59Z

Can you elaborate on the weak linkage thing? I didn't think that was the case. My understanding was that the plugins rely on having their static constructors called at load time, and need to use symbols from the parent library, and that both of these things were normal on Windows.

If a library is shared, all platforms behave similarly at runtime. At build time. the plugin and the executable both link directly to the same shared library. The library gets loaded when the application loads. Then when the plugin gets loaded, its dependency on the library gets resolved to the previously loaded copy. This is only no-compromises, cross-platform way for loadable plugins to work.

Referencing symbols from a parent executable is totally different. Because the Windows loader has absolutely no form of dynamic lookup, a plugin referencing symbols in an executable must still link to a .lib import library at build time. CMake makes this convenient using its ENABLE_EXPORTS feature, which generates the import library for you. Then you still have to link your plugin to the executable. It papers over differences on other platforms, too. On macOS, it's equivalent to linking directly to the executable using -bundle_loader. On Linux, it does nothing special at all because Linux's loader is like a honey badger...

But now we can see why trying to reference symbols from a parent static library is unportable. On Windows, the static library just gets completely absorbed into the parent application. Every application has its symbols at different positions, so there's no universal .lib to which we could link a universally compatible plugin (remember, no dynamic lookup). Linux and macOS each have their own special dynamic lookup implementations, so you build the plugin with platform-dependent flags (resp. -rdynamic and -undefined dynamic_lookup) and then build the executable such that it exports its own symbols.

I wonder if there might be hidden issues with compiling against a non-shared libHalide; Halide uses globals in various ways and it doesn't seem impossible that having non-shared sets of these could cause amusing issues (e.g. when creating unique names).

There probably are. The other consequence of no dynamic symbol lookup is that we can't rely on static initialization to call Pipeline::add_autoscheduler because the static pipeline globals on Windows are indeed duplicated between the executable and the plugin. The executable therefore has to look for known entry points, here these are AutoschedulerName and AutoschedulerRun. Even on macOS/Linux, relying on static initialization is awkward because we require applications to link with -whole-archive.

abadams · 2020-08-31T20:41:17Z

Can you elaborate on the weak linkage thing? I didn't think that was the case. My understanding was that the plugins rely on having their static constructors called at load time, and need to use symbols from the parent library, and that both of these things were normal on Windows.

If a library is shared, all platforms behave similarly at runtime. At build time. the plugin and the executable both link directly to the same shared library. The library gets loaded when the application loads. Then when the plugin gets loaded, its dependency on the library gets resolved to the previously loaded copy. This is only no-compromises, cross-platform way for loadable plugins to work.

I thought this is what we were already doing.

Referencing symbols from a parent executable is totally different. Because the Windows loader has absolutely no form of dynamic lookup, a plugin referencing symbols in an executable must still link to a .lib import library at build time. CMake makes this convenient using its ENABLE_EXPORTS feature, which generates the import library for you. Then you still have to link your plugin to the executable. It papers over differences on other platforms, too. On macOS, it's equivalent to linking directly to the executable using -bundle_loader. On Linux, it does nothing special at all because Linux's loader is like a honey badger...

But now we can see why trying to reference symbols from a parent static library is unportable. On Windows, the static library just gets completely absorbed into the parent application. Every application has its symbols at different positions, so there's no universal .lib to which we could link a universally compatible plugin (remember, no dynamic lookup). Linux and macOS each have their own special dynamic lookup implementations, so you build the plugin with platform-dependent flags (resp. -rdynamic and -undefined dynamic_lookup) and then build the executable such that it exports its own symbols.

If this works, it's a side-effect of the current design, not intentional. You should not be statically linking libHalide into a generator binary - that makes every generator binary huge.

I wonder if there might be hidden issues with compiling against a non-shared libHalide; Halide uses globals in various ways and it doesn't seem impossible that having non-shared sets of these could cause amusing issues (e.g. when creating unique names).

There probably are. The other consequence of no dynamic symbol lookup is that we can't rely on static initialization to call Pipeline::add_autoscheduler because the static pipeline globals on Windows are indeed duplicated between the executable and the plugin. The executable therefore has to look for known entry points, here these are AutoschedulerName and AutoschedulerRun. Even on macOS/Linux, relying on static initialization is awkward because we require applications to link with -whole-archive.

I don't understand how the static initialization could be duplicated. The only translation unit it exists in is one that went into the plugin. That code is hidden inside a .cpp file inside the autoscheduler.

abadams · 2020-08-31T20:43:27Z

Oh you're saying the static in Pipeline::get_autoscheduler_map will be duplicated, not the static initializer itself.

So there are multiple ways in which things can't work if we statically link libHalide, but the current design seems like it works fine with a shared libHalide. So why redesign?

EDIT: Also that's not the only static hidden inside libHalide that the plugin needs, so just changing things to avoid that one usage isn't going to make a static libHalide work with a plugin.

alexreinking · 2020-08-31T21:24:51Z

I thought this is what we were already doing.

It is, for shared libs.

If this works, it's a side-effect of the current design, not intentional. You should not be statically linking libHalide into a generator binary - that makes every generator binary huge.

Then why are we using -rdynamic and -undefined dynamic_lookup? That's the whole point of those flags. If you have app -(shared)-> libHalide <-(shared)- plugin, there's no need for the plugin to look at symbols in app, is there?

Oh you're saying the static in Pipeline::get_autoscheduler_map will be duplicated, not the static initializer itself.

Correct.

So why redesign?

To allow plugins to work with statically linked generator binaries. Or statically linked JIT applications. There's no reason it shouldn't work or we shouldn't allow it.

abadams · 2020-08-31T21:28:21Z

I don't believe it's possible to get statically linked generator binaries to work with plugins. They can use arbitrary statics inside libHalide. (E.g. the ones in IROperator.cpp)

abadams · 2020-08-31T21:34:27Z

As far as I can tell, the makefile doesn't use rdynamic for libHalide. It uses it for the tests to make define_extern work when JITting (because shared libHalide needs to resolve a symbol in the test binary).

abadams · 2020-08-31T21:39:11Z

Maybe the static variant we should support is a static libHalide with static autoschedulers. If you want things to be static, you can link the autoschedulers into the generator binary and still use -s, if not -p.

alexreinking · 2020-08-31T21:40:38Z

Then there are three paths forward:

Clarify that the autoschedulers cannot be used or built against static Halide and enforce this in CMake.
Fix our use of global state. As far as I can tell from a quick git grep of src/, it's just the unique name counters that conflict. Everything else is either constant (or effectively so).
What you said.

alexreinking · 2020-08-31T21:41:47Z

Maybe the static variant we should support is a static libHalide with static autoschedulers. If you want things to be static, you can link the autoschedulers into the generator binary and still use -s, if not -p.

This is what I meant by

We could also look into baking plugins in to our library when building statically, like Qt does. But this would be a significantly more invasive change.

above.

abadams · 2020-08-31T21:44:01Z

Ah, I missed that part. I think loading two copies of libHalide is probably a bad idea, no matter what we do with our globals. We should just go with option 1.

abadams · 2020-08-31T21:44:32Z

People are welcome to also statically link the plugin into the generator binary, but we don't have to support that in the cmake build. They can write their own build config if they want to do that.

alexreinking · 2020-09-02T06:00:31Z

This is what I meant by

We could also look into baking plugins in to our library when building statically, like Qt does. But this would be a significantly more invasive change.

above.

Actually, I misunderstood you. See (2) below.

Maybe the static variant we should support is a static libHalide with static autoschedulers. If you want things to be static, you can link the autoschedulers into the generator binary and still use -s, if not -p.

I played around with doing this, and there are two approaches.

Do what I think you're suggesting and compile the autoscheduler totally separately, then link it in with the platform-specific whole-archive flag.
Do what I was originally suggesting (and now realize is different), which is to roll the autoschedulers directly into libHalide.a. This would require staging the construction of that static library into first building the core of Halide, using those objects to compile the autoscheduler, then merging everything together into the final static lib. Don't create intermediate libraries; just use collections of object files to avoid symbols getting dropped. This is possible with CMake, but not easy.

Assuming we want to support a static scenario, I think (2) is probably worth the effort since you don't want to be compiling with whole-archive if you can avoid it, anyway and it would certainly be more convenient, linking-wise.

abadams · 2020-09-02T06:03:35Z

I was indeed suggesting 1, because of the tricky issues you raise with 2.

edit to add: but I also think we should just not support static linkage for now and just let people figure it out themselves if they have some weird need for it.

alexreinking · 2020-09-02T06:34:48Z

edit to add: but I also think we should just not support static linkage for now and just let people figure it out themselves if they have some weird need for it.

That's fine by me and I'm currently re-working this PR to use the static initializers and not build the autoschedulers when building Halide as a static library.

alexreinking · 2020-09-09T22:34:46Z

Rebased on master... removing skip_buildbots to verify that CMake is still working everywhere. Still need support updating the Makefiles.

…alide into refactor/autoschedulers

…factor/autoschedulers

steven-johnson · 2020-09-13T19:14:01Z

space

I deleted some dead stuff to reclaim space. (The last buildbot update renamed some things but didn't delete old stuff.)

…alide into refactor/autoschedulers

…ot cross compile

alexreinking · 2020-09-15T08:28:35Z

Single failure is due to Hexagon issues upstream. Woot! Please review & we can merge and cut releases

steven-johnson · 2020-09-15T16:03:09Z

We can merge and cut releases

We can merge, but any 'releases' we cut will need to be labeled as 'we know this is broken, do not use long term' until the upstream issue is fixed.

alexreinking · 2020-09-15T19:26:13Z

We can merge and cut releases

We can merge, but any 'releases' we cut will need to be labeled as 'we know this is broken, do not use long term' until the upstream issue is fixed.

Releasing versions 9 and 10 shouldn't be affected

steven-johnson · 2020-09-15T19:28:32Z

Ah, gotcha. (FYI, the buildbots aren't building LLVM9 at all now, and haven't for a few months. We could add that back, but might be simpler to just build those releases locally somewhere.)

alexreinking · 2020-09-15T19:31:42Z

Ah, gotcha. (FYI, the buildbots aren't building LLVM9 at all now, and haven't for a few months. We could add that back, but might be simpler to just build those releases locally somewhere.)

I have everything at my desk except for ARM hardware.

abadams · 2020-09-15T19:44:03Z

I don't think there's a pressing need to release a version 9 if it's a pain.

alexreinking added the code_cleanup No functional changes. Reformatting, reorganizing, or refactoring existing code. label Aug 31, 2020

alexreinking marked this pull request as draft August 31, 2020 11:37

alexreinking added the skip_buildbots Do not run buildbots on this PR. Must add before opening PR as we scan labels immediately. label Aug 31, 2020

alexreinking mentioned this pull request Aug 31, 2020

FR: reorganize the autoscheduler(s) in the codebase #4053

Closed

steven-johnson reviewed Aug 31, 2020

View reviewed changes

alexreinking force-pushed the refactor/autoschedulers branch from 2f672c8 to dac3d93 Compare September 6, 2020 19:39

alexreinking mentioned this pull request Sep 8, 2020

CMake changes to land ahead of releases #5253

Merged

alexreinking force-pushed the refactor/autoschedulers branch 3 times, most recently from a58deae to 7fe187c Compare September 9, 2020 22:33

alexreinking removed the skip_buildbots Do not run buildbots on this PR. Must add before opening PR as we scan labels immediately. label Sep 9, 2020

alexreinking added 3 commits September 10, 2020 14:15

Move autoscheduler sources into dedicated folders

a8f7acf

Convert plugins to named-entry-point

5101b9d

Setting an autoscheduler now implies you want to use it.

de65a60

abadams added 3 commits September 12, 2020 19:19

bin -> lib in tutorials

26bf23e

Merge branch 'refactor/autoschedulers' of https://github.com/halide/H…

3b45381

…alide into refactor/autoschedulers

Make distrib folder moveable on OS X

a6a560d

alexreinking marked this pull request as ready for review September 13, 2020 03:04

abadams and others added 4 commits September 12, 2020 20:09

Make autoscheduler distro libs moveable

284b845

fix perms

482ab08

Correct permissions on installed autotune_loop.sh

29e3a58

Merge remote-tracking branch 'origin/refactor/autoschedulers' into re…

75be398

…factor/autoschedulers

alexreinking and others added 9 commits September 13, 2020 13:20

Disable autograd test on wasm, like before.

3b3a218

Set correct rpath to begin with instead of patching it

0b4773d

Add autoschedulers to the tests that need them

f97c423

Merge branch 'refactor/autoschedulers' of https://github.com/halide/H…

b871071

…alide into refactor/autoschedulers

package share/Halide path. fix wasm by marking certain libraries to n…

a3c0f0b

…ot cross compile

Fix comments in tutorials.

b90afd0

Makefile rpath fixes for os x

5ceeffb

Merge

40c9dc6

More expedient hackery to handle rpath on OS X issues

2ed3e14

abadams mentioned this pull request Sep 15, 2020

Fix Makefile to not mix dependencies between BIN_DIR and DISTRIB_DIR/lib #5270

Open

steven-johnson approved these changes Sep 15, 2020

View reviewed changes

steven-johnson merged commit aef3f1b into master Sep 15, 2020

This was referenced Sep 15, 2020

Release Halide 9.0 and Halide 10.0 #5271

Closed

Ensure that inline_all_trivial_functions() isn't dead-stripped #5280

Closed

alexreinking deleted the refactor/autoschedulers branch November 17, 2020 04:37

Conversation

alexreinking commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

steven-johnson left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steven-johnson Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

Uh oh!

steven-johnson Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

steven-johnson Aug 31, 2020

Choose a reason for hiding this comment

Uh oh!

alexreinking commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

abadams commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexreinking commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

alexreinking commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexreinking commented Aug 31, 2020

Uh oh!

abadams commented Aug 31, 2020

Uh oh!

abadams commented Aug 31, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexreinking commented Sep 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

abadams commented Sep 2, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

alexreinking commented Sep 2, 2020

Uh oh!

alexreinking commented Sep 9, 2020

Uh oh!

steven-johnson commented Sep 13, 2020

Uh oh!

alexreinking commented Sep 15, 2020

Uh oh!

steven-johnson commented Sep 15, 2020

Uh oh!

alexreinking commented Sep 15, 2020

Uh oh!

steven-johnson commented Sep 15, 2020

Uh oh!

alexreinking commented Sep 15, 2020

Uh oh!

abadams commented Sep 15, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

alexreinking commented Aug 31, 2020 •

edited

Loading

alexreinking commented Aug 31, 2020 •

edited

Loading

abadams commented Aug 31, 2020 •

edited

Loading

alexreinking commented Aug 31, 2020 •

edited

Loading

alexreinking commented Aug 31, 2020 •

edited

Loading

abadams commented Aug 31, 2020 •

edited

Loading

alexreinking commented Sep 2, 2020 •

edited

Loading

abadams commented Sep 2, 2020 •

edited

Loading