Skip to content

Don't include sources for disabled targets#5503

Closed
JanHett wants to merge 2 commits intohalide:masterfrom
JanHett:dont-include-sources-for-disabled-targets
Closed

Don't include sources for disabled targets#5503
JanHett wants to merge 2 commits intohalide:masterfrom
JanHett:dont-include-sources-for-disabled-targets

Conversation

@JanHett
Copy link

@JanHett JanHett commented Dec 1, 2020

When Halide is linked with a version of LLVM that was not built with support for certain targets, compilation fails because of undefined symbols. See: https://gitter.im/halide/Halide?at=5fc54a9b3cd97915c1ac5340.

This MR removes target-specific sources from compilation based on TARGET_XYZ options.

Notes:

  • I could not identify source files for all optional targets. I still left the set() calls in the code since I personally find that clearer than sparse coverage.
  • The target-specific source files I did identify might be incomplete. Please double-check this list.

@abadams
Copy link
Member

abadams commented Dec 1, 2020

I don't think this is the right design, because it moves the complexity into the build system (e.g. this breaks the makefile build). The intent of the current design is that the codegen modules themselves gracefully do nothing if the relevant LLVM support isn't built in. (e.g. see use of WITH_ARM in CodeGen_ARM.cpp). However this isn't tested so it's broken.

A more targeted fix would be to change the INTRINSIC_128B macro in CodeGen_Hexagon.cpp when hexagon is disabled.

@abadams
Copy link
Member

abadams commented Dec 1, 2020

Actually my immediate suggestion is mediocre, because it leaves the door open for similar bugs to creep back in (references to backend-specific LLVM symbols).

Another approach that doesn't require build system complexity would be to wrap everything except for the constructor in CodeGen_Hexagon.h with WITH_HEXAGON, and then do the same for the implementations in CodeGen_Hexagon.cpp (and then do similar things in the other backends).

I don't think it's worth conditionally disabling things like HexagonOptimize.cpp. We only need to worry about files that include LLVM_Headers.h. Conditionally disabling compilation of other parts of libHalide risks introducing undefined internal symbols instead (e.g. if a new Hexagon module gets added that includes HexagonOptimize.h)

@abadams
Copy link
Member

abadams commented Dec 1, 2020

I'm wracking my brains trying to figure out the design that is least likely to permit similar bugs in future (undefined references to llvm), but also least likely to introduce undefined reference from one part of libHalide to another. The real solution is probably to add a buildbot config that just builds an llvm with ~no targets enabled and then builds Halide against that. Then the design matters much less.

Copy link
Member

@alexreinking alexreinking left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I requested some CMake style changes. I also still need to validate the interface library approach for packaging. Andrew's comments are important, too.

# Link optional targets to the main library
##

foreach(OPTIONAL_TARGET ${OPTIONAL_TARGETS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use modern cmake... foreach (optional_target IN LISTS optional_targets).

// OpenCL, CUDA, OpenGLCompute, and OpenGL last.
// The code is in reverse order to allow later tests to override
// earlier ones.
#ifdef WITH_OPENGL
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably not be skipping the runtime check for the target feature, but instead throwing an error that Halide was not compiled with support for the requested backend.

#include <sstream>

#include "CodeGen_D3D12Compute_Dev.h"
#ifdef WITH_D3D12
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be better to move these into the header guards so that they may be included unconditionally.

#endif
}

user_error << "Unknown target architecture: "
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should provide a more informative error message. It's not unknown, it was deselected in the build configuration.

Comment on lines +102 to +107
set(AARCH64_SOURCE_FILES
)

set(AMDGPU_SOURCE_FILES
)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need to have this span multiple lines. Better to show that it's meant to be empty via set(AARCH64_SOURCE_FILES "")

# Create optional targets
##

foreach(OPTIONAL_TARGET ${OPTIONAL_TARGETS})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use modern cmake... foreach (optional_target IN LISTS optional_targets).

# List of optional targets
##

set(OPTIONAL_TARGETS
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Where possible, I would like to enforce the convention that ALL_CAPS variables are reserved for cache variables and function arguments. Normal variables like optional_targets here should be lower_snake_case.

Comment on lines +481 to +482
${${OPTIONAL_TARGET}_SOURCE_FILES}
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add ${${OPTIONAL_TARGET}_HEADER_FILES}

target_compile_definitions(Halide_${OPTIONAL_TARGET} INTERFACE
WITH_${OPTIONAL_TARGET})

list(APPEND HEADER_FILES ${${OPTIONAL_TARGET}_HEADER_FILES})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No need for this line (see above)

@JanHett
Copy link
Author

JanHett commented Dec 1, 2020

Thanks for the feedback.

I'll take a crack at conditionally disabling the offending CodeGen's implementations and test how that is for a solution.

Of course I'm happy to make all the requested changes if you think changing the build system is reasonable.

In either case, I think extending the build system to test building against a minimal LLVM is a very good idea and will also incorporate this into the next revision of this MR. I have a version that adds an additional matrix parameter, but I'm not sure yet if this is a good idea.

@abadams
Copy link
Member

abadams commented Dec 1, 2020

Also, thanks for doing this work. We always appreciate bug fix PRs. I should have said that at the outset.

@JanHett
Copy link
Author

JanHett commented Dec 1, 2020

It's a pleasure! Thanks for the appreciation.

After a first attempt, it seems that wrapping the entire header and implementation in an #ifdef isn't quite straightforward since there are a couple of member functions that remain purely virtual in that case. So we could wrap everything except virtual functions and their dependencies (which I personally find to be an uncomfortable halfway point between surgical precision and a sledgehammer) or default back to just wrapping what absolutely needs to be wrapped and potentially providing appropriate error handling. In the latter case I might be out of my depth.

@JanHett
Copy link
Author

JanHett commented Dec 6, 2020

I finally found some time to get back to this. As I mentioned above, simply excluding all of CodeGen_Hexagon except the contructor doesn't compile because some pure virtual functions remain un-overridden.
If I do include those functions, everything works (including my build that was broken before).

I tried this out on another branch, here's the diff: master...JanHett:test-hobbled-llvm
If you prefer this approach, I'll create a PR for it, otherwise I'll fix this one.

The branch above also includes a change to the CI that would build versions of LLVM without any LLVM_TARGETS_TO_BUILD and test against those. Maybe building all permutations of this is not strictly necessary and I also didn't yet have time to test this.

@alexreinking alexreinking added this to the v12.0.0 milestone Dec 11, 2020
@JanHett
Copy link
Author

JanHett commented Dec 13, 2020

Hello @alexreinking and @abadams, did you have a chance to have a look at the diff and decide if that approach is in line with the rest of Halide's build strategy yet?

I started thinking about how to try out the changes to the CI today, but realised that I'd have to set up my own artefact repo to do that. Anyhow, in the process I fixed a few silly bugs I left in the first commit I made.

Would it be possible to test this with the Halide artefact repository (i.e. https://buildbot.halide-lang.org/llvm)?

@abadams
Copy link
Member

abadams commented Dec 16, 2020

Sorry, I've been slammed with other deadlines. I notice that @dsharletg proposed a fix in #5567 as part of his campaign to move some symbols into internal namespaces. Does that fix it for you?

@JanHett
Copy link
Author

JanHett commented Dec 16, 2020

No problem, I understand :).

#5567 fixes the problem for me, yes. Thanks a lot @dsharletg!

@steven-johnson
Copy link
Contributor

Post-holiday check-in: where does this PR stand?

@JanHett
Copy link
Author

JanHett commented Jan 11, 2021

Hello @steven-johnson! As I understand it, this proposed solution is not quite in line with Halide's strategy while #5567 - which also fixes the original issue - is.

If you still want to include the changes I made here, I will absolutely fix the code as @alexreinking requested above.

FWIW I agree that it makes sense to check for issues like this one in the CI. I took a stab at extending the pipelines to this end on another branch: master...JanHett:test-hobbled-llvm (ignore all the #ifdef WITH_HEXAGON stuff).

If Halide maintainers feel like neither of these changes are worth merging, I'm happy to close this PR.

@dsharletg
Copy link
Contributor

Here's what I think we should do on this issue. We can't do what #5567 did to the other targets because CodeGen_GPU_Host needs them as CRTP base classes.

We can fix that by refactoring CodeGen_GPU_Host to be a lowering pass like HexagonOffload.cpp instead. I think this would generally be a great cleanup to do anyways. CodeGen_GPU_Host wouldn't be needed any more. This will be much easier after OpenGL goes away (which is #5626), because OpenGL vertex buffer stuff is the hardest thing to move to a separate lowering pass.

After that is done, we can just do basically the same thing as #5567 to the rest of the CodeGen_* classes.

@steven-johnson
Copy link
Contributor

steven-johnson commented Jan 12, 2021

#5626 has landed -- should we create an issue to do the CodeGen_GPU_Host refactoring?

@steven-johnson
Copy link
Contributor

Is this PR still likely to be useful at all? Sounds like we now have a different approach to pursue.

@dsharletg
Copy link
Contributor

I think we should pursue the CodeGen_GPU_Host refactor instead.

@JanHett, thanks for the PR, sorry to pull the rug out from under it. If you are still interested in working on this, can you please let us know that you are doing so on #5650 ?

@dsharletg dsharletg closed this Jan 25, 2021
@alexreinking alexreinking removed this from the v12.0.0 milestone May 19, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants