Skip to content

COMP: removed hardcoded -march=corei7 compilation flag#2391

Closed
seanm wants to merge 1 commit intoInsightSoftwareConsortium:masterfrom
seanm:march-corei7
Closed

COMP: removed hardcoded -march=corei7 compilation flag#2391
seanm wants to merge 1 commit intoInsightSoftwareConsortium:masterfrom
seanm:march-corei7

Conversation

@seanm
Copy link
Copy Markdown
Contributor

@seanm seanm commented Mar 10, 2021

This fixes building for x86_64 from an arm64 Mac.

In my view, a library like ITK should not be assuming what kinds of CPU the resulting executable will be run on. It can't know if the result will only be run on the building machine, or deployed to customers running all kinds of models of computer. Only the programmer building ITK knows this.

PR Checklist

  • No API changes were made (or the changes have been approved)
  • No major design changes were made (or the changes have been approved)
  • Added test (or behavior not changed)
  • Updated API documentation (or API not changed)
  • Added license to new files (if any)
  • Added Python wrapping to new files (if any) as described in ITK Software Guide Section 9.5
  • Added ITK examples for all new major features (if any)

Refer to the ITK Software Guide for
further development details if necessary.

This fixes building for x86_64 from an arm64 Mac.

In my view, a library like ITK should not be assuming what kinds of CPU the resulting executable will be run on. It can't know if the result will only be run on the building machine, or deployed to customers running all kinds of models of computer. Only the programmer building ITK knows this.
@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Mar 10, 2021

@hjmjohnson @imikejackson @dzenanz @thewtex @SimonRit @rmukh this fixes building for Rosetta on an arm64 Mac. See also discussion in #2250

Philosophically, I think it's wrong for a library like ITK to hardcode such a flag anyway, but I suspect some may disagree...

@dzenanz dzenanz requested a review from hjmjohnson March 10, 2021 20:29
@dzenanz
Copy link
Copy Markdown
Member

dzenanz commented Mar 10, 2021

I am fine with this change. Hans?

Copy link
Copy Markdown
Member

@hjmjohnson hjmjohnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need to provide options more clearly to end-users. The -mtune=generic will prohibit the compiler from using any modern CPU features added in the past 2 decades. The -march=corei7 flag allowed using some very beneficial hardware instructions, and only limited the code to run on hardware newer than 12 years old. I thought it would be the case 99% of the time, and thus made a good default.

I can't work on this for the next month, so move forward however you see fit.

@issakomi
Copy link
Copy Markdown
Member

FYI, -mtune=generic has different meaning for different GCC versions

Therefore, if you upgrade to a newer version of GCC, code generation controlled by this option will change to reflect the processors that are most common at the time that version of GCC is released.

From GCC 10.2 documentation

# NOTE the corei7 release date was 2008
#-mtune=native # Tune the code for the computer used compile ITK, but allow running on generic cpu archetectures
-mtune=generic # for reproducible results https://github.com/InsightSoftwareConsortium/ITK/issues/1939
-march=corei7 # Use ABI settings to support corei7 (circa 2008 ABI feature sets, core-avx circa 2013)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could these be exposed as advanced CMake configuration variables? Also, if sensible defaults could be chosen based on the host architecture that would be great.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Querying the host architecture is part of the problem. It just doesn't work when cross compiling. If I'm on an arm computer and building for intel, it does not help to examine the host.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross-compiling is rare, and I think it is better to bother small number of people doing it with specifying a different option, than 99% of everyone else.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not rare on macOS. These days, every app is built as a universal binary (i.e. containing intel & arm executable code). Whether you build your app on intel or arm, the other half of it is essentially cross-compiled.

Introspection of the host causes all kinds of problems. For example, HDF5 was doing a try-compile to check sizeof(long double). It's 16 on intel, but 8 on arm. Assuming the answer is the same for both your intel half and arm half is incorrect.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

... and it's exactly to catch things like that long double example that I'm setting up a cdash submission that builds as x86_64-only but on arm64 hardware. Since the code can still be run (in emulation) it will reveal such problems.

This march=corei7 thing is the only remaining build issue.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is challenging, x86_64 should work with -march=corei7, IMHO, even if it should be better used conditionally or removed. There is a problem with Eigen, even if pass most important

-DCMAKE_CXX_COMPILER_TARGET:STRING="x86_64-apple-macos11"
-DCMAKE_C_COMPILER_TARGET:STRING="x86_64-apple-macos11"

and other params itkExternal_Eigen3.cmake fails, sometimes provided target is used, but at some point
/Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/clang -x c -target arm64-apple-macos11.1

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

itkExternal_Eigen3.cmake

Where is this? We may need to propagage CMAKE_CXX_COMPILER_TARGET for Eigen.

Copy link
Copy Markdown
Member

@issakomi issakomi Mar 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error is at line 58, the generation of Eigen3 failed with command above the line. I have tried to pass target, but finally i build own Eigen3. ITK configuration worked:

/Applications/CMake.app/Contents/bin/cmake -DBUILD_TESTING:BOOL=OFF -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_CONFIGURATION_TYPES:STRING=Release -DCMAKE_SYSTEM_PROCESSOR:STRING=x86_64 -DCMAKE_OSX_ARCHITECTURES:STRING=x86_64 -DCMAKE_C_COMPILER_TARGET:STRING=x86_64-apple-macos10.14 -DCMAKE_CXX_COMPILER_TARGET:STRING=x86_64-apple-macos10.14 -DITK_USE_SYSTEM_EIGEN:BOOL=ON -DEigen3_DIR:STRING="/Users/mi/eigen/build" -G Xcode ../ITK

Looks like related things were correct:

-- Performing Test CXX_HAS_WARNING-mtune_generic
-- Performing Test CXX_HAS_WARNING-mtune_generic - Success
-- Performing Test CXX_HAS_WARNING-march_corei7
-- Performing Test CXX_HAS_WARNING-march_corei7 - Success

-- Performing Test VXL_HAS_SSE2_HARDWARE_SUPPORT
-- Performing Test VXL_HAS_SSE2_HARDWARE_SUPPORT - Success

-- Looking for 128-bit float. [Checking long double...]
-- Looking for 128-bit float. Found long double.

But there are still very many questions, the post will be too long if i start to write about, if someone will play with it, look into cmake logs too, specially cmake's "try compile" is difficult. Cmake_system_processor too, tried also to set in tool-chain file, but still don't understand, but seems to be not crucial if target triple is set. Nonetheless i could build one of my applications, but it is a little bit too early to declare x86_64 build on m1 as "supported", IMHO. Good thing is that arm64 native build seems to be OK.

Edit:

/Applications/CMake.app/Contents/bin/cmake -DBUILD_TESTING:BOOL=OFF -DCMAKE_BUILD_TYPE:STRING=Release -DCMAKE_CONFIGURATION_TYPES:STRING=Release -DCMAKE_APPLE_SILICON_PROCESSOR:STRING=x86_64 -DCMAKE_OSX_ARCHITECTURES:STRING=x86_64 -DCMAKE_C_COMPILER_TARGET:STRING=x86_64-apple-macos10.14 -DCMAKE_CXX_COMPILER_TARGET:STRING=x86_64-apple-macos10.14 -DITK_USE_SYSTEM_EIGEN:BOOL=ON -DEigen3_DIR:STRING="/Users/mi/eigen/build" -G Xcode ../ITK

or simplified:

/Applications/CMake.app/Contents/bin/cmake -DCMAKE_APPLE_SILICON_PROCESSOR:STRING=x86_64 -DITK_USE_SYSTEM_EIGEN:BOOL=ON -DEigen3_DIR:STRING="/Users/mi/eigen/build" -G Xcode ../ITK

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@thewtex This change in itkExternal_Eigen3.cmake fixes Eigen3 failure, if CMAKE_APPLE_SILICON_PROCESSOR is used. More tests are required, but seems that other variables didn't help.

  execute_process(
    COMMAND
    ${CMAKE_COMMAND} ${_eigen3_source_dir}
+   "-DCMAKE_APPLE_SILICON_PROCESSOR=${CMAKE_APPLE_SILICON_PROCESSOR}"
    "-DCMAKE_INSTALL_PREFIX=${_eigen3_cmake_install_prefix}"
    "-DCMAKE_INSTALL_INCLUDEDIR=${_eigen3_cmake_install_includedir}"
    "-DCMAKE_INSTALL_DATADIR=${_eigen3_cmake_install_datadir}"
    "-DCMAKE_GENERATOR=${CMAKE_GENERATOR}"
    "-DCMAKE_SH=${CMAKE_SH}"
    "-DCMAKE_GENERATOR_TOOLSET=${CMAKE_GENERATOR_TOOLSET}"
    "-DCMAKE_SH=${CMAKE_SH}"
    "-DCMAKE_GENERATOR_PLATFORM=${CMAKE_GENERATOR_PLATFORM}"
    "-DCMAKE_GENERATOR_INSTANCE=${CMAKE_GENERATOR_INSTANCE}"
    "-DCMAKE_C_COMPILER=${CMAKE_C_COMPILER}"
    "-DCMAKE_C_FLAGS=${CMAKE_C_FLAGS}"
    "-DCMAKE_CXX_COMPILER=${CMAKE_CXX_COMPILER}"
    "-DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS}"
    ${_additional_external_project_args}
    WORKING_DIRECTORY ${_eigen3_build_dir}
    OUTPUT_VARIABLE ITKEigen3Config_STDOUT
    ERROR_VARIABLE ITKEigen3Config_STDERR
    )

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good analysis. Could you turn this into a patch @issakomi?

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Mar 11, 2021

The -mtune=generic will prohibit the compiler from using any modern CPU features added in the past 2 decades.

Are you sure about that? The gcc docs also say about -mtune=generic:

"Produce code optimized for the most common IA32/AMD64/EM64T processors. If you know the CPU on which your code will run, then you should use the corresponding -mtune or -march option instead of -mtune=generic. But, if you do not know exactly what CPU users of your application will have, then you should use this option." (emphasis mine)

I guess it depends what they mean by "most common IA32/AMD64/EM64T processors" :)

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Mar 11, 2021

@issakomi you should be able to repro the build error on an arm64 Mac if you set CMAKE_OSX_ARCHITECTURES=x86_64 (only).

@issakomi
Copy link
Copy Markdown
Member

issakomi commented Mar 11, 2021

you should be able to repro the build error on an arm64 Mac if you set CMAKE_OSX_ARCHITECTURES=x86_64 (only)

I could reproduce with command line:
mi@mis-Mac-mini tt % /Applications/CMake.app/Contents/bin/cmake -DBUILD_TESTING:BOOL=OFF -DCMAKE_OSX_ARCHITECTURES:STRING=x86_64 ../ITK

      clang: error: the clang compiler does not support '-march=corei7'
      make[1]: *** [CMakeFiles/cmTC_3f278.dir/testCCompiler.c.o] Error 1
      make: *** [cmTC_3f278/fast] Error 2

I didn't have the error with cmake-gui

@dzenanz Apple Silicon has changed a lot recently, now arm64 for Desktops and cross-compile are usual.

@issakomi
Copy link
Copy Markdown
Member

issakomi commented Mar 11, 2021

BTW (works directly from terminal as below), s. -march=corei7

#include <iostream>
int main(int, char**)
{
  const int x = sizeof(double);
  const int y = sizeof(long double);
  std::cout << x << " " << y << std::endl;
  return 0;
}
mi@mis-Mac-mini ~ % clang++ --target="x86_64-apple-macos10.12" -march=corei7 test0.cpp -o a1
mi@mis-Mac-mini ~ % ./a1
8 16
mi@mis-Mac-mini ~ % clang++ --target="arm64-apple-macos11" test0.cpp -o a2
mi@mis-Mac-mini ~ % ./a2                                         
8 8
mi@mis-Mac-mini ~ % uname -a
Darwin mis-Mac-mini.fritz.box 20.3.0 Darwin Kernel Version 20.3.0:
Thu Jan 21 00:06:51 PST 2021; root:xnu-7195.81.3~1/RELEASE_ARM64_T8101 arm64
mi@mis-Mac-mini ~ % lipo -create -output universal a1 a2 
mi@mis-Mac-mini ~ % ./universal 
8 8
mi@mis-Mac-mini ~ % file universal 
universal: Mach-O universal binary with 2 architectures: [x86_64:Mach-O 64-bit executable x86_64] 
[arm64:Mach-O 64-bit executable arm64]
universal (for architecture x86_64):	Mach-O 64-bit executable x86_64
universal (for architecture arm64):	Mach-O 64-bit executable arm64

@issakomi
Copy link
Copy Markdown
Member

FYI, s. new variable CMAKE_APPLE_SILICON_PROCESSOR

@thewtex
Copy link
Copy Markdown
Member

thewtex commented Mar 30, 2021

@seanm @issakomi @hjmjohnson please take a look at #2459. This white list flag addition when we are compiling for x86_64. This way, we can keep the optimizations when possible. Testing on an M1 system, we no longer run into the configure error with

CMAKE_OSX_ARCHITECTURES=x86_64 (only)

@seanm
Copy link
Copy Markdown
Contributor Author

seanm commented Mar 30, 2021

And with CMAKE_OSX_ARCHITECTURES=x86_64;arm64 ?

@issakomi
Copy link
Copy Markdown
Member

issakomi commented Mar 30, 2021

And with CMAKE_OSX_ARCHITECTURES=x86_64;arm64 ?

If Eigen3 configuration fails - may be skip passing compiler flags to external CMake generator? There is the comment, BTW

  # Nothing to link or compile, so no need to pass compiler flags.
  # However, generators and c,cxx compilers have to be explictily passed
  # for CMake configuration to work.

However they are passed, may be remove these lines?

    "-DCMAKE_C_FLAGS=${CMAKE_C_FLAGS}"

    "-DCMAKE_CXX_FLAGS=${CMAKE_CXX_FLAGS}"

There is also typo explictily (explicitly) in the comment above.

@thewtex
Copy link
Copy Markdown
Member

thewtex commented Mar 30, 2021

And with CMAKE_OSX_ARCHITECTURES=x86_64;arm64 ?

Configuration also succeeds.

@thewtex
Copy link
Copy Markdown
Member

thewtex commented Mar 30, 2021

However they are passed, may be remove these lines?

While it is possible to specify flags to use when building with interface target headers, this does not currently appear to be used. However, there does not appear to be harm to keeping them, and upstream Eigen may utilize them in the future.

CC @phcerdan

@thewtex
Copy link
Copy Markdown
Member

thewtex commented Mar 30, 2021

With #2409 and #2459 the M1 builds should now configure properly. @seanm thanks for your help with this.

@thewtex thewtex closed this Mar 30, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants