-
-
Notifications
You must be signed in to change notification settings - Fork 17.7k
opencv: misc CUDA-related updates and fixes; add enableLto #218044
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
ConnorBaker
wants to merge
185
commits into
NixOS:master
from
ConnorBaker:feat/opencv-use-cudaPackages
Closed
opencv: misc CUDA-related updates and fixes; add enableLto #218044
ConnorBaker
wants to merge
185
commits into
NixOS:master
from
ConnorBaker:feat/opencv-use-cudaPackages
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This commit updates the `buildFlags`, which is a single string with one of four possibilities: - "" - "profiled" - "bootstrap" - "profiledbootstrap" Previously only the last two were possible. Since 2ea3482 all four are possible.
e3b62ad to
c70cc93
Compare
The primary motivating example is openssl: Before the change full package build took 1m54s minutes. After the change full package build takes 59s. About a 2x speedup. The difference is visible because openssl builds hundreds of manpages spawning a perl process per manual in `install` phase. Such a workload is very easy to parallelize. Another example would be `autotools`+`libtool` based build system where install step requires relinking. The more binaries there are to relink the more gain it will be to do it in parallel. The change enables parallel installs by default only for buiilds that already have parallel builds enabled. There is a high chance those build systems already handle parallelism well but some packages will fail. Consistently propagated the enableParallelBuilding to: - cmake (enabled by default, similar to builds) - ninja (set parallelism explicitly, don't rely on default) - bmake (enable when requested) - scons (enable when requested) - meson (set parallelism explicitly, don't rely on default) - waf (set parallelism explicitly, don't rely on default) - qmake-4/5/6 (enable by default, similar to builds) - xorg (always enable, similar to builds)
Without the change install phase fails as:
installing
install flags: -j16 ...
...
./.libs/libnetsnmpagent.so: file not recognized: file format not recognized
collect2: error: ld returned 1 exit status
make[1]: *** [Makefile:1012: libnetsnmpmibs.la] Error 1
make[1]: *** Waiting for unfinished jobs....
Without the change install phase fails as:
Installing libxfs-install
../../install-sh -o nixbld -g nixbld -m 644 ioctl_xfs_ag_geometry.2 /nix/store/chymzkiiv6c2rgl2gqrn4bqv5azhx9vf-xfsprogs-6.1.1-bin/share/man/man2/ioctl_xfs_ag_geometry.2
make[1]: *** No rule to make target '\', needed by 'kmem.lo'. Stop.
make[1]: *** Waiting for unfinished jobs....
make: *** [Makefile:148: libxfs-install] Error 2
make: *** Waiting for unfinished jobs....
Without the change parallel install fails as:
install flags: -j16
...
libbtool: error: error: relink '_py3sss.la' with the above command before installing it
libtool: warning: '/build/source/libsss_cert.la' has not been installed in '/nix/store/apyk9a6q7bc7d1fnn81vqrwil4waw9cd-sssd-2.8.2/lib/sssd'
make[3]: *** [Makefile:13362: install-py3execLTLIBRARIES] Error 1
now gcc isn't built
Without the change parallel install fails as:
$ install flags: -j16 ...
...
collect2: error: ld returned 1 exit status
libtool: error: error: relink 'libsvn_ra_serf-1.la' with the above command before installing it
make: *** [build-outputs.mk:1316: install-serf-lib] Error 1
make: *** Waiting for unfinished jobs....
/nix/store/1qasgqvab0xh2jcy00x9b1zh39dw7m8f-bin
Without the change parallel install fails as:
$ install flags: -j16 ...
...
install: target '...-ocaml-4.14.0/lib/ocaml/threads': No such file or directory
make[1]: *** [Makefile:140: installopt] Error 1
Without the change parallel installs fail as:
install flags: -j2
...
ln: failed to create symbolic link '...-eresi-0.83-a3-phoenix//bin/elfsh': No such file or directory
make: *** [Makefile:108: install64] Error 1
w3m: 0.5.3+git20220429 -> 0.5.3+git20230121
libpcap: 1.10.1 -> 1.10.3
- use cudaPackages instead of cudatoolkit (reduces download/closure size) - set C/C++ compiler when building with CUDA to ensure NVCC has an appropriate backing compiler - add flag to build with CUDNN (disabled by default due to increase in closure size) - add flag to build with LTO (enabled by default)
1bd932a to
13d80db
Compare
Member
|
Please reopen, we can't unping people. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Labels
6.topic: closure size
The final size of a derivation, including its dependencies
6.topic: cuda
Parallel computing platform and API
6.topic: nixos
Issues or PRs affecting NixOS modules, or package usability issues specific to NixOS
6.topic: ocaml
OCaml is a general-purpose, high-level, multi-paradigm programming language.
6.topic: python
Python is a high-level, general-purpose programming language.
6.topic: qt/kde
Object-oriented framework for GUI creation
6.topic: ruby
A dynamic, open source programming language with a focus on simplicity and productivity.
6.topic: rust
General-purpose programming language emphasizing performance, type safety, and concurrency.
6.topic: stdenv
Standard environment
6.topic: systemd
Software suite that provides an array of system components for Linux operating systems.
8.has: changelog
This PR adds or changes release notes
8.has: documentation
This PR adds or changes documentation
8.has: module (update)
This PR changes an existing module in `nixos/`
10.rebuild-darwin: 101-500
This PR causes between 101 and 500 packages to rebuild on Darwin.
10.rebuild-linux: 501-1000
This PR causes many rebuilds on Linux and should normally target the staging branches.
10.rebuild-linux: 501+
This PR causes many rebuilds on Linux and should normally target the staging branches.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes
Info on closure sizes: these are the result of before (compiling from master) and after (my PR, with and without CUDNN support).
Before:
Full closure:
Details
After (without CUDNN, which is the default):
Full closure:
Details
After (with CUDNN):
Full closure:
Details
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)