Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](http://keepachangelog.com/)
and this project adheres to [Semantic Versioning](http://semver.org/).

## [2.5.0 - 2024-04-xx]

### Added

- Different compute device configuration for Daemon (NVIDIA, AMD, CPU)

## [2.4.0 - 2024-04-04]

### Added
Expand Down
3 changes: 2 additions & 1 deletion appinfo/info.xml
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ to join us in shaping a more versatile, stable, and secure app landscape.
*Your insights, suggestions, and contributions are invaluable to us.*

]]></description>
<version>2.4.0</version>
<version>2.5.0</version>
<licence>agpl</licence>
<author mail="andrey18106x@gmail.com" homepage="https://github.com/andrey18106">Andrey Borysenko</author>
<author mail="bigcat88@icloud.com" homepage="https://github.com/bigcat88">Alexander Piskun</author>
Expand Down Expand Up @@ -72,6 +72,7 @@ to join us in shaping a more versatile, stable, and secure app landscape.
<install>
<step>OCA\AppAPI\Migration\DataInitializationStep</step>
<step>OCA\AppAPI\Migration\DaemonUpdateV2RepairStep</step>
<step>OCA\AppAPI\Migration\DaemonUpdateGPUSRepairStep</step>
</install>
</repair-steps>
<commands>
Expand Down
11 changes: 7 additions & 4 deletions docs/CreationOfDeployDaemon.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,7 +32,7 @@ Register

Register Deploy Daemon (DaemonConfig).

Command: ``app_api:daemon:register [--net NET] [--gpu] [--] <name> <display-name> <accepts-deploy-id> <protocol> <host> <nextcloud_url>``
Command: ``app_api:daemon:register [--net NET] [--haproxy_password HAPROXY_PASSWORD] [--compute_device COMPUTE_DEVICE] [--set-default] [--] <name> <display-name> <accepts-deploy-id> <protocol> <host> <nextcloud_url>``

Arguments
*********
Expand All @@ -49,7 +49,7 @@ Options

* ``--net [network-name]`` - ``[required]`` network name to bind docker container to (default: ``host``)
* ``--haproxy_password HAPROXY_PASSWORD`` - ``[optional]`` password for AppAPI Docker Socket Proxy
* ``--gpu GPU`` - ``[optional]`` GPU device to expose to the daemon (e.g. ``/dev/dri``)
* ``--compute_device GPU`` - ``[optional]`` GPU device to expose to the daemon (e.g. ``cpu|cuda|rocm``, default: ``cpu``)
* ``--set-default`` - ``[optional]`` set created daemon as default for ExApps installation

DeployConfig
Expand All @@ -64,7 +64,10 @@ ExApp container.
"net": "host",
"nextcloud_url": "https://nextcloud.local",
"haproxy_password": "some_secure_password",
"gpus": true,
"computeDevice": {
"id": "cuda",
"name": "CUDA (NVIDIA)",
},
}

DeployConfig options
Expand All @@ -73,7 +76,7 @@ DeployConfig options
* ``net`` **[required]** - network name to bind docker container to (default: ``host``)
* ``nextcloud_url`` **[required]** - Nextcloud URL (e.g. ``https://nextcloud.local``)
* ``haproxy_password`` *[optional]* - password for AppAPI Docker Socket Proxy
* ``gpus`` *[optional]* - GPU device to attach to the daemon (e.g. ``/dev/dri``)
* ``computeDevice`` *[optional]* - Compute device to attach to the daemon (e.g. ``{ "id": "cuda", "label": "CUDA (NVIDIA)" }``)

Unregister
----------
Expand Down
27 changes: 25 additions & 2 deletions lib/Command/Daemon/RegisterDaemon.php
Original file line number Diff line number Diff line change
Expand Up @@ -38,11 +38,12 @@ protected function configure(): void {
$this->addOption('net', null, InputOption::VALUE_REQUIRED, 'DeployConfig, the name of the docker network to attach App to');
$this->addOption('haproxy_password', null, InputOption::VALUE_REQUIRED, 'AppAPI Docker Socket Proxy password for HAProxy Basic auth');

$this->addOption('gpu', null, InputOption::VALUE_NONE, 'Enable support of GPUs for containers');
$this->addOption('compute_device', null, InputOption::VALUE_REQUIRED, 'Compute device for GPU support (cpu|cuda|rocm)');

$this->addOption('set-default', null, InputOption::VALUE_NONE, 'Set DaemonConfig as default');

$this->addUsage('local_docker "Docker local" "docker-install" "http" "/var/run/docker.sock" "http://nextcloud.local" --net=nextcloud');
$this->addUsage('local_docker "Docker local" "docker-install" "http" "/var/run/docker.sock" "http://nextcloud.local" --net=nextcloud --set-default --compute_device=cuda');
}

protected function execute(InputInterface $input, OutputInterface $output): int {
Expand All @@ -57,7 +58,7 @@ protected function execute(InputInterface $input, OutputInterface $output): int
'net' => $input->getOption('net') ?? 'host',
'nextcloud_url' => $nextcloudUrl,
'haproxy_password' => $input->getOption('haproxy_password') ?? '',
'gpu' => $input->getOption('gpu') ?? false,
'computeDevice' => $this->buildComputeDevice($input->getOption('compute_device') ?? 'cpu'),
];

if (($protocol !== 'http') && ($protocol !== 'https')) {
Expand Down Expand Up @@ -95,4 +96,26 @@ protected function execute(InputInterface $input, OutputInterface $output): int
$output->writeln('Daemon successfully registered.');
return 0;
}

private function buildComputeDevice(string $computeDevice): array {
switch ($computeDevice) {
case 'cpu':
return [
'id' => 'cpu',
'label' => 'CPU',
];
case 'cuda':
return [
'id' => 'cuda',
'label' => 'CUDA (NVIDIA)',
];
case 'rocm':
return [
'id' => 'rocm',
'label' => 'ROCm (AMD)',
];
default:
throw new \InvalidArgumentException('Invalid compute device value.');
}
}
}
50 changes: 4 additions & 46 deletions lib/DeployActions/AIODockerActions.php
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,6 @@
*/
class AIODockerActions {
public const AIO_DAEMON_CONFIG_NAME = 'docker_aio';
public const AIO_DAEMON_CONFIG_NAME_GPU = 'docker_aio_gpu';
public const AIO_DOCKER_SOCKET_PROXY_HOST = 'nextcloud-aio-docker-socket-proxy:2375';

public function __construct(
Expand Down Expand Up @@ -46,13 +45,12 @@ public function registerAIODaemonConfig(): ?DaemonConfig {
'net' => 'nextcloud-aio', // using the same host as default network for Nextcloud AIO containers
'nextcloud_url' => 'https://' . getenv('NC_DOMAIN'),
'haproxy_password' => null,
'gpu' => false,
'computeDevice' => [
'id' => 'cpu',
'label' => 'CPU',
],
];

if ($this->isGPUsEnabled()) {
$this->registerAIODaemonConfigWithGPU();
}

$daemonConfigParams = [
'name' => self::AIO_DAEMON_CONFIG_NAME,
'display_name' => 'AIO Docker Socket Proxy',
Expand All @@ -68,44 +66,4 @@ public function registerAIODaemonConfig(): ?DaemonConfig {
}
return $daemonConfig;
}

/**
* Registers DaemonConfig with default params to use AIO Docker Socket Proxy with GPU
*/
private function registerAIODaemonConfigWithGPU(): ?DaemonConfig {
$daemonConfigWithGPU = $this->daemonConfigService->getDaemonConfigByName(self::AIO_DAEMON_CONFIG_NAME_GPU);
if ($daemonConfigWithGPU !== null) {
return $daemonConfigWithGPU;
}

$deployConfig = [
'net' => 'nextcloud-aio', // using the same host as default network for Nextcloud AIO containers
'nextcloud_url' => 'https://' . getenv('NC_DOMAIN'),
'haproxy_password' => null,
'gpu' => true,
];

$daemonConfigParams = [
'name' => self::AIO_DAEMON_CONFIG_NAME_GPU,
'display_name' => 'AIO Docker Socket Proxy with GPU',
'accepts_deploy_id' => 'docker-install',
'protocol' => 'http',
'host' => self::AIO_DOCKER_SOCKET_PROXY_HOST,
'deploy_config' => $deployConfig,
];

return $this->daemonConfigService->registerDaemonConfig($daemonConfigParams);
}

/**
* Check if /dev/dri folder mounted to the container.
* In AIO this means that NEXTCLOUD_ENABLE_DRI_DEVICE=true
*/
private function isGPUsEnabled(): bool {
$devDri = '/dev/dri';
if (is_dir($devDri)) {
return true;
}
return false;
}
}
43 changes: 30 additions & 13 deletions lib/DeployActions/DockerActions.php
Original file line number Diff line number Diff line change
Expand Up @@ -125,11 +125,20 @@ public function createContainer(string $dockerUrl, array $imageParams, array $pa
$containerParams['NetworkingConfig'] = $networkingConfig;
}

if (isset($params['gpu']) && filter_var($params['gpu'], FILTER_VALIDATE_BOOLEAN)) {
if (isset($params['deviceRequests'])) {
$containerParams['HostConfig']['DeviceRequests'] = $params['deviceRequests'];
} else {
$containerParams['HostConfig']['DeviceRequests'] = $this->buildDefaultGPUDeviceRequests();
if (isset($params['computeDevice'])) {
if ($params['computeDevice']['id'] === 'cuda') {
if (isset($params['deviceRequests'])) {
$containerParams['HostConfig']['DeviceRequests'] = $params['deviceRequests'];
} else {
$containerParams['HostConfig']['DeviceRequests'] = $this->buildDefaultGPUDeviceRequests();
}
}
if ($params['computeDevice']['id'] === 'rocm') {
if (isset($params['devices'])) {
$containerParams['HostConfig']['Devices'] = $params['devices'];
} else {
$containerParams['HostConfig']['Devices'] = $this->buildDevicesParams(['/dev/kfd', '/dev/dri']);
}
}
}

Expand Down Expand Up @@ -346,10 +355,15 @@ public function buildDeployParams(DaemonConfig $daemonConfig, array $appInfo): a
$externalApp = $appInfo['external-app'];
$deployConfig = $daemonConfig->getDeployConfig();

if (isset($deployConfig['gpu']) && filter_var($deployConfig['gpu'], FILTER_VALIDATE_BOOLEAN)) {
$deviceRequests = $this->buildDefaultGPUDeviceRequests();
if (isset($deployConfig['computeDevice'])) {
if ($deployConfig['computeDevice']['id'] === 'cuda') {
$deviceRequests = $this->buildDefaultGPUDeviceRequests();
} elseif ($deployConfig['computeDevice']['id'] === 'rocm') {
$devices = $this->buildDevicesParams(['/dev/kfd', '/dev/dri']);
}
} else {
$deviceRequests = [];
$devices = [];
}
$storage = $this->buildDefaultExAppVolume($appId)[0]['Target'];

Expand All @@ -375,8 +389,9 @@ public function buildDeployParams(DaemonConfig $daemonConfig, array $appInfo): a
'port' => $appInfo['port'],
'net' => $deployConfig['net'] ?? 'host',
'env' => $envs,
'computeDevice' => $deployConfig['computeDevice'] ?? null,
'devices' => $devices,
'deviceRequests' => $deviceRequests,
'gpu' => count($deviceRequests) > 0,
];

return [
Expand All @@ -398,10 +413,14 @@ public function buildDeployEnvs(array $params, array $deployConfig): array {
sprintf('NEXTCLOUD_URL=%s', $deployConfig['nextcloud_url'] ?? str_replace('https', 'http', $this->urlGenerator->getAbsoluteURL(''))),
];

// Always set COMPUTE_DEVICE=cpu|cuda|rocm
$autoEnvs[] = sprintf('COMPUTE_DEVICE=%s', $deployConfig['computeDevice']['id']);
// Add required GPU runtime envs if daemon configured to use GPU
if (isset($deployConfig['gpu']) && filter_var($deployConfig['gpu'], FILTER_VALIDATE_BOOLEAN)) {
$autoEnvs[] = sprintf('NVIDIA_VISIBLE_DEVICES=%s', 'all');
$autoEnvs[] = sprintf('NVIDIA_DRIVER_CAPABILITIES=%s', 'compute,utility');
if (isset($deployConfig['computeDevice'])) {
if ($deployConfig['computeDevice']['id'] === 'cuda') {
$autoEnvs[] = sprintf('NVIDIA_VISIBLE_DEVICES=%s', 'all');
$autoEnvs[] = sprintf('NVIDIA_DRIVER_CAPABILITIES=%s', 'compute,utility');
}
}
return $autoEnvs;
}
Expand Down Expand Up @@ -518,8 +537,6 @@ private function isGPUAvailable(): bool {

/**
* Return default GPU device requests for container.
* For now only NVIDIA GPUs supported.
* TODO: Add support for other GPU vendors
*/
private function buildDefaultGPUDeviceRequests(): array {
return [
Expand Down
72 changes: 72 additions & 0 deletions lib/Migration/DaemonUpdateGPUSRepairStep.php
Original file line number Diff line number Diff line change
@@ -0,0 +1,72 @@
<?php

declare(strict_types=1);

namespace OCA\AppAPI\Migration;

use OCA\AppAPI\Db\DaemonConfig;
use OCA\AppAPI\Db\DaemonConfigMapper;
use OCP\DB\Exception;
use OCP\Migration\IOutput;
use OCP\Migration\IRepairStep;
use Psr\Log\LoggerInterface;

class DaemonUpdateGPUSRepairStep implements IRepairStep {
public function __construct(
private readonly DaemonConfigMapper $daemonConfigMapper,
private readonly LoggerInterface $logger,
) {
}

public function getName(): string {
return 'AppAPI Daemons configuration GPU params update';
}

public function run(IOutput $output): void {
$daemons = $this->daemonConfigMapper->findAll();
$daemonsUpdated = 0;
// Update manual-install daemons
/** @var DaemonConfig $daemon */
foreach ($daemons as $daemon) {
$daemonsUpdated += $this->updateDaemonConfiguration($daemon);
}
$output->info(sprintf('Daemons configuration GPU params updated: %s', $daemonsUpdated));
}

private function updateDaemonConfiguration(DaemonConfig $daemonConfig): int {
$updated = false;

$deployConfig = $daemonConfig->getDeployConfig();
if (isset($deployConfig['gpu'])) {
if (filter_var($deployConfig['gpu'], FILTER_VALIDATE_BOOLEAN)) {
$deployConfig['computeDevice'] = [
'id' => 'cuda',
'label' => 'CUDA (NVIDIA)',
];
} else {
$deployConfig['computeDevice'] = [
'id' => 'cpu',
'label' => 'CPU',
];
}
unset($deployConfig['gpu']);
$daemonConfig->setDeployConfig($deployConfig);
$updated = true;
}

if ($updated) {
try {
$this->daemonConfigMapper->update($daemonConfig);
return 1;
} catch (Exception $e) {
$this->logger->error(
sprintf('Failed to update Daemon config (%s: %s)',
$daemonConfig->getAcceptsDeployId(), $daemonConfig->getName()),
['exception' => $e]
);
return 0;
}
}
return 0;
}
}
3 changes: 2 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,8 @@
"lint": "eslint --ext .js,.vue src",
"lint:fix": "eslint --ext .js,.vue src --fix",
"stylelint": "stylelint src/**/*.vue src/**/*.scss src/**/*.css",
"stylelint:fix": "stylelint src/**/*.vue src/**/*.scss src/**/*.css --fix"
"stylelint:fix": "stylelint src/**/*.vue src/**/*.scss src/**/*.css --fix",
"serve": "NODE_ENV=development webpack serve --allowed-hosts all --config webpack.js"
},
"browserslist": [
"extends @nextcloud/browserslist-config"
Expand Down
Binary file modified screenshots/app_api_1.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified screenshots/app_api_2.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified screenshots/app_api_3.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified screenshots/app_api_4.png
100755 → 100644
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 1 addition & 0 deletions src/components/AdminSettings.vue
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@
:options="['no', 'always', 'unless-stopped']"
:placeholder="t('app_api', 'ExApp container restart policy')"
:aria-label="t('app_api', 'ExApp container restart policy')"
:aria-label-combobox="t('app_api', 'ExApp container restart policy')"
@input="onInput" />
</NcSettingsSection>
</div>
Expand Down
Loading