feat: auto-download Chrome via fetcher, unify Docker image, fix singleton lock#268
feat: auto-download Chrome via fetcher, unify Docker image, fix singleton lock#268
Conversation
…eton lock
Replace the slim/full Docker image split with a single unified image.
Chrome is now auto-downloaded on first browser tool use via
chromiumoxide's fetcher and cached in {data_dir}/chrome_cache/,
matching the pattern used by fastembed for embedding models.
- Enable _fetcher-rustls-tokio feature on chromiumoxide
- Add 4-step Chrome detection: config > env vars > system PATH > fetcher
- Use unique temp dir per browser launch (prevents singleton lock crashes)
- Add Drop impl on BrowserState for cleanup on worker exit
- Collapse Dockerfile slim/full into single stage with runtime libs only
- Simplify release.yml to build one image per arch (no slim/full matrix)
- Remove ImageVariant enum and all variant branching from update.rs
- Read CHROME/CHROME_PATH env vars as fallback detection
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughConsolidates the Dockerfile into a single runtime stage; adds Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
Dockerfile (1)
57-57: Remove unusedopenssh-serverpackage from runtime stage.The runtime exposes only ports 19898 (API/UI) and 18789 (webhook). No SSH service is started or referenced anywhere in the codebase, making this package unnecessary attack surface and patch overhead.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@Dockerfile` at line 57, Remove the unused openssh-server package from the runtime image's package list: locate the installs in the Dockerfile runtime stage where "openssh-server" appears and delete that token from the apt install (or equivalent) line(s), then rebuild the image to verify no runtime references to openssh-server remain (the package name "openssh-server" identifies the exact item to remove); ensure exposed ports 19898 and 18789 and any startup scripts remain unchanged.src/config.rs (1)
4334-4358: Extract browser resolution into a dedicatedBrowserConfig::resolve(...)path.Lines 4334-4358 inline browser resolution logic inside
Config::from_toml. Moving this into a per-subsystem resolver would reduce drift with other load paths (likeConfig::load_from_env) and keep precedence behavior centralized.As per coding guidelines,
src/config.rs: "Use config resolution order:env > DB > defaultwith per-subsystemresolve()methods".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.rs` around lines 4334 - 4358, Extract the inline browser-merging logic into a new BrowserConfig::resolve(&self, base: &BrowserConfig, instance_dir: &Path) -> BrowserConfig method and replace the block in Config::from_toml with a call to that resolver; the resolve method should encapsulate precedence (env/DB/default), build chrome_cache_dir from instance_dir, merge fields using unwrap_or/or_else like the current logic (use base_defaults.browser as the base), and ensure other callers (e.g., Config::load_from_env or any DB-load path) also call BrowserConfig::resolve to centralize resolution behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/config.rs`:
- Line 740: BrowserConfig::default() sets chrome_cache_dir to a relative
PathBuf::from("chrome_cache"), causing env-only loading paths to differ from
Config::from_toml which resolves to instance_dir/chrome_cache; update
BrowserConfig::default() to construct chrome_cache_dir relative to
DefaultsConfig::instance_dir (or return a path that will be resolved by
Config::load_from_env) so both Config::from_toml and Config::load_from_env
produce the same instance_dir/chrome_cache location; specifically change the
default for chrome_cache_dir in BrowserConfig::default() (the
PathBuf::from("chrome_cache") entry) so it is either a joined path with the
instance dir or a sentinel relative path that Config::load_from_env resolves the
same way Config::from_toml does.
---
Nitpick comments:
In `@Dockerfile`:
- Line 57: Remove the unused openssh-server package from the runtime image's
package list: locate the installs in the Dockerfile runtime stage where
"openssh-server" appears and delete that token from the apt install (or
equivalent) line(s), then rebuild the image to verify no runtime references to
openssh-server remain (the package name "openssh-server" identifies the exact
item to remove); ensure exposed ports 19898 and 18789 and any startup scripts
remain unchanged.
In `@src/config.rs`:
- Around line 4334-4358: Extract the inline browser-merging logic into a new
BrowserConfig::resolve(&self, base: &BrowserConfig, instance_dir: &Path) ->
BrowserConfig method and replace the block in Config::from_toml with a call to
that resolver; the resolve method should encapsulate precedence
(env/DB/default), build chrome_cache_dir from instance_dir, merge fields using
unwrap_or/or_else like the current logic (use base_defaults.browser as the
base), and ensure other callers (e.g., Config::load_from_env or any DB-load
path) also call BrowserConfig::resolve to centralize resolution behavior.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (5)
.github/workflows/release.ymlis excluded by!**/*.ymlCargo.lockis excluded by!**/*.lock,!**/*.lockCargo.tomlis excluded by!**/*.tomlfly.staging.tomlis excluded by!**/*.tomlfly.tomlis excluded by!**/*.toml
📒 Files selected for processing (4)
Dockerfilesrc/config.rssrc/tools/browser.rssrc/update.rs
There was a problem hiding this comment.
♻️ Duplicate comments (1)
src/config.rs (1)
740-740:⚠️ Potential issue | 🟠 Major
chrome_cache_dirremains relative in env-only loads.Line 740 still defaults to a relative path, and
Config::load_from_envkeepsDefaultsConfig::default(), so env-only runs can cache under the process CWD instead of{instance_dir}/chrome_cache. This diverges from TOML resolution and can cause cache misses/permission surprises.As per coding guidelines, "Use config resolution order: env > DB > default with per-subsystem `resolve()` methods".🔧 Proposed fix
@@ pub fn load_from_env(instance_dir: &Path) -> Result<Self> { - Ok(Self { + let mut defaults = DefaultsConfig::default(); + defaults.browser.chrome_cache_dir = instance_dir.join("chrome_cache"); + + Ok(Self { instance_dir: instance_dir.to_path_buf(), llm, - defaults: DefaultsConfig::default(), + defaults, agents, links: Vec::new(), groups: Vec::new(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.rs` at line 740, The chrome_cache_dir default is a relative PathBuf ("chrome_cache") which causes env-only loads (Config::load_from_env) to cache under the process CWD; update resolution so defaults are anchored to the instance directory by calling the defaults' resolve method when building env-only configs — e.g. ensure DefaultsConfig::default() is passed through DefaultsConfig::resolve(instance_dir) (or call DefaultsConfig::resolve in Config::load_from_env) so chrome_cache_dir is converted to {instance_dir}/chrome_cache; touch the chrome_cache_dir field and any related resolution logic in DefaultsConfig/Config to perform this conversion.
🧹 Nitpick comments (1)
src/config.rs (1)
4397-4409: Use a descriptive closure variable instead ofb.Rename
bto something explicit likebrowser_configfor readability and consistency.As per coding guidelines, "Don't abbreviate variable names. Use `queue` not `q`, `message` not `msg`, `channel` not `ch`. Common abbreviations like `config` are fine."♻️ Suggested rename
- .map(|b| { + .map(|browser_config| { let base = &base_defaults.browser; BrowserConfig { - enabled: b.enabled.unwrap_or(base.enabled), - headless: b.headless.unwrap_or(base.headless), - evaluate_enabled: b.evaluate_enabled.unwrap_or(base.evaluate_enabled), - executable_path: b + enabled: browser_config.enabled.unwrap_or(base.enabled), + headless: browser_config.headless.unwrap_or(base.headless), + evaluate_enabled: browser_config + .evaluate_enabled + .unwrap_or(base.evaluate_enabled), + executable_path: browser_config .executable_path .or_else(|| base.executable_path.clone()), - screenshot_dir: b + screenshot_dir: browser_config .screenshot_dir .map(PathBuf::from) .or_else(|| base.screenshot_dir.clone()),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.rs` around lines 4397 - 4409, The closure variable `b` in the map that builds BrowserConfig is non-descriptive; rename it to `browser_config` (or similar) in the closure argument and update all references inside the closure (e.g., uses in enabled: b.enabled, headless: b.headless, evaluate_enabled: b.evaluate_enabled, executable_path: b.executable_path, screenshot_dir: b.screenshot_dir) so the code reads enabled: browser_config.enabled, etc., leaving the rest of the mapping and references to base_defaults.browser and the BrowserConfig struct unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Duplicate comments:
In `@src/config.rs`:
- Line 740: The chrome_cache_dir default is a relative PathBuf ("chrome_cache")
which causes env-only loads (Config::load_from_env) to cache under the process
CWD; update resolution so defaults are anchored to the instance directory by
calling the defaults' resolve method when building env-only configs — e.g.
ensure DefaultsConfig::default() is passed through
DefaultsConfig::resolve(instance_dir) (or call DefaultsConfig::resolve in
Config::load_from_env) so chrome_cache_dir is converted to
{instance_dir}/chrome_cache; touch the chrome_cache_dir field and any related
resolution logic in DefaultsConfig/Config to perform this conversion.
---
Nitpick comments:
In `@src/config.rs`:
- Around line 4397-4409: The closure variable `b` in the map that builds
BrowserConfig is non-descriptive; rename it to `browser_config` (or similar) in
the closure argument and update all references inside the closure (e.g., uses in
enabled: b.enabled, headless: b.headless, evaluate_enabled: b.evaluate_enabled,
executable_path: b.executable_path, screenshot_dir: b.screenshot_dir) so the
code reads enabled: browser_config.enabled, etc., leaving the rest of the
mapping and references to base_defaults.browser and the BrowserConfig struct
unchanged.
…to spawn_blocking
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
src/tools/browser.rs (1)
1221-1232: Consider logging when env var is set but path doesn't exist.When
CHROMEorCHROME_PATHis set but points to a non-existent path, the function silently falls through. A debug log would help troubleshoot misconfigured environments:Suggested improvement
fn detect_chrome_from_env() -> Option<PathBuf> { for variable in ["CHROME", "CHROME_PATH"] { if let Ok(value) = std::env::var(variable) { let path = PathBuf::from(&value); if path.exists() { return Some(path); + } else { + tracing::debug!( + variable, + path = %path.display(), + "environment variable set but path does not exist" + ); } } } None }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/tools/browser.rs` around lines 1221 - 1232, The detect_chrome_from_env function currently ignores cases where CHROME or CHROME_PATH is set but points to a non-existent path; update detect_chrome_from_env to emit a debug log when an environment variable is present but the PathBuf does not exist (include the env variable name and its value in the message) so misconfigurations are visible; use the project’s logging facility (e.g., debug! or tracing::debug!) consistent with other modules, and keep the existing return behavior (return Some(path) only when path.exists()) while logging the failed cases.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/tools/browser.rs`:
- Around line 550-556: The cleanup currently silently discards errors when
removing the user_data_dir; replace the blocking std::fs::remove_dir_all call
with the async tokio::fs::remove_dir_all and handle errors instead of using `let
_ =`. In the branch that checks state.browser.is_some() (after drop(browser) and
referencing user_data_dir), call tokio::fs::remove_dir_all(&user_data_dir).await
and log any Err(e) (e.g., with tracing::warn! or the project's logger) including
the path and error details so failures are not silently ignored.
- Around line 154-176: In the BrowserState Drop implementation (fn drop), the
fallback branch that runs outside a tokio runtime currently discards the Result
with `let _ = std::fs::remove_dir_all(&dir);`; change this to handle and report
errors instead: call std::fs::remove_dir_all(&dir) and match on the Result,
printing any Err via a fallback logger such as eprintln! (including the
directory path and error) or using tracing::error if available, so the failure
isn't silently ignored when cleaning up the user_data_dir.
---
Nitpick comments:
In `@src/tools/browser.rs`:
- Around line 1221-1232: The detect_chrome_from_env function currently ignores
cases where CHROME or CHROME_PATH is set but points to a non-existent path;
update detect_chrome_from_env to emit a debug log when an environment variable
is present but the PathBuf does not exist (include the env variable name and its
value in the message) so misconfigurations are visible; use the project’s
logging facility (e.g., debug! or tracing::debug!) consistent with other
modules, and keep the existing return behavior (return Some(path) only when
path.exists()) while logging the failed cases.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
src/config.rs (1)
4396-4420: Rename abbreviated closure variable for guideline compliance.Line 4400 uses
b; please use a descriptive name (for example,browser_override) in this block.♻️ Proposed refactor
- toml.defaults - .browser - .map(|b| { + toml.defaults + .browser + .map(|browser_override| { let base = &base_defaults.browser; BrowserConfig { - enabled: b.enabled.unwrap_or(base.enabled), - headless: b.headless.unwrap_or(base.headless), - evaluate_enabled: b.evaluate_enabled.unwrap_or(base.evaluate_enabled), - executable_path: b + enabled: browser_override.enabled.unwrap_or(base.enabled), + headless: browser_override.headless.unwrap_or(base.headless), + evaluate_enabled: browser_override + .evaluate_enabled + .unwrap_or(base.evaluate_enabled), + executable_path: browser_override .executable_path .or_else(|| base.executable_path.clone()), - screenshot_dir: b + screenshot_dir: browser_override .screenshot_dir .map(PathBuf::from) .or_else(|| base.screenshot_dir.clone()), chrome_cache_dir: chrome_cache_dir.clone(), } })As per coding guidelines, "Don't abbreviate variable names. Use
queuenotq,messagenotmsg,channelnotch. Common abbreviations likeconfigare fine."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/config.rs` around lines 4396 - 4420, Rename the abbreviated closure variable `b` to a descriptive name such as `browser_override` in the browser mapping closure so it complies with naming guidelines; update the mapping expression that constructs `BrowserConfig` (the closure passed to toml.defaults.browser.map) to use `browser_override` wherever `b` was used (e.g., `browser_override.enabled`, `browser_override.headless`, `browser_override.executable_path`, `browser_override.screenshot_dir`) while leaving surrounding symbols (`BrowserConfig`, `base_defaults.browser`, `chrome_cache_dir`, and the unwrap_or_else branch) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@src/config.rs`:
- Around line 4396-4420: Rename the abbreviated closure variable `b` to a
descriptive name such as `browser_override` in the browser mapping closure so it
complies with naming guidelines; update the mapping expression that constructs
`BrowserConfig` (the closure passed to toml.defaults.browser.map) to use
`browser_override` wherever `b` was used (e.g., `browser_override.enabled`,
`browser_override.headless`, `browser_override.executable_path`,
`browser_override.screenshot_dir`) while leaving surrounding symbols
(`BrowserConfig`, `base_defaults.browser`, `chrome_cache_dir`, and the
unwrap_or_else branch) unchanged.
Summary
_fetcher-rustls-tokiofeature so Chrome is downloaded on first browser tool use and cached in{data_dir}/chrome_cache/(same pattern as fastembed'sembedding_cache/). Works on bare binary (macOS/Linux), Docker, and hosted instances.slim/fullinto a single stage. Dropapt-get install chromium(~200MB), keep only the runtime shared libraries (~30MB). Image shrinks from 369MB to ~167MB compressed./tmp/chromiumoxide-runner/. AddsDropimpl onBrowserStateto clean up on worker exit. Fixes the"Browser process exited with status ExitStatus(unix_wait_status(5376))"error users were hitting.Chrome Detection Chain
executable_pathfrom config (user override)CHROME/CHROME_PATHenvironment variablesBrowserFetcher(cached, ~150MB first download)Changes
Cargo.toml_fetcher-rustls-tokiofeaturesrc/tools/browser.rsDropimplsrc/config.rschrome_cache_dirtoBrowserConfigsrc/update.rsImageVariant, simplifyresolve_target_imageDockerfilechromiumpkg, keep runtime libs.github/workflows/release.ymlfly.toml,fly.staging.tomltarget = "full"Migration
:fullor:slimtags should switch to:latestor versioned tags. Legacy variant suffixes are stripped byresolve_target_image.Verification
cargo check— cleancargo fmt --check— cleancargo clippy --all-targets— cleancargo test --lib— 274 passed