docs: outputs: add ZeroBus output plugin documentation#2537
docs: outputs: add ZeroBus output plugin documentation#2537mats16 wants to merge 1 commit intofluent:masterfrom
Conversation
📝 WalkthroughWalkthroughAdds documentation for a new Fluent Bit ZeroBus output plugin and a SUMMARY entry linking to the new page. The docs describe prerequisites, configuration parameters, examples, and the record transformation order. Changes
Sequence Diagram(s)sequenceDiagram
participant FluentBit as Fluent Bit\n(Collector)
participant Auth as OAuth2\n(Service Principal)
participant ZeroBus as ZeroBus\nEndpoint
participant Databricks as Databricks\n(ZeroBus Ingest)
FluentBit->>Auth: Request access token (client_id, client_secret)
Auth-->>FluentBit: Return access token
FluentBit->>ZeroBus: POST records + Authorization: Bearer token
ZeroBus-->>Databricks: Deliver ingested records to Unity Catalog table
Databricks-->>ZeroBus: Ack/ingest result
ZeroBus-->>FluentBit: HTTP response (success/failure)
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~5 minutes Suggested labels
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
- Introduced new ZeroBus output plugin for sending logs to Databricks via the ZeroBus streaming ingestion interface. - Updated SUMMARY.md to include ZeroBus in the list of output plugins. - Provided detailed configuration parameters, usage examples, and record format transformations for the ZeroBus plugin. Signed-off-by: mats <mats.kazuki@gmail.com>
e387cd1 to
5dac612
Compare
There was a problem hiding this comment.
🧹 Nitpick comments (3)
pipeline/outputs/zerobus.md (3)
81-85: Consider rewording the ordered steps to reduce repeated sentence starts.The repeated “If …” pattern across consecutive steps is readable but slightly mechanical; a light rewrite would improve flow.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` around lines 81 - 85, The bullet list is repetitive because each line starts with "If ..."; rephrase to vary sentence starts while preserving meaning by grouping related actions and using active phrasing: mention raw_log_key behavior (capture full original record as JSON and inject under configured key unless it exists), describe log_key behavior (include only specified keys), then mention time_key (inject RFC3339 timestamp with nanosecond precision unless key exists) and add_tag (inject Fluent Bit tag as _tag unless key exists); keep the examples (timestamp format) and the "unless a key with that name already exists" clause attached to each relevant item and reference the unique config names raw_log_key, log_key, time_key, and add_tag to locate the lines to edit.
23-23: Add a short secret-handling note forclient_secret.Please add guidance to avoid storing
client_secretin plaintext config files (for example, prefer environment variable substitution or secret-management integration).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` at line 23, Add a short secret-handling note for the `client_secret` field advising not to store `client_secret` in plaintext configuration files; recommend using environment variable substitution (e.g., reading from ENV) or integrating with a secrets manager (Vault, AWS Secrets Manager, etc.), and mention limiting access and rotation as best practices so consumers of `client_secret` know secure handling expectations.
11-11: Use a technical docs link instead of the vendor homepage.Linking to the general Databricks homepage is less neutral/technical than linking to the relevant product documentation page for ZeroBus/ingestion setup.
Based on learnings: ensure Markdown stays technical and neutral, and avoid promotional links.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` at line 11, The README sentence starting "The _ZeroBus_ output plugin lets you ingest log records into a Databricks table..." links to the Databricks homepage; replace that href with the official Databricks technical documentation for ZeroBus/streaming ingestion (the product docs page for ZeroBus or ingestion setup) so the link is neutral and technical. Edit pipeline/outputs/zerobus.md and update the Markdown link target in the sentence mentioning "_ZeroBus_" to point to the ZeroBus/streaming ingestion docs URL, keeping the visible text unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pipeline/outputs/zerobus.md`:
- Around line 81-85: The bullet list is repetitive because each line starts with
"If ..."; rephrase to vary sentence starts while preserving meaning by grouping
related actions and using active phrasing: mention raw_log_key behavior (capture
full original record as JSON and inject under configured key unless it exists),
describe log_key behavior (include only specified keys), then mention time_key
(inject RFC3339 timestamp with nanosecond precision unless key exists) and
add_tag (inject Fluent Bit tag as _tag unless key exists); keep the examples
(timestamp format) and the "unless a key with that name already exists" clause
attached to each relevant item and reference the unique config names
raw_log_key, log_key, time_key, and add_tag to locate the lines to edit.
- Line 23: Add a short secret-handling note for the `client_secret` field
advising not to store `client_secret` in plaintext configuration files;
recommend using environment variable substitution (e.g., reading from ENV) or
integrating with a secrets manager (Vault, AWS Secrets Manager, etc.), and
mention limiting access and rotation as best practices so consumers of
`client_secret` know secure handling expectations.
- Line 11: The README sentence starting "The _ZeroBus_ output plugin lets you
ingest log records into a Databricks table..." links to the Databricks homepage;
replace that href with the official Databricks technical documentation for
ZeroBus/streaming ingestion (the product docs page for ZeroBus or ingestion
setup) so the link is neutral and technical. Edit pipeline/outputs/zerobus.md
and update the Markdown link target in the sentence mentioning "_ZeroBus_" to
point to the ZeroBus/streaming ingestion docs URL, keeping the visible text
unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 3079b466-65a0-4ffc-b8c0-ec1c0a4981b3
📒 Files selected for processing (2)
SUMMARY.mdpipeline/outputs/zerobus.md
There was a problem hiding this comment.
🧹 Nitpick comments (1)
pipeline/outputs/zerobus.md (1)
79-85: Consider varying sentence structure for better readability.The transformation steps are clear and well-ordered. However, all five steps begin with "If", which creates repetitive sentence structure. Consider rephrasing for variety while maintaining clarity.
📝 Example rewrite with varied sentence structure
Each log record is converted to a JSON object before ingestion. The plugin applies the following transformations in order: -1. If `raw_log_key` is set, the full original record is captured as a JSON string before any filtering. -2. If `log_key` is set, only the specified keys are included in the output record. -3. If `raw_log_key` is set, the captured JSON string is injected under the configured key (unless a key with that name already exists). -4. If `time_key` is set, a timestamp in RFC 3339 format with nanosecond precision (for example, `2024-01-15T10:30:00.123456789Z`) is injected (unless a key with that name already exists). -5. If `add_tag` is enabled, the Fluent Bit tag is injected as `_tag` (unless a key with that name already exists). +1. When `raw_log_key` is set, the full original record is captured as a JSON string before any filtering. +2. If `log_key` is set, only the specified keys are included in the output record. +3. The captured JSON string (if enabled) is injected under the configured `raw_log_key` (unless a key with that name already exists). +4. A timestamp in RFC 3339 format with nanosecond precision (for example, `2024-01-15T10:30:00.123456789Z`) is injected under `time_key` (unless disabled or a key with that name already exists). +5. When `add_tag` is enabled, the Fluent Bit tag is injected as `_tag` (unless a key with that name already exists).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pipeline/outputs/zerobus.md` around lines 79 - 85, The steps all start with "If" creating repetitive phrasing; rewrite the five bullet sentences to vary sentence openings while keeping their order and meaning—e.g., start some with conditionals ("When `raw_log_key` is set..."), others with actions ("Capture the full original record as a JSON string..."), and some with clauses ("When `log_key` is set, include only the specified keys..."); preserve references to `raw_log_key`, `log_key`, `time_key`, `add_tag`, and the injected `_tag`, and retain the semantics about not overwriting existing keys and RFC 3339 nanosecond precision for `time_key`.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In `@pipeline/outputs/zerobus.md`:
- Around line 79-85: The steps all start with "If" creating repetitive phrasing;
rewrite the five bullet sentences to vary sentence openings while keeping their
order and meaning—e.g., start some with conditionals ("When `raw_log_key` is
set..."), others with actions ("Capture the full original record as a JSON
string..."), and some with clauses ("When `log_key` is set, include only the
specified keys..."); preserve references to `raw_log_key`, `log_key`,
`time_key`, `add_tag`, and the injected `_tag`, and retain the semantics about
not overwriting existing keys and RFC 3339 nanosecond precision for `time_key`.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 1abe0133-81b8-4c9e-af56-09c97a8e2033
📒 Files selected for processing (2)
SUMMARY.mdpipeline/outputs/zerobus.md
✅ Files skipped from review due to trivial changes (1)
- SUMMARY.md
|
@mats16 can you link the code PR this output plugin doc is dependent on? Also the linting / vale errors need to be cleaned up. Note that you can fix the spelling issue with ZeroBus in the docs by using backticks around it. I'll review once these are addressed. |
Summary
out_zerobusoutput plugin that sends logs to Databricks tables via the ZeroBus streaming ingestion interfaceendpoint,workspace_url,table_name,client_id,client_secret,add_tag,time_key,log_key,raw_log_key)Test plan
This pull request was AI-assisted by Claude.
Summary by CodeRabbit