Skip to content

Conversation

@falconcr
Copy link
Contributor

Description

This PR implements a natural language interface for observability by connecting the Gemini CLI to Grafana/Loki using the Docker MCP Toolkit.

The implementation allows DevOps engineers to query infrastructure logs and metrics using plain English. The workflow follows a three-step automated sequence:

Datasource Discovery: The system automatically identifies available telemetry backends (e.g., Loki) using the list_datasources tool.

Contextual Translation: Gemini translates natural language prompts (e.g., filtering by device_name) into technical LogQL queries autonomously.

Automated Diagnostics: The system summarizes raw logs and proactively identifies critical anomalies, such as node_filesystem_device_error, without explicit user instruction.

Reviews

Checklist

[x] Technical review

[ ] Editorial review

[ ] Product review

@netlify
Copy link

netlify bot commented Feb 11, 2026

Deploy Preview for docsdocker ready!

Name Link
🔨 Latest commit 51e0640
🔍 Latest deploy log https://app.netlify.com/projects/docsdocker/deploys/698ce814d44ded00087706f4
😎 Deploy Preview https://deploy-preview-24134--docsdocker.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

@craig-osterhout
Copy link
Contributor

/review

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Documentation Review Summary

I've reviewed the new Grafana MCP integration guide. The content is well-structured and provides a useful tutorial for DevOps engineers. However, I found several issues that should be addressed to improve clarity and completeness:

HIGH Severity Issues (3):

  • Missing explanation for security recommendations
  • Unclear description of AI behavior in logs inspection
  • Incomplete help/contact information

MEDIUM Severity Issues (5):

  • Wordy phrasing that could be more concise
  • Passive voice construction
  • Marketing metaphors instead of direct language
  • Unclear technical assumptions about Kubernetes
  • Promotional tone in closing section

All issues are detailed in inline comments below.



## 1. Provisioning Grafana Access

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing explanation for security recommendation

The statement "Using a personal API key is discouraged for security best practices" doesn't explain WHY this matters to users.

Suggestion: Add a brief explanation:

The MCP server requires a **Service Account Token** to interact with the Grafana API. Service Account Tokens are preferred over personal API keys because they can be revoked independently without affecting user access, and permissions can be scoped more narrowly.

This helps users understand the security benefit and makes the recommendation actionable.


![Gemini gets the Grafana's logs from MCP docker](images/mcp-docker-grafana-loki-2.webp)

In the final step, Gemini performs reasoning over the raw data. It filters through hundreds of lines of telemetry to confirm that Nginx logs exist, but it goes beyond the original prompt by identifying a critical anomaly. It flags a `node_filesystem_device_error`, alerting the DevOps engineer to a potential hardware or volume mounting issue on the edge node. The process concludes by turning a simple question into an actionable incident report.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear AI behavior description

The statement that Gemini "goes beyond the original prompt by identifying a critical anomaly" creates ambiguity. Was the node_filesystem_device_error actually present in the Loki logs, or did Gemini infer it from limited data?

Suggestion: Clarify what actually happened:

  • If the error was in the logs: "Gemini analyzes the logs and identifies a node_filesystem_device_error in the data..."
  • If Gemini inferred it: "Based on the log patterns, Gemini identifies a potential node_filesystem_device_error..."

This factual clarity helps users understand what MCP/Gemini actually does versus what it infers.


- Optimize your queries: Experiment with advanced LogQL prompts to create automated health reports.

Need help setting up your Docker MCP environment or customizing your Gemini prompts? Let's know.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Incomplete help information

"Let's know" is grammatically incomplete and provides no actionable way for users to get help.

Suggestion: Replace with concrete resources:

Need help setting up your Docker MCP environment or customizing your Gemini prompts? Visit the [Docker Community Forums](https://forums.docker.com) or see the [MCP Troubleshooting Guide](/path/to/guide).

Provide actual links or contact methods so users know where to get support.

---

# Integrating Gemini CLI with Grafana via Docker MCP Toolkit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wordy phrasing

"outlines the process of connecting" is unnecessarily verbose.

Suggestion: Simplify to:

This guide shows how to connect Gemini CLI to a Grafana instance using the **Docker MCP Toolkit**.

Per the style guide's "Quick transformations" guidance, prefer direct, concise language.

1. Navigate to **Administration > Users and access > Service accounts** in your Grafana dashboard.
2. Create a new Service Account (e.g., `gemini-mcp-connector`).
3. Assign the **Viewer** role (or **Editor** if you require alert management capabilities).
4. Generate a new token. **Copy the token value immediately**; it will not be displayed again.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Passive voice

"it will not be displayed again" uses passive voice unnecessarily.

Suggestion:

Copy the token immediately—you won't be able to view it again.

Active voice is more direct and personal per the style guide.

## 2. MCP Server Configuration

The Docker MCP Toolkit provides a pre-configured Grafana catalog item. This acts as the bridge between the LLM and the Grafana API.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Marketing metaphor

"This acts as the bridge between" uses metaphorical language instead of being direct.

Suggestion:

This connects the LLM to the Grafana API.

Per the style guide on avoiding marketing language: "Be concise. Don't bulk up communication with fluffy words or complex metaphors."

### Logs Inspection

The sequence initiates with the User Prompt: "I would like to filter logs based on the device_name=edge-device-01 label. Are there logs about nginx in the last 5 minutes?". At this stage, the Gemini model performs intent parsing. It identifies the specific metadata required—a label (device_name) and a keyword (nginx)—and realizes it needs external data to fulfill the request. This triggers the list_datasources tool through the MCP Server to locate the telemetry backend.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear technical assumption

The statement "This query targets the specific Kubernetes pod logs, extracting raw OpenTelemetry (OTel) data that includes pod UIDs and container metadata" assumes Kubernetes context that wasn't established earlier.

Suggestion: Either:

  1. Show a Kubernetes-specific LogQL query example, OR
  2. Remove the Kubernetes reference and keep it generic: "This query targets logs matching the device label, extracting raw log data..."

The current phrasing makes assumptions about the query's target that aren't evident from the LogQL shown.

5. If you find the root cause, use create_incident to start the formal response and add_activity_to_incident to log your findings.

## Next steps?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Promotional tone shift

The "Next steps?" section shifts from instructional to promotional with phrases like "Don't let critical logs get buried under layers of infrastructure noise."

Suggestion: Provide concrete next steps instead:

## Next steps

- Learn about [Advanced LogQL queries](/link/to/guide)
- Set up [Team-wide MCP configurations](/link/to/guide)
- Explore [Grafana alerting with MCP](/link/to/guide)
- Get help in the [Docker Community Forums](https://forums.docker.com)

Focus on actionable resources rather than motivational statements.

Copy link
Contributor

@dvdksn dvdksn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this @falconcr

For now, just a heads-up that we're about to release some changes to MCP Toolkit which affects the UX in a way that would impact some of the steps in this guide. We're introducing an abstraction layer for managing server<>client configurations, called profiles.

The guide as such will still be useful. But I think we should wait with releasing this until profiles have been released, to avoid the double work of releasing this now and then updating it in a week or two when profiles are released.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants