diff --git a/.claude/skills/ml-logging-linter/SKILL.md b/.claude/skills/ml-logging-linter/SKILL.md new file mode 100644 index 0000000..5049c37 --- /dev/null +++ b/.claude/skills/ml-logging-linter/SKILL.md @@ -0,0 +1,212 @@ +--- +name: ml-logging-linter +description: >- + Scan JS/TS files for Mojaloop logging rule violations and produce a report with + fix suggestions. Checks rules covering error handling, + trace context, log levels, sensitive data, OTel attributes, and semantic patterns. + Use this skill whenever: auditing JS/TS code for logging quality, reviewing logging + practices across a Mojaloop service, checking if log statements follow standards, + preparing a PR that touches logging code, or assessing logging hygiene in any + Mojaloop/PM4ML codebase. Also activate when the user says "logging lint", "log review", + "logging audit", "check my logging", "logging violations", "scan for bad logging", + or asks to find logging anti-patterns in JS/TS files. Works on single files or folders. +argument-hint: ' [instructions]' +--- + +# Mojaloop Logging Linter + +Scan JS/TS source files against Mojaloop logging rules and report violations with +fix suggestions. This is an LLM-powered analysis — it uses semantic understanding of the +code rather than AST parsing, so it can catch patterns that mechanical linters miss (like +context-dependent level mismatches or implicit sensitive data exposure). It complements +the ESLint plugin, which handles CI enforcement. + +## Rules Source + +Read the full rule definitions (descriptions, severity, traceability, good/bad examples): + +``` +./references/RULES.md +``` + +If the user provides an alternative RULES.md path in `[instructions]` (e.g., "use rules +from /path/to/custom-rules.md"), use that file instead. If a user-specified path is +unreadable, report the error — do not fall back silently. + +If the default RULES.md path cannot be read (and no user override was given), **warn the +user** that analysis will use a reduced rule set, then fall back to +`references/rules-summary.md` in this skill directory. The summary has rule names, +severities, and one-line descriptions — enough for useful analysis, though without the +detailed examples. + + +## Input Parsing + +**Invocation:** `/ml-logging-linter [instructions]` + +1. **``** — resolve to absolute path. Determine if it's a file or directory. +2. **`[instructions]`** — parse for: + - **Rule filters**: "only errors", "skip no-console", "focus on sensitive-data" + - **Context hints**: "Kafka consumer handler", "HTTP middleware", "uses custom Logger class" + - **Scope overrides**: "include test files", "only src/handlers/" + +If no `[instructions]` are given, apply all rules defined in RULES.md. + + +## Workflow + +### Single File + +Read the rules, read the file, analyze inline (no agents needed). Produce the per-file +report below. + +### Folder Mode + +1. **Discover files**: Glob `**/*.{js,ts,mjs,mts}` in the target directory. + Auto-exclude: `node_modules/`, `dist/`, `build/`, `coverage/`, `.git/`. + Respect `.gitignore` if present. The user can override via instructions. + +2. **Read RULES.md** once, before spawning agents. + +3. **Adaptive batching** — choose strategy based on file count: + + | Files found | Strategy | + |-------------|----------| + | 1 | Analyze inline (same as single file mode) | + | 2-10 | 1 agent per file — maximum parallelism | + | 11+ | ~5 files per agent, grouped by directory. Cap at 10 agents max — increase batch size for larger sets | + | 0 | Report "no matching files found" and exit | + +4. **Spawn agents** using the Agent tool. Pass each agent: + - The RULES.md content (so agents don't each read the file independently) + - Its assigned file paths + - The user's `[instructions]` + - The per-file report format template + - Instruction to follow ✅ Good patterns from RULES.md for fix suggestions + +5. **Collect results** from all agents and produce the aggregate report. + +**Agent prompt template:** + +``` +Analyze these JS/TS files for Mojaloop logging rule violations. + +## Rules +{RULES_CONTENT} + +## Files +{FILE_PATHS} + +## User instructions +{INSTRUCTIONS_OR_NONE} + +## Logger Detection +Identify ALL logging statements. Cast a wide net: +- `logger.*`, `log.*`, `Logger.*`, `this.logger.*`, `this.log.*` +- `console.*` (console.log, console.error, console.warn, etc.) +- Any variable imported from a logger module (scan `require`/`import` statements) +- Note: `Logger.*` (uppercase) is itself a `deprecated-logger` violation — always flag it. +If unsure whether something is a logger call, include it in detection. If analysis +reveals it is not a logger, exclude it from the report. + +For each file: +1. Read the file content +2. Identify ALL logging statements using the patterns above +3. Check each statement against the applicable rules +4. Group related violations on the same line +5. Generate fix suggestions following the ✅ Good examples in the rules + +Return results in the per-file report format. +``` + + +## Analysis Approach + +### Logger Detection + +Cast a wide net — Mojaloop codebases use various logger wrappers. Look for: +- `logger.*`, `log.*`, `Logger.*`, `this.logger.*`, `this.log.*` +- `console.*` (console.log, console.error, console.warn, etc.) +- Any variable imported from a logger module (look at the file's `require`/`import` statements) +- Common patterns: `const Logger = require('...logger')`, `const { loggerFactory } = require(...)` + +Auto-detect the logger variable by scanning imports at the top of the file. If unsure whether +something is a logger call, include it — false positives in detection are better than missed violations. + +### Applying Rules + +RULES.md is the single source of truth for all rule definitions, severities, and examples. +Check every logging statement against every rule defined there. Do not skip rules or invent +new ones — the skill's purpose is to systematically audit code against that exact rule set. + + +## Output Format + +Keep reports concise. Focus on violations — do NOT list rules with no findings or lines +that passed. The reader wants to see what's wrong and how to fix it, not a catalog of +everything that was checked. + +### Per-File Report + +```markdown +### `path/to/file.ts` + +| Line | Severity | Rule | Issue | +|------|----------|------|-------| +| 42 | Error | losing-error-stack, no-error-context | `logger.error(err.message)` — loses stack, no context message | +| 58 | Warning | catch-and-log-bubble | Catches, logs, and rethrows — will duplicate in caller | + +#### Fix: Lines 42, 58 + +```diff +- logger.error(err.message); ++ logger.error('Bulk transfer lookup failed: ', err); +``` + +**Summary:** 2 errors, 1 warning across 8 logging statements +``` + +Key formatting rules: +- Group related violations on the same line into a single table row. +- Combine fix suggestions for violations that share the same pattern — don't repeat + the same diff block for every instance of the same anti-pattern. +- **Never include `err.message` in fix suggestions** — ContextLogger (Mojaloop's + standard logger wrapper) auto-formats errors, extracting message, stack, and cause + chain from the error object. The correct pattern is: + `logger.('static context: ', err)` — preserve the original log level. +- Fix suggestions in diff blocks should always use lowercase `logger` (ContextLogger), + never uppercase `Logger`. When quoting existing code in tables or recommendations, + reproduce the code as-is. +- Fix suggestions must follow the ✅ Good patterns shown in RULES.md. +- **Minimal fixes** — fix only the violated rule, don't restructure surrounding code. + When multiple violations exist on the same line, produce one diff fixing all of them + (this is fixing one statement, not "combining"). "Don't combine" means: don't fix + lines that weren't flagged as violations. + Exception: when the violation is the *absence* of a log statement (`no-silent-catch`, + `no-silent-function`), the fix adds the minimum necessary logging. + +### Cross-File Aggregate (folder mode) + +After all per-file reports, add: + +```markdown +## Aggregate Summary + +**Files scanned:** 24 | **With violations:** 18 +**Violations:** 12 errors, 35 warnings + +### Top Violated Rules +| Rule | Count | Severity | Files affected | +|------|-------|----------|----------------| +| no-stringified-json | 12 | Warning | api.ts, db.ts, kafka.ts, ... | +| generic-log-message | 9 | Warning | handler.ts, service.ts, ... | +| losing-error-stack | 7 | Error | transfers.ts, position.ts, ... | + +### Recommendations +[Top 3 highest-impact fixes to prioritize, based on frequency and severity] +``` + +Focus the recommendations on patterns that can be fixed systematically (e.g., +"All 5 catch-and-log-bubble violations follow the same `Logger.isErrorEnabled && Logger.error(err); throw err` pattern — consider a bulk refactor"). + + diff --git a/.claude/skills/ml-logging-linter/references/RULES.md b/.claude/skills/ml-logging-linter/references/RULES.md new file mode 100644 index 0000000..91220cd --- /dev/null +++ b/.claude/skills/ml-logging-linter/references/RULES.md @@ -0,0 +1,844 @@ +# ESLint Rules Documentation + +This documentation describes the rules included in `eslint-plugin-mojaloop-logging`, their traceability to the Mojaloop Standard, and examples of compliant vs. non-compliant code. + +## Table of Contents + +- [generic-log-message](#generic-log-message) +- [no-manual-trace-id](#no-manual-trace-id) +- [catch-and-log-bubble](#catch-and-log-bubble) +- [http-semantics](#http-semantics) +- [losing-error-stack](#losing-error-stack) +- [no-console](#no-console) +- [no-error-context](#no-error-context) +- [no-manual-level-check](#no-manual-level-check) +- [no-stringified-json](#no-stringified-json) +- [non-standard-attributes](#non-standard-attributes) +- [semantic-log-levels](#semantic-log-levels) +- [sensitive-data](#sensitive-data) +- [sql-semantics](#sql-semantics) +- [unnecessary-debug-guard](#unnecessary-debug-guard) +- [valid-log-levels](#valid-log-levels) +- [deprecated-logger](#deprecated-logger) +- [constant-log-prefix](#constant-log-prefix) +- [no-string-interpolation-context](#no-string-interpolation-context) +- [kafka-semantics](#kafka-semantics) +- [no-silent-catch](#no-silent-catch) +- [no-loop-logging](#no-loop-logging) +- [expected-error-level](#expected-error-level) +- [fspiop-header-handling](#fspiop-header-handling) +- [sql-no-raw-values](#sql-no-raw-values) +- [exception-attributes](#exception-attributes) +- [no-silent-function](#no-silent-function) + +--- + +### generic-log-message + +**Description:** Warns when log messages are too generic (e.g., "failed", "processing", "success") without inline context. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/data_model.md`: "The message (Body) should be human-readable and self-explanatory... Include specific values inline... Examples of poor messages: 'Validation failed', 'Processing transfer'." + +**Examples:** + +❌ **Bad** +```javascript +logger.info("Processing started", { transferId }); +logger.error("Validation failed", { error }); +``` + +✅ **Good** +```javascript +logger.info('processing transfer', { transferId }); +logger.error(`validation failed for transfer ${transferId}: `, error); +``` + +--- + +### no-manual-trace-id + +**Description:** Disallows manual injection of `traceId`, `spanId`, or `traceFlags` into log objects, as these are handled automatically by OpenTelemetry instrumentation. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/trace_context.md`: "You should **NOT** manually add traceId/spanId to log calls. These are automatically injected." + +**Examples:** + +❌ **Bad** +```javascript +logger.info("Message", { traceId: req.headers['fspiop-trace-id'] }); +``` + +✅ **Good** +```javascript +// Trace context is injected automatically by the logger infrastructure +logger.info("Message"); +``` + +--- + +### catch-and-log-bubble + +**Description:** Detects the "Catch and Log" anti-pattern where an error is caught, logged at `error` level, and re-thrown (directly or wrapped). This causes duplicate error-level logs up the stack. Two variants: +- **Pure rethrow** (`throw err`): remove the log entirely — boundary handler will log it +- **Wrap and rethrow** (`throw reformatFSPIOPError(err)` or `throw new Error('...', { cause: err })`): either remove the log (cause chain is preserved) or downgrade to `warn` for intermediate-layer visibility +- **Exception**: Background jobs and async event handlers without a global error handler may log internally — this is not catch-and-log-bubble + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Do **NOT** catch an error just to log it and throw it again... 'Errors captured in more than one place (often three times)'." + +**Examples:** + +❌ **Bad** +```javascript +// Pure rethrow — log is redundant, boundary handler will log it +try { + await doSomething(); +} catch (err) { + logger.error(err); + throw err; +} + +// Wrap and rethrow — error level at intermediate layer pollutes error aggregations +try { + await doSomething(); +} catch (err) { + logger.error('operation failed: ', err); + throw ErrorHandler.Factory.reformatFSPIOPError(err); +} +``` + +✅ **Good** +```javascript +// Pure rethrow — wrap with context, don't log +try { + await doSomething(); +} catch (err) { + throw new AppError("Context", { cause: err }); +} +// Log only at the edge/global handler + +// Wrap and rethrow — no intermediate log (cause chain preserved) +try { + await doSomething(); +} catch (err) { + throw ErrorHandler.Factory.reformatFSPIOPError(err); +} + +// Wrap and rethrow — warn for intermediate visibility (not error) +try { + await doSomething(); +} catch (err) { + logger.warn('operation failed, wrapping: ', err); + throw ErrorHandler.Factory.reformatFSPIOPError(err); +} +``` + +--- + +### http-semantics + +**Description:** Enforces that HTTP-related logs (mentioning "request" or "response") include OTel HTTP attributes. Required attributes differ by direction: +- **Incoming**: `http.request.method`, `url.path`, `url.scheme`, `request.id`, `http.server.request.duration` +- **Outgoing**: `http.request.method`, `url.full`, `server.address`, `server.port` +- **Conditionally required** (both): `error.type` (if error occurred), `http.response.status_code` (if response sent) +- **Recommended** (outgoing): `service.peer.name` (logical remote service name) + +Duration must be in **seconds** (not milliseconds). `url.full` and `url.query` must redact sensitive query parameters. `error.type` resolution order: `err.code` → `err.name` → `"UnknownError"`. + +Client code should use HTTP wrappers (Hapi plugins, HTTP client wrappers) that handle request/response logging with proper OTel attributes — don't add HTTP attributes manually in handler code. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md` & `standard/data_model.md`: "Use OTel's standard attribute names... HTTP: `http.request.method`, `url.path`..." +- `standard/scenarios/http_requests.md`: direction-specific Required attributes + +**Examples:** + +❌ **Bad** +```javascript +// Missing OTel attributes entirely +logger.info("Incoming HTTP Request"); +// Incomplete — missing direction-specific attributes +logger.info("Incoming HTTP Request", { "http.request.method": "POST" }); +``` + +✅ **Good** +```javascript +// Incoming request — wrapper/plugin handles these attributes +logger.info("Incoming HTTP Request", { + "http.request.method": "POST", + "url.path": "/transfers", + "url.scheme": "https", + "request.id": reqId, + "http.server.request.duration": duration +}); +// Outgoing request — wrapper handles these attributes +logger.info("Outgoing HTTP Request", { + "http.request.method": "GET", + "url.full": "https://peer-dfsp:3000/parties/MSISDN/123", + "server.address": "peer-dfsp", + "server.port": 3000 +}); +``` + +--- + +### losing-error-stack + +**Description:** Prevents logging only `error.message`. The full error object must be passed to ensure the stack trace is captured. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/best_practices.md`: "**Requirement:** 'Verify that Errors are logged with Error Code, Error Stack defined'." + +**Examples:** + +❌ **Bad** +```javascript +logger.error(`Failed: ${error.message}`); +``` + +✅ **Good** +```javascript +logger.error('operation failed: ', error); +``` + +--- + +### no-console + +**Description:** Disallows the use of `console.log`, `console.error`, etc. in favor of the standard logger. + +**Severity:** `Error` (except in tests/scripts) + +**Standard Traceability:** +- `standard/best_practices.md`: "Do **not** use `console.log`... Delegate collection... to external tools." + +**Examples:** + +❌ **Bad** +```javascript +console.log("Server started"); +``` + +✅ **Good** +```javascript +logger.info("Server started"); +``` + +--- + +### no-error-context + +**Description:** Ensures error logs include context. Raw errors should either be passed alongside a message or wrapped in a context object. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/best_practices.md`: "Missing Context in Message... Message doesn't explain what happened." + +**Examples:** + +❌ **Bad** +```javascript +logger.error(error); // Just the error, no message/context +``` + +✅ **Good** +```javascript +logger.error('transfer failed: ', error); +``` + +--- + +### no-manual-level-check + +**Description:** Finds usages of `if (logger.level === 'debug')` or similar manual checks. `loggerFactory` handles this efficiently. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Use `loggerFactory` from contextLogger - It checks `isEnabled` internally." + +**Examples:** + +❌ **Bad** +```javascript +if (config.logLevel === 'debug') { + logger.debug('...'); +} +``` + +✅ **Good** +```javascript +// Just log it. The factory handles the check. +logger.debug('...'); +``` + +--- + +### no-stringified-json + +**Description:** Detects `JSON.stringify()` calls inside log routines. Logging libraries handle serialization; stringifying manually double-encodes and hurts performance. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Avoid JSON.stringify - Let the logging library handle serialization." + +**Examples:** + +❌ **Bad** +```javascript +logger.info(`Data: ${JSON.stringify(data)}`); +``` + +✅ **Good** +```javascript +logger.info('Data received', { data }); +``` + +--- + +### non-standard-attributes + +**Description:** Enforces OpenTelemetry naming conventions (dot notation) for attributes like `exception.message`, `exception.stacktrace`, `duration.ms`. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/data_model.md`: "Use OTel's standard attribute names... `exception.message`, `exception.stacktrace`, `duration.ms`." + +**Examples:** + +❌ **Bad** +```javascript +logger.info("Done", { duration: 500, errorMsg: err.message }); +``` + +✅ **Good** +```javascript +logger.info("Done", { "http.server.request.duration": 0.5, "url.path": "/transfers" }); +``` + +--- + +### semantic-log-levels + +**Description:** Uses heuristics to check if the wrong log level is used based on keywords (e.g., "failed" in an INFO log, or "retrying" in an ERROR log). Mojaloop-specific heuristics: +- **Expected DB errors** (ER_DUP_ENTRY, constraint violations) should be `warn`, not `error` +- **verbose vs debug**: verbose = flow events (entry/exit), debug = data contents +- **info = business events only**, not internal implementation details + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/log_levels.md`: Defines when to use FATAL, ERROR, WARN, INFO, DEBUG. + +**Examples:** + +❌ **Bad** +```javascript +logger.info("Transaction failed"); // Failure should be Error +logger.error("Retrying connection..."); // Recoverable retry should be Warn +logger.error("Duplicate entry: ", err); // ER_DUP_ENTRY is expected, should be Warn +logger.debug("entering validateTransfer"); // Flow event should be Verbose +logger.info("cache miss, fetching from DB"); // Internal detail, not a business event +``` + +✅ **Good** +```javascript +logger.error("Transaction failed"); +logger.warn("Retrying connection..."); +logger.warn("Duplicate entry (idempotent write): ", err); +logger.verbose("entering validateTransfer"); +logger.debug("cache miss, fetching from DB"); +logger.info("transfer committed successfully: ", { transferId }); +``` + +--- + +### sensitive-data + +**Description:** Partial matches for sensitive terms in log arguments: password, token, secret, ssn, msisdn, key, apiKey, privateKey, bankAccount, accountNumber, bearer, authorization, pin, otp, creditCard, cardNumber. Two handling levels: +- **Never log**: passwords, tokens, secrets, keys, PINs, OTPs, authorization headers +- **Mask** (if needed): bank accounts → `****1234`, MSISDNs → `****5678`, card numbers → `****1234` + +Redaction should be handled at the transport/config level, not per call site. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/security.md`: "Strictly avoid logging sensitive information." + +**Examples:** + +❌ **Bad** +```javascript +logger.info("User login", { password: user.password }); +logger.info("Transfer", { accountNumber: "1234567890" }); +``` + +✅ **Good** +```javascript +logger.info("User login", { userId: user.id }); +// If account reference is needed, mask it +logger.info("Transfer", { accountNumber: "****7890" }); +``` + +--- + +### sql-semantics + +**Description:** Enforces that SQL-related logs (mentioning "sql" or "query") include OTel database attributes. Required: `db.query.text` and `db.system.name`. Recommended: `db.operation.name`, `db.namespace`, `db.collection.name`. Duration must be in **seconds** (not milliseconds). Client code should use a centralized DB wrapper (e.g., `createMysqlQueryBuilder`) that handles query logging with proper OTel attributes — don't add SQL attributes manually in application code. Never log `db.query.parameter.*` in production — mask PII even at TRACE level. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Database: `db.system.name`, `db.query.text`..." +- `standard/scenarios/sql_queries.md`: Required and recommended attribute sets + +**Examples:** + +❌ **Bad** +```javascript +logger.debug("Executing SQL Query"); +// Missing db.system.name +logger.debug("Executing SQL Query", { "db.query.text": "SELECT * FROM users" }); +// Duration in milliseconds (should be seconds) +logger.debug("SQL query completed", { "db.client.operation.duration": 150 }); +``` + +✅ **Good** +```javascript +// DB wrapper handles logging with proper OTel attributes +logger.debug("Executing SQL Query", { + "db.query.text": "SELECT * FROM users WHERE id = ?", + "db.system.name": "mysql", + "db.operation.name": "SELECT", + "db.namespace": "central_ledger", + "db.collection.name": "users", + "db.client.operation.duration": 0.15 +}); +``` + +--- + +### unnecessary-debug-guard + +**Description:** Flags `isDebugEnabled()` checks protecting simple log statements. This check should only be used for expensive computations. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Use `isDebugEnabled()` only for expensive computation - Guard... only when arguments ... require expensive work." + +**Examples:** + +❌ **Bad** +```javascript +if (logger.isDebugEnabled()) { + logger.debug("Hello world"); +} +``` + +✅ **Good** +```javascript +logger.debug("Hello world"); + +// OR (Valid use case) +if (logger.isDebugEnabled()) { + logger.debug("Report", { report: buildExpensiveReport() }); +} +``` + +--- + +### valid-log-levels + +**Description:** Enforces standard Mojaloop log levels (fatal, error, warn, info, debug, trace, verbose, perf, silly, audit). + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/log_levels.md`: "Mojaloop log levels map to OpenTelemetry SeverityNumber ranges..." + +**Examples:** + +❌ **Bad** +```javascript +logger.critical("Crash!"); // 'critical' is not a standard level +logger.warning("Warning"); // 'warning' should be 'warn' +``` + +✅ **Good** +```javascript +logger.fatal("Crash!"); +logger.warn("Warning"); +``` + +--- + +### deprecated-logger + +**Description:** Flags usage of the legacy uppercase `Logger` variable instead of `logger` (ContextLogger instance). The uppercase `Logger` is the old naming convention that often comes with unnecessary `isErrorEnabled` guards and lacks `.child()` usage. All code should use a lowercase `logger` — the ContextLogger instance that supports `.child()`, async context propagation, and structured logging. There should be only one `loggerFactory` call per project (in a shared logger module); all other files import and reuse that instance. If a file imports both `Logger` and `logger`, the dual-import is the root issue. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/best_practices.md`: "Use `loggerFactory` from contextLogger — It checks `isEnabled` internally before calling Winston, so log arguments are only serialized when the level is active." + +**Examples:** + +❌ **Bad** +```javascript +const Logger = require('../../shared/logger').logger +Logger.isErrorEnabled && Logger.error(err) +``` + +✅ **Good** (when the project already has a shared logger module) +```javascript +const { logger } = require('../../shared/logger') +logger.error('operation failed: ', err) +``` + +✅ **Good** (shared logger module — one per project, calls loggerFactory once) +```javascript +// e.g. src/shared/logger.js or src/lib/logger.js or src/logger.js +const { loggerFactory } = require('@mojaloop/central-services-logger/src/contextLogger') + +const logger = loggerFactory('MY-SERVICE') + +module.exports = { + logger +} +``` + +--- + +### constant-log-prefix + +**Description:** Log messages must start with a searchable prefix — either a string literal or a `const` variable with a known, bounded value. The goal is that operators can find the log line in aggregators (Kibana, Grafana Loki) by searching for a predictable string. Function calls with unpredictable output, mutable variables, and high-cardinality values (IDs, timestamps) must not be the prefix — put them after it. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/data_model.md`: "The message (Body) should be human-readable and self-explanatory..." + +**What counts as searchable:** +- Literal string: `'transfer processing failed: '` +- `const` variable: `const logPrefix = 'domain::prepare'` — known value, reusable +- `const` with bounded conditional: `` const logPrefix = `domain::${isTransfer ? 'transfer' : 'fxTransfer'}::prepare` `` — finite set of values + +**What is NOT searchable (flag these):** +- Function calls: `Util.breadcrumb(location)` — unpredictable, stateful +- Mutable variables: `let prefix = ...` — could change between log statements +- High-cardinality values as prefix: `` `${transferId} — processing` `` — millions of values + +**Examples:** + +❌ **Bad** +```javascript +// Function call as prefix — unpredictable output +logger.error(`${Util.breadcrumb(location)}::${err.message}`) +logger.error(Util.breadcrumb(location) + ' prepare handler failed: ', err) +// High-cardinality ID as prefix +logger.info(`${transferId} — processing transfer`) +``` + +✅ **Good** +```javascript +// Literal prefix +logger.error('transfer processing failed: ', err) + +// const logPrefix reused across multiple log statements +const logPrefix = `domain::${payload.transferId ? 'transfer' : 'fxTransfer'}::prepare` +logger.debug(`${logPrefix}::start: `, { headers }) +logger.info(`${logPrefix}::done: `, { result }) +logger.error(`${logPrefix}::failed: `, err) + +// Preferred: child logger scopes context structurally +const log = logger.child({ component: 'domain::prepare', transferId }) +log.debug('start: ', { headers }) +log.info('done: ', { result }) +log.error('failed: ', err) +``` + +--- + +### no-string-interpolation-context + +**Description:** Dynamic values embedded in message strings instead of structured attributes. Structured attributes are searchable in log aggregators and enable filtering/alerting. Note: string interpolation with a constant prefix is acceptable per `constant-log-prefix` — this rule targets cases where structured attributes would be more appropriate. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/data_model.md`: "The message (Body) should be human-readable and self-explanatory... Use structured attributes for dynamic values." + +**Examples:** + +❌ **Bad** +```javascript +logger.info(`dfsp ${dfspId} with type ${proxyType} has proxy created`) +``` + +✅ **Good** +```javascript +logger.info('proxy created: ', { dfspId, proxyType, endpoint }) +// String interpolation with constant prefix is also acceptable +logger.info(`proxy created for ${dfspId} type=${proxyType}`) +``` + +--- + +### kafka-semantics + +**Description:** Kafka-related logs missing OTel messaging attributes. Client code should use Kafka wrappers (e.g., `central-services-stream` consumer/producer) that handle messaging logging with proper OTel attributes — don't add messaging attributes manually in handler code. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/scenarios/kafka_messaging.md`: OTel messaging semantic conventions for Kafka + +**Examples:** + +❌ **Bad** +```javascript +logger.verbose('producing message to topic') +``` + +✅ **Good** +```javascript +logger.verbose('producing message to topic', { + 'messaging.system': 'kafka', + 'messaging.destination.name': 'topic-transfer-prepare', + 'messaging.operation.name': 'send' +}) +``` + +--- + +### no-silent-catch + +**Description:** Catch blocks that neither log nor rethrow — silently swallowed errors. Complements `catch-and-log-bubble` (which catches the opposite: catch-log-rethrow). + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Do **NOT** silently swallow errors." +- `standard/scenarios/error_handling.md`: error visibility requirements + +**Examples:** + +❌ **Bad** +```javascript +try { await doSomething() } catch (err) { return null } +``` + +✅ **Good** +```javascript +try { + await doSomething() +} catch (err) { + logger.warn('operation failed, returning null: ', err) + return null +} +``` + +--- + +### no-loop-logging + +**Description:** Logging inside tight loops (for, while, forEach). Should batch or log aggregates to avoid log flooding. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Avoid logging inside tight loops — batch or log aggregates." + +**Examples:** + +❌ **Bad** +```javascript +for (const t of transfers) { + logger.verbose('processing transfer: ', { transferId: t.id }) +} +``` + +✅ **Good** +```javascript +logger.verbose(`processing batch of transfers [count: ${transfers.length}]...`) +``` + +--- + +### expected-error-level + +**Description:** Known expected/recoverable errors logged at `error` instead of `warn`. Specific patterns: ER_DUP_ENTRY, constraint violations, validation failures, retries. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/log_levels.md`: "Expected errors (e.g., duplicate key on idempotent write) are WARN, not ERROR." + +**Examples:** + +❌ **Bad** +```javascript +logger.error('Duplicate entry: ', err) // ER_DUP_ENTRY from idempotent write +``` + +✅ **Good** +```javascript +logger.warn('Duplicate entry (idempotent write): ', err) +``` + +--- + +### fspiop-header-handling + +**Description:** FSPIOP-Signature must be hashed before logging, not logged raw. FSPIOP-Encryption: log metadata/algorithm only. + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/scenarios/http_requests.md`: "FSPIOP-Signature must be hashed; FSPIOP-Encryption metadata only." + +**Examples:** + +❌ **Bad** +```javascript +logger.info('request headers: ', { 'fspiop.signature': req.headers['fspiop-signature'] }) +``` + +✅ **Good** +```javascript +logger.info('request signature.hash: ', { 'fspiop.signature.hash': hash(sig) }) +``` + +--- + +### sql-no-raw-values + +**Description:** SQL query text in log attributes must use parameterized placeholders, not interpolated values. Client code should not log DB queries directly — use a centralized DB wrapper that handles query logging with proper OTel attributes (e.g., `createMysqlQueryBuilder` pattern in quoting-service). + +**Severity:** `Error` + +**Standard Traceability:** +- `standard/scenarios/sql_queries.md`: "Query text must use parameterized placeholders." + +**Examples:** + +❌ **Bad** +```javascript +// Raw query with interpolated values +logger.debug('Query', { 'db.query.text': `SELECT * FROM users WHERE email = '${email}'` }) +// Client code logging query at all +logger.debug('executing query: ', { sql: knexQuery.toString() }) +``` + +✅ **Good** +```javascript +// DB wrapper handles logging with parameterized query and OTel attributes +// See: quoting-service/src/data/createMysqlQueryBuilder.js for reference implementation +``` + +--- + +### exception-attributes + +**Description:** Manually specified error attributes must use OTel names (`exception.type`, `exception.message`, `exception.stacktrace`), not camelCase variants. Additionally, `error.type` (the low-cardinality classifier) must follow the resolution order: `err.code` → `err.name` → `"UnknownError"`. Do not set both `exception.type` and `error.type` unless they carry genuinely different values. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/scenarios/error_handling.md`: "Use OTel exception semantic conventions for error attributes." +- `standard/scenarios/error_handling.md`: "`error.type` resolution order: err.code → err.name → 'UnknownError'" + +**Examples:** + +❌ **Bad** +```javascript +logger.error('error in transferHandler: ', { errorType: err.name, stackTrace: err.stack }) +``` + +✅ **Good** +```javascript +// Pass error object directly — contextLogger handles OTel attributes +logger.error('error in transferHandler: ', err) +``` + +--- + +### no-silent-function + +**Description:** Functions in handler, domain, or service directories must have at least one log statement. A function with no logging is invisible to operators — failures, slow paths, and unexpected branches go undetected. The log level should reflect the function's importance: +- **`info`** — significant business events (transfer committed, participant created, settlement completed) +- **`verbose`** — routine operations (handler entry/exit, operation completions, flow tracing) +- **`debug`** — internal details (cache lookups, intermediate state, config resolution) + +Does not apply to: pure utility/helper functions (math, formatting, validation predicates), functions under 3 lines, re-exports, or test files. + +**Severity:** `Warning` + +**Standard Traceability:** +- `standard/best_practices.md`: "Observability requires logging at business logic boundaries." +- `standard/log_levels.md`: level-to-audience mapping (info = operators, verbose/debug = developers) + +**Examples:** + +❌ **Bad** +```javascript +// Handler function with zero logging — completely invisible to operators +const processFulfilMessage = async (message) => { + const transfer = await getTransferById(message.transferId) + const result = await validateAndCommit(transfer, message) + await publishNotification(result) + return result +} +``` + +✅ **Good** +```javascript +const processFulfilMessage = async (message) => { + const { transferId } = message + const log = logger.child({ transferId }) + log.verbose('processFulfilMessage...') + const transfer = await getTransferById(transferId) + const result = await validateAndCommit(transfer, message) + await publishNotification(result) + log.info('transfer fulfilled: ', { transferId, state: result.state }) + return result +} +``` + +❌ **Bad** +```javascript +// Domain function — silently queries and returns +const getParticipantByName = async (name) => { + const participant = await ParticipantModel.getByName(name) + if (!participant) throw new Error(`Participant not found: ${name}`) + return participant +} +``` + +✅ **Good** +```javascript +const getParticipantByName = async (name) => { + logger.verbose('getParticipantByName...') + const participant = await ParticipantModel.getByName(name) + if (!participant) throw new Error(`Participant not found: ${name}`) + logger.verbose('getParticipantByName - done: ', { name, participantId: participant.participantId }) + return participant +} +``` diff --git a/.claude/skills/ml-logging-linter/references/rules-summary.md b/.claude/skills/ml-logging-linter/references/rules-summary.md new file mode 100644 index 0000000..9d465c1 --- /dev/null +++ b/.claude/skills/ml-logging-linter/references/rules-summary.md @@ -0,0 +1,33 @@ +# Mojaloop Logging Rules — Quick Reference + +Use this table when the full RULES.md cannot be read at runtime. +For detailed examples and traceability, see the full file at the path specified in the skill. + +| Rule | Severity | What it catches | +|------|----------|-----------------| +| `generic-log-message` | Warning | Vague messages ("failed", "processing") without inline context | +| `no-manual-trace-id` | Error | Manual traceId/spanId/traceFlags in log objects (OTel handles this) | +| `catch-and-log-bubble` | Warning | Catch-log-rethrow (pure or wrapped) at `error` level — remove log or downgrade to `warn`. Exception: background jobs without global handler | +| `http-semantics` | Warning | HTTP logs missing OTel attributes per direction; duration in seconds; redact url.query; error.type resolution order | +| `losing-error-stack` | Error | Logging `error.message` instead of the full error object | +| `no-console` | Error | `console.log/error/warn` instead of the standard logger | +| `no-error-context` | Error | `logger.error(err)` without a descriptive context message | +| `no-manual-level-check` | Warning | `if (logger.level === 'debug')` instead of built-in level gating | +| `no-stringified-json` | Warning | `JSON.stringify()` inside log calls (library handles serialization) | +| `non-standard-attributes` | Warning | camelCase attributes instead of OTel dot notation (e.g., `errorMessage`, `durationMs`; prefer `exception.message`, `duration.ms`) | +| `semantic-log-levels` | Warning | Level/keyword mismatch (e.g., "failed" at `info`, "retrying" at `error`) | +| `sensitive-data` | Error | Never-log (passwords, tokens, keys) or mask (accounts → `****1234`); redact at transport level | +| `sql-semantics` | Warning | SQL logs missing `db.query.text`/`db.system.name`; duration in seconds; never log `db.query.parameter.*` in prod | +| `unnecessary-debug-guard` | Warning | `isDebugEnabled` guarding simple log statements (only needed for expensive computation) | +| `valid-log-levels` | Error | Non-standard levels like `critical`, `warning` (use `fatal`, `warn`) | +| `deprecated-logger` | Error | Uppercase `Logger` usage instead of `logger` (ContextLogger instance) | +| `constant-log-prefix` | Warning | Log prefix not searchable — flag function calls, mutable vars, high-cardinality IDs; accept literals and `const` variables | +| `no-string-interpolation-context` | Warning | Dynamic values in message string instead of structured attributes | +| `kafka-semantics` | Warning | Kafka logs missing OTel messaging attributes (use Kafka wrappers) | +| `no-silent-catch` | Warning | Catch block neither logs nor rethrows — silently swallowed error | +| `no-loop-logging` | Warning | Logging inside tight loops — batch or log aggregates instead | +| `expected-error-level` | Warning | Expected/recoverable errors (ER_DUP_ENTRY, retries) at `error` instead of `warn` | +| `fspiop-header-handling` | Error | FSPIOP-Signature logged raw instead of hashed | +| `sql-no-raw-values` | Error | SQL query text with interpolated values instead of parameterized placeholders | +| `exception-attributes` | Warning | Manual error attrs must use OTel names; `error.type` resolution: err.code → err.name → "UnknownError" | +| `no-silent-function` | Warning | Handler/domain/service function with zero log statements — invisible to operators | diff --git a/standard/best_practices.md b/standard/best_practices.md index 929725e..bd59f5b 100644 --- a/standard/best_practices.md +++ b/standard/best_practices.md @@ -62,8 +62,8 @@ Use OTel's standard attribute names so logs, traces, and metrics correlate witho | Domain | Attributes | |--------|-----------| | HTTP | `http.request.method`, `url.path`, `http.response.status_code` | -| Database | `db.system`, `db.statement`, `db.operation.name` | -| Errors | `error.type`, `error.message`, `error.stack_trace` | +| Database | `db.system.name`, `db.query.text`, `db.operation.name` | +| Errors | `error.type`, `exception.type`, `exception.message`, `exception.stacktrace` | | Messaging | `messaging.system`, `messaging.destination.name` | When OTel renames or deprecates an attribute, update logging code to match. diff --git a/standard/data_model.md b/standard/data_model.md index f007e5e..c6b1e0f 100644 --- a/standard/data_model.md +++ b/standard/data_model.md @@ -78,9 +78,10 @@ Following OpenTelemetry conventions, logs should include: - `operation`: Function or operation name - `duration.ms`: For completed operations (milliseconds) - `transferId`, `userId`, `accountId`: Business entity IDs -- `error.type`: Error class name (for errors) -- `error.message`: Error message (for errors) -- `error.stack`: Stack trace (for errors) +- `error.type`: Error classification (for errors) — resolution order: `err.code` → `err.name` → `"UnknownError"` +- `exception.type`: Error class name (for errors) +- `exception.message`: Error message (for errors) +- `exception.stacktrace`: Stack trace (for errors) - `eventName`: For business events (e.g., 'TransferCompleted', 'PaymentFailed') ### Distinction: Resource vs Attributes diff --git a/standard/log_levels.md b/standard/log_levels.md index 170c850..1b8a661 100644 --- a/standard/log_levels.md +++ b/standard/log_levels.md @@ -80,9 +80,10 @@ logger.error(`Transfer ${transfer.id} processing failed in processTransfer opera logger.error(`Database connection to ${dbConfig.host}:${dbConfig.port} failed after ${retryCount} retry attempts: ${error.message}`, { operation: 'connectDatabase', eventName: 'DatabaseConnectionFailed', - 'db.host': dbConfig.host, - 'db.port': dbConfig.port, - 'db.name': dbConfig.database, + 'db.system.name': 'mysql', + 'server.address': dbConfig.host, + 'server.port': dbConfig.port, + 'db.namespace': dbConfig.database, retryCount: retryCount, 'exception.type': error.name, 'exception.message': error.message @@ -180,23 +181,35 @@ logger.info(`Service ${serviceName} v${version} started successfully on port ${p ``` **Do NOT use for:** -- High frequency operational noise (use VERBOSE) -- Internal function calls (use DEBUG) +- Routine operational events (use VERBOSE) +- Variable values or payload data (use DEBUG) -### VERBOSE - Operational High-Volume Events +### VERBOSE - Flow Tracing **When to use:** +- Handler entry/exit points ("processTransfer...", "processTransfer - done") +- Operation completions and routine lifecycle events - Health checks and keep-alives (often sampled) -- Minor configuration updates - Periodic background tasks that are routine -- High-frequency API calls that are not "Significant Business Events" **What to include:** +- Operation name and flow direction (entering/exiting) - Minimal context to verify activity - Status of routine check **Examples:** ```javascript +logger.verbose('processTransfer...', { + operation: 'processTransfer', + transferId: transfer.id +}); + +logger.verbose('processTransfer - done', { + operation: 'processTransfer', + transferId: transfer.id, + 'duration.ms': duration +}); + logger.verbose('Health check passed', { operation: 'healthCheck', uptime: process.uptime(), @@ -204,32 +217,30 @@ logger.verbose('Health check passed', { }); ``` -### DEBUG - Detailed Operational Information +### DEBUG - Data Inspection **When to use (in non-production or when debugging):** -- Function entry/exit with parameters -- Intermediate calculation results -- State changes during operation -- Conditional branch taken -- Loop iterations (sparingly) +- Variable values and intermediate calculation results +- Request/response payloads and body content +- Configuration values being applied - Cache hits/misses -- Validation steps +- State dumps for troubleshooting **What to include:** -- Detailed context -- Variable values -- Flow indicators -- Internal state +- Variable values and data content +- Payload details +- Configuration state +- Cache lookup results **Examples:** ```javascript -logger.debug(`Processing transfer ${transfer.id} validation at step ${step}. Account balance is ${balance} and transfer amount is ${amount}`, { - operation: 'validateTransfer', +logger.debug('transfer payload: ', { + operation: 'processTransfer', transferId: transfer.id, - step: 'checkBalance', - 'account.balance': balance, 'transfer.amount': amount, - accountId: account.id + 'transfer.currency': currency, + 'payer.fspId': payerFsp, + 'payee.fspId': payeeFsp }); logger.debug(`Cache miss for key ${cacheKey}. Fetching participant ${participantId} from database`, { @@ -242,7 +253,7 @@ logger.debug(`Cache miss for key ${cacheKey}. Fetching participant ${participant **Do NOT use for:** - Sensitive data (passwords, tokens, full card numbers) -- Excessive logging in tight loops +- Flow tracing without data (use VERBOSE) - Information available elsewhere ### TRACE - Very Detailed Diagnostic Information @@ -287,10 +298,10 @@ Can the application continue with degraded functionality? Is this a significant business event (Transfer Completed)? YES → INFO -Is this a routine operational event (Health Check)? +Is this a flow trace (handler entry/exit, operation completion, health check)? YES → VERBOSE -Is this needed for troubleshooting but not in production? +Is this data inspection (variable values, payloads, config, cache)? YES → DEBUG Is this only needed for deep diagnostic analysis? @@ -309,10 +320,10 @@ The complexity and structure of logs depend on their level. * Should describe *what* happened (business events). * Should *not* describe *how* (internal implementation details). * **VERBOSE**: Optional in Production. - * Captures high-frequency noise like health checks. + * Captures flow tracing: handler entry/exit, operation completions, health checks. * Often sampled or disabled to save storage. * **DEBUG**: Disabled by default in Production. - * Contains detailed state changes, full payload (body content), and logic flow. + * Contains data inspection: variable values, full payloads, config values, cache hits/misses. * Intended for developers debugging non-production environments. * **TRACE**: Disabled by default. * Loop iterations, and variable values.