diff --git a/hadoop-hdds/docs/content/concept/OzoneManager.md b/hadoop-hdds/docs/content/concept/OzoneManager.md index 0930ec95e380..5cf520ca2195 100644 --- a/hadoop-hdds/docs/content/concept/OzoneManager.md +++ b/hadoop-hdds/docs/content/concept/OzoneManager.md @@ -97,7 +97,7 @@ the data from the data node. For a detailed view of Ozone Manager this section gives a quick overview about the provided network services and the stored persisted data. -**Network services provided by Ozone Manager:** +### Network services provided by Ozone Manager: Ozone provides a network service for the client and for administration commands. The main service calls @@ -115,7 +115,7 @@ Ozone provides a network service for the client and for administration commands. * ServiceList (used for service discovery) * DBUpdates (used by [Recon]({{< ref "feature/Recon.md" >}}) downloads snapshots) -**Persisted state** +### Persisted state The following data is persisted in Ozone Manager side in a specific RocksDB directory: diff --git a/hadoop-hdds/docs/content/concept/Recon.md b/hadoop-hdds/docs/content/concept/Recon.md new file mode 100644 index 000000000000..902c865be8fa --- /dev/null +++ b/hadoop-hdds/docs/content/concept/Recon.md @@ -0,0 +1,163 @@ +--- +title: "Recon" +date: "2020-10-27" +weight: 8 +menu: + main: + parent: Architecture +summary: Recon serves as a management and monitoring console for Ozone. +--- + + +Recon serves as a management and monitoring console for Ozone. It gives a +bird's-eye view of Ozone and helps users troubleshoot any issues by presenting +the current state of the cluster through REST based APIs and rich web UI. + + +## High Level Design + +{{
}} + +
+ +On a high level, Recon collects and aggregates metadata from Ozone Manager (OM), +Storage Container Manager (SCM) and Datanodes (DN) and acts as a central +management and monitoring console. Ozone administrators can use Recon to query +the current state of the system without overloading OM or SCM. + +Recon maintains multiple databases to enable batch processing, faster querying +and to persist aggregate information. It maintains a local copy of OM db and +SCM db along with a SQL database for persisting aggregate information. + +Recon also integrates with Prometheus to provide a HTTP endpoint to query Prometheus +for Ozone metrics and also to display a few crucial point in time metrics in +the web UI. + +## Recon and Ozone Manager + +{{
}} + +
+ +Recon gets a full snapshot of OM rocks db initially from the leader OM's HTTP +endpoint, untars the file and initializes RocksDB for querying locally. The +database is kept in sync by periodically requesting delta updates from the leader +OM via RPC calls from the last applied sequence id. If for any reason, the delta +updates could not be retrieved or applied to the local db, a full snapshot is +requested again to keep the local db in sync with OM db. Due to this, Recon can +show stale information since the local db will not always be in sync. + +The db updates retrieved from OM is then converted into a batch of events for +further processing by OM db tasks via [Recon Task Framework](#task-framework). + + +## Recon and Storage Container Manager + +{{
}} + +
+ +Recon also acts as a passive SCM for datanodes. When Recon is configured in the +cluster, all the datanodes register with Recon and send heartbeats, container +reports, incremental container reports etc. to Recon similar to SCM. Recon uses +all the information it gets from datanodes to construct its own copy of SCM rocks db +locally. Recon never sends any command to datanodes in response and just acts as +a passive SCM for faster lookup of SCM metadata. + +## Task Framework + +Recon has its own Task framework to enable batch processing of data obtained +from OM and SCM. A task can listen to and act upon db events such as `PUT`, `DELETE`, +`UPDATE`, etc. on either OM db or SCM db. Based on this, a task either implements +`org.apache.hadoop.ozone.recon.tasks.ReconOmTask` or extends +`org.apache.hadoop.ozone.recon.scm.ReconScmTask`. + +An example `ReconOmTask` is `ContainerKeyMapperTask` that persists the container -> key +mapping in RocksDB. This is useful to understand which keys were part of the container +when the container is reported missing or is in a bad health state. Another example is +`FileSizeCountTask` which keeps track of count of files within a given file size range in +a SQL database. These tasks have implementations for two scenarios: + + - Full snapshot (reprocess()) + - Delta updates (process()) + +When a full snapshot of OM db is obtained from the leader OM, the reprocess() +is called on all the registered OM tasks. On subsequent delta updates, process() +is called on these OM tasks. + +An example `ReconScmTask` is `ContainerHealthTask` that runs in configurable +intervals to scan the list of all the containers and to persist the state of +unhealthy containers (`MISSING`, `MIS_REPLICATED`, `UNDER_REPLICATED`, `OVER_REPLICATED`) +in a SQL table. This information is used to determine if there are any missing +containers in the cluster. + +## Recon and Prometheus + +Recon can integrate with any Prometheus instance configured to collected metrics +and can display useful information in Recon UI in Datanodes and Pipelines pages. +Recon also exposes a proxy endpoint ([/metrics]({{< ref "interface/ReconApi.md#metrics" >}})) +to query Prometheus. This integration can be enabled by setting this configuration `ozone.recon.prometheus.http.endpoint` +to the Prometheus endpoint like `ozone.recon.prometheus.http.endpoint=localhost:9090`. + +## API Reference + +[Link to complete API Reference]({{< ref "interface/ReconApi.md" >}}) + +## Persisted state + + * A local copy of [OM database]({{< ref "concept/OzoneManager.md#persisted-state" >}}) + * A local copy of [SCM database]({{< ref "concept/StorageContainerManager.md#persisted-state" >}}) + * The following data is persisted in Recon in the specified RocksDB directory: + * ContainerKey table + * Stores the mapping (container, key) -> count + * ContainerKeyCount table + * Stores containerID -> no. of keys count within the container + + * The following data is stored in the configured SQL database (default is Derby): + * GlobalStats table + * A Key -> Value table to store aggregate information like total + number of volumes / buckets / keys present in the cluster + * FileCountBySize table + * Keeps track of the number of files present within a file size range in the cluster + * ReconTaskStatus table + * Keeps track of the status and last run timestamp of the registered OM and SCM + db tasks in the [Recon Task Framework](#task-framework) + * ContainerHistory table + * Stores ContainerReplica -> Datanode mapping with last known timestamp. This + is used to determine the last known datanodes when a container is reported missing + * UnhealthyContainers table + * Keeps track of all the Unhealthy Containers (MISSING, UNDER_REPLICATED, + OVER_REPLICATED, MIS_REPLICATED) in the cluster at any given time + + +## Notable configurations + +key | default |
description
+----|---------|------------ +ozone.recon.http-address | 0.0.0.0:9888 | The address and the base port where the Recon web UI will listen on. +ozone.recon.address | 0.0.0.0:9891 | RPC address of the Recon. +ozone.recon.db.dir | none | Directory where the Recon Server stores its metadata. +ozone.recon.om.db.dir | none | Directory where the Recon Server stores its OM snapshot DB. +ozone.recon.om.snapshot
.task.interval.delay | 10m | Interval in MINUTES by Recon to request OM DB Snapshot / delta updates. +ozone.recon.task
.missingcontainer.interval | 300s | Time interval of the periodic check for Unhealthy Containers in the cluster. +ozone.recon.sql.db.jooq.dialect | DERBY | Please refer to [SQL Dialect](https://www.jooq.org/javadoc/latest/org.jooq/org/jooq/SQLDialect.html) to specify a different dialect. +ozone.recon.sql.db.jdbc.url | jdbc:derby:${ozone.recon.db.dir}
/ozone_recon_derby.db | Recon SQL database jdbc url. +ozone.recon.sql.db.username | none | Recon SQL database username. +ozone.recon.sql.db.password | none | Recon SQL database password. +ozone.recon.sql.db.driver | org.apache.derby.jdbc
.EmbeddedDriver | Recon SQL database jdbc driver. + diff --git a/hadoop-hdds/docs/content/concept/ReconHighLevelDesign.png b/hadoop-hdds/docs/content/concept/ReconHighLevelDesign.png new file mode 100644 index 000000000000..3bd6443d84c2 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/ReconHighLevelDesign.png differ diff --git a/hadoop-hdds/docs/content/concept/ReconOmDesign.png b/hadoop-hdds/docs/content/concept/ReconOmDesign.png new file mode 100644 index 000000000000..20ea6a3360ed Binary files /dev/null and b/hadoop-hdds/docs/content/concept/ReconOmDesign.png differ diff --git a/hadoop-hdds/docs/content/concept/ReconScmDesign.png b/hadoop-hdds/docs/content/concept/ReconScmDesign.png new file mode 100644 index 000000000000..32d07e02d2c4 Binary files /dev/null and b/hadoop-hdds/docs/content/concept/ReconScmDesign.png differ diff --git a/hadoop-hdds/docs/content/concept/StorageContainerManager.md b/hadoop-hdds/docs/content/concept/StorageContainerManager.md index 9636af5ec7cb..8922f89bc5d9 100644 --- a/hadoop-hdds/docs/content/concept/StorageContainerManager.md +++ b/hadoop-hdds/docs/content/concept/StorageContainerManager.md @@ -56,7 +56,7 @@ token infrastructure depends on this certificate infrastructure. For a detailed view of Storage Container Manager this section gives a quick overview about the provided network services and the stored persisted data. -**Network services provided by Storage Container Manager:** +### Network services provided by Storage Container Manager: * Pipelines: List/Delete/Activate/Deactivate * pipelines are set of datanodes to form replication groups @@ -74,8 +74,7 @@ For a detailed view of Storage Container Manager this section gives a quick over Note: client doesn't connect directly to the SCM -**Persisted state** - +### Persisted state The following data is persisted in Storage Container Manager side in a specific RocksDB directory diff --git a/hadoop-hdds/docs/content/feature/Recon.md b/hadoop-hdds/docs/content/feature/Recon.md index 9fa3f8c7cdec..be434a7e517d 100644 --- a/hadoop-hdds/docs/content/feature/Recon.md +++ b/hadoop-hdds/docs/content/feature/Recon.md @@ -1,5 +1,5 @@ --- -title: "Recon" +title: "Recon Server" weight: 7 menu: main: @@ -23,25 +23,19 @@ summary: Recon is the Web UI and analysis service for Ozone limitations under the License. --> -Recon is the Web UI and analytics service for Ozone. It's an optional component, but strongly recommended as it can add additional visibility. +Recon serves as a management and monitoring console for Ozone. +It's an optional component, but it is strongly recommended to add it to the cluster +since Recon can help with troubleshooting the cluster at critical times. +Refer to [Recon Architecture]({{< ref "concept/Recon.md" >}}) for detailed architecture overview and +[Recon API]({{< ref "interface/ReconApi.md" >}}) documentation +for HTTP API reference. -Recon collects all the data from an Ozone cluster and **store** them in a SQL database for further analyses. - - 1. Ozone Manager data is downloaded in the background by an async process. A RocksDB snapshots are created on OM side periodically, and the incremental data is copied to Recon and processed. - 2. Datanodes can send Heartbeats not just to SCM but Recon. Recon can be a read-only listener of the Heartbeats and updates the local database based on the received information. - -Once Recon is configured, we are ready to start the service. +Recon is a service that brings its own HTTP web server and can be started by +the following command. {{< highlight bash >}} ozone --daemon start recon {{< /highlight >}} -## Notable configurations -key | default | description -----|---------|------------ -ozone.recon.http-address | 0.0.0.0:9888 | The address and the base port where the Recon web UI will listen on. -ozone.recon.address | 0.0.0.0:9891 | RPC address of the Recon. -ozone.recon.db.dir | none | Directory where the Recon Server stores its metadata. -ozone.recon.om.db.dir | none | Directory where the Recon Server stores its OM snapshot DB. -ozone.recon.om.snapshot.task.interval.delay | 10m | Interval in MINUTES by Recon to request OM DB Snapshot. + diff --git a/hadoop-hdds/docs/content/interface/ReconApi.md b/hadoop-hdds/docs/content/interface/ReconApi.md new file mode 100644 index 000000000000..dd033f39f0ca --- /dev/null +++ b/hadoop-hdds/docs/content/interface/ReconApi.md @@ -0,0 +1,511 @@ +--- +title: Recon API +weight: 4 +menu: + main: + parent: "Client Interfaces" +summary: Recon server supports HTTP endpoints to help troubleshoot and monitor Ozone cluster. +--- + + + +The Recon API v1 is a set of HTTP endpoints that help you understand the current +state of an Ozone cluster and to troubleshoot if needed. + +### HTTP Endpoints + +#### Containers + +* **/containers** + + **URL Structure** + ``` + GET /api/v1/containers + ``` + + **Parameters** + + * prevKey (optional) + + Only returns the containers with ID greater than the given prevKey. + Example: prevKey=1 + + * limit (optional) + + Only returns the limited number of results. The default limit is 1000. + + **Returns** + + Returns all the ContainerMetadata objects. + + ```json + { + "data": { + "totalCount": 3, + "containers": [ + { + "ContainerID": 1, + "NumberOfKeys": 834 + }, + { + "ContainerID": 2, + "NumberOfKeys": 833 + }, + { + "ContainerID": 3, + "NumberOfKeys": 833 + } + ] + } + } + ``` + +* **/containers/:id/keys** + + **URL Structure** + ``` + GET /api/v1/containers/:id/keys + ``` + + **Parameters** + + * prevKey (optional) + + Only returns the keys that are present after the given prevKey key prefix. + Example: prevKey=/vol1/bucket1/key1 + + * limit (optional) + + Only returns the limited number of results. The default limit is 1000. + + **Returns** + + Returns all the KeyMetadata objects for the given ContainerID. + + ```json + { + "totalCount":7, + "keys": [ + { + "Volume":"vol-1-73141", + "Bucket":"bucket-3-35816", + "Key":"key-0-43637", + "DataSize":1000, + "Versions":[0], + "Blocks": { + "0": [ + { + "containerID":1, + "localID":105232659753992201 + } + ] + }, + "CreationTime":"2020-11-18T18:09:17.722Z", + "ModificationTime":"2020-11-18T18:09:30.405Z" + }, + ... + ] + } + ``` + +* **/containers/missing** + + **URL Structure** + ``` + GET /api/v1/containers/missing + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns the MissingContainerMetadata objects for all the missing containers. + + ```json + { + "totalCount": 26, + "containers": [{ + "containerID": 1, + "missingSince": 1605731029145, + "keys": 7, + "pipelineID": "88646d32-a1aa-4e1a", + "replicas": [{ + "containerId": 1, + "datanodeHost": "localhost-1", + "firstReportTimestamp": 1605724047057, + "lastReportTimestamp": 1605731201301 + }, + ... + ] + }, + ... + ] + } + ``` + +* **/containers/:id/replicaHistory** + + **URL Structure** + ``` + GET /api/v1/containers/:id/replicaHistory + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns all the ContainerHistory objects for the given ContainerID. + + ```json + [ + { + "containerId": 1, + "datanodeHost": "localhost-1", + "firstReportTimestamp": 1605724047057, + "lastReportTimestamp": 1605730421294 + }, + ... + ] + ``` + +* **/containers/unhealthy** + + **URL Structure** + ``` + GET /api/v1/containers/unhealthy + ``` + + **Parameters** + + * batchNum (optional) + + The batch number (like "page number") of results to return. + Passing 1, will return records 1 to limit. 2 will return + limit + 1 to 2 * limit, etc. + + * limit (optional) + + Only returns the limited number of results. The default limit is 1000. + + **Returns** + + Returns the UnhealthyContainerMetadata objects for all the unhealthy containers. + + ```json + { + "missingCount": 2, + "underReplicatedCount": 0, + "overReplicatedCount": 0, + "misReplicatedCount": 0, + "containers": [{ + "containerID": 1, + "containerState": "MISSING", + "unhealthySince": 1605731029145, + "expectedReplicaCount": 3, + "actualReplicaCount": 0, + "replicaDeltaCount": 3, + "reason": null, + "keys": 7, + "pipelineID": "88646d32-a1aa-4e1a", + "replicas": [{ + "containerId": 1, + "datanodeHost": "localhost-1", + "firstReportTimestamp": 1605722960125, + "lastReportTimestamp": 1605731230509 + }, + ... + ] + }, + ... + ] + } + ``` + +* **/containers/unhealthy/:state** + + **URL Structure** + ``` + GET /api/v1/containers/unhealthy/:state + ``` + + **Parameters** + + * batchNum (optional) + + The batch number (like "page number") of results to return. + Passing 1, will return records 1 to limit. 2 will return + limit + 1 to 2 * limit, etc. + + * limit (optional) + + Only returns the limited number of results. The default limit is 1000. + + **Returns** + + Returns the UnhealthyContainerMetadata objects for the containers in the given state. + Possible unhealthy container states are `MISSING`, `MIS_REPLICATED`, `UNDER_REPLICATED`, `OVER_REPLICATED`. + The response structure is same as `/containers/unhealthy`. + +#### ClusterState + +* **/clusterState** + + **URL Structure** + ``` + GET /api/v1/clusterState + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns a summary of the current state of the Ozone cluster. + + ```json + { + "pipelines": 5, + "totalDatanodes": 4, + "healthyDatanodes": 4, + "storageReport": { + "capacity": 1081719668736, + "used": 1309212672, + "remaining": 597361258496 + }, + "containers": 26, + "volumes": 6, + "buckets": 26, + "keys": 25 + } + ``` + +#### Datanodes + +* **/datanodes** + + **URL Structure** + ``` + GET /api/v1/datanodes + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns all the datanodes in the cluster. + + ```json + { + "totalCount": 4, + "datanodes": [{ + "uuid": "f8f8cb45-3ab2-4123", + "hostname": "localhost-1", + "state": "HEALTHY", + "lastHeartbeat": 1605738400544, + "storageReport": { + "capacity": 270429917184, + "used": 358805504, + "remaining": 119648149504 + }, + "pipelines": [{ + "pipelineID": "b9415b20-b9bd-4225", + "replicationType": "RATIS", + "replicationFactor": 3, + "leaderNode": "localhost-2" + }, { + "pipelineID": "3bf4a9e9-69cc-4d20", + "replicationType": "RATIS", + "replicationFactor": 1, + "leaderNode": "localhost-1" + }], + "containers": 17, + "leaderCount": 1 + }, + ... + ] + } + ``` + +#### Pipelines + +* **/pipelines** + + **URL Structure** + ``` + GET /api/v1/pipelines + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns all the pipelines in the cluster. + + ```json + { + "totalCount": 5, + "pipelines": [{ + "pipelineId": "b9415b20-b9bd-4225", + "status": "OPEN", + "leaderNode": "localhost-1", + "datanodes": ["localhost-1", "localhost-2", "localhost-3"], + "lastLeaderElection": 0, + "duration": 23166128, + "leaderElections": 0, + "replicationType": "RATIS", + "replicationFactor": 3, + "containers": 0 + }, + ... + ] + } + ``` + +#### Tasks + +* **/task/status** + + **URL Structure** + ``` + GET /api/v1/task/status + ``` + + **Parameters** + + No parameters. + + **Returns** + + Returns the status of all the Recon tasks. + + ```json + [ + { + "taskName": "OmDeltaRequest", + "lastUpdatedTimestamp": 1605724099147, + "lastUpdatedSeqNumber": 186 + }, + ... + ] + ``` + +#### Utilization + +* **/utilization/fileCount** + + **URL Structure** + ``` + GET /api/v1/utilization/fileCount + ``` + + **Parameters** + + * volume (optional) + + Filters the results based on the given volume name. + + * bucket (optional) + + Filters the results based on the given bucket name. + + * fileSize (optional) + + Filters the results based on the given fileSize. + + **Returns** + + Returns the file counts within different file ranges with `fileSize` in the + response object being the upper cap for file size range. + + ```json + [{ + "volume": "vol-2-04168", + "bucket": "bucket-0-11685", + "fileSize": 1024, + "count": 1 + }, { + "volume": "vol-2-04168", + "bucket": "bucket-1-41795", + "fileSize": 1024, + "count": 1 + }, { + "volume": "vol-2-04168", + "bucket": "bucket-2-93377", + "fileSize": 1024, + "count": 1 + }, { + "volume": "vol-2-04168", + "bucket": "bucket-3-50336", + "fileSize": 1024, + "count": 2 + }] + ``` + +#### Metrics + +* **/metrics/:api** + + **URL Structure** + ``` + GET /api/v1/metrics/:api + ``` + + **Parameters** + + Refer to [Prometheus HTTP API Reference](https://prometheus.io/docs/prometheus/latest/querying/api/) + for complete documentation on querying. + + **Returns** + + This is a proxy endpoint for Prometheus and returns the same response as + the prometheus endpoint. + Example: /api/v1/metrics/query?query=ratis_leader_election_electionCount + + ```json + { + "status": "success", + "data": { + "resultType": "vector", + "result": [ + { + "metric": { + "__name__": "ratis_leader_election_electionCount", + "exported_instance": "33a5ac1d-8c65-4c74-a0b8-9314dfcccb42", + "group": "group-03CA9397D54B", + "instance": "ozone_datanode_1:9882", + "job": "ozone" + }, + "value": [ + 1599159384.455, + "5" + ] + } + ] + } + } + ``` + \ No newline at end of file diff --git a/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ReconTaskConfig.java b/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ReconTaskConfig.java index c05143eb3c3b..813baf55071a 100644 --- a/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ReconTaskConfig.java +++ b/hadoop-ozone/recon/src/main/java/org/apache/hadoop/ozone/recon/tasks/ReconTaskConfig.java @@ -53,8 +53,8 @@ public void setPipelineSyncTaskInterval(Duration interval) { defaultValue = "300s", tags = { ConfigTag.RECON, ConfigTag.OZONE }, description = "The time interval of the periodic check for " + - "containers with zero replicas in the cluster as reported by " + - "Datanodes." + "unhealthy containers in the cluster as reported " + + "by Datanodes." ) private long missingContainerTaskInterval = Duration.ofMinutes(5).toMillis();