Skip to content

Conversation

@smengcl
Copy link
Contributor

@smengcl smengcl commented Jan 4, 2024

What changes were proposed in this pull request?

Add Ozone Admin CLI listopenfiles as a debug command for admins and devs. e.g.

# First, auth as Ozone admin
$ ozone admin om lof --service-id=om-service-test1 --length=3 --prefix=/volumelof/buck1
5 total open files (est.). Showing 3 open files (limit 3) under path prefix:
  /volume-lof/buck1

Client ID		Creation time	Hsync'ed	Open File Path
111726338148007937	1704808626523	No		/volume-lof/buck1/-9223372036854774527/key0
111726338151415810	1704808626578	No		/volume-lof/buck1/-9223372036854774527/key1
111726338152071171	1704808626588	No		/volume-lof/buck1/-9223372036854774527/key2

To get the next batch of open keys, run:
  ozone admin om lof -id=om-service-test1 --length=3 --prefix=/volume-lof/buck1 --start=/-9223372036854775552/-9223372036854775040/-9223372036854774527/key2/111726338152071171

JSON output:

$ ozone admin om lof --service-id=om-service-test1 --length=3 --prefix=/volumelof/buck1 --json
{
  "openKeys" : [ {
    "keyInfo" : {
      "metadata" : { },
      "objectID" : -9223372036854774015,
      "updateID" : 7,
      "parentObjectID" : -9223372036854774527,
      "volumeName" : "volume-lof",
      "bucketName" : "buck1",
      "keyName" : "key0",
      "dataSize" : 4194304,
      "keyLocationVersions" : [ ... ],
      "creationTime" : 1704808722487,
      "modificationTime" : 1704808722487,
      "replicationConfig" : {
        "replicationFactor" : "THREE",
        "requiredNodes" : 3,
        "replicationType" : "RATIS"
      },
      "fileName" : "key0",
      "acls" : [ ... ],
      "path" : "-9223372036854774527/key0",
      "file" : true,
      "replicatedSize" : 12582912,
      "objectInfo" : "OMKeyInfo{volume='volume-lof', bucket='buck1', key='key0', dataSize='4194304', creationTime='1704808722487', objectID='-9223372036854774015', parentID='-9223372036854774527', replication='RATIS/THREE', fileChecksum='null}",
      "hsync" : false,
      "latestVersionLocations" : { ... },
      "updateIDset" : true
    },
    "openVersion" : 0,
    "clientId" : 111726344437039105
  }, {
    "keyInfo" : { ... },
    "openVersion" : 0,
    "clientId" : 111726344440578050
  }, {
    "keyInfo" : { ... },
    "openVersion" : 0,
    "clientId" : 111726344441233411
  } ],
  "totalOpenKeyCount" : 5,
  "hasMore" : true,
  "contToken" : "/-9223372036854775552/-9223372036854775040/-9223372036854774527/key2/111726344441233411"
}
  • Optional filter by bucket or key prefix
  • Pagination support
  • JSON output
  • Documentation

What is the link to the Apache JIRA

https://issues.apache.org/jira/browse/HDDS-8830

How was this patch tested?

  • Integration test addition.
  • Unit test addition.

@smengcl smengcl added enhancement New feature or request tools Tools that helps with debugging labels Jan 4, 2024
@smengcl smengcl requested a review from jojochuang January 4, 2024 15:48
@smengcl smengcl self-assigned this Jan 4, 2024
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this would list both visible and invisible (keys that are written but no hsync() is called at it) keys. If that's the intend it would be good to distinguish them in the output.

@smengcl
Copy link
Contributor Author

smengcl commented Jan 4, 2024

Looks like this would list both visible and invisible (keys that are written but no hsync() is called at it) keys. If that's the intend it would be good to distinguish them in the output.

That makes sense. Thanks for taking a look at the draft.

Client already has the full KeyInfo. The filtering is done on the client side (by checking HSYNC_CLIENT_ID metadata tag). Added in the latest revision.

Uh-oh now I remembered that OpenKeyTable entries are not updated during hsync. That means in order to determine whether a key has been hsync'ed for listOpenFiles we currently HAVE TO walk KeyTable as well for each key in OpenKeyTable, which could be expensive (unless we update OpenKeyTable as well or add new auxiliary structures). cc @jojochuang

As of commit 094af14 I have added the KeyTable seek logic to correctly determine hsync'ed keys. But one optimization I would do is to update OpenKeyTable as well, which only need to done once per key upon the first hsync (to append the HSYNC_CLIENT_ID metadata). That would make listOpenFiles a lot more efficient. That can be done later. (I also recall mentioning this change offline for some other reason back when we were fixing the hsync bug.)

ozone admin om listopenfiles -id om-service-id -p /volumelof/buck1
1 global open files (estimated). Showing 1 open files (limit 100) under path prefix:
  /volumelof/buck1

Client ID		Hsync'ed	Path
111700912364912641	Yes		/volumelof/buck1/-9223372036854774527/key1

Reached the end of the list.

@jojochuang
Copy link
Contributor

You can check out OmMetadataManagerImpl.getExpiredOpenKeys()

final boolean isHsync = java.util.Optional.ofNullable(info)

to see how we tell a key has hsync or not. You shouldn't need to walk through the keyTable.

@errose28
Copy link
Contributor

errose28 commented Jan 5, 2024

Just a few comments on the CLI portion of this change:

  • I think we should follow standard kebab-case conventions for commands going forwards.
  • We should avoid combining multiple words in one command since it makes it less flexible for future use. Instead of ozone admin om listopenfiles we could do something like ozone admin om open-files ls in case we need to provide more open file operations in the future.
  • I'm not sure whether we want to use open-files or open-keys as the term here. Even though this is being implemented for hsync which is a filesystem operation, the command is under ozone admin and therefore outside of ofs. In this area we usually refer to objects as keys like ozone admin key list.
  • Related to the above, can you clarify the scope of this command? Is it operating on only open files, open keys as well? Does it work for all bucket layouts? Do you pass it a bucket name to limit the scope?
  • Can you provide an example of expected CLI parameters and output in the PR description? This makes it easier for others to quickly give feedback on the UI portion without reviewing implementation details.
  • Since the output will likely be verbose, I think it makes sense to only have this command print json, similar to ozone sh key list. In the current draft there is a --json flag that defaults to false.

@smengcl
Copy link
Contributor Author

smengcl commented Jan 6, 2024

Thanks @errose28 for the comment.

  • I think we should follow standard kebab-case conventions for commands going forwards.
  • We should avoid combining multiple words in one command since it makes it less flexible for future use. Instead of ozone admin om listopenfiles we could do something like ozone admin om open-files ls in case we need to provide more open file operations in the future.

+1 this. But in this way we'd also need to rename all other commands (e.g. getserviceroles, cancelprepare). I'd like @jojochuang and @adoroszlai 's opinions as well.

  • I'm not sure whether we want to use open-files or open-keys as the term here. Even though this is being implemented for hsync which is a filesystem operation, the command is under ozone admin and therefore outside of ofs. In this area we usually refer to objects as keys like ozone admin key list.

This is basically a command that enumerates OpenKeyTable (or OpenFileTable). It is not just limited to hsync'ed keys but all open keys. I should remove [hsync] from the jira title.

I do like the idea of ozone admin key list --open.

  • Related to the above, can you clarify the scope of this command? Is it operating on only open files, open keys as well? Does it work for all bucket layouts? Do you pass it a bucket name to limit the scope?

Yup it is supposed to work with FSO/OBS/LEGACY. Path to a bucket (or just a key prefix) can be optionally passed to limit the scope. But without it (or when passing root as the path) it just iterates over two OpenKey(File)Tables.

  • Can you provide an example of expected CLI parameters and output in the PR description? This makes it easier for others to quickly give feedback on the UI portion without reviewing implementation details.

I planned to. But it is not finalized yet so I will do that later. Plus I will add it to the markdown doc as well.

  • Since the output will likely be verbose, I think it makes sense to only have this command print json, similar to ozone sh key list. In the current draft there is a --json flag that defaults to false.

I think we need both.

@smengcl smengcl changed the title HDDS-8830. [hsync] Add CLI to list open files HDDS-8830. Add admin CLI to list open files Jan 6, 2024
@smengcl
Copy link
Contributor Author

smengcl commented Jan 6, 2024

You can check out OmMetadataManagerImpl.getExpiredOpenKeys()

final boolean isHsync = java.util.Optional.ofNullable(info)

to see how we tell a key has hsync or not. You shouldn't need to walk through the keyTable.

As I can see it is using the same approach as mine to check whether a key is hsync'ed or not. AFAIK the metadata tag is the only way to do it at the moment. Plus getExpiredOpenKeys looks to be iterating over the entire KeyTable/FileTable.

Again, the problem is that OpenKeyTable doen't have this metadata tag at all. If we can add that it will be much more efficient (both for getExpiredOpenKeys and this PR).

@smengcl
Copy link
Contributor Author

smengcl commented Jan 6, 2024

Filed HDDS-10077 for hsync metadata tag improvement.

@smengcl smengcl marked this pull request as ready for review January 9, 2024 14:02
@smengcl smengcl changed the base branch from master to HDDS-7593 January 10, 2024 02:46
@smengcl smengcl changed the base branch from HDDS-7593 to master January 10, 2024 02:47
@smengcl
Copy link
Contributor Author

smengcl commented Jan 10, 2024

Looks like branch HDDS-7593 needs a master branch merge before I can cleanly base this PR on to that one.

Copy link
Contributor

@adoroszlai adoroszlai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @smengcl for the patch.

@errose28
Copy link
Contributor

Right now this is returning object IDs in the path of the keys, which were previously only internal to OM. What is the expected use of this CLI? If it is to do some sort of file close/reclamation based on key name the current paths returned will not help. Alternatively, providing the actual full paths may be problematic because keys can remain open after their parent directories or buckets are deleted, and only the final commit will fail.

@smengcl
Copy link
Contributor Author

smengcl commented Jan 11, 2024

Right now this is returning object IDs in the path of the keys, which were previously only internal to OM. What is the expected use of this CLI? If it is to do some sort of file close/reclamation based on key name the current paths returned will not help. Alternatively, providing the actual full paths may be problematic because keys can remain open after their parent directories or buckets are deleted, and only the final commit will fail.

At the beginning I figure this command is just supposed to return what is there in the OpenKey(File)Table. Then it gets complicated.

I think just like hdfs's, this command is mostly for admins to get a rough idea of how many files are open, and which ones. Thus human-readable output is important. JSON output might be used for scripting. The OmKeyInfo is there tho I think most of the fields may not be very useful, but its what OpenKey(File)Table has.

We could also return the dbKey of OpenKey(File)Table if that becomes useful. @jojochuang

@smengcl smengcl changed the base branch from master to HDDS-7593 January 17, 2024 19:05
Copy link
Contributor

@jojochuang jojochuang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Merging this one so we can begin testing in actual environments.

@jojochuang jojochuang merged commit 082d759 into apache:HDDS-7593 Jan 19, 2024
@smengcl
Copy link
Contributor Author

smengcl commented Jan 19, 2024

Thanks @jojochuang for merging this to the HDDS-7593 feature branch. And thanks @adoroszlai and @errose28 for the reviews and comments.

Will post a PR for HDDS-10077 that depends on this one.

@jojochuang jojochuang added the hbase HBase on Ozone support label Jan 23, 2024
chungen0126 pushed a commit to chungen0126/ozone that referenced this pull request May 3, 2024
chungen0126 pushed a commit to chungen0126/ozone that referenced this pull request May 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request hbase HBase on Ozone support tools Tools that helps with debugging

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants