-
Notifications
You must be signed in to change notification settings - Fork 4k
ARROW-13832: [Doc] Improve compute documentation #11830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
pitrou
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice improvement! Here are some comments.
docs/source/conf.py
Outdated
| # This will also rebuild appropriately when the value changes. | ||
| app.add_config_value('cuda_enabled', cuda_enabled, 'env') | ||
| app.add_config_value('flight_enabled', flight_enabled, 'env') | ||
| app.add_directive('computefuncs', ComputeFunctionsTableDirective) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For disambiguation and clarity, can we prefix our own directives with "arrow-"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
docs/source/python/compute.rst
Outdated
| keys: [["a","b","c"]] | ||
|
|
||
| The ``"sum"`` aggregation passed to the ``aggregate`` method in the previous | ||
| example is the :func:`hash_sum` compute function. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is "hash_sum" actually exposed in the docs? Otherwise, I suppose the :func: tag will not link to anything?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It currently is, but there is #11803 which removes it, so remove the :func: from here
jorisvandenbossche
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed a nice improvement!
| :toctree: ../generated/ | ||
|
|
||
| ArraySortOptions | ||
| AssumeTimezoneOptions |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure it has much value to list them here, as long as they have no docstring.. (this will create a lot new doc pages, which will basically be empty)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I list them here because it provides the signature and thus which arguments they support. As you said there won't be any docstring but given that in many cases you can guess what the arguments do from their name it's better than nothing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it's true that the signatures show something. It only looks kind of "bad" to have an empty docstring page ..
Actually, what happens if you leave out the :toctree: ../generated/ (to only have the table) ? Although that will make just sphinx complain about nonexisting references ..
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it also removes the reference page where you can see the signature
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removing it also removes the reference page where you can see the signature
The table doesn't show the signature?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope
| You can use them with or without the ``"hash_"`` prefix. | ||
|
|
||
| .. arrow-computefuncs:: | ||
| :kind: hash_aggregate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
| result = ViewList() | ||
| function_kind = self.options.get('kind', None) | ||
|
|
||
| result.append(".. csv-table::", "<computefuncs>") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of curiosity, what's the "" for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure I got the question, which "" are you referring to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, annoying github rendering that hides things between angle brackets :) I meant "<computefuncs>"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In practice is the name of the source file where the rst code was written. In this case I use <computefuncs> so that if there is a syntax error in the generated rst code it will tell you "line blahblah in <computefuncs>" and we know it's in this directive. I mimic a bit python style which uses things like File "<stdin>", line 1, in <module>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see!
Co-authored-by: Joris Van den Bossche <jorisvandenbossche@gmail.com>
|
Benchmark runs are scheduled for baseline = 3f6773a and contender = 5b805d7. 5b805d7 is a master commit associated with this PR. Results will be available as each benchmark for each run completes. |
Document a bit better the compute functions and add a section about grouped aggregations. Also list the available aggregation functions automatically.