Skip to content

Comments

[WIP] Add Quantile Calculation#11

Closed
JackStouffer wants to merge 2 commits intoDlangScience:masterfrom
JackStouffer:quantile
Closed

[WIP] Add Quantile Calculation#11
JackStouffer wants to merge 2 commits intoDlangScience:masterfrom
JackStouffer:quantile

Conversation

@JackStouffer
Copy link

This is a work in progress PR to add quantile calculations to dstats that is born out of dlang/phobos#3592. This is very simplistic ATM and will go through many revisions.

Still need to be implemented:

  • R method 1
  • R method 2
  • R method 3
  • R method 4
  • R method 5
  • R method 6
  • R method 7
  • R method 8
  • R method 9
  • Add support for user defined types for all methods
  • Write better docs

@John-Colvin
Copy link
Member

Awesome, thanks for doing this

@JackStouffer
Copy link
Author

I'm having a little trouble parsing the R docs you linked me to, when they say

j = floor(np + m) 

and

g = np + m - j

is np supposed to be n * p, or some other constant? Because np is never defined.

@John-Colvin
Copy link
Member

I'm not sure whether what I sent you was the most up-to-date version. This
should be the right one:
http://www.rdocumentation.org/packages/stats/functions/quantile

It seems that it's all written out in mathematical notation instead of in
pseudo-code (or R), so i think np is just n*p
On 24 Sep 2015 17:53, "Jack Stouffer" notifications@github.com wrote:

I'm having a little trouble parsing the R docs you linked me to, when they
say

j = floor(np + m)

and

g = np + m - j

is np supposed to be n * p, or some other constant? Because np is never
defined.


Reply to this email directly or view it on GitHub
#11 (comment).

@JackStouffer
Copy link
Author

Ok, so I got the first three, which were simple enough. But, I could use some more help with the other methods, as neither the R docs nor the linked scholarly paper define what k is when referring to p[k] and X(k).

I also tried to look at the R source for quantile, but it's practically unreadable.

@John-Colvin
Copy link
Member

The original paper those methods come from: https://www.amherst.edu/media/view/129116/original/Sample+Quantiles.pdf

@JackStouffer
Copy link
Author

Yeah, I looked at that too, but they don't define it either. It sort of just shows up in paragraph five with no explanation.

@John-Colvin
Copy link
Member

To be honest, I'm finding it confusing too...

@John-Colvin
Copy link
Member

I would recommend testing some simple quantiles examples in R, it might give you clues as to how to interpret the docs/paper.

@jmh530
Copy link

jmh530 commented Oct 1, 2015

x[k] is the order statistic. That means if you have some data, then x[5] is the 5th smallest number. Assume you have some data, 5, 3, 10, 0, 4, x[1] = 0, x[2] = 3, x[3] = 4, then that means that for type 7 (the default) p[1] = 0 / 4, p[2] = 1 / 4, etc. That means that to get the quantile, you would say that the 25% quantile matches up with k=2, which is 3. To get something like 20% quantile, you do 20% * 0 + (1-20%) * 3 = 2.4, and to get the 40% quantile you do 40% * 3 + (1-40%) * 4 = 3.6

@JackStouffer
Copy link
Author

Sorry, I no longer have the time to do this. This is a good start for someone else to pick up, so I hope it eventually gets in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants