Skip to content

Conversation

@rodricios
Copy link
Contributor

Added WrongVariableTypeError, MultipleMostCommonValuesError.

WrongVariableTypeError addresses the fact that one cannot find the 'expectation' (mean) of a distribution of categorical (nominal) random variables (for example, a distribution of words is equivalent to a categorical variable).

In other words, it makes no sense to find the average word.

From Foundations of Statistical Natural Language Processing - Manning and Schutze:

The expection is the mean or average of a random variable...

Where they define a random variable as being a...

... function X: R - OB” (commonly with n = I), where iw is the set of real numbers

The above quotes are taken from section 2.1.4 on Random Variables.

Unfortunately, the motivation behind MultipleMostCommonValuesError is not based off textbook definitions. Instead, it is based off the fact that we named our function best_pair in the singular.

Oh, and test objects were simplified a bit.

@eugene-eeo
Copy link
Contributor

WrongVariableTypeError should be a subclass of TypeError. In fact, from a philosophical point of view, TypeError is a child of ValueError. With that being said, it is now not necessary to have our own "wrapper" around TypeError, unless you want to provide more semantic information in the form of class names. However for most purposes, a simple TypeError is enough.

@rodricios
Copy link
Contributor Author

Would the statistics info I provided count as a good motivating factor for subclassing TypeError?

I personally see it similar to how stats.py subclasses on line 17:

class StatisticsError(ValueError):
    pass

@eugene-eeo
Copy link
Contributor

But the name of WrongVariableTypeError already implies that it should be an error regarding the type of the values, which is more specific the value itself. But then again having your own "wrapper exceptions" are more of an issue of how 'heavy' you want the library to be.

@rodricios
Copy link
Contributor Author

Ah, I think I get what you're suggesting. So drop the extra subclass, but override the exception message?

@eugene-eeo
Copy link
Contributor

Yup. See this SO question. Basically,

try:
    num()
except TypeError as err:
    err.message = 'custom_message'
    raise err

… the elements of the referencing class; used for checking if dist is 'discrete random variable'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants