sum_of_multiples: add tests for logic and input validation#139
sum_of_multiples: add tests for logic and input validation#139mmakaay wants to merge 1 commit intoexercism:masterfrom mmakaay:sum_of_multiples
Conversation
|
Oh, I added some input validation cases to the tests as well, but the example must be upgraded for that too. |
Right. I'm not sure whether the additional test cases really improve the exercises or rather distract. I also find it debatable whether the test cases that you declare invalid really need to be invalid. The lines in your commit message still are way too long BTW. |
|
Way to long as "too verbose" or "longer than 72 characters"? And is this about the subject line or the body text? I have wrapped my lines at 72 chars and looking at the commit message, that seems to have worked. In the message at the top, the body lines have been merged. I wonder in what way test cases would distract. They all describe distinct inputs that should be handled by the code and should result in correct output. A coder that produces correct code right away gets green output. A coder that misses a detail, is notified by a red test. Being a TDD addict might trouble my view on this of course ;-) Maybe feeding the factors in reverse order is not needed, but I do remember that during development I had to adapt my code for handling it, therefore I copied it from my test set. The error cases seem pretty vital to me. I do not think that these outputs are what a user of the produced module would expect: Of course, for the second one you could argue that 3, 6 and 9 are multiples of -3, but the readme states "If we list all the natural numbers below 10 that are multiples of 3 or 5, we get 3, 5, 6 and 9". To me, this communicates the intent that the module is about natural numbers and not about integers. |
|
Here's a good example of what I am mainly afraid of when accepting negative factors (this is a recently submitted solution for sum-of-multiples): The while a1 < (number) check will loop indefinitely, since the code will do -3, -6, -9, etc. never reaching 10. |
|
I gave the changes some thought and applied a few changes.
I squashed the changes into a single commit (thanks to your excellent guide explaining exactly how to do this). |
There was a problem hiding this comment.
I don't recall ever seeing 0 or 1 as factors in a number.
|
They aren't factors when you're talking about extracting prime factors for a number. But this exercise is not working with prime factors, is it? In this case, the factor means "half of a multiplication". The other factor in the multiplication would be n in range(1, ...), as long as n * factor < the provided number. The number 1 could be input for the SumOfMultiples class. If I would do: Then I assume that I should get: 1 not being a prime factor of 10 (or any number) is not related to the question that has to be answered. |
Good point. I was thinking prime factors, but there's no reason why it would have to be. I'm unsure about the zero case. @pminten do you have any thoughts about the mathematical underpinnings of this problem? |
|
@kytrinyx Uh, this isn't really my area of expertise, but I can add some info. When we talk about factoring numbers we usually talk about the principle/theory/whatever that any whole number over 1 can be expressed as the multiplication of primes. Numbers can often also be expressed as multiplication of other-than-prime numbers but the problem with that is that you don't really get a unique solution. For example 12 = 2 * 6 = 3 * 4 = 3 * 2 * 2. But there's no other way to factor 12 into prime numbers since prime numbers can't be divided into whole numbers themselves. However this exercise, judging by the description in x-common prime numbers are not relevant. The question that's asked is merely to implement an algorithm like: sum(i for i in range(1, n) if i % 3 == 0 or i % 5 == 0)Or put imperatively: sum = 0
for i in range(1, n):
if i % 3 == 0 or i % 5 == 0:
sum += i |
|
Btw, the API of this exercise is quite over complicated. It would be much more natural to require a single function instead of an object. Teaching people to use objects for such trivial calculations is I fear teaching them bad design. |
|
If you only consider the slow short version of the code, you are right. I implemented a computational solution, which computes the outcome for any input factors in a few steps. For that implementation I was quite happy with the class. One small learning experience about the class design as used here, would be that it shows that you can configure a class on construction and keep reusing it afterwards. |
sum(i for i in range(1, n) if any(i % j == 0 for j in divisors))I agree that configuring an object on construction and reusing it afterwards is useful, in fact it's the most useful part of having objects. However a design should fit the exercise, if an experienced programmer in a language would not use objects for a particular task the tests shouldn't require objects. |
|
I'll show you what I came up with below, so you can see why a class worked well for me. You are still very right though. A class structure is not the most Pythonic approach for this case. And if a simple function interface were defined for the exercise, I could of course still have added a class for encapsulating the logic when that seemed fit to me. I therefore have no real preference for function or class here. I was just focusing on some unit test additions, to narrow down some ambiguities in the problem field. from itertools import combinations
from fractions import gcd
class SumOfMultiples:
def __init__(self, *factors):
factors = factors or (3, 5)
self._validate_factors(factors)
self._factors = self._filter_redundant_factors(factors)
def _validate_factors(self, factors):
for factor in factors:
if factor <= 0:
raise ValueError(
"Invalid factor: %s (factors must be > 0)" % factor)
def _filter_redundant_factors(self, factors):
"""Deletes duplicate factors and factors that can be divided by a
lower factor. These factors will always produce duplicate multiples
that all must be subtracted from the total. So better not use them
in the first place.
"""
return [
factor for factor in set(factors)
if not any(
other_factor for other_factor in factors
if other_factor < factor and not factor % other_factor
)
]
def to(self, to):
"""Computes the sum of multiples for all configured factors up to the
value of the 'to' parameter. Duplicate multiples are only counted
once in the sum.
"""
return sum(
plus_or_minus * sum_of_multiples
for plus_or_minus, sum_of_multiples
in self._steps(to)
)
def _steps(self, to):
plus_or_minus = +1
sum_of_multiples = self._sum_of_multiples(self._factors, to)
yield (plus_or_minus, sum_of_multiples)
for combi_length in range(2, len(self._factors) + 1):
plus_or_minus = -plus_or_minus
factors = self._lcm_for_factor_combinations(combi_length, to)
sum_of_multiples = self._sum_of_multiples(factors, to)
yield (plus_or_minus, sum_of_multiples)
def _lcm_for_factor_combinations(self, combi_len, to):
return [
factor for factor in [
self._lcm(*factors)
for factors in combinations(self._factors, combi_len)
] if factor < to
]
def _lcm(self, *factors):
if len(factors) == 1:
return factors[0]
elif len(factors) == 2:
return (factors[0] * factors[1]) // gcd(*factors)
else:
lcm = factors[0]
for factor in factors[1::]:
lcm = self._lcm(lcm, factor)
return lcm
def _sum_of_multiples(self, factors, to):
total = 0
for factor in factors:
c = (to - 1) // factor
total += factor * (c * c + c) // 2
return total |
I agree. I have simplified some of the exercises in the Ruby track to not require objects, and intend to keep going in that direction where it makes sense. Overall I think that in most language tracks a lot could be done to improve the design of the exercises / test suites, as the first pass really was more about quantity than quality. |
|
I can do that update as well if you like in this pull branch. About the correct way to handle zero and negative numbers:
|
|
I find that thinking about the exercise as a mathematical function helps clarifying the problem:
I think it's more important to ask whether the space of possible inputs should be narrowed down to some extent although the problem description doesn't explicitely require it. These are some ideas to limit the set of possible inputs for this exercise:
I believe that guaranteeing these properties for the input may allow the users to focus on the core of problem and may motivate them to come up with smart solutions for it. What do you think about these limitations? |
|
I think that given the above observations, "the input numbers are non-negative" works well. The problem description has natural numbers, and there's no universal agreement on whether or not zero is included in that. I also like the constraints given that the input numbers will be uniq and sorted in ascending order. That will remove some pretty arbitrary argument munging that I think is less than interesting. |
|
Explicit is better than implicit. Depending on how performance critical
However what I'd definitely do is add a note to the function documentation
|
|
Exceptions are best for writing tests, especially when creating custom exception classes. Especially since I discovered (through exercises on exercism) that they are not so heavy as in most other languages. So for my code, I prefer raising exception. But am I right that the general idea here is to only provide green input to the function from the exercise tests, to let the coder decide how far he/she wants to go for input validation? The negative factor input is still interesting for cases where the implementation ends in an endless loop (like the one I posted above). But for those, I can still nitpick in the comment thread for code that gets this wrong :) It is not really possible to add a test for it, without fixating the direction of the solution (since documenting how code should behave is documented in a unit test). |
Yeah, this is one place where the discussions are invaluable. It's often during conversation that I have insights. If a test just "made me" do something, I often don't think very hard about it. |
|
Perfect, then I think I've got a good feel for what you want to archieve with the tests now. I will update the changes to this solution and push it into this pull (that sounds weird :). I just finished the Zebra puzzle in the Python tree, so I've got some time to spend now for making improvements. |
Extend the test set with the following test cases:
that I did and which passed the already existing tests. It captures
a possible thinking error when implementing this type of solution.