-
Notifications
You must be signed in to change notification settings - Fork 4
Open
Labels
Description
The reason is simple. When counting the occurrence of some event, where the event can be a collection of things (think, "what's the probability of "a" and "b", or "b" and "a" occurring?), we want to be able to record either ("a", "b") and ("b", "a") as having occurred once.
So instead of:
{
("a", "b"): 1,
("b", "a"): 1
}We have:
{
("a", "b"): 2
}Here's one possible implementation:
class TransformedDict(StatsCounter):
"""A dictionary that applies an arbitrary key-altering
function before accessing the keys
http://stackoverflow.com/questions/3387691/python-how-to-perfectly-override-a-dict
http://stackoverflow.com/questions/2060972/subclassing-python-dictionary-to-override-setitem
"""
def __init__(self, *args, **kwargs):
#super().__init__(*args, **kwargs)
self.update(*args, **kwargs)
def __getitem__(self, key):
return super(TransformedDict, self).__getitem__(self.__keytransform__(key))
def __setitem__(self, key, value):
super(TransformedDict, self).__setitem__(self.__keytransform__(key), value)
def __delitem__(self, key):
super(TransformedDict, self).__delitem__(self.__keytransform__(key))
def __keytransform__(self, key):
return key
def update(self, iterable=None, **kwds):
'''
Overiding update (taken from http://code.activestate.com/recipes/576611/)
One issue is that we sort the 4 times, when it should only be sorted twice
See "BUG HERE" below
'''
if iterable is not None:
if hasattr(iterable, 'iteritems'):
if self:
self_get = self.get
for elem, count in iterable.iteritems():
self[elem] = self_get(elem, 0) + count
else:
dict.update(self, iterable) # fast path when counter is empty
else:
self_get = self.get
for elem in iterable:
# BUG HERE
# Sort once
key = self.__keytransform__(elem)
# self_get causes sort again
self[key] = self_get(key, 0) + 1
if kwds:
self.update(kwds)
class OKDict(TransformedDict):
"""Ordered Key Dict"""
def __keytransform__(self, key):
if isinstance(key, tuple):
print(type(key), key)
return tuple(sorted(key))
else:
return key
>>> OKDict([('a','b'), ('b', 'a')])
OKDict({('a', 'b'): 2})