Skip to content

Overriding dict to sort tuples (when using tuples as keys) #5

@rodricios

Description

@rodricios

The reason is simple. When counting the occurrence of some event, where the event can be a collection of things (think, "what's the probability of "a" and "b", or "b" and "a" occurring?), we want to be able to record either ("a", "b") and ("b", "a") as having occurred once.

So instead of:

{ 
    ("a", "b"): 1, 
    ("b", "a"): 1
}

We have:

{
    ("a", "b"): 2
}

Here's one possible implementation:

class TransformedDict(StatsCounter):
    """A dictionary that applies an arbitrary key-altering
       function before accessing the keys
       http://stackoverflow.com/questions/3387691/python-how-to-perfectly-override-a-dict
       http://stackoverflow.com/questions/2060972/subclassing-python-dictionary-to-override-setitem
       """

    def __init__(self, *args, **kwargs):
        #super().__init__(*args, **kwargs)
        self.update(*args, **kwargs)

    def __getitem__(self, key):
        return super(TransformedDict, self).__getitem__(self.__keytransform__(key))

    def __setitem__(self, key, value):
        super(TransformedDict, self).__setitem__(self.__keytransform__(key), value)

    def __delitem__(self, key):
        super(TransformedDict, self).__delitem__(self.__keytransform__(key))

    def __keytransform__(self, key):
        return key

    def update(self, iterable=None, **kwds):
        '''
        Overiding update (taken from http://code.activestate.com/recipes/576611/)
        One issue is that we sort the 4 times, when it should only be sorted twice
        See "BUG HERE" below
        '''        
        if iterable is not None:
            if hasattr(iterable, 'iteritems'):
                if self:
                    self_get = self.get
                    for elem, count in iterable.iteritems():
                        self[elem] = self_get(elem, 0) + count
                else:
                    dict.update(self, iterable) # fast path when counter is empty
            else:
                self_get = self.get
                for elem in iterable:
                    # BUG HERE
                    # Sort once 
                    key = self.__keytransform__(elem)
                    # self_get causes sort again 
                    self[key] = self_get(key, 0) + 1
        if kwds:
            self.update(kwds)


class OKDict(TransformedDict):
    """Ordered Key Dict"""
    def __keytransform__(self, key):

        if isinstance(key, tuple):
            print(type(key), key)
            return tuple(sorted(key))
        else:
            return key

>>> OKDict([('a','b'), ('b', 'a')])
OKDict({('a', 'b'): 2})

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions