Skip to content
Benjamin Wheeler edited this page Sep 3, 2021 · 9 revisions

Awesome Python 3rd-party Libraries

  • Typer - Type-based CLI tool. Build simple CLIs.
    • Click - Dependency of Typer. Also a CLI tool.
  • Rich - Pretty-printing to the max: Colors, 1-line progress bars, syntax highlighting, tables, custom traceback formatting, emoji...
  • Textual - TUI (terminal UI) toolkit by same author as Rich

Common Data Structures

This is a writeup I made about various Python containers and their tradeoffs. A link with more good information is here: https://realpython.com/python-data-structures/

Table of Contents

Built-in

  • Variable

  • Tuple

    • Immutable
    • Slightly less overhead than list
    • Can be used as a dict key
    • Tuple unpacking is good, but can be difficult to refactor (e.g. adding another element to a tuple). Dicts and namedtuples can alleviate this.
  • List

    • Mutable
    • Also works as a queue and a stack (list.pop, list.insert, list.append)
    • Can be sorted in-place
    • Apparently lists have O(1) append and pop(end)! However, O(n) for insert(i, value) and pop(0).
    • https://stackoverflow.com/questions/47493446/should-i-use-a-python-deque-or-list-as-a-stack
      • List performance is over-optimistic in simple code, much worse in "real code" whenever the data has to be moved via realloc
      • Much less consistent than deque
    • Since this is likely array-based, ammortized time complexity for some stuff is O(n)
  • Dict

    • For storing key-value relationships
    • Keys, values are same type
    • Some say better than classes if all you have are attributes (see dataclass)
  • Set

    • Mutable
    • Unordered collection
    • Use 'in' keyword to access values
    • Union/intersection of sets via | and & respectively
    • Superset, subset comparison
    • Representation of mathematical sets
  • Frozenset

    • Like a set, but immutable (can't use set.add)
    • frozensets are hashable, so they can be used as dictionary keys!
    • See realpython.com article for more info.
  • Classes/objects

    • Represent real-world objects
    • Associated functions
    • Flexibility (custom getters and setters, etc, I guess that's what I meant by this)
    • Hierarchies and inheritance reduce repetition
    • Better than dicts if you need functions
    • dot syntax is so much nicer than bracket syntax
    • slightly more verbose, more overhead
    • OOP can be very weird to learn
      • No encapsulation?? This is blasphemy!!!
      • Property and setter decorators are weird but helpful
    • Fun fact: Classes are internally stored as dicts (you can use object.dict['field'] to access data instead of object.field

Persistent storage

  • File I/O
  • Pickle

Collections

  • Namedtuple

    • Gives tuples more meaning
    • colors = (255, 100, 40) # this doesn't give much information about what it does
    • colors[0] doesn't make sense
    • Dictionaries solve this but are mutable (which may be a problem)
    • DOT SYNTAX FOR ACCESS!!!
    • Makes tuple unpacking less headachy (https://www.tutorialdocs.com/article/9-worst-python-practices.html)
    • ordered
    • Potential drawback: key names must be strings
    • Typing module has its own namedtuple which adds types, but this is likely meant to be replaced with dataclasses.
  • Deque

    • O(1) for appendleft() and popleft()
      • Lists only have O(n) for these operations.
  • Chainmap

  • Counter

  • OrderedDict

    • Remembers the order in which keys were inserted.
    • Can be used as a stack via popitem()
    • Interesting: Constructors use a normal dict to pass in args, so passing in pairs via **kwargs loses order. However, you can pass in a list of pairs which preserves order.
  • Defaultdict

    • If you get an item in a normal dict that isn't in it, you get a KeyError.
      • With this, instead, it'll create a new entry instead.
      • Helpful in reducing code
  • Counter

    • A dictionary that counts stuff
    • Counter('aaabcccc') # returns {'a' : 3, 'b': 1, 'c': 4}
    • Has some helper functions too (most_common, etc)

Enumerations

  • Enum (TODO)

Dataclasses ( 3.7+)

  • Frozen instances are immutable

  • Takes a class and "supercharges" it with a constructor and useful methods

  • Allows for compile time error checking rather than runtime error checking

  • asdict allows two classes with the same attributes to be compared (reinforcing the duck typing thing)

  • TypedDict brings this ducked typing to classes by allowing classes to act like dicts.

    • subclassing to TypedDict allows the classes to work just like dictionaries. This allows for duck typing.
  • TypedDict is nice, but classes do matching and validation of data better.

    • Matching means determining an object's class when there's a union of several classes. Validation means making sure that unknown data structures, like JSON, will map to a class
  • When using dicts, typeddicts add type safety. Data classes make more resilient code though.

  • Are like namedtuples, but allow for default values and better comparison (compares class, not just data)

  • "Mutable namedtuples with defaults" from the pep abstract

  • BAD: type hinting is required but not enforced in runtime.

    • mypy checks for these issues
  • Don't use mutable default args for fields, it updates for all objects (because they’re class fields not instance fields)

    • default_factory prevents this
  • Dataclasses are compared as if they were tuples

NumPy

  • Array (TODO)

Clone this wiki locally