Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
165 changes: 165 additions & 0 deletions _posts/2018-04-27-day-25.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,165 @@
---
categories: [organizing projects, style]
---

{% include toc %}

## For Next Time
* Work on next round of website edits incorporating feedback
* In-class work on creating video/animated content for website and presentation

-----
On Day 22 we talked about "good code" from a quantitative perspective, including correctness and performance.
Today, we'll discuss the more qualitative but equally important aspects of "good code": organization, documentation, and style.

## Organizing large projects

As projects grow in scope and complexity, organization becomes more important.
Fortunately, Python makes it fairly easy to add structure as you go.

#### Interactive

In Python it's quick and easy to get started and test your ideas interactively in the shell. Unfortunately these explorations are ephemeral, and it's difficult to make changes and run the program again, so we quickly move beyond this stage.

#### Script

Your first Python script may be simply code from an interactive session pasted into a ```.py``` file so that you can edit and re-run it (as well as share it with team mates).

#### Program

We consider a Python program to be a more evolved version of a script, with defined functions, classes, documentation, etc.

#### Modules

As your program gets bigger, grows beyond what can comfortably fit in one file.
To create a module in Python, you simply write a ```.py``` file as always.
Code in the module can then be imported from other modules using the filename (minus ```.py```).

#### Packages

For really big projects, you may even grow beyond a single directory. Python allows you to organize your modules into directories and refer to them as packages.
All you need to do is place a (possibly empty) file called ```__init__.py``` in the folder along with your Python module files.

Read about [Python modules](https://docs.python.org/3/tutorial/modules.html) for all the details.


We've created an [example repository](https://github.com/sd18spring/python-modules) showing the evolution of one project through these steps. Browse through the git history to see the changes over time.

**Exercise:**

Reorganize your project code into modules that group related code.

As a caveat, this type of major refactoring can create lots of merge conflicts if you are not careful.
It is best to make sure everything is checked in first,
do the exercise together as a team,
and have test cases before and after the restructuring to make sure nothing breaks.

## Style and Lint

As you move to more polished code that other people will read and contribute to, style and consistency count.
For a poetic take on "Pythonic" code, try typing ```import this``` in Python.

The most definitive Python style guide is [PEP8](https://www.python.org/dev/peps/pep-0008/) (Python Enhancement Proposals are the mechanisms by which the language grows). It has a list of guidelines along with concrete examples - definitely worth a read.

Checking style rules is a good job for a computer, and one tool you can use to do this for you is [pylint](https://www.pylint.org/). "Linting" is the process of checking code without actually running it (think about picking pieces of lint off your sweater).
You can even set this up to run continuously in your text editor as you write code.

Linting can even help you catch bugs. Check out this output for "Bad Kangaroo" from Think Python chapter 17.

```python
...
C: 1, 0: Missing module docstring (missing-docstring)
W: 4, 4: Dangerous default value [] as argument (dangerous-default-value)
C: 12, 8: Variable name "t" doesn't conform to snake_case naming style (invalid-name)
...
```

Using mutable types as a default argument? Dangerous indeed - watch out!

**Exercise:**

Run ```pylint``` on your project code and analyze the results.
You don't need to fix everything, but it's best to know what rules you're breaking and why.


## Documentation

In case we haven't said it enough, proper documentation is vital to your final project.
As a general guideline, you should have:
- Docstring/header comment at the top of each module (file) explaining what it does
- Docstring for each class, method, and function
- Comments for sections of code that are complex enough to need explanation

One cool feature of docstrings is that Python can use them to automatically generate documentation.
The tool that does this is called
[pydoc](https://docs.python.org/3/library/pydoc.html),
and it is what is used when you call ```help()``` in Python.

**Exercise:**

Autogenerate documentation for your project using pydoc:

```bash
pydoc path/to/my_project.py
```

This will open a help file based on your docstrings (use q to quit).
Make sure this help file would be adequate for someone using your code - try showing it to someone not on your team. Add to your documentation based on this feedback.

pydoc can also generate HTML documentation, which you can add to your project website if you like (not required).
If you want to generate truly beautiful online documentation, check out
[Sphinx](http://www.sphinx-doc.org/)
(the tool used to generate the Python documentation).


## Capturing dependencies

Your project may require 3rd party Python packages that are not part of the [standard library](https://docs.python.org/3/library/).
In your README file (e.g. "Getting Started" or Install section), you must at minimum list these extra packages necessary to run your code with links and versions.

You should know what modules are needed since you wrote an import statement for each, but in case you forget or you're looking at someone else's code: command line tools to the rescue! In your project directory, type:

```bash
grep -hr "^import" . | sort | uniq
```

This means 1) search recursively in the current directory for lines that start with "import" (without printing which file you found them in),
2) sort those results alphabetically, and
3) don't show me duplicates.

(Technically ```"^\(from .\+ \)\?import"``` would be a better search expression, since it also catches ```from foo import bar``` style imports.)


In order to make things easier for your users, in addition to just listing the required packages you can give them a listing that they can automatically install using their package manager.
To generate this list, you can run:

```bash
pip freeze > requirements.txt
```

or

```bash
conda list --export > requirements.txt
```

depending on which package manager you are using. This will create a file called ```requirements.txt``` that you can include in your project repository.
Users can then install the needed dependencies by simply running:

```bash
pip install -r requirements.txt
```

or

```bash
conda install --file requirements.txt
```

as appropriate.
**Important caveat**: this process will dump every package you've installed, not just the ones needed for this project. It is best practice to edit down the list to just those necessary for your project.


**Exercise:**

Make sure the README file for your project has up-to-date installation instructions including required packages. Generate a requirements.txt file for your project.