Welcome to the this project! We're excited you're here and want to contribute.
The goal of this project is to develop a fully Pythonic companion to the core statistical text, in parallel with the in-progress R companion.
Any contributions to this project are welcome, from flagging minor typos to developing entirely new chapters.
If you are experienced with the use of Git/Github for collaborative projects, you can jump to the Technical Guidelines section below.
These guidelines are designed to make it as easy as possible to get involved. If you have any questions that aren't discussed below, please let us know by opening an issue!
Before you start, you'll need to set up a free GitHub account and sign in. Here are some instructions.
Already know what you're looking for in this guide? Jump to the following sections:
- Joining the conversation
- Contributing through Github
- Understanding issues
- Making a change
- Structuring contributions
- Licensing
- Recognizing contributors
Discussions regarding the content and structure of the book take place via Github issues. We actively monitor this space and look forward to hearing from you with any questions or suggestions.
git is a really useful tool for version control. GitHub sits on top of git and supports collaborative and distributed working.
If you're not yet familiar with git, there are lots of great resources to help you git started!
Some of our favorites include the git Handbook and
the Software Carpentry introduction to git.
On GitHub, You'll use Markdown to chat in issues and pull requests.
You can think of Markdown as a few little symbols around your text that will allow GitHub
to render the text with a little bit of formatting.
For example, you could write words as bold (**bold**), or in italics (*italics*),
or as a link ([link](https://youtu.be/dQw4w9WgXcQ)) to another webpage.
GitHub has a really helpful page for getting started with writing and formatting Markdown on GitHub.
Every project on GitHub uses issues slightly differently. The following outlines how the statsthinking21 developers think about these tools.
Issues are individual pieces of work that need to be completed to move the project forward. A general guideline: if you find yourself tempted to write a great big issue that is difficult to describe as one unit of work, please consider splitting it into two or more issues.
Issues are assigned labels which explain how they relate to the overall project's goals and immediate next steps.
The current list of issue labels are here and include:
-
These issues contain a task that is amenable to new contributors because it doesn't entail a steep learning curve.
If you feel that you can contribute to one of these issues, we especially encourage you to do so!
-
These issues point to problems in the code.
If you find a new bug, please give as much detail as possible in your issue, including steps to recreate the error. If you experience the same bug as one already listed, please add any additional information that you have as a comment.
-
These issues point to conceptual or statistical errors in the text.
If you find a conceptual or statistical problem with the text, please note its line number, describe the rationale for your report, and suggest a fix if possible.
-
These issues point to typographic errors in the text.
If you find a new typo, please note its line number, and also note the recommended correction.
-
These issues are proposing a new chapter.
If you wish to propose a new chapter, please describe your rationale, and how it would fit with the existing chapters. If possible, provide an outline of chapter subtopics.
-
These issues are proposing a new section to an existing chapter.
If you wish to propose a new section for an existing chapter, please describe your rationale, the topics that you think it should address, and how it would fit into the existing chapter.
We appreciate all contributions to this book, but those accepted fastest will follow a workflow similar to the following:
-
Comment on an existing issue or open a new issue referencing your addition.
This allows other members of the development team to confirm that you aren't overlapping with work that's currently underway and that everyone is on the same page with the goal of the work you're going to carry out.
This blog is a nice explanation of why putting this work in up front is so useful to everyone involved. -
Fork the book repository to your profile.
This is now your own unique copy of the book source. Changes here won't affect anyone else's work, so it's a safe space to explore edits to the code! -
Clone your forked book repository to your machine/computer.
While you can edit files directly on github, sometimes the changes you want to make will be complex and you will want to use a text editor that you have installed on your local machine/computer. (One great text editor is vscode).
In order to work on the code locally, you must clone your forked repository.
To keep up with changes in the main book repository, add the "upstream" book repository as a remote to your locally cloned repository.git remote add upstream https://github.com/statsthinking21/statsthinking21-python.git
Make sure to keep your fork up to date with the upstream repository.
For example, to update your master branch on your local cloned repository:git fetch upstream git checkout master git merge upstream/master
-
Create a new branch to develop and maintain the proposed code changes.
For example:git fetch upstream # Always start with an updated upstream git checkout -b fix/bug-1222 upstream/masterPlease consider using appropriate branch names as those listed below, and mind that some of them are special (e.g.,
doc/anddocs/):fix/<some-identifier>: for bugfixesenh/<feature-name>: for new features
-
Make the changes you've discussed, following the style guide for Python code.
Try to keep the your changes focused: it is generally easy to review changes that address one new section or bug fix at a time. Once you are satisfied with your local changes, add/commit/push them to the branch on your forked repository. -
Submit a pull request.
A member of the development team will review your changes to confirm that they can be merged into the main code base.
Pull request titles should begin with a descriptive prefix (for example,FIX: Correct error in computation of standard deviation):ENH: enhancements, such as new text or code (example)FIX: bug or typo fixes (example)TST: new or updated tests (example)STY: style changes (example)REF: refactoring existing code (example)CI: updates to continous integration infrastructure (example)MAINT: general maintenance (example)- For works-in-progress, add the
WIPtag in addition to the descriptive prefix. Pull-requests tagged withWIP:will not be merged until the tag is removed.
-
Have your PR reviewed by the developers team, and update your changes accordingly in your branch.
The reviewers will take special care in assisting you address their comments, as well as dealing with conflicts and other tricky situations that could emerge from distributed development.
The (currently proposed) technical plan for the book is as follows.
-
The code should be written in pure Python, targeting version 3.7 or greater. All code should follow the Python code style guide [PEP8], and should pass the flake8 style checker before submission.
-
The python file for each chapter will be named chapter-<topic>.py. In principle, the chapters should coordinate with those in the core text; if one wishes to break this rule, then please raise an issue for discussion. Any additional files (e.g. those defining utility functions) should be placed with the utils directory, and preferably named with the chapter topic in the name (e.g. "chapter-topic-utils.py").
-
The chapters files should be written using Jupytext, which allows one to generate a jupyter notebook from a pure Python file, using the percent format in which cells are delimited with a commented %%. This decision was made in order to simplify the use of version control on the code; when using plain Jupyter notebooks, the metadata is saved in the file such that the file contents change every time the notebook is executed, making it very difficult to determine the relevant changes.
-
The chapter files will be automatically converted to standard Jupyter notebooks using Jupytext using continuous integration.
-
The book will be generated using jupyter-book, which renders the jupyter notebooks to html.
TBD: Identify additional style issues regarding the structure of the notebooks.
We welcome and recognize all contributions regardless their size, content or scope:
from documentation to testing and code development.
You can see a list of current developers and contributors in our zenodo file.
Before every release, a new zenodo file will be generated.
After the first draft is complete, we will create an update script that will also sort creators and contributors by
the relative size of their contributions, as provided by the git-line-summary utility
distributed with the git-extras package.
Last positions in both the creators and contributors list will be reserved to
the project leaders.
These special positions can be revised to add names by punctual request and revised for
removal and update of ordering in an scheduled manner every two years.
All the authors enlisted as creators participate in the revision of modifications.
Creators are members of a the team who have been responsible for establishing and/or driving the project.
Names and contacts of all creators are included in the
.maint/creators.json file
Examples of steering activities that drive the project are: actively participating in the development of new content, helping with the design of the project, and providing resources (in the broad sense, including funding).
Contributors listed in the
.maint/contributors.json file
actively help or have previously helped the project in a broad sense: writing code or text,
proposing new features, and finding bugs.
If you are new to the project, don't forget to add your name and affiliation to the list
of contributors there!
Contributors who have contributed at some point to the project but wish to drop out of the project are listed in the .maint/former.json file.
This companion is licensed under the Creative Commons Attribution-NonCommercial 4.0 International (CC-BY-NC). While we are generally opposed to the use of non-commercial licenses, in this case it is necessary. The core statistical text will be published by a commercial publisher as a low-cost paperback book, and this publisher reasonably requires that the open-source version be licensed to prevent other commercial reuse. Because the companion texts may end up incorporating text from the core text, we must thus also license the companions according to CC-BY-NC as well. A benefit of this is that contibutors do not need to worry about "contaminating" the companions with text from the core; in fact, it's perfectly ok to do so, as long as the license terms are upheld.
By contributing to this book, you acknowledge that any contributions will be licensed under the same terms.
You're awesome. 👋😃
— Based on contributing guidelines from the STEMMRoleModels and fMRIprep projects.