` in the expression.
+>
> > ## Solution
-> >Using a separate function:
+> > Using a separate function:
> > ~~~
-> >def daily_above_threshold(patient_num, data, threshold):
-> > """Count how many days a given patient's inflammation exceeds a given threshold.
+> > def daily_above_threshold(patient_num, data, threshold):
+> > """Count how many days a given patient's inflammation exceeds a given threshold.
+> >
+> > :param patient_num: The patient row number
+> > :param data: A 2D data array with inflammation data
+> > :param threshold: An inflammation threshold to check each daily value against
+> > :returns: An integer representing the number of days a patient's inflammation is over a given threshold
+> > """
+> > def count_above_threshold(a, b):
+> > if b:
+> > return a + 1
+> > else:
+> > return a
> >
-> > :param patient_num: The patient row number
-> > :param data: A 2D data array with inflammation data
-> > :param threshold: An inflammation threshold to check each daily value against
-> > :returns: An integer representing the number of days a patient's inflammation is over a given threshold
-> > """
-> > def count_above_threshold(a, b):
-> > if b:
-> > return a + 1
-> > else:
-> > return a
-> >
> > # Use map to determine if each daily inflammation value exceeds a given threshold for a patient
-> > above_threshold = map(lambda x: x > threshold, data[patient_num])
+> > above_threshold = map(lambda x: x > threshold, data[patient_num])
> > # Use reduce to count on how many days inflammation was above the threshold for a patient
> > return reduce(count_above_threshold, above_threshold, 0)
> > ~~~
> > {: .language-python}
> >
-> >Note that the `count_above_threshold` function used by `reduce()` was defined within the `daily_above_threshold()` function to limit its scope and clarify its purpose (i.e. it may only be useful as part of `daily_above_threshold()` hence being defined as an inner function).
+> > Note that the `count_above_threshold` function used by `reduce()`
+> > was defined within the `daily_above_threshold()` function
+> > to limit its scope and clarify its purpose
+> > (i.e. it may only be useful as part of `daily_above_threshold()`
+> > hence being defined as an inner function).
> >
-> >The equivalent code using a lambda expression may look like:
+> > The equivalent code using a lambda expression may look like:
> >
-> >~~~
-> >from functools import reduce
+> > ~~~
+> > from functools import reduce
> >
-> >...
+> > ...
> >
-> >def daily_above_threshold(patient_num, data, threshold):
-> > """Count how many days a given patient's inflammation exceeds a given threshold.
+> > def daily_above_threshold(patient_num, data, threshold):
+> > """Count how many days a given patient's inflammation exceeds a given threshold.
> >
-> > :param patient_num: The patient row number
-> > :param data: A 2D data array with inflammation data
-> > :param threshold: An inflammation threshold to check each daily value against
-> > :returns: An integer representing the number of days a patient's inflammation is over a given threshold
-> > """
+> > :param patient_num: The patient row number
+> > :param data: A 2D data array with inflammation data
+> > :param threshold: An inflammation threshold to check each daily value against
+> > :returns: An integer representing the number of days a patient's inflammation is over a given threshold
+> > """
> >
-> > above_threshold = map(lambda x: x > threshold, data[patient_num])
-> > return reduce(lambda a, b: a + 1 if b else a, above_threshold, 0)
-> >~~~
-> >{: .language-python}
-> Where could this be useful? For example, you may want to define the success criteria for a trial if, say, 80% of
-> patients do not exhibit inflammation in any of the trial days, or some similar metrics.
+> > above_threshold = map(lambda x: x > threshold, data[patient_num])
+> > return reduce(lambda a, b: a + 1 if b else a, above_threshold, 0)
+> > ~~~
+> > {: .language-python}
+> Where could this be useful?
+> For example, you may want to define the success criteria for a trial if, say,
+> 80% of patients do not exhibit inflammation in any of the trial days, or some similar metrics.
>{: .solution}
{: .challenge}
## Decorators
-Finally, we will look at one last aspect of Python where functional programming is coming handy.
-As we have seen in the [episode on parametrising our unit tests](../22-scaling-up-unit-testing/index.html#parameterising-our-unit-tests), a decorator can take a function, modify/decorate it, then return the resulting function. This is possible because Python treats functions as first-class objects that can be passed around
-as normal data. Here, we discuss decorators in more detail and learn how to write our own. Let's look at the following code for ways on how to "decorate" functions.
+Finally, we will look at one last aspect of Python where functional programming is coming handy.
+As we have seen in the
+[episode on parametrising our unit tests](../22-scaling-up-unit-testing/index.html#parameterising-our-unit-tests),
+a decorator can take a function, modify/decorate it, then return the resulting function.
+This is possible because Python treats functions as first-class objects
+that can be passed around as normal data.
+Here, we discuss decorators in more detail and learn how to write our own.
+Let's look at the following code for ways on how to "decorate" functions.
~~~
def with_logging(func):
@@ -555,56 +738,88 @@ After function call
~~~
{: .output}
-In this example, we see a decorator (`with_logging`) and two different syntaxes for applying the decorator to a function. The decorator is implemented here as a function which encloses another function. Because the inner function (`inner()`) calls the function being decorated (`func()`) and returns its result, it still behaves like this original function. Part of this is the use of `*args` and `**kwargs` - these allow our decorated function to accept any arguments or keyword arguments and pass them directly to the function being decorated. Our decorator in this case does not need to modify any of the arguments, so we do not need to know what they are. Any additional behaviour we want to add as part of our decorated function, we can put before or after the call to the original function. Here we print some text both before and after the decorated function, to show the order in which events happen.
-
-We also see in this example the two different ways in which a decorator can be applied. The first of these is to use a normal function call (`with_logging(add_one)`), where we then assign the resulting function back to a variable - often using the original name of the function, so replacing it with the decorated version. The second syntax is the one we have seen previously (`@with_logging`). This syntax is equivalent to the previous one - the result is that we have a decorated version of the function, here with the name `add_two`. Both of these syntaxes can be useful in different situations: the `@` syntax is more concise if we never need to use the un-decorated version, while the function-call syntax gives us more flexibility - we can continue to use the un-decorated function if we make sure to give the decorated one a different name, and can even make multiple decorated versions using different decorators.
+In this example, we see a decorator (`with_logging`)
+and two different syntaxes for applying the decorator to a function.
+The decorator is implemented here as a function which encloses another function.
+Because the inner function (`inner()`) calls the function being decorated (`func()`)
+and returns its result,
+it still behaves like this original function.
+Part of this is the use of `*args` and `**kwargs` -
+these allow our decorated function to accept any arguments or keyword arguments
+and pass them directly to the function being decorated.
+Our decorator in this case does not need to modify any of the arguments,
+so we do not need to know what they are.
+Any additional behaviour we want to add as part of our decorated function,
+we can put before or after the call to the original function.
+Here we print some text both before and after the decorated function,
+to show the order in which events happen.
+
+We also see in this example the two different ways in which a decorator can be applied.
+The first of these is to use a normal function call (`with_logging(add_one)`),
+where we then assign the resulting function back to a variable -
+often using the original name of the function, so replacing it with the decorated version.
+The second syntax is the one we have seen previously (`@with_logging`).
+This syntax is equivalent to the previous one -
+the result is that we have a decorated version of the function,
+here with the name `add_two`.
+Both of these syntaxes can be useful in different situations:
+the `@` syntax is more concise if we never need to use the un-decorated version,
+while the function-call syntax gives us more flexibility -
+we can continue to use the un-decorated function
+if we make sure to give the decorated one a different name,
+and can even make multiple decorated versions using different decorators.
> ## Exercise: Measuring Performance Using Decorators
->One small task you might find a useful case for a decorator is measuring the time taken to execute a particular function. This is an important part of performance profiling.
+> One small task you might find a useful case for a decorator is
+> measuring the time taken to execute a particular function.
+> This is an important part of performance profiling.
>
->Write a decorator which you can use to measure the execution time of the decorated function using the [time.process_time_ns()](https://docs.python.org/3/library/time.html#time.process_time_ns) function. There are several different timing functions each with slightly different use-cases, but we won’t worry about that here.
+> Write a decorator which you can use to measure the execution time of the decorated function
+> using the [time.process_time_ns()](https://docs.python.org/3/library/time.html#time.process_time_ns) function.
+> There are several different timing functions each with slightly different use-cases,
+> but we won’t worry about that here.
>
->For the function to measure, you may wish to use this as an example:
->~~~
->def measure_me(n):
-> total = 0
-> for i in range(n):
-> total += i * i
+> For the function to measure, you may wish to use this as an example:
+> ~~~
+> def measure_me(n):
+> total = 0
+> for i in range(n):
+> total += i * i
>
-> return total
->~~~
->{: .language-python}
-> >## Solution
+> return total
+> ~~~
+> {: .language-python}
+> > ## Solution
> >
-> >~~~
-> >import time
+> > ~~~
+> > import time
> >
-> >def profile(func):
-> > def inner(*args, **kwargs):
-> > start = time.process_time_ns()
-> > result = func(*args, **kwargs)
-> > stop = time.process_time_ns()
+> > def profile(func):
+> > def inner(*args, **kwargs):
+> > start = time.process_time_ns()
+> > result = func(*args, **kwargs)
+> > stop = time.process_time_ns()
> >
-> > print("Took {0} seconds".format((stop - start) / 1e9))
-> > return result
+> > print("Took {0} seconds".format((stop - start) / 1e9))
+> > return result
> >
-> > return inner
+> > return inner
> >
-> >@profile
-> >def measure_me(n):
-> > total = 0
-> > for i in range(n):
-> > total += i * i
+> > @profile
+> > def measure_me(n):
+> > total = 0
+> > for i in range(n):
+> > total += i * i
> >
-> > return total
+> > return total
> >
-> >print(measure_me(1000000))
-> >~~~
-> >{: .language-python}
-> >~~~
-> >Took 0.124199753 seconds
-> >333332833333500000
-> >~~~
-> >{: .output}
->{: .solution}
+> > print(measure_me(1000000))
+> > ~~~
+> > {: .language-python}
+> > ~~~
+> > Took 0.124199753 seconds
+> > 333332833333500000
+> > ~~~
+> > {: .output}
+> {: .solution}
{: .challenge}
diff --git a/_episodes/35-object-oriented-programming.md b/_episodes/35-object-oriented-programming.md
index efc45d9fe..f07ad5777 100644
--- a/_episodes/35-object-oriented-programming.md
+++ b/_episodes/35-object-oriented-programming.md
@@ -17,33 +17,46 @@ keypoints:
- "By breaking down our data into classes, we can reason about the behaviour of parts of our data."
- "Relationships between concepts can be described using inheritance (*is a*) and composition (*has a*)."
---
-
+
## Introduction
-Object oriented programming is a programming paradigm based on the concept of objects, which are data structures that
-contain (encapsulate) data and code. Data is encapsulated in the form of fields (attributes) of objects,
-while code is encapsulated in the form of procedures (methods) that manipulate objects' attributes and define "behaviour"
-of objects. So, in object oriented programming, we first think about the data and the things that we’re modelling - and represent these by objects - rather than define the logic of the program, and code becomes a series of interactions
-between objects.
+Object oriented programming is a programming paradigm based on the concept of objects,
+which are data structures that contain (encapsulate) data and code.
+Data is encapsulated in the form of fields (attributes) of objects,
+while code is encapsulated in the form of procedures (methods)
+that manipulate objects' attributes and define "behaviour" of objects.
+So, in object oriented programming,
+we first think about the data and the things that we’re modelling -
+and represent these by objects -
+rather than define the logic of the program,
+and code becomes a series of interactions between objects.
## Structuring Data
-One of the main difficulties we encounter when building more complex software is how to structure our data.
-So far, we've been processing data from a single source and with a simple tabular structure, but it would be useful to be able to combine data from a range of different sources and with more data than just an array of numbers.
+One of the main difficulties we encounter when building more complex software is
+how to structure our data.
+So far, we've been processing data from a single source and with a simple tabular structure,
+but it would be useful to be able to combine data from a range of different sources
+and with more data than just an array of numbers.
-~~~ python
+~~~
data = np.array([[1., 2., 3.],
[4., 5., 6.]])
~~~
{: .language-python}
-Using this data structure has the advantage of being able to use NumPy operations to process the data and Matplotlib to plot it, but often we need to have more structure than this.
-For example, we may need to attach more information about the patients and store this alongside our measurements of inflammation.
+Using this data structure has the advantage of
+being able to use NumPy operations to process the data
+and Matplotlib to plot it,
+but often we need to have more structure than this.
+For example, we may need to attach more information about the patients
+and store this alongside our measurements of inflammation.
-We can do this using the Python data structures we're already familiar with, dictionaries and lists.
+We can do this using the Python data structures we're already familiar with,
+dictionaries and lists.
For instance, we could attach a name to each of our patients:
-~~~ python
+~~~
patients = [
{
'name': 'Alice',
@@ -59,13 +72,16 @@ patients = [
> ## Exercise: Structuring Data
>
-> Write a function, called `attach_names`, which can be used to attach names to our patient dataset.
+> Write a function, called `attach_names`,
+> which can be used to attach names to our patient dataset.
> When used as below, it should produce the expected output.
>
-> If you're not sure where to begin, think about ways you might be able to effectively loop over two collections at once.
-> Also, don't worry too much about the data type of the `data` value, it can be a Python list, or a NumPy array - either is fine.
+> If you're not sure where to begin,
+> think about ways you might be able to effectively loop over two collections at once.
+> Also, don't worry too much about the data type of the `data` value,
+> it can be a Python list, or a NumPy array - either is fine.
>
-> ~~~ python
+> ~~~
> data = np.array([[1., 2., 3.],
> [4., 5., 6.]])
>
@@ -90,9 +106,10 @@ patients = [
>
> > ## Solution
> >
-> > One possible solution, perhaps the most obvious, is to use the `range` function to index into both lists at the same location:
+> > One possible solution, perhaps the most obvious,
+> > is to use the `range` function to index into both lists at the same location:
> >
-> > ~~~ python
+> > ~~~
> > def attach_names(data, names):
> > """Create datastructure containing patient records."""
> > output = []
@@ -105,23 +122,35 @@ patients = [
> > ~~~
> > {: .language-python}
> >
-> > However, this solution has a potential problem that can occur sometimes, depending on the input.
-> > What might go wrong with this solution? How could we fix it?
+> > However, this solution has a potential problem that can occur sometimes,
+> > depending on the input.
+> > What might go wrong with this solution?
+> > How could we fix it?
> >
> > > ## A Better Solution
> > >
> > > What would happen if the `data` and `names` inputs were different lengths?
> > >
-> > > If `names` is longer, we'll loop through, until we run out of rows in the `data` input, at which point we'll stop processing the last few names.
-> > > If `data` is longer, we'll loop through, but at some point we'll run out of names - but this time we try to access part of the list that doesn't exist, so we'll get an exception.
+> > > If `names` is longer, we'll loop through, until we run out of rows in the `data` input,
+> > > at which point we'll stop processing the last few names.
+> > > If `data` is longer, we'll loop through, but at some point we'll run out of names -
+> > > but this time we try to access part of the list that doesn't exist,
+> > > so we'll get an exception.
> > >
-> > > A better solution would be to use the `zip` function, which allows us to iterate over multiple iterables without needing an index variable.
-> > > The `zip` function also limits the iteration to whichever of the iterables is smaller, so we won't raise an exception here, but this might not quite be the behaviour we want, so we'll also explicitly `assert` that the inputs should be the same length.
-> > > Checking that our inputs are valid in this way is an example of a precondition, which we introduced conceptually in an earlier episode.
+> > > A better solution would be to use the `zip` function,
+> > > which allows us to iterate over multiple iterables without needing an index variable.
+> > > The `zip` function also limits the iteration to whichever of the iterables is smaller,
+> > > so we won't raise an exception here,
+> > > but this might not quite be the behaviour we want,
+> > > so we'll also explicitly `assert` that the inputs should be the same length.
+> > > Checking that our inputs are valid in this way is an example of a precondition,
+> > > which we introduced conceptually in an earlier episode.
> > >
-> > > If you've not previously come across the `zip` function, read [this section](https://docs.python.org/3/library/functions.html#zip) of the Python documentation.
+> > > If you've not previously come across the `zip` function,
+> > > read [this section](https://docs.python.org/3/library/functions.html#zip)
+> > > of the Python documentation.
> > >
-> > > ~~~ python
+> > > ~~~
> > > def attach_names(data, names):
> > > """Create datastructure containing patient records."""
> > > assert len(data) == len(names)
@@ -140,16 +169,31 @@ patients = [
## Classes in Python
-Using nested dictionaries and lists should work for some of the simpler cases where we need to handle structured data, but they get quite difficult to manage once the structure becomes a bit more complex.
-For this reason, in the object oriented paradigm, we use **classes** to help with managing this data and the operations we would want to perform on it.
-A class is a **template** (blueprint) for a structured piece of data, so when we create some data using a class, we can be certain that it has the same structure each time.
-
-With our list of dictionaries we had in the example above, we have no real guarantee that each dictionary has the same structure, e.g. the same keys (`name` and `data`) unless we check it manually.
-With a class, if an object is an **instance** of that class (i.e. it was made using that template), we know it will have the structure defined by that class. Different programming languages make slightly different guarantees about how strictly the structure will match, but in object oriented programming this is one of the core ideas - all objects derived from the same class must follow the same behaviour.
-
-You may not have realised, but you should already be familiar with some of the classes that come bundled as part of Python, for example:
+Using nested dictionaries and lists should work for some of the simpler cases
+where we need to handle structured data,
+but they get quite difficult to manage once the structure becomes a bit more complex.
+For this reason, in the object oriented paradigm,
+we use **classes** to help with managing this data
+and the operations we would want to perform on it.
+A class is a **template** (blueprint) for a structured piece of data,
+so when we create some data using a class,
+we can be certain that it has the same structure each time.
+
+With our list of dictionaries we had in the example above,
+we have no real guarantee that each dictionary has the same structure,
+e.g. the same keys (`name` and `data`) unless we check it manually.
+With a class, if an object is an **instance** of that class
+(i.e. it was made using that template),
+we know it will have the structure defined by that class.
+Different programming languages make slightly different guarantees
+about how strictly the structure will match,
+but in object oriented programming this is one of the core ideas -
+all objects derived from the same class must follow the same behaviour.
+
+You may not have realised, but you should already be familiar with
+some of the classes that come bundled as part of Python, for example:
-~~~ python
+~~~
my_list = [1, 2, 3]
my_dict = {1: '1', 2: '2', 3: '3'}
my_set = {1, 2, 3}
@@ -167,16 +211,18 @@ print(type(my_set))
~~~
{: .output}
-Lists, dictionaries and sets are a slightly special type of class, but they behave in much the same way as a class we might define ourselves:
+Lists, dictionaries and sets are a slightly special type of class,
+but they behave in much the same way as a class we might define ourselves:
- They each hold some data (**attributes** or **state**).
-- They also provide some methods describing the behaviours of the data - what can the data do and what can we do to the data?
+- They also provide some methods describing the behaviours of the data -
+ what can the data do and what can we do to the data?
The behaviours we may have seen previously include:
- Lists can be appended to
-- Lists can be indexed
-- Lists can be sliced
+- Lists can be indexed
+- Lists can be sliced
- Key-value pairs can be added to dictionaries
- The value at a key can be looked up in a dictionary
- The union of two sets can be found (the set of values present in any of the sets)
@@ -186,7 +232,7 @@ The behaviours we may have seen previously include:
Let's start with a minimal example of a class representing our patients.
-~~~ python
+~~~
# file: inflammation/models.py
class Patient:
@@ -205,30 +251,52 @@ Alice
{: .output}
Here we've defined a class with one method: `__init__`.
-This method is the **initialiser** method, which is responsible for setting up the initial values and structure of the data inside a new instance of the class - this is very similar to **constructors** in other languages, so the term is often used in Python too.
-The `__init__` method is called every time we create a new instance of the class, as in `Patient('Alice')`.
-The argument `self` refers to the instance on which we are calling the method and gets filled in automatically by Python - we do not need to provide a value for this when we call the method.
-
-Data encapsulated within our Patient class includes the patient's name and a list of inflammation observations. In
-the initialiser method, we set a patient's name to the value provided, and create a list of inflammation observations for
-the patient (initially empty). Such data is also referred to
-as the attributes of a class and holds the current state of an instance of the class. Attributes are typically
-hidden (encapsulated) internal object details ensuring that access to data is protected from unintended changes. They
-are manipulated internally by the class, which, in addition, can expose certain functionality as public behavior of the class to
-allow other objects to interact with this class' instances.
+This method is the **initialiser** method,
+which is responsible for setting up the initial values and structure of the data
+inside a new instance of the class -
+this is very similar to **constructors** in other languages,
+so the term is often used in Python too.
+The `__init__` method is called every time we create a new instance of the class,
+as in `Patient('Alice')`.
+The argument `self` refers to the instance on which we are calling the method
+and gets filled in automatically by Python -
+we do not need to provide a value for this when we call the method.
+
+Data encapsulated within our Patient class includes
+the patient's name and a list of inflammation observations.
+In the initialiser method,
+we set a patient's name to the value provided,
+and create a list of inflammation observations for the patient (initially empty).
+Such data is also referred to as the attributes of a class
+and holds the current state of an instance of the class.
+Attributes are typically hidden (encapsulated) internal object details
+ensuring that access to data is protected from unintended changes.
+They are manipulated internally by the class,
+which, in addition, can expose certain functionality as public behavior of the class
+to allow other objects to interact with this class' instances.
## Encapsulating Behaviour
-In addition to representing a piece of structured data (e.g. a patient who has a name and a list of inflammation observations), a class can also provide a set of functions, or **methods**, which describe the **behaviours** of the data encapsulated in the instances of that class. To define the behaviour of a class we add functions which operate on the data the class contains. These functions are the member functions or methods.
-
-Methods on classes are the same as normal functions, except that they live inside a class and have an extra first parameter `self`.
-Using the name `self` is not strictly necessary, but is a very strong convention - it is extremely rare to see any other name chosen.
-When we call a method on an object, the value of `self` is automatically set to this object - hence the name.
-As we saw with the `__init__` method previously, we do not need to explicitly provide a value for the `self` argument, this is done for us by Python.
+In addition to representing a piece of structured data
+(e.g. a patient who has a name and a list of inflammation observations),
+a class can also provide a set of functions, or **methods**,
+which describe the **behaviours** of the data encapsulated in the instances of that class.
+To define the behaviour of a class we add functions which operate on the data the class contains.
+These functions are the member functions or methods.
+
+Methods on classes are the same as normal functions,
+except that they live inside a class and have an extra first parameter `self`.
+Using the name `self` is not strictly necessary, but is a very strong convention -
+it is extremely rare to see any other name chosen.
+When we call a method on an object,
+the value of `self` is automatically set to this object - hence the name.
+As we saw with the `__init__` method previously,
+we do not need to explicitly provide a value for the `self` argument,
+this is done for us by Python.
Let's add another method on our Patient class that adds a new observation to a Patient instance.
-~~~ python
+~~~
# file: inflammation/models.py
class Patient:
@@ -269,31 +337,53 @@ print(alice.observations)
~~~
{: .output}
-Note also how we used `day=None` in the parameter list of the `add_observation` method, then initialise it if the value is indeed `None`.
-This is one of the common ways to handle an optional argument in Python, so we'll see this pattern quite a lot in real projects.
+Note also how we used `day=None` in the parameter list of the `add_observation` method,
+then initialise it if the value is indeed `None`.
+This is one of the common ways to handle an optional argument in Python,
+so we'll see this pattern quite a lot in real projects.
> ## Class and Static Methods
>
-> Sometimes, the function we're writing doesn't need access to any data belonging to a particular object.
+> Sometimes, the function we're writing doesn't need access to
+> any data belonging to a particular object.
> For these situations, we can instead use a **class method** or a **static method**.
-> Class methods have access to the class that they're a part of, and can access data on that class - but do not belong to a specific instance of that class, whereas static methods have access to neither the class nor its instances.
->
-> By convention, class methods use `cls` as their first argument instead of `self` - this is how we access the class and its data, just like `self` allows us to access the instance and its data.
-> Static methods have neither `self` nor `cls` so the arguments look like a typical free function.
+> Class methods have access to the class that they're a part of,
+> and can access data on that class -
+> but do not belong to a specific instance of that class,
+> whereas static methods have access to neither the class nor its instances.
+>
+> By convention, class methods use `cls` as their first argument instead of `self` -
+> this is how we access the class and its data,
+> just like `self` allows us to access the instance and its data.
+> Static methods have neither `self` nor `cls`
+> so the arguments look like a typical free function.
> These are the only common exceptions to using `self` for a method's first argument.
>
-> Both of these method types are created using **decorators** - for more information see the [classmethod](https://docs.python.org/3/library/functions.html#classmethod) and [staticmethod](https://docs.python.org/3/library/functions.html#staticmethod) decorator sections of the Python documentation.
+> Both of these method types are created using **decorators** -
+> for more information see
+> the [classmethod](https://docs.python.org/3/library/functions.html#classmethod)
+> and [staticmethod](https://docs.python.org/3/library/functions.html#staticmethod)
+> decorator sections of the Python documentation.
{: .callout}
### Dunder Methods
Why is the `__init__` method not called `init`?
-There are a few special method names that we can use which Python will use to provide a few common behaviours, each of which begins and ends with a **d**ouble-**under**score, hence the name **dunder method**.
+There are a few special method names that we can use
+which Python will use to provide a few common behaviours,
+each of which begins and ends with a **d**ouble-**under**score,
+hence the name **dunder method**.
+
+When writing your own Python classes,
+you'll almost always want to write an `__init__` method,
+but there are a few other common ones you might need sometimes.
+You may have noticed in the code above that the method `print(alice)`
+returned `<__main__.Patient object at 0x7fd7e61b73d0>`,
+which is the string representation of the `alice` object.
+We may want the print statement to display the object's name instead.
+We can achieve this by overriding the `__str__` method of our class.
-When writing your own Python classes, you'll almost always want to write an `__init__` method, but there are a few other common ones you might need sometimes. You may have noticed in the code above that the method `print(alice)` returned `<__main__.Patient object at 0x7fd7e61b73d0>`, which is the string represenation of the `alice` object. We
-may want the print statement to display the object's name instead. We can achieve this by overriding the `__str__` method of our class.
-
-~~~ python
+~~~
# file: inflammation/models.py
class Patient:
@@ -333,15 +423,22 @@ Alice
~~~
{: .output}
-These dunder methods are not usually called directly, but rather provide the implementation of some functionality we can use - we didn't call `alice.__str__()`, but it was called for us when we did `print(alice)`.
+These dunder methods are not usually called directly,
+but rather provide the implementation of some functionality we can use -
+we didn't call `alice.__str__()`,
+but it was called for us when we did `print(alice)`.
Some we see quite commonly are:
- `__str__` - converts an object into its string representation, used when you call `str(object)` or `print(object)`
- `__getitem__` - Accesses an object by key, this is how `list[x]` and `dict[x]` are implemented
- `__len__` - gets the length of an object when we use `len(object)` - usually the number of items it contains
-There are many more described in the Python documentation, but it’s also worth experimenting with built in Python objects to see which methods provide which behaviour.
-For a more complete list of these special methods, see the [Special Method Names](https://docs.python.org/3/reference/datamodel.html#special-method-names) section of the Python documentation.
+There are many more described in the Python documentation,
+but it’s also worth experimenting with built in Python objects to
+see which methods provide which behaviour.
+For a more complete list of these special methods,
+see the [Special Method Names](https://docs.python.org/3/reference/datamodel.html#special-method-names)
+section of the Python documentation.
> ## Exercise: A Basic Class
>
@@ -352,7 +449,7 @@ For a more complete list of these special methods, see the [Special Method Names
> - Have an author
> - When printed using `print(book)`, show text in the format "title by author"
>
-> ~~~ python
+> ~~~
> book = Book('A Book', 'Me')
>
> print(book)
@@ -366,7 +463,7 @@ For a more complete list of these special methods, see the [Special Method Names
>
> > ## Solution
> >
-> > ~~~ python
+> > ~~~
> > class Book:
> > def __init__(self, title, author):
> > self.title = title
@@ -382,9 +479,10 @@ For a more complete list of these special methods, see the [Special Method Names
### Properties
The final special type of method we will introduce is a **property**.
-Properties are methods which behave like data - when we want to access them, we do not need to use brackets to call the method manually.
+Properties are methods which behave like data -
+when we want to access them, we do not need to use brackets to call the method manually.
-~~~ python
+~~~
# file: inflammation/models.py
class Patient:
@@ -409,33 +507,46 @@ print(obs)
~~~
{: .output}
-You may recognise the `@` syntax from episodes on parameterising unit tests and functional programming - `property` is another example of a **decorator**.
-In this case the `property` decorator is taking the `last_observation` function and modifying its behaviour, so it can be accessed as if it were a normal attribute.
+You may recognise the `@` syntax from episodes on
+parameterising unit tests and functional programming -
+`property` is another example of a **decorator**.
+In this case the `property` decorator is taking the `last_observation` function
+and modifying its behaviour,
+so it can be accessed as if it were a normal attribute.
It is also possible to make your own decorators, but we won't cover it here.
## Relationships Between Classes
-We now have a language construct for grouping data and behaviour related to a single conceptual object.
+We now have a language construct for grouping data and behaviour
+related to a single conceptual object.
The next step we need to take is to describe the relationships between the concepts in our code.
-There are two fundamental types of relationship between objects which we need to be able to describe:
+There are two fundamental types of relationship between objects
+which we need to be able to describe:
1. Ownership - x **has a** y - this is **composition**
2. Identity - x **is a** y - this is **inheritance**
### Composition
-You should hopefully have come across the term **composition** already - in the novice Software Carpentry, we use composition of functions to reduce code duplication.
-That time, we used a function which converted temperatures in Celsius to Kelvin as a **component** of another function which converted temperatures in Fahrenheit to Kelvin.
+You should hopefully have come across the term **composition** already -
+in the novice Software Carpentry, we use composition of functions to reduce code duplication.
+That time, we used a function which converted temperatures in Celsius to Kelvin
+as a **component** of another function which converted temperatures in Fahrenheit to Kelvin.
In the same way, in object oriented programming, we can make things components of other things.
-We often use composition where we can say 'x *has a* y' - for example in our inflammation project, we might want to say that a doctor *has* patients or that a patient *has* observations.
+We often use composition where we can say 'x *has a* y' -
+for example in our inflammation project,
+we might want to say that a doctor *has* patients
+or that a patient *has* observations.
-In the case of our example, we're already saying that patients have observations, so we're already using composition here.
-We're currently implementing an observation as a dictionary with a known set of keys though, so maybe we should make an `Observation` class as well.
+In the case of our example, we're already saying that patients have observations,
+so we're already using composition here.
+We're currently implementing an observation as a dictionary with a known set of keys though,
+so maybe we should make an `Observation` class as well.
-~~~ python
+~~~
# file: inflammation/models.py
class Observation:
@@ -481,22 +592,32 @@ print(obs)
~~~
{: .output}
-Now we're using a composition of two custom classes to describe the relationship between two types of entity in the system that we're modelling.
+Now we're using a composition of two custom classes to
+describe the relationship between two types of entity in the system that we're modelling.
### Inheritance
The other type of relationship used in object oriented programming is **inheritance**.
-Inheritance is about data and behaviour shared by classes, because they have some shared identity - 'x *is a* y'.
-If class `X` inherits from (*is a*) class `Y`, we say that `Y` is the **superclass** or **parent class** of `X`, or `X` is a **subclass** of `Y`.
-
-If we want to extend the previous example to also manage people who aren't patients we can add another class `Person`.
-But `Person` will share some data and behaviour with `Patient` - in this case both have a name and show that name when you print them.
-Since we expect all patients to be people (hopefully!), it makes sense to implement the behaviour in `Person` and then reuse it in `Patient`.
+Inheritance is about data and behaviour shared by classes,
+because they have some shared identity - 'x *is a* y'.
+If class `X` inherits from (*is a*) class `Y`,
+we say that `Y` is the **superclass** or **parent class** of `X`,
+or `X` is a **subclass** of `Y`.
+
+If we want to extend the previous example to also manage people who aren't patients
+we can add another class `Person`.
+But `Person` will share some data and behaviour with `Patient` -
+in this case both have a name and show that name when you print them.
+Since we expect all patients to be people (hopefully!),
+it makes sense to implement the behaviour in `Person` and then reuse it in `Patient`.
+
+To write our class in Python,
+we used the `class` keyword, the name of the class,
+and then a block of the functions that belong to it.
+If the class **inherits** from another class,
+we include the parent class name in brackets.
-To write our class in Python, we used the `class` keyword, the name of the class, and then a block of the functions that belong to it.
-If the class **inherits** from another class, we include the parent class name in brackets.
-
-~~~ python
+~~~
# file: inflammation/models.py
class Observation:
@@ -555,26 +676,45 @@ AttributeError: 'Person' object has no attribute 'add_observation'
~~~
{: .output}
-As expected, an error is thrown because we cannot add an observation to `bob`, who is a Person but not a Patient.
-
-We see in the example above that to say that a class inherits from another, we put the **parent class** (or **superclass**) in brackets after the name of the **subclass**.
-
-There's something else we need to add as well - Python doesn't automatically call the `__init__` method on the parent class if we provide a new `__init__` for our subclass, so we'll need to call it ourselves.
-This makes sure that everything that needs to be initialised on the parent class has been, before we need to use it.
-If we don't define a new `__init__` method for our subclass, Python will look for one on the parent class and use it automatically.
-This is true of all methods - if we call a method which doesn't exist directly on our class, Python will search for it among the parent classes.
-The order in which it does this search is known as the **method resolution order** - a little more on this in the Multiple Inheritance callout below.
-
-The line `super().__init__(name)` gets the parent class, then calls the `__init__` method, providing the `name` variable that `Person.__init__` requires.
-This is quite a common pattern, particularly for `__init__` methods, where we need to make sure an object is initialised as a valid `X`, before we can initialise it as a valid `Y` - e.g. a valid `Person` must have a name, before we can properly initialise a `Patient` model with their inflammation data.
+As expected, an error is thrown because we cannot add an observation to `bob`,
+who is a Person but not a Patient.
+
+We see in the example above that to say that a class inherits from another,
+we put the **parent class** (or **superclass**) in brackets after the name of the **subclass**.
+
+There's something else we need to add as well -
+Python doesn't automatically call the `__init__` method on the parent class
+if we provide a new `__init__` for our subclass,
+so we'll need to call it ourselves.
+This makes sure that everything that needs to be initialised on the parent class has been,
+before we need to use it.
+If we don't define a new `__init__` method for our subclass,
+Python will look for one on the parent class and use it automatically.
+This is true of all methods -
+if we call a method which doesn't exist directly on our class,
+Python will search for it among the parent classes.
+The order in which it does this search is known as the **method resolution order** -
+a little more on this in the Multiple Inheritance callout below.
+
+The line `super().__init__(name)` gets the parent class,
+then calls the `__init__` method,
+providing the `name` variable that `Person.__init__` requires.
+This is quite a common pattern, particularly for `__init__` methods,
+where we need to make sure an object is initialised as a valid `X`,
+before we can initialise it as a valid `Y` -
+e.g. a valid `Person` must have a name,
+before we can properly initialise a `Patient` model with their inflammation data.
> ## Composition vs Inheritance
>
-> When deciding how to implement a model of a particular system, you often have a choice of either composition or inheritance, where there is no obviously correct choice.
-> For example, it's not obvious whether a photocopier *is a* printer and *is a* scanner, or *has a* printer and *has a* scanner.
+> When deciding how to implement a model of a particular system,
+> you often have a choice of either composition or inheritance,
+> where there is no obviously correct choice.
+> For example, it's not obvious whether a photocopier *is a* printer and *is a* scanner,
+> or *has a* printer and *has a* scanner.
>
-> ~~~ python
+> ~~~
> class Machine:
> pass
>
@@ -590,7 +730,7 @@ This is quite a common pattern, particularly for `__init__` methods, where we ne
> ~~~
> {: .language-python}
>
-> ~~~ python
+> ~~~
> class Machine:
> pass
>
@@ -609,47 +749,65 @@ This is quite a common pattern, particularly for `__init__` methods, where we ne
> {: .language-python}
>
> Both of these would be perfectly valid models and would work for most purposes.
-> However, unless there's something about how you need to use the model which would benefit from using a model based on inheritance, it's usually recommended to opt for **composition over inheritance**.
-> This is a common design principle in the object oriented paradigm and is worth remembering, as it's very common for people to overuse inheritance once they've been introduced to it.
->
-> For much more detail on this see the [Python Design Patterns guide](https://python-patterns.guide/gang-of-four/composition-over-inheritance/).
+> However, unless there's something about how you need to use the model
+> which would benefit from using a model based on inheritance,
+> it's usually recommended to opt for **composition over inheritance**.
+> This is a common design principle in the object oriented paradigm and is worth remembering,
+> as it's very common for people to overuse inheritance once they've been introduced to it.
+>
+> For much more detail on this see the
+> [Python Design Patterns guide](https://python-patterns.guide/gang-of-four/composition-over-inheritance/).
{: .callout}
> ## Multiple Inheritance
>
> **Multiple Inheritance** is when a class inherits from more than one direct parent class.
> It exists in Python, but is often not present in other Object Oriented languages.
-> Although this might seem useful, like in our inheritance-based model of the photocopier above, it's best to avoid it unless you're sure it's the right thing to do, due to the complexity of the inheritance heirarchy.
-> Often using multiple inheritance is a sign you should instead be using composition - again like the photocopier model above.
+> Although this might seem useful, like in our inheritance-based model of the photocopier above,
+> it's best to avoid it unless you're sure it's the right thing to do,
+> due to the complexity of the inheritance heirarchy.
+> Often using multiple inheritance is a sign you should instead be using composition -
+> again like the photocopier model above.
{: .callout}
> ## Exercise: A Model Patient
>
-> Let's use what we have learnt in this episode and combine it with what we have learnt on
-> [software requirements](../31-software-requirements/index.html) to formulate and implement a
+> Let's use what we have learnt in this episode and combine it with what we have learnt on
+> [software requirements](../31-software-requirements/index.html)
+> to formulate and implement a
> [few new solution requirements](../31-software-requirements/index.html#exercise-new-solution-requirements)
> to extend the model layer of our clinical trial system.
>
-> Let's can start with extending the system such that there must be a `Doctor` class to hold the data representing a single doctor, which:
+> Let's start with extending the system such that there must be
+> a `Doctor` class to hold the data representing a single doctor, which:
+>
> - must have a `name` attribute
> - must have a list of patients that this doctor is responsible for.
>
-> In addition to these, try to think of an extra feature you could add to the models which would be useful for managing a dataset like this - imagine we're running a clinical trial, what else might we want to know?
-> Try using Test Driven Development for any features you add: write the tests first, then add the feature.
-> The tests have been started for you in `tests/test_patient.py`, but you will probably want to add some more.
+> In addition to these, try to think of an extra feature you could add to the models
+> which would be useful for managing a dataset like this -
+> imagine we're running a clinical trial, what else might we want to know?
+> Try using Test Driven Development for any features you add:
+> write the tests first, then add the feature.
+> The tests have been started for you in `tests/test_patient.py`,
+> but you will probably want to add some more.
>
> Once you've finished the initial implementation, do you have much duplicated code?
-> Is there anywhere you could make better use of composition or inheritance to improve your implementation?
+> Is there anywhere you could make better use of composition or inheritance
+> to improve your implementation?
>
-> For any extra features you've added, explain them and how you implemented them to your neighbour.
+> For any extra features you've added,
+> explain them and how you implemented them to your neighbour.
> Would they have implemented that feature in the same way?
+>
> > ## Solution
-> > One example solution is shown below. You may start by writing some tests (that will initially fail), and then
-> > develop the code to satisfy the new requirements and pass the tests.
-> > ~~~ python
-> > # file: tests/test_patient.py
-> > """Tests for the Patient model."""
+> > One example solution is shown below.
+> > You may start by writing some tests (that will initially fail),
+> > and then develop the code to satisfy the new requirements and pass the tests.
+> > ~~~
+> > # file: tests/test_patient.py
+> > """Tests for the Patient model."""
> >
> > def test_create_patient():
> > """Check a patient is created correctly given a name."""
@@ -664,7 +822,7 @@ This is quite a common pattern, particularly for `__init__` methods, where we ne
> > name = 'Sheila Wheels'
> > doc = Doctor(name=name)
> > assert doc.name == name
-> >
+> >
> > def test_doctor_is_person():
> > """Check if a doctor is a person."""
> > from inflammation.models import Doctor, Person
@@ -693,12 +851,12 @@ This is quite a common pattern, particularly for `__init__` methods, where we ne
> > alice = Patient("Alice")
> > doc.add_patient(alice)
> > doc.add_patient(alice)
-> > assert len(doc.patients) == 1
+> > assert len(doc.patients) == 1
> > ...
-> > ~~~
-> > {: .language-python}
-> >
-> > ~~~ python
+> > ~~~
+> > {: .language-python}
+> >
+> > ~~~
> > # file: inflammation/models.py
> > ...
> > class Person:
@@ -739,8 +897,8 @@ This is quite a common pattern, particularly for `__init__` methods, where we ne
> > return
> > self.patients.append(new_patient)
> > ...
-> > ~~~
-> {: .language-python}
+> > ~~~
+> {: .language-python}
> {: .solution}
{: .challenge}
diff --git a/_episodes/36-architecture-revisited.md b/_episodes/36-architecture-revisited.md
index 60cd3f197..0b460211a 100644
--- a/_episodes/36-architecture-revisited.md
+++ b/_episodes/36-architecture-revisited.md
@@ -11,25 +11,38 @@ keypoints:
Such components can be as small as a single function, or be a software package in their own right."
---
-As we have seen, we have different programming paradigms that are suitable for different problems and affect the
-structure of our code. In programming languages that support multiple paradigms, such as Python, we have the luxury of
-using elements of different paradigms paradigms and we, as software designers and programmers, can
-decide how to use those elements in different architectural components of our software. Let's now circle back to the
-architecture of our software for one final look.
+As we have seen, we have different programming paradigms that are suitable for different problems
+and affect the structure of our code.
+In programming languages that support multiple paradigms, such as Python,
+we have the luxury of using elements of different paradigms paradigms and we,
+as software designers and programmers,
+can decide how to use those elements in different architectural components of our software.
+Let's now circle back to the architecture of our software for one final look.
## MVC Revisited
-We've been developing our software using the **Model-View-Controller** (MVC) architecture so far, but, as we have seen, MVC is just one of the common architectural patterns and is not the only choice we could have made.
+We've been developing our software using the **Model-View-Controller** (MVC) architecture so far,
+but, as we have seen, MVC is just one of the common architectural patterns
+and is not the only choice we could have made.
-There are many variants of an MVC-like pattern (such as [Model-View-Presenter](https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93presenter) (MVP), [Model-View-Viewmodel](https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93viewmodel) (MVVM), etc.), but in most cases, the distinction between these patterns isn't particularly important.
-What really matters is that we are making decisions about the architecture of our software that suit the way in which we expect to use it.
+There are many variants of an MVC-like pattern (such as
+[Model-View-Presenter](https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93presenter) (MVP),
+[Model-View-Viewmodel](https://en.wikipedia.org/wiki/Model%E2%80%93view%E2%80%93viewmodel) (MVVM), etc.),
+but in most cases, the distinction between these patterns isn't particularly important.
+What really matters is that we are making decisions about the architecture of our software
+that suit the way in which we expect to use it.
We should reuse these established ideas where we can, but we don't need to stick to them exactly.
-In this episode we'll be taking our Object Oriented code from the previous episode and integrating it into our existing MVC pattern. But first we will explain some features of the Controller (`inflammation-analysis.py`) component of our architecture.
+In this episode we'll be taking our Object Oriented code from the previous episode
+and integrating it into our existing MVC pattern.
+But first we will explain some features of
+the Controller (`inflammation-analysis.py`) component of our architecture.
### Controller Structure
-You will have noticed already that structure of the `inflammation-analysis.py` file follows this pattern:
+You will have noticed already that structure of the `inflammation-analysis.py` file
+follows this pattern:
+
~~~
# import modules
@@ -42,15 +55,27 @@ if __name__ == "__main__":
~~~
{: .language-python}
-In this pattern the actions performed by the script are contained within the `main` function (which does not need to be called `main`, but using this convention helps others in understanding your code). The `main` function is then called within the `if` statement `__name__ == "__main__"`, after some other actions have been performed (usually the parsing of command-line arguments, which will be explained below). `__name__` is a special dunder variable which is set, along with a number of other special dunder variables, by the python interpreter before the execution of any code in the source file. What value is given by the interpreter to `__name__` is determined by the manner in which it is loaded.
+In this pattern the actions performed by the script are contained within the `main` function
+(which does not need to be called `main`,
+but using this convention helps others in understanding your code).
+The `main` function is then called within the `if` statement `__name__ == "__main__"`,
+after some other actions have been performed
+(usually the parsing of command-line arguments, which will be explained below).
+`__name__` is a special dunder variable which is set,
+along with a number of other special dunder variables,
+by the python interpreter before the execution of any code in the source file.
+What value is given by the interpreter to `__name__` is determined by
+the manner in which it is loaded.
If we run the source file directly using the Python interpreter, e.g.:
+
~~~
$ python3 inflammation-analysis.py
~~~
{: .language-bash}
then the interpreter will assign the hard-coded string `"__main__"` to the `__name__` variable:
+
~~~
__name__ = "__main__"
...
@@ -59,12 +84,15 @@ __name__ = "__main__"
{: .language-python}
However, if your source file is imported by another Python script, e.g:
+
~~~
import inflammation-analysis
~~~
{: .language-python}
-then the interpreter will assign the name `"inflammation-analysis"` from the import statement to the `__name__` variable:
+then the interpreter will assign the name `"inflammation-analysis"`
+from the import statement to the `__name__` variable:
+
~~~
__name__ = "inflammation-analysis"
...
@@ -72,30 +100,49 @@ __name__ = "inflammation-analysis"
~~~
{: .language-python}
-Because of this behaviour of the interpreter, we can put any code that should only be executed when running the script directly within the `if __name__ == "__main__":` structure, allowing the rest of the code within the script to be safely imported by another script if we so wish.
+Because of this behaviour of the interpreter,
+we can put any code that should only be executed when running the script
+directly within the `if __name__ == "__main__":` structure,
+allowing the rest of the code within the script to be
+safely imported by another script if we so wish.
-While it may not seem very useful to have your controller script importable by another script, there are a number of situations in which you would want to do this:
-- for testing of your code, you can have your testing framework import the main script, and run special test functions which then call the `main` function directly;
-- where you want to not only be able to run your script from the command-line, but also provide a programmer-friendly application programming interface (API) for advanced users.
+While it may not seem very useful to have your controller script importable by another script,
+there are a number of situations in which you would want to do this:
+
+- for testing of your code, you can have your testing framework import the main script,
+ and run special test functions which then call the `main` function directly;
+- where you want to not only be able to run your script from the command-line,
+ but also provide a programmer-friendly application programming interface (API) for advanced users.
### Passing Command-line Options to Controller
-The standard Python library for reading command line arguments passed to a script is [`argparse`](https://docs.python.org/3/library/argparse.html). This module reads arguments passed by the system, and enables the automatic generation of help and usage messages. These include, as we saw at the start of this course, the generation of helpful error messages when users give the program invalid arguments.
+The standard Python library for reading command line arguments passed to a script is
+[`argparse`](https://docs.python.org/3/library/argparse.html).
+This module reads arguments passed by the system,
+and enables the automatic generation of help and usage messages.
+These include, as we saw at the start of this course,
+the generation of helpful error messages when users give the program invalid arguments.
+
+The basic usage of `argparse` can be seen in the `inflammation-analysis.py` script.
+First we import the library:
-The basic usage of `argparse` can be seen in the `inflammation-analysis.py` script. First we import the library:
~~~
import argparse
~~~
{: .language-python}
We then initialise the argument parser class, passing an (optional) description of the program:
+
~~~
parser = argparse.ArgumentParser(
description='A basic patient inflammation data management system')
~~~
{: .language-python}
-Once the parser has been initialised we can add the arguments that we want argparse to look out for. In our basic case, we want only the names of the file(s) to process:
+Once the parser has been initialised we can add
+the arguments that we want argparse to look out for.
+In our basic case, we want only the names of the file(s) to process:
+
~~~
parser.add_argument(
'infiles',
@@ -104,19 +151,31 @@ parser.add_argument(
~~~
{: .language-python}
-Here we have defined what the argument will be called (`'infiles'`) when it is read in; the number of arguments to be expected (`nargs='+'`, where `'+'` indicates that there should be 1 or more arguments passed); and a help string for the user (`help='Input CSV(s) containing inflammation series for each patient'`).
+Here we have defined what the argument will be called (`'infiles'`) when it is read in;
+the number of arguments to be expected
+(`nargs='+'`, where `'+'` indicates that there should be 1 or more arguments passed);
+and a help string for the user
+(`help='Input CSV(s) containing inflammation series for each patient'`).
-You can add as many arguments as you wish, and these can be either mandatory (as the one above) or optional. Most of the complexity in using `argparse` is in adding the correct argument options, and we will explain how to do this in more detail below.
+You can add as many arguments as you wish,
+and these can be either mandatory (as the one above) or optional.
+Most of the complexity in using `argparse` is in adding the correct argument options,
+and we will explain how to do this in more detail below.
Finally we parse the arguments passed to the script using:
+
~~~
args = parser.parse_args()
~~~
{: .language-python}
-This returns an object (that we've called `arg`) containing all the arguments requested. These can be accessed using the names that we have defined for each argument, e.g. `args.infiles` would return the filenames that have been input.
+This returns an object (that we've called `arg`) containing all the arguments requested.
+These can be accessed using the names that we have defined for each argument,
+e.g. `args.infiles` would return the filenames that have been input.
+
+The help for the script can be accessed using the `-h` or `--help` optional argument
+(which `argparse` includes by default):
-The help for the script can be accessed using the `-h` or `--help` optional argument (which `argparse` includes by default):
~~~
$ python3 inflammation-analysis.py --help
~~~
@@ -135,28 +194,50 @@ optional arguments:
~~~
{: .output}
-The help page starts with the command line usage, illustrating what inputs can be given (any within `[]` brackets are optional). It then lists the **positional** and **optional** arguments, giving as detailed a description of each as you have added to the `add_argument()` command.
-Positional arguments are arguments that need to be included in the proper position or order when calling the script.
-
-Note that optional arguments are indicated by `-` or `--`, followed by the argument name. Positional arguments are simply inferred by their position. It is possible to have multiple positional arguments, but usually this is only practical where all (or all but one) positional arguments contains a clearly defined number of elements. If more than one option can have an indeterminate number of entries, then it is better to create them as 'optional' arguments. These can be made a required input though, by setting `required = True` within the `add_argument()` command.
+The help page starts with the command line usage,
+illustrating what inputs can be given (any within `[]` brackets are optional).
+It then lists the **positional** and **optional** arguments,
+giving as detailed a description of each as you have added to the `add_argument()` command.
+Positional arguments are arguments that need to be included
+in the proper position or order when calling the script.
+
+Note that optional arguments are indicated by `-` or `--`, followed by the argument name.
+Positional arguments are simply inferred by their position.
+It is possible to have multiple positional arguments,
+but usually this is only practical where all (or all but one) positional arguments
+contains a clearly defined number of elements.
+If more than one option can have an indeterminate number of entries,
+then it is better to create them as 'optional' arguments.
+These can be made a required input though,
+by setting `required = True` within the `add_argument()` command.
> ## Positional and Optional Argument Order
->
-> The usage section of the help page above shows the optional arguments going before the
-> positional arguments. This is the customary way to present options, but is not mandatory. Instead there are two rules which must be followed for these arguments:
-> 1. Positional and optional arguments must each be given all together, and not inter-mixed. For example, the order can be either `optional - positional` or `positional - optional`, but not `optional - positional - optional`.
-> 2. Positional arguments must be given in the order that they are shown in the usage section of the help page.
+>
+> The usage section of the help page above shows
+> the optional arguments going before the positional arguments.
+> This is the customary way to present options, but is not mandatory.
+> Instead there are two rules which must be followed for these arguments:
+>
+> 1. Positional and optional arguments must each be given all together, and not inter-mixed.
+> For example, the order can be either `optional - positional` or `positional - optional`,
+> but not `optional - positional - optional`.
+> 2. Positional arguments must be given in the order that they are shown
+> in the usage section of the help page.
{: .callout}
-Now that you have some familiarity with `argparse`, we will demonstrate below how you can use this to add extra functionality to your controller.
+Now that you have some familiarity with `argparse`,
+we will demonstrate below how you can use this to add extra functionality to your controller.
### Adding a New View
Let's start with adding a view that allows us to see the data for a single patient.
-First, we need to add the code for the view itself and make sure our `Patient` class has the necessary data - including the ability to pass a list of measurements to the `__init__` method.
-Note that your Patient class may look very different now, so adapt this example to fit what you have.
+First, we need to add the code for the view itself
+and make sure our `Patient` class has the necessary data -
+including the ability to pass a list of measurements to the `__init__` method.
+Note that your Patient class may look very different now,
+so adapt this example to fit what you have.
-~~~ python
+~~~
# file: inflammation/views.py
...
@@ -169,7 +250,7 @@ def display_patient_record(patient):
~~~
{: .language-python}
-~~~ python
+~~~
# file: inflammation/models.py
...
@@ -213,8 +294,13 @@ class Patient(Person):
~~~
{: .language-python}
-Now we need to make sure people can call this view - that means connecting it to the controller and ensuring that there's a way to request this view when running the program.
-The changes we need to make here are that the `main` function needs to be able to direct us to the view we've requested - and we need to add to the command line interface - the controller - the necessary data to drive the new view.
+Now we need to make sure people can call this view -
+that means connecting it to the controller
+and ensuring that there's a way to request this view when running the program.
+The changes we need to make here are that the `main` function
+needs to be able to direct us to the view we've requested -
+and we need to add to the command line interface - the controller -
+the necessary data to drive the new view.
~~~
# file: inflammation-analysis.py
@@ -285,12 +371,18 @@ if __name__ == "__main__":
~~~
{: .language-python}
-We've added two options to our command line interface here: one to request a specific view and one for the patient ID that we want to lookup.
-For the full range of features that we have access to with `argparse` see the [Python module documentation](https://docs.python.org/3/library/argparse.html?highlight=argparse#module-argparse).
-Allowing the user to request a specific view like this is a similar model to that used by the popular Python library Click - if you find yourself needing to build more complex interfaces than this, Click would be a good choice.
+We've added two options to our command line interface here:
+one to request a specific view and one for the patient ID that we want to lookup.
+For the full range of features that we have access to with `argparse` see the
+[Python module documentation](https://docs.python.org/3/library/argparse.html?highlight=argparse#module-argparse).
+Allowing the user to request a specific view like this is
+a similar model to that used by the popular Python library Click -
+if you find yourself needing to build more complex interfaces than this,
+Click would be a good choice.
You can find more information in [Click's documentation](https://click.palletsprojects.com/).
-For now, we also don't know the names of any of our patients, so we've made it `'UNKNOWN'` until we get more data.
+For now, we also don't know the names of any of our patients,
+so we've made it `'UNKNOWN'` until we get more data.
We can now call our program with these extra arguments to see the record for a single patient:
@@ -315,23 +407,38 @@ UNKNOWN
> ## Additional Material
>
-> Now that we've covered the basics of different programming paradigms and how we can integrate them into our
-> multi-layer architecture, there are two optional extra episodes which you may find interesting.
+> Now that we've covered the basics of different programming paradigms
+> and how we can integrate them into our multi-layer architecture,
+> there are two optional extra episodes which you may find interesting.
>
-> Both episodes cover the persistence layer of software architectures and methods of persistently storing data, but take different approaches.
-> The episode on [persistence with JSON](/persistence) covers some more advanced concepts in Object Oriented Programming, while the episode on [databases](/databases) starts to build towards a true multilayer architecture, which would allow our software to handle much larger quantities of data.
+> Both episodes cover the persistence layer of software architectures
+> and methods of persistently storing data, but take different approaches.
+> The episode on [persistence with JSON](/persistence) covers
+> some more advanced concepts in Object Oriented Programming, while
+> the episode on [databases](/databases) starts to build towards a true multilayer architecture,
+> which would allow our software to handle much larger quantities of data.
{: .callout}
## Towards Collaborative Software Development
-Having looked at some theoretical aspects of software design, we are now circling back to
-implementing our software design and developing our software to satisfy the requirements collaboratively
-in a team. At an intermediate level of software development, there is a wealth of practices that could be used, and applying suitable design and coding practices is what separates an intermediate developer from someone who has just started coding. The key for an intermediate developer is to balance these concerns for each software project appropriately, and employ design and development practices enough so that progress can be made.
-
-One practice that should always be considered, and has been shown to be very effective in team-based
-software development, is that of *code review*. Code reviews help to ensure the 'good' coding standards are achieved
-and maintained within a team by having multiple people have a look and comment on key code changes to see how they fit
-within the codebase. Such reviews check the correctness of the new code, test coverage, functionality changes,
-and confirm that they follow the coding guides and best practices. Let's have look at some code review techniques
-available to us.
+Having looked at some theoretical aspects of software design,
+we are now circling back to implementing our software design
+and developing our software to satisfy the requirements collaboratively in a team.
+At an intermediate level of software development,
+there is a wealth of practices that could be used,
+and applying suitable design and coding practices is what separates
+an intermediate developer from someone who has just started coding.
+The key for an intermediate developer is to balance these concerns
+for each software project appropriately,
+and employ design and development practices enough so that progress can be made.
+
+One practice that should always be considered,
+and has been shown to be very effective in team-based software development,
+is that of *code review*.
+Code reviews help to ensure the 'good' coding standards are achieved
+and maintained within a team by having multiple people
+have a look and comment on key code changes to see how they fit within the codebase.
+Such reviews check the correctness of the new code, test coverage, functionality changes,
+and confirm that they follow the coding guides and best practices.
+Let's have a look at some code review techniques available to us.
diff --git a/_episodes/40-section4-intro.md b/_episodes/40-section4-intro.md
index c75d4def4..82860ffb4 100644
--- a/_episodes/40-section4-intro.md
+++ b/_episodes/40-section4-intro.md
@@ -13,24 +13,34 @@ keypoints:
- "Agreeing on a set of best practices within a software development team will help to improve your software's understandability, extensibility, testability, reusability and overall sustainability."
---
-When changes - particularly big changes - are made to a codebase, how can we as a team ensure that these changes are well considered and represent good solutions?
+When changes - particularly big changes - are made to a codebase,
+how can we as a team ensure that these changes are well considered and represent good solutions?
And how can we increase the overall knowledge of a codebase across a team?
-Sometimes project goals and time pressures take precedence and producing maintainable, reusable code is not given the
-time it deserves. So, when a change or a new feature is needed - often the shortest route to making it work is taken
-as opposed to a more well thought-out solution. For this reason, it is important not to write the code alone and in
-isolation and use other team members verify each other's code and measure our coding standards against.
+Sometimes project goals and time pressures take precedence
+and producing maintainable, reusable code is not given the time it deserves.
+So, when a change or a new feature is needed -
+often the shortest route to making it work is taken as opposed to a more well thought-out solution.
+For this reason, it is important not to write the code alone and in isolation
+and use other team members to verify each other's code and measure our coding standards against.
This process of having multiple team members comment on key code changes is called *code review* -
-this is one of the most important practices of collaborative software development that helps ensure
-the ‘good’ coding standards are achieved and maintained within a team, as well as increasing knowledge about the codebase across the team.
+this is one of the most important practices of collaborative software development
+that helps ensure the ‘good’ coding standards are achieved and maintained within a team,
+as well as increasing knowledge about the codebase across the team.
We'll thus look at the benefits of reviewing code,
in particular, the value of this type of activity within a team,
and how this can fit within various ways of team working.
We'll see how GitHub can support code review activities via pull requests,
and how we can do these ourselves making use of best practices.
-After that, we'll look at some general principles of software maintainability and the benefits that writing maintainable
-code can give you. There will also be some practice at identifying problems with existing code, and some general, established practices you can apply when writing new code or to the code you've already written.
-We'll also look at how we can package software for release and distribution, using **Poetry** to manage our Python dependencies and produce a code package we can use with a Python package indexing service to illustrate these principles.
+After that, we'll look at some general principles of software maintainability
+and the benefits that writing maintainable code can give you.
+There will also be some practice at identifying problems with existing code,
+and some general, established practices you can apply
+when writing new code or to the code you've already written.
+We'll also look at how we can package software for release and distribution,
+using **Poetry** to manage our Python dependencies
+and produce a code package we can use with a Python package indexing service
+to illustrate these principles.
{: .image-with-shadow width="800px" }
@@ -44,4 +54,4 @@ Designing and Developing "Good" Software in Teams
- **Writing "good" software** that is understandable, modular, extensible and tested
- **Publishing and releasing software** for reuse by others
- **Collaborative code development and review** to improve software sustainability and avoid the accumulation of ‘technical debt’.
- {% endcomment %}
+{% endcomment %}
diff --git a/_episodes/41-code-review.md b/_episodes/41-code-review.md
index 6abc54810..15b8c7fd5 100644
--- a/_episodes/41-code-review.md
+++ b/_episodes/41-code-review.md
@@ -2,7 +2,7 @@
title: "Developing Software In a Team: Code Review"
teaching: 15
exercises: 30
-questions:
+questions:
- "How do we develop software in a team?"
- "What is code review and how it can improve the quality of code?"
objectives:
@@ -10,171 +10,235 @@ objectives:
- "Understand how to do a pull request via GitHub to engage in code review with a team and contribute to a shared code repository."
keypoints:
- "Code review is a team software quality assurance practice where team members look at parts of the codebase in order to improve their code's readability, understandability, quality and maintainability."
-- "It is important to agree on a set of best practices and establish a code review process in a team to help to
+- "It is important to agree on a set of best practices and establish a code review process in a team to help to
sustain a good, stable and maintainable code for many years."
---
-
+
## Introduction
-So far in this course we’ve focused on learning software design and (some) technical practices, tools
-and infrastructure that help the development of software in a team environment, but in an individual setting.
-Despite developing tests to check our code - no one else from the team had a look at our code
-before we merged it into the main development stream. Software is often designed and built as part of a team,
-so in this episode we'll be looking at how to manage the process of team software development and improve our
-code by engaging in code review process with other team members.
+So far in this course we’ve focused on learning software design
+and (some) technical practices, tools and infrastructure that
+help the development of software in a team environment, but in an individual setting.
+Despite developing tests to check our code - no one else from the team had a look at our code
+before we merged it into the main development stream.
+Software is often designed and built as part of a team,
+so in this episode we'll be looking at how to manage the process of team software development
+and improve our code by engaging in code review process with other team members.
> ## Collaborative Code Development Models
-> The way your team provides contributions to the shared codebase depends on the type of development model you use in your project.
-Two commonly used models are:
-- **fork and pull model** - where anyone can **fork** an existing repository (to create their copy of the project linked to
-the source) and push changes to their personal fork.
-A contributor can work independently on their own fork as they
-do not need permissions on the source repository to push modifications to a fork they own.
-The changes from contributors can then be **pulled** into the source repository by the project maintainer on request
-and after a code review process. This model is popular with open
-source projects as it reduces the start up costs for new contributors and allows them to work
-independently without upfront coordination with source project maintainers. So, for example,
-you may use this model when you are an
-external collaborator on a project rather than a core team member.
-- **shared repository model** - where collaborators are granted push access to a single shared code repository.
-Even though collaborators have write access to the main
-development and production branches, the best practice of creating feature branches for new developments and
-when changes need to be made is still followed. This is to enable easier testing of the new code and
-initiate code review and general discussion about a set of changes before they are merged
-into the development branch. This model is more prevalent with teams and organisations
-collaborating on private projects.
+> The way your team provides contributions to the shared codebase depends on
+> the type of development model you use in your project.
+> Two commonly used models are:
+>
+> - **fork and pull model** -
+> where anyone can **fork** an existing repository
+> (to create their copy of the project linked to the source)
+> and push changes to their personal fork.
+> A contributor can work independently on their own fork as they do not need
+> permissions on the source repository to push modifications to a fork they own.
+> The changes from contributors can then be **pulled** into the source repository
+> by the project maintainer on request and after a code review process.
+> This model is popular with open source projects as it
+> reduces the start up costs for new contributors
+> and allows them to work independently without upfront coordination
+> with source project maintainers.
+> So, for example, you may use this model when you are an external collaborator on a project
+> rather than a core team member.
+> - **shared repository model** -
+> where collaborators are granted push access to a single shared code repository.
+> Even though collaborators have write access to the main development and production branches,
+> the best practice of creating feature branches for new developments
+> and when changes need to be made is still followed.
+> This is to enable easier testing of the new code
+> and initiate code review and general discussion about a set of changes
+> before they are merged into the development branch.
+> This model is more prevalent with teams and organisations collaborating on private projects.
{: .callout}
-
-Regardless of the collaborative code development model you and your collaborators use - code reviews are one of the
-widely accepted best practices for software development in teams and something you should adopt in your development
-process too.
+
+Regardless of the collaborative code development model you and your collaborators use -
+code reviews are one of the widely accepted best practices for software development in teams
+and something you should adopt in your development process too.
## Code Review
-[Code review][code-review] is a software quality
-assurance practice where one or several people from the team (different from the code's author) check the software by
-viewing parts of its source code.
+[Code review][code-review] is a software quality assurance practice
+where one or several people from the team (different from the code's author)
+check the software by viewing parts of its source code.
> ## Group Exercise: Advantages of Code Review
> Discuss as a group: what do you think are the reasons behind, and advantages of, code review?
->> ## Solution
->> The purposes of code review include:
->> - improving internal code readability, understandability, quality and maintainability
->> - checking for coding standards compliance, code uniformity and consistency
->> - checking for test coverage and detecting bugs and code defects early
->> - detecting performance problems and identifying code optimisation points
->> - finding alternative/better solutions.
->>
->> An effective code review prevents errors from creeping into your software by improving code quality at an early
-stage of the software development process. It helps with learning, i.e. sharing knowledge about the codebase,
-solution approaches, expectations regarding quality, coding standards, etc. Developers use code review feedback
-from more senior developers to improve their own coding practices and expertise. Finally, it helps increase the sense of
-collective code ownership and responsibility, which in turn helps increase the "bus factor" and reduce the risk resulting from
-information and capabilities being held by a single person "responsible" for a certain part of the codebase and
-not being shared among team members.
+> > ## Solution
+> > The purposes of code review include:
+> > - improving internal code readability, understandability, quality and maintainability
+> > - checking for coding standards compliance, code uniformity and consistency
+> > - checking for test coverage and detecting bugs and code defects early
+> > - detecting performance problems and identifying code optimisation points
+> > - finding alternative/better solutions.
+> >
+> > An effective code review prevents errors from creeping into your software
+> > by improving code quality at an early stage of the software development process.
+> > It helps with learning, i.e. sharing knowledge about the codebase,
+> > solution approaches,
+> > expectations regarding quality,
+> > coding standards, etc.
+> > Developers use code review feedback from more senior developers
+> > to improve their own coding practices and expertise.
+> > Finally, it helps increase the sense of collective code ownership and responsibility,
+> > which in turn helps increase the "bus factor"
+> > and reduce the risk resulting from information and capabilities
+> > being held by a single person "responsible" for a certain part of the codebase
+> > and not being shared among team members.
> {: .solution}
{: .challenge}
-Code review is one of the most useful team code development practices - someone checks your design or code for errors,
-they get to learn from your solution, having to
-explain code to someone else clarifies your rationale and design decisions in your mind too, and collaboration
-helps to improve the overall team software development process. It is universally applicable throughout
-the software development cycle - from design to development to maintenance. According to Michael Fagan, the
-author of the [code inspection technique](https://en.wikipedia.org/wiki/Fagan_inspection), rigorous inspections can
-remove 60-90% of errors from the code even before the
-first tests are run ([Fagan, 1976](https://doi.org/10.1147%2Fsj.153.0182)).
-Furthermore, according to Fagan, the cost to remedy a defect in the early (design) stage is 10 to 100 times less
-compared to fixing the same defect in the development and maintenance
-stages, respectively. Since the cost of bug fixes grows in orders of magnitude throughout the software
-lifecycle, it is far more efficient to find and fix defects as close as possible to the point where they were introduced.
-
-There are several **code review techniques** with various degree of formality and the use of
-a technical infrastructure, including:
-
-- **Over-the-shoulder code review** is the most common and informal of code review techniques and involves one or more team
-members standing over the code author's shoulder while the author walks the reviewers through a set of code changes.
-- **Email pass-around code review** is another form of lightweight code review where the code author packages up a set
-of changes and files and sends them over to reviewers via email. Reviewers examine the files and differences against the
-code base, ask questions and discuss with the author and other developers, and suggest changes over email.
-The difficult part of this process is the manual collection the files under review and noting differences.
-- **Pair programming** is a code development process that incorporates continuous code review - two developers sit together
-at a computer, but only one of them actively codes whereas the other provides real-time feedback. It is a
-great way to inspect new code and train developers, especially if an experienced team member walks a younger
-developer through the new code, providing explanations and suggestions through a conversation. It is conducted
-in-person and synchronously but it can be time-consuming as the reviewer cannot do any other work during the
-pair programming period.
-- **Fagan code inspection** is a formal and heavyweight process of
-finding defects in specifications or designs during various phases of the software development process. There are
-several roles taken by different team members in a Fagan inspection and each inspection is a formal 7-step process
-with a predefined entry and exit criteria. See [Fagan inspection](https://en.wikipedia.org/wiki/Fagan_inspection) for
-full details on this method.
-- **Tool-assisted code review** process uses a specialised tool to facilitate the process of code review, which typically
-helps with the following tasks: (1) collecting and displaying the updated files and highlighting what has changed, (2)
-facilitating a conversation between team members (reviewers and developers), and (3) allowing code administrators and
-product managers a certain control and overview of the code development workflow. Modern tools may provide a handful
-of other functionalities too, such as metrics (e.g. inspection rate, defect rate, defect density).
-
-Each of the above techniques have their pros and cons and varying degrees practicality -
+Code review is one of the most useful team code development practices -
+someone checks your design or code for errors, they get to learn from your solution,
+having to explain code to someone else clarifies
+your rationale and design decisions in your mind too,
+and collaboration helps to improve the overall team software development process.
+It is universally applicable throughout the software development cycle -
+from design to development to maintenance.
+According to Michael Fagan, the author of the
+[code inspection technique](https://en.wikipedia.org/wiki/Fagan_inspection),
+rigorous inspections can remove 60-90% of errors from the code
+even before the first tests are run ([Fagan, 1976](https://doi.org/10.1147%2Fsj.153.0182)).
+Furthermore, according to Fagan,
+the cost to remedy a defect in the early (design) stage is 10 to 100 times less compared to
+fixing the same defect in the development and maintenance stages, respectively.
+Since the cost of bug fixes grows in orders of magnitude throughout the software lifecycle,
+it is far more efficient to find and fix defects
+as close as possible to the point where they were introduced.
+
+There are several **code review techniques** with various degree of formality
+and the use of a technical infrastructure, including:
+
+- **Over-the-shoulder code review**
+ is the most common and informal of code review techniques and involves
+ one or more team members standing over the code author's shoulder
+ while the author walks the reviewers through a set of code changes.
+- **Email pass-around code review**
+ is another form of lightweight code review where the code author
+ packages up a set of changes and files and sends them over to reviewers via email.
+ Reviewers examine the files and differences against the code base,
+ ask questions and discuss with the author and other developers,
+ and suggest changes over email.
+ The difficult part of this process is the manual collection the files under review
+ and noting differences.
+- **Pair programming**
+ is a code development process that incorporates continuous code review -
+ two developers sit together at a computer,
+ but only one of them actively codes whereas the other provides real-time feedback.
+ It is a great way to inspect new code and train developers,
+ especially if an experienced team member walks a younger developer through the new code,
+ providing explanations and suggestions through a conversation.
+ It is conducted in-person and synchronously but it can be time-consuming
+ as the reviewer cannot do any other work during the pair programming period.
+- **Fagan code inspection**
+ is a formal and heavyweight process of finding defects in specifications or designs
+ during various phases of the software development process.
+ There are several roles taken by different team members in a Fagan inspection
+ and each inspection is a formal 7-step process with a predefined entry and exit criteria.
+ See [Fagan inspection](https://en.wikipedia.org/wiki/Fagan_inspection)
+ for full details on this method.
+- **Tool-assisted code review**
+ process uses a specialised tool to facilitate the process of code review,
+ which typically helps with the following tasks:
+ (1) collecting and displaying the updated files and highlighting what has changed,
+ (2) facilitating a conversation between team members (reviewers and developers), and
+ (3) allowing code administrators and product managers
+ a certain control and overview of the code development workflow.
+ Modern tools may provide a handful of other functionalities too, such as metrics
+ (e.g. inspection rate, defect rate, defect density).
+
+Each of the above techniques have their pros and cons and varying degrees practicality -
it is up to the team to decide which ones are most suitable for the project and when to use them.
-We will have a look at the **tool-assisted code review process** using GitHub's built-in code review tool - **pull requests**. It is a lightweight tool, included with GitHub's core service for free and has gained
-popularity within the software development community in recent years.
+We will have a look at the **tool-assisted code review process**
+using GitHub's built-in code review tool - **pull requests**.
+It is a lightweight tool, included with GitHub's core service for free
+and has gained popularity within the software development community in recent years.
## Code Reviews via GitHub's Pull Requests
-Pull requests are fundamental to how teams review and improve code on GitHub (and similar code sharing platforms) -
-they let you tell others about changes you've pushed to a branch in a repository on GitHub and that your
-code is ready for review. Once a pull request is opened, you can discuss and review the potential changes with others
-on the team and add follow-up commits based on the feedback before your changes are merged from your feature branch
-into the `develop` branch. The name 'pull request' suggests you are **requesting** the codebase
-moderators to **pull** your changes into the codebase.
-
-Such changes are normally done on a feature branch, to ensure that they are separate and self-contained and
-that the main branch only contains "production-ready" work and that the `develop` branch contains code that
-has already been extensively tested. You create a branch for your work
-based on one of the existing branches (typically the `develop` branch but can be any other branch),
-do some commits on that branch, and, once you are ready to merge your changes, create a pull request to bring
-the changes back to the branch that you started from. In this
-context, the branch from which you branched off to do your work and where the changes should be applied
-back to is called the **base branch**, while the feature branch that contains changes you would like to be applied
-is the **head branch**.
-
-How you create your feature branches and open pull requests in GitHub will depend on your collaborative code
-development model:
-
-- In the shared repository model, in order to create a feature branch and open a
-pull request based on it you must have write access to the source repository or, for organisation-owned repositories,
-you must be a member of the organisation that owns the repository. Once you have access to the repository, you proceed
-to create a feature branch on that repository directly.
-- In the fork and pull model, where you do not have write permissions to the source repository, you need to fork the
-repository first before you create a feature branch (in your fork) to base your pull request on.
-
-In both development models, it is recommended to create a feature branch for your work and
-the subsequent pull request, even though you can submit pull requests from any branch or commit. This is because,
-with a feature branch, you can push follow-up commits as a response to feedback and update your proposed changes within
-a self-contained bundle.
-The only difference in creating a pull request between the two models is how you create the feature branch.
-In either model, once you are ready to merge your changes in - you will need to specify the base branch and the head
-branch.
-
+Pull requests are fundamental to how teams review and improve code
+on GitHub (and similar code sharing platforms) -
+they let you tell others about changes you've pushed to a branch in a repository on GitHub
+and that your code is ready for review.
+Once a pull request is opened,
+you can discuss and review the potential changes with others on the team
+and add follow-up commits based on the feedback
+before your changes are merged from your feature branch into the `develop` branch.
+The name 'pull request' suggests you are **requesting** the codebase moderators
+to **pull** your changes into the codebase.
+
+Such changes are normally done on a feature branch,
+to ensure that they are separate and self-contained,
+that the main branch only contains "production-ready" work,
+and that the `develop` branch contains code that has already been extensively tested.
+You create a branch for your work based on one of the existing branches
+(typically the `develop` branch but can be any other branch),
+do some commits on that branch,
+and, once you are ready to merge your changes,
+create a pull request to bring the changes back to the branch that you started from.
+In this context, the branch from which you branched off to do your work
+and where the changes should be applied back to
+is called the **base branch**,
+while the feature branch that contains changes you would like to be applied is the **head branch**.
+
+How you create your feature branches and open pull requests in GitHub will depend on
+your collaborative code development model:
+
+- In the shared repository model,
+ in order to create a feature branch and open a pull request based on it
+ you must have write access to the source repository or,
+ for organisation-owned repositories,
+ you must be a member of the organisation that owns the repository.
+ Once you have access to the repository,
+ you proceed to create a feature branch on that repository directly.
+- In the fork and pull model,
+ where you do not have write permissions to the source repository,
+ you need to fork the repository first
+ before you create a feature branch (in your fork) to base your pull request on.
+
+In both development models,
+it is recommended to create a feature branch for your work and the subsequent pull request,
+even though you can submit pull requests from any branch or commit.
+This is because, with a feature branch,
+you can push follow-up commits as a response to feedback
+and update your proposed changes within a self-contained bundle.
+The only difference in creating a pull request between the two models is
+how you create the feature branch.
+In either model, once you are ready to merge your changes in -
+you will need to specify the base branch and the head branch.
+
## Code Review and Pull Requests In Action
-Let's see this in action - you and your fellow learners are going to be organised in small teams and assume to be
-collaborating in the shared repository model. You will be added as a collaborator to another team member's repository
-(which becomes the shared repository in this context) and, likewise, you will add other team members as collaborators
-on your repository. You can form teams of two and work on each other's repositories. If there are 3 members in
-your group you can go in a round robin fashion (the first team member does a pull request on the second member's
-repository and receives a pull request on their repository from the third team member). If you are going through the
-material on your own and do not have a collaborator, you can do pull requests on your own repository from one to
-another branch.
-
-Recall [solution requirements SR1.1.1 and SR1.2.1](../31-software-requirements/index.html#solution-requirements) from an
-earlier episode. Your team member has implemented one of them according to the specification (let's call it `feature-x`)
-but tests are still missing. You are now tasked with implementing tests on top of
-that existing implementation to make sure the new feature indeed satisfies the requirements. You will propose
-changes to their repository (the shared repository in this context) via pull request
-(acting as the code author) and engage in code review with your team member (acting as a code reviewer).
-Similarly, you will receive a pull request on your repository from another team member,
-in which case the roles will be reversed. The following diagram depicts the branches that you should have in the repository.
+Let's see this in action -
+you and your fellow learners are going to be organised in small teams
+and assume to be collaborating in the shared repository model.
+You will be added as a collaborator to another team member's repository
+(which becomes the shared repository in this context)
+and, likewise, you will add other team members as collaborators on your repository.
+You can form teams of two and work on each other's repositories.
+If there are 3 members in your group you can go in a round robin fashion
+(the first team member does a pull request on the second member's repository
+and receives a pull request on their repository from the third team member).
+If you are going through the material on your own and do not have a collaborator,
+you can do pull requests on your own repository from one to another branch.
+
+Recall [solution requirements SR1.1.1 and SR1.2.1](../31-software-requirements/index.html#solution-requirements)
+from an earlier episode.
+Your team member has implemented one of them according to the specification
+(let's call it `feature-x`)
+but tests are still missing.
+You are now tasked with implementing tests on top of that existing implementation
+to make sure the new feature indeed satisfies the requirements.
+You will propose changes to their repository
+(the shared repository in this context)
+via pull request (acting as the code author)
+and engage in code review with your team member (acting as a code reviewer).
+Similarly, you will receive a pull request on your repository from another team member,
+in which case the roles will be reversed.
+The following diagram depicts the branches that you should have in the repository.
{: .image-with-shadow width="800px"}
@@ -185,8 +249,9 @@ To achieve this, the following steps are needed.
#### Step 1: Adding Collaborators to a Shared Repository
-You need to add the other team member(s) as collaborator(s) on your repository
-to enable them to create branches and pull requests. To do so, each repository owner needs to:
+You need to add the other team member(s) as collaborator(s) on your repository
+to enable them to create branches and pull requests.
+To do so, each repository owner needs to:
1. Head over to Settings section of your software project's repository in GitHub.
{: .image-with-shadow width="900px"}
@@ -194,60 +259,81 @@ to enable them to create branches and pull requests. To do so, each repository o
{: .image-with-shadow width="900px"}
3. Add your collaborator(s) by their GitHub username(s), full name(s) or email address(es).
{: .image-with-shadow width="900px"}
-4. Collaborator(s) will be notified of your invitation to join your repository based on their notification preferences.
-5. Once they accept the invitation, they will have the collaborator-level access to your repository and will show up
-in the list of your collaborators.
+4. Collaborator(s) will be notified of your invitation to join your repository
+ based on their notification preferences.
+5. Once they accept the invitation, they will have the collaborator-level access to your repository
+ and will show up in the list of your collaborators.
-See the full details on [collaborator permissions for personal repositories](https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-github-user-account/managing-user-account-settings/permission-levels-for-a-user-account-repository)
+See the full details on
+[collaborator permissions for personal repositories](https://docs.github.com/en/account-and-profile/setting-up-and-managing-your-github-user-account/managing-user-account-settings/permission-levels-for-a-user-account-repository)
to understand what collaborators will be able to do within your repository.
-Note that repositories owned by an organisation have a [more granular access control](https://docs.github.com/en/get-started/learning-about-github/access-permissions-on-github) compared to that of personal
-repositories.
+Note that repositories owned by an organisation have a
+[more granular access control](https://docs.github.com/en/get-started/learning-about-github/access-permissions-on-github)
+compared to that of personal repositories.
#### Step 2: Preparing Your Local Environment for a Pull Request
-1. Obtain the GitHub URL of the shared repository you will be working on and clone it locally (make sure
-you do it outside your software repository's folder you have been working on so far).
-This will create a copy of the repository locally on your machine along with all of
-its (remote) branches.
- ~~~
- $ git clone
- $ cd
- ~~~
- {: .language-bash}
-2. Check with the repository owner (your team member) which feature (SR1.1.1 or SR1.2.1) they implemented in
-the [previous exercise](/32-software-design/index.html#implement-requirements) and what is the name of the branch they worked on.
-Let's assume the name of the branch was `feature-x` (you should amend the branch name for your case accordingly).
-3. Your task is to add tests for the code on `feature-x` branch. You should do so on a separate branch called `feature-x-tests`, which
-will branch off `feature-x`. This is to enable you later on to create a pull request from your `feature-x-tests` branch with your changes
-that can then easily be reviewed and compared with `feature-x` by the team member who created it.
-
- To do so, branch off a new local branch `feature-x-tests` from the remote `feature-x` branch (making sure you use the
- branch names that match your case). Also note that, while we cay "remote" branch `feature-x` - you have actually
- obtained it locally on your machine when you cloned the remote repository.
- ~~~
- $ git checkout -b feature-x-tests origin/feature-x
- ~~~
- {: .language-bash}
-
- You are now located in the new (local) `feature-x-tests` branch and are ready to start adding your code.
-
-#### Step 3: Adding New Code
+1. Obtain the GitHub URL of the shared repository you will be working on and clone it locally
+ (make sure you do it outside your software repository's folder you have been working on so far).
+ This will create a copy of the repository locally on your machine
+ along with all of its (remote) branches.
+ ~~~
+ $ git clone
+ $ cd
+ ~~~
+ {: .language-bash}
+2. Check with the repository owner (your team member)
+ which feature (SR1.1.1 or SR1.2.1) they implemented in the
+ [previous exercise](/32-software-design/index.html#implement-requirements)
+ and what is the name of the branch they worked on.
+ Let's assume the name of the branch was `feature-x`
+ (you should amend the branch name for your case accordingly).
+3. Your task is to add tests for the code on `feature-x` branch.
+ You should do so on a separate branch called `feature-x-tests`,
+ which will branch off `feature-x`.
+ This is to enable you later on to create a pull request
+ from your `feature-x-tests` branch with your changes
+ that can then easily be reviewed and compared with `feature-x`
+ by the team member who created it.
+
+ To do so, branch off a new local branch `feature-x-tests` from the remote `feature-x` branch
+ (making sure you use the branch names that match your case).
+ Also note that, while we say "remote" branch `feature-x` -
+ you have actually obtained it locally on your machine when you cloned the remote repository.
+ ~~~
+ $ git checkout -b feature-x-tests origin/feature-x
+ ~~~
+ {: .language-bash}
+
+ You are now located in the new (local) `feature-x-tests` branch
+ and are ready to start adding your code.
+
+#### Step 3: Adding New Code
> ## Exercise: Implement Tests for the New Feature
-> Look back at the [solution requirements](/31-software-requirements/index.html#solution-requirements) (SR1.1.1 or SR1.2.1) for
-> the feature that was implemented in your shared repository. Implement tests against the appropriate
-> specification in your local feature branch.
->
-> *Note: Try not to not fall into the trap of writing the tests to test the existing code/implementation - you should
-> write the tests to make sure the code satisfies the requirements regardless of the actual implementation. You can
-> treat the implementation as a [black box](https://en.wikipedia.org/wiki/Black-box_testing) - a typical approach
-> to software testing - as a way to make sure it is properly tested against its requirements without introducing
-> assumptions into the tests about its implementation.*
+> Look back at the
+> [solution requirements](/31-software-requirements/index.html#solution-requirements)
+> (SR1.1.1 or SR1.2.1)
+> for the feature that was implemented in your shared repository.
+> Implement tests against the appropriate specification in your local feature branch.
+>
+> *Note: Try not to not fall into the trap of
+> writing the tests to test the existing code/implementation -
+> you should write the tests to make sure the code satisfies the requirements
+> regardless of the actual implementation.
+> You can treat the implementation as a
+> [black box](https://en.wikipedia.org/wiki/Black-box_testing) -
+> a typical approach to software testing -
+> as a way to make sure it is properly tested against its requirements
+> without introducing assumptions into the tests about its implementation.*
{: .challenge}
> ## Testing Based on Requirements
-Tests should test functionality, which stem from the software requirements, rather than an implementation. Tests can
-be seen as a reflection of those requirements - checking if the requirements are satisfied.
+> Tests should test functionality,
+> which stem from the software requirements,
+> rather than an implementation.
+> Tests can be seen as a reflection of those requirements -
+> checking if the requirements are satisfied.
{: .callout}
Remember to commit your new code to your branch `feature-x-tests`.
@@ -260,100 +346,143 @@ $ git commit -m "Added tests for feature-x."
#### Step 4: Submitting a Pull Request
-When you have finished adding your tests and have committed the changes to your local `feature-x-tests`,
-and are ready for the others in the team to review them, you have to do the following:
+When you have finished adding your tests
+and committed the changes to your local `feature-x-tests`,
+and are ready for the others in the team to review them,
+you have to do the following:
1. Push your local feature branch `feature-x-tests` remotely to the shared repository.
- ~~~
- $ git push -u origin feature-x-tests
- ~~~
- {: .language-bash}
-2. Head over to the remote repository in GitHub and locate your new (`feature-x-tests`) branch from the dropdown box on
-the Code tab (you can search for your branch or use the "View all branches" option).
+ ~~~
+ $ git push -u origin feature-x-tests
+ ~~~
+ {: .language-bash}
+2. Head over to the remote repository in GitHub
+ and locate your new (`feature-x-tests`) branch from the dropdown box on the Code tab
+ (you can search for your branch or use the "View all branches" option).
{: .image-with-shadow width="600px"}
3. Open a pull request by clicking "Compare & pull request" button.
{: .image-with-shadow width="900px"}
-4. Select the base and the head branch, e.g. `feature-x` and `feature-x-tests`, respectively. Recall that the base branch is
-where you want your changes to be merged and the head branch contains your changes.
-5. Add a comment describing the nature of the changes, and then submit the pull request.
-6. Repository moderator and other collaborators on the repository (code reviewers) will be notified of your pull request by GitHub.
+4. Select the base and the head branch, e.g. `feature-x` and `feature-x-tests`, respectively.
+ Recall that the base branch is where you want your changes to be merged
+ and the head branch contains your changes.
+5. Add a comment describing the nature of the changes,
+ and then submit the pull request.
+6. Repository moderator and other collaborators on the repository (code reviewers)
+ will be notified of your pull request by GitHub.
7. At this point, the code review process is initiated.
You should receive a similar pull request from other team members on your repository.
#### Step 5: Code Review
-1. The repository moderator/code reviewers reviews your changes and provides feedback to you
-in the form of comments.
-2. Respond to their comments and do any subsequent commits, as requested by reviewers.
-3. It may take a few rounds of exchanging comments and discussions until the team is ready to accept your changes.
+1. The repository moderator/code reviewers reviews your changes
+ and provides feedback to you in the form of comments.
+2. Respond to their comments and do any subsequent commits,
+ as requested by reviewers.
+3. It may take a few rounds of exchanging comments and discussions until
+ the team is ready to accept your changes.
-Perform the above actions on the pull request you received, this time acting as the moderator/code reviewer.
+Perform the above actions on the pull request you received,
+this time acting as the moderator/code reviewer.
#### Step 6: Closing a Pull Request
-1. Once the moderator approves your changes, either one of you can merge onto the base branch. Typically, it is
-the responsibility of the code's author to do the merge but this may differ from team to team.
+1. Once the moderator approves your changes, either one of you can merge onto the base branch.
+ Typically, it is the responsibility of the code's author to do the merge
+ but this may differ from team to team.
{: .image-with-shadow width="900px"}
2. Delete the merged branch to reduce the clutter in the repository.
Repeat the above actions for the pull request you received.
-If the work on the feature branch is completed and it is sufficiently tested, the feature branch can now be merged
-into the `develop` branch.
+If the work on the feature branch is completed and it is sufficiently tested,
+the feature branch can now be merged into the `develop` branch.
## Best Practice for Code Review
-
-There are multiple perspectives to a code review process - from general practices to technical details
-relating to different roles involved in the process. It is critical for the code's quality, stability and maintainability
-that the team decides on this process and sticks to it. Here are some examples of best practices for you to consider
-(also check these useful code review blogs from [Swarmia](https://www.swarmia.com/blog/a-complete-guide-to-code-reviews/?utm_term=code%20review&utm_campaign=Code+review+best+practices&utm_source=adwords&utm_medium=ppc&hsa_acc=6644081770&hsa_cam=14940336179&hsa_grp=131344939434&hsa_ad=552679672005&hsa_src=g&hsa_tgt=kwd-17740433&hsa_kw=code%20review&hsa_mt=b&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiAw9qOBhC-ARIsAG-rdn7_nhMMyE7aeSzosRRqZ52vafBOyMrpL4Ypru0PHWK4Rl8QLIhkeA0aAsxqEALw_wcB) and [Smartbear](https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/)):
-
+
+There are multiple perspectives to a code review process -
+from general practices to technical details relating to different roles involved in the process.
+It is critical for the code's quality, stability and maintainability
+that the team decides on this process and sticks to it.
+Here are some examples of best practices for you to consider
+(also check these useful code review blogs from [
+Swarmia](https://www.swarmia.com/blog/a-complete-guide-to-code-reviews/?utm_term=code%20review&utm_campaign=Code+review+best+practices&utm_source=adwords&utm_medium=ppc&hsa_acc=6644081770&hsa_cam=14940336179&hsa_grp=131344939434&hsa_ad=552679672005&hsa_src=g&hsa_tgt=kwd-17740433&hsa_kw=code%20review&hsa_mt=b&hsa_net=adwords&hsa_ver=3&gclid=Cj0KCQiAw9qOBhC-ARIsAG-rdn7_nhMMyE7aeSzosRRqZ52vafBOyMrpL4Ypru0PHWK4Rl8QLIhkeA0aAsxqEALw_wcB)
+and [Smartbear](https://smartbear.com/learn/code-review/best-practices-for-peer-code-review/)):
+
1. Decide the focus of your code review process, e.g., consider some of the following:
- - code design and functionality - does the code fit in the overall design and does it do what was intended?
- - code understandability and complexity - is the code readable and would another developer be able to understand it?
+ - code design and functionality -
+ does the code fit in the overall design and does it do what was intended?
+ - code understandability and complexity -
+ is the code readable and would another developer be able to understand it?
- tests - does the code have automated tests?
- - naming - are names used for variables and functions descriptive, do they follow naming conventions?
- - comments and documentation - are there clear and useful comments that explain complex designs well and focus
-on the "why/because" rather than the "what/how"?
-2. Do not review code too quickly and do not review for too long in one sitting. According to
-[“Best Kept Secrets of Peer Code Review” (Cohen, 2006)](https://www.amazon.co.uk/Best-Kept-Secrets-Peer-Review/dp/1599160676) - the first hour of review
-matters the most as detection of defects significantly drops after this period. [Studies into code review](https://smartbear.com/resources/ebooks/the-state-of-code-review-2020-report/)
-also show that you should not review more than 400 lines of code at a time. Conducting more frequent shorter reviews
-seems to be more effective.
-3. Decide on the level of depth for code reviews to maintain the balance between the creation time
-and time spent reviewing code - e.g. reserve them for critical portions of code and avoid nit-picking on small
-details. Try using automated checks and linters when possible, e.g. for consistent usage of certain terminology across the code and code styles.
-4. Communicate clearly and effectively - when reviewing code, be explicit about the action you request from the author.
+ - naming - are names used for variables and functions descriptive,
+ do they follow naming conventions?
+ - comments and documentation -
+ are there clear and useful comments that explain complex designs well
+ and focus on the "why/because" rather than the "what/how"?
+2. Do not review code too quickly and do not review for too long in one sitting.
+ According to
+ [“Best Kept Secrets of Peer Code Review” (Cohen, 2006)](https://www.amazon.co.uk/Best-Kept-Secrets-Peer-Review/dp/1599160676) -
+ the first hour of review matters the most as
+ detection of defects significantly drops after this period.
+ [Studies into code review](https://smartbear.com/resources/ebooks/the-state-of-code-review-2020-report/)
+ also show that you should not review more than 400 lines of code at a time.
+ Conducting more frequent shorter reviews seems to be more effective.
+3. Decide on the level of depth for code reviews
+ to maintain the balance between the creation time and time spent reviewing code -
+ e.g. reserve them for critical portions of code and avoid nit-picking on small details.
+ Try using automated checks and linters when possible,
+ e.g. for consistent usage of certain terminology across the code and code styles.
+4. Communicate clearly and effectively -
+ when reviewing code, be explicit about the action you request from the author.
5. Foster a positive feedback culture:
- - give feedback about the code, not about the author
- - accept that there are multiple correct solutions to a problem
- - sandwich criticism with positive comments and praise
-7. Utilise multiple code review techniques - use email, pair programming, over-the-shoulder, team discussions and
-tool-assisted or any combination that works for your team. However, for the most effective and efficient code reviews,
-tool-assisted process is recommended.
-9. From a more technical perspective:
- - use a feature branch for pull requests as you can push follow-up commits if you need to update
- your proposed changes
- - avoid large pull requests as they are more difficult to review. You can refer to some [studies](https://jserd.springeropen.com/articles/10.1186/s40411-018-0058-0) and [Google recommendations](https://google.github.io/eng-practices/review/developer/small-cls.html)
- as to what a "large pull request" is but be aware that it is not exact science.
+ - give feedback about the code, not about the author
+ - accept that there are multiple correct solutions to a problem
+ - sandwich criticism with positive comments and praise
+7. Utilise multiple code review techniques -
+ use email,
+ pair programming,
+ over-the-shoulder,
+ team discussions and
+ tool-assisted or
+ any combination that works for your team.
+ However, for the most effective and efficient code reviews,
+ tool-assisted process is recommended.
+9. From a more technical perspective:
+ - use a feature branch for pull requests as you can push follow-up commits
+ if you need to update your proposed changes
+ - avoid large pull requests as they are more difficult to review.
+ You can refer to some [studies](https://jserd.springeropen.com/articles/10.1186/s40411-018-0058-0)
+ and [Google recommendations](https://google.github.io/eng-practices/review/developer/small-cls.html)
+ as to what a "large pull request" is but be aware that it is not exact science.
- don't force push to a pull request as it changes the repository history
- and can corrupt your pull request for other collaborators
- - use pull request states in GitHub effectively (based on your team's code review process) - e.g. in GitHub
- you can open a
- pull request in a `DRAFT` state to show progress or request early feedback; `READY FOR REVIEW` when you are ready
- for feedback; `CHANGES REQUESTED` to let the author know they need to fix the requested changes or discuss more;
- `APPROVED` to let the author they can merge their pull request.
+ and can corrupt your pull request for other collaborators
+ - use pull request states in GitHub effectively (based on your team's code review process) -
+ e.g. in GitHub you can open a pull request in a `DRAFT` state
+ to show progress or request early feedback;
+ `READY FOR REVIEW` when you are ready for feedback;
+ `CHANGES REQUESTED` to let the author know
+ they need to fix the requested changes or discuss more;
+ `APPROVED` to let the author they can merge their pull request.
> ## Exercise: Code Review in Your Own Working Environment
->
-> At the start of this episode we briefly looked at a number of techniques for doing code review, and as an example, went on to see how we can use GitHub Pull Requests to review team member code changes. Finally, we also looked at some best practices for doing code reviews in general.
->
-> Now think about how you typically develop code, and how you might institute code review practices within your own working environment. Write down briefly for your own reference (perhaps using bullet points) some answers to the following questions:
->
+>
+> At the start of this episode we briefly looked at a number of techniques for doing code review,
+> and as an example,
+> went on to see how we can use GitHub Pull Requests to review team member code changes.
+> Finally, we also looked at some best practices for doing code reviews in general.
+>
+> Now think about how you typically develop code,
+> and how you might institute code review practices within your own working environment.
+> Write down briefly for your own reference (perhaps using bullet points)
+> some answers to the following questions:
+>
> - Which 2 or 3 key circumstances would code review be most useful for you and your colleagues?
-> - Referring to the first section of this episode above, which type of code review would be most useful for each circumstance (and would work best within your own working environment)?
-> - Taking one of these circumstances where code review would be most beneficial, how would you organise such a code review, e.g.:
+> - Referring to the first section of this episode above,
+> which type of code review would be most useful for each circumstance
+> (and would work best within your own working environment)?
+> - Taking one of these circumstances where code review would be most beneficial,
+> how would you organise such a code review, e.g.:
> - Which aspects of the codebase would be the most useful to cover?
> - How often would you do them?
> - How long would the activity take?
diff --git a/_episodes/42-software-reuse.md b/_episodes/42-software-reuse.md
index 4b92d848b..04a37fca4 100644
--- a/_episodes/42-software-reuse.md
+++ b/_episodes/42-software-reuse.md
@@ -20,27 +20,61 @@ keypoints:
---
## Introduction
-In previous episodes we've looked at skills, practices, and tools to help us design and develop software in a collaborative environment. In this lesson we'll be looking at a critical piece of the development puzzle that builds on what we've learnt so far - sharing our software with others.
+In previous episodes we've looked at skills, practices, and tools to help us
+design and develop software in a collaborative environment.
+In this lesson we'll be looking at
+a critical piece of the development puzzle that builds on what we've learnt so far -
+sharing our software with others.
## The Levels of Software Reusability - Good Practice Revisited
Let's begin by taking a closer look at software reusability and what we want from it.
-Firstly, whilst we want to ensure our software is reusable by others, as well as ourselves, we should be clear what we mean by 'reusable'. There are a number of definitions out there, but a helpful one written by [Benureau and Rougler in 2017](https://dx.doi.org/10.3389/fninf.2017.00069) offers the following levels by which software can be characterised:
+Firstly, whilst we want to ensure our software is reusable by others, as well as ourselves,
+we should be clear what we mean by 'reusable'.
+There are a number of definitions out there,
+but a helpful one written by [Benureau and Rougler in 2017](https://dx.doi.org/10.3389/fninf.2017.00069)
+offers the following levels by which software can be characterised:
-1. Re-runnable: the code is simply executable and can be run again (but there are no guarantees beyond that)
+1. Re-runnable: the code is simply executable
+ and can be run again (but there are no guarantees beyond that)
2. Repeatable: the software will produce the same result more than once
-3. Reproducible: published research results generated from the same version of the software can be generated again from the same input data
+3. Reproducible: published research results generated from the same version of the software
+ can be generated again from the same input data
4. Reusable: easy to use, understand, and modify
-5. Replicable: the software can act as an available reference for any ambiguity in the algorithmic descriptions made in the published article. That is, a new implementation can be created from the descriptions in the article that provide the same results as the original implementation, and that the original - or reference - implementation, can be used to clarify any ambiguity in those descriptions for the purposes of reimplementation
-
-Later levels imply the earlier ones. So what should we aim for? As researchers who develop software - or developers who write research software - we should be aiming for at least the fourth one: reusability. Reproducibility is required if we are to successfully claim that what we are doing when we write software fits within acceptable scientific practice, but it is also crucial that we can write software that can be *understood* and ideally *modified* by others. If others are unable to verify that a piece of software follows published algorithms, how can they be certain it is producing correct results? Where 'others', of course, can include a future version of ourselves.
+5. Replicable: the software can act as an available reference
+ for any ambiguity in the algorithmic descriptions made in the published article.
+ That is, a new implementation can be created from the descriptions in the article
+ that provide the same results as the original implementation,
+ and that the original - or reference - implementation,
+ can be used to clarify any ambiguity in those descriptions for the purposes of reimplementation
+
+Later levels imply the earlier ones.
+So what should we aim for?
+As researchers who develop software - or developers who write research software -
+we should be aiming for at least the fourth one: reusability.
+Reproducibility is required if we are to successfully claim that
+what we are doing when we write software fits within acceptable scientific practice,
+but it is also crucial that we can write software that can be *understood*
+and ideally *modified* by others.
+If others are unable to verify that a piece of software follows published algorithms,
+how can they be certain it is producing correct results?
+Where 'others', of course, can include a future version of ourselves.
## Documenting Code to Improve Reusability
-Reproducibility is a cornerstone of science, and scientists who work in many disciplines are expected to document the processes by which they've conducted their research so it can be reproduced by others. In medicinal, pharmacological, and similar research fields for example, researchers use logbooks which are then used to write up protocols and methods for publication.
+Reproducibility is a cornerstone of science,
+and scientists who work in many disciplines are expected to document
+the processes by which they've conducted their research so it can be reproduced by others.
+In medicinal, pharmacological, and similar research fields for example,
+researchers use logbooks which are then used to write up protocols and methods for publication.
-Many things we've covered so far contribute directly to making our software reproducible - and indeed reusable - by others. A key part of this we'll cover now is software documentation, which is ironically very often given short shrift in academia. This is often the case even in fields where the documentation and publication of research method is otherwise taken very seriously.
+Many things we've covered so far contribute directly to making our software
+reproducible - and indeed reusable - by others.
+A key part of this we'll cover now is software documentation,
+which is ironically very often given short shrift in academia.
+This is often the case even in fields where
+the documentation and publication of research method is otherwise taken very seriously.
A few reasons for this are that writing documentation is often considered:
@@ -48,30 +82,60 @@ A few reasons for this are that writing documentation is often considered:
- Expensive in terms of effort, with little reward
- Writing documentation is boring!
-A very useful form of documentation for understanding our code is code commenting, and is most effective when used to explain complex interfaces or behaviour, or the reasoning behind why something is coded a certain way. But code comments only go so far.
+A very useful form of documentation for understanding our code is code commenting,
+and is most effective when used to explain complex interfaces or behaviour,
+or the reasoning behind why something is coded a certain way.
+But code comments only go so far.
-Whilst it's certainly arguable that writing documentation isn't as exciting as writing code, it doesn't have to be expensive and brings many benefits. In addition to enabling general reproducibility by others, documentation...
+Whilst it's certainly arguable that writing documentation isn't as exciting as writing code,
+it doesn't have to be expensive and brings many benefits.
+In addition to enabling general reproducibility by others, documentation...
- Helps bring new staff researchers and developers up to speed quickly with using the software
-- Functions as a great aid to research collaborations involving software, where those from other teams need to use it
-- When well written, can act as a basis for detailing algorithms and other mechanisms in research papers, such that the software's functionality can be *replicated* and re-implemented elsewhere
-- Provides a descriptive link back to the science that underlies it. As a reference, it makes it far easier to know how to update the software as the scientific theory changes (and potentially vice versa)
-- Importantly, it can enable others to understand the software sufficiently to *modify and reuse* it to do different things
-
-In the next section we'll see that writing a sensible minimum set of documentation in a single document doesn't have to be expensive, and can greatly aid reproducibility.
+- Functions as a great aid to research collaborations involving software,
+ where those from other teams need to use it
+- When well written, can act as a basis for detailing
+ algorithms and other mechanisms in research papers,
+ such that the software's functionality can be *replicated* and re-implemented elsewhere
+- Provides a descriptive link back to the science that underlies it.
+ As a reference, it makes it far easier to know how to
+ update the software as the scientific theory changes (and potentially vice versa)
+- Importantly, it can enable others to understand the software sufficiently to
+ *modify and reuse* it to do different things
+
+In the next section we'll see that writing
+a sensible minimum set of documentation in a single document doesn't have to be expensive,
+and can greatly aid reproducibility.
### Writing a README
-A README file is the first piece of documentation (perhaps other than publications that refer to it) that people should read to acquaint themselves with the software. It concisely explains what the software is about and what it's for, and covers the steps necessary to obtain and install the software and use it to accomplish basic tasks. Think of it not as a comprehensive reference of all functionality, but more a short tutorial with links to further information - hence it should contain brief explanations and be focused on instructional steps.
+A README file is the first piece of documentation
+(perhaps other than publications that refer to it)
+that people should read to acquaint themselves with the software.
+It concisely explains what the software is about and what it's for,
+and covers the steps necessary to obtain and install the software
+and use it to accomplish basic tasks.
+Think of it not as a comprehensive reference of all functionality,
+but more a short tutorial with links to further information -
+hence it should contain brief explanations and be focused on instructional steps.
-Our repository already has a README that describes the purpose of the repository for this workshop, but let's replace it with a new one that describes the software itself. First let's delete the old one:
+Our repository already has a README that describes the purpose of the repository for this workshop,
+but let's replace it with a new one that describes the software itself.
+First let's delete the old one:
~~~
$ rm README.md
~~~
{: .language-bash}
-In the root of your repository create a replacement `README.md` file. The `.md` indicates this is a **Markdown** file, a lightweight markup language which is basically a text file with some extra syntax to provide ways of formatting them. A big advantage of them is that they can be read as plain-text files or as source files for rendering them with formatting structures, and are very quick to write. GitHub provides a very useful [guide to writing Markdown][github-markdown] for its repositories.
+In the root of your repository create a replacement `README.md` file.
+The `.md` indicates this is a **Markdown** file,
+a lightweight markup language which is basically a text file with
+some extra syntax to provide ways of formatting them.
+A big advantage of them is that they can be read as plain-text files
+or as source files for rendering them with formatting structures,
+and are very quick to write.
+GitHub provides a very useful [guide to writing Markdown][github-markdown] for its repositories.
Let's start writing `README.md` using a text editor of your choice and add the following line.
@@ -80,7 +144,13 @@ Let's start writing `README.md` using a text editor of your choice and add the f
~~~
{: .language-markdown}
-So here, we're giving our software a name. Ideally something unique, short, snappy, and perhaps to some degree an indicator of what it does. We would ideally rename the repository to reflect the new name, but let's leave that for now. In Markdown, the `#` designates a heading, two `##` are used for a subheading, and so on. The Software Sustainability Institute's [guide on naming projects][ssi-choosing-name] and products provides some helpful pointers.
+So here, we're giving our software a name.
+Ideally something unique, short, snappy, and perhaps to some degree an indicator of what it does.
+We would ideally rename the repository to reflect the new name, but let's leave that for now.
+In Markdown, the `#` designates a heading, two `##` are used for a subheading, and so on.
+The Software Sustainability Institute's
+[guide on naming projects][ssi-choosing-name]
+and products provides some helpful pointers.
We should also add a short description underneath the title.
@@ -106,7 +176,9 @@ Here are some key features of Inflam:
~~~
{: .language-markdown}
-As well as knowing what the software aims to do and its key features, it's very important to specify what other software and related dependencies are needed to use the software (typically called `dependencies` or `prerequisites`):
+As well as knowing what the software aims to do and its key features,
+it's very important to specify what other software and related dependencies
+are needed to use the software (typically called `dependencies` or `prerequisites`):
~~~
# Inflam
@@ -133,9 +205,13 @@ The following optional packages are required to run Inflam's unit tests:
~~~
{: .language-markdown}
-Here we're making use of Markdown links, with some text describing the link within `[]` followed by the link itself within `()`.
+Here we're making use of Markdown links,
+with some text describing the link within `[]` followed by the link itself within `()`.
-One really neat feature - and a common practice - of using many CI infrastructures is that we can include the status of running recent tests within our README file. Just below the `# Inflam` title on our README.md file, add the following (replacing `` with your own:
+One really neat feature - and a common practice - of using many CI infrastructures is that
+we can include the status of running recent tests within our README file.
+Just below the `# Inflam` title on our README.md file,
+add the following (replacing `` with your own:
~~~
# Inflam
@@ -144,46 +220,73 @@ One really neat feature - and a common practice - of using many CI infrastructur
~~~
{: .language-markdown}
-This will embed a *badge* (icon) at the top of our page that reflects the most recent GitHub Actions build status of
-your software repository, essentially showing whether the tests that were run when
-the last change was made to the `main` branch succeeded or failed.
+This will embed a *badge* (icon) at the top of our page that
+reflects the most recent GitHub Actions build status of your software repository,
+essentially showing whether the tests that were run
+when the last change was made to the `main` branch succeeded or failed.
-That's got us started with documenting our code,
+That's got us started with documenting our code,
but there are other aspects we should also cover:
- *Installation/deployment:* step-by-step instructions for setting up the software so it can be used
- *Basic usage:* step-by-step instructions that cover using the software to accomplish basic tasks
-- *Contributing:* for those wishing to contribute to the software's development, this is an opportunity to detail what kinds of contribution are sought and how to get involved
-- *Contact information/getting help:* which may include things like key author email addresses, and links to mailing lists and other resources
-- *Credits/acknowledgements:* where appropriate, be sure to credit those who have helped in the software's development or inspired it
-- *Citation:* particularly for academic software, it's a very good idea to specify a reference to an appropriate academic publication so other academics can cite use of the software in their own publications and media. You can do this within a separate [CITATION text file](https://github.com/citation-file-format/citation-file-format) within the repository's root directory and link to it from the Markdown
+- *Contributing:* for those wishing to contribute to the software's development,
+ this is an opportunity to detail what kinds of contribution are sought and how to get involved
+- *Contact information/getting help:* which may include things like key author email addresses,
+ and links to mailing lists and other resources
+- *Credits/acknowledgements:* where appropriate, be sure to credit those who
+ have helped in the software's development or inspired it
+- *Citation:* particularly for academic software,
+ it's a very good idea to specify a reference to an appropriate academic publication
+ so other academics can cite use of the software in their own publications and media.
+ You can do this within a separate
+ [CITATION text file](https://github.com/citation-file-format/citation-file-format)
+ within the repository's root directory and link to it from the Markdown
- *Licence:* a short description of and link to the software's licence
-For more verbose sections, there are usually just highlights in the README with links to further information, which may be held within other Markdown files within the repository or elsewhere.
+For more verbose sections,
+there are usually just highlights in the README with links to further information,
+which may be held within other Markdown files within the repository or elsewhere.
-We'll finish these off later. See [Matias Singer's curated list of awesome READMEs](https://github.com/matiassingers/awesome-readme) for inspiration.
+We'll finish these off later.
+See [Matias Singer's curated list of awesome READMEs](https://github.com/matiassingers/awesome-readme) for inspiration.
### Other Documentation
-There are many different types of other documentation you should also consider writing and making available that's beyond the scope of this course. The key is to consider which audiences you need to write for, e.g. end users, developers, maintainers, etc., and what they need from the documentation. There's a Software Sustainability Institute [blog post on best practices for research software documentation](https://www.software.ac.uk/blog/2019-06-21-what-are-best-practices-research-software-documentation) that helpfully covers the kinds of documentation to consider and other effective ways to convey the same information.
-
-One that you should always consider is **technical documentation**. This typically aims to help other developers
-understand your code sufficiently well to make their own changes to it, including external developers,
-other members in your team and a future version of yourself too. This may include documentation that covers
-the software's architecture,
-including its different components and how they fit together, API (Application Programmer Interface) documentation
-that describes the interface points designed into your software for other developers to use, e.g. for a software
-library, or technical tutorials/'HOW TOs' to accomplish developer-oriented tasks.
+There are many different types of other documentation you should also consider
+writing and making available that's beyond the scope of this course.
+The key is to consider which audiences you need to write for,
+e.g. end users, developers, maintainers, etc.,
+and what they need from the documentation.
+There's a Software Sustainability Institute
+[blog post on best practices for research software documentation](https://www.software.ac.uk/blog/2019-06-21-what-are-best-practices-research-software-documentation)
+that helpfully covers the kinds of documentation to consider
+and other effective ways to convey the same information.
+
+One that you should always consider is **technical documentation**.
+This typically aims to help other developers understand your code
+sufficiently well to make their own changes to it,
+including external developers, other members in your team and a future version of yourself too.
+This may include documentation that covers the software's architecture,
+including its different components and how they fit together,
+API (Application Programmer Interface) documentation
+that describes the interface points designed into your software for other developers to use,
+e.g. for a software library,
+or technical tutorials/'HOW TOs' to accomplish developer-oriented tasks.
## Choosing an Open Source Licence
-Software licensing is a whole topic in itself, so we’ll just summarise here. Your institution’s Intellectual Property
-(IP) team will be able to offer specific guidance that fits the way your institution thinks about software.
+Software licensing is a whole topic in itself, so we’ll just summarise here.
+Your institution’s Intellectual Property (IP) team will be able to offer specific guidance that
+fits the way your institution thinks about software.
-In IP law, software is considered a creative work of literature, so any code you write automatically has copyright
-protection applied. This copyright will usually belong to the institution that employs you, but this may be different
-for PhD students. If you need to check, this should be included in your employment/studentship contract or talk to your
-university’s IP team.
+In IP law, software is considered a creative work of literature,
+so any code you write automatically has copyright protection applied.
+This copyright will usually belong to the institution that employs you,
+but this may be different for PhD students.
+If you need to check,
+this should be included in your employment/studentship contract
+or talk to your university’s IP team.
Since software is automatically under copyright, without a licence no one may:
@@ -193,26 +296,56 @@ Since software is automatically under copyright, without a licence no one may:
- Extend it
- Use it (actually unclear at present - this has not been properly tested in court yet)
-Fundamentally there are two kinds of licence, **Open Source licences** and **Proprietary licences**, which serve slightly different purposes:
+Fundamentally there are two kinds of licence,
+**Open Source licences** and **Proprietary licences**,
+which serve slightly different purposes:
-- *Proprietary licences* are designed to pass on limited rights to end users, and are most suitable if you want to commercialise your software. They tend to be customised to suit the requirements of the software and the institution to which it belongs - again your institutions IP team will be able to help here.
-- *Open Source licences* are designed more to protect the rights of end users - they specifically grant permission to make modifications and redistribute the software to others. The [website Choose A License](https://choosealicense.com/) provides recommendations and a simple summary of some of the most common open source licences.
+- *Proprietary licences* are designed to pass on limited rights to end users,
+ and are most suitable if you want to commercialise your software.
+ They tend to be customised to suit the requirements of the software
+ and the institution to which it belongs -
+ again your institutions IP team will be able to help here.
+- *Open Source licences* are designed more to protect the rights of end users -
+ they specifically grant permission to make modifications and redistribute the software to others.
+ The [website Choose A License](https://choosealicense.com/) provides recommendations
+ and a simple summary of some of the most common open source licences.
Within the open source licences, there are two categories, **copyleft** and **permissive**:
-- The permissive licences such as MIT and the multiple variants of the BSD licence are designed to give maximum freedom to the end users of software. These licences allow the end user to do almost anything with the source code.
-- The copyleft licences in the GPL still give a lot of freedom to the end users, but any code that they write based on GPLed code must also be licensed under the same licence. This gives the developer assurance that anyone building on their code is also contributing back to the community. It’s actually a little more complicated than this, and the variants all have slightly different conditions and applicability, but this is the core of the licence.
-
-Which of these types of licence you prefer is up to you and those you develop code with. If you want more information, or help choosing a licence, the [Choose An Open-Source Licence](https://choosealicense.com/) or [tl;dr Legal](https://tldrlegal.com/) sites can help.
+- The permissive licences such as MIT and the multiple variants of the BSD licence
+ are designed to give maximum freedom to the end users of software.
+ These licences allow the end user to do almost anything with the source code.
+- The copyleft licences in the GPL still give a lot of freedom to the end users,
+ but any code that they write based on GPLed code must also be licensed under the same licence.
+ This gives the developer assurance that anyone building on their code is also
+ contributing back to the community.
+ It’s actually a little more complicated than this,
+ and the variants all have slightly different conditions and applicability,
+ but this is the core of the licence.
+
+Which of these types of licence you prefer is up to you and those you develop code with.
+If you want more information, or help choosing a licence,
+the [Choose An Open-Source Licence](https://choosealicense.com/)
+or [tl;dr Legal](https://tldrlegal.com/) sites can help.
> ## Exercise: Preparing for Release
>
-> In a (hopefully) highly unlikely and thoroughly unrecommended scenario, your project leader has informed you of the need to release your software within the next half hour, so it can be assessed for use by another team. You'll need to consider finishing the README, choosing a licence, and fixing any remaining problems you are aware of in your codebase. Ensure you prioritise and work on the most pressing issues first!
+> In a (hopefully) highly unlikely and thoroughly unrecommended scenario,
+> your project leader has informed you of the need to release your software
+> within the next half hour,
+> so it can be assessed for use by another team.
+> You'll need to consider finishing the README,
+> choosing a licence,
+> and fixing any remaining problems you are aware of in your codebase.
+> Ensure you prioritise and work on the most pressing issues first!
{: .challenge}
## Merging into `main`
-Once you've done these updates, commit your changes, and if you're doing this work on a feature branch also ensure you merge it into `develop`, e.g.:
+Once you've done these updates,
+commit your changes,
+and if you're doing this work on a feature branch also ensure you merge it into `develop`,
+e.g.:
~~~
$ git checkout develop
@@ -220,7 +353,9 @@ $ git merge my-feature-branch
~~~
{: .language-bash}
-Finally, once we've fully tested our software and are confident it works as expected on `develop`, we can merge our `develop` branch into `main`:
+Finally, once we've fully tested our software
+and are confident it works as expected on `develop`,
+we can merge our `develop` branch into `main`:
~~~
$ git checkout main
@@ -232,14 +367,18 @@ $ git push
## Tagging a Release in GitHub
-There are many ways in which Git and GitHub can help us make a software release from our code. One of these is via **tagging**, where we attach a human-readable label to a specific commit. Let's see what tags we currently have in our repository:
+There are many ways in which Git and GitHub can help us make a software release from our code.
+One of these is via **tagging**,
+where we attach a human-readable label to a specific commit.
+Let's see what tags we currently have in our repository:
~~~
$ git tag
~~~
{: .language-bash}
-Since we haven't tagged any commits yet, there's unsurprisingly no output. We can create a new tag on the last commit we did by doing:
+Since we haven't tagged any commits yet, there's unsurprisingly no output.
+We can create a new tag on the last commit we did by doing:
~~~
$ git tag -a v1.0.0 -m "Version 1.0.0"
@@ -286,7 +425,7 @@ index 4818abb..5b8e7fd 100644
+++ b/README.md
@@ -22,4 +22,33 @@ Flimflam requires the following Python packages:
The following optional packages are required to run Flimflam's unit tests:
-
+
- [pytest](https://docs.pytest.org/en/stable/) - Flimflam's unit tests are written using pytest
-- [pytest-cov](https://pypi.org/project/pytest-cov/) - Adds test coverage stats to unit testing
\ No newline at end of file
@@ -310,7 +449,7 @@ index 4818abb..5b8e7fd 100644
+- Directed by Michael Bay
+
+## Citation
-+Please cite [J. F. W. Herschel, 1829, MmRAS, 3, 177](https://ui.adsabs.harvard.edu/abs/1829MmRAS...3..177H/abstract) if you used this work in your day-to-day life.
++Please cite [J. F. W. Herschel, 1829, MmRAS, 3, 177](https://ui.adsabs.harvard.edu/abs/1829MmRAS...3..177H/abstract) if you used this work in your day-to-day life.
+Please cite [C. Herschel, 1787, RSPT, 77, 1](https://ui.adsabs.harvard.edu/abs/1787RSPT...77....1H/abstract) if you actually use this for scientific work.
+
+## License
@@ -333,23 +472,49 @@ $ git push origin v1.0.0
> ## What is a Version Number Anyway?
>
-> Software version numbers are everywhere, and there are many different ways to do it. A popular one to consider is [**Semantic Versioning**](https://semver.org/), where a given version number uses the format MAJOR.MINOR.PATCH. You increment the:
+> Software version numbers are everywhere,
+> and there are many different ways to do it.
+> A popular one to consider is [**Semantic Versioning**](https://semver.org/),
+> where a given version number uses the format MAJOR.MINOR.PATCH.
+> You increment the:
>
> - MAJOR version when you make incompatible API changes
> - MINOR version when you add functionality in a backwards compatible manner
> - PATCH version when you make backwards compatible bug fixes
>
-> You can also add a hyphen followed by characters to denote a pre-release version, e.g. 1.0.0-alpha1 (first alpha release) or 1.2.3-beta4 (fourth beta release)
+> You can also add a hyphen followed by characters to denote a pre-release version,
+> e.g. 1.0.0-alpha1 (first alpha release) or 1.2.3-beta4 (fourth beta release)
{: .callout}
-We can now use the more memorable tag to refer to this specific commit. Plus, once we've pushed this back up to GitHub, it appears as a specific release within our code repository which can be downloaded in compressed `.zip` or `.tar.gz` formats. Note that these downloads just contain the state of the repository at that commit, and not its entire history.
+We can now use the more memorable tag to refer to this specific commit.
+Plus, once we've pushed this back up to GitHub,
+it appears as a specific release within our code repository
+which can be downloaded in compressed `.zip` or `.tar.gz` formats.
+Note that these downloads just contain the state of the repository at that commit,
+and not its entire history.
-Using features like tagging allows us to highlight commits that are particularly important, which is very useful for *reproducibility* purposes. We can (and should) refer to specific commits for software in academic papers that make use of results from software, but tagging with a specific version number makes that just a little bit easier for humans.
+Using features like tagging allows us to highlight commits that are particularly important,
+which is very useful for *reproducibility* purposes.
+We can (and should) refer to specific commits for software in
+academic papers that make use of results from software,
+but tagging with a specific version number makes that just a little bit easier for humans.
## Conforming to Data Policy and Regulation
-We may also wish to make data available to either be used with the software or as generated results. This may be via GitHub or some other means. An important aspect to remember with sharing data on such systems is that they may reside in other countries, and we must be careful depending on the nature of the data.
-
-We need to ensure that we are still conforming to the relevant policies and guidelines regarding how we manage research data, which may include funding council, institutional, national, and even international policies and laws. Within Europe, for example, there's the need to conform to things like [GDPR][gdpr], for example. It's a very good idea to make yourself aware of these aspects.
+We may also wish to make data available to either
+be used with the software or as generated results.
+This may be via GitHub or some other means.
+An important aspect to remember with sharing data on such systems is that
+they may reside in other countries,
+and we must be careful depending on the nature of the data.
+
+We need to ensure that we are still conforming to
+the relevant policies and guidelines regarding how we manage research data,
+which may include funding council,
+institutional,
+national,
+and even international policies and laws.
+Within Europe, for example, there's the need to conform to things like [GDPR][gdpr].
+It's a very good idea to make yourself aware of these aspects.
{% include links.md %}
diff --git a/_episodes/43-software-release.md b/_episodes/43-software-release.md
index 6fa8b1d83..d85275c47 100644
--- a/_episodes/43-software-release.md
+++ b/_episodes/43-software-release.md
@@ -17,26 +17,41 @@ keypoints:
## Why Package our Software?
-We've now got our software ready to release - the last step is to package it up so that it can be distributed.
-
-For very small pieces of software, for example a single source file, it may be appropriate to distribute to non-technical end-users as source code, but in most cases we want to bundle our application or library into a package.
-A package is typically a single file which contains within it our software and some metadata which allows it to be installed and used more simply - e.g. a list of dependencies.
-By distributing our code as a package, we reduce the complexity of fetching, installing and integrating it for the end-users.
-
-In this session we'll introduce one widely used method for building an installable package from our code.
-There are range of methods in common use, so it's likely you'll also encounter projects which take different approaches.
+We've now got our software ready to release -
+the last step is to package it up so that it can be distributed.
+
+For very small pieces of software,
+for example a single source file,
+it may be appropriate to distribute to non-technical end-users as source code,
+but in most cases we want to bundle our application or library into a package.
+A package is typically a single file which contains within it our software
+and some metadata which allows it to be installed and used more simply -
+e.g. a list of dependencies.
+By distributing our code as a package,
+we reduce the complexity of fetching, installing and integrating it for the end-users.
+
+In this session we'll introduce
+one widely used method for building an installable package from our code.
+There are range of methods in common use,
+so it's likely you'll also encounter projects which take different approaches.
There's some confusing terminology in this episode around the use of the term "package".
This term is used to refer to both:
- A directory containing Python files / modules and an `__init__.py` - a "module package"
-- A way of structuring / bundling a project for easier distribution and installation - a "distributable package"
+- A way of structuring / bundling a project for easier distribution and installation -
+ a "distributable package"
## Packaging our Software with Poetry
### Installing Poetry
-Because we've recommended GitBash if you're using Windows, we're going to install Poetry using a different method to the officially recommended one.
-If you're on MacOS or Linux, are comfortable with installing software at the command line and want to use Poetry to manage multiple projects, you may instead prefer to follow the official [Poetry installation instructions](https://python-poetry.org/docs/#installation).
+Because we've recommended GitBash if you're using Windows,
+we're going to install Poetry using a different method to the officially recommended one.
+If you're on MacOS or Linux,
+are comfortable with installing software at the command line
+and want to use Poetry to manage multiple projects,
+you may instead prefer to follow the official
+[Poetry installation instructions](https://python-poetry.org/docs/#installation).
We can install Poetry much like any other Python distributable package, using `pip`:
@@ -58,9 +73,12 @@ $ which poetry
~~~
{: .output}
-If you don't get similar output, make sure you've got the correct virtual environment activated.
+If you don't get similar output,
+make sure you've got the correct virtual environment activated.
-Poetry can also handle virtual environments for us, so in order to behave similarly to how we used them previously, let's change the Poetry config to put them in the same directory as our project:
+Poetry can also handle virtual environments for us,
+so in order to behave similarly to how we used them previously,
+let's change the Poetry config to put them in the same directory as our project:
~~~ bash
$ poetry config virtualenvs.in-project true
@@ -69,18 +87,28 @@ $ poetry config virtualenvs.in-project true
### Setting up our Poetry Config
-Poetry uses a **pyproject.toml** file to describe the build system and requirements of the distributable package.
-This file format was introduced to solve problems with bootstrapping packages (the processing we do to prepare to process something) using the older convention with **setup.py** files and to support a wider range of build tools.
-It is described in [PEP 518 (Specifying Minimum Build System Requirements for Python Projects)](https://www.python.org/dev/peps/pep-0518/).
+Poetry uses a **pyproject.toml** file to describe
+the build system and requirements of the distributable package.
+This file format was introduced to solve problems with bootstrapping packages
+(the processing we do to prepare to process something)
+using the older convention with **setup.py** files and to support a wider range of build tools.
+It is described in
+[PEP 518 (Specifying Minimum Build System Requirements for Python Projects)](https://www.python.org/dev/peps/pep-0518/).
-Make sure you are in the root directory of your software project and have activated your virtual environment, then we're ready to begin.
+Make sure you are in the root directory of your software project
+and have activated your virtual environment,
+then we're ready to begin.
To create a `pyproject.toml` file for our code, we can use `poetry init`.
-This will guide us through the most important settings - for each prompt, we either enter our data or accept the default.
+This will guide us through the most important settings -
+for each prompt, we either enter our data or accept the default.
-*Displayed below are the questions you should see with the recommended responses to each question so try to follow these, although use your own contact details!*
+*Displayed below are the questions you should see
+with the recommended responses to each question so try to follow these,
+although use your own contact details!*
-**NB: When you get to the questions about defining our dependencies, answer no, so we can do this separately later.**
+**NB: When you get to the questions about defining our dependencies,
+answer no, so we can do this separately later.**
~~~
$ poetry init
@@ -122,24 +150,43 @@ Do you confirm generation? (yes/no) [yes] yes
~~~
{: .output}
-We've called our package "inflammation" in the setup above, instead of "inflammation-analysis" like we did in our previous `setup.py`.
-This is because Poetry will automatically find our code if the name of the distributable package matches the name of our module package.
-If we wanted our distributable package to have a different name, for example "inflammation-analysis", we could do this by explicitly listing the module packages to bundle - see [the Poetry docs on packages](https://python-poetry.org/docs/pyproject/#packages) for how to do this.
+We've called our package "inflammation" in the setup above,
+instead of "inflammation-analysis" like we did in our previous `setup.py`.
+This is because Poetry will automatically find our code
+if the name of the distributable package matches the name of our module package.
+If we wanted our distributable package to have a different name,
+for example "inflammation-analysis",
+we could do this by explicitly listing the module packages to bundle -
+see [the Poetry docs on packages](https://python-poetry.org/docs/pyproject/#packages)
+for how to do this.
### Project Dependencies
Previously, we looked at using a `requirements.txt` file to define the dependencies of our software.
-Here, Poetry takes inspiration from package managers in other languages, particularly NPM (Node Package Manager), often used for JavaScript development.
-
-Tools like Poetry and NPM understand that there are two different types of dependency: runtime dependencies and development dependencies.
-Runtime dependencies are those dependencies that need to be installed for our code to run, like NumPy.
-Development dependencies are dependencies which are an essential part of your development process for a project, but are not required to run it.
-Common examples of developments dependencies are linters and test frameworks, like `pylint` or `pytest`.
-
-When we add a dependency using Poetry, Poetry will add it to the list of dependencies in the `pyproject.toml` file, add a reference to it in a new `poetry.lock` file, and automatically install the package into our virtual environment.
-If we don't yet have a virtual environment activated, Poetry will create it for us - using the name `.venv`, so it appears hidden unless we do `ls -a`.
+Here, Poetry takes inspiration from package managers in other languages,
+particularly NPM (Node Package Manager),
+often used for JavaScript development.
+
+Tools like Poetry and NPM understand that there are two different types of dependency:
+runtime dependencies and development dependencies.
+Runtime dependencies are those dependencies that
+need to be installed for our code to run, like NumPy.
+Development dependencies are dependencies which
+are an essential part of your development process for a project,
+but are not required to run it.
+Common examples of developments dependencies are linters and test frameworks,
+like `pylint` or `pytest`.
+
+When we add a dependency using Poetry,
+Poetry will add it to the list of dependencies in the `pyproject.toml` file,
+add a reference to it in a new `poetry.lock` file,
+and automatically install the package into our virtual environment.
+If we don't yet have a virtual environment activated,
+Poetry will create it for us - using the name `.venv`,
+so it appears hidden unless we do `ls -a`.
Because we've already activated a virtual environment, Poetry will use ours instead.
-The `pyproject.toml` file has two separate lists, allowing us to distinguish between runtime and development dependencies.
+The `pyproject.toml` file has two separate lists,
+allowing us to distinguish between runtime and development dependencies.
~~~
$ poetry add matplotlib numpy
@@ -149,25 +196,41 @@ $ poetry install
{: .language-bash}
These two sets of dependencies will be used in different circumstances.
-When we build our package and upload it to a package repository, Poetry will only include references to our runtime dependencies.
-This is because someone installing our software through a tool like `pip` is only using it, but probably doesn't intend
-to contribute to the development of our software and does not require development dependencies.
-
-In contrast, if someone downloads our code from GitHub, together with our `pyproject.toml`, and installs the project that way, they will get both our runtime and development dependencies.
-If someone is downloading our source code, that suggests that they intend to contribute to the development, so they'll need all of our development tools.
+When we build our package and upload it to a package repository,
+Poetry will only include references to our runtime dependencies.
+This is because someone installing our software through a tool like `pip` is only using it,
+but probably doesn't intend to contribute to the development of our software
+and does not require development dependencies.
+
+In contrast, if someone downloads our code from GitHub,
+together with our `pyproject.toml`,
+and installs the project that way,
+they will get both our runtime and development dependencies.
+If someone is downloading our source code,
+that suggests that they intend to contribute to the development,
+so they'll need all of our development tools.
Have a look at the `pyproject.toml` file again to see what's changed.
### Packaging Our Code
-The final preparation we need to do is to make sure that our code is organised in the recommended structure.
-This is the Python module structure - a directory containing an `__init__.py` and our Python source code files.
-Make sure that the name of this Python package (`inflammation` - unless you've renamed it) matches the name of your distributable package in `pyproject.toml` unless you've chosen to explicitly list the module packages.
+The final preparation we need to do is to
+make sure that our code is organised in the recommended structure.
+This is the Python module structure -
+a directory containing an `__init__.py` and our Python source code files.
+Make sure that the name of this Python package
+(`inflammation` - unless you've renamed it)
+matches the name of your distributable package in `pyproject.toml`
+unless you've chosen to explicitly list the module packages.
-By convention distributable package names use hyphens, whereas module package names use underscores.
-While we could choose to use underscores in a distributable package name, we cannot use hyphens in a module package name, as Python will interpret them as a minus sign in our code when we try to import them.
+By convention distributable package names use hyphens,
+whereas module package names use underscores.
+While we could choose to use underscores in a distributable package name,
+we cannot use hyphens in a module package name,
+as Python will interpret them as a minus sign in our code when we try to import them.
-Once we've got our `pyproject.toml` configuration done and our project is in the right structure, we can go ahead and build a distributable version of our software:
+Once we've got our `pyproject.toml` configuration done and our project is in the right structure,
+we can go ahead and build a distributable version of our software:
~~~
$ poetry build
@@ -176,49 +239,80 @@ $ poetry build
This should produce two files for us in the `dist` directory.
The one we care most about is the `.whl` or **wheel** file.
-This is the file that `pip` uses to distribute and install Python packages, so this is the file we'd need to share with other people who want to install our software.
+This is the file that `pip` uses to distribute and install Python packages,
+so this is the file we'd need to share with other people who want to install our software.
-Now if we gave this wheel file to someone else, they could install it using `pip` - you don't need to run this command yourself, you've already installed it using `poetry install` above.
+Now if we gave this wheel file to someone else,
+they could install it using `pip` -
+you don't need to run this command yourself,
+you've already installed it using `poetry install` above.
~~~
$ pip3 install dist/inflammation*.whl
~~~
{: .language-bash}
-The star in the line above is a **wildcard**, that means Bash should use any filenames that match that pattern, with any number of characters in place for the star.
-We could also rely on Bash's autocomplete functionality and type `dist/inflammation`, then hit the Tab key if we've only got one version built.
-
-After we've been working on our code for a while and want to publish an update, we just need to update the version number in the `pyproject.toml` file (using [SemVer](https://semver.org/) perhaps), then use Poetry to build and publish the new version.
-If we don't increment the version number, people might end up using this version, even though they thought they were using the previous one.
-Any re-publishing of the package, no matter how small the changes, needs to come with a new version number.
-The advantage of [SemVer](https://semver.org/) is that the change in the version number indicates the degree of change in the code and thus the degree of risk of breakage when we update.
+The star in the line above is a **wildcard**,
+that means Bash should use any filenames that match that pattern,
+with any number of characters in place for the star.
+We could also rely on Bash's autocomplete functionality and type `dist/inflammation`,
+then hit the Tab key if we've only got one version built.
+
+After we've been working on our code for a while and want to publish an update,
+we just need to update the version number in the `pyproject.toml` file
+(using [SemVer](https://semver.org/) perhaps),
+then use Poetry to build and publish the new version.
+If we don't increment the version number,
+people might end up using this version,
+even though they thought they were using the previous one.
+Any re-publishing of the package, no matter how small the changes,
+needs to come with a new version number.
+The advantage of [SemVer](https://semver.org/) is that the change in the version number
+indicates the degree of change in the code and thus the degree of risk of breakage when we update.
~~~
$ poetry build
~~~
{: .language-bash}
-In addition to the commands we've already seen, Poetry contains a few more that can be useful for our development process.
+In addition to the commands we've already seen,
+Poetry contains a few more that can be useful for our development process.
For the full list see the [Poetry CLI documentation](https://python-poetry.org/docs/cli/).
The final step is to publish our package to a package repository.
-A package repository could be either public or private - while you may at times be working on public projects, it's likely the majority of your work will be published internally using a private repository such as JFrog Artifactory.
-Every repository may be configured slightly differently, so we'll leave that to you to investigate.
+A package repository could be either public or private -
+while you may at times be working on public projects,
+it's likely the majority of your work will be published internally
+using a private repository such as JFrog Artifactory.
+Every repository may be configured slightly differently,
+so we'll leave that to you to investigate.
## What if We Need More Control?
-Sometimes we need more control over the process of building our distributable package than Poetry allows.
-There many ways to distribute Python code in packages, with some degree of flux in terms of which methods are most
-popular. For a more comprehensive overview of Python packaging you can see the
-[Python docs on packaging](https://packaging.python.org/en/latest/), which contains a helpful guide to the overall
-[packaging process, or 'flow'](https://packaging.python.org/en/latest/flow/), using the
-[Twine](https://pypi.org/project/twine/) tool to upload created packages to PyPI for distribution as an alternative.
+Sometimes we need more control over the process of
+building our distributable package than Poetry allows.
+There many ways to distribute Python code in packages,
+with some degree of flux in terms of which methods are most popular.
+For a more comprehensive overview of Python packaging you can see the
+[Python docs on packaging](https://packaging.python.org/en/latest/),
+which contains a helpful guide to the overall
+[packaging process, or 'flow'](https://packaging.python.org/en/latest/flow/),
+using the [Twine](https://pypi.org/project/twine/) tool to
+upload created packages to PyPI for distribution as an alternative.
> ## Optional Exercise: Enhancing our Package Metadata
>
-> The [Python Packaging User Guide](https://packaging.python.org/) provides documentation on [how to package a project](https://packaging.python.org/en/latest/tutorials/packaging-projects/) using a manual approach to building a `pyproject.toml` file, and using Twine to upload the distribution packages to PyPI.
->
-> Referring to the [section on metadata](https://packaging.python.org/en/latest/tutorials/packaging-projects/#configuring-metadata) in the documentation, enhance your `pyproject.toml` with some additional metadata fields to improve the information your package.
+> The [Python Packaging User Guide](https://packaging.python.org/)
+> provides documentation on
+> [how to package a project](https://packaging.python.org/en/latest/tutorials/packaging-projects/)
+> using a manual approach to building a `pyproject.toml` file,
+> and using Twine to upload the distribution packages to PyPI.
+>
+> Referring to the
+> [section on metadata](https://packaging.python.org/en/latest/tutorials/packaging-projects/#configuring-metadata)
+> in the documentation,
+> enhance your `pyproject.toml` with some additional metadata fields
+> to improve the information your package.
{: .challenge}
{% include links.md %}
diff --git a/_episodes/50-section5-intro.md b/_episodes/50-section5-intro.md
index 651468608..cb668eddd 100644
--- a/_episodes/50-section5-intro.md
+++ b/_episodes/50-section5-intro.md
@@ -17,26 +17,32 @@ keypoints:
---
In this section of the course we look at managing the **development and evolution** of software -
-how to keep track of the tasks the team has to do,
-how to improve the quality and reusability of our software for others as well as ourselves,
+how to keep track of the tasks the team has to do,
+how to improve the quality and reusability of our software for others as well as ourselves,
and how to assess other people's software for reuse within our project.
The focus in this section will move beyond just software development to **software management**:
-internal planning and prioritising tasks for future development,
-management of internal communication as well as how the outside world interacts with and makes use of our software,
-how others can interact with ourselves to report issues, and the ways we can successfully manage software
-improvement in response to feedback.
+internal planning and prioritising tasks for future development,
+management of internal communication as well as
+how the outside world interacts with and makes use of our software,
+how others can interact with ourselves to report issues,
+and the ways we can successfully manage software improvement in response to feedback.
{: .image-with-shadow width="800px" }
In this section we will:
- Use GitHub to **track issues with our software** registered by ourselves and external users.
-- Use GitHub's **Mentions** and notifications system to effectively **communicate within the team** on software development tasks.
+- Use GitHub's **Mentions** and notifications system to
+ effectively **communicate within the team** on software development tasks.
- Use GitHub's **Project Boards** and **Milestones** for project planning and management.
-- Learn to manage the **improvement of our software through feedback** using **agile** management techniques.
-- Employ **effort estimation** of development tasks as a foundational tool for prioritising future team work,
-and use the **MoSCoW approach** and software development **sprints** to manage improvement. As we will see, it is very
-difficult to prioritise work effectively without knowing both its relative importance to others as well as the effort required to deliver those work items.
+- Learn to manage the **improvement of our software through feedback**
+ using **agile** management techniques.
+- Employ **effort estimation** of development tasks
+ as a foundational tool for prioritising future team work,
+ and use the **MoSCoW approach** and software development **sprints** to manage improvement.
+ As we will see, it is very difficult to prioritise work effectively
+ without knowing both its relative importance to others
+ as well as the effort required to deliver those work items.
- Learn how to employ a critical mindset when reviewing software for reuse.
{% include links.md %}
diff --git a/_episodes/51-managing-software.md b/_episodes/51-managing-software.md
index 7b3b07d19..63bd244fa 100644
--- a/_episodes/51-managing-software.md
+++ b/_episodes/51-managing-software.md
@@ -13,210 +13,330 @@ objectives:
- "Use GitHub's **Project Boards** and **Milestones** for software project management, planning sprints and releases"
keypoints:
- "We should use GitHub's **Issues** to keep track of software problems and other requests for change - even if we are the only developer and user."
-- "GitHub’s **Mentions** play an important part in communicating between collaborators and is
+- "GitHub’s **Mentions** play an important part in communicating between collaborators and is
used as a way of alerting team members of activities and referencing one issue/pull requests/comment/commit from another."
-- "Without a good project and issue management framework, it can be hard to keep track of what’s done, or what needs doing, and
+- "Without a good project and issue management framework, it can be hard to keep track of what’s done, or what needs doing, and
particularly difficult to convey that to others in the team or sharing the responsibilities."
---
## Introduction
-Developing software is a project and, like most projects, it consists of multiple tasks. Keeping track of identified issues
-with the software, the list of tasks the team has to do, progress on each, prioritising tasks for future development,
-planning sprints and releases, etc., can quickly become a non-trivial task in itself.
-Without a good team project management process and framework,
-it can be hard to keep track of what’s done, or what needs doing, and particularly difficult to convey that to others
+Developing software is a project and, like most projects, it consists of multiple tasks.
+Keeping track of identified issues with the software,
+the list of tasks the team has to do, progress on each,
+prioritising tasks for future development,
+planning sprints and releases, etc.,
+can quickly become a non-trivial task in itself.
+Without a good team project management process and framework,
+it can be hard to keep track of what’s done, or what needs doing,
+and particularly difficult to convey that to others
in the team or share the responsibilities.
## Using GitHub to Manage Issues With Software
-As a piece of software is used, bugs and other issues will inevitably come to light - nothing is perfect!
-If you work on your code with collaborators, or have non-developer users, it can be helpful to have a single shared
-record of all the problems people have found with the code, not only to keep track of them for you to work on later,
+As a piece of software is used,
+bugs and other issues will inevitably come to light - nothing is perfect!
+If you work on your code with collaborators,
+or have non-developer users,
+it can be helpful to have a single shared record of
+all the problems people have found with the code,
+not only to keep track of them for you to work on later,
but to avoid people emailing you to report a bug that you already know about!
-GitHub provides **Issues** - a framework for managing bug reports, feature requests, and lists of future work.
+GitHub provides **Issues** -
+a framework for managing bug reports, feature requests, and lists of future work.
-Go back to the home page for your `python-intermediate-inflammation` repository in GitHub, and click on the **Issue** tab.
-You should see a page listing the open issues on your repository - currently there should be none.
+Go back to the home page for your `python-intermediate-inflammation` repository in GitHub,
+and click on the **Issue** tab.
+You should see a page listing the open issues on your repository -
+currently there should be none.
{: .image-with-shadow width="1000px"}
-Let's go through the process of creating a new issue. Start by clicking the `New issue` button.
+Let's go through the process of creating a new issue.
+Start by clicking the `New issue` button.
{: .image-with-shadow width="1000px"}
-When you create an issue, you can add a range of details to them. They can be *assigned to a specific developer* for example - this can be a helpful way to know who, if anyone, is currently working to fix the issue, or a way to assign
-responsibility to someone to deal with it.
+When you create an issue, you can add a range of details to them.
+They can be *assigned to a specific developer* for example -
+this can be a helpful way to know who, if anyone, is currently working to fix the issue,
+or a way to assign responsibility to someone to deal with it.
-They can also be assigned a *label*. The labels available for issues can be customised, and given a colour, allowing you to see at a glance the state of your code's issues. The [default labels available in GitHub](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) include:
+They can also be assigned a *label*.
+The labels available for issues can be customised,
+and given a colour,
+allowing you to see at a glance the state of your code's issues.
+The [default labels available in GitHub](https://docs.github.com/en/issues/using-labels-and-milestones-to-track-work/managing-labels) include:
- `bug` - indicates an unexpected problem or unintended behavior
- `documentation` - indicates a need for improvements or additions to documentation
- `duplicate` - indicates similar or already reported issues, pull requests, or discussions
-- `enhancement` - indicates new feature requests, or if they are created by a developer, indicate planned new features
+- `enhancement` - indicates new feature requests,
+ or if they are created by a developer, indicate planned new features
- `good first issue` - indicates a good issue for first-time contributors
- `help wanted` - indicates that a maintainer wants help on an issue or pull request
- `invalid` - indicates that an issue, pull request, or discussion is no longer relevant
- `question` - indicates that an issue, pull request, or discussion needs more information
- `wontfix` - indicates that work won't continue on an issue, pull request, or discussion
-You can also create your own custom labels to help with classifying issues. There are no rules really about naming the labels - use whatever makes sense for your project. Some conventional custom labels include: `status:in progress` (to indicate that someone started working on the issue), `status:blocked` (to indicate that the progress on addressing issue is blocked by another issue or activity), etc.
-
-As well as highlighting problems, the `bug` label can make code much more usable by allowing users to find out if anyone has had the same problem before, and also how to fix (or work around) it on their end. Enabling users to solve their own problems can save you a lot of time. In general, a good bug report should contain only one bug, specific details of the environment in which the issue appeared (e.g. operating system or browser, version of the software and its dependencies), and sufficiently clear and concise steps that allow a developer to reproduce the bug themselves. They should also be clear on what the bug reporter considers factual ("I did this and this happened") and speculation ("I think it was caused by this"). If an error report was generated from the software itself, it's a very good idea to include that in the issue.
-
-The `enhancement` label is a great way to communicate your future priorities to your collaborators but also to yourself - it’s far too easy to leave a software project for a few months to work on something else, only to come back and forget the improvements you were going to make. If you have other users for your code, they can use the label to request new features, or changes to the way the code operates. It’s generally worth paying attention to these suggestions, especially if you spend more time developing than running the code. It can be very easy to end up with quirky behaviour because of off-the-cuff choices during development. Extra pairs of eyes can point out ways the code can be made more accessible - the easier the code is to use, the more widely it will be adopted and the greater impact it will have.
-
-One interesting label is `wontfix`, which indicates that an issue simply won't be worked on for whatever reason. Maybe the bug it reports is outside of the use case of the software, or the feature it requests simply isn't a priority. This can make it clear
-you've thought about an issue and dismissed it.
+You can also create your own custom labels to help with classifying issues.
+There are no rules really about naming the labels -
+use whatever makes sense for your project.
+Some conventional custom labels include:
+`status:in progress` (to indicate that someone started working on the issue),
+`status:blocked` (to indicate that the progress on addressing issue is
+blocked by another issue or activity), etc.
+
+As well as highlighting problems,
+the `bug` label can make code much more usable by
+allowing users to find out if anyone has had the same problem before,
+and also how to fix (or work around) it on their end.
+Enabling users to solve their own problems can save you a lot of time.
+In general, a good bug report should contain only one bug,
+specific details of the environment in which the issue appeared
+(e.g. operating system or browser, version of the software and its dependencies),
+and sufficiently clear and concise steps that allow a developer to reproduce the bug themselves.
+They should also be clear on what the bug reporter considers factual
+("I did this and this happened")
+and speculation
+("I think it was caused by this").
+If an error report was generated from the software itself,
+it's a very good idea to include that in the issue.
+
+The `enhancement` label is a great way to communicate your future priorities
+to your collaborators but also to yourself -
+it’s far too easy to leave a software project for a few months to work on something else,
+only to come back and forget the improvements you were going to make.
+If you have other users for your code,
+they can use the label to request new features,
+or changes to the way the code operates.
+It’s generally worth paying attention to these suggestions,
+especially if you spend more time developing than running the code.
+It can be very easy to end up with quirky behaviour
+because of off-the-cuff choices during development.
+Extra pairs of eyes can point out ways the code can be made more accessible -
+the easier the code is to use, the more widely it will be adopted
+and the greater impact it will have.
+
+One interesting label is `wontfix`,
+which indicates that an issue simply won't be worked on for whatever reason.
+Maybe the bug it reports is outside of the use case of the software,
+or the feature it requests simply isn't a priority.
+This can make it clear you've thought about an issue and dismissed it.
> ## Locking and Pinning Issues
-> The **Lock conversation** and **Pin issue** buttons are both available from individual issue pages.
-> Locking conversations allows you to block future comments on the issue, e.g. if the conversation around the issue
-> is not constructive or violates your team's code of conduct. Pinning issues allows you to pin up to three
-> issues to the top of the issues page, e.g. to emphasise their importance.
+> The **Lock conversation** and **Pin issue** buttons are both available
+> from individual issue pages.
+> Locking conversations allows you to block future comments on the issue,
+> e.g. if the conversation around the issue is not constructive
+> or violates your team's code of conduct.
+> Pinning issues allows you to pin up to three issues to the top of the issues page,
+> e.g. to emphasise their importance.
{: .callout}
> ## Manage Issues With Your Code Openly
-> Having open, publicly-visible lists of the the limitations and problems with your code is incredibly helpful. Even if some issues end up languishing unfixed for years, letting users know about them can save them a huge amount of work attempting to fix what turns out to be an unfixable problem on their end. It can also help you see at a glance what state your code is in, making it easier to prioritise future work!
+> Having open, publicly-visible lists of the limitations and problems with your code
+> is incredibly helpful.
+> Even if some issues end up languishing unfixed for years,
+> letting users know about them can save them a huge amount of work
+> attempting to fix what turns out to be an unfixable problem on their end.
+> It can also help you see at a glance what state your code is in,
+> making it easier to prioritise future work!
{: .testimonial}
> ## Exercise: Our First Issue!
-> Individually, with a critical eye, think of an aspect of the code you have developed so far that needs improvement.
-> It could be a bug, for example, or a documentation issue with your README, a missing LICENSE file, or an enhancement.
-> In GitHub, enter the details of the issue and select `Submit new issue`. Add a label to your issue, if appropriate.
->
+> Individually, with a critical eye,
+> think of an aspect of the code you have developed so far that needs improvement.
+> It could be a bug, for example,
+> or a documentation issue with your README,
+> a missing LICENSE file,
+> or an enhancement.
+> In GitHub, enter the details of the issue and select `Submit new issue`.
+> Add a label to your issue, if appropriate.
+>
> Time: 5 mins
>> ## Solution
->> For example, "Add a licence file" could be a good first issue, with a label `documentation`.
+>> For example, "Add a licence file" could be a good first issue, with a label `documentation`.
> {: .solution}
{: .challenge}
### Issue (and Pull Request) Templates
-GitHub also allows you to set up issue and pull request templates for your software project.
-Such templates provide a structure for the issue/pull request descriptions, and/or prompt issue reporters and collaborators
-to fill in answers to pre-set questions. They can help contributors raise issues or submit pull requests in a way
-that is clear, helpful and provides enough information for maintainers to act upon
-(without going back and forth to extract it). GitHub provides a range of default templates,
+
+GitHub also allows you to set up issue and pull request templates for your software project.
+Such templates provide a structure for the issue/pull request descriptions,
+and/or prompt issue reporters and collaborators to fill in answers to pre-set questions.
+They can help contributors raise issues or submit pull requests
+in a way that is clear, helpful and provides enough information for maintainers to act upon
+(without going back and forth to extract it).
+GitHub provides a range of default templates,
but you can also [write your own](https://docs.github.com/en/communities/using-templates-to-encourage-useful-issues-and-pull-requests/configuring-issue-templates-for-your-repository).
## Using GitHub's Notifications & Referencing System to Communicate
-GitHub implements a comprehensive [notifications system](https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/configuring-notifications)
-to keep the team up-to-date with activities in your code repository and notify you when something happens or changes
-in your software project. You can choose whether to watch or unwatch an individual repository,
-or can choose to only be notified of certain event types such as updates to issues, pull requests, direct mentions,
-etc. GitHub also provides an additional useful notification feature for collaborative work - **Mentions**.
-In addition to referencing team members (which will result in an appropriate notification), GitHub allows us
-to reference issues, pull requests and comments from one another - providing a useful way of connecting things
-and conversations in your project.
+GitHub implements a comprehensive
+[notifications system](https://docs.github.com/en/account-and-profile/managing-subscriptions-and-notifications-on-github/setting-up-notifications/configuring-notifications)
+to keep the team up-to-date with activities in your code repository
+and notify you when something happens or changes in your software project.
+You can choose whether to watch or unwatch an individual repository,
+or can choose to only be notified of certain event types
+such as updates to issues, pull requests, direct mentions, etc.
+GitHub also provides an additional useful notification feature for collaborative work - **Mentions**.
+In addition to referencing team members
+(which will result in an appropriate notification),
+GitHub allows us to reference issues, pull requests and comments from one another -
+providing a useful way of connecting things and conversations in your project.
### Referencing Team Members Using Mentions
-The mention system notifies team members when somebody else references them in an issue,
-comment or pull request - you can use this to notify people when you want to check a detail with them,
-or let them know something has been fixed or changed (much easier than writing out all the same information
-again in an email).
-
-You can use the mention system to link to/notify an individual GitHub account or a whole team
-for notifying multiple people. Typing @ in GitHub will bring up a list of all accounts and teams linked
-to the repository that can be "mentioned". People will then receive notifications based on their preferred notification
-methods - e.g. via email or GitHub's User Interface.
-
-### Referencing Issues, Pull Requests and Comments
-
-GitHub also lets you mention/reference one issue or pull request from another (and people "watching" these will be notified
-of any such updates). Whilst writing the description of an issue, or commenting on one,
-if you type # you should see a list of the issues and pull requests on the repository.
-They are coloured green if they're open, or white if they're closed. Continue typing the issue number, and
-the list will narrow down, then you can hit Return to select the entry and link the two.
-For example, if you realise that several of your bugs
-have common roots, or that one enhancement can't be implemented before you've finished another, you can use the
-mention system to indicate the depending issue(s). This is a simple way to add much more information to your issues.
-
-While not strictly notifying anyone, GitHub lets you also reference individual comments and commits. If you click the
-`...` button on a comment, from the drop down list you can select to `Copy link` (which is a URL that points to that
-comment that can be pasted elsewhere) or to `Reference [a comment] in a new issue` (which opens a new issue and references
-the comment by its URL). Within a text box for comments, issue and pull request descriptions, you can reference
-a commit by pasting its long, unique identifier (or its first few digits which uniquely identify it)
-and GitHub will render it nicely using the identifier's short form and link to the commit in question.
+The mention system notifies team members when somebody else references them
+in an issue, comment or pull request -
+you can use this to notify people when you want to check a detail with them,
+or let them know something has been fixed or changed
+(much easier than writing out all the same information again in an email).
+
+You can use the mention system to link to/notify an individual GitHub account
+or a whole team for notifying multiple people.
+Typing @ in GitHub will bring up a list of
+all accounts and teams linked to the repository that can be "mentioned".
+People will then receive notifications based on their preferred notification methods -
+e.g. via email or GitHub's User Interface.
+
+### Referencing Issues, Pull Requests and Comments
+
+GitHub also lets you mention/reference one issue or pull request from another
+(and people "watching" these will be notified of any such updates).
+Whilst writing the description of an issue, or commenting on one,
+if you type # you should see
+a list of the issues and pull requests on the repository.
+They are coloured green if they're open, or white if they're closed.
+Continue typing the issue number, and the list will narrow down,
+then you can hit Return to select the entry and link the two.
+For example, if you realise that several of your bugs have common roots,
+or that one enhancement can't be implemented before you've finished another,
+you can use the mention system to indicate the depending issue(s).
+This is a simple way to add much more information to your issues.
+
+While not strictly notifying anyone,
+GitHub lets you also reference individual comments and commits.
+If you click the `...` button on a comment,
+from the drop down list you can select to `Copy link`
+(which is a URL that points to that comment that can be pasted elsewhere)
+or to `Reference [a comment] in a new issue`
+(which opens a new issue and references the comment by its URL).
+Within a text box for comments, issue and pull request descriptions,
+you can reference a commit by pasting its long, unique identifier
+(or its first few digits which uniquely identify it)
+and GitHub will render it nicely using the identifier's short form
+and link to the commit in question.
{: .image-with-shadow width="700px"}
> ## Exercise: Our First Mention/Reference!
-> Add a mention to one of your team members using the `@` notation
-> in a comment within an issue or a pull request in your repository - e.g. to
-> ask them a question or a clarification on something or to do some additional work.
->
-> Alternatively, add another issue to your repository and reference the issue you created in the previous exercise using the
-> `#` notation.
->
+> Add a mention to one of your team members using the `@` notation
+> in a comment within an issue or a pull request in your repository -
+> e.g. to ask them a question or a clarification on something or to do some additional work.
+>
+> Alternatively, add another issue to your repository
+> and reference the issue you created in the previous exercise
+> using the `#` notation.
+>
> Time: 5 mins
{: .challenge}
> ## You Are Also a User of Your Code
>
-> This section focuses a lot on how issues and mentions can help communicate the current state of the code to others and
-> document what conversations were held around particular issues. As a sole developer, and possibly also the only user of the code, you might be tempted to not bother with recording issues, comments and new features as you don't need to communicate the information to anyone else.
+> This section focuses a lot on how issues and mentions can help
+> communicate the current state of the code to others
+> and document what conversations were held around particular issues.
+> As a sole developer, and possibly also the only user of the code,
+> you might be tempted to not bother with recording issues, comments and new features
+> as you don't need to communicate the information to anyone else.
>
-> Unfortunately, human memory isn't infallible! After spending six months on a different topic, it's inevitable you'll forget some of the plans you had and problems you faced. Not documenting these things can lead to you having to re-learn things you already put the effort into discovering before. Also, if others are brought on to the project at a later date, the software's existing issues and potential new features are already in place to build upon.
+> Unfortunately, human memory isn't infallible!
+> After spending six months on a different topic,
+> it's inevitable you'll forget some of the plans you had and problems you faced.
+> Not documenting these things can lead to you having to
+> re-learn things you already put the effort into discovering before.
+> Also, if others are brought on to the project at a later date,
+> the software's existing issues and potential new features are already in place to build upon.
{: .callout}
## Software Project Management in GitHub
-Managing issues within your software project is one aspect of project management but it gives a relative flat
-representation of tasks and may not be as suitable for higher-level project management such as
-prioritising tasks for future development, planning sprints and releases. Luckily,
+
+Managing issues within your software project is one aspect of project management but it gives a relative flat
+representation of tasks and may not be as suitable for higher-level project management such as
+prioritising tasks for future development, planning sprints and releases. Luckily,
GitHub provides two project management tools for this purpose - **Projects** and **Milestones**.
-Both Projects and Milestones provide [agile development and project management systems](https://www.atlassian.com/agile)
-and ways of organising issues into smaller "sub-projects" (i.e.
+Both Projects and Milestones provide [agile development and project management systems](https://www.atlassian.com/agile)
+and ways of organising issues into smaller "sub-projects" (i.e.
smaller than the "project" represented by the whole repository).
Projects provide a way of visualising and organising work which is not time-bound and is on a higher level (e.g. more suitable for
-project management tasks). Milestones are typically used to
-organise lower-level tasks that have deadlines and progress of which needs to be closely tracked
+project management tasks). Milestones are typically used to
+organise lower-level tasks that have deadlines and progress of which needs to be closely tracked
(e.g. release and version management). The main difference is that Milestones are a repository-level feature
(i.e. they belong and are managed from a single repository), whereas projects are account-level and can manage tasks
across many repositories under the same user or organisational account.
-How you organise and partition your project work and which tool you want to use
-to track progress (if at all) is up to you and the size of your project. For example, you could create a project per
-milestone or have several milestones in a single project, or split milestones into shorter sprints.
-We will use Milestones soon to organise work on a mini sprint within our team -
+How you organise and partition your project work and which tool you want to use
+to track progress (if at all) is up to you and the size of your project. For example, you could create a project per
+milestone or have several milestones in a single project, or split milestones into shorter sprints.
+We will use Milestones soon to organise work on a mini sprint within our team -
for now, we will have a brief look at Projects.
### Projects
-A Project uses a "project board" consisted of
-columns and cards to keep track of tasks (although GitHub now also provides a table view over a project's tasks).
-You break down your project into smaller sub-projects, which in turn are split
-into tasks which you write on cards, then move the cards between columns that describe the status of each task.
-Cards are usually small, descriptive and self-contained tasks that build on each other. Breaking a project
-down into clearly-defined tasks makes it a lot easier to manage. GitHub project boards interact and integrate with
-the other features of the site such as issues and pull requests - cards can be added to track the progress of such
-tasks and automatically moved between columns based on their progress or status.
+
+A Project uses a "project board" consisting of columns and cards to keep track of tasks
+(although GitHub now also provides a table view over a project's tasks).
+You break down your project into smaller sub-projects,
+which in turn are split into tasks which you write on cards,
+then move the cards between columns that describe the status of each task.
+Cards are usually small, descriptive and self-contained tasks that build on each other.
+Breaking a project down into clearly-defined tasks makes it a lot easier to manage.
+GitHub project boards interact and integrate with the other features of the site
+such as issues and pull requests -
+cards can be added to track the progress of such tasks
+and automatically moved between columns based on their progress or status.
> ## Project are a Cross-Repository Management Tool
-> [Project in GitHub](https://docs.github.com/en/issues/planning-and-tracking-with-projects/learning-about-projects/about-projects) are created on a user or organisation level, i.e. they can span all repositories owned by a user or organisation in GitHub and are not a repository-level feature any more. A project can integrate your issues and pull requests on GitHub from multiple repositories to help you plan and track your team's work effectively.
+> [Project in GitHub](https://docs.github.com/en/issues/planning-and-tracking-with-projects/learning-about-projects/about-projects)
+> are created on a user or organisation level,
+> i.e. they can span all repositories owned by a user or organisation in GitHub
+> and are not a repository-level feature any more.
+> A project can integrate your issues and pull requests on GitHub from multiple repositories
+> to help you plan and track your team's work effectively.
{: .callout}
Let's create a Project in GitHub to plan the first release of our code.
-1. From your GitHub account's home page (not your repository's home page!), select the "Projects" tab, then click the `New project` button on the right.
+1. From your GitHub account's home page (not your repository's home page!),
+ select the "Projects" tab, then click the `New project` button on the right.
{: .image-with-shadow width="1000px"}
-2. In the "Select a template" pop-up window, select "Board" - this will give you a classic "cards on a board" view of the project. An alternative is the "Table" view, which presents a spreadsheet-like and slightly more condensed view of a project.
+2. In the "Select a template" pop-up window, select "Board" -
+ this will give you a classic "cards on a board" view of the project.
+ An alternative is the "Table" view,
+ which presents a spreadsheet-like and slightly more condensed view of a project.
{: .image-with-shadow width="600px"}
-3. GitHub will create an unnamed project board for you. You should populate the name and the description of the project from the project's Settings, which can be found by clicking the `...` button in the top right corner of the board.
+3. GitHub will create an unnamed project board for you.
+ You should populate the name and the description of the project from the project's Settings,
+ which can be found by clicking the `...` button in the top right corner of the board.
{: .image-with-shadow width="1000px"}
-4. We can, for example, use "Inflammation project - release v0.1" and "Tasks for the v0.1 release of the inflammation project" for the name and description of our project, respectively. Or you can use anything that suits your project.
+4. We can, for example, use "Inflammation project - release v0.1"
+ and "Tasks for the v0.1 release of the inflammation project"
+ for the name and description of our project, respectively.
+ Or you can use anything that suits your project.
{: .image-with-shadow width="1000px"}
-5. GitHub's default card board template contains the following three columns with pretty self-explanatory names:
+5. GitHub's default card board template contains
+ the following three columns with pretty self-explanatory names:
- `To Do`
- `In Progress`
@@ -224,56 +344,89 @@ Let's create a Project in GitHub to plan the first release of our code.
{: .image-with-shadow width="1000px"}
- You can add or remove columns from your project board to suit your use case. One commonly seen extra
- column is `On hold` or `Waiting` - if you have tasks that get held up by waiting on other people (e.g. to respond to your questions) then moving them to a separate column makes their current state clearer.
-
- To add a new column, press the
- `+` button on the right; to remove a column select the `...` button in the top right corner
- of the column itself and then the `Delete column` option.
-
-6. You can now add new items (cards) to columns by pressing the `+ Add item` button at the bottom of each column - a text box to add a card will appear. Cards can be simple textual notes which you type into the text box and pres `Enter` when finished. Cards can also be (links to) existing issues and pull requests, which can be filtered out from the text box by pressing `#` (to activate GitHub's referencing mechanism) and selecting the repository and an issue or pull request from that repository that you want to add.
+ You can add or remove columns from your project board to suit your use case.
+ One commonly seen extra column is `On hold` or `Waiting` -
+ if you have tasks that get held up by waiting on other people
+ (e.g. to respond to your questions)
+ then moving them to a separate column makes their current state clearer.
+
+ To add a new column,
+ press the `+` button on the right;
+ to remove a column select the `...` button in the top right corner of the column itself
+ and then the `Delete column` option.
+
+6. You can now add new items (cards) to columns by pressing
+ the `+ Add item` button at the bottom of each column -
+ a text box to add a card will appear.
+ Cards can be simple textual notes
+ which you type into the text box and pres `Enter` when finished.
+ Cards can also be (links to) existing issues and pull requests,
+ which can be filtered out from the text box by pressing `#`
+ (to activate GitHub's referencing mechanism)
+ and selecting the repository
+ and an issue or pull request from that repository that you want to add.
{: .image-with-shadow width="1000px"}
- Notes contain task descriptions and can have detailed content like checklists. In some cases, e.g. if a note becomes
- too complex, you may want to convert it into an issue so you can add labels, assign them to team members or
- write more detailed comments (for that, use the `Convert to issue`
- option from the `...` menu on the card itself).
-
+ Notes contain task descriptions and can have detailed content like checklists.
+ In some cases, e.g. if a note becomes too complex,
+ you may want to convert it into an issue so you can add labels,
+ assign them to team members
+ or write more detailed comments
+ (for that, use the `Convert to issue` option from the `...` menu on the card itself).
+
{: .image-with-shadow width="1000px"}
-7. You can now drag a card to `In Progress` column to indicate that you are working on it or to the `Done` column to indicate
-that it has been completed. Issues and pull requests on cards will automatically be moved to the `Done` column for you when you close the issue or merge the pull request - which is very convenient and can save you some project management time.
+7. You can now drag a card to `In Progress` column to indicate that you are working on it
+ or to the `Done` column to indicate that it has been completed.
+ Issues and pull requests on cards will automatically be moved to the `Done` column for you
+ when you close the issue or merge the pull request -
+ which is very convenient and can save you some project management time.
> ## Exercise: Working With Projects
-> Spend a few minutes planning what you want to do with your project as a bigger chunk of work (you can continue working on the
-> first release of your software if you like)
+> Spend a few minutes planning what you want to do with your project as a bigger chunk of work
+> (you can continue working on the first release of your software if you like)
> and play around with your project board to manage tasks around the project:
+>
> - practice adding and removing columns,
-> - practice adding different types of cards (notes and from already existing open issues and/or unmerged pull requests),
+> - practice adding different types of cards
+> (notes and from already existing open issues and/or unmerged pull requests),
> - practice turing cards into issues and closing issues, etc.
->
-> Make sure to add a certain number of issues to your repository to be able to use in you project board.
->
+>
+> Make sure to add a certain number of issues to your repository
+> to be able to use in your project board.
+>
> Time: 10 mins
{: .challenge}
> ## Prioritisation With Project Boards
-> Once your project board has a large number of cards on it, you might want to begin priorisiting them.
-Not all tasks are going to be equally important, and some will require others to be completed before they
-can even be begun. Common methods of prioritisation include:
-- **Vertical position**: the vertical arrangement of cards in a column implicitly represents their importance.
-High-priority issues go to the top of `To Do`, whilst tasks that depend on others go beneath them.
-This is the easiest one to implement, though you have to remember to correctly place cards when you add them.
-- **Priority columns**: instead of a single `To Do` column, you can have two or more, for example -
-`To Do: Low Priority` and `To Do: High Priority`. When adding a card, you pick which is the appropriate column for it.
-You can even add a `Triage` column for newly-added issues that you’ve not yet had time to classify.
-This format works well for project boards devoted to bugs.
-- **Labels**: if you convert each card into an issue, then you can label them with their priority - remember GitHub
-lets you create custom labels and set their colours. Label colours can provide a very visually clear indication of
-issue priority but require more administrative work on the project, as each card has to be an issue to be assigned a
-label. If you choose this route for issue prioritisation - be aware of accessibility issues for colour-blind people
-when picking colours.
+> Once your project board has a large number of cards on it,
+> you might want to begin priorisiting them.
+> Not all tasks are going to be equally important,
+> and some will require others to be completed before they can even be begun.
+> Common methods of prioritisation include:
+>
+> - **Vertical position**:
+> the vertical arrangement of cards in a column implicitly represents their importance.
+> High-priority issues go to the top of `To Do`,
+> whilst tasks that depend on others go beneath them.
+> This is the easiest one to implement,
+> though you have to remember to correctly place cards when you add them.
+> - **Priority columns**: instead of a single `To Do` column,
+> you can have two or more, for example -
+> `To Do: Low Priority` and `To Do: High Priority`.
+> When adding a card, you pick which is the appropriate column for it.
+> You can even add a `Triage` column for newly-added issues
+> that you’ve not yet had time to classify.
+> This format works well for project boards devoted to bugs.
+> - **Labels**: if you convert each card into an issue,
+> then you can label them with their priority -
+> remember GitHub lets you create custom labels and set their colours.
+> Label colours can provide a very visually clear indication of issue priority
+> but require more administrative work on the project,
+> as each card has to be an issue to be assigned a label.
+> If you choose this route for issue prioritisation -
+> be aware of accessibility issues for colour-blind people when picking colours.
>
{: .callout}
diff --git a/_episodes/52-assessing-software-suitability-improvement.md b/_episodes/52-assessing-software-suitability-improvement.md
index a275b6d22..6d2079c0f 100644
--- a/_episodes/52-assessing-software-suitability-improvement.md
+++ b/_episodes/52-assessing-software-suitability-improvement.md
@@ -16,47 +16,84 @@ keypoints:
## Introduction
-What we've been looking at so far enables us to adopt a more proactive and managed attitude and approach to the software we develop. But we should also adopt this attitude when selecting and making use of third-party software we wish to use. With pressing deadlines it's very easy to reach for a piece of software that appears to do what you want without considering properly whether it's a good fit for your project first. A chain is only as strong as its weakest link, and our software may inherit weaknesses in any dependent software or create other problems.
+What we've been looking at so far enables us to adopt
+a more proactive and managed attitude and approach to the software we develop.
+But we should also adopt this attitude when
+selecting and making use of third-party software we wish to use.
+With pressing deadlines it's very easy to reach for
+a piece of software that appears to do what you want
+without considering properly whether it's a good fit for your project first.
+A chain is only as strong as its weakest link,
+and our software may inherit weaknesses in any dependent software or create other problems.
-Overall, when adopting software to use it's important to consider not only whether it has the functionality you want,
-but a broader range of qualities that are important for your project. Adopting a critical mindset when assessing other
-software against suitability criteria will help you adopt the same attitude when assessing your own software for future improvements.
+Overall, when adopting software to use it's important to consider
+not only whether it has the functionality you want,
+but a broader range of qualities that are important for your project.
+Adopting a critical mindset when assessing other software against suitability criteria
+will help you adopt the same attitude when assessing your own software for future improvements.
## Assessing Software for Suitability
> ## Exercise: Decide on Your Group's Repository!
>
-> You all have your code repositories you have been working on throughout the course so far. For the upcoming exercise, groups will exchange repositories and review the code of the repository they inherit, and provide feedback.
+> You all have your code repositories you have been working on throughout the course so far.
+> For the upcoming exercise,
+> groups will exchange repositories and review the code of the repository they inherit,
+> and provide feedback.
>
> Time: 5 mins
>
-> 1. Decide as a team on one of your repositories that will represent your group. You can do this any way you wish.
-> 2. Add the URL of the repository to the section of the shared notes labelled 'Decide on your Group's Repository',
-> next to your team's name.
+> 1. Decide as a team on one of your repositories that will represent your group.
+> You can do this any way you wish.
+> 2. Add the URL of the repository to
+> the section of the shared notes labelled 'Decide on your Group's Repository',
+> next to your team's name.
{: .challenge}
> ## Exercise: Conduct Assessment on Third-Party Software
>
-> *The scenario:* It is envisaged that a piece of software developed by another team will be adopted and used for the long term in a number of future projects. You have been tasked with conducting an assessment of this software to identify any issues that need resolving prior to working with it, and will provide feedback to the developing team to fix these issues.
+> *The scenario:* It is envisaged that a piece of software developed by another team will be
+> adopted and used for the long term in a number of future projects.
+> You have been tasked with conducting an assessment of this software
+> to identify any issues that need resolving prior to working with it,
+> and will provide feedback to the developing team to fix these issues.
>
> Time: 20 mins
>
-> 1. As a team, briefly decide who will assess which aspect of the repository, e.g. its documentation, tests, codebase, etc.
-> 2. Obtain the URL for the repository you will assess from the shared notes document, in the section labelled 'Decide on your Group's Repository' - see the last column which indicates which team's repository you are assessing.
-> 3. Conduct the assessment and register any issues you find on the other team's software repository on GitHub.
+> 1. As a team, briefly decide who will assess which aspect of the repository,
+> e.g. its documentation, tests, codebase, etc.
+> 2. Obtain the URL for the repository you will assess from the shared notes document,
+> in the section labelled 'Decide on your Group's Repository' -
+> see the last column which indicates which team's repository you are assessing.
+> 3. Conduct the assessment
+> and register any issues you find on the other team's software repository on GitHub.
> 4. Be meticulous in your assessment and register as many issues as you can!
{: .challenge}
> ## Supporting Your Software - How and How Much?
>
-> Within your collaborations and projects, what should you do to support other users? Here are some key aspects to consider:
+> Within your collaborations and projects, what should you do to support other users?
+> Here are some key aspects to consider:
>
-> - Provide contact information: so users know what to do and how to get in contact if they run into problems
-> - Manage your support: an issue tracker - like the one in GitHub - is essential to track and manage issues
-> - Manage expectations: let users know the level of support you offer, in terms of when they can expect responses to queries, the scope of support (e.g. which platforms, types of releases, etc.), the types of support (e.g. bug resolution, helping develop tailored solutions), and expectations for support in the future (e.g. when project funding runs out)
+> - Provide contact information:
+> so users know what to do and how to get in contact if they run into problems
+> - Manage your support:
+> an issue tracker - like the one in GitHub - is essential to track and manage issues
+> - Manage expectations:
+> let users know the level of support you offer,
+> in terms of when they can expect responses to queries,
+> the scope of support (e.g. which platforms, types of releases, etc.),
+> the types of support (e.g. bug resolution, helping develop tailored solutions),
+> and expectations for support in the future (e.g. when project funding runs out)
>
-> All of this requires effort, and you can't do everything. It's therefore important to agree and be clear on how the software will be supported from the outset, whether it's within the context of a single laboratory, project, or other collaboration, or across an entire community.
+> All of this requires effort, and you can't do everything.
+> It's therefore important to agree and be clear on
+> how the software will be supported from the outset,
+> whether it's within the context of a single laboratory,
+> project,
+> or other collaboration,
+> or across an entire community.
{: .callout}
{% include links.md %}
diff --git a/_episodes/53-improvement-through-feedback.md b/_episodes/53-improvement-through-feedback.md
index d00ed9956..2576824c1 100644
--- a/_episodes/53-improvement-through-feedback.md
+++ b/_episodes/53-improvement-through-feedback.md
@@ -21,50 +21,109 @@ keypoints:
## Introduction
-When a software project has been around for even just a short amount of time, you'll likely discover many aspects that can be improved. These can come from issues that have been registered via collaborators or users, but also those you're aware of internally, which should also be registered as issues. When starting a new software project, you'll also have to determine how you'll handle all the requirements. But which ones should you work on first, which are the most important and why, and how should you organise all this work?
-
-Software has a fundamental role to play in doing science, but unfortunately software development is often given short shrift in academia when it comes to prioritising effort. There are also many other draws on our time in addition to the research, development, and writing of publications that we do, which makes it all the more important to prioritise our time for development effectively.
-
-In this lesson we'll be looking at prioritising work we need to do and what we can use from the agile perspective of project management to help us do this in our software projects.
+When a software project has been around for even just a short amount of time,
+you'll likely discover many aspects that can be improved.
+These can come from issues that have been registered via collaborators or users,
+but also those you're aware of internally,
+which should also be registered as issues.
+When starting a new software project,
+you'll also have to determine how you'll handle all the requirements.
+But which ones should you work on first,
+which are the most important and why,
+and how should you organise all this work?
+
+Software has a fundamental role to play in doing science,
+but unfortunately software development is often
+given short shrift in academia when it comes to prioritising effort.
+There are also many other draws on our time
+in addition to the research, development, and writing of publications that we do,
+which makes it all the more important to prioritise our time for development effectively.
+
+In this lesson we'll be looking at prioritising work we need to do
+and what we can use from the agile perspective of project management
+to help us do this in our software projects.
## Estimation as a Foundation for Prioritisation
-For simplicity, we'll refer to our issues as *requirements*, since that's essentially what they are - new requirements for our software to fulfil.
+For simplicity, we'll refer to our issues as *requirements*,
+since that's essentially what they are -
+new requirements for our software to fulfil.
-But before we can prioritise our requirements, there are some things we need to find out.
+But before we can prioritise our requirements,
+there are some things we need to find out.
Firstly, we need to know:
-- *The period of time we have to resolve these requirements* - e.g. before the next software release, pivotal demonstration, or other deadlines requiring their completion. This is known as a **timebox**. This might be a week or two, but for agile, this should not be longer than a month. Longer deadlines with more complex requirements may be split into a number of timeboxes.
-- *How much overall effort we have available* - i.e. who will be involved and how much of their time we will have during this period
-
-We also need estimates for how long each requirement will take to resolve, since we cannot meaningfully prioritise requirements without knowing what the effort tradeoffs will be. Even if we know how important each requirement is, how would we even know if completing the project is possible? Or if we don't know how long it will take to deliver those requirements we deem to be critical to the success of a project, how can we know if we can include other less important ones?
-
-It is often not the reality, but estimation should ideally be done by the people likely to do the actual work (i.e. the Research Software Engineers, researchers, or developers). It shouldn't be done by project managers or PIs simply because they are not best placed to estimate, and those doing the work are the ones who are effectively committing to these figures.
+- *The period of time we have to resolve these requirements* -
+ e.g. before the next software release, pivotal demonstration,
+ or other deadlines requiring their completion.
+ This is known as a **timebox**.
+ This might be a week or two, but for agile, this should not be longer than a month.
+ Longer deadlines with more complex requirements may be split into a number of timeboxes.
+- *How much overall effort we have available* -
+- i.e. who will be involved and how much of their time we will have during this period.
+
+We also need estimates for how long each requirement will take to resolve,
+since we cannot meaningfully prioritise requirements without
+knowing what the effort tradeoffs will be.
+Even if we know how important each requirement is,
+how would we even know if completing the project is possible?
+Or if we don't know how long it will take
+to deliver those requirements we deem to be critical to the success of a project,
+how can we know if we can include other less important ones?
+
+It is often not the reality,
+but estimation should ideally be done by the people likely to do the actual work
+(i.e. the Research Software Engineers, researchers, or developers).
+It shouldn't be done by project managers or PIs
+simply because they are not best placed to estimate,
+and those doing the work are the ones who are effectively committing to these figures.
> ## Why is it so Difficult to Estimate?
>
-> Estimation is a very valuable skill to learn, and one that is often difficult. Lack of experience in estimation can play a part, but a number of psychological causes can also contribute. One of these is [Dunning-Kruger](https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect), a type of cognitive bias in which people tend to overestimate their abilities, whilst in opposition to this is [imposter syndrome](https://en.wikipedia.org/wiki/Impostor_syndrome), where due to a lack of confidence people underestimate their abilities. The key message here is to be honest about what you can do, and find out as much information that is reasonably appropriate before arriving at an estimate.
+> Estimation is a very valuable skill to learn, and one that is often difficult.
+> Lack of experience in estimation can play a part,
+> but a number of psychological causes can also contribute.
+> One of these is [Dunning-Kruger](https://en.wikipedia.org/wiki/Dunning%E2%80%93Kruger_effect),
+> a type of cognitive bias in which people tend to overestimate their abilities,
+> whilst in opposition to this is [imposter syndrome](https://en.wikipedia.org/wiki/Impostor_syndrome),
+> where due to a lack of confidence people underestimate their abilities.
+> The key message here is to be honest about what you can do,
+> and find out as much information that is reasonably appropriate before arriving at an estimate.
>
-> More experience in estimation will also help to reduce these effects. So keep estimating!
+> More experience in estimation will also help to reduce these effects.
+> So keep estimating!
{: .callout}
-An effective way of helping to make your estimates more accurate is to do it as a team. Other members can ask prudent questions that may not have been considered, and bring in other sanity checks and their own development experience. Just talking things through can help uncover other complexities and pitfalls, and raise crucial questions to clarify ambiguities.
+An effective way of helping to make your estimates more accurate is to do it as a team.
+Other members can ask prudent questions that may not have been considered,
+and bring in other sanity checks and their own development experience.
+Just talking things through can help uncover other complexities and pitfalls,
+and raise crucial questions to clarify ambiguities.
> ## Where to Record Effort Estimates?
> There is no dedicated place to record the effort estimate on an issue in current GitHub's interface.
> Therefore, you can agree on a convention within your team on how to record this information -
> e.g. you can add the effort in person/days in the issue title.
-> Recording estimates within comments on an issue may not be the best idea as it may get lost among other comments.
-> Another place where you can record estimates for your issues is on project boards - there is no default field for this
-> but you can create a custom numeric field and use it to assign effort estimates (note that
-> you cannot sum them yet in the current GitHub's interface).
+> Recording estimates within comments on an issue may not be the best idea
+> as it may get lost among other comments.
+> Another place where you can record estimates for your issues is on project boards -
+> there is no default field for this but you can create a custom numeric field
+> and use it to assign effort estimates
+> (note that you cannot sum them yet in the current GitHub's interface).
{: .callout}
> ## Exercise: Estimate!
>
-> As a team go through the issues that your partner team has registered with your software repository, and quickly estimate how long each issue will take to resolve in minutes. Do this by blind consensus first, each anonymously submitting an estimate, and then briefly discuss your rationale and decide on a final estimate. Make sure these are honest estimates, and you are able to complete them in the allotted time!
+> As a team
+> go through the issues that your partner team has registered with your software repository,
+> and quickly estimate how long each issue will take to resolve in minutes.
+> Do this by blind consensus first,
+> each anonymously submitting an estimate,
+> and then briefly discuss your rationale and decide on a final estimate.
+> Make sure these are honest estimates,
+> and you are able to complete them in the allotted time!
>
> Time: 15 mins
{: .challenge}
@@ -72,54 +131,104 @@ An effective way of helping to make your estimates more accurate is to do it as
# Using MoSCoW to Prioritise Work
-Now we have our estimates we can decide how important each requirement is to the success of the project. This should be decided by the project stakeholders; those - or their representatives - who have a stake in the success of the project and are either directly affected or affected by the project, e.g. Principle Investigators, researchers, Research Software Engineers, collaborators, etc.
-
-To prioritise these requirements we can use a method called **MoSCoW**, a way to reach a common understanding with stakeholders on the importance of successfully delivering each requirement for a timebox. MoSCoW is an acronym that stands for **Must have**, **Should have**, **Could have**, and **Won't have**. Each requirement is discussed by the stakeholder group and falls into one of these categories:
-
-- *Must Have* (MH) - these requirements are critical to the current timebox for it to succeed. Even the inability to deliver just one of these would cause the project to be considered a failure.
-- *Should Have* (SH) - these are important requirements but not *necessary* for delivery in the timebox. They may be as *important* as Must Haves, but there may be other ways to achieve them or perhaps they can be held back for a future development timebox.
-- *Could Have* (CH) - these are desirable but not necessary, and each of these will be included in this timebox if it can be achieved.
-- *Won't Have* (WH) - these are agreed to be out of scope for this timebox, perhaps because they are the least important or not critical for this phase of development.
-
-In typical use, the ratio to aim for of requirements to the MH/SH/CH categories is 60%/20%/20% for a particular timebox. Importantly, the division is by the requirement *estimates*, not by number of requirements, so 60% means 60% of the overall estimated effort for requirements are Must Haves.
-
-Why is this important? Because it gives you a unique degree of control of your project for each time period. It awards you 40% of flexibility with allocating your effort depending on what's critical and how things progress. This effectively forces a tradeoff between the effort available and critical objectives, maintaining a significant safety margin. The idea is that as a project progresses, even if it becomes clear that you are only able to deliver the Must Haves for a particular time period, you have still delivered it *successfully*.
+Now we have our estimates we can decide
+how important each requirement is to the success of the project.
+This should be decided by the project stakeholders;
+those - or their representatives -
+who have a stake in the success of the project
+and are either directly affected or affected by the project,
+e.g. Principle Investigators,
+researchers,
+Research Software Engineers,
+collaborators, etc.
+
+To prioritise these requirements we can use a method called **MoSCoW**,
+a way to reach a common understanding with stakeholders
+on the importance of successfully delivering each requirement for a timebox.
+MoSCoW is an acronym that stands for
+**Must have**,
+**Should have**,
+**Could have**,
+and **Won't have**.
+Each requirement is discussed by the stakeholder group and falls into one of these categories:
+
+- *Must Have* (MH) -
+ these requirements are critical to the current timebox for it to succeed.
+ Even the inability to deliver just one of these would
+ cause the project to be considered a failure.
+- *Should Have* (SH) -
+ these are important requirements but not *necessary* for delivery in the timebox.
+ They may be as *important* as Must Haves,
+ but there may be other ways to achieve them
+ or perhaps they can be held back for a future development timebox.
+- *Could Have* (CH) -
+ these are desirable but not necessary,
+ and each of these will be included in this timebox if it can be achieved.
+- *Won't Have* (WH) -
+ these are agreed to be out of scope for this timebox,
+ perhaps because they are the least important or not critical for this phase of development.
+
+In typical use, the ratio to aim for of requirements to the MH/SH/CH categories is
+60%/20%/20% for a particular timebox.
+Importantly, the division is by the requirement *estimates*,
+not by number of requirements,
+so 60% means 60% of the overall estimated effort for requirements are Must Haves.
+
+Why is this important?
+Because it gives you a unique degree of control of your project for each time period.
+It awards you 40% of flexibility with allocating your effort
+depending on what's critical and how things progress.
+This effectively forces a tradeoff between the effort available and critical objectives,
+maintaining a significant safety margin.
+The idea is that as a project progresses,
+even if it becomes clear that you are only able to
+deliver the Must Haves for a particular time period,
+you have still delivered it *successfully*.
### GitHub's Milestones
-Once we've decided on those we'll work on (i.e. not Won't Haves), we can (optionally) use a GitHub's
-**Milestone** to organise them for a particular timebox. Remember, a milestone is a collection of issues to be worked on in a given period
-(or timebox). We can create a new one by selecting `Issues` on our repository, then `Milestones` to display any
-existing milestones, then clicking the "New milestone" button to the right.
+Once we've decided on those we'll work on (i.e. not Won't Haves),
+we can (optionally) use a GitHub's **Milestone** to organise them for a particular timebox.
+Remember, a milestone is a collection of issues to be worked on in a given period (or timebox).
+We can create a new one by selecting `Issues` on our repository,
+then `Milestones` to display any existing milestones,
+then clicking the "New milestone" button to the right.
+
{: .image-with-shadow width="1000px"}
{: .image-with-shadow width="1000px"}
-We add in a title, a completion date (i.e. the end of this timebox),
+We add in a title,
+a completion date (i.e. the end of this timebox),
and any description for the milestone.
{: .image-with-shadow width="800px"}
-Once created, we can view our issues and assign them to our milestone from the `Issues` page or from an individual
-issue page.
+Once created, we can view our issues
+and assign them to our milestone from the `Issues` page or from an individual issue page.
{: .image-with-shadow width="1000px"}
-
-Let's now use Milestones to plan and prioritise our team's next sprint.
+
+Let's now use Milestones to plan and prioritise our team's next sprint.
> ## Exercise: Prioritise!
>
-> Put your stakeholder hats on, and as a team apply MoSCoW to the repository issues to determine how you will prioritise effort to resolve them in the allotted time. Try to stick to the 60/20/20 rule, and assign all issues you'll be working on (i.e. not `Won't Haves`) to a new milestone, e.g. "Tidy up documentation" or "version 0.1".
->
+> Put your stakeholder hats on, and as a team apply MoSCoW to the repository issues
+> to determine how you will prioritise effort to resolve them in the allotted time.
+> Try to stick to the 60/20/20 rule,
+> and assign all issues you'll be working on (i.e. not `Won't Haves`) to a new milestone,
+> e.g. "Tidy up documentation" or "version 0.1".
+>
> Time: 10 mins
{: .challenge}
{% comment %}
> ## Milestones in Project Boards
-> Milestones are also visible on project boards. If an issue or pull request belongs to a milestone,
-> the name of the milestone will be displayed on the project card.
-> You can add or remove an issue or pull request from milestones using the details sidebar, and filter your project
-> cards by milestones using the search bar.
+> Milestones are also visible on project boards.
+> If an issue or pull request belongs to a milestone,
+> the name of the milestone will be displayed on the project card.
+> You can add or remove an issue or pull request from milestones using the details sidebar,
+> and filter your project cards by milestones using the search bar.
>
> {: .image-with-shadow width="900px"}
{: .callout}
@@ -127,19 +236,42 @@ Let's now use Milestones to plan and prioritise our team's next sprint.
## Using Sprints to Organise and Work on Issues
-A sprint is an activity applied to a timebox, where development is undertaken on the agreed prioritised work for the period. In a typical sprint, there are daily meetings called **scrum meetings** which check on how work is progressing, and serves to highlight any blockers and challenges to meeting the sprint goal.
+A sprint is an activity applied to a timebox,
+where development is undertaken on the agreed prioritised work for the period.
+In a typical sprint, there are daily meetings called **scrum meetings**
+which check on how work is progressing,
+and serves to highlight any blockers and challenges to meeting the sprint goal.
> ## Exercise: Conduct a Mini Mini-Sprint
>
-> For the remaining time in this course, assign repository issues to team members and work on resolving them as per your MoSCoW breakdown. Once an issue has been resolved, notable progress made, or an impasse has been reached, provide concise feedback on the repository issue. Be sure to add the other team members to the chosen repository so they have access to it. You can grant `Write` access to others on a GitHub repository via the `Settings` tab for a repository, then selecting `Collaborators`, where you can invite other GitHub users to your repository with specific permissions.
+> For the remaining time in this course,
+> assign repository issues to team members and work on resolving them as per your MoSCoW breakdown.
+> Once an issue has been resolved, notable progress made, or an impasse has been reached,
+> provide concise feedback on the repository issue.
+> Be sure to add the other team members to the chosen repository so they have access to it.
+> You can grant `Write` access to others on a GitHub repository
+> via the `Settings` tab for a repository, then selecting `Collaborators`,
+> where you can invite other GitHub users to your repository with specific permissions.
>
> Time: however long is left
{: .challenge}
-Depending on how many issues were registered on your repository, it's likely you won't have resolved all the issues in this first milestone. Of course, in reality, a sprint would be over a much longer period of time. In any event, as the development progresses into future sprints any unresolved issues can be reconsidered and prioritised for another milestone, which are then taken forward, and so on. This process of receiving new requirements, prioritisation, and working on them is naturally continuous - with the benefit that at key stages you are repeatedly **re-evaluating what is important and needs to be worked on** which helps to ensure real concrete progress against project goals and requirements which may change over time.
+Depending on how many issues were registered on your repository,
+it's likely you won't have resolved all the issues in this first milestone.
+Of course, in reality, a sprint would be over a much longer period of time.
+In any event, as the development progresses into future sprints
+any unresolved issues can be reconsidered and prioritised for another milestone,
+which are then taken forward, and so on.
+This process of receiving new requirements, prioritisation,
+and working on them is naturally continuous -
+with the benefit that at key stages
+you are repeatedly **re-evaluating what is important and needs to be worked on**
+which helps to ensure real concrete progress against project goals and requirements
+which may change over time.
> ## Project Boards For Planning Sprints
-> Remember, you can use project boards for higher-level project management - e.g. planning several sprints in advance
+> Remember, you can use project boards for higher-level project management -
+> e.g. planning several sprints in advance
> (and use milestones for tracking progress on individual sprints).
{: .callout}
diff --git a/_episodes/60-wrap-up.md b/_episodes/60-wrap-up.md
index 544d03a23..8dcf1b27f 100644
--- a/_episodes/60-wrap-up.md
+++ b/_episodes/60-wrap-up.md
@@ -27,47 +27,105 @@ Efficient, performant + scalable - trade off
Secure
Discoverable – others can understand quickly + easily
Simple – modular
-Pick the properties that are relevant to your project - e.g. trade off between time, efficiency and performance,
-the levels of software reusability - this will dictate practices and the level of development. This can lead to a discussion.
+Pick the properties that are relevant to your project -
+e.g. trade off between time, efficiency and performance,
+the levels of software reusability - this will dictate practices and the level of development.
+This can lead to a discussion.
Reiterate some of the key messages.
{% endcomment %}
## Summary
-As part of this course we have looked at a core set of established, intermediate-level software development tools and
-best practices for working as part of a team. The course teaches a selected subset of skills
-that have been tried and tested in collaborative research software development environments, although not an
-all-encompassing set of every skill you might need (check some [further reading](./#further-resources)). It will
-provide you with a solid basis for writing industry-grade code, which relies on the same best practices taught in this course:
+As part of this course we have looked at a core set of
+established, intermediate-level software development tools and best practices
+for working as part of a team.
+The course teaches a selected subset of skills that have been tried and tested
+in collaborative research software development environments,
+although not an all-encompassing set of every skill you might need
+(check some [further reading](./#further-resources)).
+It will provide you with a solid basis for writing industry-grade code,
+which relies on the same best practices taught in this course:
-- Collaborative techniques and tools play an important part of research software development in teams, but also have benefits in solo development. We've looked at the benefits of a well-considered development environment, using practices, tools and infrastructure to help us write code more effectively in collaboration with others.
-- We've looked at the importance of being able to verify the correctness of software and automation, and how we can leverage techniques and infrastructure to automate and scale tasks such as testing to save us time - but automation has a role beyond simply testing: what else can you automate that would save you even more time? Once found, we've also examined how to locate faults in our software.
-- We've gone beyond procedural programming and explored different software design paradigms, such as object-oriented and functional styles of programming. We've contrasted their pros, cons, and the situations in which they work best, and how separation of concerns through modularity and architectural design can help shape good software.
-- As an intermediate developer, aspects other than technical skills become important, particularly in development teams. We've looked at the importance of good, consistent practices for team working, and the importance of having a self-critical mindset when developing software, and ways to manage feedback effectively and efficiently.
+- Collaborative techniques and tools play an important part
+ of research software development in teams,
+ but also have benefits in solo development.
+ We've looked at the benefits of a well-considered development environment,
+ using practices, tools and infrastructure
+ to help us write code more effectively in collaboration with others.
+- We've looked at the importance of being able to
+ verify the correctness of software and automation,
+ and how we can leverage techniques and infrastructure
+ to automate and scale tasks such as testing to save us time -
+ but automation has a role beyond simply testing:
+ what else can you automate that would save you even more time?
+ Once found, we've also examined how to locate faults in our software.
+- We've gone beyond procedural programming and explored different software design paradigms,
+ such as object-oriented and functional styles of programming.
+ We've contrasted their pros, cons, and the situations in which they work best,
+ and how separation of concerns through modularity and architectural design
+ can help shape good software.
+- As an intermediate developer,
+ aspects other than technical skills become important,
+ particularly in development teams.
+ We've looked at the importance of good,
+ consistent practices for team working,
+ and the importance of having a self-critical mindset when developing software,
+ and ways to manage feedback effectively and efficiently.
> ## Reflection Exercise: Putting the Pieces Together
-> As a group, reflect on the concepts (e.g. tools, techniques and practices) covered throughout the course, how they relate to one another, how they fit together in a bigger picture or skill learning pathways and in which order you need to learn them.
->> ## Solution
->> One way to think about these concepts is to make a list and try to organise them along two axes - 'perceived usefulness of a concept' versus 'perceived difficulty or time needed to master a concept', as shown in the table below (for the exercise, you can make your own copy of the [template table](https://docs.google.com/document/d/1NdE6PjqxjSsf1K4ofkCoWc2GA3sY2RIsjRg8BghTXas/edit?usp=sharing) for the purpose of this exercise). You then may
->> think in which order you want to learn the skills and how much effort they require - e.g. start with those that are more useful but, for the time being, hold off those that are not too useful to you and take loads of time to master. You will likely want to focus on the concepts in the top right corner of the table first, but
->> investing time to master more difficult concepts may pay off in the long run by saving you time and effort
->> and helping reduce technical debt.
->> {: .image-with-shadow width="800px"}
->>
->> Another way you can organise the concepts is using a [concept map](https://en.wikipedia.org/wiki/Concept_map) (a directed graph depicting suggested relationships between concepts) or any other diagram/visual aid of your choice.
->> Below are some example views of tools and techniques covered in the course using concept maps. Your views
->> may differ but that is not to say that either view is right or wrong. This exercise is meant to get you to reflect on what was covered in the course and hopefully to reinforce the ideas and concepts you learned.
->> {: .image-with-shadow width="800px"}
->> A different concept map tries to organise concepts/skills based on their level of difficulty (novice, intermediate and advanced, and in-between!) and tries to show which skills are prerequisite for others and in which order you should consider learning skills.
->> {: .image-with-shadow width="800px"}
+> As a group, reflect on the concepts
+> (e.g. tools, techniques and practices)
+> covered throughout the course,
+> how they relate to one another,
+> how they fit together in a bigger picture or skill learning pathways
+> and in which order you need to learn them.
+> > ## Solution
+> > One way to think about these concepts is to
+> > make a list and try to organise them along two axes -
+> > 'perceived usefulness of a concept' versus
+> > 'perceived difficulty or time needed to master a concept',
+> > as shown in the table below
+> > (for the exercise, you can make your own copy of the
+> > [template table](https://docs.google.com/document/d/1NdE6PjqxjSsf1K4ofkCoWc2GA3sY2RIsjRg8BghTXas/edit?usp=sharing)
+> > for the purpose of this exercise).
+> > You then may think in which order you want to learn the skills
+> > and how much effort they require -
+> > e.g. start with those that are more useful but, for the time being,
+> > hold off those that are not too useful to you and take loads of time to master.
+> > You will likely want to focus on the concepts in the top right corner of the table first,
+> > but investing time to master more difficult concepts may pay off in the long run
+> > by saving you time and effort and helping reduce technical debt.
+> >
+> > {: .image-with-shadow width="800px"}
+> >
+> > Another way you can organise the concepts is using a
+> > [concept map](https://en.wikipedia.org/wiki/Concept_map)
+> > (a directed graph depicting suggested relationships between concepts)
+> > or any other diagram/visual aid of your choice.
+> > Below are some example views of tools and techniques covered in the course using concept maps.
+> > Your views may differ but that is not to say that either view is right or wrong.
+> > This exercise is meant to get you to reflect on what was covered in the course
+> > and hopefully to reinforce the ideas and concepts you learned.
+> >
+> > {: .image-with-shadow width="800px"}
+> >
+> > A different concept map tries to organise concepts/skills based on their level of difficulty
+> > (novice, intermediate and advanced, and in-between!)
+> > and tries to show which skills are prerequisite for others
+> > and in which order you should consider learning skills.
+> >
+> > {: .image-with-shadow width="800px"}
> {: .solution}
{: .challenge}
## Further Resources
+
Below are some additional resources to help you continue learning:
- [Additional episode on persisting data](../persistence)
- [Additional episode on databases](../databases)
-- [CodeRefinery courses on FAIR (Findable, Accessible, Interoperable, and Reusable) software practices][coderefinery-lessons]
+- [CodeRefinery courses on FAIR
+ (Findable, Accessible, Interoperable, and Reusable)
+ software practices][coderefinery-lessons]
- [Python documentation][python-documentation]
- [GitHub Actions documentation][github-actions]
diff --git a/_extras/common-issues.md b/_extras/common-issues.md
index ec741bc38..82ebe7817 100644
--- a/_extras/common-issues.md
+++ b/_extras/common-issues.md
@@ -2,36 +2,54 @@
title: "Common Issues, Fixes & Tips"
---
-Here is a list of issues previous participants of the course encountered and some tips to help you with troubleshooting.
+Here is a list of issues previous participants of the course encountered
+and some tips to help you with troubleshooting.
## Command Line/Git Bash Issues
-
+
### Python Hangs in Git Bash
-Hanging issues with trying to run Python 3 in Git Bash on Windows (i.e. typing `python` in the shell, which causes
-it to just hang with no error message or output). The solution appears to be to use `winpty` - a Windows software
-package providing an interface similar to a Unix pty-master for communicating with Windows command line tools.
-Inside the shell type `alias python='winpty python.exe'`. This alias will be valid for the duration of the shell
-session. For a more permanent solution, from the shell do: `echo "alias python='winpty python.exe'" >> ~/.bashrc`
-(and from there on remember to invoke Python as `python` or whatever command you aliased it to).
-Read more details on the issue at [Stack Overflow](https://stackoverflow.com/questions/32597209/python-not-working-in-the-command-line-of-git-bash) or [Superuser](https://superuser.com/questions/1403345/git-bash-not-running-python3-as-expected-hanging-issues).
+Hanging issues with trying to run Python 3 in Git Bash on Windows
+(i.e. typing `python` in the shell, which causes it to just hang with no error message or output).
+The solution appears to be to use `winpty` -
+a Windows software package providing an interface similar to a Unix pty-master
+for communicating with Windows command line tools.
+Inside the shell type `alias python='winpty python.exe'`.
+This alias will be valid for the duration of the shell session.
+For a more permanent solution, from the shell do:
+`echo "alias python='winpty python.exe'" >> ~/.bashrc`
+(and from there on remember to invoke Python as `python`
+or whatever command you aliased it to).
+Read more details on the issue at
+[Stack Overflow](https://stackoverflow.com/questions/32597209/python-not-working-in-the-command-line-of-git-bash)
+or [Superuser](https://superuser.com/questions/1403345/git-bash-not-running-python3-as-expected-hanging-issues).
### Customising Command Line Prompt
-Minor annoyance with the ultra long prompt command line sometimes gives you - if you do not want a reminder of the
-current working directory, you can just set it to `$` by typing the following in your command line: `export PS1="$ "`.
-More details on command line prompt customisation can be found in this [guide](https://www.cyberciti.biz/tips/howto-linux-unix-bash-shell-setup-prompt.html).
+Minor annoyance with the ultra long prompt command line sometimes gives you -
+if you do not want a reminder of the current working directory,
+you can set it to just `$` by typing the following in your command line: `export PS1="$ "`.
+More details on command line prompt customisation can be found in this
+[guide](https://www.cyberciti.biz/tips/howto-linux-unix-bash-shell-setup-prompt.html).
## Git/GitHub Issues
### Connection Issues When Accessing GitHub Using Git Over VPN or Protected Networks - Proxy Needed
-When accessing external services and websites (such as GitHub using `git` or to [install Python packages with `pip`](../common-issues/index.html#connection-issues-when-installing-packages-with-pip-over-vpn-or-protected-networks---proxy-needed)), you may experience connection errors
-(e.g. similar to `fatal: unable to access '....': Failed connect to github.com`) or a connection that hangs. This may
-indicate that they need to configure a proxy server user by your organisation to tunnel SSH traffic through a HTTP proxy.
-
-To get `git` to work through a proxy server in Windows, you'll need `connect.exe` program that comes with GitBash (which
-you should have installed as part of setup, so no additional installation is needed).
-If installed in the default location, this file should be found at
-`C:\Program Files\Git\mingw64\bin\connect.exe`. Next, you'll need to modify your ssh config file (typically in `~/.ssh/config`)
+When accessing external services and websites
+(such as GitHub using `git` or to
+[install Python packages with `pip`](../common-issues/index.html#connection-issues-when-installing-packages-with-pip-over-vpn-or-protected-networks---proxy-needed)),
+you may experience connection errors
+(e.g. similar to `fatal: unable to access '....': Failed connect to github.com`)
+or a connection that hangs.
+This may indicate that they need to configure a proxy server user by your organisation
+to tunnel SSH traffic through a HTTP proxy.
+
+To get `git` to work through a proxy server in Windows,
+you'll need `connect.exe` program that comes with GitBash
+(which you should have installed as part of setup, so no additional installation is needed).
+If installed in the default location,
+this file should be found at `C:\Program Files\Git\mingw64\bin\connect.exe`.
+Next, you'll need to modify your ssh config file (typically in `~/.ssh/config`)
and add the following:
+
~~~
Host github.com
ProxyCommand "C:/Program Files/Git/mingw64/bin/connect.exe" -H : %h %p
@@ -42,9 +60,12 @@ Host github.com
Hostname github.com
~~~
-Mac and Linux users can use the [Corkscrew tool](https://github.com/bryanpkc/corkscrew) for tunneling SSH through HTTP proxies,
-which would have to be installed separately. Next, you'll need to modify your SSH config file (typically in `~/.ssh/config`)
+Mac and Linux users can use the [Corkscrew tool](https://github.com/bryanpkc/corkscrew)
+for tunneling SSH through HTTP proxies,
+which would have to be installed separately.
+Next, you'll need to modify your SSH config file (typically in `~/.ssh/config`)
and add the following:
+
~~~
Host github.com
ProxyCommand corkscrew %h %p
@@ -56,21 +77,24 @@ Host github.com
~~~
### Creating a GitHub Key Without 'Workflow' Authorisation Scope
-If learner creates a GitHub authentication token but forgets to check 'workflow' scope (to allow the token to be used to update GitHub Action workflows) they will get the following error when trying to
-push a new workflow (when adding the `pytest` action in Section 2) to GitHub:
+If a learner creates a GitHub authentication token
+but forgets to check 'workflow' scope
+(to allow the token to be used to update GitHub Action workflows)
+they will get the following error when trying to push a new workflow
+(when adding the `pytest` action in Section 2) to GitHub:
~~~
! [remote rejected] test-suite -> test-suite (refusing to allow an OAuth App to create or update workflow `.github/workflows/main.yml` without `workflow` scope`
~~~
{: .error}
-The solution is to generate a new token with the correct scope/usage permissions and clear the local
-credential cache (if that's where the token has been saved). In same cases, simply clearing
-credential cache was not enough and updating to Git 2.29 was needed.
+The solution is to generate a new token with the correct scope/usage permissions
+and clear the local credential cache (if that's where the token has been saved).
+In same cases, simply clearing credential cache was not enough and updating to Git 2.29 was needed.
### `Please tell me who you are` Git Error
-If you experience the following error the first time you do a Git commit, you may not have configured your identity with
-Git on your machine:
+If you experience the following error the first time you do a Git commit,
+you may not have configured your identity with Git on your machine:
~~~
fatal: unable to auto-detect email address
@@ -79,19 +103,24 @@ fatal: unable to auto-detect email address
{: .error}
This can be configured from the command line as follows:
+
~~~
$ git config --global user.name "Your Name"
$ git config --global user.email "name@example.com"
~~~
{: .language-bash}
-The option `--global` tells Git to use these settings "globally" (i.e. for every project that uses Git for version control
-on your machine). If you use different identifies for different projects, then you should not use the
-`--global` option. Make sure to use the same email address you used to open an account on GitHub that you
-are using for this course.
+The option `--global` tells Git to use these settings "globally"
+(i.e. for every project that uses Git for version control on your machine).
+If you use different identifies for different projects,
+then you should not use the `--global` option.
+Make sure to use the same email address you used to open an account on GitHub
+that you are using for this course.
-At this point it may also be a good time to configure your favourite text editor with Git, if you have not already done so.
+At this point it may also be a good time to configure your favourite text editor with Git,
+if you have not already done so.
For example, to use the editor `nano` with Git:
+
~~~
$ git config --global core.editor "nano -w"
~~~
@@ -100,8 +129,8 @@ $ git config --global core.editor "nano -w"
## Python, `pip`, `venv` & Installing Packages Issues
### Issues With Numpy (and Potentially Other Packages) on New M1 Macs
-
-When using `numpy` package installed via `pip` on a command line on a new Apple M1 Mac, you get a failed installation with the error:
+When using `numpy` package installed via `pip` on a command line on a new Apple M1 Mac,
+you get a failed installation with the error:
~~~
...
@@ -109,24 +138,39 @@ mach-o file, but is an incompatible architecture (have 'x86_64', need 'arm64e').
...
~~~
{: .error}
-
-Numpy is a package heavily optimised for performance, and many parts of it are written in C and compiled for specific architectures, such as Intel (x86_64, x86_32, etc.) or Apple's M1 (arm64e). In this instance, `pip` is obtaining a version of `numpy` with the incorrect compiled binaries, instead of the ones needed for Apple's M1 Mac. One way that was found to work was to install numpy via PyCharm into your environment instead, which seems able to determine the correct packages to download and install.
-### Python 3 Installed but not Found When Using `python3` Command
-Python 3 installed on some Windows machines may not be accessible using the `python3` command from the command line, but
-works fine when invoked via the command `python`.
+Numpy is a package heavily optimised for performance,
+and many parts of it are written in C and compiled for specific architectures,
+such as Intel (x86_64, x86_32, etc.)
+or Apple's M1 (arm64e).
+In this instance, `pip` is obtaining a version of `numpy` with the incorrect compiled binaries,
+instead of the ones needed for Apple's M1 Mac.
+One way that was found to work was to install numpy via PyCharm into your environment instead,
+which seems able to determine the correct packages to download and install.
+
+### Python 3 Installed but not Found When Using `python3` Command
+Python 3 installed on some Windows machines
+may not be accessible using the `python3` command from the command line,
+but works fine when invoked via the command `python`.
### Connection Issues When Installing Packages With `pip` Over VPN or Protected Networks - Proxy Needed
-If you encounter issues when trying to install packages with `pip` over your organisational network -
-it may be because your may need to [use a proxy](https://stackoverflow.com/questions/30992717/proxy-awareness-with-pip) provided by your organisation. In order to get `pip` to use the proxy, you need to add an additional parameter when installing packages with `pip`:
+If you encounter issues when trying to
+install packages with `pip` over your organisational network -
+it may be because your may need to
+[use a proxy](https://stackoverflow.com/questions/30992717/proxy-awareness-with-pip)
+provided by your organisation.
+In order to get `pip` to use the proxy,
+you need to add an additional parameter when installing packages with `pip`:
~~~
$ pip3 install --proxy `
~~~
{: .language-bash}
-To keep these settings permanently, you may want to add the following to your `.zshrc`/`.bashrc` file to avoid
-having to specify the proxy for each session, and restart your command line terminal:
+To keep these settings permanently,
+you may want to add the following to your `.zshrc`/`.bashrc` file
+to avoid having to specify the proxy for each session,
+and restart your command line terminal:
~~~
# call set_proxies to set proxies and unset_proxies to remove them
@@ -144,31 +188,39 @@ export NO_PROXY=
~~~
## PyCharm Issues
-
+
### Using GitBash from PyCharm
-To embed Git Bash in PyCharm as external tool and work with it in PyCharm window, from Settings select
-"Tools->Terminal->Shell path" and enter `“C:\Program Files\Git\bin\sh.exe” --login`. See [more details](https://stackoverflow.com/questions/20573213/embed-git-bash-in-pycharm-as-external-tool-and-work-with-it-in-pycharm-window-w) on Stack Overflow.
-
+To embed Git Bash in PyCharm as external tool and work with it in PyCharm window,
+from Settings
+select "Tools->Terminal->Shell path"
+and enter `“C:\Program Files\Git\bin\sh.exe” --login`.
+See [more details](https://stackoverflow.com/questions/20573213/embed-git-bash-in-pycharm-as-external-tool-and-work-with-it-in-pycharm-window-w)
+on Stack Overflow.
+
### Virtual Environments Issue `"no such option: –build-dir"`
-Using PyCharm to add a package to a virtual environment created from the command line using `venv`
-can fail with error `"no such option: –build-dir"`, which appears to be caused by the latest version of `pip` (20.3)
-where the flag `-build-dir` was removed but is required by PyCharm to install packages. A workaround is to:
+Using PyCharm to add a package to a virtual environment created from the command line using `venv`
+can fail with error `"no such option: –build-dir"`,
+which appears to be caused by the latest version of `pip` (20.3)
+where the flag `-build-dir` was removed but is required by PyCharm to install packages.
+A workaround is to:
+
- Close PyCharm
-- Downgrade the version of `pip` used by `venv`, e.g. in a command line terminal type:
+- Downgrade the version of `pip` used by `venv`, e.g. in a command line terminal type:
~~~
$ pip3 install pip==20.2.4
~~~
{: .language-bash}
- Restart PyCharm
-See [the issue](https://youtrack.jetbrains.com/issue/PY-45712) for more details.
+See [the issue](https://youtrack.jetbrains.com/issue/PY-45712) for more details.
This issue seems to only occur with older versions of PyCharm - recent versions should be fine.
-
+
### Invalid YAML Issue
-If YAML is copy+pasted from the course material, it might not get pasted correctly in PyCharm and some
-extra indentation may occur. Annoyingly, PyCharm won't flag this up as invalid YAML and learners may get
-all sort of different issues and errors with these files - e.g. ‘actions must start with run or uses’ with
-GitHub Actions workflows.
+If YAML is copy+pasted from the course material,
+it might not get pasted correctly in PyCharm and some extra indentation may occur.
+Annoyingly, PyCharm won't flag this up as invalid YAML
+and learners may get all sort of different issues and errors with these files -
+e.g. ‘actions must start with run or uses’ with GitHub Actions workflows.
An example of incorrect extra indentation:
@@ -186,4 +238,3 @@ steps:
~~~
{% include links.md %}
-
diff --git a/_extras/databases.md b/_extras/databases.md
index 3fd94042b..b4bc67a65 100644
--- a/_extras/databases.md
+++ b/_extras/databases.md
@@ -15,37 +15,56 @@ keypoints:
## Databases
> ## Follow up from Section 3
-> This episode could be read as a follow up from the end of [Section 3 on software design and development](../36-architecture-revisited/index.html#additional-material).
-{: .callout}
+> This episode could be read as a follow up from the end of
+> [Section 3 on software design and development](../36-architecture-revisited/index.html#additional-material).
+{: .callout}
-A **database** is an organised collection of data, usually organised in some way to mimic the structure of the entities it represents.
-There are several major families of database model, but the dominant form is the **relational database**.
+A **database** is an organised collection of data,
+usually organised in some way to mimic the structure of the entities it represents.
+There are several major families of database model,
+but the dominant form is the **relational database**.
-Relational databases focus on describing the relationships between entities in the data, similar to the object oriented paradigm.
+Relational databases focus on describing the relationships between entities in the data,
+similar to the object oriented paradigm.
The key concepts in a relational database are:
Tables
-- Within a database we can have multiple tables - each table usually represents all entities of a single type.
+
+- Within a database we can have multiple tables -
+ each table usually represents all entities of a single type.
- E.g., we might have a `patients` table to represent all of our patients.
Columns / Fields
+
- Each table has columns - each column has a name and holds data of a specific type
-- E.g., we might have a `name` column in our `patients` table which holds text data representing the names of our patients.
+- E.g., we might have a `name` column in our `patients` table
+ which holds text data representing the names of our patients.
Rows
+
- Each table has rows - each row represents a single entity and has a value for each field.
-- E.g., each row in our `patients` table represents a single patient - the value of the `name` field in this row is our patient's name.
+- E.g., each row in our `patients` table represents a single patient -
+ the value of the `name` field in this row is our patient's name.
Primary Keys
-- Each row has a primary key - this is a unique ID that can be used to select this from from the data.
-- E.g., each patient might have a `patient_id` which can be used to distinguish two patients with the same name.
+
+- Each row has a primary key -
+ this is a unique ID that can be used to select this from from the data.
+- E.g., each patient might have a `patient_id`
+ which can be used to distinguish two patients with the same name.
Foreign Keys
-- A relationship between two entities is described using a foreign key - this is a field which points to the primary key of another row / table.
-- E.g., Each patient might have a foreign key field called `doctor` pointing to a row in a `doctors` table representing the doctor responsible for them - i.e. this doctor *has a* patient.
-While relational databases are typically accessed using **SQL queries**, we're going to use a library to help us translate between Python and the database.
-[SQLAlchemy](https://www.sqlalchemy.org/) is a popular Python library which contains an **Object Relational Mapping** (ORM) framework.
+- A relationship between two entities is described using a foreign key -
+ this is a field which points to the primary key of another row / table.
+- E.g., Each patient might have a foreign key field called `doctor`
+ pointing to a row in a `doctors` table representing the doctor responsible for them -
+ i.e. this doctor *has a* patient.
+
+While relational databases are typically accessed using **SQL queries**,
+we're going to use a library to help us translate between Python and the database.
+[SQLAlchemy](https://www.sqlalchemy.org/) is a popular Python library
+which contains an **Object Relational Mapping** (ORM) framework.
> ## SQLAlchemy
>
@@ -60,8 +79,11 @@ $ pip3 install sqlalchemy
```
{: .language-bash}
-A mapping is the core component of an ORM - it describes how to convert between our Python classes and the contents of our database tables.
-Typically, we can take our existing classes and convert them into mappings with a little modification, so we don't have to start from scratch.
+A mapping is the core component of an ORM -
+it describes how to convert between our Python classes and the contents of our database tables.
+Typically, we can take our existing classes
+and convert them into mappings with a little modification,
+so we don't have to start from scratch.
~~~
# file: inflammation/models.py
@@ -87,34 +109,56 @@ class Patient(Base):
~~~
{: .language-python}
-Now that we've defined how to translate between our Python class and a database table, we need to hook our code up to an actual database.
+Now that we've defined how to translate between our Python class and a database table,
+we need to hook our code up to an actual database.
The library we're using, SQLAlchemy, does everything through a database **engine**.
-This is essentially a wrapper around the real database, so we don't have to worry about which particular database software is being used - we just need to write code for a generic relational database.
+This is essentially a wrapper around the real database,
+so we don't have to worry about which particular database software is being used -
+we just need to write code for a generic relational database.
-For these lessions we're going to use the SQLite engine as this requires almost no configuration and no external software.
+For these lessions we're going to use the SQLite engine
+as this requires almost no configuration and no external software.
Most relational database software runs as a separate service which we can connect to from our code.
-This means that in a large scale environment, we could have the database and our software running on different computers - we could even have the database spread across several servers if we have particularly high demands for performance or reliability.
+This means that in a large scale environment,
+we could have the database and our software running on different computers -
+we could even have the database spread across several servers
+if we have particularly high demands for performance or reliability.
Some examples of databases which are used like this are PostgreSQL, MySQL and MSSQL.
-On the other hand, SQLite runs entirely within our software and uses only a single file to hold its data.
-It won't give us the extremely high performance or reliability of a properly configured PostgreSQL database, but it's good enough in many cases and much less work to get running.
+On the other hand, SQLite runs entirely within our software
+and uses only a single file to hold its data.
+It won't give us
+the extremely high performance or reliability of a properly configured PostgreSQL database,
+but it's good enough in many cases and much less work to get running.
-Lets write some test code to setup and connect to an SQLite database.
-For now we'll store the database in memory rather than an actual file - it won't actually allow us to store data after the program finishes, but it allows us not to worry about **migrations**.
+Let's write some test code to setup and connect to an SQLite database.
+For now we'll store the database in memory rather than an actual file -
+it won't actually allow us to store data after the program finishes,
+but it allows us not to worry about **migrations**.
> ## Migrations
>
-> When we make changes to our mapping (e.g. adding / removing columns), we need to get the database to update its tables to make sure they match the new format.
-> This is what the `Base.metadata.create_all` method does - creates all of these tables from scratch because we're using an in-memory database which we know will be removed between runs.
+> When we make changes to our mapping (e.g. adding / removing columns),
+> we need to get the database to update its tables to make sure they match the new format.
+> This is what the `Base.metadata.create_all` method does -
+> creates all of these tables from scratch
+> because we're using an in-memory database which we know will be removed between runs.
>
-> If we're actually storing data persistently, we need to make sure that when we change the mapping, we update the database tables without damaging any of the data they currently contain.
-> We could do this manually, by running SQL queries against the tables to get them into the right format, but this is error-prone and can be a lot of work.
+> If we're actually storing data persistently,
+> we need to make sure that when we change the mapping,
+> we update the database tables without damaging any of the data they currently contain.
+> We could do this manually,
+> by running SQL queries against the tables to get them into the right format,
+> but this is error-prone and can be a lot of work.
>
> In practice, we generate a migration for each change.
-> Tools such as [Alembic](https://alembic.sqlalchemy.org/en/latest/) will compare our mappings to the known state of the database and generate a Python file which updates the database to the necessary state.
+> Tools such as [Alembic](https://alembic.sqlalchemy.org/en/latest/)
+> will compare our mappings to the known state of the database
+> and generate a Python file which updates the database to the necessary state.
>
-> Migrations can be quite complex, so we won't be using them here - but you may find it useful to read about them later.
+> Migrations can be quite complex, so we won't be using them here -
+> but you may find it useful to read about them later.
{: .callout}
~~~
@@ -151,19 +195,32 @@ def test_sqlalchemy_patient_search():
~~~
{: .language-python}
-For this test, we've imported our models inside the test function, rather than at the top of the file like we normally would.
-This is not recommended in normal code, as it means we're paying the performance cost of importing every time we run the function, but can be useful in test code.
-Since each test function only runs once per test session, this performance cost isn't as important as a function we were going to call many times.
-Additionally, if we try to import something which doesn't exist, it will fail - by imporing inside the test function, we limit this to that specific test failing, rather than the whole file failing to run.
+For this test, we've imported our models inside the test function,
+rather than at the top of the file like we normally would.
+This is not recommended in normal code,
+as it means we're paying the performance cost of importing every time we run the function,
+but can be useful in test code.
+Since each test function only runs once per test session,
+this performance cost isn't as important as a function we were going to call many times.
+Additionally, if we try to import something which doesn't exist, it will fail -
+by imporing inside the test function,
+we limit this to that specific test failing,
+rather than the whole file failing to run.
### Relationships
-Relational databases don't typically have an 'array of numbers' column type, so how are we going to represent our observations of our patients' inflammation?
+Relational databases don't typically have an 'array of numbers' column type,
+so how are we going to represent our observations of our patients' inflammation?
Well, our first step is to create a table of observations.
-We can then use a **foreign key** to point from the observation to a patient, so we know which patient the data belongs to.
-The table also needs a column for the actual measurement - we'll call this `value` - and a column for the day the measurement was taken on.
+We can then use a **foreign key** to point from the observation to a patient,
+so we know which patient the data belongs to.
+The table also needs a column for the actual measurement -
+we'll call this `value` -
+and a column for the day the measurement was taken on.
-We can also use the ORM's `relationship` helper function to allow us to go between the observations and patients without having to do any of the complicated table joins manually.
+We can also use the ORM's `relationship` helper function
+ allow us to go between the observations and patients
+ without having to do any of the complicated table joins manually.
~~~
from sqlalchemy import Column, ForeignKey, Integer, String
@@ -199,13 +256,17 @@ class Patient(Base):
> ## Time is Hard
>
> We're using an integer field to store the day on which a measurement was taken.
-> This keeps us consistent with what we had previously as it's essentialy the position of the measurement in the Numpy array.
+> This keeps us consistent with what we had previously
+> as it's essentialy the position of the measurement in the Numpy array.
> It also avoids us having to worry about managing actual date / times.
>
-> The Python `datetime` module we've used previously in the Academics example would be useful here, and most databases have support for 'date' and 'time' columns, but to reduce the complexity, we'll just use integers here.
+> The Python `datetime` module we've used previously in the Academics example would be useful here,
+> and most databases have support for 'date' and 'time' columns,
+> but to reduce the complexity, we'll just use integers here.
{: .callout}
-Our test code for this is going to look very similar to our previous test code, so we can copy-paste it and make a few changes.
+Our test code for this is going to look very similar to our previous test code,
+so we can copy-paste it and make a few changes.
This time, after setting up the database, we need to add a patient and an observation.
We then test that we can get the observations from a patient we've searched for.
@@ -242,8 +303,10 @@ def test_sqlalchemy_observations():
~~~
{: .language-python}
-Finally, let's put in a way to convert all of our observations into a Numpy array, so we can use our previous analysis code.
-We'll use the `property` decorator here again, to create a method that we can use as if it was a normal data attribute.
+Finally, let's put in a way to convert all of our observations into a Numpy array,
+so we can use our previous analysis code.
+We'll use the `property` decorator here again,
+to create a method that we can use as if it was a normal data attribute.
~~~
# file: inflammation/models.py
@@ -274,7 +337,8 @@ class Patient(Base):
{: .language-python}
Once again we'll copy-paste the test code and make some changes.
-This time we want to create a few observations for our patient and test that we can turn them into a Numpy array.
+This time we want to create a few observations for our patient
+and test that we can turn them into a Numpy array.
~~~
# file: tests/test_models.py
@@ -307,16 +371,21 @@ def test_sqlalchemy_observations_to_array():
> ## Further Array Testing
>
-> There's an important feature of the behaviour of our `Patient.values` property that's not currently being tested.
+> There's an important feature of the behaviour of our `Patient.values` property
+> that's not currently being tested.
> What is this feature?
> Write one or more extra tests to cover this feature.
>
> > ## Hint
> >
-> > The `Patient.values` property creates an array of zeroes, then fills it with data from the table.
-> > If a measurement was not taken on a particular day, that day's value will be left as zero.
+> > The `Patient.values` property creates an array of zeroes,
+> > then fills it with data from the table.
+> > If a measurement was not taken on a particular day,
+> > that day's value will be left as zero.
> >
-> > If this is intended behaviour, it would be useful to write a test for it, to ensure that we don't break it in future.
+> > If this is intended behaviour,
+> > it would be useful to write a test for it,
+> > to ensure that we don't break it in future.
> > Using tests in this way is known as **regression testing**.
> >
> {: .solution}
@@ -325,30 +394,51 @@ def test_sqlalchemy_observations_to_array():
> ## Refactoring for Reduced Redundancy
>
> You've probably noticed that there's a lot of replicated code in our database tests.
-> It's fine if some code is replicated a bit, but if you keep needing to copy the same code, that's a sign it should be refactored.
+> It's fine if some code is replicated a bit,
+> but if you keep needing to copy the same code,
+> that's a sign it should be refactored.
>
-> Refactoring is the process of changing the structure of our code, without changing its behaviour, and one of the main benefits of good test coverage is that it makes refactoring easier.
-> If we've got a good set of tests, it's much more likely that we'll detect any changes to behaviour - even when these changes might be in the tests themselves.
+> Refactoring is the process of changing the structure of our code,
+> without changing its behaviour,
+> and one of the main benefits of good test coverage is that it makes refactoring easier.
+> If we've got a good set of tests,
+> it's much more likely that we'll detect any changes to behaviour -
+> even when these changes might be in the tests themselves.
>
-> Try refactoring the database tests to see if you can reduce the amount of replicated code by moving it into one or more functions at the top of the test file.
+> Try refactoring the database tests to see if you can
+> reduce the amount of replicated code
+> by moving it into one or more functions at the top of the test file.
>
{: .challenge}
> ## Advanced Challenge: Connecting More Views
>
-> We've added the ability to store patient records in the database, but not actually connected it to any useful views.
-> There's a common pattern in data management software which is often refered to as **CRUD** - Create, Read, Update, Delete.
-> These are the four fundamental views that we need to provide to allow people to manage their data effectively.
+> We've added the ability to store patient records in the database,
+> but not actually connected it to any useful views.
+> There's a common pattern in data management software
+> which is often refered to as **CRUD** - Create, Read, Update, Delete.
+> These are the four fundamental views that we need to provide
+> to allow people to manage their data effectively.
>
-> Each of these applies at the level of a single record, so for both patients and observations we should have a view to: create a new record, show an existing record, update an existing record and delete an existing record.
-> It's also sometimes useful to provide a view which lists all existing records for each type - for example, a list of all patients would probably be useful, but a list of all observations might not be.
+> Each of these applies at the level of a single record,
+> so for both patients and observations we should have a view to:
+> create a new record,
+> show an existing record,
+> update an existing record
+> and delete an existing record.
+> It's also sometimes useful to provide a view which lists all existing records for each type -
+> for example, a list of all patients would probably be useful,
+> but a list of all observations might not be.
>
-> Pick one (or several) of these views to implement - you may want to refer back to the section where we added our initial patient read view.
+> Pick one (or several) of these views to implement -
+> you may want to refer back to the section where we added our initial patient read view.
{: .challenge}
> ## Advanced Challenge: Managing Dates Properly
>
> Try converting our existing models to use actual dates instead of just a day number.
-> The Python [datetime module documentation](https://docs.python.org/3/library/datetime.html) and SQLAlchemy [Column and Data Types page](https://docs.sqlalchemy.org/en/13/core/type_basics.html) will be useful to you here.
+> The Python [datetime module documentation](https://docs.python.org/3/library/datetime.html)
+> and SQLAlchemy [Column and Data Types page](https://docs.sqlalchemy.org/en/13/core/type_basics.html)
+> will be useful to you here.
>
{: .challenge}
diff --git a/_extras/guide.md b/_extras/guide.md
index 8388791c5..54168c7d6 100644
--- a/_extras/guide.md
+++ b/_extras/guide.md
@@ -3,27 +3,34 @@ title: "Instructor Notes"
---
> ## Common Issues & Tips
-> Check out a [list of issues](../common-issues) previous
-> participants of the course encountered and some tips to help you with troubleshooting at the workshop.
-{: .callout}
+> Check out a [list of issues](../common-issues) previous participants of the course encountered
+> and some tips to help you with troubleshooting at the workshop.
+{: .callout}
## Course Design
-The course follows a narrative around a software development team working on an existing software project that is
-analysing patients’ inflammation data (from the [novice Software Carpentry Python course](https://software-carpentry.org/lessons).
-The course is meant to be delivered as a single unit as the course's code examples and exercises built on top of
-previously covered topics and code - so skipping or missing bits of the course would cause students to get out of
-sync and cause them difficulties in following subsequent sections.
-
-A typical learner for the course is someone who has gained foundational software development skills in using Git,
-command line shell and Python (e.g. by attending prior courses or by self-learning), and have used these skills
-for individual code development and scripting. They are now joining the
-development team where they will require a number of software development tools and intermediate software
-development skills to engineer their code more properly taking into consideration the lifecycle of software,
-team ethic, writing software for stakeholders, and applying a process to understanding, designing, building,
-releasing, and maintaining software.
-
-The course has been separated into 5 sections:
+The course follows a narrative around
+a software development team working on an existing software project
+that is analysing patients’ inflammation data
+(from the [novice Software Carpentry Python course](https://software-carpentry.org/lessons).
+The course is meant to be delivered as a single unit
+as the course's code examples and exercises built on top of previously covered topics and code -
+so skipping or missing bits of the course would cause students to
+get out of sync and cause them difficulties in following subsequent sections.
+
+A typical learner for the course is
+someone who has gained foundational software development skills in using Git,
+command line shell and Python
+(e.g. by attending prior courses or by self-learning),
+and has used these skills for individual code development and scripting.
+They are now joining the development team where they will require
+a number of software development tools and intermediate software development skills
+to engineer their code more properly
+taking into consideration the lifecycle of software,
+team ethic, writing software for stakeholders,
+and applying a process to understanding, designing, building, releasing, and maintaining software.
+
+The course has been separated into 5 sections:
- Section 1: Setting Up Environment For Collaborative Code Development
- Section 2: Ensuring Correctness of Software at Scale
@@ -34,66 +41,101 @@ The course has been separated into 5 sections:
Each section can be approximately delivered in a half-day (e.g. try to allow 4 hours per section).
## Course Delivery
-The course is intended primarily for self-learning but other modes of delivery are possible (e.g. mixing in elements
-of instructor-led coding-along). The way the course has been delivered so far is that
-students are organised in small groups from the outset and initially work individually through the
-material. In later sections, exercises involve more group work and people from the same group form small development
-teams and collaborate on a mini software project (to provide more in-depth practice for software development in teams).
+The course is intended primarily for self-learning
+but other modes of delivery are possible
+(e.g. mixing in elements of instructor-led coding-along).
+The way the course has been delivered so far is that
+students are organised in small groups from the outset
+and initially work individually through the material.
+In later sections,
+exercises involve more group work
+and people from the same group form small development teams
+and collaborate on a mini software project
+(to provide more in-depth practice for software development in teams).
There is a bunch of helpers on hand who sit with learners in groups.
-This provides a more comfortable and less intimidating learning environment with learners more willing to engage and
-chat with their group colleagues about what they are doing and ask for help.
-
-The course can be delivered online or in-person. A good ratio is 4-6 learners to 1 helper.
-If you have a smaller number of helpers than groups - helpers can roam around to make sure groups are making progress.
-While this course can be live-coded by an instructor as well (in the earlier stages), we felt
-that intermediate-level learners are capable of going through the material on their own
-at a reasonable speed and would not require to code-along to the same extent as novice learners. In later stages, exercises require
-participants to develop code more individually so they can review and comment on each other's code, so the
-codes need to be sufficiently different for these exercises to be effective. Having an instructor live-code would
-make everyone have exactly the same code on their machines and would not have the same effect.
-
-A workshop kicks off with everyone together at the start of each day. One of course leads/helpers provides workshop
-introduction and motivation to paint the bigger picture and set the scene for the whole workshop. In addition, a short
-intro to the section topics is provided on each day, to explain what the students will be learning and doing on that
-particular day. After that, participants are split into groups and go through the materials for that day on their own with
-helpers on hand. At the end of each section, all reconvene for a joint Q&A session,
-feedback and wrap-up. If participants have not finished all exercises for a section, they are asked to finish them off
-before the next section starts to make sure everyone is in sync as much as possible and are working on similar things
-(though students will inevitably cover the material at different speeds). This synchronisation becomes particularly
-important for later workshop stages when students start with group exercises.
+This provides a more comfortable and less intimidating learning environment
+with learners more willing to engage and chat with their group colleagues about what they are doing
+and ask for help.
+
+The course can be delivered online or in-person.
+A good ratio is 4-6 learners to 1 helper.
+If you have a smaller number of helpers than groups -
+helpers can roam around to make sure groups are making progress.
+While this course can be live-coded by an instructor as well (in the earlier stages),
+we felt that intermediate-level learners are capable of
+going through the material on their own at a reasonable speed
+and would not require to code-along to the same extent as novice learners.
+In later stages, exercises require participants to develop code more individually
+so they can review and comment on each other's code,
+so the codes need to be sufficiently different for these exercises to be effective.
+Having an instructor live-code would make everyone have exactly the same code on their machines
+and would not have the same effect.
+
+A workshop kicks off with everyone together at the start of each day.
+One of course leads/helpers provides workshop introduction
+and motivation to paint the bigger picture and set the scene for the whole workshop.
+In addition, a short intro to the section topics is provided on each day,
+to explain what the students will be learning and doing on that particular day.
+After that, participants are split into groups
+and go through the materials for that day on their own with helpers on hand.
+At the end of each section, all reconvene for a joint Q&A session, feedback and wrap-up.
+If participants have not finished all exercises for a section,
+they are asked to finish them off before the next section starts
+to make sure everyone is in sync as much as possible and are working on similar things
+(though students will inevitably cover the material at different speeds).
+This synchronisation becomes particularly important for later workshop stages
+when students start with group exercises.
### Helpers Roles and Responsibilities
-At the workshop, everyone in the training team is a helper. You may have more experienced helpers delivering introductions to the workshop and sections. Contact the course authors for intro slides you can reuse.
+At the workshop, everyone in the training team is a helper.
+You may have more experienced helpers delivering introductions to the workshop and sections.
+Contact the course authors for intro slides you can reuse.
Roles and responsibilities of helpers include:
+
- Being familiar with the material
- Facilitating groups/breakout rooms and helping people going through the material
-- Try to prepare a few questions/discussion points to take to groups/breakout rooms to make sure the groups are engaged
-(by note some learners may find discussions distracting so try and find a balance)
-- Taking notes on what works well and what not - throughout the workshop - from their individual perspective and perspectives of students:
+- Try to prepare a few questions/discussion points
+ to take to groups/breakout rooms to make sure the groups are engaged
+ (but note some learners may find discussions distracting so try to find a balance)
+- Taking notes on what works well and what not - throughout the workshop -
+ from their individual perspective and perspectives of students:
- Collecting general feelings and comments
- Their thoughts as a potential student and instructor
- Noting mistakes, inconsistencies and learning obstacles in the materials
- Recording issues or doing PRs in the lesson repository during or after of the workshop
-- Helping students get through the material but also being ready answer questions on applying the material in learners’ domains, if possible
+- Helping students get through the material
+ but also being ready to answer questions on applying the material in learners’ domains,
+ if possible
### Group Exercises
Here is some advice on how best to sync and organise group exercises in later stages of the course.
-- For earlier workshop stages, where learners go through the material individually (though placed in groups), maintaining the
-same group composition is not all that important. However, it would be good to maintain the same teams once group exercises
-start, as group will chose one software project to be the "team project" to work on.
-- Take a note of who was in which group between different days (e.g. in a share document where people can sign up),
-as people tend to forget (especially for online workshop).
-- Some group exercises start in the middle (rather than at the beginning) of a section. This means that synchronisation
-is needed to make sure everyone starts at the same time during that particular session. As some students will naturally
-be ready faster, perhaps have a shared document for people to put their names down as they are ready to start with
-the group exercises, and organise them in teams based on the speed they are covering the material. Even if these
-groups change from previous days, it will ensure people's idle time is minimised.
-- People may lose motivation in the later stages involving teamwork if some team members are missing - while this may
-be inevitable due to other commitments, make it clear during workshop advertising that people should try
-to commit workshop days/times.
-- Make it obvious to the learners that they should catch up with any unfinished material or exercises from the previous
-session before joining the next one - this is even more important for group exercises so the teams are not stalled.
+- For earlier workshop stages,
+ where learners go through the material individually (though placed in groups),
+ maintaining the same group composition is not all that important.
+ However, it would be good to maintain the same teams once group exercises start,
+ as group will chose one software project to be the "team project" to work on.
+- Take a note of who was in which group between different days
+ (e.g. in a share document where people can sign up),
+ as people tend to forget (especially for online workshop).
+- Some group exercises start in the middle (rather than at the beginning) of a section.
+ This means that synchronisation is needed to make sure
+ everyone starts at the same time during that particular session.
+ As some students will naturally be ready faster,
+ perhaps have a shared document for people to put their names down
+ as they are ready to start with the group exercises,
+ and organise them in teams based on the speed they are covering the material.
+ Even if these groups change from previous days,
+ it will ensure people's idle time is minimised.
+- People may lose motivation in the later stages involving teamwork
+ if some team members are missing -
+ while this may be inevitable due to other commitments,
+ make it clear during workshop advertising
+ that people should try to commit workshop days/times.
+- Make it obvious to the learners that they should
+ catch up with any unfinished material or exercises from the previous session
+ before joining the next one -
+ this is even more important for group exercises so the teams are not stalled.
{% include links.md %}
diff --git a/_extras/persistence.md b/_extras/persistence.md
index 0ab6f4b9a..ab0379062 100644
--- a/_extras/persistence.md
+++ b/_extras/persistence.md
@@ -24,17 +24,27 @@ keypoints:
## Introduction
> ## Follow up from Section 3
-> This episode could be read as a follow up from the end of [Section 3 on software design and development](../36-architecture-revisited/index.html#additional-material).
+> This episode could be read as a follow up from the end of
+> [Section 3 on software design and development](../36-architecture-revisited/index.html#additional-material).
{: .callout}
Our patient data system so far can read in some data, process it, and display it to people.
What's missing?
-Well, at the moment, if we wanted to add a new patient or perform a new observation, we would have to edit the input CSV file by hand.
-We might not want our staff to have to manage their patients by making changes to the data by hand, but rather provide the ability to do this through the software.
-That way we can perform any necessary validation (e.g. inflammation measurements must be a number) or transformation before the data gets accepted.
-
-If we want to bring in this data, modify it somehow, and save it back to a file, all using our existing MVC architecture pattern, we'll need to:
+Well, at the moment, if we wanted to add a new patient or perform a new observation,
+we would have to edit the input CSV file by hand.
+We might not want our staff to have to manage their patients
+by making changes to the data by hand,
+but rather provide the ability to do this through the software.
+That way we can perform any necessary validation
+(e.g. inflammation measurements must be a number)
+or transformation before the data gets accepted.
+
+If we want to bring in this data,
+modify it somehow,
+and save it back to a file,
+all using our existing MVC architecture pattern,
+we'll need to:
- Write some code to perform data import / export (**persistence**)
- Add some views we can use to modify the data
@@ -42,18 +52,28 @@ If we want to bring in this data, modify it somehow, and save it back to a file,
## Serialisation and Serialisers
-The process of converting data from an object to and from storable formats is often called **serialisation** and **deserialisation** and is handled by a **serialiser**.
-Serialisation is the process of exporting our structured data to a usually text-based format for easy storage or transfer, while deserialisation is the opposite.
-We're going to be making a serialiser for our patient data, but since there are many different formats we might eventually want to use to store the data, we'll also make sure it's possible to add alternative serialisers later and swap between them.
-So let's start by creating a base class to represent the concept of a serialiser for our patient data - then we can specialise this to make serialisers for different formats by inheriting from this base class.
+The process of converting data from an object to and from storable formats
+is often called **serialisation** and **deserialisation**
+and is handled by a **serialiser**.
+Serialisation is the process of
+exporting our structured data to a usually text-based format for easy storage or transfer,
+while deserialisation is the opposite.
+We're going to be making a serialiser for our patient data,
+but since there are many different formats we might eventually want to use to store the data,
+we'll also make sure it's possible to add alternative serialisers later and swap between them.
+So let's start by creating a base class
+to represent the concept of a serialiser for our patient data -
+then we can specialise this to make serialisers for different formats
+by inheriting from this base class.
By creating a base class we provide a contract that any kind of patient serialiser must satisfy.
-If we create some alternative serialisers for different data formats, we know that we will be able to use them all in exactly the same way.
+If we create some alternative serialisers for different data formats,
+we know that we will be able to use them all in exactly the same way.
This technique is part of an approach called **design by contract**.
We'll call our base class `PatientSerializer` and put it in file `inflammation/serializers.py`.
-~~~ python
+~~~
# file: inflammation/serializers.py
from inflammation import models
@@ -80,17 +100,30 @@ class PatientSerializer:
~~~
{: .language-python}
-Our serialiser base class has two pairs of class methods (denoted by the `@classmethod` decorators), one to serialise (save) the data and one to deserialise (load) it.
-We're not actually going to implement any of them quite yet as this is just a template for how our real serialisers should look, so we'll raise `NotImplementedError` to make this clear if anyone tries to use this class directly.
-The reason we've used class methods is that we don't need to be able to pass any data in using the `__init__` method, as we'll be passing the data to be serialised directly to the `save` function.
-
-There are many different formats we could use to store our data, but a good one is [**JSON** (JavaScript Object Notation)](https://en.wikipedia.org/wiki/JSON).
-This format comes originally from JavaScript, but is now one of the most widely used serialisation formats for exchange or storage of structured data, used across most common programming languages.
-
-Data in JSON format is structured using nested **arrays** (very similar to Python lists) and **objects** (very similar to Python dictionaries).
+Our serialiser base class has two pairs of class methods
+(denoted by the `@classmethod` decorators),
+one to serialise (save) the data and one to deserialise (load) it.
+We're not actually going to implement any of them quite yet
+as this is just a template for how our real serialisers should look,
+so we'll raise `NotImplementedError` to make this clear
+if anyone tries to use this class directly.
+The reason we've used class methods is that
+we don't need to be able to pass any data in using the `__init__` method,
+as we'll be passing the data to be serialised directly to the `save` function.
+
+There are many different formats we could use to store our data,
+but a good one is [**JSON** (JavaScript Object Notation)](https://en.wikipedia.org/wiki/JSON).
+This format comes originally from JavaScript,
+but is now one of the most widely used serialisation formats
+for exchange or storage of structured data,
+used across most common programming languages.
+
+Data in JSON format is structured using nested
+**arrays** (very similar to Python lists)
+and **objects** (very similar to Python dictionaries).
For example, we're going to try to use this format to store data about our patients:
-~~~ json
+~~~
[
{
"name": "Alice",
@@ -118,16 +151,26 @@ For example, we're going to try to use this format to store data about our patie
~~~
{: .language-json}
-Compared to the CSV format, this gives us much more flexibility to describe complex structured data.
-If we wanted to represent this data in CSV format, the most natural way would be to have two separate files: one with each row representing a patient, the other with each row representing an observation.
+Compared to the CSV format,
+this gives us much more flexibility to describe complex structured data.
+If we wanted to represent this data in CSV format,
+the most natural way would be to have two separate files:
+one with each row representing a patient,
+the other with each row representing an observation.
We'd then need to use a unique identifier to link each observation record to the relevant patient.
-This is how relational databases work, but it would be quite complicated to manage this ourselves with CSVs.
+This is how relational databases work,
+but it would be quite complicated to manage this ourselves with CSVs.
+
+Now, if we are going to follow
+[TDD (Test Driven Development)](../35-object-oriented-programming/index.html#test-driven-development),
+we should write some test code.
+Our JSON serialiser should be able to save and load our patient data to and from a JSON file,
+so for our test we could try these save-load steps
+and check that the result is the same as the data we started with.
+Again you might need to change these examples slightly
+to get them to fit with how you chose to implement your `Patient` class.
-Now, if we are going to follow [TDD (Test Driven Development)](../35-object-oriented-programming/index.html#test-driven-development), we should write some test code.
-Our JSON serialiser should be able to save and load our patient data to and from a JSON file, so for our test we could try these save-load steps and check that the result is the same as the data we started with.
-Again you might need to change these examples slightly to get them to fit with how you chose to implement your `Patient` class.
-
-~~~ python
+~~~
# file: tests/test_serializers.py
from inflammation import models, serializers
@@ -159,10 +202,14 @@ We then load the data from this file and check that the results match the input.
With our test, we know what the correct behaviour looks like - now it's time to implement it.
For this, we'll use one of Python's built-in libraries.
-Among other more complex features, the `json` library provides functions for converting between Python data structures and JSON formatted text files.
-Our test also didn't specify what the structure of our output data should be, so we need to make that decision here - we'll use the format we used as JSON example earlier.
+Among other more complex features,
+the `json` library provides functions for
+converting between Python data structures and JSON formatted text files.
+Our test also didn't specify what the structure of our output data should be,
+so we need to make that decision here -
+we'll use the format we used as JSON example earlier.
-~~~ python
+~~~
# file: inflammation/serializers.py
import json
@@ -197,27 +244,40 @@ class PatientJSONSerializer(PatientSerializer):
~~~
{: .language-python}
-For our `save` / `serialize` methods, since the JSON format is similar to nested Python lists and dictionaries, it makes sense as a first step to convert the data from our `Patient` class into a dictionary - we do this for each patient using a list comprehension.
+For our `save` / `serialize` methods,
+since the JSON format is similar to nested Python lists and dictionaries,
+it makes sense as a first step to convert the data from our `Patient` class into a dictionary -
+we do this for each patient using a list comprehension.
Then we can pass this to the `json.dump` function to save it to a file.
As we might expect, the `load` / `deserialize` methods are the opposite of this.
-Here we need to first read the data from our input file, then convert it to instances of our `Patient` class.
-The `**` syntax here may be unfamiliar to you - this is the **dictionary unpacking operator**.
-The dictionary unpacking operator can be used when calling a function (like a class `__init__` method) and passes the items in the dictionary as named arguments to the function.
-The name of each argument passed is the dictionary key, the value of the argument is the dictionary value.
+Here we need to first read the data from our input file,
+then convert it to instances of our `Patient` class.
+The `**` syntax here may be unfamiliar to you -
+this is the **dictionary unpacking operator**.
+The dictionary unpacking operator can be used when calling a function
+(like a class `__init__` method)
+and passes the items in the dictionary as named arguments to the function.
+The name of each argument passed is the dictionary key,
+the value of the argument is the dictionary value.
When we run the tests however, we should get an error:
~~~
FAILED tests/test_serializers.py::test_patients_json_serializer - TypeError: Object of type Observation is not JSON serializable
~~~
+{: .error}
-This means that our patient serializer almost works, but we need to write a serializer for our observation model as well!
+This means that our patient serializer almost works,
+but we need to write a serializer for our observation model as well!
-Since this new serializer is not a type of `PatientSerializer`, we need to inherit from a new base class which holds the design that is shared between `PatientSerializer` and `ObservationSerializer`.
-Since we don't actually need to save the observation data to a file independently, we won't worry about implementing the `save` and `load` methods for the `Observation` model.
+Since this new serializer is not a type of `PatientSerializer`,
+we need to inherit from a new base class
+which holds the design that is shared between `PatientSerializer` and `ObservationSerializer`.
+Since we don't actually need to save the observation data to a file independently,
+we won't worry about implementing the `save` and `load` methods for the `Observation` model.
-~~~ python
+~~~
# file: inflammation/serializers.py
from inflammation import models
@@ -261,7 +321,7 @@ class ObservationSerializer(Serializer):
Now we can link this up to the `PatientSerializer` and our test should finally pass.
-~~~ python
+~~~
# file: inflammation/serializers.py
...
@@ -290,43 +350,69 @@ class PatientSerializer(Serializer):
{: .language-python}
> ## Linking it All Together
-> We've now got some code which we can use to save and load our patient data, but we've not yet linked it up so people can use it.
+> We've now got some code which we can use to save and load our patient data,
+> but we've not yet linked it up so people can use it.
>
-> Just like we did with the `display_patient` view in [Section 3](../36-architecture-revisited/index.html#mvc-revisited), try adding some views to work with our patient data using the JSON serialiser.
-> When you do this, think about the design of the command line interface - what arguments will you need to get from the user, what output should they receive back?
+> Just like we did with the `display_patient` view in
+> [Section 3](../36-architecture-revisited/index.html#mvc-revisited),
+> try adding some views to work with our patient data using the JSON serialiser.
+> When you do this, think about the design of the command line interface -
+> what arguments will you need to get from the user,
+> what output should they receive back?
{: .challenge}
> ## Equality Testing
>
-> When we wrote our serialiser test, we said we wanted to check that the data coming out was the same as our input data, but we actually compared just parts of the data, rather than just using `assert patients_new == patients`.
+> When we wrote our serialiser test,
+> we said we wanted to check that the data coming out was the same as our input data,
+> but we actually compared just parts of the data,
+> rather than just using `assert patients_new == patients`.
>
-> The reason for this is that, by default, `==` comparing two instances of a class tests whether they're stored at the same location in memory, rather than just whether they contain the same data.
+> The reason for this is that,
+> by default, `==` comparing two instances of a class
+> tests whether they're stored at the same location in memory,
+> rather than just whether they contain the same data.
>
-> Add some code to the `Patient` and `Observation` classes, so that we get the expected result when we do `assert patients_new == patients`.
-> When you have this comparison working, update the serialiser test to use this instead.
+> Add some code to the `Patient` and `Observation` classes,
+> so that we get the expected result when we do `assert patients_new == patients`.
+> When you have this comparison working,
+> update the serialiser test to use this instead.
>
-> **Hint:** The method Python uses to check for equality of two instances of a class is called `__eq__` and takes the arguments `self` (as all normal methods do) and `other`.
+> **Hint:** The method Python uses to check for equality of two instances of a class
+> is called `__eq__` and takes the arguments `self` (as all normal methods do) and `other`.
{: .challenge}
> ## Advanced Challenge: Abstract Base Classes
>
-> Since our `Serializer` class is designed not to be directly usable and its methods raise `NotImplementedError`, it ideally should be an abstract base class.
-> An abstract base class is one which is intended to be used only by creating subclasses of it and can mark some or all of its methods as requiring implementation in the new subclass.
+> Since our `Serializer` class is designed not to be directly usable
+> and its methods raise `NotImplementedError`,
+> it ideally should be an abstract base class.
+> An abstract base class is one which is intended to be used only by creating subclasses of it
+> and can mark some or all of its methods as requiring implementation in the new subclass.
>
-> Using Python's documentation on the [abc module](https://docs.python.org/3/library/abc.html), convert the `Serializer` class into an ABC.
+> Using Python's documentation on
+> the [abc module](https://docs.python.org/3/library/abc.html),
+> convert the `Serializer` class into an ABC.
>
-> **Hint:** The only component that needs to be changed is `Serializer` - this should not require any changes to the other classes.
+> **Hint:** The only component that needs to be changed is `Serializer` -
+> this should not require any changes to the other classes.
>
> **Hint:** The abc module documentation refers to metaclasses - don't worry about these.
-> A metaclass is a template for creating a class (classes are instances of a metaclass), just like a class is a template for creating objects (objects are instances of a class), but this isn't necessary to understand if you're just using them to create your own abstract base classes.
+> A metaclass is a template for creating a class (classes are instances of a metaclass),
+> just like a class is a template for creating objects (objects are instances of a class),
+> but this isn't necessary to understand
+> if you're just using them to create your own abstract base classes.
{: .challenge}
> ## Advanced Challenge: CSV Serialization
>
> Try implementing an alternative serialiser, using the CSV format instead of JSON.
>
-> **Hint:** Python also has a module for handling CSVs - see the documentation for the [csv module](https://docs.python.org/3/library/csv.html).
-> This module provides a CSV reader and writer which are a bit more flexible, but slower for purely numeric data, than the ones we've seen previously as part of NumPy.
+> **Hint:** Python also has a module for handling CSVs -
+> see the documentation for the [csv module](https://docs.python.org/3/library/csv.html).
+> This module provides a CSV reader and writer which are a bit more flexible,
+> but slower for purely numeric data,
+> than the ones we've seen previously as part of NumPy.
>
> Can you think of any cases when a CSV might not be a suitable format to hold our patient data?
{: .challenge}
diff --git a/_extras/quiz.md b/_extras/quiz.md
index 45fc0a395..e8c1b1d9a 100644
--- a/_extras/quiz.md
+++ b/_extras/quiz.md
@@ -1,145 +1,163 @@
---
title: Quiz
---
-This is an intermediate-level software development course so it is expected for you to have some prerequisite knowledge
-on the topics covered, as outlined at the [beginning of the lesson](../index.html#prerequisites).
-Here is a little quiz that you can do to test your prior knowledge to determine
-where you fit on the skills spectrum and if this course is for you.
-
-## Git
-1. Which command should you use to initialise a new Git repository?
+
+This is an intermediate-level software development course
+so it is expected for you to have some prerequisite knowledge on the topics covered,
+as outlined at the [beginning of the lesson](../index.html#prerequisites).
+Here is a little quiz that you can do to test your prior knowledge
+to determine where you fit on the skills spectrum and if this course is for you.
+
+## Git
+
+1. Which command should you use to initialise a new Git repository?
~~~
a. git bash
b. git install
c. git init
d. git start
- ~~~
-
- >> ## Solution
- >> `git init` is the command to initialise a Git repository and tell Git to start tracking files in it.
- >> `git bash`, `git start` and `git install` are not Git commands and will return an error.
+ ~~~
+
+ > > ## Solution
+ > > `git init` is the command to initialise a Git repository
+ > > and tell Git to start tracking files in it.
+ > > `git bash`, `git start` and `git install` are not Git commands and will return an error.
> {: .solution}
-2. After you initialise a new Git repository and create a file named `LICENCE.md` in the root of the repository,
-which of the following commands will not work?
+2. After you initialise a new Git repository
+ and create a file named `LICENCE.md` in the root of the repository,
+ which of the following commands will not work?
~~~
- a. git add LICENCE.md
- b. git status
- c. git add .
+ a. git add LICENCE.md
+ b. git status
+ c. git add .
d. git commit -m "Licence file added"
~~~
- > > ## Solution
- > > `git commit -m "Licence file added"` won't work because you need to add the file to Git's staging area first
+ > > ## Solution
+ > > `git commit -m "Licence file added"` won't work
+ > > because you need to add the file to Git's staging area first
> > before you can commit.
> {: .solution}
-3. `git clone` command downloads and creates a local repository from a remote repository. Which command can then be
-used to upload your local changes back to the remote repository?
+3. `git clone` command downloads and creates a local repository from a remote repository.
+ Which command can then be used to upload your local changes back to the remote repository?
~~~
- a. git push
- b. git add
- c. git upload
- d. git commit
+ a. git push
+ b. git add
+ c. git upload
+ d. git commit
~~~
-
- > > ## Solution
- > > `git push` is the correct command. `git add` adds a file to the local staging area, `git commit` commits the
- > > staged changes to the local repository and `git push` will push those committed changes to the remote repository.
- > > `git upload` is not a Git command and will return an error.
- > {: .solution}
-## Shell
-1. In the command line shell, which command can you use to see the directory you are currently in?
+ > > ## Solution
+ > > `git push` is the correct command.
+ > > `git add` adds a file to the local staging area,
+ > > `git commit` commits the staged changes to the local repository
+ > > and `git push` will push those committed changes to the remote repository.
+ > > `git upload` is not a Git command and will return an error.
+ > {: .solution}
+
+## Shell
+
+1. In the command line shell,
+ which command can you use to see the directory you are currently in?
~~~
- a. whereami
- b. locate
- c. map
- d. pwd
+ a. whereami
+ b. locate
+ c. map
+ d. pwd
~~~
-
- > > ## Solution
- > > `pwd` (which stands for 'print working directory') is the correct command.
- > {: .solution}
-2. Which command do you use to go to the parent directory of the directory you are currently in?
-
+ > > ## Solution
+ > > `pwd` (which stands for 'print working directory') is the correct command.
+ > {: .solution}
+
+2. Which command do you use to go to the parent directory of the directory you are currently in?
+
~~~
- a. cd -
- b. cd ~
- c. cd /up
- d. cd ..
+ a. cd -
+ b. cd ~
+ c. cd /up
+ d. cd ..
~~~
-
- > > ## Solution
- > > `cd ..` is the correct command. `cd -` goes to the previous location in history (not parent). `cd ~` goes to the home folder. `cd /up` goes to a folder `up` in the root (`/`) of the file system.
- > {: .solution}
-3. How can you append the output of a command to a file?
-
+ > > ## Solution
+ > > `cd ..` is the correct command.
+ > > `cd -` goes to the previous location in history (not parent).
+ > > `cd ~` goes to the home folder.
+ > > `cd /up` goes to a folder `up` in the root (`/`) of the file system.
+ > {: .solution}
+
+3. How can you append the output of a command to a file?
+
~~~
- a. command > file
- b. command >> file
- c. command file
- d. command < file
+ a. command > file
+ b. command >> file
+ c. command file
+ d. command < file
~~~
-
- > > ## Solution
- > > `command >> file` is the correct command. `command > file` will redirect the output of a command to a file and
- > >overwrite its content, `command file` will pass the file as an argument to the command and `command < file` redirects
- > > input rather than output.
- > {: .solution}
+
+ > > ## Solution
+ > > `command >> file` is the correct command.
+ > > `command > file` will redirect the output of a command to a file
+ > > and overwrite its content,
+ > > `command file` will pass the file as an argument to the command
+ > > and `command < file` redirects input rather than output.
+ > {: .solution}
## Python
+
1. Which of these collections defines a list in Python?
-
- ~~~
- a. {"apple", "banana", "cherry"}
- b. {"name": "apple", "type": "fruit"}
- c. ["apple", "banana", "cherry"]
- d. ("apple", "banana", "cherry")
- ~~~
-
- > > ## Solution
- > > While all of the answers define a collection in Python, `["apple", "banana", "cherry"]` defines a list and
- > > is the correct answer. `{"apple", "banana", "cherry"}` defines a set; `{"name": "apple", "type": "fruit"}` defines a dictionary
- > > (a hash map),`("apple", "banana", "cherry")` defines a tuple (an ordered and unchangeable collection).
+
+ ~~~
+ a. {"apple", "banana", "cherry"}
+ b. {"name": "apple", "type": "fruit"}
+ c. ["apple", "banana", "cherry"]
+ d. ("apple", "banana", "cherry")
+ ~~~
+
+ > > ## Solution
+ > > While all of the answers define a collection in Python,
+ > > `["apple", "banana", "cherry"]` defines a list and is the correct answer.
+ > > `{"apple", "banana", "cherry"}` defines a set;
+ > > `{"name": "apple", "type": "fruit"}` defines a dictionary (a hash map);
+ > > `("apple", "banana", "cherry")` defines a tuple (an ordered and unchangeable collection).
> {: .solution}
2. What is the correct syntax for *if* statement in Python?
-
+
~~~
- a. if (x > 3):
- b. if (x > 3) then:
- c. if (x > 3)
- d. if (x > 3);
+ a. if (x > 3):
+ b. if (x > 3) then:
+ c. if (x > 3)
+ d. if (x > 3);
~~~
- > > ## Solution
- > > `if (x > 3):` is the correct answer.
- > {: .solution}
+ > > ## Solution
+ > > `if (x > 3):` is the correct answer.
+ > {: .solution}
-3. Look at the following 3 assignment statements in Python.
+3. Look at the following 3 assignment statements in Python.
~~~
- n = 300
- m = n
- n = -100
+ n = 300
+ m = n
+ n = -100
~~~
-
+
What is the result at the end of the above assignments?
~~~
- a. n = 300 and m = 300
- b. n = -100 and m = 300
- c. n = -100 and m = -100
- d. n = 300 and m = -10
+ a. n = 300 and m = 300
+ b. n = -100 and m = 300
+ c. n = -100 and m = -100
+ d. n = 300 and m = -10
~~~
-
- > > ## Solution
- > > `n = -100 and m = 300` is the correct answer.
- > {: .solution}
+
+ > > ## Solution
+ > > `n = -100 and m = 300` is the correct answer.
+ > {: .solution}
diff --git a/_extras/vscode.md b/_extras/vscode.md
index b53fa817b..6de0e1a39 100644
--- a/_extras/vscode.md
+++ b/_extras/vscode.md
@@ -4,7 +4,8 @@ title: "Additional Material: Using Microsoft Visual Studio Code"
## Installation
-VSCode is available from the project website [here](https://code.visualstudio.com/download). Users on Ubuntu can install the program via the package manager:
+VSCode is available from the project website [here](https://code.visualstudio.com/download).
+Users on Ubuntu can install the program via the package manager:
~~~
sudo apt install code
@@ -13,11 +14,15 @@ sudo apt install code
### Extensions
-As an IDE VSCode can be used for many programming languages provided the appropriate extensions have been installed. For this workshop we will require the Python extensions, to install extensions click the icon below in the sidebar:
+As an IDE VSCode can be used for many programming languages
+provided the appropriate extensions have been installed.
+For this workshop we will require the Python extensions,
+to install extensions click the icon below in the sidebar:

-Search "python" and select the result for the Intellisense extension created by Microsoft. Click "Install" to install the extension, you may be asked to also reload the window.
+Search "python" and select the result for the Intellisense extension created by Microsoft.
+Click "Install" to install the extension, you may be asked to also reload the window.

@@ -26,42 +31,62 @@ You are now ready to code!
## Using the VSCode IDE
-Let's open our project in VSCode now and familiarise ourselves with some commonly used features.
+Let's open our project in VSCode now
+and familiarise ourselves with some commonly used features.
### Opening a Software Project
Create a directory in a location of your choice which will be your main project folder.
-If you don't have VSCode running yet, start it up now. Select `File` > `Open Folder` and navigate to the directory you created.
+If you don't have VSCode running yet, start it up now.
+Select `File` > `Open Folder` and navigate to the directory you created.
### Configuring a Virtual Environment in VSCode
-As in the previous chapter, we now want to create a virtual environment we can work in. Go to `Terminal` > `New Terminal` to open a new terminal session within the project directory, and run the command to create a new environment:
+As in the episode
+[_Virtual Environments For Software Development_]({{ page.root }}{% link _episodes/12-virtual-environments.md %}),
+we now want to create a virtual environment we can work in.
+Go to `Terminal` > `New Terminal` to open a new terminal session within the project directory,
+and run the command to create a new environment:
~~~
python3 -m venv venv
~~~
{: .language-bash}
-this will create a new folder `venv`. VSCode will notice the new environment and ask if you want to use it as the default Python interpreter for this project, click "Yes":
+this will create a new folder `venv`.
+VSCode will notice the new environment
+and ask if you want to use it as the default Python interpreter for this project,
+click "Yes":

---
**Troubleshooting**
-If the prompt did not appear, you can manually set the interpreter. Firstly navigate to the location of the `python` binary within the virtual environment using the file browser side bar (see below), this will be located at `/bin/python`. Right click on the binary and select `Copy Path`.
+If the prompt did not appear, you can manually set the interpreter.
+Firstly navigate to the location of the `python` binary within the virtual environment
+using the file browser side bar (see below),
+this will be located at `/bin/python`.
+Right click on the binary and select `Copy Path`.
-Then using the keyboard shortcut `CTRL-SHIFT-P` to bring up the action menu, and searching for `Python: Select Interpreter`, clicking `Enter interpreter path...` and pasting the address followed by Enter.
+Then using the keyboard shortcut `CTRL-SHIFT-P` to bring up the action menu,
+and searching for `Python: Select Interpreter`,
+clicking `Enter interpreter path...`
+and pasting the address followed by Enter.
---
-You can verify the setup has worked correctly by creating an empty Python script in the project folder. Right click on the file explorer side bar and select `New File`, create a name for the file ensuring it ends in `.py`.
+You can verify the setup has worked correctly by
+creating an empty Python script in the project folder.
+Right click on the file explorer side bar and select `New File`,
+create a name for the file ensuring it ends in `.py`.

-If everything is setup correctly you should see the interpreter stated in the blue information bar at the bottom of your VSCode window:
+If everything is setup correctly you should see
+the interpreter stated in the blue information bar at the bottom of your VSCode window:

@@ -72,7 +97,10 @@ Any terminal you now open will start with the virtual environment already activa
### Adding Dependencies
-For this workshop you will need to install `pytest`, `numpy` and `matplotlib`, start a new terminal to activate the environment and run:
+For this workshop you will need to
+install `pytest`, `numpy` and `matplotlib`,
+start a new terminal to activate the environment
+and run:
~~~
pip install numpy matplotlib pytest
@@ -83,7 +111,9 @@ pip install numpy matplotlib pytest
**Troubleshooting**
-If you are having issues with `pip` it may be your version is too old. Pip will usually inform you via a warning if a newer version is available, upgrade pip by running:
+If you are having issues with `pip` it may be your version is too old.
+Pip will usually inform you via a warning if a newer version is available,
+upgrade pip by running:
~~~
pip install --upgrade pip
@@ -97,23 +127,33 @@ before installing packages.
## Running Scripts in VSCode
-To run a script in VSCode, open the script by clicking on it and then either click the Play icon in the top right corner, or use the keyboard shortcut `CTRL-ALT-N`.
+To run a script in VSCode,
+open the script by clicking on it
+and then either click the Play icon in the top right corner,
+or use the keyboard shortcut `CTRL-ALT-N`.

## Running Tests
-In addition VSCode also allows you to run tests from a dedicated test viewer. Clicking the laboratory flask icon in the sidebar allows you to set up test exploration:
+In addition VSCode also allows you to run tests from a dedicated test viewer.
+Clicking the laboratory flask icon in the sidebar allows you to set up test exploration:

-Click `Configure Python Tests`, select `pytest` as the test framework, and the `tests` directory as the directory for searching.
+Click `Configure Python Tests`,
+select `pytest` as the test framework,
+and the `tests` directory as the directory for searching.
-You should now be able to run tests individually using the test browser and selecting the test of interest.
+You should now be able to run tests individually
+using the test browser and selecting the test of interest.

-### Running in Debug
+### Running in Debug
-When clicking on a test you will see two icons, the ordinary Play icon, and an icon with a bug. The latter allows you to run the tests in debug mode useful for obtaining further information as to why a failure has occurred.
\ No newline at end of file
+When clicking on a test you will see two icons,
+the ordinary Play icon, and an icon with a bug.
+The latter allows you to run the tests in debug mode
+useful for obtaining further information as to why a failure has occurred.
diff --git a/index.md b/index.md
index caaa29093..30472b7bc 100644
--- a/index.md
+++ b/index.md
@@ -4,85 +4,120 @@ root: . # Is the only page that doesn't follow the pattern /:path/index.html
permalink: index.html # Is the only page that doesn't follow the pattern /:path/index.html
---
-This course aims to teach a **core set** of established, intermediate-level software development skills and
-best practices for working as part of a team in a research environment using Python as an
-example programming language (see detailed [learning objectives](/index.html#learning-objectives-for-the-workshop) below).
+This course aims to teach a **core set** of established,
+intermediate-level software development skills
+and best practices for working as part of a team in a research environment
+using Python as an example programming language
+(see detailed [learning objectives](/index.html#learning-objectives-for-the-workshop) below).
The core set of skills we teach is not a comprehensive set of all-encompassing skills,
-but a selective set of tried-and-tested collaborative development skills that forms a firm foundation for continuing
-on your learning journey.
+but a selective set of tried-and-tested collaborative development skills
+that forms a firm foundation for continuing on your learning journey.
-A **typical learner** for this course may be someone who is working in
-a research environment, needing to write some code, has **gained basic software development skills** either
-by self-learning or attending, e.g., a novice [Software Carpentry Python course](https://software-carpentry.org/lessons).
-They have been **applying those skills in their domain of work by writing code for some time**, e.g. half a year or more.
-However, their software development-related projects
-are now becoming larger and are involving more researchers and other stakeholders (e.g. users), for example:
-- Software is becoming more complex and more collaborative development effort is needed to keep the software running
-- Software is going further that just the small group developing and/or using the code - there are more users and
-an increasing need to add new features
-- ['Technical debt'](https://en.wikipedia.org/wiki/Technical_debt) is increasing with demands to add new functionality while ensuring previous development efforts remain functional and maintainable
+A **typical learner** for this course may be someone who
+is working in a research environment,
+needing to write some code,
+has **gained basic software development skills**
+either by self-learning or attending,
+e.g., a novice [Software Carpentry Python course](https://software-carpentry.org/lessons).
+They have been **applying those skills in their domain of work by writing code for some time**,
+e.g. half a year or more.
+However, their software development-related projects are now becoming larger
+and are involving more researchers and other stakeholders (e.g. users), for example:
-They now need intermediate software engineering skills to help them design more robust software code that goes
-beyond a few thrown-together proof-of-concept scripts, taking into consideration the lifecycle of software,
-writing software for stakeholders, team ethic and applying a process to understanding, designing, building, releasing, and maintaining software.
+- Software is becoming more complex
+ and more collaborative development effort is needed to keep the software running
+- Software is going further than just the small group developing and/or using the code -
+ there are more users and an increasing need to add new features
+- ['Technical debt'](https://en.wikipedia.org/wiki/Technical_debt) is increasing
+ with demands to add new functionality
+ while ensuring previous development efforts remain functional and maintainable
+
+They now need intermediate software engineering skills
+to help them design more robust software code that goes
+beyond a few thrown-together proof-of-concept scripts,
+taking into consideration the lifecycle of software,
+writing software for stakeholders,
+team ethic
+and applying a process to understanding, designing, building, releasing, and maintaining software.
## Target Audience
This course is for you if:
-- You have been writing software for a while, which may be used by people other than yourself, but it is
-currently undocumented or unstructured
+
+- You have been writing software for a while,
+ which may be used by people other than yourself,
+ but it is currently undocumented or unstructured
- You want to learn:
- more intermediate software engineering techniques and tools
- how to collaborate with others to develop software
- how to prepare software for others to use
- You are currently comfortable with:
- - basic Python programming (though this may not be the main language you use) and applying it to your work on a regular basis
+ - basic Python programming
+ (though this may not be the main language you use)
+ and applying it to your work on a regular basis
- basic version control using Git
- command line interface (shell)
This course is not for you if:
- - You have not yet started writing software (in which case have a look at the [Software Carpentry course](https://software-carpentry.org/lessons) or some other
- Python course for novices first)
+
+ - You have not yet started writing software
+ (in which case have a look at the
+ [Software Carpentry course](https://software-carpentry.org/lessons)
+ or some other Python course for novices first)
- You have learned the basics of writing software but have not
- applied that knowledge yet (or are unsure how to apply it) to your work. In this case, we suggest you revisit the course
- after you have been programming for at least 6 months
- - You are well familiar with the [learning objectives of the course](/index.html#learning-objectives-for-the-workshop) and those of individual episodes
+ applied that knowledge yet (or are unsure how to apply it) to your work.
+ In this case, we suggest you revisit the course
+ after you have been programming for at least 6 months
+ - You are well familiar with the
+ [learning objectives of the course](/index.html#learning-objectives-for-the-workshop)
+ and those of individual episodes
- The software you write is fully documented and well architected
> ## Prerequisites
-> To attend this course you should meet the following criteria. You can also test your prerequisite knowledge by taking
+> To attend this course you should meet the following criteria.
+> You can also test your prerequisite knowledge by taking
> [this short quiz](quiz/index.html).
>
> #### Git
> - **You are familiar with the concept of version control**
> - **You have experience configuring Git for the first time and creating a local repository**
-> - **You have experience using Git to create and clone a repository and add/commit changes to it and to push to/pull from a remote repository**
-> - Optionally, you have experience comparing various versions of tracked files or ignoring specific files
+> - **You have experience using Git to create and clone a repository
+> and add/commit changes to it and to push to/pull from a remote repository**
+> - Optionally, you have experience comparing various versions of tracked files
+> or ignoring specific files
>
> #### Python
-> - **You have a basic knowledge of programming in Python (using variables, lists,
-> conditional statements, functions and importing external libraries)**
-> - **You have previously written Python scripts or iPython/Jupyter notebooks to accomplish tasks in your domain of work**
+> - **You have a basic knowledge of programming in Python
+> (using variables, lists, conditional statements,
+> functions and importing external libraries)**
+> - **You have previously written Python scripts or iPython/Jupyter notebooks
+> to accomplish tasks in your domain of work**
>
> #### Shell
-> - **You have experience using a command line interface, such as Bash, to navigate a UNIX-style file system and run
-> commands with arguments**
+> - **You have experience using a command line interface, such as Bash,
+> to navigate a UNIX-style file system and run commands with arguments**
> - Optionally, you have experience redirecting inputs and outputs from a command
{: .prereq}
> ## Learning Objectives for the Workshop
-> - Set up and use a suitable development environment together with popular source code management infrastructure to develop software collaboratively
-> - Use a test framework to automate the verification of correct behaviour of code, and employ parameterisation and continuous integration to scale and further automate your testing
-> - Design robust, extensible software through the application of suitable programming paradigms and design techniques
-> - Understand the code review process and employ it to improve the quality of code
+> - Set up and use a suitable development environment
+> together with popular source code management infrastructure to develop software collaboratively
+> - Use a test framework to automate the verification of correct behaviour of code,
+> and employ parameterisation and continuous integration
+> to scale and further automate your testing
+> - Design robust, extensible software
+> through the application of suitable programming paradigms and design techniques
+> - Understand the code review process
+> and employ it to improve the quality of code
> - Prepare and release your software for reuse by others
> - Manage software improvement from feedback through agile techniques
{: .objectives }
> ## Setup
-> Please make sure that you have all the necessary software and accounts setup ahead of the workshop
+> Please make sure that you have all the necessary software and accounts setup ahead of the workshop
> as described in the [Setup](./setup.html) section.
-> Also check the list of [common issues, fixes & tips](./common-issues/index.html) if you experience any problems
-running any of the tools you installed - your issue may be solved there.
+> Also check the list of [common issues, fixes & tips](./common-issues/index.html)
+> if you experience any problems running any of the tools you installed -
+> your issue may be solved there.
{: .callout}
{% include links.md %}