Skip to content

Feedback from Nav on Project + Model Building sections #130

@amyheather

Description

@amyheather

Feedback from Nav Mustafee (@NavonilNM), on the Model Building section - thank you Nav! This continues from prior feedback on introduction and set-up (#44) and model inputs (#62).

DistributionRegistry class

  • Nav had environment.yaml from earlier version of book, so when tried to import DistributionRegistry from sim-tools, it could not find the class. He changed the changed the sim-tools version number in .yaml file to the new version (sim-tools==1.0.0) but it only installed 0.8.0 - despite latest version in conda and pypi being 1.0.0. He tried directly using pip and the code worked fine (pip install –upgrade sim-tools) - he wondered maybe it was because .yaml file did not save before recreating the environment. --> Environment page has 1.0.0, so shouldn't be an issue for future users - was just as had older environment file.

Project: STARS

  • Great content! One question I have is the difference between repeatable and reproducible. Is the code Repeatable when the code is run by the same person? And it is Reproducible when the code is run by a different person? --> Great question! Made me realise I wasn't clear on this either - I have read up on it and clarified on the page - explanation below too:

Repeatable: Running the same code with the same inputs in the same environment reliably produces identical outputs each time. For example, work could be repeatable if have set a random seed (even if not documented environment used - as is just about yourself re-running it at the present moment in time).

Reproducible: It is about re-creating the documented conditions and regenerating the published results - perhaps by a different person on a different system, or also by the same person who wrote the code just in the future when they no longer have their old environment etc. etc. they're going back to the files they had before and trying to reproduce it all. So it's about making sure all the code, data, environment details and parameters are recorded (and ideally published).

Model Building: Randomness

  • (1) “By default, most software PRNGs initialise the seed based on something that changes rapidly and unpredictably, such as the current system time” à not sure of the word “unpredictably” in relation to system time. --> Thanks for spotting this, removed "unpredictably"

  • (2) Will the following example be better if size is parameter is changed to for the second example. --> Have kept as is as want to show it's the same code getting the same results - but have modified comments and print statements to make this a bit clearer

Image Image

This shows that that although six values are generated, the first three remain the same.

Image
  • (3) Sim-tools – it worked OK for me. I looked in environment.yml file for des-example env., and sim-tools showed under pip. I cannot now recollect, if I made this change (i.e. adding sim-tools to the yaml file)? Its been 3-4 months since I worked on the examples 😊 But anyway, it did work.
Image
  • (4) Shared generators – good example. I think to complement the exisiting material and the explanatory text box (with red background), a table can be offered (but please check with Tom). --> Good idea, this helps clarify. I have made a table like you suggest, but represented underlying RNG state with letters ("A", "B", "C") to help keep it simple.
Image

So, why this is happening becomes clear as both the RND number stream is shown (same numbers are generated here) but as the processing samples have changed, values will be different. Perhaps a python program can be written rather than a table. Anyway, this may be overexplaining and the text box may be sufficient.

Image
  • (5) Test yourself – excellent addition to the book! --> Thanks!

  • (6) Feedback box – good idea. For the feedback to appear in the webpage, will it have to be approved? For example, is there a moderation element? --> Users must login with GitHub, so completely anonymous comments are not possible. I've just double-checked, but it seems it isn't possible to require moderation approval before comments become visible. However, once posted, we do have lots of moderation actions available to us. The comments go to GitHub discussions, from where we can edit, hide or delete comments, we can report content to GitHub, and we can block users.

Model Building: Entity Generation

  • (1) Excellent explanation of the code. The note on random number (see below), could perhaps also be linked to the earlier section on “Randomness” – a hyperlink! --> Thanks! Have add an intro sentence to help clarify how this builds on Randomness page
Image
  • (2) I think DistributionRegistry has to be imported
    from sim_tools.distributions import Exponential, Normal, DistributionRegistry --> This is included in the blue box at the start of the page, though I am wondering now if perhaps the imports (and changes in imports between pages) could be clearer. I have now put them in a separate paler blue box, to help them stand out more.

  • (3) The following (“distributions are seeded in alphabetical order..”) may need further explanation. If not critical to the tutorial it could be omitted. --> I think it's important as it's something that I got confused around why my results were changing with DistributionRegistry, so have kept it in but add further explanation as suggested

Image

Model Building: Entity Processing

  • (1) Perhaps some other distribution can be used; good for learning and more exposure to sim-tools distribution classes! -->Nice idea! I think will leave as exponential though, as super handy for (1) consistency with the nurse visit simulation example, which uses same distribution, and (2) on the mathematical proof of correctness page, we validate the model against theoretical results, this requires M/M/s so needs exponential, so for simplicity makes sense to set up with that, and then that validation is the same as used for the nurse visit simulation repositories
Image
  • (2) One line can be added after the code is presented (in the area marked red) that the following sections will explain code.
Image
  • (3) “SimPy will run this in parallel with other events like arrivals, so multiple patients and activities can progress at the same time”.  I think the simulation executive is following the event list; although we can use the word “in parallel” in a general sense, I think the advancement of time, execution of bound events and execution of conditional events (ABC) is following a sequential logic. I guess this tutorial is for those who already understand DES; however, if this the book is also aimed for teaching the basics of DES, then a flowchart of what is happening in terms of event scheduling can be included (some figures and texts, some tables etc.). This example of patient and doctor (entity and resource) would be am excellent example for ABC! --> Python's SimPy and R's simmer both use a process-based approach rather than a three-phase approach. I'm adding a section to the DES page (in introduction chapter) that explains the different DES world views (event, activity, process, three-phase). On this current page, I'll clarify the wording here, and make reference to the process-based approach.

  • (4) The example had three consultants; In the output, like patient 1, perhaps we can also mention consultant 1, 2, 3? Also, If we wanted to include the ABC example, for “Bound event” (consultation), we could mention in parentheses when the consultation will end. This can be a further source of validation (e.g., patient 2 leaves at time = consultation start time (starts consultation) + consultation time (from distribution)). OR perhaps this could be a task for the reader (test yourself). --> Process-based so won't do in reference to ABC, but have followed the great suggestion of making this a test yourself activity, where they try modifying the logging messages, and can validate the end time is as expected.

Image
  • (5) Expected one additional section in relation to DistributionRegistry since it was already introduced in the earlier section (EntityGeneration) and now we have two distributions, so this example would help with learning! --> Previous page shows two options for setting up - manually or with DistributionRegistry - moving forwards from there, the examples use the first option, as the registry is a bit excessive for our simple examples in the book, as it's still only two distributions. (This is especially true in R version) (Though the stroke repository is an example they can refer to with distributionregistry used - whilst the book and m/m/s progress using a manual approach).

Model Building: Logging

  • (1) I am executing through Jupyter notebook; when classes in .py change, they are not loaded automatically. The jupyter kernal needs to be restarted. Did some searching and the addition of the first two statements (see below) worked for me. --> Good point! I also do this. However, I forgot to mention it in the book. I have add this on the package page, as that is where I describe how to set-up a Jupyter notebook and import the local package.
Image
  • (2) As the code is being developed from entity gen -> entity processing -> logging, additions are shown but not deletions. For example:
Image

However, in the code (progressing from entity processing  logging) the verbose element had to be deleted/commented (see below), else there are errors reported (e.g. since verbose is no longer in the parameter list for logging into file examples).
Thus, we can have a line (as comment) to alert that some lines may need to be commented/deleted if the workbook is being followed in a progressive style.

Image

Metadata

Metadata

Assignees

Labels

peer reviewFeedback from peer review of the repo

Type

No type

Projects

Status

Done

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions