Skip to content

Add Methods Section#91

Merged
samm82 merged 39 commits intomainfrom
methods
Sep 19, 2024
Merged

Add Methods Section#91
samm82 merged 39 commits intomainfrom
methods

Conversation

@samm82
Copy link
Owner

@samm82 samm82 commented Sep 14, 2024

Section 4 of my thesis (with some parts being reused in Section III.D of my paper) consists of discussion on how analysis is performed from the perspectives of both what categorizations are used (so far, just explaining "rigidity") and how analysis is automated/augmented with automated tools, which I based off of #82. I think I did a pretty good job of explaining the underlying ideas without getting bogged down by too many details, but a review from the perspectives of people who haven't written the code will put this to the test. Some notes on what is still left to do in this section as part of larger changes:

  1. In addition to rigidity, I also plan to outline what the different "classes" of discrepancies mean in more depth here, which will likely introduce more categories that are split off from the "other discrepancies" category (e.g., "definitions" is likely a useful category of discrepancy to track; inspired by Thesis Meeting | Aug 27, 2024 - ~4pm - Teams #86)
  2. What is meant by "source categorizations" will be addressed as part of Investigating Source Categorizations #89.
    • A note that the "rigid" graphs also exclude inferred relations should be included
  3. User instructions for running code will be addressed as part of Ensure that code can be run on others' systems #78.

This also addresses the second to-do item (plus its child items) of #69, although maybe not entirely.

Some content from Methods will be present in the Methodology section of the paper; this is probably OK?
I thought this might affect the blind paper, but since the URL doesn't actually appear in the paper, it didn't
It won't hurt to have an updated version of it in the repo though
Recompiled some graphs, but probably OK since there was no change to their tex files
Ensure that 'usual' glossary output isn't overwritten by 'example' glossaries
Improve how graphs/legends are 'found' by Makefile
*Actually* fix Makefile code for finding graphs/legends
Improve how graphs without legends are made
Elaborate on how discrepancies were found
Organize example glossaries
Compile paper PDF, since that hadn't been done in a while for these changes
Minor improvements to formatting, including discrep tab ref func name
Update commit in link to source code; will need to be updated from this as well
Now that I'm more sure it's implemented consistently
Use this example to explain what is meant by 'double counting' and how it is avoided
Confirm that this works as expected by improving existing manual discreps
Bug found by accident when testing how 'implied by' affects counts
@samm82 samm82 added documentation Improvements or additions to documentation enhancement New feature or request labels Sep 14, 2024
@smiths
Copy link
Collaborator

smiths commented Sep 14, 2024

@samm82 to make it completely unambiguous, can you please include a link to the pdf file that includes the section you want reviewed? Also, please provide the section number.

@samm82
Copy link
Owner Author

samm82 commented Sep 14, 2024

Yup! Section 4 of my thesis, with some parts being reused in Section III.D of my paper.

@smiths
Copy link
Collaborator

smiths commented Sep 17, 2024

Feedback:

  • I don't like the title of Chapter 4 (Methods). The title doesn't tell the reader enough about the content of the chapter. Maybe something like Graph Generation and Analysis Tool?
  • The first paragraph, what I call the "intro blurb" is too short. It is good to have a roadmap of the chapter contents, but you should say something about the purpose of the chapter. The chapter presents the tools that was developed to facilitate the visualization and analysis of the large quantities of data collected. The tools is critical because you will need to regenerate the graphs and update the analysis whenever there is a change to the data, either because of a correction, or because of a future change to the source documents.
  • In Section 4.1 you should give an example from your source of one or two implicit discrepancies. You point to Table 6.1, but the table has the counts of discrepancies, not examples. It will help the reader to see what you were looking for when you read the source data and decided on an implicit discrepancy. We want the describe your methods to the point where we could get someone else to reproduce your efforts and they would come up with essentially the same data you came up with.
  • In Section 4.1 you mention the keywords you are looking for. You should clarify where there words are. I believe the words are in the table you created. I believe the words were written by you when you were doing your data collection. Another interpretation would be that the keywords are words that you searched for in the original source text. You should aim to be unambiguous.
  • In Section 4.2 the link to "our approach glossary" didn't work for me
  • You should provide more detail on the entries in Table 4.1.
    • What does the information in brackets mean. Sometimes it says Author and then a number, but sometimes there is just a number?
    • What is Name? If it is the name of a test approach, you have room to label the column Test Approach Name.
  • "All child-parent relations are graphed, as well as synonym relations where either: 1. both terms are present in the glossary". I feel like information is missing for criteria 1. For the both terms, are they synonyms?
  • Do you define earlier in the thesis what you mean by synonym and how they are detected? Do you give examples. Examples will help.
  • Do you define earlier in the thesis what you mean by the child parent relation? Do you give examples? Examples will help.
  • On page 13 you have a comment that asks "should I add an example?" Yes. :-)
  • There is a lot going on in the first full paragraph on page 13. You might want to split this into separate paragraphs and add detail.
  • In Section 4.3 you talk about a "stricter" type. Can you add an explanation of what you mean by this?
  • Section 4.3.2, the sentence that starts with "Then, the relevant sources ..." is too long.
  • Section 4.3.2, "how times" should be "how many times"
  • I feel like there is a lot going on here to keep track of. Can you create a more formal model of your graph? In particular, can you summarize all the data you are tracking and all the possible relations between the data, and the modifiers for those relations. I feel like all this information should be together in one spot.

I like what you are doing @samm82, but I don't think I entirely understand Chapter 4. I think part of the problem is reading it out of context. I don't know what you are planning on already telling the reader by the time they get to this chapter. I also think a formal model of your graph would really help the reader, and likely the author. Maybe @JacquesCarette will have some good ideas on how to be more formal? We can discuss in our meeting later today.

@samm82
Copy link
Owner Author

samm82 commented Sep 19, 2024

Based on the workflow discussion in #92, I'm merging this PR, with the relevant content to be reviewed later

@samm82 samm82 merged commit d3b3de2 into main Sep 19, 2024
@samm82 samm82 deleted the methods branch September 19, 2024 16:02
@samm82 samm82 mentioned this pull request Dec 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

documentation Improvements or additions to documentation enhancement New feature or request

Projects

Status: Done

Development

Successfully merging this pull request may close these issues.

2 participants