Skip to content

Fixes to reprun before Stata Conference#47

Merged
ankritisingh merged 39 commits into
devfrom
reprun-46
Jul 29, 2024
Merged

Fixes to reprun before Stata Conference#47
ankritisingh merged 39 commits into
devfrom
reprun-46

Conversation

@bbdaniels
Copy link
Copy Markdown
Contributor

Fix #46

@bbdaniels bbdaniels linked an issue Jun 4, 2024 that may be closed by this pull request
@bbdaniels bbdaniels self-assigned this Jun 4, 2024
@bbdaniels bbdaniels requested a review from mariarrt94 June 4, 2024 19:31
@bbdaniels bbdaniels changed the base branch from main to dev June 4, 2024 19:31
Comment thread src/ado/reprun.ado Outdated
@mariarrt94
Copy link
Copy Markdown

@bbdaniels now it works, and stops at a different point. I'm not sure if it comes from the same issue.
Screenshot 2024-06-04 165501

The line that produces the error is this (figure 4a runs perfectly, stops when adding the comment to introduce figure 4b, we can review this tomorrow).

`
*Figure 4a below corresponds to "Mean age of private paid employees, by industry: Health"

graph hbar (mean) latest_data if indicatorname_num==94, over(region_num)  blabel(bar, format(%2.1g))  ///
	aspectratio(0.75) ///  
	xsize(2) ysize(1) ///
	scheme(plotplain)   ///
	title("Mean age of private paid employees, by industry: Health", size(small)) ///
	ytitle("Percentage", size(small)) ///
	graphregion(color(white)) ///
	legend( position(6) row(2) size(small) symxsize(6)) 
	graph save  "Graph" "${f_output_fig}/fig4a.gph", replace
	graph export "${f_output_fig_img}/fig4a.png", replace

*Figure 4b below corresponds to Mean age of public paid employees, by industry: Health"`

@bbdaniels
Copy link
Copy Markdown
Contributor Author

bbdaniels commented Jun 4, 2024

No, now it comes from Stata characters that are always problematic, namely:

` ' ( ) * //

We just need to sanitize the command that's passed in somehow... @kbjarkefur ?

I tried:

foreach w in `=subinstr(subinstr(subinstr(subinstr(subinstr(subinstr(`"`macval(line)’”’,”/“,””,.),char(42),””,.),”)”,””,.),char(34),””,.),char(96),””,.),char(39),””,.)’ {

but got:

. foreach w in `=subinstr(subinstr(subinstr(subinstr(subinstr(subinstr(`"*Figure 4b below corresponds to Mean age of public paid employees, by industry: Health""',"/","",.),char(42),"",.),")","",.),char(34),"",.)
> ,char(96),"",.),char(39),"",.)' {
  2. di `"`w'"'
  3. }
- foreach w in `=subinstr(subinstr(subinstr(subinstr(subinstr(subinstr(`"*Figure 4b below corresponds to Mean age of public paid employees, by industry: Health""',"/","",.),char(42),"",.),")","",.),char(34),"",.)
> ,char(96),"",.),char(39),"",.)' {
expression too long

Comment thread src/ado/reprun.ado Outdated
Copy link
Copy Markdown

@mariarrt94 mariarrt94 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Works now! :)

@bbdaniels
Copy link
Copy Markdown
Contributor Author

bbdaniels commented Jun 5, 2024

Solves #48

@bbdaniels bbdaniels linked an issue Jun 5, 2024 that may be closed by this pull request
Comment thread src/ado/reprun.ado Outdated
Comment on lines +283 to +299
local theline = regexr(`"`macval(line)'"',"[\[//]\\\^\%\|\?\*\+\(\)]","")
forv i = 1/10 {
local theline = regexr(`"`theline'"',"[\[//]]","")
local theline = regexr(`"`theline'"',"[\\]","")
local theline = regexr(`"`theline'"',"[\^]","")
local theline = regexr(`"`theline'"',"[\%]","")
local theline = regexr(`"`theline'"',"[\|]","")
local theline = regexr(`"`theline'"',"[\?]","")
local theline = regexr(`"`theline'"',"[\*]","")
local theline = regexr(`"`theline'"',"[\+]","")
local theline = regexr(`"`theline'"',"[\(]","")
local theline = regexr(`"`theline'"',"[\)]","")
// local theline = regexr(`"`theline'"',`"[\"]"',"")
}

foreach w in `macval(theline)' {
cap get_command, word(`"`w'"')
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section is attempting to sanitize strings to pass into get_command. It works decently should rather be a while loop most likely, and should probably put in spaces instead of blanks.

However, I can't figure out an appropriate handling for ". We cannot have unmatched " or the foreach line will break. At the same time, we cannot remove " because they are also needed to delineate words.

One solution @luisesanmartin proposed is to have a better handling for catching comments so they can be skipped here entirely. Another solution is to simply note failures here and fix them manually -- they are somewhat rare but @ankritisingh has encountered a couple (especially in comments, which is particularly painful).

Another solution is to take the handling of dofiles out of this loop, since we are only trying to catch do/run here. We could handle that separately in the full line if desired, rather than trying to catch the next word at this point.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was the latest one but I don't see an unmatched "

Error:

Screenshot 2024-06-06 at 4 26 48 PM

Code:

Screenshot 2024-06-06 at 4 26 43 PM

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the unhandled *. Hence needing a while loop to catch them all. @luisesanmartin also interesting that catching comments won't help here!

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, it's possible that it's taking cmd"pdslasso alone as a word, which will still produce a lone " in the next step.

Comment thread src/ado/reprun.ado
Comment thread src/ado/reprun.ado
Comment thread src/ado/reprun.ado
Comment thread src/ado/reprun.ado
@bbdaniels bbdaniels changed the title Regex structure for sanitizing command strings before parsing Various fixes to issues. Jun 27, 2024
Comment thread src/ado/reprun.ado Outdated
Comment on lines +273 to +284
// Sanitize that string! -- see d17586d873a978987f34ba2fe536a311107ea58b for more regex
local theline = `"`macval(line)'"'
while regexm(`"`macval(theline)'"',"[\*]") {
local theline = regexr(`"`macval(theline)'"',"[\*]","")
}
while regexm(`"`macval(theline)'"',"\[//]"){
local theline = regexr(`"`macval(theline)'"',"\[//]","")
}

// Identify all commands in line
foreach w in `macval(theline)' {
cap get_command, word(`"`w'"')
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kbjarkefur This is the best I could do here -- I cannot fix a single " yet.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still not 100% sure what an code example would be that is valid Stata code that have a single ". is there an example in the test files?

Copy link
Copy Markdown
Contributor

@ankritisingh ankritisingh Jun 27, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the examples:

/* the do-file tests the "single quote
error"	
*/	
	sysuse auto, clear
		
	bys foreign: egen assigned = total(mpg)
	
	

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly Ankriti had a file that included a line like:

* A very badly behaved comment "

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here’s another example that produces the same error. However, it ran without issues in the previous version.

*** Load data (
	sysuse auto, clear

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that’s the ( I suppose. Last time I wiped many more characters but I tried to minimize it this time. Can you provide a list of all characters which cause failure in this kind of example? You should be able to implement the fix since the code is templated right?

Comment thread src/ado/reprun.ado
@bbdaniels bbdaniels changed the title Various fixes to issues. Fixes to reprun before Stata Conference Jul 9, 2024
@ankritisingh ankritisingh merged commit 3113c74 into dev Jul 29, 2024
@ankritisingh ankritisingh deleted the reprun-46 branch July 30, 2024 18:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

reprun: flag m:m merges reprun: Specific strings in command checking causes failures

4 participants