PriceDelayTstat do file winsorizing bug

Long story short, we should remove the winsorizing in this do file. But I spent wayy to much time tracking this down, because of OCD, and thought we should document how OCD this repo is. 

So there's a bug in 

https://github.com/OpenSourceAP/CrossSection/blob/d81c696d283d62b61260f223eeac0e90511a4e77/Signals/Code/Predictors/ZZ2_PriceDelaySlope_PriceDelayRsq_PriceDelayTstat.do

line 84 has
```
gstats winsor PriceDelayTstat, by(time_avail_m) trim cuts(10 90) replace  // Trim very aggressively because coefficient/se not very well-behaved
```
which should mean that all extreme values are forced to the same value for a given time_avail_m. But, instead, we have these weird missing values if i run 
```
gstats winsor PriceDelayTstat, by(time_avail_m) trim cuts(10 90) gen(TstatWin)
list permno time_avail_m PriceDelayTstat TstatWin if time_avail_m == tm(1954m7) & permno >= 20677 & permno <= 20800
```

<img width="430" height="295" alt="Image" src="https://github.com/user-attachments/assets/41d14c9e-6c54-4fe4-a28a-aa3b74a11e31" />

The value of 14.03342 should be trimmed to a smaller value of around 6, based on the summary stats:
<img width="556" height="301" alt="Image" src="https://github.com/user-attachments/assets/c7a9bad0-6528-4c07-87b5-8fd118730a63" />

but instead it's made missing. Then the missing value is filled later on in the code, in the "Fill to Monthly" step. This weirdness might happen because the underlying data is actually daily, and we don't sort by daily date, but I'm honestly not sure.

It doesn't matter, because the OP (Hou and Moskowitz) don't mention any winsorizing. It also doesn't make sense to me why we should winsorize t-stats but not the slopes (if anything the slopes would be more noisy). Last, the winsorizing should not affect any portfolio sorts anyway.

I found this bug while comparing the python and Stata outputs. For the Stata liberation.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PriceDelayTstat do file winsorizing bug #177

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

PriceDelayTstat do file winsorizing bug #177

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions