Skip to content

Orgize validation fails when parsing certain unicode values #22

@calmofthestorm

Description

@calmofthestorm

In general I expect weird unicode values to get "interesting" results, but I'm going to report this since it results in a panic when debug_assertions are enabled.

Each of these characters, alone, as input, results in a panic in debug builds. I recommend running the example below with --release as otherwise calling parse will panic.

Up to you as to whether it's worth fixing. I saw you had a fuzz test in the source tree so I assume that crashes like this might be of interest, but I can also understand not wanting to go down the unicode rabbithole and it's unclear to me how often these actually come up in real use.

The one or two I tested with org-element work correctly -- a headline containing them in the title is parsed correctly.

fn main() {
    let s = "\u{000b}\u{0085}\u{00a0}\u{1680}\u{2000}\u{2001}\u{2002}\u{2003}\u{2004}\u{2005}\u{2006}\u{2007}\u{2008}\u{2009}\u{200a}\u{2028}\u{2029}\u{202f}\u{205f}\u{3000}";

    for (i, c) in s.chars().enumerate() {
        let org = orgize::Org::parse_string(c.to_string());
        println!("Validation ok for {}: {}", i, org.validate().is_empty());
    }
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions