Skip to content

Punctuation behind number caused number not processed #21

@hongbo-miao

Description

@hongbo-miao

Summary

I am using text-processing-rs 0.2.1
I found punctuation behind number caused number not processed.

Case 1

use text_processing_rs::normalize_sentence_aviation;

fn main() {
    let normalized_text = normalize_sentence_aviation("United seven eighty eight, please come up on frequency one three five point six two five, thanks.");
    println!("{normalized_text}");
}
United seven eighty eight, please come up on frequency one three five point six two five, thanks.

gives

United 780 eight, please come up on frequency 135.62 five, thanks.

I expect

United 788, please come up on frequency 135.625, thanks.

Case 2 - Changing punctuation to .

United seven eighty eight. please come up on frequency one three five point six two five. thanks.

gives

United 780 eight. please come up on frequency 135.62 five. thanks.

I expect

United 788. please come up on frequency 135.625. thanks.

Case 3 - Adding space before ,

United seven eighty eight, please come up on frequency one three five point six two five, thanks.

gives

United 788 , please come up on frequency 135.625 , thanks.

So I think it is related to punctuation after a number.
Also this applies to normalize_sentence as well not just normalize_sentence_aviation.

Workaround

One workaround way is

  1. adding space before all punctuations
  2. run normalize_sentence or normalize_sentence_aviation
  3. remove spaces before all punctuations again.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions