Skip to content

Conversation

@kojiishi
Copy link
Collaborator

This patch supports non-breaking content in Java.

In Java and Python implementations, the "Skip" operation includes the skipped content to the BudouX parser, so no changes to the text for the parser is needed.

This patch changes following items:

  1. Add NOBR to the "skip" element.
  2. Fix "skip" is applied only to its descendants. Before this patch, all content following "skip" elements are skipped.
  3. When there's a phrase boundary right before the "skip" element, insert a break before the "skip" element.

@kojiishi kojiishi force-pushed the nobr-java branch 2 times, most recently from 2d58953 to eb356f6 Compare August 15, 2023 09:30
@kojiishi kojiishi marked this pull request as ready for review August 15, 2023 09:33
@kojiishi kojiishi force-pushed the nobr-java branch 2 times, most recently from 8ea8817 to a4e567b Compare August 15, 2023 11:05
This patch supports non-breaking content in Java.

In Java and Python implementations, the "Skip" operation includes the
skipped content to the BudouX parser, so no changes to the text for the
parser is needed.

This patch changes following items:
1. Add `NOBR` to the "skip" element.
2. Fix "skip" is applied only to its descendants. Before this patch, all
   content following "skip" elements are skipped.
3. When there's a phrase boundary right before the "skip" element,
   insert a break before the "skip" element.
kojiishi added a commit to kojiishi/budoux that referenced this pull request Aug 15, 2023
This patch supports non-breaking content in Python.

In Java and Python implementations, the "Skip" operation includes the
skipped content to the BudouX parser, so no changes to the text for the
parser is needed.

This patch changes following items:
1. Changed `to_skip` to a stack of elements, rather than always reset
   to `False` at the end of an element.
2. When there's a phrase boundary right before the "skip" element,
   insert a break before the "skip" element.

Note `<NOBR>` is added to `skip_nodes.json` at:
google#248.
kojiishi added a commit to kojiishi/budoux that referenced this pull request Aug 15, 2023
This patch supports non-breaking content in Python.

In Java and Python implementations, the "Skip" operation includes the
skipped content to the BudouX parser, so no changes to the text for the
parser is needed.

This patch changes following items:
1. Changed `to_skip` to a stack of elements, rather than always reset
   to `False` at the end of an element.
2. When there's a phrase boundary right before the "skip" element,
   insert a break before the "skip" element.

Note `<NOBR>` is added to `skip_nodes.json` at:
google#248.
@kojiishi kojiishi requested a review from tushuhei August 15, 2023 12:36
@tushuhei tushuhei merged commit a448046 into google:main Aug 17, 2023
tushuhei pushed a commit that referenced this pull request Aug 17, 2023
This patch supports non-breaking content in Python.

In Java and Python implementations, the "Skip" operation includes the
skipped content to the BudouX parser, so no changes to the text for the
parser is needed.

This patch changes following items:
1. Changed `to_skip` to a stack of elements, rather than always reset
   to `False` at the end of an element.
2. When there's a phrase boundary right before the "skip" element,
   insert a break before the "skip" element.

Note `<NOBR>` is added to `skip_nodes.json` at:
#248.
@kojiishi kojiishi deleted the nobr-java branch August 17, 2023 07:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants