Fix app hang when opening post list and loading more posts#24728
Conversation
7f0761a to
aff8b88
Compare
|
| App Name | WordPress | |
| Configuration | Release-Alpha | |
| Build Number | 28417 | |
| Version | PR #24728 | |
| Bundle ID | org.wordpress.alpha | |
| Commit | 2547c92 | |
| Installation URL | 4na9spuibh4o0 |
|
| App Name | Jetpack | |
| Configuration | Release-Alpha | |
| Build Number | 28417 | |
| Version | PR #24728 | |
| Bundle ID | com.jetpack.alpha | |
| Commit | 2547c92 | |
| Installation URL | 5vb9j19foktfg |
|
The app already depends on SwiftSoup. Have you tried using that? I presume it'd be faster than Regex matching. |
| let summary = content.strippingGutenbergContentForExcerpt() | ||
|
|
||
| XCTAssertEqual(summary, expectedSummary) | ||
| } |
There was a problem hiding this comment.
The two test functions that test gallery blocks seem to be already covered by GutenbergExcerptGeneratorTests.swift. But should we keep the video press ones below?
There was a problem hiding this comment.
Sure, let's add it because the original "remove VideoPress blocks with regex" code was added a fix for an issue: wordpress-mobile/WordPress-iOS-Shared#339. Let's verify it still works:
@Test
func testVideoPressBlock() {
let content = "<p>Before</p>\n<!-- wp:videopress/video {\"title\":\"demo\",\"description\":\"\",\"id\":5297,\"guid\":\"AbCDe\",\"videoRatio\":56.333333333333336,\"privacySetting\":2,\"allowDownload\":false,\"rating\":\"G\",\"isPrivate\":true,\"duration\":1673} -->\n<figure class=\"wp-block-videopress-video wp-block-jetpack-videopress jetpack-videopress-player\"><div class=\"jetpack-videopress-player__wrapper\">\nhttps://videopress.com/v/AbCDe?resizeToParent=true&cover=true&preloadContent=metadata&useAverageColor=true\n</div></figure>\n<!-- /wp:videopress/video -->\n<p>After</p>"
let summary = GutenbergExcerptGenerator.firstParagraph(from: content, maxLength: 150)
#expect(summary == "Before")
}Here as expected, it picks the first paragraph regardless of what types of blocks the post has. It should now cover even more scenarios than before.
| let range = NSRange(rawText.startIndex..., in: rawText) | ||
| let text = (regex?.stringByReplacingMatches(in: rawText, options: [], range: range, withTemplate: "") ?? rawText) | ||
| .stringByDecodingXMLCharacters() | ||
| .trimmingCharacters(in: .whitespacesAndNewlines) |
There was a problem hiding this comment.
You may be able to use SwiftSoup to simplify the parsing as:
let doc = SwiftSoup.parse(...)
let text = doc.select("p:first-of-type").text()There was a problem hiding this comment.
I didn't know we had it. It's used only in two places in GBM that will be removed soon. I'd suggest keeping the simpler ad-hoc version - it's probably going to be faster than SwiftSoup. SwiftSoup is pretty large and it would be nice to remove it.
There was a problem hiding this comment.
It's also used in GBK (see wordpress-mobile/GutenbergKit#148).
I did a quick search. It's added in 25.0 (see diff). The app file size was increased from 211 MB to 212 MB. The binary size impact seems to be okay.
There was a problem hiding this comment.
Yeah, but it is a transient dependency and a pretty large one. It would be uncalled for if the app ends up with SwiftSoup as a dependency and it's used to find a single <p> tag for this small use-case.
There was a problem hiding this comment.
There are a few other places that use regex to parse HTML. I reckon we should replace them using a proper HTML parser instead.
There was a problem hiding this comment.
Yeah, it would be a good idea to replace these. The code I wrote also should use a full blown HTML parser, but it would be ideal to move it to the background first and do it when parsing posts. I'm pretty sure parsing also currently happens on the main thread if I'm not mistaken. So there is plenty to optimize.
| // Find first <p> tag content | ||
| guard let pStart = content.range(of: "<p", options: .caseInsensitive), | ||
| let pEnd = content.range(of: "</p>", options: .caseInsensitive, range: pStart.upperBound..<content.endIndex), | ||
| let tagEnd = content.range(of: ">", range: pStart.upperBound..<pEnd.lowerBound) else { |
There was a problem hiding this comment.
Why is searching for tagEnd needed? Can you search <p> instead of <p?
There was a problem hiding this comment.
It covers the scenario like <p class="test">text</p>. It first finds the beginning <p and the looks for the closing tag. It's a slightly faster way to do it than with regex (I haven't measured it, so it may not be true). There is also probably a way to break it, but I think it's pretty unlikely and it is not mission critical.
There was a problem hiding this comment.
It covers the scenario like
<p class="test">text</p>.
Right, of course. 👍
|





Fixes a (massive) app hang when loading the Posts screen and loading more Posts. For this purposes (short details label on the list) all we gotta do is show the first paragraph. I tested it on P2 and it produced perfectly acceptable result. The new implementation is orders of magnitude faster.