Conversation
There was a problem hiding this comment.
NodeLists are iterable<node>.
This means they can be iterated over with for..of loops.
It could result in the following code:
function convertRelativeURIs(tagName, propName) {
for(var elem of articleContent.getElementsByTagName(tagName)){
var relativeURI = elem.getAttribute(propName);
if (relativeURI != null)
elem.setAttribute(propName, toAbsoluteURI(relativeURI));
}
}This would work in Firefox, but wouldn't in Node 0.10. It can be made to work in Node 0.12 and io.js with some experimental flags on I believe. The future (hopefully coming soon) is that ES6 will be deployed consistently everywhere, so it will eventually work everywhere.
I don't know what's best, but felt it's worth sharing options :-)
There was a problem hiding this comment.
Readability's code needs to run on iOS, so I suspect what you suggest is not an option (at least for now, sadly).
There was a problem hiding this comment.
The downside of forEach used to be that the extra function calls produced overhead, so the performance is worse. I don't know if that's still (meaningfully) the case.
There was a problem hiding this comment.
I'm always prepending test suite run calls with the time exec, I didn't notice any significant impact on performance. I think js engines are now optimizing this properly, and even if it doesn't perform as good I think more readable code is worth the (cheap) price to pay here.
There was a problem hiding this comment.
Functions calls are removed by the JIT compiler, especially if the collection type is stable.
If you notice a significant difference (which is unlikely), file a bug, that's the sort of things JS engine folks want to know about.
More on the topic http://rfrn.org/~shu/2013/03/20/two-reasons-functional-style-is-slow-in-spidermonkey.html
There was a problem hiding this comment.
Functions calls are removed by the JIT compiler, especially if the collection type is stable.
Which isn't the case here for real DOMs, right? :-)
I'm also confused because your link seems to imply that there really is still a perf penalty unless you heavily optimize the function(s).
There was a problem hiding this comment.
Guys, please note we're speaking milliseconds here. I think the argument of the impossibility to deal with DOM mutations within function based loops is much more worth spending our time.
There was a problem hiding this comment.
Certainly here though, the issue I'd be worried about is that Array.forEach can't deal with removing items from the array while iterating. The reason this loop iterated backwards is to deal with NodeLists that are dynamic, ie when you call removeChild, the item will disappear from the doc.getElementsByTagName result, and therefore from the nodelist.
Your change does not handle that, so on compliant DOM implementations it would break this code. It doesn't with JSDOMParser, but I believe at least iOS will want to use Readability with a "real" DOM.
There was a problem hiding this comment.
var a = [1, 2, 3];
a.forEach(function(x, i) {
if (x === 2)
a.splice(i, 1)
});
console.log(a) // [1,3]I agree this isn't the most easy code to deal with, but in my world still better than using traditional for loops (I've probably been too much into FP lately, trying to avoid mutable state changes as much as I can, or when I can't trying to limit side effects as much as possible — which is nearly impossible with the DOM :/)
There was a problem hiding this comment.
Hmm, you're probably right that dealing with a mutable DOM that way is probably not a good idea. I should just cancel this PR then.
There was a problem hiding this comment.
Hm, no, actually the docs say that I am just wrong:
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/Array/forEach
The range of elements processed by forEach() is set before the first invocation of callback. Elements that are appended to the array after the call to forEach() begins will not be visited by callback. If the values of existing elements of the array are changed, the value passed to callback will be the value at the time forEach() visits them; elements that are deleted before being visited are not visited.
I don't think any of these loops care about added items?
There was a problem hiding this comment.
I don't think any of these loops care about added items?
It doesn't seem so, though that means we won't ever be able to introduce such a behavior; on the other hand, appending items to a currently iterated collection is definitely asking for trouble imoe :D
There was a problem hiding this comment.
Hrmpf. I thought this was all going to go OK, and then I realized we'll still need to fix the _setNodeTag stuff for non-JSDOMParser, which will involve reparenting stuff (ie, below, move everything from a font tag into a different span tag and then replace the font with the span). This is unfortunate because if there are nested tags, we will be adding + removing things, which will mess up the loop. For instance:
<font><font>Hello</font></font>
We'll hit the first font first with the forEach loop, which then means we'll reparent the second one, which might mean we won't iterate over the second node? Not sure... :-\
Tested in Firefox's devtools console:
>>> f = document.documentElement.appendChild(document.createElement('font'));
<font>
>>> f.appendChild(document.createElement('font'))
<font>
>>> document.getElementsByTagName("font")
HTMLCollection [ <font>, <font> ]
>>> document.getElementsByTagName("font").forEach
undefined
>>> Array.prototype.forEach.call(document.getElementsByTagName("font"), function(f) {
>>> var x = document.createElement("span");
>>> Array.prototype.forEach.call(f.childNodes, function(fc) {
>>> x.appendChild(fc);
>>> });
>>> f.parentNode.replaceChild(x, f);
>>> });
undefined
>>> document.getElementsByTagName("font")
HTMLCollection [ <font> ]
>>> document.getElementsByTagName("font")[0].parentNode
<span>
There was a problem hiding this comment.
Damn. Sounds so much complicated. Indirectly related, just out of curiosity: why do we want to replace font tags with spans? couldn't we just remove these font tags? Note that we currently keep the font tag attributes, which makes generated HTML looking a little silly.
There was a problem hiding this comment.
It's to avoid the styles of the readable page being messed up by the page's (font tag's) styles, I think. Like I said, we can leave it for a followup, it's not currently broken with this change.
|
@n1k0 I need to head out and likely won't be back until the evening here in SF. I think this is basically OK but we need to be careful with the forEach stuff in cases where we manipulate the things we're looping. (see #52 (comment) ) Can you just file a followup that as well? I think the clarity this change adds is something we should take while we're still working on the compat with live DOMs. |
|
(by which I mean, we can postpone the hard work of figuring out those changes later, and potentially change them back to manual loops if that fixes bugs at that point in time, but merge this PR now and iterate afterwards) |
Filed #59 |
|
Great, this is ready to be merged now, right, @n1k0 ? |
I think so! :) |
Use forEach when it makes sense.
|
\o/ |
I'm finding all these for loops hard to read and error prone since we use
varinstead ofletstatements.I'm suggesting we loop using
forEachwhenever it makes sense to address both issues.@leibovic @gijsk thoughts?