Skip to content

ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion - Is this mergeable on 03/04/2020? Seems buggy#3796

Closed
muxator wants to merge 1 commit intoether:developfrom
muxator:lenient-import
Closed

ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion - Is this mergeable on 03/04/2020? Seems buggy#3796
muxator wants to merge 1 commit intoether:developfrom
muxator:lenient-import

Conversation

@muxator
Copy link
Copy Markdown
Contributor

@muxator muxator commented Mar 31, 2020

This PR includes a change by @JohnMcLear that was originally part of #3737 (about utf8 tests).

I merged all the other changes there, and left out this one because I would like more testing.
If testing is successful (and import/export with LibreOffice works well) this change can be integrated.

…loaded just convert it then convert it back

This should mitigate the risk.
@JohnMcLear
Copy link
Copy Markdown
Member

Do weird .etherpad export and .import test IE dump all emoticons in a pad and try export and import. Should work but \o/

@JohnMcLear JohnMcLear changed the title ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion - Scheduled for merge 03/04/2020 Apr 2, 2020
@muxator muxator added this to the 1.8.1 milestone Apr 3, 2020
@muxator
Copy link
Copy Markdown
Contributor Author

muxator commented Apr 3, 2020

This change is needed for avoid a crash (and then we can take it), or for having a working Libreoffice export?

In that case, it did not work for me:

  1. apply this change
  2. configure etherpad to use LibreOffice (in my case "soffice": "/usr/local/bin/libreoffice6.4")
  3. run etherpad
  4. open /tests/backend/specs/api/emojis.html in the browser & select all & CTRL+C
  5. CTRL + V in etherpad
  6. (there is another bug here: if I hit F5 without having typed at least a charachter, a blank pad is shown, but it's not relevant here)
  7. export in PDF, nothing crashes, BUT:
  8. a PDF full of strange symbols is shown

Is it ok to take this?

If so, feel free to rebase & merge.

@muxator muxator changed the title ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion - Scheduled for merge 03/04/2020 ImportHandler: instead of failing when a non Ascii code is uploaded do a roundtrip conversion - Is this mergeable on 03/04/2020? Seems buggy Apr 3, 2020
@JohnMcLear JohnMcLear self-assigned this Apr 3, 2020
@JohnMcLear
Copy link
Copy Markdown
Member

/usr/local/bin/libreoffice6.4 <-- thi sis weird.

Will look into thjis.

@JohnMcLear
Copy link
Copy Markdown
Member

PDF export does include lots of weird chars.

HTML is right.

So I wonder if the HTML > PDF soffice conversion needs a UTF8 flag too

@JohnMcLear
Copy link
Copy Markdown
Member

@muxator I'm being blocked by https://github.com/ether/etherpad-lite/pull/3783/files#diff-347748fef7777c7f23554905e9ead2c4R40

Where I'm asking for your help because I can't get the spawned process to get the right params.

See #3783

My theory is that the Etherpad > HTML is fine but HTML > PDF is not reading the HTML file as UTF8 so the PDF is getting "mashed" by the characters :)

@muxator
Copy link
Copy Markdown
Contributor Author

muxator commented Apr 4, 2020

/usr/local/bin/libreoffice6.4 <-- thi sis weird.

Comes from the latest LibreOffice 6.4 packages from upstream: https://www.libreoffice.org/download/download/

@muxator I'm being blocked by https://github.com/ether/etherpad-lite/pull/3783/files#diff-347748fef7777c7f23554905e9ead2c4R40

Where I'm asking for your help because I can't get the spawned process to get the right params.

Next thing I'll look at 👌

@JohnMcLear
Copy link
Copy Markdown
Member

What's blocking on this?

@muxator
Copy link
Copy Markdown
Contributor Author

muxator commented Apr 19, 2020

What's blocking on this?

Basically, that this change avoids a failure when exporting a document, but there is at least a case (shown in my previous comment) where the resulting PDF export is crippled anyway.

If there is a case in which this change allows a successful & correct PDF export, I'll be happy to merge it. Do we have an example document that we can test?

@JohnMcLear
Copy link
Copy Markdown
Member

Your crippled PDF export will have been due to soffice or abiword not being setup in settings.json probably? I can't replicate a failure case.

@muxator
Copy link
Copy Markdown
Contributor Author

muxator commented Apr 19, 2020

Your crippled PDF export will have been due to soffice or abiword not being setup in settings.json probably? I can't replicate a failure case.

I'll try again, but I hardly think there was a setup problem on my side 😄

@JohnMcLear
Copy link
Copy Markdown
Member

This surprised me too. You can upload non supported file types w. out abiword/soffice set in settings.json... It caught me out, I ended up with mess in my pad.

@muxator
Copy link
Copy Markdown
Contributor Author

muxator commented Apr 19, 2020

Do weird .etherpad export and .import test IE dump all emoticons in a pad and try export and import. Should work but \o/

Tried it:

  • open tests/backend/specs/api/emojis.html in Firefox, CTRL+A (select all), CTRL+V (copy)
  • open a pad, CTLR-V
  • refresh the pad, the text is there
  • export the pad in .etherpad format
  • open another pad
  • type at least a character in the new pad otherwise we do not allow import
  • import the .etherpad file

This worked for me even with the current develop (51e40dd), without needing to apply this PR

I am trying to find a case that is failing without this PR, and succeeds afterwards.

@muxator muxator modified the milestones: 1.8.3, 1.8.4 Apr 19, 2020
@JohnMcLear JohnMcLear closed this May 12, 2020
@muxator muxator removed this from the 1.8.5 milestone May 15, 2020
@muxator muxator deleted the lenient-import branch May 15, 2020 23:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants