Added check for obsolete keys (no assertions though) and removed thos…#869
Conversation
|
Good idea to clean this up. What you still need to do to make this cleanup complete:
|
|
Assertions are added. Not sure if my Python skills are good enough to sort out the removal. I assume that one will need to rewrite the complete translation file when removing the entries. Regarding other languages it could work as when adding: English is manual, python-script to remove from other languages. |
31419fe to
8080501
Compare
|
Nice. To help you in your python skills, I have implemented the script - you can use this code: Have a look at the allFilesMustHaveSameKeys test as well, as it can be extended to ensure that all language files have only the keys in the english file and only those. |
|
@oscargus do you need any help with this PR? |
|
I think it should be OK. I had missed that you provided the missing Python
functions though. Thanks!
|
b988726 to
6bddafb
Compare
|
I can't find the problem in the Russian translation file... The Python script doesn't find the problematic line, but apparently the Java parser finds it... |
|
in |
|
in |
|
bouncycastle is not in the file anymore and the space I've tried. However, it seems like #, :, and ! (and =) should be escaped, so I'll try those (just found some information...). |
|
I am not sure if # has to be escaped. I think, we only escape colon, equals and backslashes. |
|
Tests are OK on my machine locally. Very strange. |
|
According to Wikipedia: Doesn't work on my local machine, but now I have at least escaped all characters. Some translations were not correctly escaped (including the Russian). Still no success though... |
|
If one removes the ! in the three first comments, the extra string doesn't contain a !. Removing all comments also removes the #. But really out of ideas at the moment... |
|
I have read up a bit more. As I understand it:
Still, these characters can always be escaped. (Doesn't help though...) |
|
hm, you could try to debug the test and see why it fails. Or create a small main class which does this only for the russian language. As it works on my machine, I am unable to help here. :-( |
|
I've tried this (by print-out-debugging), but since the whole file is loaded through But I just made some progress! Using a Reader and setting the encoding to "UTF-8" lead to that the obsolete key is |
|
"The encoding of a .properties file is ISO-8859-1, also known as Latin-1."... Bad idea to encode it in UTF-8 then... |
|
Bah! Almost three hours because someone saved a file in an invalid encoding... Anyway, now I think that it is working and that the translations are slightly easier to maintain. |
|
Does ISO-8859-1 really cover Russian characters? |
|
Somehow, yes. By using Unicode escaping it is claimed that it works anyway.
What that means in practice I'm not really sure about.
I also notes that the Japanese translation used UTF-8 without any problems,
so I cannot say that I fully understand it...
|
|
My Notepad++ says that all those files are saved in UTF8 (regardless what the comment says) - but russian was the only one not saved with "UTF8 without BOM". |
|
OK! I changed the encoding properties in Eclipse and then it worked, but
clearly the diff was quite small... Maybe my edit in the Wiki was a bit
quick...
|
|
Ok, then can this be merged? Btw. we use a custom written class which enables loading properties files encoded in UTF8 instead of the default ISO.... |
|
I added escaping for # and ! as well (a bit annoying is we happens to use the translation string I can also confirm that with the current format of the ru-files it works fine on my Windows 7 laptop. |
|
I suppose it has sth do to with the Python script not reading the files in UTF8: |
|
No, nothing to do with Python. I think @matthiasgeiger s comment about "UTF8 without BOM" is the key thing here. (And no, the comment has nothing to do with it, not sure why it is there...) I quite sure that the Russian files are indeed saved as UTF-8 now as well (based on the small final diff). #994 is a bit more doubtful though... Either way, good editors will handle it transparently, but we should probably wait before merging #994. |
|
|
||
| public String getPropertiesKeyUnescaped() { | ||
| // space, = and : are not allowed in properties file keys | ||
| // space, #, !, = and : are not allowed in properties file keys |
There was a problem hiding this comment.
Why are'nt they repleaced here? - I don't get how the comment matches with the code.
There was a problem hiding this comment.
0068ef5 to
95d49b8
Compare
…e from the English translation files
5c5063e to
d78e814
Compare
d78e814 to
5c11f96
Compare
|
I've found a good way to actually store the files in UTF-8, even those that are now mixed (like the French translation). See the latest commit. Should I go ahead and convert all files? |
|
Maybe @JabRef/translators should state their opinion here. Since popeye works perfectly, I would see no reason for keeping outdated encodings. |
|
For me, files can be converted. |
|
+1 |
|
Please also add a gradle task to call the new functionality |
|
It is already done. The (easiest) solution was to do both adding new and
remove obsolete in the same command. I also tried to make the test print
similar instructions, but it seems like that doesn't work...
|
|
@oscargus Hm the test instructions are coming from the |
|
Maybe I didn't finish it...
|
|
FYI the problem was that the Just merged this in... |
There's not a way to automatically remove keys, right?