Skip to content

Added check for obsolete keys (no assertions though) and removed thos…#869

Merged
matthiasgeiger merged 8 commits intoJabRef:masterfrom
oscargus:obsoletetranslationkeys
Mar 30, 2016
Merged

Added check for obsolete keys (no assertions though) and removed thos…#869
matthiasgeiger merged 8 commits intoJabRef:masterfrom
oscargus:obsoletetranslationkeys

Conversation

@oscargus
Copy link
Copy Markdown
Contributor

There's not a way to automatically remove keys, right?

@oscargus oscargus added [outdated] type: enhancement component: internationalization i18n status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers labels Feb 25, 2016
@simonharrer
Copy link
Copy Markdown
Contributor

Good idea to clean this up. What you still need to do to make this cleanup complete:

  • change the python script which is called via the localization.gradle so that it not only adds the keys but also deletes unused keys, kind of a sync functionality; maybe we can rename the long method to something short which says syncPropertyFiles
  • add assertions to your two test methods which make the build fail if there are unused keys in any properties file
  • take into account that we not only have english but also other languages

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 5, 2016

Assertions are added. Not sure if my Python skills are good enough to sort out the removal. I assume that one will need to rewrite the complete translation file when removing the entries. Regarding other languages it could work as when adding: English is manual, python-script to remove from other languages.

@oscargus oscargus force-pushed the obsoletetranslationkeys branch from 31419fe to 8080501 Compare March 5, 2016 14:29
@simonharrer
Copy link
Copy Markdown
Contributor

Nice. To help you in your python skills, I have implemented the script - you can use this code:

$ git diff
diff --git a/scripts/syncLang.py b/scripts/syncLang.py
index 40fb077..02ce513 100644
--- a/scripts/syncLang.py
+++ b/scripts/syncLang.py
@@ -60,6 +60,19 @@ def append_keys_to_file(filename, keys):
     f.close()


+def remove_keys_from_file(filename, keys):
+    lines = open(filename).readlines()
+    lines_to_write = []
+    for line in lines:
+        add = True
+        for key in keys:
+            if(line.startswith(key+"=")):
+                add = False
+        if add:
+            lines_to_write.append(line)
+    open(filename, 'w').writelines(lines_to_write)
+
+
 def compare_property_files_to_main_property_file(main_properties_file, other_properties_files, append_missing_keys_to_other_properties_files):
     keys_in_properties_file = get_keys_from_lines(read_all_lines(main_properties_file))

@@ -86,6 +99,9 @@ def compare_property_files_to_main_property_file(main_properties_file, other_pro
             print "----> Possible obsolete keys (not in English language file):"
             for key in keys_obsolete:
                 print key
+
+            if append_missing_keys_to_other_properties_files:
+                remove_keys_from_file(other_properties_file, keys_obsolete)
             print ""

Have a look at the allFilesMustHaveSameKeys test as well, as it can be extended to ensure that all language files have only the keys in the english file and only those.

@simonharrer
Copy link
Copy Markdown
Contributor

@oscargus do you need any help with this PR?

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 17, 2016 via email

@oscargus oscargus force-pushed the obsoletetranslationkeys branch 2 times, most recently from b988726 to 6bddafb Compare March 17, 2016 21:09
@oscargus
Copy link
Copy Markdown
Contributor Author

I can't find the problem in the Russian translation file... The Python script doesn't find the problematic line, but apparently the Java parser finds it...

@simonharrer
Copy link
Copy Markdown
Contributor

in http://www.bouncycastle.org/ the : is not escaped.

@simonharrer
Copy link
Copy Markdown
Contributor

in File_rename_failed_for_%0_entries.=\there is an escaped space. Maybe that is the issue?

@oscargus
Copy link
Copy Markdown
Contributor Author

bouncycastle is not in the file anymore and the space I've tried. However, it seems like #, :, and ! (and =) should be escaped, so I'll try those (just found some information...).

@simonharrer
Copy link
Copy Markdown
Contributor

I am not sure if # has to be escaped. I think, we only escape colon, equals and backslashes.

@simonharrer
Copy link
Copy Markdown
Contributor

Tests are OK on my machine locally. Very strange.

@oscargus
Copy link
Copy Markdown
Contributor Author

According to Wikipedia:

# You are reading the ".properties" entry.
! The exclamation mark can also mark text as comments.
# The key and element characters #, !, =, and : are written with
# a preceding backslash to ensure that they are properly loaded.

Doesn't work on my local machine, but now I have at least escaped all characters. Some translations were not correctly escaped (including the Russian). Still no success though...

@oscargus
Copy link
Copy Markdown
Contributor Author

If one removes the ! in the three first comments, the extra string doesn't contain a !. Removing all comments also removes the #. But really out of ideas at the moment...

@oscargus
Copy link
Copy Markdown
Contributor Author

I have read up a bit more. As I understand it:

  • = and : should be escaped if they are in the key
  • and ! should be escaped if they are the first character in the key

Still, these characters can always be escaped.

(Doesn't help though...)

@simonharrer
Copy link
Copy Markdown
Contributor

hm, you could try to debug the test and see why it fails. Or create a small main class which does this only for the russian language. As it works on my machine, I am unable to help here. :-(

@oscargus
Copy link
Copy Markdown
Contributor Author

I've tried this (by print-out-debugging), but since the whole file is loaded through properties.load(is); I cannot even figure out when the extra entry is inserted (and the Map/keySet is not in any order I can figure out).

But I just made some progress! Using a Reader and setting the encoding to "UTF-8" lead to that the obsolete key is #!...

@oscargus
Copy link
Copy Markdown
Contributor Author

"The encoding of a .properties file is ISO-8859-1, also known as Latin-1."... Bad idea to encode it in UTF-8 then...

@oscargus
Copy link
Copy Markdown
Contributor Author

Bah! Almost three hours because someone saved a file in an invalid encoding... Anyway, now I think that it is working and that the translations are slightly easier to maintain.

@koppor
Copy link
Copy Markdown
Member

koppor commented Mar 18, 2016

Does ISO-8859-1 really cover Russian characters?

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 18, 2016 via email

@matthiasgeiger
Copy link
Copy Markdown
Member

My Notepad++ says that all those files are saved in UTF8 (regardless what the comment says) - but russian was the only one not saved with "UTF8 without BOM".

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 18, 2016 via email

@simonharrer
Copy link
Copy Markdown
Contributor

Ok, then can this be merged?

Btw. we use a custom written class which enables loading properties files encoded in UTF8 instead of the default ISO....

@oscargus
Copy link
Copy Markdown
Contributor Author

I added escaping for # and ! as well (a bit annoying is we happens to use the translation string "#mon# undefined" and it ends up to be a comment...).

I can also confirm that with the current format of the ru-files it works fine on my Windows 7 laptop.

@Siedlerchr
Copy link
Copy Markdown
Member

I suppose it has sth do to with the Python script not reading the files in UTF8:
http://stackoverflow.com/questions/10971033/backporting-python-3-openencoding-utf-8-to-python-2
And I strongly would advise to let the properties files in UTF8, makes work for the tranlators easiert.

@oscargus
Copy link
Copy Markdown
Contributor Author

No, nothing to do with Python. I think @matthiasgeiger s comment about "UTF8 without BOM" is the key thing here. (And no, the comment has nothing to do with it, not sure why it is there...)

I quite sure that the Russian files are indeed saved as UTF-8 now as well (based on the small final diff). #994 is a bit more doubtful though... Either way, good editors will handle it transparently, but we should probably wait before merging #994.


public String getPropertiesKeyUnescaped() {
// space, = and : are not allowed in properties file keys
// space, #, !, = and : are not allowed in properties file keys
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are'nt they repleaced here? - I don't get how the comment matches with the code.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@oscargus oscargus force-pushed the obsoletetranslationkeys branch from 0068ef5 to 95d49b8 Compare March 24, 2016 13:50
@oscargus oscargus force-pushed the obsoletetranslationkeys branch 2 times, most recently from 5c5063e to d78e814 Compare March 24, 2016 14:24
@oscargus oscargus force-pushed the obsoletetranslationkeys branch from d78e814 to 5c11f96 Compare March 24, 2016 14:27
@oscargus
Copy link
Copy Markdown
Contributor Author

I've found a good way to actually store the files in UTF-8, even those that are now mixed (like the French translation). See the latest commit. Should I go ahead and convert all files?

@koppor
Copy link
Copy Markdown
Member

koppor commented Mar 24, 2016

Maybe @JabRef/translators should state their opinion here. Since popeye works perfectly, I would see no reason for keeping outdated encodings.

@mlep
Copy link
Copy Markdown
Contributor

mlep commented Mar 25, 2016

For me, files can be converted.

@domwass
Copy link
Copy Markdown

domwass commented Mar 25, 2016

+1

@Siedlerchr
Copy link
Copy Markdown
Member

Please also add a gradle task to call the new functionality

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 28, 2016 via email

@Siedlerchr
Copy link
Copy Markdown
Member

@oscargus Hm the test instructions are coming from the convertPropertiesFile in LocalizationConsistencyTest
There the Sentence with "Execute...." is printed

@oscargus
Copy link
Copy Markdown
Contributor Author

oscargus commented Mar 28, 2016 via email

@matthiasgeiger matthiasgeiger added this to the v3.3 milestone Mar 30, 2016
@matthiasgeiger matthiasgeiger merged commit 0febc16 into JabRef:master Mar 30, 2016
@matthiasgeiger
Copy link
Copy Markdown
Member

FYI the problem was that the fail(...) terminates the test before the information could be printed.

Just merged this in...

@oscargus oscargus deleted the obsoletetranslationkeys branch March 30, 2016 12:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

component: internationalization i18n [outdated] type: enhancement status: ready-for-review Pull Requests that are ready to be reviewed by the maintainers

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants