Adding Polish wordlist to BIP39#1037
Adding Polish wordlist to BIP39#1037KarolTrzeszczkowski wants to merge 4 commits intobitcoin:masterfrom
Conversation
|
Hello, Your list is very good. levenshtein distance is greater than 1 in every word comparison, and I found no errors in the other rules. The only word that I would change is this one: mama it is a repeated word from the spanish list, mamá As most software won't be able to make a difference between mama and mamá, I would change this one. Great work! |
Words like mama conflict with Spanish mamá. This commit remove all such words.
|
Thank you for the nice words and catching the collision! I was able to identify more such word collisions and I removed them: |
|
Being called to the blackboard by seeing my PR referenced I feel obliged to share some of my thoughts. Here is how I see it:
|
|
@luke-jr Could you please take a look at my pull request? |
|
As I understand it there are two competing PRs to add a Polish wordlist currently open. This one and #753. I don't speak Polish and afaik Luke and the BIP 39 authors don't either. Before we ask one of the BIP authors to ACK this (which is needed to merge it) we are going to need Polish speaker(s) who ideally understand BIP 39 to look over this and judge which PR should be merged (if any). This PR looks high quality to me but I am neither a Polish speaker nor a BIP author. |
|
This is also potentially relevant to this PR from one of the BIP 39 authors #1047 |
|
Please consider this PR it looks promising and it will be definietly valueable for community. Code of this PR is not complicated so I believe it will not have an bad impact for project and its efficiency and security. |
|
Great idea! It will be vey valueable for community! |
|
Looks really good to me. It may have positive impact on Polish community, especially the newcomers. |
|
Polish wordlist will be amazing. It will help every polish native speaker like me. |
|
NACK from my side. One does not have to spend more than one minute to find words that are considered offensive. I was also stroke by the incorrect order of words at the end of the list. The chosen set of words looks strange to me. I am under the impression the list was generated automatically, without really trying to polish it. Not to mention that the proper approach should be to manually select all the words. And what I really cannot understand is the list of the comments above - are they just quick comments (like doing a favor?), without putting the effort into at least reading the list? I am not impressed, it does not look good, hence the NACK. |
|
@p2w34 I explained in the description that I included offensive words as they are loaded with emotions and easy to remember. Seed words are private so there is no reason to avoid them. If it is required, I will remove them. If you could point me directly to the incorrect order? Thank you. The reason I created this list was because you refused to include feedback from other people and I didn't like your choices of words at all. They are mostly weird and not memorable at all. Judging from your attitude and how proud you are of your work, I expect you'd refuse my feedback as you refused to include other people feedback. |
The only reason I am spending my time being involved in various discussions here is that I am worried about the quality of the word lists. And I cannot say that I am having a good time - on the contrary. I may make comments which sound harsh but I do this only when absolutely necessary. All the comments made in another PR with the Polish list that I created were addressed. As my final comment, I repeat myself - I am of opinion that BIP0039 should not be continued in the current form. The problem of word lists should be approached separately, in a more holistic manner. This is, however, to be decided by the BIP0039 maintainers. Or one may try to simply write a new proposal. |
|
@p2w34 if you could point me to the ordering error you mentioned? |
|
I don't think the author of the competing PR should leave a NACK here and lie about an ordering error, that would have been found in the initial algorythmic check performed by @bitmover-studio. It's clear that it's nothing but an ego battle having nothing to do with the quality of the proposed wordlist. |
There are words starting with
Again, it is not. |
You are right. I am sorry for this accusation. It's supper weird my algorithm left it out and bitmovers check haven't caught it. I'm sorry once again. I will fix it. |
I checked the latest revision of your wordlist (which should be this one, correct me if I'm wrong), using my tool bip39validator and my log output (https://paste.ubuntu.com/p/Jwc83KJ8ZB/) says all words are <= 8 chars, no accents, are unique within the first 4 words and have a Levenshtein distance between every other word of at least 2. Those are the default parameters it runs with. Those are three of the four major checks that a BIP39 wordlist should be tested against, but you currently have to make the fourth check by hand; ensuring there are no words in this list that are similar to words in other (merged) languages' lists. I should mention that I am not a Polish speaker either. |
Words chosen using the following rules:
Unlike #753 it this wordlist is based on popular words dictionary and does not include any words used in other language mnemonic sets. It differs also by using polish symbols. Please consider merging it.