Letter number #1298

rocky · 2021-04-22T08:36:13Z

@mmatera Again not sure what error messages/tests we should have here. Please advise.

Also, handling Alphabets is probably a bigger project that we probably need come back to.

More later...

mmatera · 2021-04-22T10:57:01Z

Does this help?

In[1]:= LetterNumber["ss2!"] 
Out[1]= {19, 19, 0, 0}

In[2]:= LetterNumber[4]

LetterNumber::nas: The argument 4 is not a string.
Out[2]= LetterNumber[4]

In[3]:= LetterNumber[Graphics[{}]]

LetterNumber::nas: The argument -Graphics- is not a string.
Out[3]= LetterNumber[Graphics[{}]

In[4]:= LetterNumber["dd", "Mediano"]

Alphabet::noalpha: The alphabet Mediano is not known or not available.

LetterNumber::nalph: The alphabet Mediano is not known or not available.
 
Out[4]= Missing[NotAvailable]

mmatera · 2021-04-22T11:03:15Z

mathics/builtin/strings.py

+      <dt>'LetterNumber'[$c$]
+      <dd>returns the position of the character $c$ in the English alphabet.
+
+      <dt>'LetterNumber["string"]'


I would implement Alphabet[] at least for English, (Latin?) and the second parameter, in a way that anything that is not the default raises a message. At the end, the basic implementation of Alphabet is just a dictionary with the form "alphabetname": "abcd..."
isn't it?

You have a clearer idea of what to do at this point than I do. So if you'd just do it, I'd appreciate.

Otherwise I am happy to have this hang out as a draft for a while.

For larger context, I have been going over https://www.wolfram.com/language/elementary-introduction/2nd-ed/11-strings-and-text.html and more generally https://www.wolfram.com/language/elementary-introduction/2nd-ed/ which has the attribution Copyright 2021 (and nothing more). The 1st Edition, which I have in book form, has a "non-commercial share-alike" copyright and that is similar.

And the bigger context even here is that these give examples that can be used in worksheets. (See the dockerhub image or the sqlite file in mathics-omnibus and please also see Mathics3/mathics-django#32). But in addition this is a much more gentle way to guide us in filling out the code in a way where users can see immediate results.

The problem I have with FeynCalc, Rubi, KnotTheory and similar packages like this is that they are hundreds if not thousands of lines long and use sometimes sophisticated constructs in intertwined ways. I think if we have a more solid base to start out with, things will go easier there.

I tried pulling a 2006 version of KnotTheory and tried building it. In terms of the number of things we need to fix, there are far fewer. I counted maybe 3 or 4 things. Still, without those I am not able to get this to run. And at least for me this is kind of disappointing.

At the end, the basic implementation of Alphabet is just a dictionary with the form "alphabetname": "abcd..."
isn't it?

I think there exists python bindings for the ICU project to create alphabets:

http://site.icu-project.org/

Unfortunately the alphabets doesn't seem to be exactly the same as in MMA.

I looked at the ICU project and it looks both awesome and bureaucratic.

Our needs here are extremely simple and basic: for each language give me an ordered list of the alphabet with a way to convert from one case to the other. And don't even need bidirectional conversion so you can choose which case to start out with.

However, although I can see how to do case conversion using unicode properties (and you'd think then that a library would use that provide such a function equivalent to "lower" or "upper"), there isn't anything that says give me the first letter in the alphabet and iterate to to the last letter, as far as I can tell.

But I could be wrong here. The documentation while extensive isn't all that useful. Sigh.

I think that an ICU-based Alphabet is something to implement in an external module. Here, I would limit to define a basic set of alphabets (let's say, "English", "Spanish", "German", and "Greek" alphabets, that is what I could handle). A Pymathics module then can overload this basic implementation.

With ICU more things like IntegerName and Transliterate can be created:

https://github.com/axkr/symja_android_library/blob/master/symja_android_library/doc/functions/IntegerName.md

https://github.com/axkr/symja_android_library/blob/master/symja_android_library/doc/functions/Transliterate.md

Yes, I was thinking along the lines of @mmatera where this would be an external Pymathics librarie.

There seems a lot of functionality somewhere in ICU, which extends beyond just Alphabet.

At some point I will post a query on StackOverflow. But if you look at past queries on this topic they generally are met with derision. I just upvoted https://stackoverflow.com/questions/32375797/what-unicode-ranges-are-considered-letters and it appears I am the only one to have done so just now after 5 1/2 years with no great answers.

Thanks @axkr for the link and suggestion. Do you mind if we port that code to Python when we get around to writing the PyMathics module?

NO problem if you want to port it.

mmatera · 2021-04-22T11:51:05Z

In that case, OK, but if you want to merge this, I think it does not hurt. I can improve this later.

rocky · 2021-04-22T12:54:00Z

I'll make a pass at some point to sync with the error messages.

During the week, basically I have only small chunks of time to do things. Whatever can be done in this time, I do, but things that don't fit have to wait.

rocky · 2021-04-23T10:51:55Z

@mmatera I think this is ready for this level of detail. However one code path that we don't test is found in this example:

>> LetterNumber[{"P", "Pe", "P1", "eck"}]
 = {16, 16, 5, 16, 0, 5, 3, 11}

and that's because I don't know if the above is what is expected.

mmatera · 2021-04-23T11:35:43Z

@mmatera I think this is ready for this level of detail. However one code path that we don't test is found in this example:
>> LetterNumber[{"P", "Pe", "P1", "eck"}]
 = {16, 16, 5, 16, 0, 5, 3, 11}
and that's because I don't know if the above is what is expected.

In WMA, the output of that sentence is
{16, {16, 5}, {16, 0}, {5, 3, 11}}

rocky · 2021-04-24T13:53:14Z

@mmatera I think this is ready for this level of detail. However one code path that we don't test is found in this example:
>> LetterNumber[{"P", "Pe", "P1", "eck"}]
 = {16, 16, 5, 16, 0, 5, 3, 11}
and that's because I don't know if the above is what is expected.
In WMA, the output of that sentence is
{16, {16, 5}, {16, 0}, {5, 3, 11}}

Should be addressed in 71b755d

support for Alphabets

rocky · 2021-04-24T21:02:23Z

@mmatera I was thinking about this a little more. For a small subset of cases not requiring a LoadModule["pymathicsICU"] I suppose the small set we have here is fine.

But please, let us not extend this in core this way. Instead let us delegate this out to a Pymathics module which is based on something that purports to handle in more general language support.

rocky · 2021-04-24T21:07:51Z

mathics/builtin/strings.py

+        "Uppercase": "ABCDEFGHIJKLMNOPQRSTUVWXYZ",
+    },
+    "Spanish": {
+        "Lowercase": "abcdefghijklmnñopqrstuvwxyz",


WL Aphabets omit any accented characters such as those in Spanish above an "e", but leave those tilde for "n"? Similarly for an umlaut for German?

If this is the case this is irregular and weird.

No! This is the "Eñe" and is a letter very important for Spanish speakers! https://en.m.wikipedia.org/wiki/%C3%91
:)

Letter number

mmatera · 2021-04-24T21:45:58Z

Actually, my initial idea was to use Alphabet to something which allows to hook custom definitions inside LetterNumber and other similar builtins. Then, the actual definition of Alphabet could be implemented as a pymathics module, or as a .m WL module. The problem is how to implement the "lowercase" for generic alphabets.

Basic LetterNumber functionality.

e86ed43

More later...

rocky requested a review from mmatera April 22, 2021 08:36

rocky marked this pull request as draft April 22, 2021 08:36

Expand LetterNumber to handle a list of Characters

e5d8fb0

rocky force-pushed the LetterNumber branch from fe67478 to e5d8fb0 Compare April 22, 2021 08:37

mmatera reviewed Apr 22, 2021

View reviewed changes

rocky mentioned this pull request Apr 22, 2021

Create Pymathics module using ICU for Alphabets and Transliteration #1303

Closed

Add error cases reported by mmatera

d36d994

rocky marked this pull request as ready for review April 23, 2021 10:49

Update CHANGES.rst

700df37

Handle LetterNumber with a list of strings

71b755d

mmatera and others added 2 commits April 24, 2021 16:37

support for Alphabets

94293a6

Merge pull request #1310 from mathics/LetterNumber-mmatera

8376718

support for Alphabets

rocky merged commit e941cf7 into master Apr 24, 2021

rocky commented Apr 24, 2021

View reviewed changes

rocky added a commit that referenced this pull request Apr 24, 2021

Merge pull request #1298 from mathics/LetterNumber

21c4fa7

Letter number

rocky deleted the LetterNumber branch June 7, 2021 23:10

Uh oh!

Letter number #1298

Letter number #1298

Uh oh!

Conversation

rocky commented Apr 22, 2021

Uh oh!

mmatera commented Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mmatera Apr 22, 2021

Choose a reason for hiding this comment

Uh oh!

rocky Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axkr Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rocky Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmatera Apr 22, 2021

Choose a reason for hiding this comment

Uh oh!

axkr Apr 22, 2021

Choose a reason for hiding this comment

Uh oh!

rocky Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rocky Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

axkr Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmatera commented Apr 22, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rocky commented Apr 22, 2021

Uh oh!

rocky commented Apr 23, 2021

Uh oh!

mmatera commented Apr 23, 2021

Uh oh!

rocky commented Apr 24, 2021

Uh oh!

rocky commented Apr 24, 2021

Uh oh!

rocky Apr 24, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mmatera Apr 24, 2021

Choose a reason for hiding this comment

Uh oh!

mmatera commented Apr 24, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

mmatera commented Apr 22, 2021 •

edited

Loading

rocky Apr 22, 2021 •

edited

Loading

axkr Apr 22, 2021 •

edited

Loading

rocky Apr 22, 2021 •

edited

Loading

rocky Apr 22, 2021 •

edited

Loading

rocky Apr 22, 2021 •

edited

Loading

axkr Apr 22, 2021 •

edited

Loading

mmatera commented Apr 22, 2021 •

edited

Loading

rocky Apr 24, 2021 •

edited

Loading