Skip to content

Conversation

@alejandro-colomar
Copy link
Collaborator

@alejandro-colomar alejandro-colomar commented Aug 19, 2025

Suggested-by: @ikerexxe
Suggested-by: @hallyn
Suggested-by: @Karlson2k
Suggested-by: @lslebodn

Here, I've covered also some of the functions I haven't added yet.

v2
  • Address some issues reported by @ikerexxe .
$ git range-diff master gh/strreadme strreadme 
1:  d6d127b6 ! 1:  cc66977f lib/string/README: Add guidelines for using strings
    @@ lib/string/README (new)
     +   (almost) exclusively.
     +
     +
    -+Forbidden libc functions:
    -+=========================
    ++Don't use some libc functions without Really Good Reasons:
    ++==========================================================
     +
     +<stdio.h>
     +    asprintf(3)
    @@ lib/string/README (new)
     +    strcmp(3)
     +  Use streq() instead.
     +  The return value of strcmp(3) is confusing.
    -+  strcmp(3) would be legitimate as a callback to a search
    -+  function, but we don't need that here.
    ++  strcmp(3) would be legitimate for sorting strings.
     +
     +    strcasecmp(3)
     +  Use strcaseeq() instead.
    @@ lib/string/README (new)
     +    strncat(3)
     +  Use strndup(3) or strndupa(3) instead.
     +  strncat(3) is legitimate for catenating a prefix of a string
    -+  after an existing string, but we don't need that here.
    ++  after an existing string.
     +
     +    strlcpy(3)
     +  Use strtcpy() instead.
    @@ lib/string/README (new)
     +Specific guidelines:
     +====================
     +
    -+ctype/
    ++ctype/ - Character classification functions
    ++
     +    strchrisascii/
     +  These functions return true
     +  if the string has any characters that belong to a category.
    @@ lib/string/README (new)
     +    strtoascii/
     +  These functions translate all characters in a string.
     +
    -+memset/
    ++memset/ - Memory zeroing utilities
    ++
     +    memzero()
     +  Synonym of explicit_bzero(3).
     +    MEMZERO()
    @@ lib/string/README (new)
     +    strzero()
     +  Like memzero(), but takes a string.
     +
    -+strchr/
    ++strchr/ - Character search and counting
    ++
     +    strchrcnt()
     +  Count the number of occurrences of a given character in a string.
     +
    @@ lib/string/README (new)
     +    strnul()
     +  Return a pointer to the terminating null byte.  (s + strlen(s))
     +
    -+strcmp/
    ++strstr/ - String search
    ++
    ++  s/
    ++    stppfx()  // Current name: strprefix()
    ++  Return a pointer to the end of the prefix, or NULL if not found.
    ++
    ++    stpsfx()  // Unimplemented
    ++  Return a pointer to the suffix, or NULL if not found.
    ++
    ++strcmp/ - String comparison
    ++
     +  s/
     +    streq()
     +  Return true if the strings are equal.
    @@ lib/string/README (new)
     +    strpfx()  // Unimplemented
     +  Return true if the string starts with a prefix.
     +
    -+    stppfx()  // Current name: strprefix()
    -+  Like strpfx(), but return a pointer to the end of the prefix.
    -+
     +    strsfx()  // Unimplemented
     +  Return true if the string ends with a suffix.
     +
    -+    stpsfx()  // Unimplemented
    -+  Like strsfx(), but return a pointer to the suffix.
    -+
     +    strcaseeq()
     +  Like streq(), but ignore upper-/lower-case.
     +
    @@ lib/string/README (new)
     +
     +  n/
     +    strneq()  // Unimplemented
    -+  Return true if a nonstring is equal to a string.
    ++  Return true if a [[gnu::nonstring]] is equal to a string.
     +    STRNEQ()  // Unimplemented
     +  Like strneq(), but takes an array.
     +
     +    strnpfx()  // Unimplemented
    -+  Return true if a nonstring starts with a prefix.
    ++  Return true if a [[gnu::nonstring]] starts with a prefix.
     +    STRNPFX()  // Unimplemented
     +  Like strnpfx(), but takes an array.
     +
    -+strdup/
    ++strdup/ - Memory duplication
    ++
     +  s/
     +    strndupa(3)
    -+  Create a new string (in stack storage) from a nonstring.
    ++  Create a new string (in stack) from a [[gnu::nonstring]].
     +    STRNDUPA()
     +  Like strndupa(3), but takes an array.
     +
    @@ lib/string/README (new)
     +    MEMDUP()
     +  Like memdup(), but with type-safety checks.
     +
    -+strcpy/
    ++strcpy/ - String copying
    ++
     +  n/
     +    STRNCPY()
     +  Like strncpy(3), but takes an array.
    @@ lib/string/README (new)
     +    MEMCPY()
     +  Like memcpy(3), but takes two arrays.
     +
    -+sprintf/
    ++sprintf/ - Formatted string creation
    ++
     +    aprintf()
     +  sprintf(3) variant that allocates.  It has better calling
     +  conventions than asprintf(3).
    @@ lib/string/README (new)
     +  Similar to stprintf(), but takes a pointer to the end instead of
     +  a size.  This makes it safer for chaining several calls.
     +
    -+strspn/
    ++strspn/ - String span searching
    ++
     +  Naming conventions:
     +  -  'r': reverse (search from the end).
     +  -  'c': complement (negate the second argument).
    @@ lib/string/README (new)
     +  This is rarely useful.  It was useful for implementing
     +  basename().
     +
    -+strtok/
    ++strsep/ - String separation
    ++
     +    stpsep()
     +  Similar to strsep(3), but swap the input pointer with the return
     +  value.  It writes a null byte at the first delimiter found, and
v3
$ git range-diff master gh/strreadme strreadme 
1:  cc66977f ! 1:  374c8fc5 lib/string/README: Add guidelines for using strings
    @@ lib/string/README (new)
     +Specific guidelines:
     +====================
     +
    -+ctype/ - Character classification functions
    ++ctype/ - Character classification and conversion functions
     +
     +    strchrisascii/
     +  These functions return true
    -+  if the string has any characters that belong to a category.
    ++  if the string has any characters that
    ++  belong to the category specified in the function name.
     +
     +    strisascii/
     +  These functions return true
    -+  if all of the characters of the string belong to a category
    ++  if all of the characters of the string
    ++  belong to the category specified in the function name
     +  and the string is not an empty string.
     +
     +    strtoascii/
v3b
  • wfix
$ git rd
1:  374c8fc5 ! 1:  5616a0b0 lib/string/README: Add guidelines for using strings
    @@ lib/string/README (new)
     +    strtoascii/
     +  These functions translate all characters in a string.
     +
    -+memset/ - Memory zeroing utilities
    ++memset/ - Memory zeroing
     +
     +    memzero()
     +  Synonym of explicit_bzero(3).
v4
$ git rd 
1:  5616a0b0 ! 1:  cd11ed92 lib/string/README: Add guidelines for using strings
    @@ lib/string/README (new)
     +<stdio.h>
     +    asprintf(3)
     +  Use aprintf() instead.
    -+  It is difficult to handle errors after asprintf(3).  Also, it
    -+  makes it more difficult for static analyzers to check that calls
    -+  to it free(3) memory correctly.
    ++  It is difficult to handle errors after asprintf(3).
    ++  Also, it makes it more difficult for static analyzers to check
    ++  that memory is later free(3)d appropriately.
     +
     +<string.h>
     +    snprintf(3)
    @@ lib/string/README (new)
     +
     +  s/
     +    stppfx()  // Current name: strprefix()
    -+  Return a pointer to the end of the prefix, or NULL if not found.
    ++  Return a pointer to the end of the prefix,
    ++  or NULL if not found.
     +
     +    stpsfx()  // Unimplemented
    -+  Return a pointer to the suffix, or NULL if not found.
    ++  Return a pointer to the beginning of the suffix,
    ++  or NULL if not found.
     +
     +strcmp/ - String comparison
     +
    @@ lib/string/README (new)
     +sprintf/ - Formatted string creation
     +
     +    aprintf()
    -+  sprintf(3) variant that allocates.  It has better calling
    -+  conventions than asprintf(3).
    ++  sprintf(3) variant that allocates.
    ++  It has better interface than asprintf(3).
     +
     +    stprintf()  // Current name: snprintf_()
     +  snprintf(3) wrapper that reports truncation with -1.
v5
$ git rd 
1:  cd11ed92 ! 1:  09c2912f lib/string/README: Add guidelines for using strings
    @@ lib/string/README (new)
     +
     +Specific guidelines:
     +====================
    ++  Under lib/string/ we provide a set of functions to manipulate
    ++  strings, separated in subdirectories by utility type.  In this
    ++  section, we provide a broad overview.
     +
     +ctype/ - Character classification and conversion functions
     +
     +    strchrisascii/
    -+  These functions return true
    ++  The functions defined under this directory
    ++  return true
     +  if the string has any characters that
     +  belong to the category specified in the function name.
     +
     +    strisascii/
    -+  These functions return true
    ++  The functions defined under this directory
    ++  return true
     +  if all of the characters of the string
     +  belong to the category specified in the function name
     +  and the string is not an empty string.
     +
     +    strtoascii/
    -+  These functions translate all characters in a string.
    ++  The functions defined under this directory
    ++  translate all characters in a string.
     +
     +memset/ - Memory zeroing
     +

@alejandro-colomar alejandro-colomar marked this pull request as ready for review August 19, 2025 21:23
@alejandro-colomar alejandro-colomar force-pushed the strreadme branch 8 times, most recently from f6c228a to d6d127b Compare August 24, 2025 07:38
Copy link
Collaborator

@ikerexxe ikerexxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for putting this together, it is a great step forward in understanding all existing functions and the roadmap you are proposing.

I added several comments inline, but I have some more general comments.

I miss a table of contents or some explanation for what each category does. My proposal would be to use the following:

- ctype/: Character classification functions
- memset/: Memory zeroing utilities
- strchr/: Character search and counting
- strcmp/: String comparison with various modes
- strdup/: Memory duplication utilities
- strcpy/: Safe string copying variants
- sprintf/: Formatted string creation
- strspn/: String span operations
- strtok/: String tokenization utilities

Furthermore, I find it somewhat difficult to understand the functions that begin with Like *(3), since in most cases I need to first check man *(3) and then return to this document to understand the difference. Perhaps it is just me, or perhaps there are other people who have difficulty retaining this type of information and changing the context. I really do not know.

@alejandro-colomar
Copy link
Collaborator Author

Thank you for putting this together, it is a great step forward in understanding all existing functions and the roadmap you are proposing.

I added several comments inline, but I have some more general comments.

I miss a table of contents or some explanation for what each category does. My proposal would be to use the following:

- ctype/: Character classification functions
- memset/: Memory zeroing utilities
- strchr/: Character search and counting
- strcmp/: String comparison with various modes
- strdup/: Memory duplication utilities
- strcpy/: Safe string copying variants
- sprintf/: Formatted string creation
- strspn/: String span operations
- strtok/: String tokenization utilities

Those titles sound good (I'll amend some a little bit).

But I'd prefer to not write a table of contents explicitly,
as it will likely get out of sync with the contents below.
However, you can create the table of contents with grep(1):

$ cat lib/string/README | grep '^\S'
General guidelines:
===================
-  If there's an upper-case macro that wraps a function, use the macro
-  x*() functions wrap a function of the same name without the 'x'.
-  strn*() functions are forbidden for use with strings.  Consider strn
Forbidden libc functions:
=========================
<stdio.h>
<string.h>
Specific guidelines:
====================
ctype/
memset/
strchr/
strcmp/
strdup/
strcpy/
sprintf/
strspn/
strtok/
strftime.h

Maybe I could add the information after the directory name.

Furthermore, I find it somewhat difficult to understand the functions that begin with Like *(3), since in most cases I need to first check man *(3) and then return to this document to understand the difference. Perhaps it is just me, or perhaps there are other people who have difficulty retaining this type of information and changing the context. I really do not know.

With that, I essentially mean that those are libc functions. You don't need to read the manual page, as you'll usually remember what the libc functions do, I expect.

@alejandro-colomar alejandro-colomar force-pushed the strreadme branch 2 times, most recently from cc66977 to 374c8fc Compare September 3, 2025 09:21
@alejandro-colomar
Copy link
Collaborator Author

I've significantly reworded the document. Please re-check.

Here's the "summary" that you can get now with grep(1):

$ cat lib/string/README | grep '^\S'
General guidelines:
===================
-  If there's an upper-case macro that wraps a function, use the macro
-  x*() functions wrap a function of the same name without the 'x'.
-  strn*() functions are forbidden for use with strings.  Consider strn
Don't use some libc functions without Really Good Reasons:
==========================================================
<stdio.h>
<string.h>
Specific guidelines:
====================
ctype/ - Character classification and conversion functions
memset/ - Memory zeroing utilities
strchr/ - Character search and counting
strstr/ - String search
strcmp/ - String comparison
strdup/ - Memory duplication
strcpy/ - String copying
sprintf/ - Formatted string creation
strspn/ - String span searching
strsep/ - String separation
strftime.h

@ikerexxe
Copy link
Collaborator

ikerexxe commented Sep 4, 2025

Maybe I could add the information after the directory name.

That sounds good! I believe this information will be useful to other developers.

With that, I essentially mean that those are libc functions. You don't need to read the manual page, as you'll usually remember what the libc functions do, I expect.

For some I do, for others I wish.

@ikerexxe
Copy link
Collaborator

ikerexxe commented Sep 8, 2025

We have corrected a few errors, and I believe this has improved the comprehensibility of the documentation.

Since @hallyn and I requested this documentation, I would like to wait for an initial review from him before continuing with my own.

@alejandro-colomar
Copy link
Collaborator Author

We have corrected a few errors, and I believe this has improved the comprehensibility of the documentation.

Since @hallyn and I requested this documentation, I would like to wait for an initial review from him before continuing with my own.

Feel free to continue with your review while we wait for @hallyn .

@hallyn
Copy link
Member

hallyn commented Sep 15, 2025 via email

@alejandro-colomar
Copy link
Collaborator Author

gnulib has adopted some of these APIs:

  • streq(), memeq()

We're discussing the adoption of some more.

@hallyn
Copy link
Member

hallyn commented Sep 21, 2025

Thanks, @alejandro-colomar , this looks great.

Suggested-by: Iker Pedrosa <ipedrosa@redhat.com>
Suggested-by: Serge Hallyn <serge@hallyn.com>
Suggested-by: Evgeny Grin (Karlson2k) <k2k@drgrin.dev>
Suggested-by: Lukas Slebodnik <lslebodn@fedoraproject.org>
Signed-off-by: Alejandro Colomar <alx@kernel.org>
@alejandro-colomar
Copy link
Collaborator Author

@hallyn Thanks!

@ikerexxe , I'll wait for the rest of your review.

Copy link
Collaborator

@ikerexxe ikerexxe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we did a good job with the initial revisions because I can see that everything is fine in this latest one. Thank you for your excellent work!

@ikerexxe ikerexxe merged commit 0c9c46a into shadow-maint:master Sep 22, 2025
11 checks passed
@alejandro-colomar alejandro-colomar deleted the strreadme branch September 26, 2025 13:37
@alejandro-colomar alejandro-colomar self-assigned this Dec 27, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants