Make parsing of gecos field more robust and strict #1307

alejandro-colomar · 2025-07-18T22:12:23Z

I finally understood what this function was doing. It was over-engineered, for supporting gecos fields that nobody should be writing in the first place. It was also bogus: if the gecos field was missing some subfield, it left the buffers with garbage. Even more buggy: if any of the first 4 CSV fields contained any '=' in them, those would be appended to the slop, regardless of oflg, which is inconsistent with the last lines, which parse the remaining slop depending on oflg. This code was very broken.

Revisions:

v2

wfix
Fix more bugs, and improve readability.

$ git range-diff shadow/master gh/gecos gecos 
1:  91431a96 ! 1:  8cfc1c19 src/chfn.c: Do not allow the 'slop' fields to appear before any non-slop gecos fields
    @@ Commit message
         2)  Building and room number or contact person
         3)  Office telephone number
         4)  Home telephone number
    -    5)  Any other contact information (pager number, fax, external e-mail address, etc.)
    +    5+) Any other contact information (pager number, fax, external e-mail address, etc.)
     
         But our code supported the "other contact information", which we call
         slop, and which is composed of an arbitrary number of key=value fields,
    @@ Commit message
     
         After this patch, the GECOS field is treated as a CSV, blindly copying
         the fields as they appear, where the first 4 fields are as specified
    -    above, and anything after them is the slop (5th field, any other contact
    +    above, and anything after them is the slop (5+ fields, any other contact
         information).
     
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
2:  d523246f = 2:  286739a2 src/chfn.c: Use strsep(3) and strcpy(3) instead of its pattern
3:  d624fbf0 = 3:  aa6755fe src/chfn.c: Write an empty string if there's nothing in the GECOS field
-:  -------- > 4:  f4b1950b src/chfn.c: slop: Reduce buffer size
-:  -------- > 5:  81e28291 src/chfn.c: Simplify checking for a long GECOS field
-:  -------- > 6:  4ac43002 src/chfn.c: Use stpeprintf() to improve readability

v2b

Shorten comment.

$ git range-diff shadow/master gh/gecos gecos 
1:  8cfc1c19 = 1:  8cfc1c19 src/chfn.c: Do not allow the 'slop' fields to appear before any non-slop gecos fields
2:  286739a2 = 2:  286739a2 src/chfn.c: Use strsep(3) and strcpy(3) instead of its pattern
3:  aa6755fe = 3:  aa6755fe src/chfn.c: Write an empty string if there's nothing in the GECOS field
4:  f4b1950b = 4:  f4b1950b src/chfn.c: slop: Reduce buffer size
5:  81e28291 = 5:  81e28291 src/chfn.c: Simplify checking for a long GECOS field
6:  4ac43002 ! 6:  f01f131b src/chfn.c: Use stpeprintf() to improve readability
    @@ Commit message
         This allows us to split the formation of the string into several
         s*printf() calls.
     
    +    Shorten comment, to make it fit in one line.
    +
         Signed-off-by: Alejandro Colomar <alx@kernel.org>
     
      ## src/chfn.c ##
    @@ src/chfn.c: static void check_fields (void)
      
        sanitize_env ();
     @@ src/chfn.c: int main (int argc, char **argv)
    -    * Build the new GECOS field by plastering all the pieces together,
    -    * if they will fit ...
         */
    +   check_fields ();
    + 
    +-  /*
    +-   * Build the new GECOS field by plastering all the pieces together,
    +-   * if they will fit ...
    +-   */
     -  ret = SNPRINTF(new_gecos, "%s,%s,%s,%s%s%s",
     -                 fullnm, roomno, workph, homeph,
     -                 (!streq(slop, "")) ? "," : "", slop);
     -  if (ret == -1) {
    ++  /* Build the new GECOS field by plastering all the pieces together.  */
     +  p = new_gecos;
     +  e = new_gecos + countof(new_gecos);
     +  p = stpeprintf(p, e, "%s", fullnm);

v2c

Add missing include.

$ git range-diff shadow/master gh/gecos gecos 
1:  8cfc1c19 = 1:  8cfc1c19 src/chfn.c: Do not allow the 'slop' fields to appear before any non-slop gecos fields
2:  286739a2 = 2:  286739a2 src/chfn.c: Use strsep(3) and strcpy(3) instead of its pattern
3:  aa6755fe = 3:  aa6755fe src/chfn.c: Write an empty string if there's nothing in the GECOS field
4:  f4b1950b = 4:  f4b1950b src/chfn.c: slop: Reduce buffer size
5:  81e28291 = 5:  81e28291 src/chfn.c: Simplify checking for a long GECOS field
6:  f01f131b ! 6:  bbdcc006 src/chfn.c: Use stpeprintf() to improve readability
    @@ Commit message
     
      ## src/chfn.c ##
     @@
    + #include "pwauth.h"
      #include "pwio.h"
      #include "shadowlog.h"
    ++#include "sizeof.h"
      #include "sssd.h"
     -#include "string/sprintf/snprintf.h"
     +#include "string/sprintf/stpeprintf.h"

…lop gecos fields According to the Wikipedia page for the 'Gecos field', the "typical" format for the GECOS field is a comma-delimited list with this order: 1) User's full name (or application name, if the account is for a program) 2) Building and room number or contact person 3) Office telephone number 4) Home telephone number 5+) Any other contact information (pager number, fax, external e-mail address, etc.) But our code supported the "other contact information", which we call slop, and which is composed of an arbitrary number of key=value fields, to appear before any of the other 4 fields. This seems to be undocumented, and none of the documentation I've found for the GECOS field in any systems I checked claims to support this. By removing support for those, we can significantly simplify the copy_field() function, which was quite unreadable. After this patch, the GECOS field is treated as a CSV, blindly copying the fields as they appear, where the first 4 fields are as specified above, and anything after them is the slop (5+ fields, any other contact information). Signed-off-by: Alejandro Colomar <alx@kernel.org>

This wrapper was very weird, and it's simpler to open-code the calls to strsep(3) and strcpy(3) instead. Signed-off-by: Alejandro Colomar <alx@kernel.org>

Otherwise, the buffer would contain garbage. Signed-off-by: Alejandro Colomar <alx@kernel.org>

We never use more than BUFSIZ. (And we could use way less than that.) Signed-off-by: Alejandro Colomar <alx@kernel.org>

Use a buffer of the exact size we want, and let SNPRINTF() decide if it fits or not. BTW, the old check seemed to be wrong: it wasn't accounting for the commas in the 80-character limit, but that didn't make much sense. Signed-off-by: Alejandro Colomar <alx@kernel.org>

This allows us to split the formation of the string into several s*printf() calls. Shorten comment, to make it fit in one line. Signed-off-by: Alejandro Colomar <alx@kernel.org>

hallyn · 2025-08-09T22:59:55Z

Hm, interesting. Yes, based on the code, I just assumed there were programs that wrote 'other' fields anywhere in the string. But I can't find anything that does. More interesting - the util-linux version of chfn does not allow changing the 'extra' field (as it calls it) at all!

So I do think this is safe.

alejandro-colomar changed the title ~~Tighten parsing of gecos field~~ Make parsing of gecos field more robust Jul 18, 2025

alejandro-colomar changed the title ~~Make parsing of gecos field more robust~~ Make parsing of gecos field more robust and strict Jul 18, 2025

alejandro-colomar force-pushed the gecos branch from 707364c to e0d8ff8 Compare July 18, 2025 22:13

alejandro-colomar requested review from hallyn and ikerexxe July 18, 2025 22:14

alejandro-colomar force-pushed the gecos branch from 1f1e7cd to e956fa7 Compare July 18, 2025 22:21

alejandro-colomar marked this pull request as ready for review July 19, 2025 01:15

alejandro-colomar force-pushed the gecos branch 2 times, most recently from cf7ed0a to d624fbf Compare July 19, 2025 01:20

alejandro-colomar added the bug label Jul 19, 2025

alejandro-colomar added 5 commits July 19, 2025 10:31

src/chfn.c: Use strsep(3) and strcpy(3) instead of its pattern

286739a

This wrapper was very weird, and it's simpler to open-code the calls to strsep(3) and strcpy(3) instead. Signed-off-by: Alejandro Colomar <alx@kernel.org>

src/chfn.c: Write an empty string if there's nothing in the GECOS field

aa6755f

Otherwise, the buffer would contain garbage. Signed-off-by: Alejandro Colomar <alx@kernel.org>

src/chfn.c: slop: Reduce buffer size

f4b1950

We never use more than BUFSIZ. (And we could use way less than that.) Signed-off-by: Alejandro Colomar <alx@kernel.org>

alejandro-colomar force-pushed the gecos branch 2 times, most recently from 4ac4300 to f01f131 Compare July 19, 2025 09:11

src/chfn.c: Use stpeprintf() to improve readability

bbdcc00

This allows us to split the formation of the string into several s*printf() calls. Shorten comment, to make it fit in one line. Signed-off-by: Alejandro Colomar <alx@kernel.org>

alejandro-colomar force-pushed the gecos branch from f01f131 to bbdcc00 Compare July 19, 2025 09:45

alejandro-colomar mentioned this pull request Jul 20, 2025

CodeQL fixes #1281

Closed

hallyn approved these changes Aug 9, 2025

View reviewed changes

hallyn merged commit bff2961 into shadow-maint:master Aug 9, 2025
10 checks passed

alejandro-colomar deleted the gecos branch August 10, 2025 08:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Make parsing of gecos field more robust and strict #1307

Make parsing of gecos field more robust and strict #1307

Uh oh!

alejandro-colomar commented Jul 18, 2025 •

edited

Loading

Uh oh!

hallyn commented Aug 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Make parsing of gecos field more robust and strict #1307

Make parsing of gecos field more robust and strict #1307

Uh oh!

Conversation

alejandro-colomar commented Jul 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hallyn commented Aug 9, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

alejandro-colomar commented Jul 18, 2025 •

edited

Loading