Skip to content

metaskinn/get_next_line

Repository files navigation

This project has been created as part of the 42 curriculum by metaskin.

get_next_line

Description

get_next_line is a C function that reads one line at a time from a given file descriptor (fd). It uses a static variable to preserve leftover data between calls.

Key behavior:

  • Each call to get_next_line(fd) returns the next line, including \n if present.
  • The last line of a file is returned without \n if the file doesn't end with one.
  • Returns NULL when there is nothing left to read or on error.

File Structure

Mandatory

File Purpose
get_next_line.h Header file — includes, macros, prototypes. Explains header guards and BUFFER_SIZE.
get_next_line.c Core logic — reading loop, line extraction, stash management. Contains the main algorithm walkthrough.
get_next_line_utils.c Helper functions — strlen, strchr, strdup, substr, strjoin with gnl_ prefix.

Bonus

File Purpose
get_next_line_bonus.h Same as mandatory header + OPEN_MAX macro for multi-fd support.
get_next_line_bonus.c Same logic, but stash is now an array (stash[OPEN_MAX]) — each fd gets its own stash.
get_next_line_utils_bonus.c Identical to mandatory utils (only the header include changes).

Instructions (Build)

Mandatory:

cc -Wall -Wextra -Werror -D BUFFER_SIZE=42 get_next_line.c get_next_line_utils.c

Bonus:

cc -Wall -Wextra -Werror -D BUFFER_SIZE=42 get_next_line_bonus.c get_next_line_utils_bonus.c

BUFFER_SIZE can be changed at compile time. If omitted, defaults to 42 (defined in the header).

Algorithm Overview

[FILE] --read()--> [buffer] --strjoin--> [stash]
                                           |
                                    '\n' found?
                                     /        \
                                   YES         NO
                                    |           |
                              Stop reading   Keep reading
                                    |
                         Split stash in two:
                         +--------+-----------+
                         |  LINE  | REMAINDER |
                         +--------+-----------+
                         "up to \n" "after \n"
                              |          |
                         return line  stash = remainder
                        (to caller)  (saved for next call)

Step-by-Step

  1. get_next_line(fd) — Entry point. Validates fd and BUFFER_SIZE, then delegates to the internal functions.
  2. gnl_reading(fd, stash) — Reads from the file descriptor in a loop, appending each chunk to stash via gnl_strjoin, until a \n is found or EOF is reached.
  3. gnl_subline(stash) — Extracts everything from the start of stash up to and including \n (or end of string). This is the line returned to the caller.
  4. update_stash(stash) — Removes the extracted line from stash and keeps the remainder for the next call. Frees the old stash.
  5. free_all(buf, stash) — Utility to safely free both pointers on error paths. Returns NULL for clean one-liner returns.

Why a Static Variable?

read() reads BUFFER_SIZE bytes at a time, which rarely aligns with line boundaries. The leftover bytes after a \n belong to the next line. A static char *stash persists across function calls, holding this leftover data so the next call can pick up exactly where the last one left off.

What Would Go Wrong Without the Safety Checks?

The comments in the source code explain each check in detail. Here's a summary:

Check What happens if removed
if (s == NULL) in gnl_strlen SEGFAULT — dereferencing a NULL pointer
if (fd < 0) in get_next_line Undefined behavior — read() with an invalid fd
if (dup == NULL) after malloc SEGFAULT — writing to a NULL pointer when memory is exhausted
buffer[bytes] = '\0' after read Buffer overflow — read() does NOT null-terminate; string functions would read garbage
free(s1) in gnl_strjoin Memory leak — old stash becomes unreachable on every loop iteration
stash = NULL after free(stash) Use-after-free — the static pointer would still reference freed memory on the next call

Helper Functions (get_next_line_utils.c)

Function Role Why it exists
gnl_strlen Returns string length Used everywhere for malloc size calculation
gnl_strchr Searches for a character Checks if \n exists in stash (loop condition)
gnl_strdup Duplicates a string on the heap Initializes stash as "" safely (can't free a string literal)
gnl_substr Extracts a substring Cuts the line from stash, and cuts the remainder
gnl_strjoin Joins two strings, frees s1 Appends read buffer to stash; frees old stash to prevent leaks

Note: gnl_strjoin differs from the standard ft_strjoin — it frees s1 because stash must be replaced on every iteration without leaking the old value.

Walkthrough Example

File content: "AB\nCDEF\n"    BUFFER_SIZE: 3

Call Stash before Read chunks Stash after read Line returned Stash after
1st NULL"" "AB\n" "AB\n" "AB\n" NULL
2nd NULL"" "CDE" then "F\n" "CDEF\n" "CDEF\n" NULL
3rd NULL"" 0 bytes (EOF) "" NULL NULL

Bonus Part — Multiple File Descriptor Support

The bonus version adds the ability to manage multiple file descriptors simultaneously without mixing up their reading states.

What changes?

Only 3 things change from mandatory to bonus:

Aspect Mandatory Bonus
Header guard GET_NEXT_LINE_H GET_NEXT_LINE_BONUS_H
Stash declaration static char *stash static char *stash[OPEN_MAX]
Stash access stash stash[fd]
Extra check fd >= OPEN_MAX

All helper functions (gnl_reading, gnl_subline, update_stash, free_all, and all utils) remain completely unchanged.

Why is this needed?

In the mandatory version, there is only one stash. If you alternate between two file descriptors:

get_next_line(fd1);  // stash now holds fd1's leftover
get_next_line(fd2);  // stash is OVERWRITTEN with fd2's data — fd1's leftover is lost!
get_next_line(fd1);  // WRONG result — fd1's remainder is gone

In the bonus version, each fd has its own slot in the array:

get_next_line(fd1);  // stash[fd1] holds fd1's leftover
get_next_line(fd2);  // stash[fd2] holds fd2's leftover — fd1 is untouched
get_next_line(fd1);  // stash[fd1] still intact — correct result

Why OPEN_MAX?

OPEN_MAX defines the maximum number of file descriptors a process can have open at once. The stash array uses fd as a direct index, so the array needs to be large enough to cover all valid fd values. Accessing stash[fd] where fd >= OPEN_MAX would be an out-of-bounds array access — which is why the bonus adds an fd >= OPEN_MAX guard.

Resources

Project & Guides

Videos

Official Documentation (C / POSIX / Linux)

AI Usage

I only used an AI tool to help me write this README and provide a clear, general overview of the get_next_line project. It helped me improve the grammar of my explanation, as well as the formatting and structure of the page. I also used it to brainstorm additional edge cases to test (e.g., very small/large BUFFER_SIZE, files without a trailing newline, empty lines, and read() error scenarios).

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages