Skip to content
This repository was archived by the owner on May 11, 2022. It is now read-only.

Latest commit

 

History

History
237 lines (218 loc) · 10.2 KB

File metadata and controls

237 lines (218 loc) · 10.2 KB
<style>.backtick, .tilde {overflow-x: auto;} .longTOC {overflow-x: hidden;}</style> # String library C string utils library (STB style, header-only). Features: - String expression evaluator (`streval`) - String formatters (to heap buffers) (`strcpyf*`, `strcatf*`) - String formatters (to temporary buffers) (`strf*`) - String fuzzy search (`strscore`, `strfuzzy`) - String regular expression (`strregex [c?^$*]`) - String 64-bit hashing (both compile-time and runtime) (`strhash`) - String interning (quarks) (`strput`, `strget`) - String matching (`strsub`, `strfindl`, `strfindr`, `strbegin`, `strend`, `strmatch`, `streq`, `streqi`) - String splitting (with and without allocations) (`strsplit`, `strchop`, `strjoin`) - String options parsing (`stropt`, `stropti`, `stroptf`) - String trim utils (`strdel`, `strtrimws`, `strtrimblf/bff`, `strtrimrlfe/ffe`) - String transform utils (`strrepl`, `strremap`, `strlower`, `strupper`, `strrev`) - String normalization utils (`strnorm`) - String conversion utils (`strint`, `strhuman`, `strrobot`) - String unicode utils (`strutf8`, `strutf32`, `strwiden`, `strshorten`) - [Documentation](https://rawgit.com/r-lyeh/stdstring.h/master/stdstring.h.html).

Homepage

Changelog

  • 2018.1 (v1.0.4): Fix API() macro; Add new transform utils; Rename a few functions.
  • 2018.1 (v1.0.3): Add strnorm(), strjoin(). Fix strcpyf().
  • 2018.1 (v1.0.2): Add stropt*() options parser.
  • 2018.1 (v1.0.1): Fix wrong version of strcatf() in first commit. Cosmetics.
  • 2018.1 (v1.0.0): Initial release.

Credits

  • Using Rob Pike's regular expression (apparently public domain).
  • Using Sam Hocevar's preprocessor trick (apparently public domain).
  • Using Bob Stout's transform utils (public domain).
  • Using Sean Barrett and Jeff Roberts' string formatters (unlicensed).
  • Using Werner Stoop's expression evaluator (unlicensed).
  • Using Dimitri Diakopoulos' & Sean Barrett's unicode stuff (unlicensed).

License

  • rlyeh, unlicensed (~public domain).

API

String expression evaluator

  • Evaluates a mathematical expression. Returns number value, or NaN if error.
  • Note: To check for NaN use isnan(ret) or bool error = (ret != ret);
<script type='preformatted'> ~~~C ABI double streval(const char *expression); ~~~ </script>

String fuzzy completion

  • Compares two strings. Returns string matching score (higher is better).
  • Fuzzy search a word into a list of given words. Returns best match or empty string (if none).
<script type='preformatted'> ~~~C ABI int strscore(const char *string1, const char *string2); ABI const char * strfuzzy(const char *string, int num, const char *words[]); ~~~ </script>

String regular expression

  • Regular expression matching. returns non-zero if found.
    • c matches any literal character c.
    • ? matches any single character.
    • ^ matches the beginning of the input string.
    • $ matches the end of the input string.
    • * matches zero or more occurrences of the previous character.
  • Return true if string matches wildcard pattern expression (?*).
<script type='preformatted'> ~~~C ABI int strregex(const char *string, const char *regex); ABI bool strmatch(const char *string, const char *substring); ~~~ </script>

String hashing

  • Runtime string hash. Returns 64-bit hash of given string.
  • Compile-time string hash macro. Returns 64-bit hash of given string.
  • Note: Macro requires code optimizations enabled (-O3 for gcc, /O2 for MSVC).
<script type='preformatted'> ~~~C ABI uint64_t strhash(const char *string); ABI uint64_t STRHASH(const char *string); ~~~ </script>

String interning dictionary (quarks)

  • Insert string into dictionary (if not exists). Returns quark ID, or 0 if empty string.
  • Retrieve previously interned string. ID#0 returns empty string always.
<script type='preformatted'> ~~~C ABI int strput(const char *string); ABI const char * strget(int quark); ~~~ </script>

String matching

  • Extract substring from position. Negative positions are relative to end of string.
  • Search substring from beginning (left). Returns end-of-string if not found.
  • Search substring from end (right). Returns end-of-string if not found.
  • Return true if string starts with substring.
  • Return true if string ends with substring.
  • Return true if strings are equal.
  • Return true if strings are equal (case insensitive).
<script type='preformatted'> ~~~C ABI const char* strsub (const char *string, int position); ABI const char* strfindl(const char *string, const char *substring); ABI const char* strfindr(const char *string, const char *substring); ABI bool strbegin(const char *string, const char *substring); ABI bool strend (const char *string, const char *substring); ABI bool streq (const char *string, const char *substring); ABI bool streqi (const char *string, const char *substring); ~~~ </script>

String normalization

  • Normalize resource identificator. Normalized uris have deterministic string hashes.
<script type='preformatted'> ~~~C ABI TEMP char * strnorm (const char *uri); ~~~ </script>

String conversion utils

  • Convert a string to integer.
  • Convert number to human readable string (12000000 -> 12M)
  • Convert human readable string to number (12M -> 12000000)
<script type='preformatted'> ~~~C ABI int64_t strint (const char *string); ABI TEMP char* strhuman(uint64_t number); ABI uint64_t strrobot(const char *string); ~~~ </script>

String options parsing

  • Parse argc/argv looking for comma-separated values. Returns matching string or defaults.
  • Parse argc/argv looking for comma-separated values. Returns matching integer or defaults.
  • Parse argc/argv looking for comma-separated values. Returns matching floating or defaults.
<script type='preformatted'> ~~~C ABI const char * stropt (const char *defaults, const char *options_csv); ABI int64_t stropti(int64_t defaults, const char *options_csv); ABI double stroptf(double defaults, const char *options_csv); ~~~ </script>

String format (temporary buffers)

  • Format a C-style formatted string. Returns temporary buffer (do not free()).
  • Format a C-style formatted valist. Returns temporary buffer (do not free()).
  • Note: 16 buffers are handled internally so that nested calls are safe within reasonable limits.
<script type='preformatted'> ~~~C ABI TEMP char * strf (const char *format, ...); ABI TEMP char * strfv(const char *format, va_list list); ~~~ </script>

String format (heap buffers)

  • Assign a C-style formatted string. Reallocates input buffer (will create buffer if str is NULL).
  • Assign a C-style formatted valist. Reallocates input buffer (will create buffer if str is NULL).
  • Concat a C-style formatted string. Reallocates input buffer (will create buffer if str is NULL).
  • Concat a C-style formatted valist. Reallocates input buffer (will create buffer if str is NULL).
<script type='preformatted'> ~~~C ABI HEAP char * strcpyf (INOUT char **string, const char *format, ...); ABI HEAP char * strcpyfv(INOUT char **string, const char *format, va_list list); ABI HEAP char * strcatf (INOUT char **string, const char *format, ...); ABI HEAP char * strcatfv(INOUT char **string, const char *format, va_list list); ~~~ </script>

String splitting

  • Check delimiters and split string into tokens. Function never allocates.
  • Check delimiters and split string into tokens. Returns null-terminated list.
  • Join tokens into a string.
<script type='preformatted'> ~~~C ABI int strchop (const char *string, const char *delimiters, int avail, const char *tokens[]); ABI HEAP char** strsplit(const char *string, const char *delimiters); ABI HEAP char* strjoin (INOUT char **out, const char *tokens[], const char *separator); ~~~ </script>

String trimming

  • Delete substring from a string.
  • Trim characters from begin of string to first-find (b-str-str-e to x-xxx-str-e).
  • Trim characters from begin of string to last-find (b-str-str-e to x-xxx-xxx-e).
  • Trim characters from first-find to end of string (b-str-str-e to b-xxx-xxx-x).
  • Trim characters from last-find to end of string (b-str-str-e to b-str-xxx-x).
  • Trim leading, trailing and excess embedded whitespaces.
<script type='preformatted'> ~~~C ABI char* strdel (char *string, const char *substring); ABI char* strtrimbff(char *string, const char *substring); ABI char* strtrimblf(char *string, const char *substring); ABI char* strtrimffe(char *string, const char *substring); ABI char* strtrimlfe(char *string, const char *substring); ABI char* strtrimws (char *string); ~~~ </script>

String transforms

  • Replace a substring in a string.
  • Remap specific characters in a string from a given set to another one. Length of both sets must be identical.
  • Convert a string to lowercase (ANSI only, not utf8/locale aware!).
  • Convert a string to uppercase (ANSI only, not utf8/locale aware!).
  • Reverse a string in-place.
<script type='preformatted'> ~~~C ABI HEAP char* strrepl (INOUT char **string, const char *target, const char *replace); ABI char* strremap (INOUT char *string, const char srcs[], const char dsts[]); ABI char* strlower (char *string); ABI char* strupper (char *string); ABI char* strrev (char *string); ~~~ </script>

String unicode utils

  • Extract 32bit codepoint from string. Also advances input string to next codepoint.
  • Convert 32bit codepoint to utf8 string.
  • Convert utf8 to utf16 string.
  • Convert utf16 to utf8 string.
<script type='preformatted'> ~~~C ABI uint32_t strutf32(INOUT const char **utf8); ABI TEMP char * strutf8(uint32_t codepoint); ABI TEMP wchar_t* strwiden(const char *utf8); ABI TEMP char* strshorten(const wchar_t *utf16); ~~~ </script> <script>markdeepOptions={tocStyle:'long'};</script> <script src='https://casual-effects.com/markdeep/latest/markdeep.min.js?'></script>