Skip to content

Tutorial: strings

Paul Alexander Bilokon edited this page Dec 15, 2024 · 1 revision

String Utilities Tutorial

This tutorial provides an overview of the string manipulation utilities available in the thalesians.adiutor.strings module, with examples of how to sanitize strings and generate unique strings.


1. Overview

Functionality

The thalesians.adiutor.strings module provides tools for cleaning, formatting, and generating unique strings. These utilities are particularly useful in data processing and ensuring consistent string formatting across applications.


2. sanitize_str Function

Description

The sanitize_str function takes a raw string and transforms it into a sanitized, lowercase string suitable for use as an identifier or file name.

Parameters

  • raw_str: The raw string to sanitize.

Returns

  • sanitized_str: The sanitized string, with special characters removed or replaced by underscores.

Usage Example

import thalesians.adiutor.strings as our_strings

raw_string = "Hello, Hélyette?! This  is a__raw_str"
sanitary = our_strings.sanitize_str(raw_string)
print(sanitary)

Output

hello_helyette_this_is_a_raw_str

3. make_unique_str Function

Description

The make_unique_str function ensures that a given string is unique within a list of existing strings by appending a numeric suffix if necessary.

Parameters

  • base_str: The base string to make unique.
  • existing_strs: A list of strings against which uniqueness is checked.

Returns

  • unique_str: A unique string derived from the base string.

Usage Example

import thalesians.adiutor.strings as our_strings

existing_strings = ["foo", "foo_1", "foo_3"]
unique_str = our_strings.make_unique_str("foo", existing_strings)
print(unique_str)

Output

foo_2

4. Unit Test Examples

Here are minimal unit tests to validate the functionality of the sanitize_str and make_unique_str functions:

import unittest
import thalesians.adiutor.strings as our_strings

class TestStringUtils(unittest.TestCase):
    def test_sanitize_str(self):
        self.assertEqual(
            our_strings.sanitize_str("Hello, Hélyette?! This  is a__raw_str"),
            "hello_helyette_this_is_a_raw_str")

    def test_make_unique_str(self):
        self.assertEqual(our_strings.make_unique_str("foo", ["bar"]), "foo")
        self.assertEqual(
            our_strings.make_unique_str("foo", ["foo", "foo_1", "foo_3"]),
            "foo_2")

if __name__ == "__main__":
    unittest.main()

5. Conclusion

The thalesians.adiutor.strings module offers simple yet effective tools for string manipulation. The sanitize_str function ensures strings are formatted in a consistent, machine-friendly manner, while make_unique_str helps avoid naming conflicts in datasets or file systems.

Clone this wiki locally