Skip to content
This repository was archived by the owner on Dec 5, 2025. It is now read-only.
This repository was archived by the owner on Dec 5, 2025. It is now read-only.

pycti library has problem with non-ASCII characters #723

@Lhorus6

Description

@Lhorus6

Description

In a stream connector, we use the "self.helper.listen_stream()" function to listen to a stream. The problem is that the data is truncated when a non-ascii character passes through it.

The use case is as follows:

  • I send File Observables in a stream.
  • Some of my File have non-ascii characters in the "x_opencti_additional_names" field because the file has non-ascii characters in its name.
  • The pycti function truncates the response at the non-ASCII character level.

Seen in the stream

image

Retrieved by my connector

Screenshot 2024-08-29 220415

Environment

OCTI 6.2.16

Reproducible Steps

Steps to create the smallest reproducible scenario:

  1. Create a live stream with the filters "Entity type: File AND label: test-bug".
  2. Create a File with only a MD5 hash (no author, no marking, etc to avoid noise in the stream).
  3. Run in debug mode a stream connector listening your stream and with a breakpoint to the place where it processes the retrieved data.
  4. Add in the "name" field of the File: 2020ë�� ì�°êµ¬ ì �문ì�� ë°� ì��ì��ì��ë¶�ì�¼ ê²½ë ¥ì�¬ì�� ì� ë°� 모ì§�ì��ê°�.hwp
  5. Add the label "test-bug" on the File to send it in the stream.
  6. Look at the connector side for the data retrieved. -> truncated data

Expected Output

Have the whole data, like what I have in my stream

Metadata

Metadata

Assignees

No one assigned

    Labels

    buguse for describing something not working as expectedsolveduse to identify issue that has been solved (must be linked to the solving PR)

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions