Skip to content

Allow reading/writing step files to string#1299

Closed
anuraaga wants to merge 1 commit intotpaviot:masterfrom
anuraaga:step-string
Closed

Allow reading/writing step files to string#1299
anuraaga wants to merge 1 commit intotpaviot:masterfrom
anuraaga:step-string

Conversation

@anuraaga
Copy link

@anuraaga anuraaga commented Feb 8, 2024

In #1096 there is great interest in being able to marshal step files without temporary files on disk. While *Stream methods are provided, they are awkward to use with Python files, and I think in some places in the library this is acknowledged and methods that read and write strings are provided. Strings do require buffering so may not be ideal compared to a stream in all cases, but in many cases it actually already is buffered, and it is still a significant improvement over temporary files for many of the cases listed in the issue, it would be great to be able to use this without files.

I mostly copied the pattern of breptools

https://github.com/tpaviot/pythonocc-core/blob/master/src/SWIG_files/wrapper/BRepTools.i#L855

If there's a reason to avoid this, no worries in closing this as it's out of the blue. But since the change wasn't big I figured I'd give it a shot.

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Type: Enhancement

PR Summary: This pull request introduces functionality to read and write STEP files as strings, eliminating the need for temporary files on disk. This enhancement aligns with the interest expressed in issue #1096 and follows a pattern similar to that used in breptools. The implementation includes a comprehensive test case that demonstrates the process of writing a STEP file to a string, including setting up a document, creating a shape, assigning colors, and then reading the STEP data back from the string to verify the integrity of the shape and its properties.

Decision: Comment

📝 Type: 'Enhancement' - not supported yet.
  • Sourcery currently only approves 'Typo fix' PRs.
✅ Issue addressed: this change correctly addresses the issue or implements the desired feature.
No details provided.
📝 Complexity: the changes are too large or complex for Sourcery to approve.
  • Unsupported files: the diff contains files that Sourcery does not currently support during reviews.

General suggestions:

  • Consider adding more detailed comments within the test case to explain each step of the process for future maintainability.
  • Evaluate the performance implications of handling large STEP files as strings, especially in memory-constrained environments, and document any findings or recommendations.
  • Explore the possibility of extending this functionality to other file formats supported by the library, providing a consistent interface for string-based file handling.

Thanks for using Sourcery. We offer it for free for open source projects and would be very grateful if you could help us grow. If you like it, would you consider sharing Sourcery on your favourite social media? ✨

Share Sourcery

Help me be more useful! Please click 👍 or 👎 on each comment to tell me if it was helpful.

self.assertFalse(a_shape.IsNull())

def test_read_write_step_string(self) -> None:
"""Writes a step file as a string and reads it back"""
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that doing a round trip would be the simplest way to ensure both writing and reading have assertions applied to them. I noticed that test_write_step_file may not have any assertions being applied

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

Thank you for this interesting contribution.

However, the thing is you still need to have the step file stored on your local filesystem, and I can't see the improvement compared to the standard Read method. It would be much better that the ReadFromString method only takes a string, let's say:

with open("test.stp", "r", encoding="uft8") as step_file:
    step_file_content = step_file.read()
step_reader.ReadFromString(step_file_content)

This way, the ReadFromString method could take any string, coming from instance from an API call.

@anuraaga
Copy link
Author

Hi @tpaviot - I included the filename in ReadString to be consistent with ReadStream. I'm actually not sure what the filename is used for, I assumed it's something to help the STEP parser understand. But in neither case is the file actually present on the filesystem, which we confirm in the unit test.

Note ReadStream upstream doc is here

https://dev.opencascade.org/doc/refman/html/class_x_s_control___reader.html#a49a080413455e31484199c4bdf25e08e

It doesn't quite say what the purpose of the parameter is but seems to be required. One idea is to implement ReadString with a hard-coded filename instead of taking a parameter, fromstring.step or something but wasn't sure the implications of it. We would need some sort of file name since in the end we need to call ReadStream.

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

Good point ! If the filename is not necessary, let's just remove it (pass an empty string as filename), we don't need it at the python level (I have to check however why the filename is a mandatory parameter).

FYI I dont modify swig files directly in the wrapper directory, I use to work with the pythonocc-generator:

  • I commit/push changes to the pythonocc-generator

  • I generate all the swig files into the pythonocc-core /wrapper directory

  • I commit changes to pythonocc-core and monitor the ci process

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

Using an empty string seems to work, but there maybe some side effects I can not see so far.

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

I modified the example https://github.com/tpaviot/pythonocc-demos/blob/master/examples/core_load_step_ap203_ocaf.py so that it can taka a string, that seems to work the same way as using a file.

Maybe there would be a more explicit solution than just passing an empty string as the stream name. Or leave this parameter with a default value, for instance:

def ReadFromString(the_string, the_stream_name="pythonocc_stream_name"):

@anuraaga your opinion?

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

Reverse engineering of the ReadStream method of STEPCAFControl_Reader:

It simply calls the ReadStream method of the myReader instance with type STEPControl_Reader

The stream name parameter is used at line https://github.com/Open-Cascade-SAS/OCCT/blob/master/src/STEPControl/STEPControl_Reader.cxx#L201 :

WS()->SetLoadedFile(theName);

The WS instance is a XSControl_WorkSession

The stream is read by calling the method ReadStream of the aLibrary instance (line 187) with type IFSelect_WorkLibrary defined here: https://github.com/Open-Cascade-SAS/OCCT/blob/master/src/IFSelect/IFSelect_WorkLibrary.cxx#L91

It just returns 1. Damned, dead end.

 void SetLoadedFile (const Standard_CString theFileName)
  { theloaded = theFileName; }
  
  //! Returns the filename used to load current model
  //! empty if unknown
  Standard_CString LoadedFile() const
  { return theloaded.ToCString(); }
  • there are only 10 occurrences of the LoadedFile method all over the occt codebase. As far as I understand, it is mostly used in the DRAW framework, whici is not covered by pythonocc.

Conclusion I guess we can safely use an hardcoded string for the stream name parameter. We can wrap it as an optional parameter in python, if ever someone needs to set it to a different value.

@anuraaga
Copy link
Author

Thanks for confirming! In that case I lean towards not exposing a parameter, and if for some reason someone asks for it it could be added with a default value to be backwards compatible (I think). I somehow expect for no one to ever ask for it.

pythonocc-generator

Sorry I hadn't noticed this repo. Should I move this change to it? And also

  • Remove name parameter
  • ReadFromString / WriteToString? I went with matching the grammar of ReadStream for this PR but noticed that pattern in the codebase too, happy to use either.

@tpaviot
Copy link
Owner

tpaviot commented Feb 21, 2024

I've been working on a more robust solution : wrap all std::istream and std::ostream parameters in occt to python strings. This leads to a simpler generator, and a python wrapper more aligned with occt method names and signature. It maybe less pythonic, but easier to go from the official documentation to python. PR is on the way

@tpaviot
Copy link
Owner

tpaviot commented Feb 22, 2024

@anuraaga see #1301

@anuraaga
Copy link
Author

Thanks! That looks very useful

@anuraaga anuraaga closed this Feb 22, 2024
@hunosdof
Copy link

hunosdof commented Nov 11, 2024

I am trying to load a step file from a stream instead directly from a file.

However, the following function fails:

status = step_reader.ReadStream(file_path, file_content)

I am passing this file_content:
if isinstance(file_content, bytes):
file_content = file_content.decode('utf-8')

It is in utf-8 format.
However, I do not get an error but my code fails in this line.

I assume that the file_content argument is incorrect (but str is expected).
Is there a specific format this string has to be in?

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants