Skip to content

Add ipynb conversion#86

Merged
vanyle merged 36 commits intomasterfrom
Add-ipynb-conversion-draft
Dec 12, 2021
Merged

Add ipynb conversion#86
vanyle merged 36 commits intomasterfrom
Add-ipynb-conversion-draft

Conversation

@FabienRoger
Copy link
Contributor

This is a draft, many required features are missing.
What is there

  1. A basic converter which uses the default code block and fills it with the pieces of codes, and align them left to right
  2. The load button now supports loading a ipynb file
    What is missing
  3. Many crucial things (such as markdown and outputs) are ignored by the converter (see the TODOs in the python file)
  4. There is not dedicated "load ipynb button" for now. It works with the load button when a .ipynb file is selected.
  5. Unit tests might be required for this one, but are not yet present

@FabienRoger FabienRoger requested a review from vanyle December 2, 2021 22:08
@FabienRoger FabienRoger linked an issue Dec 2, 2021 that may be closed by this pull request
@FabienRoger FabienRoger marked this pull request as draft December 2, 2021 22:30
@FabienRoger FabienRoger changed the title Add ipynb conversion draft Add ipynb conversion Dec 2, 2021

data["source"] = ''.join(cell["source"])

data["sockets"] = {} # TODO : add sockets
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Less TODO comments. That's why there are github comments and issues. One is fine but there are too many.

data = json.loads(file.read())

data["id"] = 0 # TODO : give a proper id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set the height / width of data to match the number of line of the source to avoid scrolling when reading the imported file.

# Load the default empty block
# TODO : add something in case the user renames / removes the empty block / changes it too much ?
data: OrderedDict = {}
with open("blocks/empty.ocbb", 'r', encoding='utf-8') as file:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The empty block might change places in the future. The data fields should be set from scratch.

data = OrderedDict([
  ("id",0),
  # ...
])

You can checkout the serialize in block.py as an example

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

27022fb
You can do something similar to what you did in this commit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the future, the serialization process will be more robust and we'll be able to not set some fields like splitter and enjoy sane defaults.

data = json.loads(file.read())
return data

def load_from_ipynb(self, filepath: str) -> OrderedDict:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is the same as load_from_ipyg. Consider renaming load_from_ipyg to something like load_from_json.

@FabienRoger
Copy link
Contributor Author

Update on the state of the pull request

What is there

  1. A basic converter which add pieces of codes, and align them left to right, and align markdown top to bottom
  2. The load button now supports loading a ipynb file
  3. Automatic resizing to see the hole code

Here is what it looks like
image

What is missing

  1. Many crucial things (such outputs) are ignored by the converter
  2. There is not dedicated "load ipynb button" for now. It works with the load button when a .ipynb file is selected.
  3. Unit tests might be required for this one, but are not yet present
  4. Titles ?
  5. Automatic text size detection (here the expected text size is fixed)
  6. Make it look better ?

@vanyle
Copy link
Contributor

vanyle commented Dec 3, 2021

You're right about titles. I guess it depends on your use of markdown inside jupyter notebooks.
One solution would be to merge 2 jupyter notebook cells into one when the markdown cell is short and only contains one line,
but to keep the 2 cell separate if there are multiple lines or if the cell uses markdown specific syntax like #, *italics* or `quotes`

@FabienRoger
Copy link
Contributor Author

Sockets are added and I have implemented your suggestion ("One solution would be to merge 2 jupyter notebook cells into one when the markdown cell is short")

@FabienRoger FabienRoger marked this pull request as ready for review December 3, 2021 22:12
@FabienRoger
Copy link
Contributor Author

What is missing and will be added in future branches :

  1. Conversion from ipyg to ipynb
  2. Display outputs saved in the ipynb file
  3. Unit tests
  4. Automatic text size detection (here the expected text size is fixed)
  5. A dedicated "load ipynb button" for now. It works with the load button when a .ipynb file is selected. Should it really be added ?

@FabienRoger
Copy link
Contributor Author

I need help with the following:

  1. How to generate proper ids ? I don't find the function that does it.
  2. How to get the font size (for the moment, it just uses "12" when converting from ipynb to ipyg)

Copy link
Member

@MathisFederico MathisFederico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first review !
Thanks for the contribution !

def load_from_ipyg(self, filepath: str):
""" Load an interactive python graph (.ipyg) into the scene.
def load_from_json(self, filepath: str) -> OrderedDict:
"""Load the ipynb json data into an ordered dict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not only .ipynb json data, it can be .ipyg json data too (see lines 148-149)

edges: List[OrderedDict] = add_sockets(blocks)

return {
"id": 0,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"id": 0,

id should not be present to be generated

"""Convert ipynb data (ipynb file, as ordered dict) into ipyg data (ipyg, as ordered dict)"""

blocks: List[OrderedDict] = get_blocks(data)
edges: List[OrderedDict] = add_sockets(blocks)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
edges: List[OrderedDict] = add_sockets(blocks)
edges_data: List[OrderedDict] = get_edges_data(data, blocks)

add_sockets should be named get_edges_data to match get_blocks_data
Also original data should be passed to get_edges_data in case edges data is present in .ipynb metadata

def ipynb_to_ipyg(data: OrderedDict) -> OrderedDict:
"""Convert ipynb data (ipynb file, as ordered dict) into ipyg data (ipyg, as ordered dict)"""

blocks: List[OrderedDict] = get_blocks(data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
blocks: List[OrderedDict] = get_blocks(data)
blocks_data: List[OrderedDict] = get_blocks_data(data)

blocks_data should be used instead of blocks. blocks should referer to actual OCBBlock instances

Comment on lines +237 to +241
def get_integers_generator() -> Generator[int, None, None]:
n = 0
while True:
yield n
n += 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused function

Suggested change
def get_integers_generator() -> Generator[int, None, None]:
n = 0
while True:
yield n
n += 1

Comment on lines +189 to +201
def get_default_input_socket(socket_id: int) -> OrderedDict:
"""Returns the default input socket with the corresponding id"""
return {
"id": socket_id,
"type": "input",
"position": [0.0, SOCKET_HEIGHT],
"metadata": {
"color": "#FF55FFF0",
"linecolor": "#FF000000",
"linewidth": 1.0,
"radius": 10.0,
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not add random metadata, deserialization should handle missing arguments with default ones. If that's not the case, raise an issue and start a bugfix branch from master.

Suggested change
def get_default_input_socket(socket_id: int) -> OrderedDict:
"""Returns the default input socket with the corresponding id"""
return {
"id": socket_id,
"type": "input",
"position": [0.0, SOCKET_HEIGHT],
"metadata": {
"color": "#FF55FFF0",
"linecolor": "#FF000000",
"linewidth": 1.0,
"radius": 10.0,
},
}
def get_default_input_socket(socket_id: int) -> OrderedDict:
"""Returns the default input socket with the corresponding id"""
return {"id": socket_id, "type": "input"}

Comment on lines +204 to +219
def get_default_output_socket(socket_id: int, block_width: int) -> OrderedDict:
"""
Returns the default input socket with the corresponding id
and at the correct relative position with respect to the block
"""
return {
"id": socket_id,
"type": "output",
"position": [block_width, SOCKET_HEIGHT],
"metadata": {
"color": "#FF55FFF0",
"linecolor": "#FF000000",
"linewidth": 1.0,
"radius": 10.0,
},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should not add random metadata, deserialization should handle missing arguments with default ones. If that's not the case, raise an issue and start a bugfix branch from master.

Suggested change
def get_default_output_socket(socket_id: int, block_width: int) -> OrderedDict:
"""
Returns the default input socket with the corresponding id
and at the correct relative position with respect to the block
"""
return {
"id": socket_id,
"type": "output",
"position": [block_width, SOCKET_HEIGHT],
"metadata": {
"color": "#FF55FFF0",
"linecolor": "#FF000000",
"linewidth": 1.0,
"radius": 10.0,
},
}
def get_default_output_socket(socket_id: int, block_width: int) -> OrderedDict:
"""Returns the default output socket with the corresponding id"""
return {"id": socket_id, "type": "output"}

Comment on lines +222 to +234
def get_default_edge(
edge_id: int,
edge_start_block_id: int,
edge_start_socket_id: int,
edge_end_block_id: int,
edge_end_socket_id: int,
) -> OrderedDict:
return {
"id": edge_id,
"path_type": "bezier",
"source": {"block": edge_start_block_id, "socket": edge_start_socket_id},
"destination": {"block": edge_end_block_id, "socket": edge_end_socket_id},
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path_type should be handled by deserialization

Suggested change
def get_default_edge(
edge_id: int,
edge_start_block_id: int,
edge_start_socket_id: int,
edge_end_block_id: int,
edge_end_socket_id: int,
) -> OrderedDict:
return {
"id": edge_id,
"path_type": "bezier",
"source": {"block": edge_start_block_id, "socket": edge_start_socket_id},
"destination": {"block": edge_end_block_id, "socket": edge_end_socket_id},
}
def get_default_edge(
edge_id: int,
edge_start_block_id: int,
edge_start_socket_id: int,
edge_end_block_id: int,
edge_end_socket_id: int,
) -> OrderedDict:
return {
"id": edge_id,
"source": {"block": edge_start_block_id, "socket": {"id": socket_id, "type": "output"},
"destination": {"block": edge_end_block_id, "socket": {"id": socket_id, "type": "output"}},
}

@MathisFederico
Copy link
Member

1. How to generate proper ids ? I don't find the function that does it.

You can use a generator or just select a number of unique random high enough integers.
For example, for edge, blocks and sockets id, you just need to gather 5*len(blocks) unique integers and you will be good to go !
We will probably refactor id-managment later on anyway.

2. How to get the font size (for the moment, it just uses "12" when converting from ipynb to ipyg)

Do you mean how to set the font size ? For source text or/and titles ? If yes you can add it in metadata.
Not so sure why you care about font size for this .ipynb conversion.

@FabienRoger
Copy link
Contributor Author

FabienRoger commented Dec 6, 2021

Nice suggestions ! I added them.

Do you mean how to set the font size ? For source text or/and titles ? If yes you can add it in metadata.
Not so sure why you care about font size for this .ipynb conversion.

Blocks width and heights are chosen such that the code fits. I need the text size (or a better solution) to find how much space the code takes.

@MathisFederico
Copy link
Member

Blocks width and heights are chosen such that the code fits. I need the text size (or a better solution) to find how much space the code takes.

I see ! It's not that important for now, we will improve this in a future branch, just choose a 'good enough value' for now.

@MathisFederico MathisFederico requested a review from vanyle December 8, 2021 23:37
Copy link
Member

@MathisFederico MathisFederico left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would not have thought of it like this, but I like this Serializable refactor !
Let's wait for @vanyle approval before merging anything, but it looks good to me !

"""Convert ipynb data (ipynb file, as ordered dict) into ipyg data (ipyg, as ordered dict)"""

blocks_data: List[OrderedDict] = get_blocks_data(data)
edges_data: List[OrderedDict] = get_sockets_data(blocks_data)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still think this should be edges_data though

Suggested change
edges_data: List[OrderedDict] = get_sockets_data(blocks_data)
edges_data: List[OrderedDict] = get_edges_data(blocks_data)

@vanyle
Copy link
Contributor

vanyle commented Dec 9, 2021

image
The font size seems to vary from computer to computer. For example, on my machine, the height of the block needs to be about doubled.

We cannot hardcode TEXT_SIZE_TO_HEIGHT_RATIO into a constant, it needs to be computed.

Look into the QFontMetrics(font) object. Given a string, this object can return the bounding box of said string.
The font can be retreived from the PythonEditor class (inside pyeditor.py).

@vanyle
Copy link
Contributor

vanyle commented Dec 9, 2021

Apart from that, I would also like to say that putting all the default values for cells serialization in their own file was a great idea !

vanyle
vanyle previously approved these changes Dec 9, 2021
Copy link
Contributor

@vanyle vanyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code is very clean !
Just look at the comment for the remaining issue. You can merge it anyway if you want. Because the issue is specific to my machine, it might be better if I focus on fixing it as ratios between real height and height inside a QGraphicsView might be involved.

@FabienRoger
Copy link
Contributor Author

There is still an issue determining block size with the newlines that isn't present in the ipynb files I tested it with. I will try to fix that latter.

MathisFederico
MathisFederico previously approved these changes Dec 9, 2021
vanyle
vanyle previously approved these changes Dec 12, 2021
@vanyle vanyle merged commit 7bd0d4a into master Dec 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Graph to notebook conversion

3 participants