Skip to content

Inconsistencies in disjoint_union after addressing issue #761 #1587

@bockthom

Description

@bockthom

What happens, and what did you expect instead?

In issue #761, I previously had reported unintended type conversions in disjoint_union.
These have been fixed in PR #1375 by using vctrs::vec_c() instead of c().
Finally, the corresponding fixes have been released with version 2.1.1 of rigraph.

Since the release of version 2.1.1, we tried to adjust our usage of disjoint_union to the new behavior (which enforced us to make more changes to our code than expected, as now, after simplification with edge attribute concatenation, all values of these attributes need to be lists instead of characters, for instance). While we managed to adjust our code to the new behavior of disjoint_union regarding lists and other types of attributes, we spotted two inconsistencies in the fixes carried out in PR #1375:


(1) Missing attributes are now handled inconsistently:

According to igraph's documentation, missing attributes shall be replaced by NA values.
Quoting igraph docs on disjoint_union:

For graphs that lack some vertex/edge attribute, the corresponding values in the new graph are set to NA.

However, this is not always the case in the new implementation, as the following minimum working example shows:

# Create a graph with duplicate edges and 'attr.one' edge attribute
graph.one = igraph::make_empty_graph() +
     igraph::vertices("A", "B") +
     igraph::edges(c("A", "B", "A", "B"), attr.one = "test")

# Simplify both edges into one. Use the "concat" strategy for the 'attr.one' edge-attribute.
graph.one = igraph::simplify(graph.one, edge.attr.comb = list(attr.one = "concat"))

# Create a second graph without the 'attr.one' edge attribute, but with 'attr.two' edge attribute
graph.two = igraph::make_empty_graph() +
     igraph::vertices("C", "D") +
     igraph::edges(c("C", "D"), attr.two = "test")

# Join both graphs
union = igraph::disjoint_union(graph.one, graph.two)

After running this code you can observe how the non-existence of attr.one in edges of graph.two is expressed as a NULL instead of NA, while the non-existence of attr.two in edges of graph.one is expressed as an NA:

> str(igraph::as_data_frame(union,"edges"))
'data.frame':   2 obs. of  4 variables:
 $ from    : chr  "A" "C"
 $ to      : chr  "B" "D"
 $ attr.one:List of 2
  ..$ : chr  "test" "test"
  ..$ : NULL
 $ attr.two: chr  NA "test"

So, there is an inconsistency: Missing values of the attribute that is not a list is replaced by NA (consistent with previous implementations and with the documentation), whereas missing values for the attribute that is a list are replaced by NULL. As we (and potentially other users) rely on NA as a consistently used indicator for non-existing attribute values, we consider this a bug resulting from the fixes made in PR #1375.

Details

(2) Inconsistency between edge and vertex attributes:

While PR #1375 only addressed edge attributes, vertex attributes potentially suffer from similar problems as those reported in #761. That is, the changes of PR #1375 have been carried out in lines 252-258 of R/operators.R regarding edge attributes, while there is very similar code for vertex attributes in the same file a few lines above (lines 227-233), which has not been edited in PR #1375.


Could you please take care of the two issues described above? Thanks in advance!

CC: @hechtlC @maxloeffler

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions