Skip to content

networkLite assumes atomic vertex and edge attributes #2

@chad-klumb

Description

@chad-klumb

The current implementation of networkLite assumes that each vertex or edge attribute has a single atomic value (possibly NA) for each vertex or edge. This is less general than network, which allows basically arbitrary object-valued attributes. For example, a vertex attribute in network can assign a different matrix (of arbitrary size) to each vertex. This generality does not seem to be widely used in EpiModel (outside of networkDynamics, which we currently have no reason to convert to networkLites) or even e.g. ergm, where ergm_get_vattr's use of the %v% operator (which uses the default unlist = TRUE argument when extracting vertex attributes from the network object) here

https://github.com/statnet/ergm/blob/eebdbfa51e55d0dbff3867a2efb7bd136f018b46/R/get.node.attr.R#L486

and here

https://github.com/statnet/ergm/blob/eebdbfa51e55d0dbff3867a2efb7bd136f018b46/R/get.node.attr.R#L522

basically assumes we are dealing with a single atomic value for each vertex (otherwise the alignment of values and vertices will not generally be sensible).

In principle, the networkLite data structure can support more general object-valued vertex and edge attributes, because tibbles can have list columns (not just atomic vector columns). If/when someone implements that, beyond reworking the code (for example, introducing and supporting the unlist argument, as used in network), there is a need to be sensitive to EpiModel efficiency requirements. These center around the need to repeatedly initialize ergm_models (and other similar objects, such as ergm_proposals) from the dat object in netsim. networkLites are used as intermediaries in this process, as they support essential network operations (allowing ergm_model calls to work), structurally resemble the dat object as closely as possible (keeping construction costs dat -> networkLite low), and store vertex attributes in the way ergm_get_vattr wants them, as atomic vectors (keeping networkLite -> ergm_model runtimes relatively low).

If attributes are stored as lists for generality, it will slow down both dat -> networkLite and networkLite -> ergm_model times at least somewhat, although some basic benchmarking suggests it could still be appreciably faster than using networks (which store attributes in a different nested list structure) in the same context. Another option might be to have a "fast" networkLite mode that requires atomic attributes, and a "general" networkLite mode that uses lists. Yet another option might be to store attributes as they are passed in to the networkLite constructor and attribute setting methods, converting to lists only as needed, and thus potentially storing some attributes as atomic vectors and some as lists, with attribute accessors and the unlist argument providing predictable return values.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions