Skip to content

Eliminating blockers for Minimal Fragmenting Relationships#79

Closed
james-j-obrien wants to merge 19 commits intobevyengine:mainfrom
james-j-obrien:minimal-fragmenting-relationships
Closed

Eliminating blockers for Minimal Fragmenting Relationships#79
james-j-obrien wants to merge 19 commits intobevyengine:mainfrom
james-j-obrien:minimal-fragmenting-relationships

Conversation

@james-j-obrien
Copy link

RFC with a focus on the technical blockers for a hypothetical minimal fragmenting relationships implementation. Less time has been spent on the summary and motivation aspects as those have been well explored by Sander's articles (example) as well as prior RFCs, PRs, discord discussions etc.

RENDERED

@Trashtalk217
Copy link

This looks great! And I think it's also a more reasonable size than #78 (which I should probably close for now). One thing I would like slightly more clarification on is the 'fragmenting' part of 'fragmenting' relationships. Even in #78, I don't think it's adequately explained. My current guess, based on context clues, is that adding relationships means that every (component, entity) acts like a new component for the archetype system, so (component, entity1), and (component, entity2) would be in two different tables, hence splitting the tables, hence fragmenting the tables, which makes iteration speed for queries slower (less predictable access patterns).

I think that's it, but wouldn't that also mean that when you don't use relations, you don't encounter any extra fragmentation, so you shouldn't see performance regressions?

@james-j-obrien
Copy link
Author

james-j-obrien commented Apr 1, 2024

Your interpretation of the "fragmenting" part is correct, I took a quick edit to make it more explicit but it could use more detail. Iteration for queries could hypothetically be slower for a set of entities that previously would have occupied the same table however if the query is only iterating dense components we can still use dense iteration and shouldn't see any regressions. One advantage we have over flecs here is we already have split storage. Since this RFC doesn't cover data on the relationships iterating targets (which are stored in the archetype not component storage) will always be dense and when we do support them we can opt for sparse storage when we don't want to impact iteration speed of dense queries that don't access the relationship.

Your also correct in that if you don't make use of relationships you won't have any increase to the fragmentation, however once you do it can increase quickly, especially if/when bevy_hierarchy uses them.

In the minimal implementation all relationships will take the form of `(ZST Component, entity)` pairs so the API surface will look quite familiar. All naming is subject to bikeshedding:
```rust
#[derive(Component)]
struct Relationship;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to explicitly designate a component as a relationship component, or should we allow it to be used as a non-relationship context too?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having a derive specific to relationship components is fine, there are potential cases where someone might want to use the same type for both, but they can always newtype.

No strong feelings here.

@iiYese
Copy link

iiYese commented Apr 3, 2024

"Fragmenting"

I'd like to add some comments about "fragmenting" since that's bound to receive the most questions. We should stop calling it "fragmenting". It places too much emphasis on the potential downsides of the feature. A better description for this is "archetype level". This is a tool which can have downsides but has many desirable upsides akin to the archetype ECS model we already have.

Other implementations

I do not think this RFC or any efforts towards this feature should deviate away from an archetype level implementation and generalizing to just "an implementation of relations" would be a mistake. The other two big contenders for a relations implementation in the context of an archetypal ECS have been explored at this stage:

  • Indexed
  • "Non-fragmenting" (aery)

Indexed

To my knowledge there's no specific crate like aery that offers this as a standalone feature but this already exists in every other crate. Many crates have an "index" of some sort in a resource that creates relationships between entities or data stored outside the ECS. The problem with all of these is indexing is a very different data model to archetypes. This presents syncing problems and an even bigger ergonomic problems. Different data models create boundaries in your app that you have to cross every time you access data. Think getting stuff in & out of egui contexts for widgets.

Non-fragmenting

Aery was my attempt at this for my final year university project. It has less of the problems of indexed relations because it's closer to the archetype model (it just stores Entitys in components). It's serviceable for many usecases but it's very unoptimal for many more & has nowhere near the amount of query features I'd like. It originally started off as a fork of bevy but half way through I decided I didn't want it in bevy for two reasons.

TLDR

In short it makes the most sense for an ECS relationships implementation to use the same data model as the ECS. Archetype level relationships are what we should aim for. The worst case scenario is you would have as many archetypes/tables as you have entities which is already how the vast majority of commercial games written in OO paradigms work.

@SanderMertens
Copy link

The other two big contenders for a relations implementation in the context of an archetypal ECS have been explored at this stage:

There's also Flecs union relationships, which are a non-fragmenting relationships implementation that are widely used by flecs users in combination with fragmenting relationships.

A fourth non-fragmenting implementation has just finished and will land soon, which'll let you use component members as targets of relationships.

Copy link

@Trashtalk217 Trashtalk217 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A couple of language things that can be found with any appropriately smart spell checker.

Anyway, I can say that I understand the document now, which is good news. One thing I do like to clear up on thing about the order of operations. All of the caching improvements (component index, access, the ArchetypeComponentId cache) technically don't need to wait on #12144, right?

james-j-obrien and others added 4 commits May 10, 2024 14:40
Co-authored-by: Trashtalk217 <trashtalk217@gmail.com>
Co-authored-by: Trashtalk217 <trashtalk217@gmail.com>
Copy link

@Trashtalk217 Trashtalk217 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The language is still quite dense, some things could've easily been expanded upon to make it read clearer, and there probably are still a couple of spelling errors in there.

BUT with a topic as deep and complex as an ecs, we could keep adding to it forever. And for the good of starting development, I think this rfc is in a good state.

Relationships have been a long desired bevy feature. The ability to express and traverse hierarchies and graphs within the ECS unlocks a huge amount of potential. Features such as breadth-first query traversal, immediate hierarchy clean-up, component inheritance, multi-target queries etc. all rely on or are simplified by the existence of relationships as a first class ECS primitive.
## User-facing explanation

Relationship pairs can be inserted and removed like any other component. There are 3 ids associated with any relationship pair instance:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a bit more introduction. What is a "relationship pair"?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes this could be more clear, it's only implicitly defined in the summary.

struct Relationship;

world.entity_mut(source_entity_a)
.insert_pair::<Relationship>(target_entity_b)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not keen on the naming here. A "pair" is a new concept / terminology that I don't think actually adds any expressive value over insert_relation.

Copy link
Author

@james-j-obrien james-j-obrien May 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very fair, this is me bringing in baggage from other languages that defined two item tuples as pairs. I think the main reason I went that way over insert_relation is just how long that is for a very common operation that already needs a type parameter. (admittedly bikesheddy)

Copy link

@SanderMertens SanderMertens May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fwiw the reason I picked pair is because it's less confusing/opinionated:

  • Does relation(ship) refer to the first element of the pair or both the relationship(which should then be called?) and the target?
  • The feature is called relationships, but it really implements the ability to have composite component ids which can be used for relationships. I can use it to add a (Dog, Food) pair, where Dog isn't really a relationship and Food isn't really a relationship target.

Copy link
Member

@alice-i-cecile alice-i-cecile May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can get behind that. The key thing here is clearly laying out (and motivating) the terminology EDIT: I've changed my mind here.

- Clean-up policies to allow specifying recursive despawning, this is important to allow porting `bevy_hierarchy` to use relationships
- More advanced traversal methods: up, BFS down, etc.
- More expressive query types: multi-target, grouping, sorted etc.
- With component as entities we unlock full `(entity, entity)` relationships
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs more explanation.

Copy link
Member

@alice-i-cecile alice-i-cecile left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'm broadly on board with the technical implementation strategy, but the writing here needs work.

There are two key problems in this document (and the surrounding discussion):

  1. Jargon from flecs is used freely without definition. "pairs" and (entity, entity) relationships are the key issues here.
  2. We jump straight to a solution, without clearly laying out the problems to be solved, or how the problems are related to the solution. This is most evident in the section on Archetype Cleanup, but needs work throughout.

We can reduce the amount of space spent talking about "relations are great" (just link to my issue), but a clear "here are the problems, here's how we can solve them" structure will go a long way to making this easy for others to understand.

In terms of API design, I have two points of feedback:

  1. Can we avoid introducing the "pairs" terminology completely?
  2. I would like to cover the basic API by which ZST relations can be queried (or explicitly mark it as out of scope).

james-j-obrien and others added 2 commits May 13, 2024 08:19
Co-authored-by: Alice Cecile <alice.i.cecile@gmail.com>
@zesterer
Copy link

zesterer commented Jul 8, 2024

If it's useful in any way, we have some prior art on this over at Veloren that's been in use for a number of years and has proven itself under a number of different situations.

We call it the 'link system', and it's a generic framework for defining and maintaining relationships between ECS entities. It supports both 1:1 and 1:many relationships. It's the product of iterative development, and replaces several previous failed attempts at managing entity relationships.

Currently, it is used to define:

  • Rider/mount relationships (like a player riding a horse)
  • Volume entity riding relationships (many players sat on seats in an airship, moving through the air)
  • Tethering (physics-based ropes that attach entities to one-another, or to the terrain)

I'm happy to answer any questions about it.

Users often run into difficulty while trying to model entity hierarchies in Bevy. The tools we currently have are not expressive enough to meet their needs. See this issue for more discussion [#3742](https://github.com/bevyengine/bevy/issues/3742).

This RFC proposes a minimal design that would unblock a large amount of users while also leaving avenues for the engine to further expand these features. In short:
- Define a new variant of `Identifier` that combines a `ComponentId` (the relationship type) and an `Entity` (the target) that can be added to an entity (the source) to model a relationship of that type between the source and the target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm leary of this design in general: I've found discussions about flecs features that lean on this feature (and components-as-entities) rather confusing to follow. While I've gotten more comfortable with it over time, I worry that new users will still struggle to understand what's going on.

Copy link

@iiYese iiYese Jan 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes an unrealistic assumption users will be introduced to all of the ECS at once. Instead of say developing an understanding of components + entities & later learning that "oh components are just entities too". If anything I don't think it's a "new user" issue I think it takes more effort for you to grasp precisely because you have far more activity with bevy. I like many others, with far less development time on bevy have no issues grasping the feature.

- Replacing internal `SparseSet` usage where `Identifier` would be used as keys as they are no longer dense.
- Allowing archetypes to be deleted when an entity that was used as part of a pair in that archetype is despawned. This involves updating several internal caches that are used for correctness and/or safety.

## Motivation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This section needs to be rewritten to take into account the newly merged non-fragmenting relations.

- **Generation**: The upper 32 bits of an `Entity`, represents how many times the id has been re-used, important to prevent interpreting a recycled `Entity` as the original.
- **Relationship**: A `Component` type used to model a relationship between two entities i.e. `Eats`, takes up the upper 32 bits of a `Pair`
- **Target**: An entity used as a target of a relationship, it's id takes up the lower 32 bits of a `Pair`
- **Pair**: A variant of `Identifier` representing a relationship to a target entity i.e. `(Eats, apple)` where `apple` is the target entity
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will veto this name. It is extremely unclear and too generic.

- **Source**: The entity a `Pair` is added to as a component i.e. if `(Eats, apple)` is added to entity `alice`, `alice` would be the source of that relationship.
- **ZST**: Zero-sized type, these are types with `std::mem::size_of::<T>() == 0` and hence carry no data.
- **Wildcard**: A special id used to match all any id, similar to a glob i.e. `(Eats, *)` would match all ids where the relationship type was `Eats`
## User-facing explanation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
## User-facing explanation
## User-facing explanation

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a bunch of missing newlines around section headers actually :)

- **Source**: The entity a `Pair` is added to as a component i.e. if `(Eats, apple)` is added to entity `alice`, `alice` would be the source of that relationship.
- **ZST**: Zero-sized type, these are types with `std::mem::size_of::<T>() == 0` and hence carry no data.
- **Wildcard**: A special id used to match all any id, similar to a glob i.e. `(Eats, *)` would match all ids where the relationship type was `Eats`
## User-facing explanation
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be rewritten in the wake of merging non-fragmenting relations.

##### Query and System Caches
Query's caches of matching tables and archetypes are updated each time the query is accessed by iterating through the archetypes that have been created since it was last accessed. As the number of archetypes in bevy is expected to stabilize over the runtime of a program this isn't currently an issue. However with the increased archetype fragmentation caused by fragmenting relationships and the new wrinkle of archetype deletion the updating of the query caches is poised to become a performance bottleneck.

An appropriate solution would allow us to expose new/deleted archetypes to only the queries that may be interested in iterating them. Luckily observers [#10839](https://github.com/bevyengine/bevy/pull/10839) could cleanly model this pattern. By sending ECS triggers each time an archetype is created and creating an observer for each query we can target the updates so that we no longer need to iterate every new archetype. Similarly we can fire archetype deletion events to cause queries to remove the archetype from their cache.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No objections to this: supporting archetype cleanup is a valuable optimization no matter what we do.


An appropriate solution would allow us to expose new/deleted archetypes to only the queries that may be interested in iterating them. Luckily observers [#10839](https://github.com/bevyengine/bevy/pull/10839) could cleanly model this pattern. By sending ECS triggers each time an archetype is created and creating an observer for each query we can target the updates so that we no longer need to iterate every new archetype. Similarly we can fire archetype deletion events to cause queries to remove the archetype from their cache.

In order to implement such a mechanism we need some way for the observer to have access to the query state. Currently `QueryState` is stored in the system state, so in order to update it we would need to expose several additional APIs, even then it's quite difficult to reach into the system state tuple to access the correct query without some macros to handle tuple indexing. An alternative would be to move the query state into a component on an associated entity (WIP branch here: [queries-as-entities](https://github.com/james-j-obrien/bevy/tree/queries-as-entities)) this provides additional benefits in making it easier manage dynamic queries, but they are largely tangential to this RFC.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that moving queries and systems out onto their own entities is likely to be a useful long-term route. The ECS is very powerful and expressive, and improving flexibility and introspection is valuable.


Regardless this will also require further refactors to the query cache implementation. For example, currently the collection of tables and archetypes is a vector but that doesn't provide an efficient method for removal without an O(n) scan, so this data structure will have to be replaced with one that performs reasonably for both iteration and insertion/removal. Similarly systems cache their `ArchetypeComponentId` accesses for the purposes of multithreading, we also need to update theses caches to remove deleted ids presenting many of the same problems.

Using observers globally or moving queries to entities are both blocked by the deletion of entities in the render world tracked here: [#12144](https://github.com/bevyengine/bevy/issues/12144).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No longer true!

##### Access Bitsets and Component SparseSets
The `FixedBitSet` implementation used by `Access` inside of systems, system params, queries etc. relies on dense ids to avoid allocating a large amount of memory. Since pairs have component ids in the upper bits the `u64` they are represented as is extremely large, each `FixedBitSet` would allocate non-trivial amounts of memory to have flags for all the intervening ids. In order to minimize the affects of this we could consider all `(R, *)` pairs as one id in accesses, reducing the amount we need to track at the expense of some theoretical ability to parallelise.

Similarly the `SparseSet` implementation used in both table and sparse component storage also operate under the assumption that ids will remain low and would allocate an even larger amount of memory storing pairs on an entity. In order to circumvent this we can implement a "component index", this index would track for each `Identifier` the tables and archetypes it belongs to as well as the column in that table the id inhabits (if any). This removes the need for the sparse lookup as well as providing opportunities for other optimizations related to creation of observers and queries.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is worth pursuing.


### Minimal Implementation
The component index could be implemented at any time, however the other pieces require some ordering to satisfy dependencies or prevent half implemented features.
Implementing the remainder in an appropriate order of dependance would work as follows:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be updated.

- Merge observers [#10839](https://github.com/bevyengine/bevy/pull/10839)
- Resolve [#12144](https://github.com/bevyengine/bevy/issues/12144) so entities can be persisted in the render world.
- Refactor queries to become entities with caches updated via observers
- Implement archetype deletion and the associated clean-up for systems and queries, this is the only change that pays a complexity cost for no user benefit until relationships are realised
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should expose a manual "cleanup empty archetypes method" for power users even at this stage.

- Clean-up policies to allow specifying recursive despawning, this is important to allow porting `bevy_hierarchy` to use relationships
- More advanced traversal methods: up, BFS down, etc.
- More expressive query types: multi-target, grouping, sorted etc.
- With component as entities we unlock the ability `(entity, entity)` pairs, where either entity could also be a component. This makes it much easier to express dynamic relationships where you don't necessarily want to define your relationship type statically or go through the trouble of creating a dynamic component, as well as cases where you want a component as the target.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am broadly opposed to this feature in the proposed form. I don't think that the clarity/complexity to utility tradeoff is worth it over more narrow forms.

@alice-i-cecile
Copy link
Member

Closing this out: I don't think that editing this is a particularly good way to get to an implementable design doc. Too much has changed, there's too much history here, and there's too many implicit assumptions about the path we should take. I'm not at all opposed to archetypal relations (and actively want to support things like large numbers of archetypes and clearing empty tables and archetypes), but I want to see a user-focused path / design for how and why Bevy should get there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants