Klebert-Engineering · josephbirkner · Mar 16, 2026 · Feb 18, 2026 · Feb 18, 2026 · Feb 18, 2026
diff --git a/CMakeLists.txt b/CMakeLists.txt
@@ -71,6 +71,7 @@ add_library(simfil ${LIBRARY_TYPE}
   src/value.cpp
   src/overlay.cpp
   src/exception-handler.cpp
+  src/expression-visitor.cpp
   src/model/model.cpp
   src/model/nodes.cpp
   src/model/string-pool.cpp)
@@ -94,8 +95,10 @@ target_sources(simfil PUBLIC
       include/simfil/transient.h
       include/simfil/simfil.h
       include/simfil/exception-handler.h
+      include/simfil/expression-visitor.h
 
       include/simfil/model/arena.h
+      include/simfil/model/column.h
       include/simfil/model/string-pool.h
       include/simfil/model/model.h
       include/simfil/model/nodes.h

diff --git a/docs/simfil-dev-guide.md b/docs/simfil-dev-guide.md
@@ -62,6 +62,24 @@ Objects and arrays do not embed child nodes directly. Instead, they maintain `Mo
 
 `StringPool` maintains the mapping between strings and the `StringId` integers stored in object fields. The base `Model` interface exposes `lookupStringId` so that serialization code such as `ModelNode::toJson` can recover human-readable field names. `ModelPool::setStrings` allows a pool to adopt a different `StringPool`, populating any missing field names along the way. This operation is used by higher-level components that need to merge data from several pools into a unified string namespace.
 
+### ModelColumn
+
+The primitive storage building block below `ModelPool` and `ArrayArena` is `ModelColumn<T, RecordsPerPage, StoragePolicy>`. A model column stores a single fixed-width record stream and exposes bulk byte operations for serialization and deserialization. The generic implementation accepts three families of types:
+
+- fixed-width scalar types (`bool`, fixed-width integers, fixed-width enums, `float`, `double`)
+- explicitly tagged external record types via `MODEL_COLUMN_TYPE(expected_size)`
+- other approved native POD records that are trivially copyable and standard-layout
+
+The column implementation assumes little-endian hosts and treats the in-memory representation as the wire representation. `bytes()` returns the canonical payload bytes for the current record stream; `assign_bytes()` and `read_payload_from_bitsery()` perform the inverse operation. For vector-backed columns this is one contiguous bulk copy; for segmented storage the same payload is copied chunk-by-chunk while preserving the same wire layout.
+
+`RecordsPerPage` defines the number of records stored per page, not the page size in bytes. The effective page size is `RecordsPerPage * sizeof(T)`, and segmented storage requires that value to be a multiple of the record size. This keeps page boundaries aligned with record boundaries and lets callers reason about capacity in record counts instead of byte counts.
+
+### Split pair columns with `TwoPart`
+
+`TwoPart<A, B>` is a logical pair type used when a compound record should behave like `{A, B}` in C++ but should not pay struct-padding costs on the wire. `ModelColumn<TwoPart<A, B>>` specializes the generic column by storing the `first()` and `second()` members in two synchronized child columns. Reads and writes still happen through a pair-like ref proxy, but serialization concatenates the dense payload of the first column and the dense payload of the second column.
+
+The main current use is object member storage. `detail::ObjectField` is defined as `TwoPart<StringId, ModelNodeAddress>`, so object fields still behave like `(name, value)` pairs while the wire payload remains dense and deterministic regardless of host padding rules.
+
 ### Value representation
 
 `Value` is the runtime carrier for scalar and structured results:
@@ -127,25 +145,32 @@ classDiagram
 
 `BaseArray<ModelType, ModelNodeType>` provides the generic implementation of array behaviour for model pools. It owns a pointer to an `ArrayArena<ModelNodeAddress, …>` and an `ArrayIndex` into that arena. The base class implements `type()` (always `Array`), `at()`, `size()`, and `iterate()` in terms of the arena. `Array` itself is a thin wrapper over `BaseArray<ModelPool, ModelNode>` that adds convenience overloads for appending scalars, which internally delegate to `ModelPool::newSmallValue` or `ModelPool::newValue` and then record the resulting address in the arena.
 
-`BaseObject<ModelType, ModelNodeType>` plays the same role for object nodes. It stores key–value pairs as `{StringId, ModelNodeAddress}` elements inside an `ArrayArena`. The base class implements `type()` (always `Object`), `get(StringId)`, `keyAt()`, `at()` (interpreting the array as an ordered sequence of fields), and `iterate()`. The concrete `Object` subclass adds convenience `addField` overloads for common scalar types and an `extend` method that copies all fields from another `Object`.
+`BaseObject<ModelType, ModelNodeType>` plays the same role for object nodes. It stores key–value pairs as `detail::ObjectField` elements inside an `ArrayArena`; that type is currently `TwoPart<StringId, ModelNodeAddress>`, so names and child addresses are physically stored in split columns while the API still behaves like a logical pair sequence. The base class implements `type()` (always `Object`), `get(StringId)`, `keyAt()`, `at()` (interpreting the array as an ordered sequence of fields), and `iterate()`. The concrete `Object` subclass adds convenience `addField` overloads for common scalar types and an `extend` method that copies all fields from another `Object`.
 
 `ProceduralObject` extends `Object` with a bounded number of synthetic fields. These fields are represented as `std::function<ModelNode::Ptr(LambdaThisType const&)>` callbacks in a `small_vector`. Accessors such as `get`, `at`, `keyAt`, and `iterate` first consult the procedural fields and then fall back to the underlying `Object` storage. This pattern makes it possible to expose computed members alongside stored ones without materialising them permanently in the arena.
 
 `OverlayNode` is an orthogonal mechanism that wraps an arbitrary underlying node and maintains a separate map `<StringId, Value>` of overlay children. Calls to `get` and `iterate` first visit the injected children and then delegate to the wrapped node. The overlay itself derives from `MandatoryDerivedModelNodeBase` and uses an `OverlayNodeStorage` `Model` implementation to resolve access.
 
 ### Array arena details
 
-The `ArrayArena` template implements the append-only sequences used by arrays and objects. Conceptually, it manages a collection of logical arrays, each of which may consist of one or more “chunks” backed by a single `segmented_vector<ElementType, PageSize>`. A logical array is identified by an `ArrayIndex`. For each index, the arena stores a head `Chunk` in `heads_` and, if the array grows beyond the head’s capacity, additional continuation chunks in `continuations_`.
+The `ArrayArena` template implements the append-only sequences used by arrays and objects. Conceptually, it manages a collection of logical arrays, each of which may use one of two physical representations:
+
+- a regular growable chunk chain backed by `heads_`, `continuations_`, and `data_`
+- a singleton handle backed by `singletonValues_` and `singletonOccupied_`
+
+Regular arrays behave like the historical arena implementation. Each logical array is identified by an `ArrayIndex` and starts with a head `Chunk` in `heads_`. If the array grows beyond the head’s capacity, the arena allocates continuation chunks in `continuations_`. Each chunk records an `offset` into `data_`, a `capacity`, and a `size`. For a head chunk, `size` also tracks the total logical length of the array; for continuation chunks, `size` is local to that chunk. The `next` and `last` indices form a singly-linked list from the head to the tail chunk.
+
+`new_array(initialCapacity, fixedSize)` controls which representation is chosen. If `fixedSize` is `false`, even `initialCapacity == 1` creates a regular growable array. If `fixedSize` is `true` and `initialCapacity == 1`, the arena instead returns a singleton handle. That handle represents a 0-or-1 element logical array with no head chunk allocation. This is useful for storage patterns where one-element arrays are common and known not to grow later.
 
-Each `Chunk` records an `offset` into the `data_` vector, a `capacity`, and a `size`. For a head chunk, `size` also tracks the total logical length of the array; for continuation chunks, `size` expresses the number of valid elements in that chunk only. The `next` and `last` indices form a singly-linked list from the head to the tail chunk. `new_array(initialCapacity)` reserves a contiguous region in `data_`, initialises the head chunk with the offset and capacity, and returns a fresh `ArrayIndex`.
+When a caller appends an element to a regular array via `push_back` or `emplace_back`, the arena calls `ensure_capacity_and_get_last_chunk_unlocked`. This function locates the current tail chunk (either the head or a continuation). If the tail still has spare capacity, it is returned directly; otherwise, the function allocates a new continuation chunk with capacity doubled relative to the previous tail, extends `data_`, links the new chunk into `continuations_`, and updates the head’s `last` pointer. Singleton handles do not use this growth path; they allow at most one element and reject further appends.
 
-When a caller appends an element via `push_back` or `emplace_back`, the arena calls `ensure_capacity_and_get_last_chunk`. This function locates the current tail chunk (either the head or a continuation). If the tail still has spare capacity, it is returned directly; otherwise, the function allocates a new continuation chunk with capacity doubled relative to the previous tail, extends `data_` accordingly, links the new chunk into `continuations_`, and updates the head’s `last` pointer. This growth strategy guarantees amortised constant time for appends while avoiding large reallocations.
+Element access via `at(ArrayIndex, i)` dispatches by representation. Singleton handles resolve directly against `singletonValues_`. Compact arenas resolve against the compact head metadata. Regular arrays walk the chunk list, subtracting full chunk capacities from the requested index until the index falls within the current chunk’s capacity and size. This keeps the public API uniform while allowing denser storage for the common singleton case.
 
-Element access via `at(ArrayIndex, i)` walks the chunk list for the target array. It subtracts full chunk capacities from the requested index until the index falls within the current chunk’s capacity and size, and then returns a reference to `data_[offset + localIndex]`. This guarantees O(number_of_chunks) access in the worst case, but in practice the number of chunks per array remains small because capacities grow geometrically.
+The arena also supports a compact serialization mode. In that mode, `compactHeads_` stores only `{offset, size}` metadata for each regular array, while `data_` already contains a dense payload without chunk gaps. Runtime head chunks are materialized lazily from `compactHeads_` when a later mutation requires growable chunk state again. This allows serialized arenas to stay compact without forcing the mutable runtime representation onto the wire.
 
-The arena also provides higher-level iteration facilities. The `begin(array)`/`end(array)` pair yields an iterator over the elements of a specific logical array. The `iterate(ArrayIndex, lambda)` helper executes a callback on every element and supports two signatures: a unary callback receiving a reference to the element, and a binary callback receiving both the element and its global index. This is used by `BaseArray::iterate` to implement `ModelNode::iterate` efficiently without allocating intermediate containers.
+The higher-level iteration facilities follow the same dispatch rules. `begin(array)`/`end(array)` iterate one logical array, while the top-level arena iterator skips the sentinel head entry and also yields singleton handles. `iterate(ArrayIndex, lambda)` supports unary callbacks receiving a value and binary callbacks receiving both a value and its logical index. This is used by `BaseArray::iterate` and `BaseObject::iterate` to expose child traversal without materializing temporary containers.
 
-Thread-safety is conditional. If `ARRAY_ARENA_THREAD_SAFE` is defined, the arena uses a shared mutex to protect growth and element access. Appends and `new_array` take an exclusive lock only when allocating new chunks; reads can proceed with shared locks. Simfil itself does not require the arena to be thread-safe as long as model construction happens before concurrent evaluation, but the hooks are present for embedders that need concurrent writers.
+Thread-safety is conditional. If `ARRAY_ARENA_THREAD_SAFE` is defined, the arena uses a shared mutex to protect growth and element access. Reads use shared locks, while mutations and compact-to-runtime materialization take an exclusive lock. Simfil itself does not require the arena to be thread-safe as long as model construction happens before concurrent evaluation, but the hooks are present for embedders that need concurrent writers.
 
 ## Parser, tokens, and AST
 

diff --git a/include/simfil/diagnostics.h b/include/simfil/diagnostics.h
@@ -2,29 +2,48 @@
 
 #pragma once
 
+#include "simfil/sourcelocation.h"
 #include "simfil/value.h"
-#include "simfil/token.h"
 #include "simfil/error.h"
+#include "simfil/expression.h"
 
+#include <limits>
 #include <tl/expected.hpp>
 #include <optional>
 #include <vector>
 #include <string>
-#include <memory>
+#include <cstdlib>
 
 namespace simfil
 {
 
 class AST;
-class Expr;
 struct Environment;
 struct ModelNode;
 
 /** Query Diagnostics. */
-struct Diagnostics
+class Diagnostics
 {
+    static constexpr std::uint32_t InvalidIndex = std::numeric_limits<std::uint32_t>::max();
 public:
-    using ExprId = std::uint32_t;
+    struct FieldExprData
+    {
+        SourceLocation location;
+        std::uint32_t hits = 0;
+        std::uint32_t evaluations = 0;
+        std::string name;
+    };
+
+
+    struct ComparisonExprData
+    {
+        SourceLocation location;
+        TypeFlags leftTypes;
+        TypeFlags rightTypes;
+        std::uint32_t evaluations = 0u;
+        std::uint32_t falseResults = 0u;
+        std::uint32_t trueResults = 0u;
+    };
 
     struct Message
     {
@@ -42,6 +61,12 @@ struct Diagnostics
     Diagnostics(Diagnostics&&) noexcept;
     ~Diagnostics();
 
+    /**
+     * Get diagnostics data for a single Expr.
+     */
+    template <class DiagnosticsDataType>
+    auto get(const Expr& expr) -> DiagnosticsDataType&;
+
     /**
      * Append/merge another diagnostics object into this one.
      */
@@ -53,22 +78,82 @@ struct Diagnostics
     auto write(std::ostream& stream) const -> tl::expected<void, Error>;
     auto read(std::istream& stream) -> tl::expected<void, Error>;
 
-    struct Data;
-private:
-    friend auto eval(Environment&, const AST&, const ModelNode&, Diagnostics*) -> tl::expected<std::vector<Value>, Error>;
-    friend auto diagnostics(Environment& env, const AST& ast, const Diagnostics& diag) -> tl::expected<std::vector<Message>, Error>;
-
-    std::unique_ptr<Data> data;
-
     /**
-     * Collect diagnostics data from an AST.
+     * Build the exprIndex_ map for the AST.
      */
-    auto collect(Expr& ast) -> void;
+    auto prepareIndices(const Expr& ast) -> void;
+
+    /** ExprId to diagnostics data index mapping. */
+    std::vector<std::uint32_t> exprIndex_;
+
+    /** FieldExpr diagnostics data. */
+    std::vector<FieldExprData> fieldData_;
+
+    /** ComparisonExpr diagnostics data. */
+    std::vector<ComparisonExprData> comparisonData_;
+
+private:
+    friend auto diagnostics(const Diagnostics& diag) -> tl::expected<std::vector<Message>, Error>;
 
     /**
      * Build messages from this objecst diagnostics data.
      */
-    auto buildMessages(Environment& env, const AST& ast) const -> std::vector<Message>;
+    auto buildMessages() const -> std::vector<Message>;
+
+    mutable std::mutex mtx_;
 };
 
+namespace detail
+{
+
+template <class T>
+struct DiagnosticsStorage;
+
+template <>
+struct DiagnosticsStorage<Diagnostics::FieldExprData>
+{
+    static auto get(Diagnostics& diag)
+    {
+        return &diag.fieldData_;
+    }
+};
+
+template <>
+struct DiagnosticsStorage<Diagnostics::ComparisonExprData>
+{
+    static auto get(Diagnostics& diag)
+    {
+        return &diag.comparisonData_;
+    }
+};
+
+}
+
+/**
+ * Get typed diagnostics data for a single Expr.
+ */
+template <class DiagnosticsDataType>
+auto Diagnostics::get(const Expr& expr) -> DiagnosticsDataType&
+{
+    auto* data = detail::DiagnosticsStorage<DiagnosticsDataType>::get(*this);
+
+    const auto id = expr.id();
+    if (exprIndex_.size() <= id) [[unlikely]] {
+        exprIndex_.resize(id + 1u, Diagnostics::InvalidIndex);
+        exprIndex_[id] = data->size();
+    }
+
+    auto index = exprIndex_[id];
+    if (index == Diagnostics::InvalidIndex) {
+        exprIndex_[id] = data->size();
+        index = exprIndex_[id];
+    }
+
+    if (data->size() <= index) {
+        data->resize(index + 1u);
+    }
+
+    return (*data)[index];
+}
+
 }
diff --git a/include/simfil/environment.h b/include/simfil/environment.h
@@ -21,6 +21,7 @@ namespace simfil
 
 class Expr;
 class Function;
+class Diagnostics;
 struct ResultFn;
 struct Debug;
 
@@ -138,6 +139,7 @@ struct Environment
 struct Context
 {
     Environment* const env;
+    Diagnostics* const diag;
 
     /* Current phase under which the evaluation
      * takes place. */
@@ -151,7 +153,8 @@ struct Context
     /* Timeout after which the evaluation should be canceled. */
     std::optional<std::chrono::time_point<std::chrono::steady_clock>> timeout;
 
-    Context(Environment* env, Phase = Phase::Evaluation);
+    Context() = delete;
+    Context(Environment* env, Diagnostics* diag, Phase = Phase::Evaluation);
 
     auto canceled() const -> bool
     {

diff --git a/include/simfil/expression-visitor.h b/include/simfil/expression-visitor.h
@@ -0,0 +1,77 @@
+// Copyright (c) Navigation Data Standard e.V. - See "LICENSE" file.
+
+#pragma once
+
+#include <cstdlib>
+
+namespace simfil
+{
+
+class Expr;
+class WildcardExpr;
+class AnyChildExpr;
+class MultiConstExpr;
+class ConstExpr;
+class SubscriptExpr;
+class SubExpr;
+class AnyExpr;
+class EachExpr;
+class CallExpression;
+class UnpackExpr;
+class UnaryWordOpExpr;
+class BinaryWordOpExpr;
+class FieldExpr;
+class PathExpr;
+class AndExpr;
+class OrExpr;
+struct OperatorEq;
+struct OperatorNeq;
+struct OperatorLt;
+struct OperatorLtEq;
+struct OperatorGt;
+struct OperatorGtEq;
+template <class> class UnaryExpr;
+template <class> class BinaryExpr;
+
+/**
+ * Visitor base for visiting expressions recursively.
+ */
+class ExprVisitor
+{
+public:
+    ExprVisitor();
+    virtual ~ExprVisitor();
+
+    virtual void visit(const Expr& expr);
+    virtual void visit(const WildcardExpr& expr);
+    virtual void visit(const AnyChildExpr& expr);
+    virtual void visit(const MultiConstExpr& expr);
+    virtual void visit(const ConstExpr& expr);
+    virtual void visit(const SubscriptExpr& expr);
+    virtual void visit(const SubExpr& expr);
+    virtual void visit(const AnyExpr& expr);
+    virtual void visit(const EachExpr& expr);
+    virtual void visit(const CallExpression& expr);
+    virtual void visit(const PathExpr& expr);
+    virtual void visit(const FieldExpr& expr);
+    virtual void visit(const UnpackExpr& expr);
+    virtual void visit(const UnaryWordOpExpr& expr);
+    virtual void visit(const BinaryWordOpExpr& expr);
+    virtual void visit(const AndExpr& expr);
+    virtual void visit(const OrExpr& expr);
+    virtual void visit(const BinaryExpr<OperatorEq>& expr);
+    virtual void visit(const BinaryExpr<OperatorNeq>& expr);
+    virtual void visit(const BinaryExpr<OperatorLt>& expr);
+    virtual void visit(const BinaryExpr<OperatorLtEq>& expr);
+    virtual void visit(const BinaryExpr<OperatorGt>& expr);
+    virtual void visit(const BinaryExpr<OperatorGtEq>& expr);
+
+protected:
+    /* Returns the index of the current expression */
+    std::size_t index() const;
+
+private:
+    std::size_t index_ = 0;
+};
+
+}