Skip to content

Hublabel#239

Open
electricEpilith wants to merge 60 commits intomasterfrom
hublabel
Open

Hublabel#239
electricEpilith wants to merge 60 commits intomasterfrom
hublabel

Conversation

@electricEpilith
Copy link
Copy Markdown

libbdsg changes to go with merging the hublabel branch of vg

electricEpilith and others added 30 commits December 9, 2025 12:54
…it returns, and which is a different size on mac
…rtices from a net graph child. Add a bunch of comments exlaining why I am confused by the distance index orientation bookkeeping.
adamnovak and others added 23 commits February 2, 2026 14:24
…ances

Add O(1) get_snarl_child_count() that reads the stored count directly.
Replace Dijkstra fallback for oversized snarl internal distances with
hub label (HHL) queries.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
# Conflicts:
#	bdsg/include/bdsg/snarl_distance_index.hpp
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
The merge of origin/hublabel-debug introduced duplicate declarations in
the header (lines 439 and 550) and duplicate definitions in the .cpp
(lines 631 and 648). Removed the older versions, keeping the ones from
hublabel-debug which have better comments and slightly cleaner impl.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@adamnovak adamnovak left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like this brings along more of the other oracle approaches than we need to to implement just the contraction hierarchies and hierarchical hub labeling that actually gets used.

There's also some commented-out code that should be removed, and some documentation comments that ought to be written, some of which are for things I added.

Also, I didn't add a file comment for it, but if it's easy I might port away from Boost to our own data structure for what's now called the "Boost graph", to avoid the dependency. Right now we have Boost in vg only because of the old Vowpal-Wabbit MAPQ recalibration stuff in vgteam/vg#1621 which I don't know actually has any users. If it's not easy it probably wouldn't be worth doing, or at least might not be worth doing now. (Since it's only needed for the construction, we can change later if we want to without invalidating indexes.)

/// Indexing into iterator. Even though we type this as reference, remember
/// that we don't actually implement writing to our "references" and just
/// use the value type.
/// Result is undefined if itrators are to different collecitons.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Result is undefined if itrators are to different collecitons.
/// Result is undefined if iterators are to different collections.

// Comaprable iterator methods (TODO: Is there an STL concept name for this?)

/// Determine if this iterator is strictly before another.
/// Result is undefined if iterators are to different collecitons.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Result is undefined if iterators are to different collecitons.
/// Result is undefined if iterators are to different collections.

bool operator<(const IndexingIterator& other) const;

/// Determine if this iterator is before or at another.
/// Result is undefined if iterators are to different collecitons.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Result is undefined if iterators are to different collecitons.
/// Result is undefined if iterators are to different collections.

bool operator<=(const IndexingIterator& other) const;

/// Determine if this iterator is strictly after another.
/// Result is undefined if iterators are to different collecitons.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Result is undefined if iterators are to different collecitons.
/// Result is undefined if iterators are to different collections.

bool operator>(const IndexingIterator& other) const;

/// Determine if this iterator is at or after another.
/// Result is undefined if itrators are to different collecitons.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Result is undefined if itrators are to different collecitons.
/// Result is undefined if iterators are to different collections.

Comment thread bdsg/src/test_libbdsg.cpp
Comment on lines +5123 to +5126
// need to debug
for (size_t a = 0; a < handles.size() * 2; a++) {
cerr << hhl_query(packed_labels.begin(), rank(1, false), a) << endl;
}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been debugged now, right? We should probably not print stuff from the tests unless they're built with a define.

Comment thread bdsg/src/test_libbdsg.cpp
Comment on lines 5046 to +5398
@@ -5089,6 +5393,7 @@ int main(void) {
test_packed_subgraph_overlay();
test_multithreaded_overlay_construction();
test_mapped_packed_graph();
test_hash_graph();
test_snarl_distance_index();
test_hash_graph(); */
test_hub_labeling();
//test_snarl_distance_index();
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have to turn the other tests back on.

Comment thread bdsg/src/vectorizable_overlays.cpp Outdated

// We limit threading on small inputs.
auto limited_threads = [&](size_t batch) {
return std::max<size_t>(1, std::min<size_t>(batch / 1024, get_thread_count()));
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I added this, but this could use a constant instead of a magic number.

Comment thread CMakeLists.txt
# Find other system dependencies
pkg_check_modules(Jansson REQUIRED IMPORTED_TARGET jansson)

find_package(Boost REQUIRED)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're adding a dependency on Boost, we need to explain that in the README.

Comment thread Makefile
OBJS += $(OBJ_DIR)/subgraph_overlay.o
OBJS += $(OBJ_DIR)/vectorizable_overlays.o
OBJS += $(OBJ_DIR)/packed_subgraph_overlay.o
OBJS += $(OBJ_DIR)/ch.o
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we don't actually need hublabel.o and landmark.o and we can drop their CPP files?

adamnovak and others added 6 commits April 22, 2026 17:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants