Implement a monitor system for the GC by mathias-lang-sociomantic · Pull Request #2052 · dlang/druntime

mathias-lang-sociomantic · 2018-01-22T15:43:56Z

Allow one to hook the start and end of a fullcollect, potentially gathering statistics / logging the event as one see fit.
This was present in the D1 runtime but got stripped for unknown reason (git history has no mention of it).
Sociomantic relies on it heavily for some real-time apps.

Allow one to hook the start and end of a fullcollect, potentially gathering statistics / logging the event as one see fit. This was present in the D1 runtime but got stripped for unknown reason (git history has no mention of it). Sociomantic relies on it heavily for some real-time apps.

dlang-bot · 2018-01-22T15:43:57Z

Thanks for your pull request, @mathias-lang-sociomantic!

Bugzilla references

Your PR doesn't reference any Bugzilla issue.

If your PR contains non-trivial changes, please reference a Bugzilla issue or create a manual changelog.

mathias-lang-sociomantic · 2018-01-22T15:44:10Z

CC @mihails-strasuns-sociomantic @leandro-lucarella-sociomantic

mathias-lang-sociomantic · 2018-01-22T15:45:58Z

src/core/memory.d


+    // Alias used because DMD thinks the delegate are extern(C) as well otherwise
+    alias CollectionStartHook = void delegate() nothrow @nogc;
+    alias CollectionEndHook = void delegate(size_t, size_t) nothrow @nogc;


@mihails-strasuns-sociomantic mentioned that this could be two instances of the GC.Stats struct.
However, those are not so cheap to get, since we iterate over the pools (unless I misunderstood something).
Given the use cases for those hooks, I think having a no-cost approach is quite mandatory.

I think having a no-cost approach is quite mandatory

It still has to hold GC lock which immediately makes it pretty far from no-cost.

I think acquiring a lock (specially in single threaded environments, where is basically an atomic op usually) is considered no-cost compared to traverse the whole pool set. Also, there seems to be no reason to do it when the user can get the stats anyway with just one extra call, if needed.

I have two unrelated concerns about it:

It is not cheap in multi-threaded environments and while single-threaded optimization is indeed most suitable to our needs, druntime version should be more uniformly useful.

To avoid calculating exact stats this PR right now exposes freedLargePages and freedSmallPages as monitored values and I would argue those alone have rather limited usefulness.

What would be really cool is to accept different types of callback delegates and provide required stats based on what is desired by callback. Not sure if it can fit into API elegantly though.

mathias-lang-sociomantic · 2018-01-22T15:47:12Z

src/gc/impl/conservative/gc.d


+    /// Hook used for debugging application-side
+    __gshared CollectionStartHook on_collect_start;
+    __gshared CollectionEndHook on_collect_end;


One point raised by @mihails-strasuns-sociomantic is that having delegate + __gshared is dangerous. Since we have the lock it should be okay, but I'm interested to know if there's something addressing this issue anywhere else.

Well, from pure theoretical PoV it should require shared delegate but I still have no idea if that is even implemented to mean something in D frontend.

leandro-lucarella-sociomantic

I guess this also needs a test. At least the most simple form of it should be easy to do (just register the handlers that write to some global, call GC.collect() and check if the handlers were ran), at least if there is any infrastructure for this.

leandro-lucarella-sociomantic · 2018-01-22T16:22:50Z

src/core/memory.d


+    // Alias used because DMD thinks the delegate are extern(C) as well otherwise
+    alias CollectionStartHook = void delegate() nothrow @nogc;
+    alias CollectionEndHook = void delegate(size_t, size_t) nothrow @nogc;


I think acquiring a lock (specially in single threaded environments, where is basically an atomic op usually) is considered no-cost compared to traverse the whole pool set. Also, there seems to be no reason to do it when the user can get the stats anyway with just one extra call, if needed.

leandro-lucarella-sociomantic · 2018-01-22T16:27:24Z

src/gc/gcinterface.d

+     * Useful for debugging and tuning.
+     */
+    void monitor (CollectionStartHook on_start, CollectionEndHook on_end)
+        nothrow @nogc;


@__future?

That makes sense.
I would be extremely surprised if someone was implementing this interface, but I suppose that's a good test anyway.

Yeah, I know it would be extremely unlikely, but this is a good example for what future was created, to avoid any kind of surprises as much as possible, and OTOH we know people is creative and we always get surprised about how assertions like "probably nobody is doing X" prove themselves wrong with time :)

I don't think @future makes any sense here - gc package is not available to import, it is strictly contained within druntime. So there is no downstream to care about.

leandro-lucarella-sociomantic · 2018-01-22T16:27:45Z

src/gc/gcinterface.d

+    /// Provided for implementation's convenience, not intended for public usage
+    protected alias CollectionStartHook = void delegate() nothrow @nogc;
+    /// Ditto
+    protected alias CollectionEndHook = void delegate(size_t, size_t) nothrow @nogc;


@__future? Although I don't think it's working for aliases.

I don't think that's necessary for alias actually. If someone defines something with the same name, it will just shadow this.

leandro-lucarella-sociomantic · 2018-01-22T16:31:08Z

BTW, all tests are failing, looks like a linking problem, at lease in jenkins:

src/core/memory.d:(.text._D4core6memory2GC7monitorFNbNiDFNbNiZvDFNbNimmZvZv[_D4core6memory2GC7monitorFNbNiDFNbNiZvDFNbNimmZvZv]+0x37): undefined reference to `gc_monitor'

andralex · 2018-01-22T16:35:09Z

Nice, I have a few uses already. One thing needed would be composability - a call to monitor should not overwrite the previous call. So we're looking at a singly-linked list of hooks.

There's some nice related work in the area:

mathias-lang-sociomantic · 2018-01-22T16:44:50Z

Nice, I have a few uses already. One thing needed would be composability - a call to monitor should not overwrite the previous call. So we're looking at a singly-linked list of hooks.

Is there any advantage of doing it within the GC ?
If we allow hooks to be composable, we also need ways to remove some of those hooks (or all of them), as some could fail on application shutdown.

To implement this approach, I would suggest using this simple hooking mechanism and simply providing a delegate that forwards:

struct HookList (Dg)
{
    private LinkedList!(Dg) hooks;

    public void callback (size_t a, size_t b)
    {
        foreach (h; this.hooks)
            h.call(a, b);
    }
}

/// ...
HookList!(void delegate(size_t, size_t) hook;
GC.monitor(null, &hook.callback);

This way the GC implementation stays simple, and we leave open an unlimited realm of behavior.
Of course for anything implementing GC assertions, we can provide an already-available implementation.

leandro-lucarella-sociomantic · 2018-01-22T16:49:25Z

I agree is good to keep the complexity out of the GC/runtime if possible. Is true that if some library want to hook to the GC monitor, then things get hairy, as user code implementing this kind of hooking mechanism won't be enough. But then this is a pretty low level construct and one might argue that probably a library shouldn't be hooking to this, but providing the user a callback in case the user want to hook it.

andralex · 2018-01-22T17:21:38Z

This could get modularized in a number of ways. At any rate there should be a public API that allows code to add hooks without clobbering existing hooks.

One more possibility that comes to mind is what some Windows APIs do - pass the "previous" delegate as an argument to the hook. Then the hook may decide cooperatively to continue down the chain by calling the hook, or stop the chain.

leandro-lucarella-sociomantic · 2018-01-22T17:47:55Z

Yeah, a way to get the current hooks might be the easiest way to enable users to do the chain manually if appropriate.

andralex · 2018-01-22T19:29:26Z

Threads are likely to make matters difficult for a get/set interface backed by globals (currently there's only means for setting).

A thread would find it impossible to insert a new hook reliably - by the time is has loaded the current hook, another thread could install its own hook. A CAS-based approach in the implementation would solve the matter reliably and efficiently.
In the code sketched by @mathias-lang-sociomantic, there's the matter of allocating memory for the hook itself. That's tricky given that we're talking about hooks into the GC itself; this is the kind of difficulty that shouldn't be left to user code.
There are a few subtler issues such as a hook attempting to insert a new hook during execution (probably should be disallowed).

mihails-strasuns-sociomantic · 2018-01-23T09:12:15Z

There is also another multi-threaded concern (referenced here #2052 (comment)) - hook list stores delegates that capture some thread context and thus executing such delegate may have a race with a thread that registered it originally over the very same context. Basically shared(this) problem.

andralex · 2018-01-23T13:18:59Z

@mihails-strasuns-sociomantic wouldn't all hooks be invoked from the collection thread with all other threads frozen?

mihails-strasuns-sociomantic · 2018-01-23T20:46:19Z

@andralex you are correct of course - and unregistering thread from runtime is @system right now AFAICS. False alarm.

rainers · 2018-01-27T10:01:53Z

Useful addition. I did something similar in my early GC adventures.

if the callback registration gets a bit more involved, maybe it would be better if that would be in another object so that different GC implementations can call into that to invoke the hooks. That way the GC interface doesn't have to be changed for every added callback.
callbacks could also be united by letting them pass an ID (e.g. startCollection, stopCollection) and custom data (basically a tagged union).
would it be possible to allow calling some query functions during the callback, like GC.stats?

leandro-lucarella-sociomantic · 2018-01-29T17:33:24Z

Just to recap, what would be the minimum requirements for this PR to be accepted?

andralex · 2018-02-01T19:33:27Z

Once the feature is in, there will be a healthy demand for its modular use. Thinking of things like caches, memoization, weak pointers, files to close, etc.

The list would include:

A careful specification.
- Are the hooks called with threads frozen/melted?
- What limitations are imposed against the code in hooks - e.g. access to TLS, shared, __gshared data?
- What happens when a hook calls GC query functions such as GC.getAttr?
- Can hooks allocate memory?
- Can hooks free memory?
- When is the context pointer of the delegates considered for deallocation?
- Is the hook persistent for the whole application, or does it go away after one invocation?
- What if a hook tries to register another hook? (This may be needed for certain idioms and is influenced by the point above.)
An API for atomically adding hooks; they should be chained together.
The order of serving (FCFS vs LCFS) should be defined. In the spirit of destructors, perhaps LCFS is the order of choice.
I think we can get away without an API for removing hooks. (One simple solution is to give the hook the chance to remove itself upon execution, i.e. by returning false.) If a need for a more sophisticated API comes forward, we can add to it later.

An eye for applications would be great. I'm thinking assertions ("this object is no longer referenced") and caches.

mathias-lang-sociomantic · 2018-04-20T09:33:21Z

Thanks for your feedback @andralex . This will be taken over by someone else in the future.

leandro-lucarella-sociomantic · 2018-05-04T10:18:12Z

We kind of did a survey on our applications to see how this facility was used, and it was almost exclusively used to get how much time was spent doing collections. The list of requirements seem to be super reasonable and I think we should go that way if this gets ever implemented, but actually we are going to switch the focus to improve GC stats instead (we'll go more on the path of #1846), which is what we really need.

mathias-lang-sociomantic requested review from DmitryOlshansky and andralex as code owners January 22, 2018 15:43

mathias-lang-sociomantic requested review from MartinNowak, rainers and wilzbach as code owners January 22, 2018 15:43

mathias-lang-sociomantic commented Jan 22, 2018

View reviewed changes

leandro-lucarella-sociomantic reviewed Jan 22, 2018

View reviewed changes

mihails-strasuns-sociomantic mentioned this pull request Feb 7, 2018

Add GC.monitor functionality sociomantic-tsunami/dmd-transitional#14

Closed

rainers mentioned this pull request Apr 2, 2018

Add access to GC runtime profile stats #2155

Closed

mathias-lang-sociomantic closed this Apr 20, 2018

mathias-lang-sociomantic deleted the gc_monitor branch April 20, 2018 09:33

mathias-lang-sociomantic restored the gc_monitor branch April 20, 2018 09:33

leandro-lucarella-sociomantic mentioned this pull request May 4, 2018

Rework of -profile=gc calculations #1846

Closed

Uh oh!

Comments

Conversation

mathias-lang-sociomantic commented Jan 22, 2018

Uh oh!

dlang-bot commented Jan 22, 2018

Bugzilla references

Uh oh!

mathias-lang-sociomantic commented Jan 22, 2018

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leandro-lucarella-sociomantic left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

leandro-lucarella-sociomantic commented Jan 22, 2018

Uh oh!

andralex commented Jan 22, 2018

Uh oh!

mathias-lang-sociomantic commented Jan 22, 2018

Uh oh!

leandro-lucarella-sociomantic commented Jan 22, 2018

Uh oh!

andralex commented Jan 22, 2018

Uh oh!

leandro-lucarella-sociomantic commented Jan 22, 2018

Uh oh!

andralex commented Jan 22, 2018

Uh oh!

mihails-strasuns-sociomantic commented Jan 23, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

andralex commented Jan 23, 2018

Uh oh!

mihails-strasuns-sociomantic commented Jan 23, 2018

Uh oh!

rainers commented Jan 27, 2018

Uh oh!

leandro-lucarella-sociomantic commented Jan 29, 2018

Uh oh!

andralex commented Feb 1, 2018

Uh oh!

mathias-lang-sociomantic commented Apr 20, 2018

Uh oh!

leandro-lucarella-sociomantic commented May 4, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mihails-strasuns-sociomantic commented Jan 23, 2018 •

edited

Loading