Skip to content

add hibernate option for long lived process#101

Closed
redink wants to merge 1 commit into
elixir-ecto:masterfrom
redink:add_hibernate_for_long_lived_process
Closed

add hibernate option for long lived process#101
redink wants to merge 1 commit into
elixir-ecto:masterfrom
redink:add_hibernate_for_long_lived_process

Conversation

@redink
Copy link
Copy Markdown

@redink redink commented Oct 24, 2017

could save memory as far as possible

@redink
Copy link
Copy Markdown
Author

redink commented Oct 24, 2017

When I using hibernate for the connection process and query mongo continuously the memory of process just like:

------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,6360}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}
------- {memory,3408}

but, after disabled the hibernate option, the memory of the long lived process:

------- {memory,16936}
------- {memory,16936}
------- {memory,16936}
------- {memory,17096}
------- {memory,17096}
------- {memory,17440}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,22328}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,29888}
------- {memory,14080}
------- {memory,14080}
------- {memory,14080}
------- {memory,14080}
------- {memory,14080}
------- {memory,17096}
------- {memory,17096}
------- {memory,17096}
------- {memory,17096}
------- {memory,17440}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}
------- {memory,21984}

Maybe hibernate is not friendly to CPU, but user could choose if enable the hibernate. So, I think it is necessary to add hibernate option. Please review !

{_, :after_connect} ->
{timeout, backoff} = Backoff.backoff(backoff)
{:backoff, timeout, %{s | backoff: backoff}}
maybe_hibernate(hibernate?, {:backoff, timeout, %{s | backoff: backoff}})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we wrap this one in hibernate?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep, this point is disconnect, so it not necessary to hibernate. Thanks

@josevalim
Copy link
Copy Markdown
Member

Thanks @redink! @fishcakez would be the best person to decide if we should merge this but I believe we would need to at least have tests. I am not sure if we can assert externally that a process is in hibernation but we need to assert that at least the code doesn't break if the hibernate flag is used.

@redink
Copy link
Copy Markdown
Author

redink commented Oct 24, 2017

Thanks @josevalim ! As just we know, db_connection depends on https://hex.pm/packages/connection, and connection is one wrapper of gen_server. hibernation is one property for Erlang process and user can utilize the property via hibernate in gen_server. Fortunately, connection keeps hibernation in it and db_connection can use hibernation conveniently.

There are some example which using hibernate:

  • rabbitmq-server
  • ejabberd

rabbitmq-server using gen_server2 in its source code and using hibernate in many modules, just like:
https://github.com/rabbitmq/rabbitmq-server/blob/master/src/rabbit_channel.erl#L419

In ejabberd source code, its using gen_server and hibernate timeout:
https://github.com/processone/ejabberd/blob/master/src/ejabberd_receiver.erl#L193

And in one early article https://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-2. The tester using hibernate.

Judicious use of hibernate means the mochiweb application memory levels out at 78MB Resident with 10k connections, much better than the 450MB we saw in Part 1. There was no significant increase in CPU usage.

@fishcakez
Copy link
Copy Markdown
Member

I think that if we want to add a hibernate option it should follower a slightly smarter strategy rather than always hibernating. For example if we have a checkout or a checkin we know that its very likely that we will get another message almost immediately so hibernating is likely to impact performance there and not be beneficial.

My intuition is that we should see the same benefits if we only hibernate in the following cases:

  • On backoff because we are going idle, majority of data that remains is suitable for old heap, if we get a checkout call it will fail fast after the hibernate no hurting performance
  • After idle_timeout because we are idle its reasonable to assume the next call isn't about to happen (if not the idle_timeout value is poorly set)
  • After DBConnection.SojournError disconnect because we are going idle and not going to receive a call soon.

@redink could you try this approach and report back? Also could you describe the methodology you used both this time and previously? For example if just start the processes and do nothing, or you are running a small number of queries. DBConnection.Connection processes have a different gc pattern to the rabbitmq, ejabberd and mochiweb examples because the vast majority of our garbage is generated in another process's heap. Its very possible we are being too clever here. Also a well written driver won't hold onto refc binaries because they should never be passed from another process so we shouldn't hit that.

@redink
Copy link
Copy Markdown
Author

redink commented Oct 25, 2017

@fishcakez hmmmm, can't agree with you more. Before, I usually use hibernate timeout, just like the new feature hibernate_after in Erlang 20.x
But, there is idle_timeout in db_connection, so it not easy to use hibernate timeout, these two timeout is conflicted.
However, I push one new branch redink:smarter_hibernate, and create a new repo to show how I test them.

https://github.com/redink/db_connection_hibernate

The data shows that smarter hibernate also could save memory.

@redink
Copy link
Copy Markdown
Author

redink commented Oct 25, 2017

The process that I trace is:

image

@redink
Copy link
Copy Markdown
Author

redink commented Oct 26, 2017

@fishcakez ping ...

@fishcakez
Copy link
Copy Markdown
Member

@redink sorry it'll be a couple of days until I can take a better look at this.

@michalmuskala
Copy link
Copy Markdown
Member

If the problem is memory, maybe a better solution would be to manually trigger GC with :erlang.garbage_collect() rather than constantly hibernating the process. Wouldn't this be a bit less invasive while providing similar benefits?

@redink
Copy link
Copy Markdown
Author

redink commented Nov 6, 2017

Maybe :erlang.garbage_collect() will reduce the memory leak and fragment, but it still has some problems:

  • we can't seize the opportunity to trigger garbage collect accurately
  • this method is not thorough

In more technical terms, erlang:hibernate/3 discards the call stack for the process, and then garbage collects the process. After this, all live data is in one continuous heap. The heap is then shrunken to the exact same size as the live data that it holds (even if that size is less than the minimum heap size for the process). (from http://erlang.org/doc/man/erlang.html#hibernate-3)

And hibernate is a feature supported by Erlang official, I don't think it not better than :erlang.garbage_collect(). I also don't think it invasive, just because Erlang support hibernate_after option when start one gen_server process on 20.x (http://erlang.org/doc/man/gen_server.html#start_link-3).

@redink
Copy link
Copy Markdown
Author

redink commented Nov 17, 2017

any update ?

@fishcakez
Copy link
Copy Markdown
Member

@redink can you send a PR with your smarter branch with 2 changes:

  • rename hibernate? to idle_hibernate for consistent naming
  • remove the Application.get_env call as we don't have that for other options - if there was to be an application env it would be in the driver for consistency. DBConnection should only depend on the pure options passed to it.

@fishcakez fishcakez added this to the v1.2 milestone Nov 20, 2017
@redink redink mentioned this pull request Nov 20, 2017
@fishcakez
Copy link
Copy Markdown
Member

Closing in favor of #103.

@fishcakez fishcakez closed this Nov 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants