Skip to content

Smarter hibernate#103

Closed
redink wants to merge 9 commits into
elixir-ecto:masterfrom
redink:smarter_hibernate
Closed

Smarter hibernate#103
redink wants to merge 9 commits into
elixir-ecto:masterfrom
redink:smarter_hibernate

Conversation

@redink
Copy link
Copy Markdown

@redink redink commented Nov 20, 2017

from #101

alias TestPool, as: P
alias TestAgent, as: A

@tag :idle_hibernate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test should have the :idle_timeout tag because it relies on idle timeout being active. The tags are used for test filtering, so if a pool doesn't support idle timeout it can't hibernate after an idle timeout. If there is a custom pool that needs this tag then we can leave it on. I think we should also move this test to the idle_test.exs file.

opts = [agent: agent, parent: self(), idle_timeout: 50, idle_hibernate: true]
{:ok, pool} = P.start_link(opts)
assert_receive {:hi, conn}
assert_receive {:pong, ^conn}
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we can test the process hibernates in someway.

Comment thread integration_test/tests.exs Outdated
Code.require_file "cases/stream_test.exs", __DIR__
Code.require_file "cases/transaction_execute_test.exs", __DIR__
Code.require_file "cases/transaction_test.exs", __DIR__
Code.require_file "cases/hibernate_test.exs", __DIR__
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alphabetically please because once we had a test file that wasn't being tested and that makes it easier to locate ;)

done_lock(regulator, lock)
{timeout, backoff} = Backoff.backoff(backoff)
{:backoff, timeout, %{s | lock: nil, backoff: backoff}}
maybe_hibernate(idle_hibernate, {:backoff, timeout, %{s | lock: nil, backoff: backoff}})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should test this part with the other backoff handling

@redink
Copy link
Copy Markdown
Author

redink commented Nov 21, 2017

I added a way for test the hibernate, I have no idea but only this one.

If you have better one, give me help please. Thanks.

@fishcakez
Copy link
Copy Markdown
Member

@redink we could read the memory in the callback and send it in the message to the test process and then in a recursive loop wait for it to go down. Alternatively in a similar recursive loop we could wait for the Process.info(self(), current_function) to change to the appropriate function.

We should consider limiting the loop by a time limit using System.monotonic_time. Also we could sleep between each check to allow other processes to run.

Comment thread integration_test/cases/backoff_test.exs Outdated
assert_receive {:error, conn}
assert_receive {:hi, ^conn}

if test_hibernate? and Mix.env() != :sojourn do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the :sojourn check is required then it means we are missing some code for the sojourn pool. We shouldn't have a Mix.env call, this context should be handled by tags or config.

We are missing the hibernate at this location:

continue_ask(%{s | idle_time: 0, state: state})
(i.e. after ping succeeds and the :sbroker async requests has been sent).

Comment thread lib/db_connection/connection.ex Outdated
{:noreply, s, idle_timeout}
defp handle_timeout(%{client: nil, idle_timeout: idle_timeout,
idle_hibernate: idle_hibernate} = s) do
Process.send_after(self(), :timeout, idle_timeout)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use a timer then we need to be careful to cancel the timer and/or have the timeout message uniquely identifiable (include reference) because the :timeout could arrive late and trigger early timeout and out of sync timeouts.

This module includes few examples of :erlang.start_timer to do this in similar way. Given timers have better behavior here (timers give stricter timeout than a timeout return value) its a good change. This might complicate code, I think that might be why didn't implemented like that originally.

Seems like we really need gen_statem here but cant until otp 20+.


if test_hibernate? and Mix.env() != :sojourn do
assert {:current_function, {:erlang, :hibernate, 3}} ==
Process.info(conn, :current_function)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This may have a race condition as we can't guarantee the process has hibernated because of the async nature of processes. However we can leave this until its observed.

end

opts = [agent: agent, parent: self(), backoff_min: 10]
defp execute_test_backoff_after_failed(agent, opts, test_hibernate? \\ false) do
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If theres enough logic to share a function we could try to simplify the test when hibenating and not share function. We might have overlapping tests.

Comment thread lib/db_connection/connection.ex Outdated
hibernate_timer: hibernate_timer} = s) do
cancel_timer(hibernate_timer)
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout)
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate)
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NEED HELP !

The test will failed if I use start_timer, but it works well if I use Process.send_after/3.

The error information:


  1) test ping after idle timeout using hibernate (TestIdle)
     integration_test/cases/idle_test.exs:15
     No message matching {:pong, ^conn} after 500ms.
     The following variables were pinned:
       conn = #PID<0.561.0>
     The process mailbox is empty.
     code: execute_test_case(agent, opts, true)
     stacktrace:
       integration_test/cases/idle_test.exs:53: TestIdle.execute_test_case/3
       integration_test/cases/idle_test.exs:18: (test)

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...) because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state]) anymore.

Comment thread lib/db_connection/connection.ex Outdated
defp cancel_timer(timer) do
case :erlang.cancel_timer(timer) do
false -> flush_timer(timer)
false -> :erlang.read_timer(timer) and flush_timer(timer)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_timer will always be false here so wont flush, we don't need this part.

Comment thread lib/db_connection/connection.ex Outdated
defp start_timer(pid, timeout) do
:erlang.start_timer(timeout, self(), {__MODULE__, pid, timeout})
end
defp start_timer(_, :infinity, _), do: nil
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets reuse the existing timer function by calling start_timer(:idle, timeout) so the message carries some information!

Comment thread lib/db_connection/connection.ex Outdated
receive do
{:timeout, ^timer, {__MODULE__, _, _}} ->
:ok
{:timeout, ^timer, :trigger_hibernate} ->
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We won't need this clause see above!

Comment thread lib/db_connection/connection.ex Outdated

s = %{mod: mod, opts: opts, state: nil, client: :closed, broker: broker,
regulator: regulator, lock: nil, queue: queue, timer: nil,
hibernate_timer: nil,
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use idle_timer as the timer is for idle and not hibernating, but hibernate is a possible side effect of going idle.

Comment thread integration_test/cases/backoff_test.exs Outdated
execute_test_backoff_after_failed(agent, opts)
end

@tag :idle_hibernate_backoff
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not require this tag, all pools should support this.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually lets keep this for now, ignore the above comment.

end

@tag :idle_timeout
@tag :idle_hibernate
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should support this for all pools that can do idle_timeout, I left a comment on how to achieve this previously.

Copy link
Copy Markdown
Member

@fishcakez fishcakez Nov 21, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually lets keep this for now, ignore the above comment.

Comment thread lib/db_connection/connection.ex Outdated
hibernate_timer: hibernate_timer} = s) do
cancel_timer(hibernate_timer)
hibernate_timer = Process.send_after(self(), :timeout, idle_timeout)
# hibernate_timer = start_timer(self(), idle_timeout, :trigger_hibernate)
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When the timer arrives we need to call the logic that is currently done by handle_info(:timeout, ...) because the timer is replacing that timeout. The reason it fails is because we don't call apply(mod, :ping, [state]) anymore.

Comment thread lib/db_connection/connection.ex Outdated
end
def handle_info(:timeout, %{idle_hibernate: idle_hibernate} = s) do
maybe_hibernate(idle_hibernate, {:noreply, s})
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should never reach this handle_info/2 clause. I think we can delete this clause.

Comment thread lib/db_connection/connection.ex Outdated

def handle_info({:timeout, timer, :trigger_hibernate},
%{idle_hibernate: idle_hibernate} = s) when is_reference(timer) do
maybe_hibernate(idle_hibernate, {:noreply, %{s | hibernate_timer: nil}})
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fishcakez
Copy link
Copy Markdown
Member

@redink I left some comments: the main takeaway is that the timer is for idling and for hibernating. The hibernation occurs after the idle handling (calling the ping callback), and it is possible we won't hibernate. The new option is to go idle after we ping or after we fail to connect.

@redink
Copy link
Copy Markdown
Author

redink commented Nov 21, 2017

sojourn test still have problem, I will work on it later.

@fishcakez
Copy link
Copy Markdown
Member

@redink the last changes look good. It might be that Sojourn doesn't play well with hibernate because of how the pool works. I suggest we don't implement hibernate for it yet and add the @idle_hibernate tag for sojourn tests.

@redink
Copy link
Copy Markdown
Author

redink commented Nov 27, 2017

Hello, Any updates ? Or any more comments ?

@fishcakez
Copy link
Copy Markdown
Member

@redink there is a performance regression so need to look in it.

@fishcakez
Copy link
Copy Markdown
Member

Closing in favor of #108.

@fishcakez fishcakez closed this Jan 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants