Skip to content

Tasks signaling Conditions leave suspended Fibers behind #295

@rmosolgo

Description

@rmosolgo

I've learned that left-over, suspended fibers can retain ActiveRecord connections, causing the connection pool to run out of available connections (example: rmosolgo/graphql-ruby#4739 (comment)).

So, I added a test to my project that makes sure that all Fibers created during execution end up not alive?. It currently doesn't pass when I use async as the "backend" for GraphQL::Dataloader. Here's the test: https://github.com/rmosolgo/graphql-ruby/blob/e60d22b138d91c0193e2b24eeb7b20b60648698f/spec/graphql/dataloader_spec.rb#L1020-L1026

In that test, I have to pause GC in order to catch the suspended fiber, because otherwise it will get GC'ed, since the task assigns @fiber = nil.

But, in Rails, that suspended Fiber can be the "owner" of an ActiveRecord connection, and since it's suspended, ActiveRecord thinks that it might still use the connection -- and therefore doesn't reuse the connection 😖 .

Here's a small replication (the condition.signal seems to be the special sauce):

require "bundler/inline"

gemfile do
  gem "async", "~>2.6"
  gem "sqlite3", "~>1.7"
  gem "rails", "~>7.1"
end

require "active_record"

ActiveSupport::IsolatedExecutionState.isolation_level = :fiber
ActiveRecord::Base.establish_connection(adapter: "sqlite3", database: "__example.db")
ActiveRecord::Schema.define do
  self.verbose = false
  create_table :things, force: true do |t|
    t.column :name, :string
  end
end

class Thing < ActiveRecord::Base
end

Thing.create!(name: "Pogo Stick")

pp ActiveRecord::Base.connection_pool.stat

def run_process(number)
  puts "#{number} - begin process"
  Sync do |root_task|
    condition = Async::Condition.new
    t1 = root_task.async do |inner_task|
      condition.wait
    end

    root_task.async do |inner_task|
      Thing.all.to_a
      condition.signal
    end.wait

    t1.wait
  end
  puts "#{number} - end process"
end


run_process(1)
run_process(2)
run_process(3)

# For clarity's sake, remove connections which ActiveRecord knows aren't in use:
ActiveRecord::Base.connection_pool.reap

puts "\n\nSome AR connections are in use, even though the process exited:"
pp ActiveRecord::Base.connection_pool.connections.map { |c|
  { in_use: !!c.in_use?, owner_alive?: c.owner&.alive?, owner: c.owner,  }
}
# For exmaple:
# [{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000107a1f1f0 (resumed)>},
#  {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b845060 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
#  {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b8e4020 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
#  {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x000000010b80ded0 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>}]

run_process(4)
# This process will fail to acquire a connection:
run_process(5)
run_process(6)

It prints like this:

$ ruby async_issue.rb
/Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/io-event-1.3.3/lib/io/event/support.rb:24: warning: IO::Buffer is experimental and both the Ruby and C interface may change in the future!
{:size=>5, :connections=>1, :busy=>1, :dead=>0, :idle=>0, :waiting=>0, :checkout_timeout=>5.0}
1 - begin process
1 - end process
2 - begin process
2 - end process
3 - begin process
3 - end process


Some AR connections are in use, even though the process exited:
[{:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000101b2f2a8 (resumed)>},
 {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105d11ce8 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
 {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105f18d98 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>},
 {:in_use=>true, :owner_alive?=>true, :owner=>#<Fiber:0x0000000105f13b40 /Users/rmosolgo/.rbenv/versions/3.2.2/lib/ruby/gems/3.2.0/gems/async-2.6.5/lib/async/task.rb:326 (suspended)>}]
4 - begin process
4 - end process
5 - begin process

(It never exits and for some reason it also doesn't raise with a timeout error 🤷 )

I think it would be better if this Fiber was terminated, so that ActiveRecord could reap the connection it was using. What do you think?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions