Hey, we run into a problem when combining que, active_record and PG advisory_lock in a highly concurrent environment.
The active database connection is being released back to the pool immediately when a que job is scheduled from within another que job.
Consider the following sample jobs:
class TestQueJob < Que::Job
def run(id)
conn = ::ActiveRecord::Base.connection
TestQueJob2.enqueue
if conn != ::ActiveRecord::Base.connection
raise "Connection changed"
end
end
end
class TestQueJob2 < Que::Job
def run; end
end
When executed in a highly concurrent environment, it’s possible that the raise will be triggered.
If that happens inside of non-transactional advisory_lock, then it’s possible that when you try to release your lock, you will be given a different connection than was used to obtain that lock, e.g.
class TestQueJob < Que::Job
def run(id)
ActiveRecord::Base.connection.execute("SELECT pg_advisory_lock(#{id})")
TestQueJob2.enqueue
ActiveRecord::Base.connection.execute("SELECT pg_advisory_unlock(#{id})")
end
end
In that case, you can end up with a lock that is never released, or more specifically, released by the PG server when the que process exits and the connection is disconnected.
This is a problem for us, because we use advisory locks to synchronize access to specific DB records, which causes some jobs to be stuck until the que process is restarted.
I've done some digging and I'm planning to submit a PR with a fix proposal shortly 🙂
Hey, we run into a problem when combining
que,active_recordand PGadvisory_lockin a highly concurrent environment.The active database connection is being released back to the pool immediately when a que job is scheduled from within another que job.
Consider the following sample jobs:
When executed in a highly concurrent environment, it’s possible that the
raisewill be triggered.If that happens inside of non-transactional
advisory_lock, then it’s possible that when you try to release your lock, you will be given a different connection than was used to obtain that lock, e.g.In that case, you can end up with a lock that is never released, or more specifically, released by the PG server when the que process exits and the connection is disconnected.
This is a problem for us, because we use advisory locks to synchronize access to specific DB records, which causes some jobs to be stuck until the que process is restarted.
I've done some digging and I'm planning to submit a PR with a fix proposal shortly 🙂