-
Notifications
You must be signed in to change notification settings - Fork 18
Why resque brain can't kill a worker
Because Resque myopically hard-codes a single Ruby VM-wide instance of Redis for its operations, and because killing a worker will run failure hooks, which could depend on that instance, resque-brain cannot safely kill a worker.
Here's the code for when you unregister a worker:
def unregister_worker(exception = nil)
# If we're still processing a job, make sure it gets logged as a
# failure.
if (hash = processing) && !hash.empty?
job = Job.new(hash['queue'], hash['payload'])
job.data_store = data_store
# Ensure the proper worker is attached to this job, even if
# it's not the precise instance that died.
job.worker = self
job.fail(exception || DirtyExit.new)
end
data_store.unregister_worker(self) do
Stat.clear("processed:#{self}")
Stat.clear("failed:#{self}")
end
end
This part is OK for resque-brain, because Resque::DataStore
is set up to allow this to work. Even setting job.data_store = data_store
seems like it will work.
Here's the code in Job
for fail
:
def fail(exception)
run_failure_hooks(exception)
Failure.create \
:payload => payload,
:exception => exception,
:worker => worker,
:queue => queue
end
This is already a problem, because Failure
uses the hard-coded VM-wide redis instance, and not the one that resque-brain operates. But, it gets worse when we look at run_failure_hooks
:
def run_failure_hooks(exception)
begin
job_args = args || []
if has_payload_class?
failure_hooks.each { |hook| payload_class.send(hook, exception, *job_args) } unless @failure_hooks_ran
end
ensure
@failure_hooks_ran = true
end
end
Here, we fire hooks on the job class, and those hooks could be anything. Since Idiomatic Ruby™ is to use global symbols like Resque.redis
with reckless abandon, there's no way to be sure what will happen in those hooks.
This is probably the only viable option. Doing so, of course, is not thread-safe even a little bit, so we'd have to surround it with mutexes and all that. Maybe we can do that, but for now, we'll just count it as a win that we can even see stale workers.