Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: Add a generic object wrapper Retriable.with_retries() #15

Closed
wants to merge 3 commits into from

Conversation

julik
Copy link

@julik julik commented Jul 23, 2015

that returns an object which will have all it's public method calls wrapped with
a retriable() block. Useful for creating wrappers for network services, remote fetchers and the like.

We found this usage pattern more useful when threads are involved (we can then pass such
wrapped objects to execution threads without having the thread worry about configuring the retries).

julik added 2 commits July 23, 2015 10:30
that returns an object which will have all it's public method calls wrapped with
a retriable() block. Useful for creating wrappers for network services, remote fetchers and the like.
@kamui
Copy link
Owner

kamui commented Jul 24, 2015

Hey @julik, first off, thanks for the PR!

Could you explain the use case for this feature in more detail for me? Maybe a more concrete scenario for why you'd want to wrap another class? In your example, is API some 3rd party library in which you can't or don't want to add retriable blocks in?

The issues I have with this PR is:

  1. The wrapper looks like it wraps every method in a retriable block as long as it responds to that method call. This makes me uncomfortable, because is it really the case that if you have say a twitter api class, that you'd want to retry ALL methods in the class? Wouldn't you more normally want to just wrap a few methods? I could imagine there's probably helper methods or non api call methods that you might not want to wrap. This implementation doesn't allow that.
  2. Once you wrap an object, there's no way to get the original object back from the wrapped object. If you don't care about the original object, why not just modify the existing object? Or Module#prepend on the class itself?

Here's 2 alternatives I would use instead of this:

Given this API class:

class Twitter
  def get_favorites(number)
    puts "get #{number} favorites"
  end
end

We can create the follow module to add retriable blocks:

module RetriableTwitter
  def get_favorites(*args)
    Retriable.retriable(**retriable_options) do
      super
    end
  end

  private
  def retriable_options
    { on: Timeout::Error }
  end
end

Extending the instance:

twitter_api = Twitter.new
twitter_api.extend(RetriableTwitter)
twitter_api.get_favorites(10)

This extends only the instance with retriable blocks around only the methods you want to wrap. I think this is more explicit. You do need to know the method names, but it seems safer to wrap real api calls or network requests, rather than wrapping ALL method calls.

Module#prepend the class:

# Reopen the class and prepend
class Twitter
  prepend RetriableTwitter
end

twitter_api = Twitter.new
twitter_api.get_favorites(10)

If you don't mind modifying the Twitter class, I'd just prepend the methods you want to retry and then every instance of the twitter api will retry when that specific method is called.

If I'm missing something or don't understand the use case, please let me know.

@kamui
Copy link
Owner

kamui commented Jul 24, 2015

Here's a path I would consider going down, this is just a helper method to generate an anonymous module.

module Retriable
  def module_methods(methods, retriable_options = {})
    Module.new do
      [*methods].each do |method|
        define_method method do |*args|
          Retriable.retriable(**retriable_options) do
            super(*args)
          end
        end
      end
    end
  end
end

And here's how you use it:

twitter_api = Twitter.new
twitter_api.extend Retriable.module_methods([:get_favorites], { on: StandardError })
twitter.get_favorites(10)

or

class Twitter
  prepend Retriable.module_methods([:get_favorites], { on: StandardError })
end
twitter_api = Twitter.new
twitter.get_favorites(10)

I could see something like that going into the gem.

@julik
Copy link
Author

julik commented Jul 24, 2015

Thanks for your comments. The reason why I am not using dynamic method extend is because in Ruby it is generally low-performance, since it busts method caches - and creates extra class metadata on a per-object basis, see various DCI rebuttals like this one http://tonyarcieri.com/dci-in-ruby-is-completely-broken. Extending/prepending separate objects is therefore something I tend to avoid (especially when using small objects by the thousand).

The prepend approach is nice, however it would also cause the internal calls within the object (when the object messages itself) to go through retries, and I like the delegate approach better exactly because it only concerns itself with the external consumer talking to the object, but not with the object talking to itself.

To explain the use case a bit more - I am using it a bit like the delay() proxy with Sidekiq. I use it in a distributed context (with queues and threads) so that the executing code does not have to manage retries on it's own. A bit like this (shown here with my own retries wrapper that I already have, I just thought integrating it into Retriable would make more sense):

download = PartialDownload.new("http://filestore.com/bigfile.bin", 10, 409885)

# I know that every download will thus be retried N times with those errors,
# and I can configure that in the caller
retriable_download = NanoRetriable.with_retries(download, errors: Net::OpenTimeout, Faraday::ConnectionError, times: 5)
queue << retriable_download

# ...way below
ThreadPool.execute do
  job = queue.pop(non_block=true)
  job.download! # Will use or not use retries, the thread should know nothing about it
rescue ThreadError
  # queue emptied
end

So I use this to store the retry settings in the object initiating the threads, and from there any operation
can be made "retriable" because operations work like sync Actors in a way - you send them messages and they just do stuff.

It also makes perfect sense to use a wrapper like this for those objects, since the number of methods they support is very small (like perform!, execute!, download! etc.) and the retries are only applied at "the surface" of the API, but not when the object is working with it's own methods.

Fetching the object being wrapped is a very valid concern and I think a method could be easily provided to retrieve it. My initial impl. did not have it because I never had the consumer of the wrapped object extract the wrapped object to do anything with it.

To be clear - I am absolutely not insisting that this should be the only way to consume Retriable, but I think it is a nice usage pattern - and it would be nice if I could replace my homegrown thingy with Retriable while doing so.

@kamui
Copy link
Owner

kamui commented Jul 25, 2015

Thanks for the clarification. So, the 2 main points I'm seeing are these:

  1. Performance issues due to busting method cache with dynamic methods.
  2. Externality of the retry code to the class being wrapped.

I totally see your point in regards to both these. The 2nd one especially, I also think that the retry code should only be invoked by the thing consuming the wrapper, and not the class being wrapped. My suggestions could break that if the class calls it's own methods. I can also see the first one being an issue too when you have a lot of these object instances being created, then wrapped.

The delegator approach is better for that scenario. I'd even suggest going one step further and using DelegateClass so you can skip the method_missing and respond_to? calls, but that would mean you'd have to define a wrapper class for each api/network class you want to wrap and wrap each method manually. I do see a 2nd benefit to that though, you can specify exactly what methods you want to retry making it more explicit.

So, I do see the usefulness of this PR, since it'll allow you to wrap any class, but I still have a few issues:

1 - It wraps all public methods with retriable blocks. This is fine for your use case because you know that the class api is small and you actually want to retry all the methods on those classes, but I'm trying to think of this as a general wrapper for wrapping any network/io/api class. I don't think we can assume anyone using this wrapper will want to retry all public methods. So we need to provide a way to specify which methods you want to wrap.

I understand that since the user is wrapping the object, that they should know that when they call any method on the wrapped object, it will be retried, but I think it would be a better experience if you could specify which methods you want to retry, and maybe even only respond to the methods you specify, otherwise you need to unwrap the wrapper and call the method on that object.

This shouldn't be hard to add right? initialize could take another argument, an array of method symbols, and then you check against that list in the respond_to_missing method.

2 - This is somewhat less of a concern, but what if you wanted to wrap something with a crazy class ancestor hierarchy, like an ActiveRecord class? Wouldn't the respond_to? checks and method_missing lookup also incur a big performance penalty as well? Maybe to the point that extend or prepend might be faster? I haven't benchmarked this, so I don't know.

So currently, I think this implementation might be too specific to your needs. It's great if you know you want to wrap all the methods in the class or you're careful to only call methods you know you want to retry, and if the class has a simple/shallow method lookup ancestry. I'm trying to think of ways we can make it more useful for more use cases.

@kamui
Copy link
Owner

kamui commented Jul 25, 2015

Out of curiosity, in your use case, how many classes like PartialDownload do you wrap? If it's a lot, I could see that creating a class uses DelegateClass could be annoying, but I figure most people don't need to wrap a lot of these types of classes, and that solution might work best.

Another thing I also thought of is, what if you want to retry different exceptions for some subset of method calls? Or maybe different backoff settings for some methods?

@julik
Copy link
Author

julik commented Jul 26, 2015

On my case I very often wrap Faraday for shortcut methods like get, I wrap separate Actor objects and so on. I think the number of classes I wrap like this in my recent projects is about 4-5 per library, but may be more. Obviously most of that is concerned with network rertries.

I think we can arrive at a compromise if we let the with_retries accept an extra kwarg, say :methods. If given, only messages to those methods will be wrapped with retries, and the others will be passthroughs. If not given, all methods will be wrapped (to make it concise, because I would also not necessarily want the configuring code to know about the methods the object being wrapped supports).

Then I will also add __getobj__ to get at the real object. Using SimpleDelegator is a good idea since it's way less malicious than in 1.8.

Re.overhead - there will be an overhead of one method_missing (once per call from consumer) when calling a wrapped method, one wrapper block in a cached method _retrying (once per call from consumer) and there will be a public_send when doing the dispatch, depending on number of retries. I think those are negligible.

Another thing I also thought of is, what if you want to retry different exceptions for
some subset of method calls? Or maybe different backoff settings for some methods?

In that case I would not use such a wrapper, but make something mor elaborate (or even inherit from such a delegator and redefine the methods needing special settings).

@julik
Copy link
Author

julik commented Jul 26, 2015

However, having had a second look at SimpleDelegator - we are already implementing most of it, and it's more suitable for adding methods to the proxy than for wrapping the existing ones, so I'm tempted to leave it for what it is.

@kamui
Copy link
Owner

kamui commented Jul 28, 2015

Yeah, SimpleDelegator doesn't seem like a fit here, but DelegateClass is slightly different. DelegateClass would allow you to skip the method_missing and respond_to? calls, but you couldn't really abstract it since it has to delete to a specific class.

For Faraday, you should be able to do something like this:

class FaradayRetriable < DelegateClass(Faraday)
  def get(url)
    Retriable.retriable do
      super
    end
  end
end

I'd imagine it'd be easy to wrap an Actor as well.

@julik
Copy link
Author

julik commented Jul 28, 2015

Well creating all those extra classes is exactly something I want to avoid by using a delegator.
DelegateClass() also creates anonymous classes, so if there will be many objects this will also balloon the method/class tables.

Look, if you do not like the approach - we can also close the PR and I will just make a separate gem on top of the one you got.

@kamui
Copy link
Owner

kamui commented Jul 29, 2015

I do see some value in a wrapper, but I think 1) retrying ALL methods and 2) with the same retry options for each method seems like it has a narrow and specific use case. At this point, I think you should make a separate gem on top of this one with that wrapper functionality. If it turns out that a lot of people have the same use case as you and find it useful I'd be happy to revisit adding this functionality into this gem in the future.

Sorry this discussion took so long, but I wanted to understand your use case, discuss alternatives, and get try to get a sense of how useful this might be to others.

@kamui kamui closed this Jul 29, 2015
@julik
Copy link
Author

julik commented Jul 29, 2015

No problem, let's do the extra gem thing. Thanks for the discussion.

@julik
Copy link
Author

julik commented Jul 29, 2015

Added in https://github.com/julik/retriable_proxy - maybe you could stick it in the README for exposure?

@kamui
Copy link
Owner

kamui commented Jul 29, 2015

Sure, do you want to right up a blurb or few sentences I should say about it?

@julik
Copy link
Author

julik commented Jul 30, 2015

What about something like this:

If you would like to use Retriable via a proxy object wrapper around any object, like this:

RetriableProxy.for_object(api_endpoint, on: Net::TimeoutError)

use the retriable_proxy gem which allows for this functionality on top of Retriable.

@kamui
Copy link
Owner

kamui commented Aug 4, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants