6

I have a Sinatra app with a long running process (a web scraper). I'd like the app flush the results of the crawler's progress as the crawler is running instead of at the end.

I've considered forking the request and doing something fancy with ajax but this is a really basic one-pager app that really just needs to output a log to a browser as it's happening. Any suggestions?

1 Answer 1

7

Update (2012-03-21)

As of Sinatra 1.3.0, you can use the new streaming API:

get '/' do
  stream do |out|
    out << "foo\n"
    sleep 10
    out << "bar\n"
  end
end

Old Answer

Unfortunately you don't have a stream you can simply flush to (that would not work with Rack middleware). The result returned from a route block can simply respond to each. The Rack handler will then call each with a block and in that block flush the given part of the body to the client.

All rack responses have to always respond to each and always hand strings to the given block. Sinatra takes care of this for you, if you just return a string.

A simple streaming example would be:

require 'sinatra'

get '/' do
  result = ["this", " takes", " some", " time"]
  class << result
    def each
      super do |str|
        yield str
        sleep 0.3
      end
    end
  end
  result
end

Now you could simply place all your crawling in the each method:

require 'sinatra'

class Crawler
  def initialize(url)
    @url = url
  end

  def each
    yield "opening url\n"
    result = open @url
    yield "seaching for foo\n"
    if result.include? "foo"
      yield "found it\n"
    else
      yield "not there, sorry\n"
    end
  end
end

get '/' do
  Crawler.new 'http://mysite'
end
3
  • hi, I'm a little confused here. How many times is the :each called in Crawler class by rack. How do we control that?
    – Raja
    Mar 11, 2012 at 16:04
  • Usually it's called once (when streaming out), but middleware might call it, too, to get access to the body content, so you cannot be 100% sure it's only being called once. Also keep in mind that this does not flush on Thin. Mar 21, 2012 at 14:37
  • Updated my answer to include the new streaming API. Mar 21, 2012 at 14:45

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Not the answer you're looking for? Browse other questions tagged or ask your own question.