Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.

Dismiss

Lazy lists and optimizing for responsiveness

0 views

Skip to first unread message

Yuval Kogman

unread,

Sep 19, 2005, 8:17:42 AM9/19/05

to perl6-l...@perl.org

One thing that is extraordinarily hard to do with the facilities we
have today is finding the responsive optimum between laziness and
eagerness.

Let's use an example.

WWW::Mechanize comes with a nice example script for mailing list
moderation.

This script can be rather easily hacked to work on several mailing
lists at once.

Let's look at several approaches to how responsiveness of the
program could be improved:

1. lazy loading

my @messages = gather {
for @lists -> $mailing_list {
take $mailing_list.messages; # mech logs into the mailman
# interface and collects the subjects. This list is possibly
# lazy too
}
}

for @messages -> $message {
# prompt with $message.subject and what not
}

This will pause between each mailing list, but present the first
message as early as possible.

Responsiveness is pretty good overall, but occasionally the user
must wait.

2. strict loading

my @messages = ** gather { ... }

This will pause once in the begining, presenting the user with a all
the messages at once.

Responsiveness is high between messages (never any wait between two
messages) but there is a long startup time.

3. multithreaded approach

my @messages;
my $done;
async {
for @lists -> $mailing_list {
push @messages, $mailing_list.messages
}
$done = 1;
};

while (!$done) {
while (@messages) {
my $message = shift @messages;
...
}

wait_for_write(@messages); # assume there is no race condition
# between the depletion and the write
}

This solution lacks the elegance of the lazy loading approach, but
has the best responsiveness. These implementations tend to be overly
complex for what they do, and hence not worth the maintenance costs.

The gain is that the user only has to wait for the first message,
and if the throughput of the system is higher than the user's input,
there is no responsiveness loss because something is always ready
for the user to get at.

Ideally we could have the lazy list approach have some kind of way
to modify the lazyness so that it's something in between -
background eagerness - the lazy list is resolved in the background,
accumilating to @messages. When @messages is filling up quickly the
background resolution thread might get a lower priority. when
@messages is empty, the receiver side has to block.

This behaves much like UNIX pipes - when the buffer window is filled
the writer process blocks.

The problem with UNIX pipes is that there is no orthogonality -
there is one input and one output. But they do make sense for
character streams.

The lazy pulling model lets us model this with higher level data,
and more flexibility.

Note: I have an idea of how easy I want this to be, but not how I
want to do it.

I think that a nice solution is to allow an optional named adverb to
gather which defaults to lazy:

gather : async {

}

gather : lazy {

}

gather : eager { # or strict

}

These behaviors can then cascade nicely. For example to get the full
result set from many streams simultaneously you do something like

my @full = gather : eager { # doesn't return until the inner gathers are depleted
take gather : async { ... }
take gather : async { ... }
}

Another interesting mechanism is the push side of the deal. I think
a give synonym for 'take' is in order.

In the listmod script the prompt loop will asynchronously take from
the message list, and give to the change submission list.

This allows lazifying that is not only in callee context.

async {
for @delete_list -> $message {
$message.delete; # WWW::Mechanize again
}
}

for @messages -> $message {
# prompt
@delete_list.give($message) if $result_of_prompt ~~ Delete;
}

This gets even more interesting when we want to optimize the
deletion code so that messages are deleted in clusters, to improve
throughput. Something like

for :async :noblock @delete_list -> *@messages {
# @messages is filled with as many messages as there were
# available right now
$mech.delete(@messages); # bah
}

We could wrap such that:

my @delete_list = gather {
take $message if prompt() ~~ Delete;
}

but to improve readability the point of balance should be the prompt
loop, not the deletion loop. I back this point with the comparison
of two descriptions of the program:

listmod is a program that prompts about each message in a number
of mialing list interfaces, and deletes the ones you asked it
to.

listmod is a program that deletes all the messages that the user
asked to delete from a mailing list interface.

I find that the second description requires more mental stack space
and IMHO the code should be written that way too.

Also, 'giving' from the prompt loop is also more scalable since you
can have more than the delete queue - messages are also approved.

With a pull only model you need to part the list and iterate the two
result sets asynchronously. With a push model you give your handler
results as they come in, and it gobbles them up in the background as
it sees fit (chunked, as fast as possible, with a delay loop...
whatever).

--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &
/\ kung foo master: /me spreads pj3Ar using 0wnage: neeyah!!!!!!!!!!!

TSa

unread,

Sep 19, 2005, 9:44:04 AM9/19/05

to perl6-l...@perl.org

HaloO,

Yuval Kogman wrote:
> One thing that is extraordinarily hard to do with the facilities we
> have today is finding the responsive optimum between laziness and
> eagerness.

Good, that you remind me to this subject! I wanted to ask the "same"
question starting from more theoretical grounds. I know that eager
and lazy evaluation are just the two extreme cases of the bounded
buffer pattern: lazy = zero sized buffer, eager = infinite buffer.
Sometimes the buffer is also called queue or pipe.

IIRC, $Larry has mentioned a Pipe type which to me seems to be
just the generic type where you configure the buffer/queue size.
In multi-threaded (or connected processes) applications the buffer
size needs tuning to balance responsiveness with throughput. Thus
your gather proposal could just be a :buffer($size) adverb/trait
in the slurpy list declaration and some auto-threading behaviour
of the pipe operators ==> and <==.

To me this is yet another reason why lists---and junctions---to me
are more code like than data like.
--
$TSa.greeting := "HaloO"; # mind the echo!

Austin Frank

unread,

Sep 20, 2005, 2:47:33 PM9/20/05

to TSa, perl6-l...@perl.org

TSa wrote:

> IIRC, $Larry has mentioned a Pipe type which to me seems to be
> just the generic type where you configure the buffer/queue size.
> In multi-threaded (or connected processes) applications the buffer
> size needs tuning to balance responsiveness with throughput. Thus
> your gather proposal could just be a :buffer($size) adverb/trait
> in the slurpy list declaration and some auto-threading behaviour
> of the pipe operators ==> and <==.

Would the named adverbs for gather work in other contexts as well?
Would you suggest this mechanism for specifying the buffering behavior
for IO operations?

Tim Bray recently wrote about wanting a language that would be smarter
about IO buffering by default. Will perl6 be such a language?

From "Radical New Language Idea"
(http://www.tbray.org/ongoing/When/200x/2005/09/13/Buffering)

"I had this program that was running slow and fixed the problem by
fixing the I/O buffering. If I had a quarter for every time I’ve done
this over my career, I’d have, well, two or three bucks. I think the
language-design community ought to take notice. Currently, they cook up
new languages so object-oriented that each individual bit has a method
repertoire, languages so concurrent that threads execute in
incommensurable parallel universes, languages so functional that their
effects are completely unobservable... How about a radical new language
where the runtime system is hyperaggressive about ensuring that all of
its I/O primitives are by default buffered unless the programmer
specifically requests otherwise? I’ve heard that there’s something
called “COBOL” that has this capability, I’ll have to check it out."

Apologies if I've misunderstood things so badly as to make my questions
senseless, or worse yet, irrelevant.

/au

Stuart Cook

unread,

Sep 20, 2005, 6:55:32 PM9/20/05

to perl6-l...@perl.org

On 19/09/05, Yuval Kogman <nothi...@woobling.org> wrote:
> This solution lacks the elegance of the lazy loading approach, but
> has the best responsiveness. These implementations tend to be overly
> complex for what they do, and hence not worth the maintenance costs.
>
> The gain is that the user only has to wait for the first message,
> and if the throughput of the system is higher than the user's input,
> there is no responsiveness loss because something is always ready
> for the user to get at.
>
> Ideally we could have the lazy list approach have some kind of way
> to modify the lazyness so that it's something in between -
> background eagerness - the lazy list is resolved in the background,
> accumilating to @messages. When @messages is filling up quickly the
> background resolution thread might get a lower priority. when
> @messages is empty, the receiver side has to block.

Initial reaction: I like it!

It's one of those "I always subconsciously wanted to do this, but
never knew it" ideas.

Essentially, if lazy means "don't force items until I ask for them",
and strict means "everything must be forced up-front", then async
relaxes both restrictions, saying "I don't need this list to be forced
up-front, but I also don't care if they end up forced before I use
them--just make each access as quick as you can".

Obviously, applying this to an /infinite/ list might be a bad idea
(unless you could specify a finite buffering limit)...

> Note: I have an idea of how easy I want this to be, but not how I
> want to do it.
>
> I think that a nice solution is to allow an optional named adverb to
> gather which defaults to lazy:

More generally, it would be nice to have a sub &async_force (or
whatever) that takes a lazy list and produces an asynchronously
pre-forced one. Shortcuts for common cases (like gather/take) would be
nice too, though.

Stuart

Yuval Kogman

unread,

Sep 21, 2005, 3:24:40 AM9/21/05

to Austin Frank, TSa, perl6-l...@perl.org

On Tue, Sep 20, 2005 at 14:47:33 -0400, Austin Frank wrote:
> Would the named adverbs for gather work in other contexts as well?
> Would you suggest this mechanism for specifying the buffering
> behavior for IO operations?

See scook's email below... I think that yes. Here is a reference
implementation making heavy use of coros:

sub buffer_lazy (*@lazy, +$threshold = 100) { # renamed scook's force_async
my @buffer;

state Sem $collecting;

if (!@buffer) { # blocking
push @buffer, shift @lazy;
}

if (@buffer.elems < $threshold and $collecting.lock(:nonblocking)) {
async {
LEAVE { $colecting.release };

while (@buffer.elems < $threshold) {
push @buffer , shift @lazy; # or splice it splice returns a lazy list
}
}
}

yield shift @buffer;
}

my @lazy = buffer_lazy @other_lazy; # using @lazy should be more responsive

and where gather is

sub gather (&body, +$async, +$threshold){
my @result_list = {
temp sub *take (*@stuff) {
&OUTER::yield(@stuff); # how do we do this? Maybe $?CALLER_CONTINUATION.yield?
}
body();
};

return $async
?? buffer_lazy @result_list :threshold($threshold)
!! @result_list;
}

> Tim Bray recently wrote about wanting a language that would be
> smarter about IO buffering by default. Will perl6 be such a
> language?

I think it already is about making life easy for the programmer. For
example, the 'will prompt' trait gives autoprompting to a read
handle, by making each read go to another handle, print, flush it
smartly for interactivity, and then actually read the data. This
makes the programmer's life less painful in many ways by
encapsulating a common problem in a standard library definition.

As for Tim's problem, I don't know what exactly Tim means, since I
don't know what he fixed in his program, but "I/O primitives are by
default bufferred unless the programmer specifically requested
otherwise" sounds a lot like perl 5 and $| ;-)

I think that Perl 6 needs more than this to be "smarter about IO
buffering", but I'm not sure I want it to be.

--
() Yuval Kogman <nothi...@woobling.org> 0xEBD27418 perl hacker &

/\ kung foo master: /me beats up some cheese: neeyah!!!!!!!!!!!!!!!!!

0 new messages