Google Groups no longer supports new Usenet posts or subscriptions. Historical content remains viewable.
Dismiss

split on empty string

49 views
Skip to first unread message

Gaal Yahas

unread,
Jan 17, 2006, 12:24:14 PM1/17/06
to perl6-l...@perl.org
While cleaning up tests for release:

"".split(':') =>

() # Perl 5
("",) # pugs

Which is correct? It doesn's seem to be specced yet.

--
Gaal Yahas <ga...@forum2.org>
http://gaal.livejournal.com/

Mark Reed

unread,
Jan 17, 2006, 12:35:57 PM1/17/06
to Gaal Yahas, perl6-l...@perl.org
On 2006-01-17 12:24 PM, "Gaal Yahas" <ga...@forum2.org> wrote:

> While cleaning up tests for release:
>
> "".split(':') =>
>
> () # Perl 5
> ("",) # pugs
>
> Which is correct? It doesn's seem to be specced yet.

I would prefer the current pugs behavior; it's consistent with the general
case, in which a string which does not match the splitting regex results in
a single-item list containing the original string. This is the Python
behavior.

I find the Perl5 (and, surprisingly, Ruby) behavior rather counterintuitive.


Jonathan Scott Duff

unread,
Jan 18, 2006, 1:18:03 AM1/18/06
to Mark Reed, Gaal Yahas, perl6-l...@perl.org

FWIW, I agree with Mark.

-Scott
--
Jonathan Scott Duff
du...@pobox.com

David K Storrs

unread,
Jan 18, 2006, 10:04:18 AM1/18/06
to perl6-l...@perl.org

On Jan 18, 2006, at 1:18 AM, Jonathan Scott Duff wrote:

> On Tue, Jan 17, 2006 at 12:35:57PM -0500, Mark Reed wrote:
>> On 2006-01-17 12:24 PM, "Gaal Yahas" <ga...@forum2.org> wrote:

>>> [split on empty string] doesn's seem to be specced yet.


>>
>> I would prefer the current pugs behavior; it's consistent with the
>> general
>> case, in which a string which does not match the splitting regex
>> results in
>> a single-item list containing the original string. This is the
>> Python
>> behavior.
>>
>> I find the Perl5 (and, surprisingly, Ruby) behavior rather
>> counterintuitive.
>
> FWIW, I agree with Mark.
>
> -Scott

Just to show opposite, I've always found that behavior (i.e.
returning the original string unchanged) confusing. C<split> works
based on sequential examination of the target string to locate
matching substrings on which to split. There is a matching "empty
string" substring between every character. Naturally, what you get
back is an array of characters.

Plus, it's a useful idiom.

--Dks

Mark Reed

unread,
Jan 18, 2006, 10:21:37 AM1/18/06
to David K Storrs, perl6-l...@perl.org
On 2006-01-18 10:04 AM, "David K Storrs" <dst...@dstorrs.com> wrote:
> Just to show opposite, I've always found that behavior (i.e.
> returning the original string unchanged) confusing. C<split> works
> based on sequential examination of the target string to locate
> matching substrings on which to split. There is a matching "empty
> string" substring between every character. Naturally, what you get
> back is an array of characters.
>
> Plus, it's a useful idiom.

You misunderstand. We're not talking about splitting a string with the
empty string as the delimiter. We're talking about splitting the empty
string (as the target), with any delimiter whatsoever. If you do that, what
you get back is an empty array.

Perl6 "".split(/whatever/) is equivalent to split(/whatever/,"") in Perl5.
It's not like the silly Python idiom where the invocant is the delimiter
string. :)

So this is the Perl5 to which I was referring:

my @result = split(/whatever/, "");
print ~~@result,"\n"; # --> 0

The general case I was comparing it to is when the target string is *not*
empty but doesn't match, in which case you get the original string back:

my @result = split(/foo/, "bar");
print ~~@result,"\n"; # --> 1
print $result[0],"\n"; # --> "bar"

Jonathan Lang

unread,
Jan 18, 2006, 10:26:12 AM1/18/06
to Mark Reed, David K Storrs, perl6-l...@perl.org
Mark Reed wrote:
> Perl6 "".split(/whatever/) is equivalent to split(/whatever/,"") in Perl5.

I'm hoping that the perl 5 syntax will still be valid in perl 6.

--
Jonathan "Dataweaver" Lang

Larry Wall

unread,
Jan 18, 2006, 2:10:46 PM1/18/06
to perl6-l...@perl.org
On Tue, Jan 17, 2006 at 07:24:14PM +0200, Gaal Yahas wrote:
: While cleaning up tests for release:

:
: "".split(':') =>
:
: () # Perl 5
: ("",) # pugs
:
: Which is correct? It doesn's seem to be specced yet.

This has nothing to do with splitting on the empty string, per se, but
with Perl 5 stripping trailing null fields by default:

@results = split(/:/,'::::');
print @results + 0; # prints '0'

I think we could make that behavior optional now with :trim or some
such. The original motivation has largely gone away with the advent
of autochomping filehandles, so split /\s+/ won't produce a spurious
null field after the \n.

Larry

Juerd

unread,
Jan 18, 2006, 3:08:07 PM1/18/06
to Jonathan Lang, perl6-l...@perl.org
Jonathan Lang skribis 2006-01-18 7:26 (-0800):

> Mark Reed wrote:
> > Perl6 "".split(/whatever/) is equivalent to split(/whatever/,"") in Perl5.
> I'm hoping that the perl 5 syntax will still be valid in perl 6.

Don't worry, it is.


Juerd
--
http://convolution.nl/maak_juerd_blij.html
http://convolution.nl/make_juerd_happy.html
http://convolution.nl/gajigu_juerd_n.html

Larry Wall

unread,
Jan 18, 2006, 3:13:32 PM1/18/06
to perl6-l...@perl.org
On Wed, Jan 18, 2006 at 09:08:07PM +0100, Juerd wrote:
: Jonathan Lang skribis 2006-01-18 7:26 (-0800):

: > Mark Reed wrote:
: > > Perl6 "".split(/whatever/) is equivalent to split(/whatever/,"") in Perl5.
: > I'm hoping that the perl 5 syntax will still be valid in perl 6.
:
: Don't worry, it is.

Yep, it's mostly just the semantics we're screwing around with. :-)

Larry

0 new messages