blank.gif (43 bytes)

Church Of The
Swimming Elephant


split function

Split up a string using a regexp delimiter

    split /PATTERN/,EXPR
    split /PATTERN/

Splits a string into an array of strings, and returns it. By default, empty leading fields are preserved, and empty trailing ones are deleted.

If not in list context, returns the number of fields found and splits into the @_ array. (In list context, you can force the split into @_ by using ?? as the pattern delimiters, but it still returns the list value.) The use of implicit split to @_ is deprecated, however, because it clobbers your subroutine arguments.

If EXPR is omitted, splits the $_ string. If PATTERN is also omitted, splits on whitespace (after skipping any leading whitespace). Anything matching PATTERN is taken to be a delimiter separating the fields. (Note that the delimiter may be longer than one character.)

If LIMIT is specified and positive, splits into no more than that many fields (though it may split into fewer). If LIMIT is unspecified or zero, trailing null fields are stripped (which potential users of pop() would do well to remember). If LIMIT is negative, it is treated as if an arbitrarily large LIMIT had been specified.

A pattern matching the null string (not to be confused with a null pattern //, which is just one member of the set of patterns matching a null string) will split the value of EXPR into separate characters at each point it matches that way. For example:

    print join(':', split(/ */, 'hi there'));

produces the output 'h:i:t:h:e:r:e'.

The LIMIT parameter can be used to split a line partially

    ($login, $passwd, $remainder) = split(/:/, $_, 3);

When assigning to a list, if LIMIT is omitted, Perl supplies a LIMIT one larger than the number of variables in the list, to avoid unnecessary work. For the list above LIMIT would have been 4 by default. In time critical applications it behooves you not to split into more fields than you really need.

If the PATTERN contains parentheses, additional array elements are created from each matching substring in the delimiter.

    split(/([,-])/, "1-10,20", 3);

produces the list value

    (1, '-', 10, ',', 20)

If you had the entire header of a normal Unix email message in $header, you could split it up into fields and their values this way:

    $header =~ s/\n\s+/ /g;  # fix continuation lines
    %hdrs   =  (UNIX_FROM => split /^(\S*?):\s*/m, $header);

The pattern /PATTERN/ may be replaced with an expression to specify patterns that vary at runtime. (To do runtime compilation only once, use /$variable/o.)

As a special case, specifying a PATTERN of space (' ') will split on white space just as split() with no arguments does. Thus, split(' ') can be used to emulate awk's default behavior, whereas split(/ /) will give you as many null initial fields as there are leading spaces. A split() on /\s+/ is like a split(' ') except that any leading whitespace produces a null first field. A split() with no arguments really does a split(' ', $_) internally.


    open(PASSWD, '/etc/passwd');
    while (<PASSWD>) {
        ($login, $passwd, $uid, $gid,
         $gcos, $home, $shell) = split(/:/);

(Note that $shell above will still have a newline on it. See chop, chomp, and join.)


Protect yourself from cyberstalkers, identity thieves, and those who would snoop on you.
Stop spam from invading your inbox without losing the mail you want. We give you more control over your e-mail than any other service.
Block popups, ads, and malicious scripts while you surf the net through our anonymous proxies.
Participate in Usenet, host your web files, easily send anonymous messages, and more, much more.
All private, all encrypted, all secure, all in an easy to use service, and all for only $5.95 a month!

Service Details

Have you gone to church today?
All pages ©1999, 2000, 2001, 2002, 2003 Church of the Swimming Elephant unless otherwise stated
Church of the Swimming Elephant©1999, 2000, 2001, 2002, 2003 is a wholly owned subsidiary of Packetderm, LLC.

Packetderm, LLC
210 Park Ave #308
Worcester, MA 01609