PERL

Manual

Introduction to Perl

Simple Examples

Options

How to Find Script?

Data Types and Objects

Syntax

Perl Regular Expression

Pattern Operations

Substitution

Procedure

Predefined Variables

CHMOD

FOR LOOP

SWITCH

CHOP

Pass array by reference

Maximum of an array

Join Two Lines

Local Array

Use of Regular expression

FAQ

Untitled Document

Manual

Perl Manual in HTML

Manual of wwwlib

Some Links

Perl Tutorial: Start

Perl Quick Reference

Rex Swain's HTMLified Perl 5 Reference Guide

Robert's Perl Tutorial

The Perl You Need to Know
Thu May 31 08:50:39 CST 2012 Untitled Document

Introduction to Perl

A Text Manipulation Language

Power of shell, awk, sed, and C(?) -- more complete

Multi-dimensional regular and associative arrays

String manipulation and pattern match capabilities

Powerful regular expressions

Rich set of built-in variables

Strength

Awk/Sed syntax flavor

Easy to learn for shell/awk/sed programmers

Much less I/O

Public domain, very popular

Shortcomings

Not so clean

Mixture of different paradigms

Suggestions

use shell/awk/sed whenever possible

use perl when shell/awk/sed get too complicated

use perl when in-core memory is needed

Data Types

$days	a scalar variable
$days[28]	29th element of array @days
$days{'Feb'}	one value from an associative array
@days	($days[0], $days[1],... $days[n])
@days[3,4,5]	same as @days[3..5]
%days	entire associative arrays

String Manipulation Operators

.	Concatenation of two strings
.=	The concatenation assignment operator
eq	String equality
ne	String inequality (!= is numeric inequality)
lt	String less than
gt	String greater than
le	String less than or equal
ge	String greater than or equal
cmp	String comparison, returning -1, 0, or 1

Regular Expressions

all	unix regular expressions
\w	matches an alphanumeric character (including "_")
\W	matches a nonalphanumeric
\b	matches a word boundary
\B	matches a non-boundary
\s	matches a whitespace character
\S	matches a non-whitespace character
\d	matches a numeric character
\D	matches a non-numeric character

{n,m}	occur at least n times but no more than m times
{n,}	occur at least n times
{n}	occur exactly n times
*	0 or more times
+	1 or more times
?	0 or 1 time

$+	returns whatever the last bracket-match matched
$&	returns the entire matched string
$`	returns everything before the matched string
$'	returns everything after the matched string

Thu May 31 08:50:39 CST 2012 Untitled Document

Simple Examples

grep

$pattern = shift (@ARGV); # $pattern = first argument while (<>) { # rest of @ARGV are filenames if (/$pattern/){ print; } }

Print non-null lines

while (<>) { chop; if ($_ ne "" ){ print $_, "\n"; } }

while (<>) { print unless $_ eq "\n" ; }

while (<>) { print unless /" $"/ ; }

while (<>) { s/ \n$//; print; }

Thu May 31 08:50:39 CST 2012 Untitled Document

Options

Option Combination

Single-character options may be combined.

#!/usr/bin/perl -spi.bak #-s -p -i.bak

Opt	Args	Explanation
-0	digits	Record Separator
-D	number	Sets debugging flags
-e	"commandline"	One line of script on command line
-i	extension	input file ( <> ) are to be edited in-place.
-I	directory	Path name for include files.
-l	octnum	Line-ending processing mode

Opt	Explanation
-a	Autosplit mode with -n or -p
-c	Check the syntax only
-d	Debugging mode
-n	Automatic loop like "sed -n" or awk:
-p	Automatic loop like "sed -p"
-P	Invoke cpp before compilation
-s	Rudimentary switch mode
-S	use the PATH environment variable to search for the script
-u	Core dump after compiling
-U	allows perl to do unsafe operations.
-v	prints the version and patchlevel
-w	Print warning message
-x	script is embedded in a message

-0 digits

Record Separator, sepcifies record separator ($/) in octal number

null, if no digits

00, slurp files in paragraph mode

0777, slurp files whole (0777 is not a valid character)

-a

Autosplit mode with -n or -p

spiltting result is saved in @F

perl -ane 'print pop(@F), "\n";'

above comand is equivalent to

while (<>) { @F = split(' '); print pop(@F), "\n"; }

-D number

Sets debugging flags

-D14	to watch how it executes your script
-D1024	lists your compiled syntax tree
-D512	displays compiled regular expressions

-e "commandline"

One line of script on command line

Multiple -e commands may be given for a multi-line script.

-i extension

input file ( <> ) are to be edited in-place.

rename the input file

open the output file by the same name

select that output file as the default for print statements

the extension is added to file name to make a backup copy

no backup if no extension

e.g. perl -p -i.bak -e "s/foo/bar/;"

-I directory

Path name for include files

default directories: /usr/include and /usr/lib/perl

-l octnum

Line-ending processing mode

automatically chops the line terminator when used with -n or -p

assigns $\ to octnum (default is $/)

add line terminator back automatically in each print statement.

e.g. to trim lines to 80 columns:

perl -lpe 'substr($_, 80) = ""'

-n

Automatic loop like "sed -n" or awk:

lines are not printed by default

Equivalent to

while (<>) { 。。。 # your script goes here }

.fi

e.g. to delete all files older than a week:

find . -mtime +7 -print \| perl -nle 'unlink;'

-p

Automatic loop like "sed -p"

lines are printed by default

while (<>) { 。。。 # your script goes here } continue { print; }

-P

Invoke cpp before compilation

because of ambiguity, you should avoid starting comments with any words recognized by the C preprocessor such as "if", "else" or "define".)

-s

Rudimentary switch mode

any switch in command line set a flag variable.

is removed from @ARGV

prints "true" if the script is invoked with a -xyz switch

#!/usr/bin/perl -s if ($xyz) { print "true\n"; }

-u

Core dump after compiling

You can then take this core dump and turn it into an executable file by using the undump program (not supplied).

-U

allows perl to do unsafe operations.

e.g. unlink directories while running as superuser

-w

Print warning message

identifiers that are mentioned only once

scalar variables that are used before being set

redefined subroutines

references to undefined filehandles

write to read-only filehandles

use == on values that don't look like numbers

subroutines recurse more than 100 deep

-x

script is embedded in a message

Thu May 31 08:50:39 CST 2012 Untitled Document

How to Find Script?

Upon startup, perl looks for your script in one of the following places:

Command Line

-e switches on the command line

First file filename on command line

STDIN

only work if there are no filename arguments

implicit

use option - to explicitly specify

Thu May 31 08:50:39 CST 2012 Untitled Document

Data Types and Objects

Data Types

scalars

arrays of scalars (indexed by number from 0)

associative arrays of scalars (indexed by string)

Context Dependent type determination

String, Numeric, Array.

Some operations return array or scalar values depending on contexts

Scalar operations don't care whether the context is looking for a string or a number,

scalar variables and values are context dependent.

In boolean context:

FALSE : null string or 0.

TRUE: Otherwise (boolean operations return 1)

Variable Names

Names which start with a letter may also contain digits and underscores.

Names which do not start with a letter are limited to one character,

e.g. "$%" or "$$".

most of the one character names have been predefined.

Variables

Single Variables (denoted by '$')

$days	# a simple scalar variable
$days[28]	# 29th element of array @days
$days{'Feb'}	# one value from an associative array
$#days	# last index of array @days
${days}	# same as $days

@days	# ($days[0], $days[1],... $days[n])
@days[3,4,5]	# same as @days[3..5]
@days{'a','c'}	# same as ($days{'a'},$days{'c'})

Entire associative arrays (denoted by '%')

%days	# (key1, val1, key2, val2 ...)

Context dependent Assignments

Assignment to a scalar evaluates the righthand side in a scalar context

assignment to an array or array slice evaluates the righthand side in an array context.

Assigning to $#days changes the length of the array.

e.g. Nullify an array

@whatever = (); $#whatever = $[ - 1;

evaluate an array in a scalar context, it returns the length of the array.

scalar(@whatever) == $#whatever - $[ + 1;

evaluate an associative array in a scalar context, it returns
a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash.

Multi-dimensional arrays

not directly supported, but can be emulated

String literals

'string', "string", string

same as shell's string literals

"string" is subject to \ and variable substitution;

but 'string' is not (except for \' and \\).

Special Characters

\t	tab
\n	newline
\r	return
\f	form feed
\b	backspace
\a	alarm (bell)
\e	escape
\033	octal char
\x1b	hex char
\c[	control char
\l	lowercase next char
\u	uppercase next char
\L	lowercase till \E
\U	uppercase till \E
\E	end case modification

Strings can contain \n

Variable inside strings is limited to :

scalar variables

normal array values

array slices

$Price = '$100'; # not interpreted print "The price is $Price.\\n"; #interpreted

put {} around the identifier as delimiters

a single quoted string must be separated from a preceding word by a space, since single quote is a valid character in an identifier

Numeric literals

12345 12345.67 .23E-10 0xffff # hex 0377 # octal 4_294_967_296

Specail Literals

__LINE__ (current line number)

__FILE__ (current filename)

They may only be used as separate tokens;

they will not be interpolated into strings.

__END__ or ^D or ^Z (logical end of the script)

Any following text is ignored, but may be read via the DATA filehandle.

Unquotted string literals

a word that doesn't have any other interpretation in the grammar will be treated as if it had single quotes around it.

a word consists only of alphanumeric characters and underline, must started with an alphabetic character.

Array value expansion

are interpolated into double-quoted strings by joining all the elements of the array with the delimiter specified in the $" variable, space by default.

$temp = join($",@ARGV); system "echo $temp"; system "echo @ARGV";

Here-Is (same as shell)

begin with a <<ANY-STRING

all lines following the current line down to the:

terminated by ANY-STRING

print <<EOF; # same as above The price is $Price. EOF

print <<"EOF"; # same as above The price is $Price. EOF

print << x 10; # null identifier is delimiter Merry Christmas! print <<`EOC`; # execute commands echo hi there echo lo there EOC

print <<foo, <<bar; # you can stack them I said foo. foo I said bar. bar

Ambiguity within search patterns

/$foo[bar]/ is /${foo}[bar]/ or /${foo[bar]}/ ?

Is [bar] a character class or the subscript to array @foo ?

If @foo doesn't exist, then it's obviously a character class.

If @foo exist, perl will take a good guess.

Array literals:

(comma separated list)

In a non-array context,

the final element of the array literal is used.

@foo = ('cc', '-E', $bar);	# assign entire list to @foo
$foo = ('cc', '-E', $bar);	# assign $bar to $foo
$foo = @foo;	# $foo gets 3 !!!!!!!!!

When a LIST is evaluated,

each element of the list is evaluated in an array context,

(@foo,@bar,&SomeSub)	contains all the elements of @foo and @bar and all the elements returned by the subroutine named SomeSub.

A list value may also be subscripted

$time = (stat($file))[8]; # stat returns array value $digit = ('a','b','c','d','e','f')[$digit-10]; return (pop(@foo),pop(@foo))[0];

Array lists may be assigned to

each element of the list is an lvalue:

final element may be an array or an associative array

($a, $b, $c) = (1, 2, 3);

($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00);

($a, $b, @rest) = split;

local($a, $b, %rest) = @_;

Associative array literal

sequence of key, value pairs

%map = ('red',0x00f,'blue',0x0f0,'green',0xf00);

Array assignment in a scalar context

returns the number of elements produced by the expression on the right side of the assignment:

$x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2

`shell command` (same as shell)

expand the stirng like a double quoted string.

interprete as a shell command,

output a string in a scalar context,

output an array in an array context

one line for each element

File Handles

Use <Filehandle> to get the next line of that file

assign the line to a variable except that:

it is assigned to $_ in a bare while loop;

while ($_ = <STDIN>) { print; } while (<STDIN>) { print; } for (;<STDIN>;) { print; } print while $_ = <STDIN>; print while <STDIN>;

Predefined: STDIN, STDOUT, STDERR

Use Open to create filehandles

<FILEHANDLE> is used in array context

an array consisting of all the input lines is returned, one line per array element.

Null filehandle <>

used to emulate the behavior of sed and awk.

input from standard input, or

each file listed on the command line.

return FALSE only once

call it again will get input from STDIN

Scalar variable filehandle

(e.g. <$foo>)

the name of the filehandle to input from

File Glob:

(e.g. <non-filehandle string>)

filename pattern to be globbed,

One level of $ interpretation is done first,

should use <${foo}> (can't use <$foo>).

while (<*.c>) { chmod 0644, $_; }

chmod 0644, <*.c>;

Thu May 31 08:50:39 CST 2012 Untitled Document

Syntax

Program

Sequence of declarations and commands.

No need to declare except report formats and subroutines

Default value for uninitialized user objects:

null or 0

Program is executed only once

like C, not sed or awk

exception: -n and -p switches

Declaration can be put anywhere

A Free-form Language

Use '#' for comment (like shell)

/* */ is not for comments

Compound Statements

BLOCK {....}

a sequence of commands works as one command

Control Flow

if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
LABEL while (EXPR) BLOCK
LABEL while (EXPR) BLOCK continue BLOCK
LABEL for (EXPR; EXPR; EXPR) BLOCK
LABEL foreach VAR (ARRAY) BLOCK
LABEL BLOCK continue BLOCK

{ } are required, not like C

if (!open(foo)) { die "Can't open $foo: $!"; }
die "Can't open $foo: $!" unless open(foo);
open(foo) \|\| die "Can't open $foo: $!"; # foo or bust!
open(foo) ? 'hi mom' : die "Can't open $foo: $!";

If and Unless

if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
if (EXPR) BLOCK elsif (EXPR) BLOCK .. else BLOCK

unless (EXPR) BLOCK
unless (EXPR) BLOCK else BLOCK
unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK

Block Condition

(EXPR) can be a BLOCK,

return true if the value of the last command is true

While and Until

LABEL [while\|until] (EXPR) BLOCK
LABEL [while\|until] (EXPR) BLOCK continue BLOCK

(EXPR) can be a BLOCK,

LABEL (name:) is optional

(for loop control: next, last, and redo)

continue BLOCK

(executed before condition is to be evaluated again)

similar to the third part of a "for" loop in C

Until block

the condition is still tested before the first iteration

For, Foreach

FOR

LABEL for (EXPR; EXPR; EXPR) BLOCK

for ($i = 1; $i < 10; $i++) { ... }

is the same as

$i = 1; while ($i < 10) { } continue { $i++; }

FOREACH

LABEL foreach VAR (ARRAY) BLOCK

iterates over a normal array value

sets VAR to be each element of the array in turn

VAR is local to loop

can use for for brevity.

VAR can be omitted, $_ is set to each value

If ARRAY is an actual array (not an array expression) you can modify each element of the array by modifying VAR

for (@ary) { s/foo/bar/; }
foreach $elem (@elements) { $elem *= 2; }
for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) { print $_, "\n"; sleep(1); }
for (1..15) { print "Merry Christmas\n"; }
foreach $item (split(/:[\\\n:]*/, $ENV{'TERMCAP'})) { print "Item: $item\n"; }

Single Block

LABEL BLOCK continue BLOCK

loop that executes once

can use loop control to leave or restart

continue block is optional.

Good to emulate a CASE structure

foo: { if (/^abc/) { $abc = 1; last foo; } if (/^def/) { $def = 1; last foo; } if (/^xyz/) { $xyz = 1; last foo; } $nothing = 1; }

foo: { $abc = 1, last foo if /^abc/; $def = 1, last foo if /^def/; $xyz = 1, last foo if /^xyz/; $nothing = 1; }

foo: { /^abc/ && do { $abc = 1; last foo; }; /^def/ && do { $def = 1; last foo; }; /^xyz/ && do { $xyz = 1; last foo; }; $nothing = 1; }

foo: { /^abc/ && ($abc = 1, last foo); /^def/ && ($def = 1, last foo); /^xyz/ && ($xyz = 1, last foo); $nothing = 1; }

if (/^abc/) { $abc = 1; } elsif (/^def/) { $def = 1; } elsif (/^xyz/) { $xyz = 1; } else {$nothing = 1;}

Simple statements

Every simple statement but last one must end with a ";"

Optional Post modifier

if EXPR while EXPR
unless EXPR until EXPR

do {} while/until EXPR

will execute once before the conditional is evaluated.

do { $_ = <STDIN>; } until $_ eq ".\n";

Do { } can't have loop control

since modifiers don't take loop labels

Expressions

Like C expressions

Perl Special

**	exponentiation operator.
**=	exponentiation assignment operator.
()	The null list, used to initialize an array to null.
.	Concatenation of two strings.
.=	The concatenation assignment operator.
eq	String equality (== is numeric equality).
ne	String inequality (!= is numeric inequality).
lt	String less than.
gt	String greater than.
le	String less than or equal.
ge	String greater than or equal.
cmp	String comparison, returning -1, 0, or 1.
<=>	Numeric comparison, returning -1, 0, or 1.

Perl Special operator (=~, !~, x, x= )

(Do pattern oriented operations over left operand)

right arg. is a search pattern, substitution, or translation.

left argument is the target object.

$xxx ~= s/TEST/test/g

(same as =~ but negate the return value)

(repetition operator)

if left operand is (), repeat the list.

print '-' x 80;	# print row of dashes
print '-' x80;	# illegal, x80 is identifier
print "\t" x ($tab/8), ' ' x ($tab%8);	# tab over
@ones = (1) x 80;	# an array of 80 1's
@ones = (5) x @ones;	# set all elements to 5 the 2nd @ones is the length of @ones

(Repetition assignment, scalar only)

\ ..

(Range Operator)

In a scalar context ==> returns a boolean value

It works like line addressing exp. in sed or awk.

if (101 .. 200) { print; }	# print 2nd hundred lines
next line if (1 .. /^$/);	# skip header lines
s/^/> / if (/^$/ .. eof());	# quote body

In an array context ==> returns an array

for (101 .. 200) { print; }	# print $_ 100 times
@foo = @foo[$[ .. $#foo];	# an expensive no-op
@foo = @foo[$#foo-4 .. $#foo];	# slice last 5 items

File Test

-r -w -x -o	Readable/Writabe/Executable/Owned by effective uid/gid.
-R -W -X -O	Readable/Writabe/Executable/Owned by real uid/gid.
-e	File exists.
-z -s	File has zero/non-zero size (returns size).
-f -d -l -p -S	plain_file/directory/symbolic_link/named_pipe_(FIFO)/socket
-b -c	is a block/character special file.
-u -g -k	File has setuid/setgid/sticky bit set.
-t	Filehandle is opened to a tty.
-T -B	is a text/binary file.
-M	Age of file in days when script started.
-A	Same for access time.
-C	Same for inode change time.

while (<>) { chop; next unless -f $_; # ignore specials 。。。 }

"_" ==> previously tested file

print "Can do.\n" if -r $a \|\| -w _ \|\| -x _;
stat($filename);
print "Readable\n" if -r _;
print "Writable\n" if -w _;
print "Executable\n" if -x _;
print "Text\n" if -T _;
print "Binary\n" if -B _;

Perl Expression

C's Expressions not in Perl

Address_of &	Pointer *	Type casting (TYPE)

++ Auto-increment for alphnumeric strings

if a string matches to pattern /^[a-zA-Z]*[0-9]*$/,
++ will increase it as a string

print ++($foo = '99');	# prints '100'
print ++($foo = 'a0');	# prints 'a1'
print ++($foo = 'Az');	# prints 'Ba'
print ++($foo = 'zz');	# prints 'aaa'

range operator (in an array context) makes use of the magical autoincrement algorithm if the minimum and maximum are strings.

@alphabet = ('A' .. 'Z');	# A to Z
$hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];	#Hex number
@z2 = ('01' .. '31'); print @z2[$mday];

-- is not magical.

Perl Expression

|| and &&

In C: returns 0 or 1

In Perl: return the last value evaluated.

e.g. #a portable way to find out the home directory

$home = $ENV{'HOME'} \|\| $ENV{'LOGDIR'} \|\| (getpwuid($<))[7] \|\| die "You're homeless!\n";

Thu May 31 08:50:39 CST 2012 Untitled Document

Perl Regular Expression

Henrry Spencer's Version 8 Reg. Exp.

.	any character
*	repeat 0 or more times
^	beginning of the line
$	end of line
[ ]	any character in []
[x-y]	any character between x and y
$..$	marked pattern

Perl Special

\w	any alphanumeric character (including "_")
\W	non-alphanumeric character (not \w)
\b	word boundary
\B	not \b
\s	white space
\S	not \s
\d	numeric character
\D	not \d

Back Reference

Refer to the substring matched ina pattern search

\<digit>	digit'th substring (within pattern search only) $1, $2, .. The nth substring matched in a pattern search
$`	Everything before the matched string
$&	The entire matched string
$'	Everything after the matched string
$+	Last bracket match matched

s/^([^ ]) ([^ ]*)/$2 $1/; # swap first two words

if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; }

Multiline pattern search

set $* to 1

^	matches after any newline within the string,
$	matches before any newline.
.	character never matches a newline

e.g. leaves a newline on the $_ string:

$_ = <STDIN>; s/.(some_string)./$1/;

If the newline is unwanted, try one of

s/.(some_string).\n/$1/; s/.(some_string)[^\000]/$1/; s/.(some_string)(.\|\n)/$1/; chop; s/.(some_string)./$1/; /(some_string)/ && ($_ = $1);

{n,m} Modifier

{n,m}	must occur at least n times but no more than m times .br .br
{n}	is {n,n} , must occur exactly n times
{n,}	n or more times
{0,}	*, 0 or more times
{1,}	+ , 1 or more times
{0,1}	? , 0 or 1 time

Other Tips

all backslashed metacharacters are alphanumeric, such as \b, \w, \n.

So anything that looks like \, $, $, \< \>, \{, or \} is always interpreted as a literal character, not a metacharacter. This makes it simple to quote a string that you want to use for a pattern but that you are afraid might contain metacharacters. Simply quote all the non-alphanumeric characters:

$pattern =~ s/(\W)/\$1/g;

Thu May 31 08:50:40 CST 2012 Untitled Document

Pattern Operations

Format

/PATTERN/gio

?PATTERN? #search only once, stop until reset.

m/PATTERN/gio

/PATTERN/gio

true (1) or false ('').

Default target: $_

Other Target: use =~ or !~ operator

Target can be the result of an expression evaluation,

can use any character for delimiters except the first form

Post Modifiers

"g" for global match

"i" for case insensitive

"o" disable run time varialbe recompilation

PATTERN may contain references to scalar variables except $) and $|

use an "o" to avoid run time recompilation of variables

When need to return an array, returns:

return subexpressions matched by () in the pattern,

i.e. ($1, $2, $3...).

return a Null array If the match fails

return (1) if the match succeeds, but no ()

open(tty, '/dev/tty'); <tty> =~ /^y/i && do foo(); #do foo if desired if (/Version: ([0-9.])/) { $version = $1; } next if m#^/usr/spool/uucp#;

# poor man's grep $arg = shift; while (<>) { print if /$arg/o; #compile only once } if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s(.)/))

Post Modifier "g" for Pattern Match

matching as many times as possible within the string.

In Array Context, returns a list of:

all the substrings matched by all ()

all the matched strings, as if /(entire-string)/

in Scalar Context

iterates through the string,
returning TRUE each time it matches, and
FALSE when it eventually runs out of matches.

Don't modify string between matches

   # array context   
($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g);
   # scalar context   
$/ = "";    #record separator is mutli-balnk line   
$* = 1;     # mutlti-line pattern match   
while ($paragraph = <>) {
    while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) {
	 $sentences++;
    }
}
print "$sentences\n";

Thu May 31 08:50:40 CST 2012 Untitled Document

Substitution

Format

s/PATTERN/REPLACEMENT/gieo

Searches a string for a pattern,

if found, replaces that pattern with the replacement text and returns the number of substitutions made.

Otherwise returns false (0).

Default target string is $_

Post modifier

"i"	case insensitive
"o"	avoid variable recompilation in patterns
"g"	replace all matched occurrences
"e"	the replacement string is an expression

Delimiters

Any non-alphanumeric delimiter may replace the slashes;

if '' are used, no interpretation is done on the replacement string
(the e modifier overrides this, however);

if `` are used, the replacement string is a command to execute whose output will be used as the actual replacement text.

If the PATTERN is delimited by bracketing quotes,
the REPLACEMENT has its own pair of quotes.

e.g. s(foo)(bar) or s/bar/.

pattern can contains scalar variables

If the PATTERN is a null string,
the most recent success ful regular expression is used instead.

s/\bgreen\b/mauve/g;	# don't change wintergreen
$path =~ s\|/usr/bin\|/usr/local/bin\|;
s/Login: $foo/Login: $bar/;	# run-time pattern
($foo = $bar) =~ s/bar/foo/;
s/([^ ]) ([^ ]*)/$2 $1/;	# reverse 1st two fields
$_ = 'abc123xyz';
s/\d+/$&*2/e;	# yields 'abc246xyz'
s/\d+/sprintf("%5d",$&)/e;	# yields 'abc 246xyz'
s/\w/$& x 2/eg;	# yields 'aabbcc 224466xxyyzz'

Thu May 31 08:50:40 CST 2012 Untitled Document

Procedure

Define

sub subname { Block }

Invoke

&subname ( para1, para2, ...)

Passing Arguments

Call By Reference

calling variables are put in array @_

any change to @_ will affect the variables in calling context

example: define

sub under { foreach $value ( @_ ) { $value += 5; print "$value "; }

example: calling

&under(1,2,3,4,5);

example: execution result

6,7,8,9,10

Call By Value

use local to declare local variables

sub greeting { local($fanme, $lname) = @_; . . . . . . . . return $fname; }

Thu May 31 08:50:40 CST 2012 Untitled Document

Predefined Variables

$_	default input and pattern-searching space (Mnemonic: underline is understood in certain operations.)
$.	current input line number of the last filehandle that was read. Readonly. (Mnemonic: Many editors use . for current line #.)
$/	The input record separator, newline by default. "" -- blank lines (multiple blank line is treated as one) "\n\n" -- blank lines (one first balnk line is delimiter) can use multi-character delimiter (Mnemonic: / is line boundaries when quoting poetry.)
$,	The output field separator for the print operator (Mnemonic: , is used inprint statement for OFS)
$"	The output field separator for the array interpolation
$\	The output record separator for the print operator.
$#	The output format for printed numbers. initial value is %.20g rather than %.6g (Mnemonic: # is the number sign.)
$%	The current page number of the currently selected output channel. (Mnemonic: % is page number in nroff.)
$=	The current page length (printable lines) of the currently selected output channel. Default is 60. (Mnemonic: = has horizontal lines.)
$-	The number of lines left on the page of the currently selected output channel. (Mnemonic: lines_on_page - lines_printed.)
$~	The name of the current report format for the currently selected output channel. Default is name of the filehandle.

Thu May 31 08:50:40 CST 2012 Untitled Document

CHMOD

while (<*.c>) { chmod 0644, $_; }

open(foo, "echo *.c \| tr -s ' \t\r\f' '\\012\\012\\012\\012'\|"); while () { chop; chmod 0644, $_; }

chmod 0644, <*.c>;

Thu May 31 08:50:40 CST 2012 Untitled Document

FOR LOOP

for ( $1 = 1; $i < 10; $i++) { }

$i = 1; while ($i < 10) { 。。。 } continue { $i++; }

for (@ary) { s/foo/bar/; }

foreach $elem (@elements) { $elem *= 2; }

for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) { print $_, "\n"; sleep(1); }

for (1..15) { print "Merry Christmas\n"; }

foreach $item (split(/:[\\\n:]*/, $ENV{'TERMCAP'})) { print "Item: $item\n"; }

Thu May 31 08:50:40 CST 2012 Untitled Document

SWITCH

foo: { if (/^abc/) { $abc = 1; last foo; } if (/^def/) { $def = 1; last foo; } if (/^xyz/) { $xyz = 1; last foo; } $nothing = 1; }

foo: { $abc = 1, last foo if /^abc/; $def = 1, last foo if /^def/; $xyz = 1, last foo if /^xyz/; $nothing = 1; }

foo: { /^abc/ && do { $abc = 1; last foo; }; /^def/ && do { $def = 1; last foo; }; /^xyz/ && do { $xyz = 1; last foo; }; $nothing = 1; }

foo: { /^abc/ && ($abc = 1, last foo); /^def/ && ($def = 1, last foo); /^xyz/ && ($xyz = 1, last foo); $nothing = 1; }

if (/^abc/) { $abc = 1; } elsif (/^def/) { $def = 1; } elsif (/^xyz/) { $xyz = 1; } else {$nothing = 1;}

Thu May 31 08:50:40 CST 2012 Untitled Document

CHOP

while (<>) { chop; next unless -f $_; # ignore specials 。。。 }

Thu May 31 08:50:40 CST 2012 Untitled Document

Pass array by reference

sub doubleary { local(someary) = @_; foreach $elem (@someary) { $elem = 2; } } do doubleary(foo); do doubleary(bar);

Thu May 31 08:50:40 CST 2012 Untitled Document

Maximum of an array

sub MAX { local($max) = pop(@_); foreach $foo (@_) { $max = $foo if $max < $foo; } $max; } 。。。 $bestday = &MAX($mon,$tue,$wed,$thu,$fri);

Thu May 31 08:50:40 CST 2012 Untitled Document

Join Two Lines

Join a line with next line with certain pattern

    # get a line, combining continuation lines   
    #  that start with whitespace   
   sub get_line {
	$thisline = $lookahead;
	line: while ($lookahead = ) {
	  if ($lookahead =~ /^[ \t]/) {
		  $thisline .= $lookahead;
	     }
	     else {
		  last line;
	     }
	}
	$thisline;
   }
   $lookahead = ;     # get first line   
   while ($_ = do get_line()) {
        }

Thu May 31 08:50:41 CST 2012 Untitled Document

Local Array

Use array assignment to a local list to name your formal arguments:

sub maybeset { local($key, $value) = @_; $foo{$key} = $value unless $foo{$key}; }

Thu May 31 08:50:41 CST 2012 Untitled Document

Use of Regular expression

s/^([^ ]) ([^ ]*)/$2 $1/; # swap first two words if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; }

the following leaves a newline on the $_ string:

$_ = ; s/.(some_string)./$1/;

If the newline is unwanted, try one of

s/.(some_string).\n/$1/; s/.(some_string)[^\000]/$1/; s/.(some_string)(.\|\n)/$1/; chop; s/.(some_string)./$1/; /(some_string)/ && ($_ = $1);

Thu May 31 08:50:41 CST 2012 Untitled Document

FAQ

FAQ0

FAQ1

FAQ2

Other FAQ
Thu May 31 08:50:41 CST 2012