Manual |
---|
|
---|
|
---|
Some Links
|
|
---|
|
---|
|
---|
|
---|
|
---|
Introduction to Perl |
---|
A Text Manipulation Language
|
Power of shell, awk, sed, and C(?) -- more complete
|
Multi-dimensional regular and associative arrays
|
String manipulation and pattern match capabilities
|
Powerful regular expressions
|
Rich set of built-in variables
|
Strength
|
Awk/Sed syntax flavor
|
Easy to learn for shell/awk/sed programmers
|
Much less I/O
|
Public domain, very popular
|
Shortcomings
|
Not so clean
|
Mixture of different paradigms
|
Suggestions
|
use shell/awk/sed whenever possible
|
use perl when shell/awk/sed get too complicated
|
use perl when in-core memory is needed
|
Data Types
|
$days | a scalar variable
$days[28]
| 29th element of array @days
| $days{'Feb'}
| one value from an associative array
| @days
| ($days[0], $days[1],... $days[n])
| @days[3,4,5]
| same as @days[3..5]
| %days
| entire associative arrays
| |
---|
String Manipulation Operators
|
. | Concatenation of two strings
.=
| The concatenation assignment operator
| eq
| String equality
| ne
| String inequality (!= is numeric inequality)
| lt
| String less than
| gt
| String greater than
| le
| String less than or equal
| ge
| String greater than or equal
| cmp
| String comparison, returning -1, 0, or 1
| |
---|
Regular Expressions
|
all | unix regular expressions
\w
| matches an alphanumeric character (including "_")
| \W
| matches a nonalphanumeric
| \b
| matches a word boundary
| \B
| matches a non-boundary
| \s
| matches a whitespace character
| \S
| matches a non-whitespace character
| \d
| matches a numeric character
| \D
| matches a non-numeric character
| |
---|
{n,m} | occur at least n times but no more than m times
{n,}
| occur at least n times
| {n}
| occur exactly n times
| *
| 0 or more times
| +
| 1 or more times
| ?
| 0 or 1 time
| |
---|
$+ | returns whatever the last bracket-match matched
$&
| returns the entire matched string
| $`
| returns everything before the matched string
| $'
| returns everything after the matched string
| |
---|
Simple Examples |
---|
grep
|
$pattern = shift (@ARGV); # $pattern = first argument while (<>) { # rest of @ARGV are filenames if (/$pattern/){ print; } } |
---|
Print non-null lines
|
while (<>) { chop; if ($_ ne "" ){ print $_, "\n"; } } |
---|
while (<>) { print unless $_ eq "\n" ; } |
---|
while (<>) { print unless /" $"/ ; } |
---|
while (<>) { s/ \n$//; print; } |
---|
Options |
---|
Option Combination
|
Single-character options may be combined.
|
#!/usr/bin/perl -spi.bak #-s -p -i.bak |
---|
Opt | Args | Explanation
-0
| digits
| Record Separator
| -D
| number
| Sets debugging flags
| -e
| "commandline"
| One line of script on command line
| -i
| extension
| input file ( <> ) are to be edited in-place.
| -I
| directory
| Path name for include files.
| -l
| octnum
| Line-ending processing mode
| |
---|
Opt | Explanation
-a
| Autosplit mode with -n or -p
| -c
| Check the syntax only
| -d
| Debugging mode
| -n
| Automatic loop like "sed -n" or awk:
| -p
| Automatic loop like "sed -p"
| -P
| Invoke cpp before compilation
| -s
| Rudimentary switch mode
| -S
| use the PATH environment variable to search for the script
| -u
| Core dump after compiling
| -U
| allows perl to do unsafe operations.
| -v
| prints the version and patchlevel
| -w
| Print warning message
| -x
| script is embedded in a message
| |
---|
-0 digits
|
Record Separator, sepcifies record separator ($/) in octal number
|
null, if no digits
|
00, slurp files in paragraph mode
|
0777, slurp files whole (0777 is not a valid character)
|
-a
|
Autosplit mode with -n or -p
|
spiltting result is saved in @F
|
perl -ane 'print pop(@F), "\n";' |
---|
above comand is equivalent to
|
while (<>) { @F = split(' '); print pop(@F), "\n"; } |
---|
-D number
|
Sets debugging flags
|
-D14 | to watch how it executes your script
-D1024
| lists your compiled syntax tree
| -D512
| displays compiled regular expressions
| |
---|
-e "commandline"
|
One line of script on command line
|
Multiple -e commands may be given for a multi-line script.
|
-i extension
|
input file ( <> ) are to be edited in-place.
|
rename the input file
|
open the output file by the same name
|
select that output file as the default for print statements
|
the extension is added to file name to make a backup copy
|
no backup if no extension
|
e.g. perl -p -i.bak -e "s/foo/bar/;" |
---|
-I directory
|
Path name for include files
|
default directories: /usr/include and /usr/lib/perl
|
-l octnum
|
Line-ending processing mode
|
automatically chops the line terminator when used with -n or -p
|
assigns $\ to octnum (default is $/)
|
add line terminator back automatically in each print statement.
|
e.g. to trim lines to 80 columns:
|
perl -lpe 'substr($_, 80) = ""' |
---|
-n
|
Automatic loop like "sed -n" or awk:
|
lines are not printed by default
|
Equivalent to
|
while (<>) { 。。。 # your script goes here } |
---|
e.g. to delete all files older than a week:
|
find . -mtime +7 -print | perl -nle 'unlink;' |
---|
-p
|
Automatic loop like "sed -p"
|
lines are printed by default
|
while (<>) { 。。。 # your script goes here } continue { print; } |
---|
-P
|
Invoke cpp before compilation
|
because of ambiguity, you should avoid starting comments with
any words recognized by the C preprocessor such as "if", "else"
or "define".)
|
-s
|
Rudimentary switch mode
|
any switch in command line set a flag variable.
|
is removed from @ARGV
|
prints "true" if the script is invoked with a -xyz switch
|
#!/usr/bin/perl -s if ($xyz) { print "true\n"; } |
---|
-u
|
Core dump after compiling
|
You can then take this core dump and turn it into an executable
file by using the undump program (not supplied).
|
-U
|
allows perl to do unsafe operations.
|
e.g. unlink directories while running as superuser
|
-w
|
Print warning message
|
identifiers that are mentioned only once
|
scalar variables that are used before being set
|
redefined subroutines
|
references to undefined filehandles
|
write to read-only filehandles
|
use == on values that don't look like numbers
|
subroutines recurse more than 100 deep
|
-x
|
script is embedded in a message
|
How to Find Script? |
---|
Upon startup, perl looks for your script in one of the following places:
|
Command Line
|
-e switches on the command line
|
First file filename on command line
|
STDIN
|
only work if there are no filename arguments
|
implicit
|
use option - to explicitly specify
|
Data Types and Objects |
---|
Data Types
|
scalars
|
arrays of scalars (indexed by number from 0)
|
associative arrays of scalars (indexed by string)
|
Context Dependent type determination
|
String, Numeric, Array.
|
Some operations return array or scalar values depending on contexts
|
Scalar operations don't care whether the context is looking
for a string or a number,
|
scalar variables and values are context dependent.
|
In boolean context:
|
FALSE : null string or 0.
|
TRUE: Otherwise (boolean operations return 1)
|
Variable Names
|
Names which start with a letter may also contain digits and underscores.
|
Names which do not start with a letter are limited to one character,
|
e.g. "$%" or "$$".
|
most of the one character names have been predefined.
|
Variables
|
Single Variables (denoted by '$')
|
$days | # a simple scalar variable
$days[28]
| # 29th element of array @days
| $days{'Feb'}
| # one value from an associative array
| $#days
| # last index of array @days
| ${days}
| # same as $days
| |
---|
@days | # ($days[0], $days[1],... $days[n])
@days[3,4,5]
| # same as @days[3..5]
| @days{'a','c'}
| # same as ($days{'a'},$days{'c'})
| |
---|
Entire associative arrays (denoted by '%')
|
%days | # (key1, val1, key2, val2 ...) |
---|
Context dependent Assignments
|
Assignment to a scalar evaluates the righthand side in a scalar context
|
assignment to an array or array slice evaluates the righthand
side in an array context.
|
Assigning to $#days changes the length of the array.
|
e.g. Nullify an array
|
@whatever = (); $#whatever = $[ - 1; |
---|
evaluate an array in a scalar context, it returns the length of the array.
|
scalar(@whatever) == $#whatever - $[ + 1; |
---|
evaluate an associative array in a scalar context, it returns
a string consisting of the number of used buckets and the number of allocated buckets, separated by a slash. |
Multi-dimensional arrays
|
not directly supported, but can be emulated
|
String literals
|
'string', "string", string
|
same as shell's string literals
|
"string" is subject to \ and variable substitution;
|
but 'string' is not (except for \' and \\).
|
Special Characters
|
Strings can contain \n
|
Variable inside strings is limited to :
|
scalar variables
|
normal array values
|
array slices
|
$Price = '$100'; # not interpreted print "The price is $Price.\\n"; #interpreted |
---|
put {} around the identifier as delimiters
|
a single quoted string must be separated from a preceding
word by a space, since single quote is a valid character in an identifier
|
Numeric literals
|
12345 12345.67 .23E-10 0xffff # hex 0377 # octal 4_294_967_296 |
---|
Specail Literals
|
__LINE__ (current line number)
|
__FILE__ (current filename)
|
They may only be used as separate tokens;
|
they will not be interpolated into strings.
|
__END__ or ^D or ^Z (logical end of the script)
|
Any following text is ignored, but may be read via the DATA filehandle.
|
Unquotted string literals
|
a word that doesn't have any other interpretation in the
grammar will be treated as if it had single quotes around
it.
|
a word consists only of alphanumeric characters and underline,
must started with an alphabetic character.
|
Array value expansion
|
are interpolated into double-quoted strings by
joining all the elements of the array with the delimiter
specified in the $" variable, space by default.
|
$temp = join($",@ARGV); system "echo $temp"; system "echo @ARGV"; |
---|
Here-Is (same as shell)
|
begin with a <<ANY-STRING
|
all lines following the current line down to the:
|
terminated by ANY-STRING
|
print <<EOF; # same as above The price is $Price. EOF |
---|
print <<"EOF"; # same as above The price is $Price. EOF |
---|
print << x 10; # null identifier is delimiter Merry Christmas! print <<`EOC`; # execute commands echo hi there echo lo there EOC |
---|
print <<foo, <<bar; # you can stack them I said foo. foo I said bar. bar |
---|
Ambiguity within search patterns
|
/$foo[bar]/ is /${foo}[bar]/ or /${foo[bar]}/ ?
|
Is [bar] a character class or the subscript to array @foo ?
|
If @foo doesn't exist, then it's obviously a character class.
|
If @foo exist, perl will take a good guess.
|
Array literals:
|
(comma separated list)
|
In a non-array context,
|
the final element of the array literal is used.
|
@foo = ('cc', '-E', $bar); | # assign entire list to @foo
$foo = ('cc', '-E', $bar);
| # assign $bar to $foo
| $foo = @foo;
| # $foo gets 3 !!!!!!!!!
| |
---|
When a LIST is evaluated,
|
each element of the list is evaluated in an array context,
|
(@foo,@bar,&SomeSub) | contains all the elements of @foo and @bar and all the elements returned by the subroutine named SomeSub. |
---|
A list value may also be subscripted
|
$time = (stat($file))[8]; # stat returns array value $digit = ('a','b','c','d','e','f')[$digit-10]; return (pop(@foo),pop(@foo))[0]; |
---|
Array lists may be assigned to
|
each element of the list is an lvalue:
|
final element may be an array or an associative array
|
($a, $b, $c) = (1, 2, 3); |
---|
($map{'red'}, $map{'blue'}, $map{'green'}) = (0x00f, 0x0f0, 0xf00); |
---|
($a, $b, @rest) = split; |
---|
local($a, $b, %rest) = @_; |
---|
Associative array literal
|
sequence of key, value pairs
|
%map = ('red',0x00f,'blue',0x0f0,'green',0xf00); |
---|
Array assignment in a scalar context
|
returns the number of elements produced by the expression
on the right side of the assignment:
|
$x = (($foo,$bar) = (3,2,1)); # set $x to 3, not 2 |
---|
`shell command` (same as shell)
|
expand the stirng like a double quoted string.
|
interprete as a shell command,
|
output a string in a scalar context,
|
output an array in an array context
|
one line for each element
|
File Handles
|
Use <Filehandle> to get the next line of that file
|
assign the line to a variable except that:
|
it is assigned to $_ in a bare while loop;
|
while ($_ = <STDIN>) { print; } while (<STDIN>) { print; } for (;<STDIN>;) { print; } print while $_ = <STDIN>; print while <STDIN>; |
---|
Predefined: STDIN, STDOUT, STDERR
|
Use Open to create filehandles
|
<FILEHANDLE> is used in array context
|
an array consisting of all the input lines is
returned, one line per array element.
|
Null filehandle <>
|
used to emulate the behavior of sed and awk.
|
input from standard input, or
|
each file listed on the command line.
|
return FALSE only once
|
call it again will get input from STDIN
|
Scalar variable filehandle
|
(e.g. <$foo>)
|
the name of the filehandle to input from
|
File Glob:
|
(e.g. <non-filehandle string>)
|
filename pattern to be globbed,
|
One level of $ interpretation is done first,
|
should use <${foo}> (can't use <$foo>).
|
while (<*.c>) { chmod 0644, $_; } |
---|
or
|
chmod 0644, <*.c>; |
---|
Syntax |
---|
Program
|
Sequence of declarations and commands.
|
No need to declare except report formats and subroutines
|
Default value for uninitialized user objects:
|
null or 0
|
Program is executed only once
|
like C, not sed or awk
|
exception: -n and -p switches
|
Declaration can be put anywhere
|
A Free-form Language
|
Use '#' for comment (like shell)
|
/* */ is not for comments
|
Compound Statements
|
BLOCK {....}
|
a sequence of commands works as one command
|
Control Flow
|
if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
| if (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
| LABEL while (EXPR) BLOCK
| LABEL while (EXPR) BLOCK continue BLOCK
| LABEL for (EXPR; EXPR; EXPR) BLOCK
| LABEL foreach VAR (ARRAY) BLOCK
| LABEL BLOCK continue BLOCK
| |
---|
{ } are required, not like C
|
if (!open(foo)) { die "Can't open $foo: $!"; }
die "Can't open $foo: $!" unless open(foo);
| open(foo) || die "Can't open $foo: $!"; # foo or bust!
| open(foo) ? 'hi mom' : die "Can't open $foo: $!";
| |
---|
If and Unless
|
if (EXPR) BLOCK
if (EXPR) BLOCK else BLOCK
| if (EXPR) BLOCK elsif (EXPR) BLOCK .. else BLOCK
|
| unless (EXPR) BLOCK
| unless (EXPR) BLOCK else BLOCK
| unless (EXPR) BLOCK elsif (EXPR) BLOCK ... else BLOCK
| |
---|
Block Condition
|
(EXPR) can be a BLOCK,
|
return true if the value of the last command is true
|
While and Until
|
LABEL [while|until] (EXPR) BLOCK
LABEL [while|until] (EXPR) BLOCK continue BLOCK
| |
---|
(EXPR) can be a BLOCK,
|
LABEL (name:) is optional
|
(for loop control: next, last, and redo)
|
continue BLOCK
|
(executed before condition is to be evaluated again)
|
similar to the third part of a "for" loop in C
|
Until block
|
the condition is still tested before the first iteration
|
For, Foreach
|
FOR
|
LABEL for (EXPR; EXPR; EXPR) BLOCK
|
for ($i = 1; $i < 10; $i++) { ... } |
---|
is the same as
|
$i = 1; while ($i < 10) { } continue { $i++; } |
---|
FOREACH
|
LABEL foreach VAR (ARRAY) BLOCK
|
iterates over a normal array value
|
sets VAR to be each element of the array in turn
|
VAR is local to loop
|
can use for for brevity.
|
VAR can be omitted, $_ is set to each value
|
If ARRAY is an actual array (not an array expression)
you can modify each element of the array by modifying VAR
|
for (@ary) { s/foo/bar/; }
foreach $elem (@elements) {
| $elem *= 2; } for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) {
| print $_, "\n"; sleep(1); } for (1..15) { print "Merry Christmas\n"; }
| foreach $item (split(/:[\\\n:]*/, $ENV{'TERMCAP'})) {
| print "Item: $item\n"; } |
---|
Single Block
|
LABEL BLOCK continue BLOCK
|
loop that executes once
|
can use loop control to leave or restart
|
continue block is optional.
|
Good to emulate a CASE structure
|
foo: { if (/^abc/) { $abc = 1; last foo; } if (/^def/) { $def = 1; last foo; } if (/^xyz/) { $xyz = 1; last foo; } $nothing = 1; } |
---|
foo: { $abc = 1, last foo if /^abc/; $def = 1, last foo if /^def/; $xyz = 1, last foo if /^xyz/; $nothing = 1; } |
---|
foo: { /^abc/ && do { $abc = 1; last foo; }; /^def/ && do { $def = 1; last foo; }; /^xyz/ && do { $xyz = 1; last foo; }; $nothing = 1; } |
---|
foo: { /^abc/ && ($abc = 1, last foo); /^def/ && ($def = 1, last foo); /^xyz/ && ($xyz = 1, last foo); $nothing = 1; } |
---|
if (/^abc/) { $abc = 1; } elsif (/^def/) { $def = 1; } elsif (/^xyz/) { $xyz = 1; } else {$nothing = 1;} |
---|
Simple statements
|
Every simple statement but last one must end with a ";"
|
Optional Post modifier
|
if EXPR while EXPR
unless EXPR until EXPR
| |
---|
do {} while/until EXPR
|
will execute once before the conditional is evaluated.
|
do { $_ = <STDIN>; } until $_ eq ".\n"; |
---|
Do { } can't have loop control
|
since modifiers don't take loop labels
|
Expressions
|
Like C expressions
|
Perl Special
|
** | exponentiation operator.
**=
| exponentiation assignment operator.
| ()
| The null list, used to initialize an array to null.
| .
| Concatenation of two strings.
| .=
| The concatenation assignment operator.
| eq
| String equality (== is numeric equality).
| ne
| String inequality (!= is numeric inequality).
| lt
| String less than.
| gt
| String greater than.
| le
| String less than or equal.
| ge
| String greater than or equal.
| cmp
| String comparison, returning -1, 0, or 1.
| <=>
| Numeric comparison, returning -1, 0, or 1.
| |
---|
Perl Special operator (=~, !~, x, x= )
|
=~
|
(Do pattern oriented operations over left operand)
|
right arg. is a search pattern, substitution, or translation.
|
left argument is the target object.
|
$xxx ~= s/TEST/test/g |
---|
!~
|
(same as =~ but negate the return value)
|
x
|
(repetition operator)
|
if left operand is (), repeat the list.
|
print '-' x 80; | # print row of dashes
print '-' x80;
| # illegal, x80 is identifier
| print "\t" x ($tab/8), ' ' x ($tab%8);
| # tab over
| @ones = (1) x 80;
| # an array of 80 1's
| @ones = (5) x @ones;
| # set all elements to 5
| the 2nd @ones is the length of @ones |
---|
x=
|
(Repetition assignment, scalar only)
|
\ ..
|
(Range Operator)
|
In a scalar context ==> returns a boolean value
|
It works like line addressing exp. in sed or awk.
|
if (101 .. 200) { print; } | # print 2nd hundred lines
next line if (1 .. /^$/);
| # skip header lines
| s/^/> / if (/^$/ .. eof());
| # quote body
| |
---|
In an array context ==> returns an array
|
for (101 .. 200) { print; } | # print $_ 100 times
@foo = @foo[$[ .. $#foo];
| # an expensive no-op
| @foo = @foo[$#foo-4 .. $#foo];
| # slice last 5 items
| |
---|
File Test
|
-r -w -x -o | Readable/Writabe/Executable/Owned by effective uid/gid.
-R -W -X -O
| Readable/Writabe/Executable/Owned by real uid/gid.
| -e
| File exists.
| -z -s
| File has zero/non-zero size (returns size).
| -f -d -l -p -S
| plain_file/directory/symbolic_link/named_pipe_(FIFO)/socket
| -b -c
| is a block/character special file.
| -u -g -k
| File has setuid/setgid/sticky bit set.
| -t
| Filehandle is opened to a tty.
| -T -B
| is a text/binary file.
| -M
| Age of file in days when script started.
| -A
| Same for access time.
| -C
| Same for inode change time.
| |
---|
|
"_" ==> previously tested file
|
print "Can do.\n" if -r $a || -w _ || -x _;
stat($filename);
| print "Readable\n" if -r _;
| print "Writable\n" if -w _;
| print "Executable\n" if -x _;
| print "Text\n" if -T _;
| print "Binary\n" if -B _;
| |
---|
Perl Expression
|
C's Expressions not in Perl
|
Address_of & | Pointer * | Type casting (TYPE) |
---|
++ Auto-increment for alphnumeric strings
|
if a string matches to pattern /^[a-zA-Z]*[0-9]*$/,
++ will increase it as a string |
print ++($foo = '99'); | # prints '100'
print ++($foo = 'a0');
| # prints 'a1'
| print ++($foo = 'Az');
| # prints 'Ba'
| print ++($foo = 'zz');
| # prints 'aaa'
| |
---|
range operator (in an array context) makes use of the
magical autoincrement algorithm if the minimum and maximum
are strings.
|
@alphabet = ('A' .. 'Z'); | # A to Z
$hexdigit = (0 .. 9, 'a' .. 'f')[$num & 15];
| #Hex number
| @z2 = ('01' .. '31'); print @z2[$mday];
| |
---|
-- is not magical.
|
Perl Expression
|
|| and &&
|
In C: returns 0 or 1
|
In Perl: return the last value evaluated.
|
e.g. #a portable way to find out the home directory
|
$home = $ENV{'HOME'} || $ENV{'LOGDIR'} || (getpwuid($<))[7] || die "You're homeless!\n"; |
---|
Perl Regular Expression |
---|
Henrry Spencer's Version 8 Reg. Exp.
|
. | any character
*
| repeat 0 or more times
| ^
| beginning of the line
| $
| end of line
| [ ]
| any character in []
| [x-y]
| any character between x and y
| \(..\)
| marked pattern
| |
---|
Perl Special
|
\w | any alphanumeric character (including "_")
\W
| non-alphanumeric character (not \w)
| \b
| word boundary
| \B
| not \b
| \s
| white space
| \S
| not \s
| \d
| numeric character
| \D
| not \d
| |
---|
Back Reference
|
Refer to the substring matched ina pattern search
|
\<digit> | digit'th substring (within pattern search only)
$1, $2, .. The nth substring matched in a pattern search $`
| Everything before the matched string
| $&
| The entire matched string
| $'
| Everything after the matched string
| $+
| Last bracket match matched
| |
---|
s/^([^ ]*) *([^ ]*)/$2 $1/; # swap first two words |
---|
if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; } |
---|
Multiline pattern search
|
set $* to 1
|
^ | matches after any newline within the string,
$
| matches before any newline.
| .
| character never matches a newline
| |
---|
e.g. leaves a newline on the $_ string:
|
$_ = <STDIN>; s/.*(some_string).*/$1/; |
---|
If the newline is unwanted, try one of
|
s/.*(some_string).*\n/$1/; s/.*(some_string)[^\000]*/$1/; s/.*(some_string)(.|\n)*/$1/; chop; s/.*(some_string).*/$1/; /(some_string)/ && ($_ = $1); |
---|
{n,m} Modifier
|
{n,m} | must occur at least n times but no more than m times
.br .br {n}
| is {n,n} , must occur exactly n times
| {n,}
| n or more times
| {0,}
| *, 0 or more times
| {1,}
| + , 1 or more times
| {0,1}
| ? , 0 or 1 time
| |
---|
Other Tips
|
all backslashed metacharacters are alphanumeric, such as \b, \w, \n.
|
So anything that looks like \, \(, \), \< \>, \{, or \}
is always interpreted as a literal character, not a metacharacter.
This makes it simple to quote a string that you want to use for a pattern
but that you are afraid might contain metacharacters.
Simply quote all the non-alphanumeric characters:
|
$pattern =~ s/(\W)/\$1/g; |
---|
Pattern Operations |
---|
Format
|
/PATTERN/gio
|
?PATTERN? #search only once, stop until reset.
|
m/PATTERN/gio
|
/PATTERN/gio
|
true (1) or false ('').
|
Default target: $_
|
Other Target: use =~ or !~ operator
|
Target can be the result of an expression evaluation,
|
can use any character for delimiters except the first form
|
Post Modifiers
|
"g" for global match
|
"i" for case insensitive
|
"o" disable run time varialbe recompilation
|
PATTERN may contain references to scalar variables except $) and $|
|
use an "o" to avoid run time recompilation of variables
|
When need to return an array, returns:
|
return subexpressions matched by () in the pattern,
|
i.e. ($1, $2, $3...). |
---|
return a Null array If the match fails
|
return (1) if the match succeeds, but no ()
|
open(tty, '/dev/tty'); <tty> =~ /^y/i && do foo(); #do foo if desired if (/Version: *([0-9.]*)/) { $version = $1; } next if m#^/usr/spool/uucp#; |
---|
# poor man's grep $arg = shift; while (<>) { print if /$arg/o; #compile only once } if (($F1, $F2, $Etc) = ($foo =~ /^(\S+)\s+(\S+)\s*(.*)/)) |
---|
Post Modifier "g" for Pattern Match
|
matching as many times as possible within the string.
|
In Array Context, returns a list of:
|
all the substrings matched by all ()
|
all the matched strings, as if /(entire-string)/
|
in Scalar Context
|
iterates through the string,
returning TRUE each time it matches, and FALSE when it eventually runs out of matches. |
Don't modify string between matches
|
# array context ($one,$five,$fifteen) = (`uptime` =~ /(\d+\.\d+)/g); # scalar context $/ = ""; #record separator is mutli-balnk line $* = 1; # mutlti-line pattern match while ($paragraph = <>) { while ($paragraph =~ /[a-z]['")]*[.!?]+['")]*\s/g) { $sentences++; } } print "$sentences\n"; |
---|
Substitution |
---|
Format
|
s/PATTERN/REPLACEMENT/gieo |
---|
Searches a string for a pattern,
|
if found, replaces that pattern with the replacement text and
returns the number of substitutions made.
|
Otherwise returns false (0).
|
Default target string is $_
|
Post modifier
|
"i" | case insensitive
"o"
| avoid variable recompilation in patterns
| "g"
| replace all matched occurrences
| "e"
| the replacement string is an expression
| |
---|
Delimiters
|
Any non-alphanumeric delimiter may replace the slashes;
|
if '' are used, no interpretation is done on the replacement string
(the e modifier overrides this, however); |
if `` are used, the replacement string is a command to execute
whose output will be used as the actual replacement text.
|
If the PATTERN is delimited by bracketing quotes,
the REPLACEMENT has its own pair of quotes. |
e.g. s(foo)(bar) or s |
---|
pattern can contains scalar variables
|
If the PATTERN is a null string,
the most recent success ful regular expression is used instead. |
s/\bgreen\b/mauve/g; | # don't change wintergreen
$path =~ s|/usr/bin|/usr/local/bin|;
| s/Login: $foo/Login: $bar/;
| # run-time pattern
| ($foo = $bar) =~ s/bar/foo/;
| s/([^ ]*) *([^ ]*)/$2 $1/;
| # reverse 1st two fields
| $_ = 'abc123xyz'; | s/\d+/$&*2/e;
| # yields 'abc246xyz'
| s/\d+/sprintf("%5d",$&)/e;
| # yields 'abc 246xyz'
| s/\w/$& x 2/eg;
| # yields 'aabbcc 224466xxyyzz'
| |
---|
Procedure |
---|
Define
|
sub subname { Block } |
---|
Invoke
|
&subname ( para1, para2, ...) |
---|
Passing Arguments
|
Call By Reference
|
calling variables are put in array @_
|
any change to @_ will affect the variables in calling context
|
example: define
|
sub under { foreach $value ( @_ ) { $value += 5; print "$value "; } |
---|
example: calling
|
&under(1,2,3,4,5); |
---|
example: execution result
|
6,7,8,9,10 |
---|
Call By Value
|
use local to declare local variables
|
sub greeting { local($fanme, $lname) = @_; . . . . . . . . return $fname; } |
---|
Predefined Variables |
---|
$_ | default input and pattern-searching space
(Mnemonic: underline is understood in certain operations.) $.
| current input line number of the last filehandle
that was read. Readonly.
| (Mnemonic: Many editors use . for current line #.) $/
| The input record separator, newline by default.
"" -- blank lines (multiple blank line is treated as one)
"\n\n" -- blank lines (one first balnk line is delimiter)
can use multi-character delimiter
| (Mnemonic: / is line boundaries when quoting poetry.) $,
| The output field separator for the print operator
| (Mnemonic: , is used inprint statement for OFS) $"
| The output field separator for the array interpolation
| $\
| The output record separator for the print operator.
| $#
| The output format for printed numbers.
initial value is %.20g rather than %.6g
| (Mnemonic: # is the number sign.) $%
| The current page number of the currently selected output channel.
| (Mnemonic: % is page number in nroff.) $=
| The current page length (printable lines) of the
currently selected output channel. Default is 60.
| (Mnemonic: = has horizontal lines.) $-
| The number of lines left on the page of the
currently selected output channel.
| (Mnemonic: lines_on_page - lines_printed.) $~
| The name of the current report format for the
currently selected output channel. Default is name
of the filehandle.
| |
---|
CHMOD |
---|
while (<*.c>) { chmod 0644, $_; } |
---|
open(foo, "echo *.c | tr -s ' \t\r\f' '\\012\\012\\012\\012'|"); while ( |
---|
chmod 0644, <*.c>; |
---|
FOR LOOP |
---|
for ( $1 = 1; $i < 10; $i++) { } |
---|
$i = 1; while ($i < 10) { 。。。 } continue { $i++; } |
---|
for (@ary) { s/foo/bar/; } |
---|
foreach $elem (@elements) { $elem *= 2; } |
---|
for ((10,9,8,7,6,5,4,3,2,1,'BOOM')) { print $_, "\n"; sleep(1); } |
---|
for (1..15) { print "Merry Christmas\n"; } |
---|
foreach $item (split(/:[\\\n:]*/, $ENV{'TERMCAP'})) { print "Item: $item\n"; } |
---|
SWITCH |
---|
foo: { if (/^abc/) { $abc = 1; last foo; } if (/^def/) { $def = 1; last foo; } if (/^xyz/) { $xyz = 1; last foo; } $nothing = 1; } |
---|
foo: { $abc = 1, last foo if /^abc/; $def = 1, last foo if /^def/; $xyz = 1, last foo if /^xyz/; $nothing = 1; } |
---|
foo: { /^abc/ && do { $abc = 1; last foo; }; /^def/ && do { $def = 1; last foo; }; /^xyz/ && do { $xyz = 1; last foo; }; $nothing = 1; } |
---|
foo: { /^abc/ && ($abc = 1, last foo); /^def/ && ($def = 1, last foo); /^xyz/ && ($xyz = 1, last foo); $nothing = 1; } |
---|
if (/^abc/) { $abc = 1; } elsif (/^def/) { $def = 1; } elsif (/^xyz/) { $xyz = 1; } else {$nothing = 1;} |
---|
CHOP |
---|
while (<>) { chop; next unless -f $_; # ignore specials 。。。 } |
---|
Pass array by reference |
---|
sub doubleary { local(*someary) = @_; foreach $elem (@someary) { $elem *= 2; } } do doubleary(*foo); do doubleary(*bar); |
---|
Maximum of an array |
---|
sub MAX { local($max) = pop(@_); foreach $foo (@_) { $max = $foo if $max < $foo; } $max; } 。。。 $bestday = &MAX($mon,$tue,$wed,$thu,$fri); |
---|
Join Two Lines |
---|
Join a line with next line with certain pattern
|
# get a line, combining continuation lines # that start with whitespace sub get_line { $thisline = $lookahead; line: while ($lookahead = |
---|
Local Array |
---|
Use array assignment to a local list to name your formal arguments:
|
sub maybeset { local($key, $value) = @_; $foo{$key} = $value unless $foo{$key}; } |
---|
Use of Regular expression |
---|
s/^([^ ]*) *([^ ]*)/$2 $1/; # swap first two words if (/Time: (..):(..):(..)/) { $hours = $1; $minutes = $2; $seconds = $3; } |
---|
the following leaves a newline on the $_ string:
|
$_ = |
---|
If the newline is unwanted, try one of
|
s/.*(some_string).*\n/$1/; s/.*(some_string)[^\000]*/$1/; s/.*(some_string)(.|\n)*/$1/; chop; s/.*(some_string).*/$1/; /(some_string)/ && ($_ = $1); |
---|
FAQ |
---|
|
---|
|
---|
|
---|
|
---|