23
 Basic Script ############################################################################# # # name: main # purpose: entry point of the script ############################################################################# # # @ARGV holds all script command line arguments (pos 0 is not prog-name) # $0 holds script filename print "hello world\n"; Data Types ############################################################################# # # name: main # purpose: show the basic datatypes ############################################################################# # ### scalars (ints, floats, strings) $float = 3.14; # can hold real / whole numbers $false = 0; # 0 counts as FALSE, non-zero is TRUE $str = "hello"; # can hold strings $false2 = ""; # en empty string is FALSE of course $false3 = '0'; # "0" also counts as FALSE, all other strings are TRUE undef $num; # similar to NULL, counts as FALSE $line = "hello\n"; # $line holds 'hello' and then a new-line (LF) - 6 chars $firm = 'hello\n'; # $firm holds the text 'hello\n' - 7 chars $ten_a = 'a' x 10; # $ten_a holds 'aaaaaaaaaa' $long1 = <<"END1"; # long text with a few lines - as a "" string This is long text, With $float lines. END1 $long2 = <<'END2'; # long text with a few lines - as a '' string This can hold \n. END2 # scalar operations $num = $num*2 + 3 - $float; # $num is 23.86 $num = 2**4 % 5; # $num is 1 - exp then modulus $num++; # $num is 2 - inc after eval $ms = (1<< 3)&0xff|0x03^0x01; # $ms is 0x0a print ++($foo = '99'); # prints '100' - inc before eval $new = $str." world"; # $new is "hello world" ### arrays (lists of scalars)

Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

Embed Size (px)

Citation preview

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 1/23

 

Basic Script

############################################################################### name: main# purpose: entry point of the script##############################################################################

# @ARGV holds all script command line arguments (pos 0 is not prog-name)# $0 holds script filenameprint "hello world\n";

Data Types

############################################################################### name: main# purpose: show the basic datatypes##############################################################################

### scalars (ints, floats, strings)$float = 3.14; # can hold real / whole numbers$false = 0; # 0 counts as FALSE, non-zero is TRUE $str = "hello"; # can hold strings$false2 = ""; # en empty string is FALSE of course$false3 = '0'; # "0" also counts as FALSE, all other strings are TRUE undef $num; # similar to NULL, counts as FALSE $line = "hello\n"; # $line holds 'hello' and then a new-line (LF) - 6 chars$firm = 'hello\n'; # $firm holds the text 'hello\n' - 7 chars$ten_a = 'a' x 10; # $ten_a holds 'aaaaaaaaaa'$long1 = <<"END1"; # long text with a few lines - as a "" string This is long text,With $float lines.END1$long2 = <<'END2'; # long text with a few lines - as a '' string This can hold \n.END2

# scalar operations$num = $num*2 + 3 - $float; # $num is 23.86 $num = 2**4 % 5; # $num is 1 - exp then modulus$num++; # $num is 2 - inc after eval$ms = (1<< 3)&0xff|0x03^0x01; # $ms is 0x0aprint ++($foo = '99'); # prints '100' - inc before eval$new = $str." world"; # $new is "hello world"

### arrays (lists of scalars)

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 2/23

@nums = (1,2,3);@strings = ("one",$str); # @strings is ("one","hello")@mixed = ("three",3.13); # any scalar can be placed in a list@empty = (); # empty list counts as FALSE, non-empty is TRUE ($one,$two) = (1,2); # $one is 1, $two is 2@to_ten = (1,2,3..10); # 3..10 is a list of all nums from 3 to 10

# array operations$first = $nums[0]; # $first is 1$strings[1] = "neo"; # @strings is ("one","neo")$mixed[2] = 37; # @mixed is ("three",3.13,37) - grows automatically @joined = (@mixed,8); # @joined is ("three",3.13,37,8)@sl = @nums[0,-1,1]; # @sl is (1,3,2) - array slice (specific indices)@sl = @nums[0..2]; # @sl is (1,2,3) - array slice (span)$len = scalar(@nums); # an array in scalar context is the list length (3)$last_index = $#nums; # $last_index is 2 (the last index in the list)$#nums = -1; # @nums is () - empty 

### hashes (maps of keys and values)%ages = ("jim"=>18,"ted"=>21); # the key "jim" has a value of 18

%same = ("jim",18,"ted",21); # => is exactly like ,%mix_hash = (1=>"bla","hi"=>22.1); # any scalar can be a key or value%empty_hash = (); # empty hash counts as FALSE, full isTRUE 

# hash operations$jims_age = $ages{"jim"}; # $jims_age is 18$ages{"jim"}++; # key "jim" has value of 19 in %ages$ages{"ron"} = 24; # key "ron" with value 24 added to %ages@sl = @ages{"ted","ron"}; # @sl is (21,24) - hash slice$stats = scalar(%ages); # string eg. "1/16" - 1 used bucket out of 16 alloced 

### references (scalar that holds a pointer to another type)$scalarref = \$num;$arrayref = \@mixed;$hashref = \%ages;

# reference operations$num_copy = $$scalarref; # dereference using {type}$reference@mixed_copy = @$arrayref;$value = $$hashref{"jim"};$value = $arrayref->[0]; # or dereference using $reference-> $value = $hashref->{"jim"};

Conditionals############################################################################### name: main# purpose: show the basic conditionals##############################################################################

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 3/23

# regular c style if statement, must use blocksif (defined($value) && ($value == 1)) # defined() tests for undef {print "value equals 1\n";

}

# if-else, must use blocksif (($job eq "millionaire") || ($state ne "dead")) # eq,ne are used for strings{print "a suitable husband found\n";

}else

{print "not suitable\n";

}

# unless is the opposite of if, must use blocksunless (($age >= 18) and ($age < 80)) # and,or,not are also ok

{print "too old\n";

}

# short forms (no blocks needed if a single statement comes before)print "ok" if $ok;print "ok" unless not $ok;

# and the true perl way open(FILE) or d ie "cant open file";

Loops and Iterations

############################################################################### name: main# purpose: show the flow blocks##############################################################################

# for (regular c style), must use a blockfor ($i=0; $i<10; $i++){print "iteration number $i\n";

}

# foreach (iterate on lists), must use blocksforeach $num (@numbers) # $num hold a member in each iteration{print "$num";

}foreach (@numbers) # if excluded, the member is stored in $_{print; # by default, $_ is printed 

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 4/23

}for (1..10) { print "a"; } # 'foreach' is actually a synonym for 'for'

# while / until statements, must use blocks$i = 0; while ($i < 10) # enter block if condition is TRUE {print "iteration number $i\n";$i++;

}until ($i == 0) # enter block if condition is FALSE {$i--;print "back to $i\n";

}

# do while / until just like in c, must use blocks$i = 0;d o 

{print "this will print\n"; # enter block once before evaluating 

}  while ($i != 0);d o 

{print "this too\n";

} until ($i == 0);

# short forms (no blocks needed if a single statement comes before)print "a" for (1..10);read_next_line()  while not end_of_file();read_next_line() until end_of_file();

# next and last statements are similar to c continue and breakfor ($i=0; $i<10; $i++){next if ($i == 3); # skip printing 3 (go to next iteration)last if ($i == 5); # exit the loop before printing 5print $i; # will print 0124

}

Functions

##############################################################################

# name: main# purpose: show function and subroutine syntax ##############################################################################

# return valuessub seventeen1 # return keyword indicated return value{return 17;

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 5/23

}sub seventeen2 # if no return exists, retval is the lastexpression{17;

}$num = seventeen1() + seventeen2() + 53;sub retlist # all datatypes can be returned {return (1,2,3);

}($one,$two,$thr) = retlist; # () are optional (even when we have args)

# argumentssub has_args{@func_arguments = @_; # all arguments are members of the list @_$first_arg = $_[0]; # returns undef if no arg given($arg1,$arg2,$arg3) = @_; # the common perl way to handle function

arguments

}has_args($num,@l1,22,@l2); # all arguments are flattened into one listsub takes_two_lists # to pass several lists / hashes, use references{($l1ref,$l2ref) = @_;@list1 = @$l1ref;

}takes_two_lists(\@a,\@b);

# prototypes (limited compile-time argument checking)sub two_scalars($$) { }; # two_scalars(12,"hello");sub scalar_n_list($@) { }; # scalar_n_list("scalar",1,2,3);sub array_ref(\@) { }; # array_ref(@array);

R egular Expressions

############################################################################### name: main# purpose: show regular expression usage##############################################################################

# matching $call911 = 'Someone, call 911.'; # the string we want to match upon

$found = ($call911 =~ /call/); # $found is TRUE, matched 'call'@res = ($call911 =~ /Some(...)/); # @res is ('one'), matched 'Someone'$entire_res = $&; # $entire_res is 'Someone'$brack1_res = $1; # $brack1_res is 'one', $+ for lastbrackets($entire_pos,$brack1_pos) = @-; # $entire_pos is 0, $brack1_pos is 4($entire_end,$brack1_end) = @+; # $entire_end is 7, $brack1_end is 7 # global matching (get all found)

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 6/23

$call911 =~ /(.o.)/g; # g is global-match, $1 is 'Som', $2 is'eon'@res = ($call911 =~ /(.o.)/g); # @res is ('Som','eon'), $& is 'eon'

# substituting $greeting = "hello world"; # the string we want to replace in$greeting =~ s/hello/goodbye/; # $greeting is 'goodbye world'

# splitting @l = split(/\W+/,$call911); # @l is ('Someone','call','911')@l = split(/(\W+)/,$call911); # @l is ('Someone',', ','call','','911','.')

# pattern syntax $call911 =~ /c.ll/; # . is anything but \n, $& is 'call'$call911 =~ /c.ll/s; # s is singe-line, . will include \n, $& is 'call'$call911 =~ /911\./; # \ escapes metachars {}[]()^$.|*+?\, $& is '911.'$call911 =~ /o../; # matches earliest, $& is 'ome'$call911 =~ /g?one/; # ? is 0 or 1 times, $& is 'one'$call911 =~ /cal+/; # + is 1 or more times, $& is 'call', * for 0 or 

more$call911 =~ /cal{2}/; # {2} is exactly 2 times, $& is 'call'$call911 =~ /cal{0,3}/; # {0,3} is 0 to 3 times, $& is 'call', {2,} for >= 2$call911 =~ /S.*o/; # matches are greedy, $& is 'Someo'$call911 =~ /S.*?o/; # ? makes match non-greedy, $& is 'So'$call911 =~ /^.o/; # ^ must match beginning of line, $& is 'So'$call911 =~ /....$/; # $ must match end of line, $& is '911.'$call911 =~ /9[012-9a-z]/;# one of the letters in [...], $& is '91'$call911 =~ /.o[^m]/; # none of the letters in [^...], $& is 'eon'$call911 =~ /\d*/; # \d is digit, $& is '911'$call911 =~ /S\w*/; # \w is word [a-zA-Z0-9_], $& is 'Someone'$call911 =~ /..e\b/; # \b is word boundry, $& is 'one', \B for non-boundry $call911 =~ / \D.../; # \D is non-digit, $& is ' call', \W for non-word $call911 =~ /\s.*\s/; # \s is whitespace char [\t\n ], $& is ' call '$call911 =~ /\x39\x31+/; # \x is hex byte, $& is '911'$call911 =~ /Some(.*),/; # (...) extracts, $1 is 'one', $& is 'Someone,'$call911 =~ /e(one|two)/; # | means or, $& is 'eone'$call911 =~ /e(?:one|tw)/;# (?:...) does not extract, $& is 'eone', $1 isundef $call911 =~ /(.)..\1/; # \1 is memory of first brackets, $& is 'omeo'$call911 =~ /some/i; # i is case-insensitive, $& is 'Some'$call911 =~ /^Some/m; # m is multi-line, ^ will match start of entiretext$call911 =~ m!call!; # use ! instead of /, no need for \/, $& is 'call'

Special Variables

############################################################################### name: main# purpose: show some special internal variables

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 7/23

##############################################################################

# $_ - default inputprint for (1..10); # in many places, no var will cause work on $_print $_ for $_ (1..10); # same as above

# $. - current line in last file handle while (!(<IN> =~ /error/i)) {};print "first error on line $.\n";

# $/ - input record separator (default is "\n")undef $/;$entire = <IN>; # read entire file all at once$/ = "512";$chunk = <IN>; # read a chunk of 512 bytes

# $\ - output record separator (default is undef)$\ = "\n"; # auto \n after printprint 'no need for LF';

# $! - errno / a string description of error open(FILE) or d ie "error: $!";

# $@ - errors from last evaleval $cmd;print "eval successful" if not $@;

Standard IO

#############################################################################

## name: main# purpose: show some basic IO and file handling ##############################################################################

# open a file a la shellopen(IN, "< input.txt") or d ie "cant open input file: $!";open(OUT, ">> output.txt") or d ie "cant open output file: $!";# binmode(IN) to change IN from txt mode to binary mode

# read records from a file (according to $/) while ($line = <IN>) # <IN> returns next line, or FALSE if none left{

# write data to a fileprint OUT $line;

}

# cleanupclose(IN);close(OUT);

# check if file exists

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 8/23

print "$filename exists" if (-e $filename);

# check the file sizeprint "$filename file size is ".(stat $filename)[7];

# get all the txt files in current directory @txtfiles = <*.txt>; # perl globbing @txtfiles = `dir /b *.txt`; # or use the shell (slower), needs chomping 

Usef ul Functions and Keywords

############################################################################### name: main# purpose: show some basic functions and keywords of perl##############################################################################

# scalar / string functionsforeach (`dir /b`) { chomp; print; } # chomp removes \n tail (according to$/)$ext = chop($file).$ext for (1..3); # chop removes last char and returns itprint 'a is '.chr(ord('a')); # ord converts chr to num, chr isoppositeprint lc("Hello"), uc(" World"); # prints 'hello WORLD'print length("hello"); # prints '5'$three_a = sprintf("%08x",58); # just like regular c sprintf print($type) if ($type = ref $ref); # prints 'SCALAR'/'ARRAY'/'HASH'/'REF'

# regexps and pattern matching functionsprint quotemeta('[.]'); # prints '\[\.\]'

@words = split(/W+/,$sentence); # splits a string according to a regexp

# array / list functions@three_two_one = reverse(1,2,3); # returns a list in reverseprint pop(push(@arr,'at end')); # prints 'at end', no change to @arr print shift(unshift(@arr,'at start'); # prints 'at start', no change to @arr @after = grep(!/^\s*#/, @before); # weed out full comment lines$sentence = join(' ',@words); # turns lists into strings with a delimprint sort <*.*>; # sort string lists in alphabeticalorder delete @arr[3..5]; # deletes the 3rd,4th,5th elements in@arr print "length is ".scalar @arr; # scalar evaluates expressions asscalars

# hash related functionsdelete @hash{"key1","key2"}; # deletes these keys from the hashprint $hash{$_} foreach (keys %hash); # prints all hash values by checking keysprint values(%hash); # same but different

# misc functions and keywords

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 9/23

sleep(10); # causes the script to sleep for 10secsexit(0) if $should_quit; # exits the script with a return valueuse warnings; use strict; # imports new external modulesno warnings; no strict; # un-imports imported external modules my $var; # declare a local variable (strict)undef($null) if defined($null); # check if a variable is defined eval '$pn = $0;'; print $pn; # interpret new perl code in runtimesystem("del $filename"); # run commands in the shell (blocking)system("start calc.exe"); # run commands in the shell(nonblocking)@files = `dir /b`; # run & get output of shell commands("")

is module provides syntax highlighting for Perl code. The design bias is roughly line-oriented

and streamed (ie, processing a file line-by-line in a single pass). Provisions may be made in thefuture for tasks related to "back-tracking" (ie, re-doing a single line in the middle of a stream)

such as speeding up state copying.

Constructors 

The only constructor provided is new(). When called on an existing object, new() will create a

new copy of that object. Otherwise, new() creates a new copy of the (internal) Def  ault Object . Note that the use of the procedural syntax modifies the  Def  ault Object and that those changes

will be reflected in any subsequent new() calls.

Formatting 

Formatting is done using the format_string() method. Call format_string() with one or more strings to format, or it will default to using $_.

Setting and Getting Formats 

You can set the text used for formatting a syntax element using set_format() (or set the start

and end format individually using set_start_format() and set_end_format(), respectively).

You can also retrieve the text used for formatting for an element via get_start_format() or 

get_end_format. Bulk retrieval of the names or values of defined formats is possible via

get_format_names_list() (names), get_start_format_values_list() andget_end_format_values_list().

See "FORMAT TYPES" later in this document for information on what format elements can be

used.

Checking and Setting the State 

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 10/23

You can check certain aspects of the state of the formatter via the methods: in_heredoc(),

in_string(), in_pod(), was_pod(), in_data(), and line_count().

You can reset all of the above states (and a few other internal ones) using reset().

Stable and Unstable Formatting Modes 

You can set or check the stability of formatting via unstable().

In unstable (TRUE) mode, formatting is not considered to be persistent with nested formats. Or,

 put another way, when unstable, the formatter can only "remember" one format at a time andmust reinstate formatting for each token. An example of unstable formatting is using ANSI color 

escape sequences in a terminal.

In stable (FALSE) mode (the default), formatting is considered persistent within arbitrarilynested formats. Even in stable mode, however, formatting is never allowed to span multiple

lines; it is always fully closed at the end of the line and reinstated at the beginning of a new line,if necessary. This is to ensure properly balanced tags when only formatting a partial code

snippet. An example of stable formatting is HTML.

Substitutions 

Using define_substitution(), you can have the formatter substitute certain strings withothers, after the original string has been parsed (but before formatting is applied). This is useful

for escaping characters special to the output mode (eg, > and < in HTML) without them affectingthe way the code is parsed.

You can retrieve the current substitutions (as a hash-ref) via substitutions().

FORMAT TYPES

The Syntax::Highlight::Perl formatter recognizes and differentiates between many Perl

syntactical elements. Each type of syntactical element has a Format Type associated with it.There is also a 'DEFAULT' type that is applied to any element who's Format Type does not have

a value.

Several of the Format Types have underscores in their name. This underscore is special, andindicates that the Format Type can be "generalized." This means that you can assign a value to

 just the first part of the Format Type name (the part before the underscore) and that value will beapplied to all Format Types with the same first part. For example, the Format Types for all types

of variables begin with "Variable_". Thus, if you assign a value to the Format Type "Variable", itwill be applied to any type of variable. Generalized Format Types take precedence over non-

generalized Format Types. So the value assigned to "Variable" would be applied to"Variable_Scalar", even if "Variable_Scalar" had a value explicitly assigned to it.

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 11/23

You can also define a "short-cut" name for each Format Type that can be generalized. The short-cut name would be the part of the Format Type name after the underscore. For example, the

short-cut for "Variable_Scalar" would be "Scalar". Short-cut names have the least precedenceand are only assigned if neither the generalized Type name, nor the full Type name have values.

Following is a list of all the syntactical elements that Syntax::Highlight::Perl currentlyrecognizes, along with a short description of what each would be applied to.

Comment_Normal

A normal Perl comment. Starts with '#' and goes until the end of the line.

Comment_POD

Inline documentation. Starts with a line beginning with an equal sign ('=') followed by aword (eg: '=pod') and continuing until a line beginning with '=cut'.

Directive

Either the "she-bang" line at the beginning of the file, or a line directive altering what thecompiler thinks the current line and file is.

Label

A loop or statement label (to be the target of a goto, next, last or redo).

Quote

Any string or character that begins or ends a String. Including, but not necessarily limited

to: quote-like regular expression operators (m//, s///, tr///, etc), a Here-Document

terminating line, the lone period terminating a format, and, of course, normal quotes (', ",

`, q{}, qq{}, qr{}, qx{}).

String

Any text within quotes, formats, Here-Documents, Regular Expressions, and the like.

Subroutine

The identifier used to define, identify, or call a subroutine (or method). Note thatSyntax::Highlight::Perl cannot recognize a subroutine if it is called without using

 parentheses or an ampersand, or methods called using the indirect object syntax. Itformats those as barewords.

Variable_Scalar

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 12/23

A scalar variable.

 Note that (theoretically) this format is not applied to non-scalar variables that are beingused as scalars (ie: array or hash lookups, nor references to anything other than scalars).

Syntax::Highlight::Perl figures out (or at least tries to) the actual type of the variable being used (by looking at how you're subscripting it) and formats it accordingly. The first

character of the variable (ie, the $, @, %, or *) tells you the type of value being used, andthe color (hopefully) tells you the type of variable being used to get that value.

(See "KNOWN ISSUES" for information about when this doesn't work quite right.)

Variable_Array

An array variable (but not usually a slice; see above).

Variable_Hash

A hash variable.

Variable_Typeglob

A typeglob. Note that typeglobs not beginning with an asterisk (*) (eg: filehandles) are

formatted as barewords. This is because, well, they are.

Whitespace

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 13/23

Whitespace. Not usually formatted but it can be.

Character

A special, or backslash-escaped, character. For example: \n (newline), or \d (digits).

Only occurs within strings or regular expressions.

Keyword

A Perl keyword. Some examples include: my, local, sub, next.

 Note that Perl does not make any distinction between keywords and built-in functions (at

least not in the documentation). Thus I had to make a subjective call as to what would beconsidered keywords and what would be built-in functions.

The list of keywords can be found (and overloaded) in the variable$Syntax::Highlight::Perl::keyword_list_re as a pre-compiled regular expression.

Builtin_Function

A Perl built-in function, called as a function (ie, using parentheses).

The list of built-in functions can be found (and overloaded) in the variable

$Syntax::Highlight::Perl::builtin_list_re as a pre-compiled regular expression.

Builtin_Operator

A Perl built-in function, called as a list or unary operator (ie, without using parentheses).

The list of built-in functions can be found (and overloaded) in the variable

$Syntax::Highlight::Perl::builtin_list_re as a pre-compiled regular expression.

Operator

A Perl operator.

The list of operators can be found (and overloaded) in the variable

$Syntax::Highlight::Perl::operator_list_re as a pre-compiled regular expression.

Bareword

A bareword. This can be user-defined subroutine called without parentheses, a typeglobused without an asterisk (*), or just a plain old bareword.

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 14/23

Package

The name of a package or pragmatic module.

 Note that this does not apply to the package portion of a fully qualified variable name.

Number

A numeric literal.

Symbol

A symbol (ie, non-operator punctuation).

CodeTerm

The special tokens that signal the end of executable code and the begining of the DATAsection. Specifically, '__END__' and '__DATA__'.

DATA

Anything in the DATA section (see CodeTerm).

PROCEDURAL vs. OBJECT ORIENTED

Syntax::Highlight::Perl uses OO method-calls internally (and actually defines a Default Objectthat is used when the functions are invoked procedurally) so you will not gain anything

(efficiency-wise) by using the procedural interface. It is just a matter of style.

It is actually recommended that you use the OO interface, as this allows you to instantiatemultiple, concurrent-yet-separate formatters. Though I cannot think of why you would need  

multiple formatters instantiated. :-)

One point to note: the new() method uses the Default Object to initialize new objects. Thismeans that any changes to the state of the Default Object (including Format definitions) made byusing the procedural interface will be reflected in any subsequently created objects. This can be

useful in some cases (eg, call set_format() procedurally just before creating a batch of newobjects to define default Formats for them all) but will most likely lead to trouble.

METHODS

new PACKAGE

new OBJECT

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 15/23

Creates a new object. If called on an existing object, creates a new copy of that object(which is thenceforth totally separate from the original).

reset

Resets the object's internal state. This breaks out of strings and here-docs, ends PODs,resets the line-count, and otherwise gets the object back into a "normal" state to begin

 processing a new stream.

 Note that this does not  reset any user options (including formats and format stability).

unstable EXPR

unstable

Returns true if the formatter is in unstable mode.

If called with a non-zero number, puts the formatter into unstable formatting mode.

In unstable mode, it is assumed that formatting is not persistent one token to the next andthat each token must be explicitly formatted.

in_heredoc

Returns true if the next string to be formatted will be inside a Here-Document.

in_string

Returns true if the next string to be formatted will be inside a multi-line string.

in_pod

Returns true if the formatter would consider the next string passed to it as begin within a

POD structure. This is false immediately before any POD instigators (=pod, =head1,

=item, etc), true immediately after an instigator, throughout the POD and immediately

 before the POD terminator (=cut), and false immediately after the POD terminator.

was_pod

Returns true if the last line of the string just formatted was part of a POD structure. This

includes the /^=\w+/ POD instigators and terminators.

in_data

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 16/23

Returns true if the next string to be formatted will be inside the DATA section (ie,

follows a __DATA__ or __END__ tag).

line_count

Returns the number of lines processed by the formatter.

substitutions

Returns a reference to the substitution table used. The substitution table is a hash whosekeys are the strings to be replaced, and whose values are what to replace them with.

define_substitution HASH_REF

define_substitution LIST

Allows user to define certain characters that will be substituted before formatting is done(but after they have been processed for meaning).

If the first parameter is a reference to a hash, the formatter will replace it's own hash withthe given one, and subsequent changes to the hash outside the formatter will be reflected.

Otherwise, it will copy the arguments passed into it's own hash, and any substitutionsalready defined (but not in the parameter list) will be preserved. (ie, the new substitutions

will be added, without destroying what was there already.)

set_start_format HASH_REF

set_start_format LIST

Given either a list of keys/values, or a reference to a hash of keys/values, copy them intothe object's Formats list.

set_end_format HASH_REF

set_end_format LIST

Given either a list of keys/values, or a reference to a hash of keys/values, copy them into

the object's Formats list.

set_format LIST

Sets the formatting string for one or more formats.

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 17/23

You should pass a list of keys/values where the keys are the format names and the valuesare references to arrays containing the starting and ending formatting strings (in that

order) for that format.

get_start_format LIST

Retrieve the string that is inserted to begin a given format type (starting format string).

The names are looked for in the following order:

First: Prefer the names joined by underscore, from most general to least. For example,

given ("Variable", "Scalar"): "Variable" then "Variable_Scalar".

Second: Then try each name singly, in reverse order. For example, "Scalar" then

"Variable".

See "FORMAT TYPES" for more information.

get_end_format LIST

Retrieve the string that is inserted to end a given format type (ending format string).

get_format_names_list

Returns a list of the names of all the Formats defined.

get_start_format_values_list

Returns a list of the values of all the start Formats defined (in the same order as the

names returned by get_format_names_list()).

get_end_format_values_list

Returns a list of the values of all the end Formats defined (in the same order as the names

returned by get_format_names_list()).

format_string LIST

Formats one or more strings of Perl code. If no strings are specified, defaults to $_.

Returns the list of formatted strings (or the first string formatted if called in scalar context).

Note: The end of the string is considered to be the end of a line, regardless of whether or 

not there is a trailing line-break (but trailing line-breaks will not cause an extra, emptyline).

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 18/23

Another Note: The function actually uses $/ to determine line-breaks, unless $/ is set to

\n (newline). If $/ i s \n, then it looks for the first match of m/\r?\n|\n?\r/ in the stringand uses that to determine line-breaks. This is to make it easy to handle non-unix text.

Whatever characters it ends up using as line-breaks are preserved.

format_token TOK

EN, LIST

Returns TOKEN wrapped in the start and end Formats corresponding to LIST (as would

 be returned by get_start_format( LIST ) and get_end_format( LIST ),respectively).

 No syntax checking is done on TOKEN but substitutions defined with

define_substitution() are performed.

K NOWN ISSUES or LIMITATIONS

y  Barewords used as keys to a hash are formatted as strings. This is Good. They should not be,however, if they are not the only thing within the curly braces. That can be fixed.

y  This version does not handle formats (see perlform(1)) very well. It treats them as Here-

Documents and ignores the rules for comment lines, as well as the fact that picture lines are not

supposed to be interpolated. Thus, your picture lines will look strange with the '@'s being

formatted as array variables (albeit, invalid ones). Ideally, it would also treat value lines as

normal Perl code and format accordingly. I think I'l l get to the comment lines and non-

interpolating picture lines first. If/When I do get this fixed, I will most likely add a format type of 

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 19/23

'Format' or something, so that they can be formatted differently, if so desired.

y  This version does not handle Regular Expression significant characters. It simply treats Regular

Expressions as interpolated strings.

y  User-defined subroutines, called without parentheses, are formatted as barewords. This is

because there is no way to tell them apart from barewords without parsing the code, and would

require us to go as far as perl does when doing the -c check (ie, executing BEGIN and END

blocks and the like). That's not going to happen.

y  If you are indexing (subscripting) an array or hash, the formatter tries to figure out the "real"

variable class by looking at how you index the variable. However, if you do something funky (but

legal in Perl) and put line-breaks or comments between the variable class

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 20/23

character ($) and your identifier, the formatter will get confused and treat your variable as a

scalar. Until it finds the index character. Then it will format the scalar class character ($) as a

scalar and your identifier as the "correct" class.

y  If you put a line-break between your variable identifier and it's indexing character (see above),

which is also legal in Perl, the formatter will never find it and treat your variable as a scalar.

y  If you put a line-break between a bareword hash-subscript and the hash variable, or between a

bareword and its associated => operator, the bareword will not be formatted correctly (as a

string). (Noticing a pattern here?) 

 AUTHOR

Cory Johns dark [email protected] 

Copyright (c) 2001 Cory Johns. This library is free software; you can redistribute and/or modify

it under the same conditions as Perl itself.

TO DO

1.  Improve handling of regular expressions. Add support for regexp-special characters. Recognize

the /e option to the substitution operator (maybe).

2.  Improve handling of formats. Don't treat format definitions as interpolating. Handle format-

comments. Possibly format value lines as normal Perl code.

3.  Create in-memory deep-copy routine to replace eval(Data::Dumper) deep-copy.

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 21/23

4.  Generalize state transitions (reset() and, in the future, copy_state()) to use non-hard-

coded keys and values for state variables. Probably will extrapolate them into an overloadable

hash, and use the aforementioned deep-copy to assign them.

5.  Create a method to save or copy states between objects (copy_state()). Would be useful for

using this module in an editor.

6.  Add support for greater-than-one length special characters. Specifically, octal, hexidecimal, and

control character codes. For example, \644, \x1a4 or \c[.

REVISIONS

04-04-2001 Cory Johns 

y  Fixed problem with special characters not formatting inside of Here-Documents.

y  Fixed bug causing hash variables to format inside of Here-Documents.

03-30-2001 Cory Johns 

y  Fixed bug where quote-terminators were checked for inside of Here-Documents.

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 22/23

 

8/9/2019 Perl Short-cut for Variable_Scalar Would Be Scalar Short-Cut Names Have the Least

http://slidepdf.com/reader/full/perl-short-cut-for-variablescalar-would-be-scalar-short-cut-names-have-the 23/23