diff options
author | 2013-03-25 20:06:16 +0000 | |
---|---|---|
committer | 2013-03-25 20:06:16 +0000 | |
commit | 898184e3e61f9129feb5978fad5a8c6865f00b92 (patch) | |
tree | 56f32aefc1eed60b534611007c7856f82697a205 /gnu/usr.bin/perl/cpan/perlfaq/lib | |
parent | PGSHIFT -> PAGE_SHIFT (diff) | |
download | wireguard-openbsd-898184e3e61f9129feb5978fad5a8c6865f00b92.tar.xz wireguard-openbsd-898184e3e61f9129feb5978fad5a8c6865f00b92.zip |
import perl 5.16.3 from CPAN - worked on by Andrew Fresh and myself
Diffstat (limited to 'gnu/usr.bin/perl/cpan/perlfaq/lib')
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pm | 6 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pod | 1373 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq1.pod | 332 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq2.pod | 246 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq3.pod | 1160 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq4.pod | 2679 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq5.pod | 1574 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq6.pod | 1124 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq7.pod | 1061 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq8.pod | 1422 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq9.pod | 426 | ||||
-rw-r--r-- | gnu/usr.bin/perl/cpan/perlfaq/lib/perlglossary.pod | 3442 |
12 files changed, 14845 insertions, 0 deletions
diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pm b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pm new file mode 100644 index 00000000000..1d5b4e4233f --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pm @@ -0,0 +1,6 @@ +package perlfaq; +{ + $perlfaq::VERSION = '5.0150039'; +} + +0; # not is it supposed to be loaded diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pod new file mode 100644 index 00000000000..449c0a2de84 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq.pod @@ -0,0 +1,1373 @@ +=head1 NAME + +perlfaq - frequently asked questions about Perl + +=head1 DESCRIPTION + +The perlfaq comprises several documents that answer the most commonly +asked questions about Perl and Perl programming. It's divided by topic +into nine major sections outlined in this document. + +=head2 Where to find the perlfaq + +The perlfaq is an evolving document. Read the latest version at +L<http://learn.perl.org/faq/>. It is also included in the standard Perl +distribution. + +=head2 How to use the perlfaq + +The C<perldoc> command line tool is part of the standard Perl distribution. To +read the perlfaq: + + $ perldoc perlfaq + +To search the perlfaq question headings: + + $ perldoc -q open + +=head2 How to contribute to the perlfaq + +Review L<https://github.com/perl-doc-cats/perlfaq/wiki>. If you don't find +your suggestion create an issue or pull request against +L<https://github.com/perl-doc-cats/perlfaq>. + +Once approved, changes are merged into L<https://github.com/tpf/perlfaq>, the +repository which drives L<http://learn.perl.org/faq/>, and they are +distributed with the next Perl 5 release. + +=head2 What if my question isn't answered in the FAQ? + +Try the resources in L<perlfaq2>. + +=head1 TABLE OF CONTENTS + +=over 4 + +=item perlfaq1 - General Questions About Perl + +=item perlfaq2 - Obtaining and Learning about Perl + +=item perlfaq3 - Programming Tools + +=item perlfaq4 - Data Manipulation + +=item perlfaq5 - Files and Formats + +=item perlfaq6 - Regular Expressions + +=item perlfaq7 - General Perl Language Issues + +=item perlfaq8 - System Interaction + +=item perlfaq9 - Web, Email and Networking + +=back + +=head1 THE QUESTIONS + +=head2 L<perlfaq1>: General Questions About Perl + +This section of the FAQ answers very general, high-level questions about Perl. + +=over 4 + +=item * + +What is Perl? + +=item * + +Who supports Perl? Who develops it? Why is it free? + +=item * + +Which version of Perl should I use? + +=item * + +What are Perl 4, Perl 5, or Perl 6? + +=item * + +What is Perl 6? + +=item * + +How stable is Perl? + +=item * + +Is Perl difficult to learn? + +=item * + +How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? + +=item * + +Can I do [task] in Perl? + +=item * + +When shouldn't I program in Perl? + +=item * + +What's the difference between "perl" and "Perl"? + +=item * + +What is a JAPH? + +=item * + +How can I convince others to use Perl? + +=back + + +=head2 L<perlfaq2>: Obtaining and Learning about Perl + +This section of the FAQ answers questions about where to find source and documentation for Perl, support, and related matters. + +=over 4 + +=item * + +What machines support Perl? Where do I get it? + +=item * + +How can I get a binary version of Perl? + +=item * + +I don't have a C compiler. How can I build my own Perl interpreter? + +=item * + +I copied the Perl binary from one machine to another, but scripts don't work. + +=item * + +I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? + +=item * + +What modules and extensions are available for Perl? What is CPAN? + +=item * + +Where can I get information on Perl? + +=item * + +What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org? + +=item * + +Where can I post questions? + +=item * + +Perl Books + +=item * + +Which magazines have Perl content? + +=item * + +Which Perl blogs should I read? + +=item * + +What mailing lists are there for Perl? + +=item * + +Where can I buy a commercial version of Perl? + +=item * + +Where do I send bug reports? + +=back + + +=head2 L<perlfaq3>: Programming Tools + +This section of the FAQ answers questions related to programmer tools and programming support. + +=over 4 + +=item * + +How do I do (anything)? + +=item * + +How can I use Perl interactively? + +=item * + +How do I find which modules are installed on my system? + +=item * + +How do I debug my Perl programs? + +=item * + +How do I profile my Perl programs? + +=item * + +How do I cross-reference my Perl programs? + +=item * + +Is there a pretty-printer (formatter) for Perl? + +=item * + +Is there an IDE or Windows Perl Editor? + +=item * + +Where can I get Perl macros for vi? + +=item * + +Where can I get perl-mode or cperl-mode for emacs? + +=item * + +How can I use curses with Perl? + +=item * + +How can I write a GUI (X, Tk, Gtk, etc.) in Perl? + +=item * + +How can I make my Perl program run faster? + +=item * + +How can I make my Perl program take less memory? + +=item * + +Is it safe to return a reference to local or lexical data? + +=item * + +How can I free an array or hash so my program shrinks? + +=item * + +How can I make my CGI script more efficient? + +=item * + +How can I hide the source for my Perl program? + +=item * + +How can I compile my Perl program into byte code or C? + +=item * + +How can I get C<#!perl> to work on [MS-DOS,NT,...]? + +=item * + +Can I write useful Perl programs on the command line? + +=item * + +Why don't Perl one-liners work on my DOS/Mac/VMS system? + +=item * + +Where can I learn about CGI or Web programming in Perl? + +=item * + +Where can I learn about object-oriented Perl programming? + +=item * + +Where can I learn about linking C with Perl? + +=item * + +I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? + +=item * + +When I tried to run my script, I got this message. What does it mean? + +=item * + +What's MakeMaker? + +=back + + +=head2 L<perlfaq4>: Data Manipulation + +This section of the FAQ answers questions related to manipulating numbers, dates, strings, arrays, hashes, and miscellaneous data issues. + +=over 4 + +=item * + +Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? + +=item * + +Why is int() broken? + +=item * + +Why isn't my octal data interpreted correctly? + +=item * + +Does Perl have a round() function? What about ceil() and floor()? Trig functions? + +=item * + +How do I convert between numeric representations/bases/radixes? + +=item * + +Why doesn't & work the way I want it to? + +=item * + +How do I multiply matrices? + +=item * + +How do I perform an operation on a series of integers? + +=item * + +How can I output Roman numerals? + +=item * + +Why aren't my random numbers random? + +=item * + +How do I get a random number between X and Y? + +=item * + +How do I find the day or week of the year? + +=item * + +How do I find the current century or millennium? + +=item * + +How can I compare two dates and find the difference? + +=item * + +How can I take a string and turn it into epoch seconds? + +=item * + +How can I find the Julian Day? + +=item * + +How do I find yesterday's date? + +=item * + +Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant? + +=item * + +How do I validate input? + +=item * + +How do I unescape a string? + +=item * + +How do I remove consecutive pairs of characters? + +=item * + +How do I expand function calls in a string? + +=item * + +How do I find matching/nesting anything? + +=item * + +How do I reverse a string? + +=item * + +How do I expand tabs in a string? + +=item * + +How do I reformat a paragraph? + +=item * + +How can I access or change N characters of a string? + +=item * + +How do I change the Nth occurrence of something? + +=item * + +How can I count the number of occurrences of a substring within a string? + +=item * + +How do I capitalize all the words on one line? + +=item * + +How can I split a [character]-delimited string except when inside [character]? + +=item * + +How do I strip blank space from the beginning/end of a string? + +=item * + +How do I pad a string with blanks or pad a number with zeroes? + +=item * + +How do I extract selected columns from a string? + +=item * + +How do I find the soundex value of a string? + +=item * + +How can I expand variables in text strings? + +=item * + +What's wrong with always quoting "$vars"? + +=item * + +Why don't my E<lt>E<lt>HERE documents work? + +=item * + +What is the difference between a list and an array? + +=item * + +What is the difference between $array[1] and @array[1]? + +=item * + +How can I remove duplicate elements from a list or array? + +=item * + +How can I tell whether a certain element is contained in a list or array? + +=item * + +How do I compute the difference of two arrays? How do I compute the intersection of two arrays? + +=item * + +How do I test whether two arrays or hashes are equal? + +=item * + +How do I find the first array element for which a condition is true? + +=item * + +How do I handle linked lists? + +=item * + +How do I handle circular lists? + +=item * + +How do I shuffle an array randomly? + +=item * + +How do I process/modify each element of an array? + +=item * + +How do I select a random element from an array? + +=item * + +How do I permute N elements of a list? + +=item * + +How do I sort an array by (anything)? + +=item * + +How do I manipulate arrays of bits? + +=item * + +Why does defined() return true on empty arrays and hashes? + +=item * + +How do I process an entire hash? + +=item * + +How do I merge two hashes? + +=item * + +What happens if I add or remove keys from a hash while iterating over it? + +=item * + +How do I look up a hash element by value? + +=item * + +How can I know how many entries are in a hash? + +=item * + +How do I sort a hash (optionally by value instead of key)? + +=item * + +How can I always keep my hash sorted? + +=item * + +What's the difference between "delete" and "undef" with hashes? + +=item * + +Why don't my tied hashes make the defined/exists distinction? + +=item * + +How do I reset an each() operation part-way through? + +=item * + +How can I get the unique keys from two hashes? + +=item * + +How can I store a multidimensional array in a DBM file? + +=item * + +How can I make my hash remember the order I put elements into it? + +=item * + +Why does passing a subroutine an undefined element in a hash create it? + +=item * + +How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? + +=item * + +How can I use a reference as a hash key? + +=item * + +How can I check if a key exists in a multilevel hash? + +=item * + +How can I prevent addition of unwanted keys into a hash? + +=item * + +How do I handle binary data correctly? + +=item * + +How do I determine whether a scalar is a number/whole/integer/float? + +=item * + +How do I keep persistent data across program calls? + +=item * + +How do I print out or copy a recursive data structure? + +=item * + +How do I define methods for every class/object? + +=item * + +How do I verify a credit card checksum? + +=item * + +How do I pack arrays of doubles or floats for XS code? + +=back + + +=head2 L<perlfaq5>: Files and Formats + +This section deals with I/O and the "f" issues: filehandles, flushing, formats, and footers. + +=over 4 + +=item * + +How do I flush/unbuffer an output filehandle? Why must I do this? + +=item * + +How do I change, delete, or insert a line in a file, or append to the beginning of a file? + +=item * + +How do I count the number of lines in a file? + +=item * + +How do I delete the last N lines from a file? + +=item * + +How can I use Perl's C<-i> option from within a program? + +=item * + +How can I copy a file? + +=item * + +How do I make a temporary file name? + +=item * + +How can I manipulate fixed-record-length files? + +=item * + +How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? + +=item * + +How can I use a filehandle indirectly? + +=item * + +How can I set up a footer format to be used with write()? + +=item * + +How can I write() into a string? + +=item * + +How can I open a filehandle to a string? + +=item * + +How can I output my numbers with commas added? + +=item * + +How can I translate tildes (~) in a filename? + +=item * + +How come when I open a file read-write it wipes it out? + +=item * + +Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? + +=item * + +How can I open a file with a leading "E<gt>" or trailing blanks? + +=item * + +How can I reliably rename a file? + +=item * + +How can I lock a file? + +=item * + +Why can't I just open(FH, "E<gt>file.lock")? + +=item * + +I still don't get locking. I just want to increment the number in the file. How can I do this? + +=item * + +All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? + +=item * + +How do I randomly update a binary file? + +=item * + +How do I get a file's timestamp in perl? + +=item * + +How do I set a file's timestamp in perl? + +=item * + +How do I print to more than one file at once? + +=item * + +How can I read in an entire file all at once? + +=item * + +How can I read in a file by paragraphs? + +=item * + +How can I read a single character from a file? From the keyboard? + +=item * + +How can I tell whether there's a character waiting on a filehandle? + +=item * + +How do I do a C<tail -f> in perl? + +=item * + +How do I dup() a filehandle in Perl? + +=item * + +How do I close a file descriptor by number? + +=item * + +Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? + +=item * + +Why doesn't glob("*.*") get all the files? + +=item * + +Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? + +=item * + +How do I select a random line from a file? + +=item * + +Why do I get weird spaces when I print an array of lines? + +=item * + +How do I traverse a directory tree? + +=item * + +How do I delete a directory tree? + +=item * + +How do I copy an entire directory? + +=back + + +=head2 L<perlfaq6>: Regular Expressions + +This section is surprisingly small because the rest of the FAQ is littered with answers involving regular expressions. For example, decoding a URL and checking whether something is a number can be handled with regular expressions, but those answers are found elsewhere in this document (in perlfaq9 : "How do I decode or create those %-encodings on the web" and perlfaq4 : "How do I determine whether a scalar is a number/whole/integer/float", to be precise). + +=over 4 + +=item * + +How can I hope to use regular expressions without creating illegible and unmaintainable code? + +=item * + +I'm having trouble matching over more than one line. What's wrong? + +=item * + +How can I pull out lines between two patterns that are themselves on different lines? + +=item * + +How do I match XML, HTML, or other nasty, ugly things with a regex? + +=item * + +I put a regular expression into $/ but it didn't work. What's wrong? + +=item * + +How do I substitute case-insensitively on the LHS while preserving case on the RHS? + +=item * + +How can I make C<\w> match national character sets? + +=item * + +How can I match a locale-smart version of C</[a-zA-Z]/> ? + +=item * + +How can I quote a variable to use in a regex? + +=item * + +What is C</o> really for? + +=item * + +How do I use a regular expression to strip C-style comments from a file? + +=item * + +Can I use Perl regular expressions to match balanced text? + +=item * + +What does it mean that regexes are greedy? How can I get around it? + +=item * + +How do I process each word on each line? + +=item * + +How can I print out a word-frequency or line-frequency summary? + +=item * + +How can I do approximate matching? + +=item * + +How do I efficiently match many regular expressions at once? + +=item * + +Why don't word-boundary searches with C<\b> work for me? + +=item * + +Why does using $&, $`, or $' slow my program down? + +=item * + +What good is C<\G> in a regular expression? + +=item * + +Are Perl regexes DFAs or NFAs? Are they POSIX compliant? + +=item * + +What's wrong with using grep in a void context? + +=item * + +How can I match strings with multibyte characters? + +=item * + +How do I match a regular expression that's in a variable? + +=back + + +=head2 L<perlfaq7>: General Perl Language Issues + +This section deals with general Perl language issues that don't clearly fit into any of the other sections. + +=over 4 + +=item * + +Can I get a BNF/yacc/RE for the Perl language? + +=item * + +What are all these $@%&* punctuation signs, and how do I know when to use them? + +=item * + +Do I always/never have to quote my strings or use semicolons and commas? + +=item * + +How do I skip some return values? + +=item * + +How do I temporarily block warnings? + +=item * + +What's an extension? + +=item * + +Why do Perl operators have different precedence than C operators? + +=item * + +How do I declare/create a structure? + +=item * + +How do I create a module? + +=item * + +How do I adopt or take over a module already on CPAN? + +=item * + +How do I create a class? + +=item * + +How can I tell if a variable is tainted? + +=item * + +What's a closure? + +=item * + +What is variable suicide and how can I prevent it? + +=item * + +How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? + +=item * + +How do I create a static variable? + +=item * + +What's the difference between dynamic and lexical (static) scoping? Between local() and my()? + +=item * + +How can I access a dynamic variable while a similarly named lexical is in scope? + +=item * + +What's the difference between deep and shallow binding? + +=item * + +Why doesn't "my($foo) = E<lt>$fhE<gt>;" work right? + +=item * + +How do I redefine a builtin function, operator, or method? + +=item * + +What's the difference between calling a function as &foo and foo()? + +=item * + +How do I create a switch or case statement? + +=item * + +How can I catch accesses to undefined variables, functions, or methods? + +=item * + +Why can't a method included in this same file be found? + +=item * + +How can I find out my current or calling package? + +=item * + +How can I comment out a large block of Perl code? + +=item * + +How do I clear a package? + +=item * + +How can I use a variable as a variable name? + +=item * + +What does "bad interpreter" mean? + +=back + + +=head2 L<perlfaq8>: System Interaction + +This section of the Perl FAQ covers questions involving operating system interaction. Topics include interprocess communication (IPC), control over the user-interface (keyboard, screen and pointing devices), and most anything else not related to data manipulation. + +=over 4 + +=item * + +How do I find out which operating system I'm running under? + +=item * + +How come exec() doesn't return? + +=item * + +How do I do fancy stuff with the keyboard/screen/mouse? + +=item * + +How do I print something out in color? + +=item * + +How do I read just one key without waiting for a return key? + +=item * + +How do I check whether input is ready on the keyboard? + +=item * + +How do I clear the screen? + +=item * + +How do I get the screen size? + +=item * + +How do I ask the user for a password? + +=item * + +How do I read and write the serial port? + +=item * + +How do I decode encrypted password files? + +=item * + +How do I start a process in the background? + +=item * + +How do I trap control characters/signals? + +=item * + +How do I modify the shadow password file on a Unix system? + +=item * + +How do I set the time and date? + +=item * + +How can I sleep() or alarm() for under a second? + +=item * + +How can I measure time under a second? + +=item * + +How can I do an atexit() or setjmp()/longjmp()? (Exception handling) + +=item * + +Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? + +=item * + +How can I call my system's unique C functions from Perl? + +=item * + +Where do I get the include files to do ioctl() or syscall()? + +=item * + +Why do setuid perl scripts complain about kernel problems? + +=item * + +How can I open a pipe both to and from a command? + +=item * + +Why can't I get the output of a command with system()? + +=item * + +How can I capture STDERR from an external command? + +=item * + +Why doesn't open() return an error when a pipe open fails? + +=item * + +What's wrong with using backticks in a void context? + +=item * + +How can I call backticks without shell processing? + +=item * + +Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? + +=item * + +How can I convert my shell script to perl? + +=item * + +Can I use perl to run a telnet or ftp session? + +=item * + +How can I write expect in Perl? + +=item * + +Is there a way to hide perl's command line from programs such as "ps"? + +=item * + +I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? + +=item * + +How do I close a process's filehandle without waiting for it to complete? + +=item * + +How do I fork a daemon process? + +=item * + +How do I find out if I'm running interactively or not? + +=item * + +How do I timeout a slow event? + +=item * + +How do I set CPU limits? + +=item * + +How do I avoid zombies on a Unix system? + +=item * + +How do I use an SQL database? + +=item * + +How do I make a system() exit on control-C? + +=item * + +How do I open a file without blocking? + +=item * + +How do I tell the difference between errors from the shell and perl? + +=item * + +How do I install a module from CPAN? + +=item * + +What's the difference between require and use? + +=item * + +How do I keep my own module/library directory? + +=item * + +How do I add the directory my program lives in to the module/library search path? + +=item * + +How do I add a directory to my include path (@INC) at runtime? + +=item * + +What is socket.ph and where do I get it? + +=back + + +=head2 L<perlfaq9>: Web, Email and Networking + +This section deals with questions related to running web sites, sending and receiving email as well as general networking. + +=over 4 + +=item * + +Should I use a web framework? + +=item * + +Which web framework should I use? + +=item * + +What is Plack and PSGI? + +=item * + +How do I remove HTML from a string? + +=item * + +How do I extract URLs? + +=item * + +How do I fetch an HTML file? + +=item * + +How do I automate an HTML form submission? + +=item * + +How do I decode or create those %-encodings on the web? + +=item * + +How do I redirect to another page? + +=item * + +How do I put a password on my web pages? + +=item * + +How do I make sure users can't enter values into a form that causes my CGI script to do bad things? + +=item * + +How do I parse a mail header? + +=item * + +How do I check a valid mail address? + +=item * + +How do I decode a MIME/BASE64 string? + +=item * + +How do I find the user's mail address? + +=item * + +How do I send email? + +=item * + +How do I use MIME to make an attachment to a mail message? + +=item * + +How do I read email? + +=item * + +How do I find out my hostname, domainname, or IP address? + +=item * + +How do I fetch/put an (S)FTP file? + +=item * + +How can I do RPC in Perl? + +=back + + + +=head1 CREDITS + +Tom Christiansen wrote the original perlfaq then expanded it with the +help of Nat Torkington. brian d foy substantialy edited and expanded +the perlfaq. perlfaq-workers and others have also supplied feedback, +patches and corrections over the years. + +=head1 AUTHOR AND COPYRIGHT + +Tom Christiansen wrote the original version of this document. +brian d foy C<< <bdfoy@cpan.org> >> wrote this version. See the +individual perlfaq documents for additional copyright information. + +This document is available under the same terms as Perl itself. Code +examples in all the perlfaq documents are in the public domain. Use +them as you see fit (and at your own risk with no warranty from anyone). diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq1.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq1.pod new file mode 100644 index 00000000000..a02fae6a707 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq1.pod @@ -0,0 +1,332 @@ +=head1 NAME + +perlfaq1 - General Questions About Perl + +=head1 DESCRIPTION + +This section of the FAQ answers very general, high-level questions +about Perl. + +=head2 What is Perl? + +Perl is a high-level programming language with an eclectic heritage +written by Larry Wall and a cast of thousands. + +Perl's process, file, and text manipulation facilities make it +particularly well-suited for tasks involving quick prototyping, system +utilities, software tools, system management tasks, database access, +graphical programming, networking, and web programming. + +Perl derives from the ubiquitous C programming language and to a +lesser extent from sed, awk, the Unix shell, and many other tools +and languages. + +These strengths make it especially popular with web developers +and system administrators. Mathematicians, geneticists, journalists, +managers and many other people also use Perl. + +=head2 Who supports Perl? Who develops it? Why is it free? + +The original culture of the pre-populist Internet and the deeply-held +beliefs of Perl's author, Larry Wall, gave rise to the free and open +distribution policy of Perl. Perl is supported by its users. The +core, the standard Perl library, the optional modules, and the +documentation you're reading now were all written by volunteers. + +The core development team (known as the Perl Porters) +are a group of highly altruistic individuals committed to +producing better software for free than you could hope to purchase for +money. You may snoop on pending developments via the +L<archives|http://www.nntp.perl.org/group/perl.perl5.porters/> +or read the L<faq|http://dev.perl.org/perl5/docs/p5p-faq.html>, +or you can subscribe to the mailing list by sending +perl5-porters-subscribe@perl.org a subscription request +(an empty message with no subject is fine). + +While the GNU project includes Perl in its distributions, there's no +such thing as "GNU Perl". Perl is not produced nor maintained by the +Free Software Foundation. Perl's licensing terms are also more open +than GNU software's tend to be. + +You can get commercial support of Perl if you wish, although for most +users the informal support will more than suffice. See the answer to +"Where can I buy a commercial version of Perl?" for more information. + +=head2 Which version of Perl should I use? + +(contributed by brian d foy) + +There is often a matter of opinion and taste, and there isn't any one +answer that fits everyone. In general, you want to use either the current +stable release, or the stable release immediately prior to that one. +Currently, those are perl5.14.x and perl5.12.x, respectively. + +Beyond that, you have to consider several things and decide which is best +for you. + +=over 4 + +=item * + +If things aren't broken, upgrading perl may break them (or at least issue +new warnings). + +=item * + +The latest versions of perl have more bug fixes. + +=item * + +The Perl community is geared toward supporting the most recent releases, +so you'll have an easier time finding help for those. + +=item * + +Versions prior to perl5.004 had serious security problems with buffer +overflows, and in some cases have CERT advisories (for instance, +L<http://www.cert.org/advisories/CA-1997-17.html> ). + +=item * + +The latest versions are probably the least deployed and widely tested, so +you may want to wait a few months after their release and see what +problems others have if you are risk averse. + +=item * + +The immediate, previous releases (i.e. perl5.8.x ) are usually maintained +for a while, although not at the same level as the current releases. + +=item * + +No one is actively supporting Perl 4. Ten years ago it was a dead +camel carcass (according to this document). Now it's barely a skeleton +as its whitewashed bones have fractured or eroded. + +=item * + +The current leading implementation of Perl 6, Rakudo, released a "useful, +usable, 'early adopter'" distribution of Perl 6 (called Rakudo Star) in July of +2010. Please see L<http://rakudo.org/> for more information. + +=item * + +There are really two tracks of perl development: a maintenance version +and an experimental version. The maintenance versions are stable, and +have an even number as the minor release (i.e. perl5.10.x, where 10 is the +minor release). The experimental versions may include features that +don't make it into the stable versions, and have an odd number as the +minor release (i.e. perl5.9.x, where 9 is the minor release). + +=back + +=head2 What are Perl 4, Perl 5, or Perl 6? + +In short, Perl 4 is the parent to both Perl 5 and Perl 6. Perl 5 is the older +sibling, and though they are different languages, someone who knows one will +spot many similarities in the other. + +The number after Perl (i.e. the 5 after Perl 5) is the major release +of the perl interpreter as well as the version of the language. Each +major version has significant differences that earlier versions cannot +support. + +The current major release of Perl is Perl 5, first released in +1994. It can run scripts from the previous major release, Perl 4 +(March 1991), but has significant differences. + +Perl 6 is a reinvention of Perl, it is a language in the same lineage but +not compatible. The two are complementary, not mutually exclusive. Perl 6 is +not meant to replace Perl 5, and vice versa. See L</"What is Perl 6?"> below +to find out more. + +See L<perlhist> for a history of Perl revisions. + +=head2 What is Perl 6? + +Perl 6 was I<originally> described as the community's rewrite of Perl 5. +Development started in 2002; syntax and design work continue to this day. +As the language has evolved, it has become clear that it is a separate +language, incompatible with Perl 5 but in the same language family. + +Contrary to popular belief, Perl 6 and Perl 5 peacefully coexist with one +another. Perl 6 has proven to be a fascinating source of ideas for those +using Perl 5 (the L<Moose> object system is a well-known example). There is +overlap in the communities, and this overlap fosters the tradition of sharing +and borrowing that have been instrumental to Perl's success. The current +leading implementation of Perl 6 is Rakudo, and you can learn more about +it at L<http://rakudo.org>. + +If you want to learn more about Perl 6, or have a desire to help in +the crusade to make Perl a better place then read the Perl 6 developers +page at L<http://www.perl6.org/> and get involved. + +"We're really serious about reinventing everything that needs reinventing." +--Larry Wall + +=head2 How stable is Perl? + +Production releases, which incorporate bug fixes and new functionality, +are widely tested before release. Since the 5.000 release, we have +averaged about one production release per year. + +The Perl development team occasionally make changes to the +internal core of the language, but all possible efforts are made toward +backward compatibility. + +=head2 Is Perl difficult to learn? + +No, Perl is easy to start L<learning|http://learn.perl.org/> --and easy to keep learning. It looks +like most programming languages you're likely to have experience +with, so if you've ever written a C program, an awk script, a shell +script, or even a BASIC program, you're already partway there. + +Most tasks only require a small subset of the Perl language. One of +the guiding mottos for Perl development is "there's more than one way +to do it" (TMTOWTDI, sometimes pronounced "tim toady"). Perl's +learning curve is therefore shallow (easy to learn) and long (there's +a whole lot you can do if you really want). + +Finally, because Perl is frequently (but not always, and certainly not by +definition) an interpreted language, you can write your programs and test +them without an intermediate compilation step, allowing you to experiment +and test/debug quickly and easily. This ease of experimentation flattens +the learning curve even more. + +Things that make Perl easier to learn: Unix experience, almost any kind +of programming experience, an understanding of regular expressions, and +the ability to understand other people's code. If there's something you +need to do, then it's probably already been done, and a working example is +usually available for free. Don't forget Perl modules, either. +They're discussed in Part 3 of this FAQ, along with L<CPAN|http://www.cpan.org/>, which is +discussed in Part 2. + +=head2 How does Perl compare with other languages like Java, Python, REXX, Scheme, or Tcl? + +Perl can be used for almost any coding problem, even ones which require +integrating specialist C code for extra speed. As with any tool it can +be used well or badly. Perl has many strengths, and a few weaknesses, +precisely which areas are good and bad is often a personal choice. + +When choosing a language you should also be influenced by the +L<resources|http://www.cpan.org/>, L<testing culture|http://www.cpantesters.org/> +and L<community|http://www.perl.org/community.html> which surrounds it. + +For comparisons to a specific language it is often best to create +a small project in both languages and compare the results, make sure +to use all the L<resources|http://www.cpan.org/> of each language, +as a language is far more than just it's syntax. + +=head2 Can I do [task] in Perl? + +Perl is flexible and extensible enough for you to use on virtually any +task, from one-line file-processing tasks to large, elaborate systems. + +For many people, Perl serves as a great replacement for shell scripting. +For others, it serves as a convenient, high-level replacement for most of +what they'd program in low-level languages like C or C++. It's ultimately +up to you (and possibly your management) which tasks you'll use Perl +for and which you won't. + +If you have a library that provides an API, you can make any component +of it available as just another Perl function or variable using a Perl +extension written in C or C++ and dynamically linked into your main +perl interpreter. You can also go the other direction, and write your +main program in C or C++, and then link in some Perl code on the fly, +to create a powerful application. See L<perlembed>. + +That said, there will always be small, focused, special-purpose +languages dedicated to a specific problem domain that are simply more +convenient for certain kinds of problems. Perl tries to be all things +to all people, but nothing special to anyone. Examples of specialized +languages that come to mind include prolog and matlab. + +=head2 When shouldn't I program in Perl? + +One good reason is when you already have an existing +application written in another language that's all done (and done +well), or you have an application language specifically designed for a +certain task (e.g. prolog, make). + +If you find that you need to speed up a specific part of a Perl +application (not something you often need) you may want to use C, +but you can access this from your Perl code with L<perlxs>. + +=head2 What's the difference between "perl" and "Perl"? + +"Perl" is the name of the language. Only the "P" is capitalized. +The name of the interpreter (the program which runs the Perl script) +is "perl" with a lowercase "p". + +You may or may not choose to follow this usage. But never write "PERL", +because perl is not an acronym. + +=head2 What is a JAPH? + +(contributed by brian d foy) + +JAPH stands for "Just another Perl hacker,", which Randal Schwartz used +to sign email and usenet messages starting in the late 1980s. He +previously used the phrase with many subjects ("Just another x hacker,"), +so to distinguish his JAPH, he started to write them as Perl programs: + + print "Just another Perl hacker,"; + +Other people picked up on this and started to write clever or obfuscated +programs to produce the same output, spinning things quickly out of +control while still providing hours of amusement for their creators and +readers. + +CPAN has several JAPH programs at L<http://www.cpan.org/misc/japh>. + +=head2 How can I convince others to use Perl? + +(contributed by brian d foy) + +Appeal to their self interest! If Perl is new (and thus scary) to them, +find something that Perl can do to solve one of their problems. That +might mean that Perl either saves them something (time, headaches, money) +or gives them something (flexibility, power, testability). + +In general, the benefit of a language is closely related to the skill of +the people using that language. If you or your team can be faster, +better, and stronger through Perl, you'll deliver more value. Remember, +people often respond better to what they get out of it. If you run +into resistance, figure out what those people get out of the other +choice and how Perl might satisfy that requirement. + +You don't have to worry about finding or paying for Perl; it's freely +available and several popular operating systems come with Perl. Community +support in places such as Perlmonks ( L<http://www.perlmonks.com> ) +and the various Perl mailing lists ( L<http://lists.perl.org> ) means that +you can usually get quick answers to your problems. + +Finally, keep in mind that Perl might not be the right tool for every +job. You're a much better advocate if your claims are reasonable and +grounded in reality. Dogmatically advocating anything tends to make +people discount your message. Be honest about possible disadvantages +to your choice of Perl since any choice has trade-offs. + +You might find these links useful: + +=over 4 + +=item * L<http://www.perl.org/about.html> + +=item * L<http://perltraining.com.au/whyperl.html> + +=back + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples here are in the public +domain. You are permitted and encouraged to use this code and any +derivatives thereof in your own programs for fun or for profit as you +see fit. A simple comment in the code giving credit to the FAQ would +be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq2.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq2.pod new file mode 100644 index 00000000000..e890cc34a1a --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq2.pod @@ -0,0 +1,246 @@ +=head1 NAME + +perlfaq2 - Obtaining and Learning about Perl + +=head1 DESCRIPTION + +This section of the FAQ answers questions about where to find +source and documentation for Perl, support, and +related matters. + +=head2 What machines support Perl? Where do I get it? + +The standard release of Perl (the one maintained by the Perl +development team) is distributed only in source code form. You +can find the latest releases at L<http://www.cpan.org/src/>. + +Perl builds and runs on a bewildering number of platforms. Virtually +all known and current Unix derivatives are supported (perl's native +platform), as are other systems like VMS, DOS, OS/2, Windows, +QNX, BeOS, OS X, MPE/iX and the Amiga. + +Binary distributions for some proprietary platforms can be found +L<http://www.cpan.org/ports/> directory. Because these are not part of +the standard distribution, they may and in fact do differ from the +base perl port in a variety of ways. You'll have to check their +respective release notes to see just what the differences are. These +differences can be either positive (e.g. extensions for the features +of the particular platform that are not supported in the source +release of perl) or negative (e.g. might be based upon a less current +source release of perl). + +=head2 How can I get a binary version of Perl? + +See L<CPAN Ports|http://www.cpan.org/ports/> + +=head2 I don't have a C compiler. How can I build my own Perl interpreter? + +For Windows, use a binary version of Perl, +L<Strawberry Perl|http://strawberryperl.com/> and +L<ActivePerl|http://www.activestate.com/activeperl> come with a +bundled C compiler. + +Otherwise if you really do want to build Perl, you need to get a +binary version of C<gcc> for your system first. Use a search +engine to find out how to do this for your operating system. + +=head2 I copied the Perl binary from one machine to another, but scripts don't work. + +That's probably because you forgot libraries, or library paths differ. +You really should build the whole distribution on the machine it will +eventually live on, and then type C<make install>. Most other +approaches are doomed to failure. + +One simple way to check that things are in the right place is to print out +the hard-coded C<@INC> that perl looks through for libraries: + + % perl -le 'print for @INC' + +If this command lists any paths that don't exist on your system, then you +may need to move the appropriate libraries to these locations, or create +symbolic links, aliases, or shortcuts appropriately. C<@INC> is also printed as +part of the output of + + % perl -V + +You might also want to check out +L<perlfaq8/"How do I keep my own module/library directory?">. + +=head2 I grabbed the sources and tried to compile but gdbm/dynamic loading/malloc/linking/... failed. How do I make it work? + +Read the F<INSTALL> file, which is part of the source distribution. +It describes in detail how to cope with most idiosyncrasies that the +C<Configure> script can't work around for any given system or +architecture. + +=head2 What modules and extensions are available for Perl? What is CPAN? + +CPAN stands for Comprehensive Perl Archive Network, a multi-gigabyte +archive replicated on hundreds of machines all over the world. CPAN +contains tens of thousands of modules and extensions, source code +and documentation, designed for I<everything> from commercial +database interfaces to keyboard/screen control and running large web sites. + +You can search CPAN on L<http://metacpan.org> or +L<http://search.cpan.org/>. + +The master web site for CPAN is L<http://www.cpan.org/>, +L<http://www.cpan.org/SITES.html> lists all mirrors. + +See the CPAN FAQ at L<http://www.cpan.org/misc/cpan-faq.html> for answers +to the most frequently asked questions about CPAN. + +The L<Task::Kensho> module has a list of recommended modules which +you should review as a good starting point. + +=head2 Where can I get information on Perl? + +=over 4 + +=item * L<http://www.perl.org/> + +=item * L<http://perldoc.perl.org/> + +=item * L<http://learn.perl.org/> + +=back + +The complete Perl documentation is available with the Perl distribution. +If you have Perl installed locally, you probably have the documentation +installed as well: type C<perldoc perl> in a terminal or +L<view online|http://perldoc.perl.org/perl.html>. + +(Some operating system distributions may ship the documentation in a different +package; for instance, on Debian, you need to install the C<perl-doc> package.) + +Many good books have been written about Perl--see the section later in +L<perlfaq2> for more details. + +=head2 What is perl.com? Perl Mongers? pm.org? perl.org? cpan.org? + +L<Perl.com|http://www.perl.com/> used to be part of the O'Reilly +Network, a subsidiary of O'Reilly Media. Although it retains most of +the original content from its O'Reilly Network, it is now hosted by +L<The Perl Foundation|http://www.perlfoundation.org/>. + +The Perl Foundation is an advocacy organization for the Perl language +which maintains the web site L<http://www.perl.org/> as a general +advocacy site for the Perl language. It uses the domain to provide +general support services to the Perl community, including the hosting +of mailing lists, web sites, and other services. There are also many +other sub-domains for special topics like learning Perl and jobs in Perl, +such as: + +=over 4 + +=item * L<http://www.perl.org/> + +=item * L<http://learn.perl.org/> + +=item * L<http://jobs.perl.org/> + +=item * L<http://lists.perl.org/> + +=back + +L<Perl Mongers|http://www.pm.org/> uses the pm.org domain for services +related to local Perl user groups, including the hosting of mailing lists +and web sites. See the L<Perl Mongers web site|http://www.pm.org/> for more +information about joining, starting, or requesting services for a +Perl user group. + +CPAN, or the Comprehensive Perl Archive Network L<http://www.cpan.org/>, +is a replicated, worldwide repository of Perl software. +See L<What is CPAN?|/"What modules and extensions are available for Perl? What is CPAN? What does CPANE<sol>srcE<sol>... mean?">. + +=head2 Where can I post questions? + +There are many Perl L<mailing lists|lists.perl.org> for various +topics, specifically the L<beginners list|http://lists.perl.org/list/beginners.html> +may be of use. + +Other places to ask questions are on the +L<PerlMonks site|http://www.perlmonks.org/> or +L<stackoverflow|http://stackoverflow.com/questions/tagged/perl>. + +=head2 Perl Books + +There are many good L<books on Perl|http://www.perl.org/books/library.html>. + +=head2 Which magazines have Perl content? + +There's also I<$foo Magazin>, a German magazine dedicated to Perl, at +( L<http://www.foo-magazin.de> ). The I<Perl-Zeitung> is another +German-speaking magazine for Perl beginners (see +L<http://perl-zeitung.at.tf> ). + +Several unix/linux releated magazines frequently includes articles on Perl. + +=head2 Which Perl blogs should I read? + +L<Perl News|http://perlnews.org/> covers some of the major events in the Perl +world, L<Perl Weekly|http://perlweekly.com/> is a weekly e-mail +(and RSS feed) of hand-picked Perl articles. + +L<http://blogs.perl.org/> hosts many Perl blogs, there are also +several blog aggregators: L<Perlsphere|http://perlsphere.net/> and +L<IronMan|http://ironman.enlightenedperl.org/> are two of them. + +=head2 What mailing lists are there for Perl? + +A comprehensive list of Perl-related mailing lists can be found at +L<http://lists.perl.org/> + +=head2 Where can I buy a commercial version of Perl? + +Perl already I<is> commercial software: it has a license +that you can grab and carefully read to your manager. It is distributed +in releases and comes in well-defined packages. There is a very large +and supportive user community and an extensive literature. + +If you still need commercial support +L<ActiveState|http://www.activestate.com/activeperl> offers +this. + +=head2 Where do I send bug reports? + +(contributed by brian d foy) + +First, ensure that you've found an actual bug. Second, ensure you've +found an actual bug. + +If you've found a bug with the perl interpreter or one of the modules +in the standard library (those that come with Perl), you can use the +L<perlbug> utility that comes with Perl (>= 5.004). It collects +information about your installation to include with your message, then +sends the message to the right place. + +To determine if a module came with your version of Perl, you can +install and use the L<Module::CoreList> module. It has the information +about the modules (with their versions) included with each release +of Perl. + +Every CPAN module has a bug tracker set up in RT, L<http://rt.cpan.org>. +You can submit bugs to RT either through its web interface or by +email. To email a bug report, send it to +bug-E<lt>distribution-nameE<gt>@rt.cpan.org . For example, if you +wanted to report a bug in L<Business::ISBN>, you could send a message to +bug-Business-ISBN@rt.cpan.org . + +Some modules might have special reporting requirements, such as a +Github or Google Code tracking system, so you should check the +module documentation too. + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples here are in the public +domain. You are permitted and encouraged to use this code and any +derivatives thereof in your own programs for fun or for profit as you +see fit. A simple comment in the code giving credit to the FAQ would +be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq3.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq3.pod new file mode 100644 index 00000000000..9e9ae8d906f --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq3.pod @@ -0,0 +1,1160 @@ +=head1 NAME + +perlfaq3 - Programming Tools + +=head1 DESCRIPTION + +This section of the FAQ answers questions related to programmer tools +and programming support. + +=head2 How do I do (anything)? + +Have you looked at CPAN (see L<perlfaq2>)? The chances are that +someone has already written a module that can solve your problem. +Have you read the appropriate manpages? Here's a brief index: + +=over 4 + +=item Basics + +=over 4 + +=item L<perldata> - Perl data types + +=item L<perlvar> - Perl pre-defined variables + +=item L<perlsyn> - Perl syntax + +=item L<perlop> - Perl operators and precedence + +=item L<perlsub> - Perl subroutines + +=back + + +=item Execution + +=over 4 + +=item L<perlrun> - how to execute the Perl interpreter + +=item L<perldebug> - Perl debugging + +=back + + +=item Functions + +=over 4 + +=item L<perlfunc> - Perl builtin functions + +=back + +=item Objects + +=over 4 + +=item L<perlref> - Perl references and nested data structures + +=item L<perlmod> - Perl modules (packages and symbol tables) + +=item L<perlobj> - Perl objects + +=item L<perltie> - how to hide an object class in a simple variable + +=back + + +=item Data Structures + +=over 4 + +=item L<perlref> - Perl references and nested data structures + +=item L<perllol> - Manipulating arrays of arrays in Perl + +=item L<perldsc> - Perl Data Structures Cookbook + +=back + +=item Modules + +=over 4 + +=item L<perlmod> - Perl modules (packages and symbol tables) + +=item L<perlmodlib> - constructing new Perl modules and finding existing ones + +=back + + +=item Regexes + +=over 4 + +=item L<perlre> - Perl regular expressions + +=item L<perlfunc> - Perl builtin functions> + +=item L<perlop> - Perl operators and precedence + +=item L<perllocale> - Perl locale handling (internationalization and localization) + +=back + + +=item Moving to perl5 + +=over 4 + +=item L<perltrap> - Perl traps for the unwary + +=item L<perl> + +=back + + +=item Linking with C + +=over 4 + +=item L<perlxstut> - Tutorial for writing XSUBs + +=item L<perlxs> - XS language reference manual + +=item L<perlcall> - Perl calling conventions from C + +=item L<perlguts> - Introduction to the Perl API + +=item L<perlembed> - how to embed perl in your C program + +=back + +=item Various + +L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> +(not a man-page but still useful, a collection of various essays on +Perl techniques) + +=back + +A crude table of contents for the Perl manpage set is found in L<perltoc>. + +=head2 How can I use Perl interactively? + +The typical approach uses the Perl debugger, described in the +L<perldebug(1)> manpage, on an "empty" program, like this: + + perl -de 42 + +Now just type in any legal Perl code, and it will be immediately +evaluated. You can also examine the symbol table, get stack +backtraces, check variable values, set breakpoints, and other +operations typically found in symbolic debuggers. + +You can also use L<Devel::REPL> which is an interactive shell for Perl, +commonly known as a REPL - Read, Evaluate, Print, Loop. It provides +various handy features. + +=head2 How do I find which modules are installed on my system? + +From the command line, you can use the C<cpan> command's C<-l> switch: + + $ cpan -l + +You can also use C<cpan>'s C<-a> switch to create an autobundle file +that C<CPAN.pm> understands and can use to re-install every module: + + $ cpan -a + +Inside a Perl program, you can use the L<ExtUtils::Installed> module to +show all installed distributions, although it can take awhile to do +its magic. The standard library which comes with Perl just shows up +as "Perl" (although you can get those with L<Module::CoreList>). + + use ExtUtils::Installed; + + my $inst = ExtUtils::Installed->new(); + my @modules = $inst->modules(); + +If you want a list of all of the Perl module filenames, you +can use L<File::Find::Rule>: + + use File::Find::Rule; + + my @files = File::Find::Rule-> + extras({follow => 1})-> + file()-> + name( '*.pm' )-> + in( @INC ) + ; + +If you do not have that module, you can do the same thing +with L<File::Find> which is part of the standard library: + + use File::Find; + my @files; + + find( + { + wanted => sub { + push @files, $File::Find::fullname + if -f $File::Find::fullname && /\.pm$/ + }, + follow => 1, + follow_skip => 2, + }, + @INC + ); + + print join "\n", @files; + +If you simply need to check quickly to see if a module is +available, you can check for its documentation. If you can +read the documentation the module is most likely installed. +If you cannot read the documentation, the module might not +have any (in rare cases): + + $ perldoc Module::Name + +You can also try to include the module in a one-liner to see if +perl finds it: + + $ perl -MModule::Name -e1 + +(If you don't receive a "Can't locate ... in @INC" error message, then Perl +found the module name you asked for.) + +=head2 How do I debug my Perl programs? + +(contributed by brian d foy) + +Before you do anything else, you can help yourself by ensuring that +you let Perl tell you about problem areas in your code. By turning +on warnings and strictures, you can head off many problems before +they get too big. You can find out more about these in L<strict> +and L<warnings>. + + #!/usr/bin/perl + use strict; + use warnings; + +Beyond that, the simplest debugger is the C<print> function. Use it +to look at values as you run your program: + + print STDERR "The value is [$value]\n"; + +The L<Data::Dumper> module can pretty-print Perl data structures: + + use Data::Dumper qw( Dumper ); + print STDERR "The hash is " . Dumper( \%hash ) . "\n"; + +Perl comes with an interactive debugger, which you can start with the +C<-d> switch. It's fully explained in L<perldebug>. + +If you'd like a graphical user interface and you have L<Tk>, you can use +C<ptkdb>. It's on CPAN and available for free. + +If you need something much more sophisticated and controllable, Leon +Brocard's L<Devel::ebug> (which you can call with the C<-D> switch as C<-Debug>) +gives you the programmatic hooks into everything you need to write your +own (without too much pain and suffering). + +You can also use a commercial debugger such as Affrus (Mac OS X), Komodo +from Activestate (Windows and Mac OS X), or EPIC (most platforms). + +=head2 How do I profile my Perl programs? + +(contributed by brian d foy, updated Fri Jul 25 12:22:26 PDT 2008) + +The C<Devel> namespace has several modules which you can use to +profile your Perl programs. + +The L<Devel::NYTProf> (New York Times Profiler) does both statement +and subroutine profiling. It's available from CPAN and you also invoke +it with the C<-d> switch: + + perl -d:NYTProf some_perl.pl + +It creates a database of the profile information that you can turn into +reports. The C<nytprofhtml> command turns the data into an HTML report +similar to the L<Devel::Cover> report: + + nytprofhtml + +You might also be interested in using the L<Benchmark> to +measure and compare code snippets. + +You can read more about profiling in I<Programming Perl>, chapter 20, +or I<Mastering Perl>, chapter 5. + +L<perldebguts> documents creating a custom debugger if you need to +create a special sort of profiler. brian d foy describes the process +in I<The Perl Journal>, "Creating a Perl Debugger", +L<http://www.ddj.com/184404522> , and "Profiling in Perl" +L<http://www.ddj.com/184404580> . + +Perl.com has two interesting articles on profiling: "Profiling Perl", +by Simon Cozens, L<http://www.perl.com/lpt/a/850> and "Debugging and +Profiling mod_perl Applications", by Frank Wiles, +L<http://www.perl.com/pub/a/2006/02/09/debug_mod_perl.html> . + +Randal L. Schwartz writes about profiling in "Speeding up Your Perl +Programs" for I<Unix Review>, +L<http://www.stonehenge.com/merlyn/UnixReview/col49.html> , and "Profiling +in Template Toolkit via Overriding" for I<Linux Magazine>, +L<http://www.stonehenge.com/merlyn/LinuxMag/col75.html> . + +=head2 How do I cross-reference my Perl programs? + +The L<B::Xref> module can be used to generate cross-reference reports +for Perl programs. + + perl -MO=Xref[,OPTIONS] scriptname.plx + +=head2 Is there a pretty-printer (formatter) for Perl? + +L<Perl::Tidy> comes with a perl script L<perltidy> which indents and +reformats Perl scripts to make them easier to read by trying to follow +the rules of the L<perlstyle>. If you write Perl, or spend much time reading +Perl, you will probably find it useful. + +Of course, if you simply follow the guidelines in L<perlstyle>, +you shouldn't need to reformat. The habit of formatting your code +as you write it will help prevent bugs. Your editor can and should +help you with this. The perl-mode or newer cperl-mode for emacs +can provide remarkable amounts of help with most (but not all) +code, and even less programmable editors can provide significant +assistance. Tom Christiansen and many other VI users swear by +the following settings in vi and its clones: + + set ai sw=4 + map! ^O {^M}^[O^T + +Put that in your F<.exrc> file (replacing the caret characters +with control characters) and away you go. In insert mode, ^T is +for indenting, ^D is for undenting, and ^O is for blockdenting--as +it were. A more complete example, with comments, can be found at +L<http://www.cpan.org/authors/id/TOMC/scripts/toms.exrc.gz> + +=head2 Is there an IDE or Windows Perl Editor? + +Perl programs are just plain text, so any editor will do. + +If you're on Unix, you already have an IDE--Unix itself. The Unix +philosophy is the philosophy of several small tools that each do one +thing and do it well. It's like a carpenter's toolbox. + +If you want an IDE, check the following (in alphabetical order, not +order of preference): + +=over 4 + +=item Eclipse + +L<http://e-p-i-c.sf.net/> + +The Eclipse Perl Integration Project integrates Perl +editing/debugging with Eclipse. + +=item Enginsite + +L<http://www.enginsite.com/> + +Perl Editor by EngInSite is a complete integrated development +environment (IDE) for creating, testing, and debugging Perl scripts; +the tool runs on Windows 9x/NT/2000/XP or later. + +=item Komodo + +L<http://www.ActiveState.com/Products/Komodo/> + +ActiveState's cross-platform (as of October 2004, that's Windows, Linux, +and Solaris), multi-language IDE has Perl support, including a regular expression +debugger and remote debugging. + +=item Notepad++ + +L<http://notepad-plus.sourceforge.net/> + +=item Open Perl IDE + +L<http://open-perl-ide.sourceforge.net/> + +Open Perl IDE is an integrated development environment for writing +and debugging Perl scripts with ActiveState's ActivePerl distribution +under Windows 95/98/NT/2000. + +=item OptiPerl + +L<http://www.optiperl.com/> + +OptiPerl is a Windows IDE with simulated CGI environment, including +debugger and syntax-highlighting editor. + +=item Padre + +L<http://padre.perlide.org/> + +Padre is cross-platform IDE for Perl written in Perl using wxWidgets to provide +a native look and feel. It's open source under the Artistic License. It +is one of the newer Perl IDEs. + +=item PerlBuilder + +L<http://www.solutionsoft.com/perl.htm> + +PerlBuilder is an integrated development environment for Windows that +supports Perl development. + +=item visiPerl+ + +L<http://helpconsulting.net/visiperl/index.html> + +From Help Consulting, for Windows. + +=item Visual Perl + +L<http://www.activestate.com/Products/Visual_Perl/> + +Visual Perl is a Visual Studio.NET plug-in from ActiveState. + +=item Zeus + +L<http://www.zeusedit.com/lookmain.html> + +Zeus for Window is another Win32 multi-language editor/IDE +that comes with support for Perl. + +=back + +For editors: if you're on Unix you probably have vi or a vi clone +already, and possibly an emacs too, so you may not need to download +anything. In any emacs the cperl-mode (M-x cperl-mode) gives you +perhaps the best available Perl editing mode in any editor. + +If you are using Windows, you can use any editor that lets you work +with plain text, such as NotePad or WordPad. Word processors, such as +Microsoft Word or WordPerfect, typically do not work since they insert +all sorts of behind-the-scenes information, although some allow you to +save files as "Text Only". You can also download text editors designed +specifically for programming, such as Textpad ( +L<http://www.textpad.com/> ) and UltraEdit ( L<http://www.ultraedit.com/> ), +among others. + +If you are using MacOS, the same concerns apply. MacPerl (for Classic +environments) comes with a simple editor. Popular external editors are +BBEdit ( L<http://www.bbedit.com/> ) or Alpha ( +L<http://www.his.com/~jguyer/Alpha/Alpha8.html> ). MacOS X users can use +Unix editors as well. + +=over 4 + +=item GNU Emacs + +L<http://www.gnu.org/software/emacs/windows/ntemacs.html> + +=item MicroEMACS + +L<http://www.microemacs.de/> + +=item XEmacs + +L<http://www.xemacs.org/Download/index.html> + +=item Jed + +L<http://space.mit.edu/~davis/jed/> + +=back + +or a vi clone such as + +=over 4 + +=item Vim + +L<http://www.vim.org/> + +=item Vile + +L<http://dickey.his.com/vile/vile.html> + +=back + +The following are Win32 multilanguage editor/IDEs that support Perl: + +=over 4 + +=item Codewright + +L<http://www.borland.com/codewright/> + +=item MultiEdit + +L<http://www.MultiEdit.com/> + +=item SlickEdit + +L<http://www.slickedit.com/> + +=item ConTEXT + +L<http://www.contexteditor.org/> + +=back + +There is also a toyedit Text widget based editor written in Perl +that is distributed with the Tk module on CPAN. The ptkdb +( L<http://ptkdb.sourceforge.net/> ) is a Perl/Tk-based debugger that +acts as a development environment of sorts. Perl Composer +( L<http://perlcomposer.sourceforge.net/> ) is an IDE for Perl/Tk +GUI creation. + +In addition to an editor/IDE you might be interested in a more +powerful shell environment for Win32. Your options include + +=over 4 + +=item Bash + +from the Cygwin package ( L<http://sources.redhat.com/cygwin/> ) + +=item Ksh + +from the MKS Toolkit ( L<http://www.mkssoftware.com/> ), or the Bourne shell of +the U/WIN environment ( L<http://www.research.att.com/sw/tools/uwin/> ) + +=item Tcsh + +L<ftp://ftp.astron.com/pub/tcsh/> , see also +L<http://www.primate.wisc.edu/software/csh-tcsh-book/> + +=item Zsh + +L<http://www.zsh.org/> + +=back + +MKS and U/WIN are commercial (U/WIN is free for educational and +research purposes), Cygwin is covered by the GNU General Public +License (but that shouldn't matter for Perl use). The Cygwin, MKS, +and U/WIN all contain (in addition to the shells) a comprehensive set +of standard Unix toolkit utilities. + +If you're transferring text files between Unix and Windows using FTP +be sure to transfer them in ASCII mode so the ends of lines are +appropriately converted. + +On Mac OS the MacPerl Application comes with a simple 32k text editor +that behaves like a rudimentary IDE. In contrast to the MacPerl Application +the MPW Perl tool can make use of the MPW Shell itself as an editor (with +no 32k limit). + +=over 4 + +=item Affrus + +is a full Perl development environment with full debugger support +( L<http://www.latenightsw.com> ). + +=item Alpha + +is an editor, written and extensible in Tcl, that nonetheless has +built-in support for several popular markup and programming languages, +including Perl and HTML ( L<http://www.his.com/~jguyer/Alpha/Alpha8.html> ). + +=item BBEdit and BBEdit Lite + +are text editors for Mac OS that have a Perl sensitivity mode +( L<http://web.barebones.com/> ). + +=back + +=head2 Where can I get Perl macros for vi? + +For a complete version of Tom Christiansen's vi configuration file, +see L<http://www.cpan.org/authors/Tom_Christiansen/scripts/toms.exrc.gz> , +the standard benchmark file for vi emulators. The file runs best with nvi, +the current version of vi out of Berkeley, which incidentally can be built +with an embedded Perl interpreter--see L<http://www.cpan.org/src/misc/> . + +=head2 Where can I get perl-mode or cperl-mode for emacs? +X<emacs> + +Since Emacs version 19 patchlevel 22 or so, there have been both a +perl-mode.el and support for the Perl debugger built in. These should +come with the standard Emacs 19 distribution. + +Note that the perl-mode of emacs will have fits with C<"main'foo"> +(single quote), and mess up the indentation and highlighting. You +are probably using C<"main::foo"> in new Perl code anyway, so this +shouldn't be an issue. + +For CPerlMode, see L<http://www.emacswiki.org/cgi-bin/wiki/CPerlMode> + +=head2 How can I use curses with Perl? + +The Curses module from CPAN provides a dynamically loadable object +module interface to a curses library. A small demo can be found at the +directory L<http://www.cpan.org/authors/Tom_Christiansen/scripts/rep.gz> ; +this program repeats a command and updates the screen as needed, rendering +B<rep ps axu> similar to B<top>. + +=head2 How can I write a GUI (X, Tk, Gtk, etc.) in Perl? +X<GUI> X<Tk> X<Wx> X<WxWidgets> X<Gtk> X<Gtk2> X<CamelBones> X<Qt> + +(contributed by Ben Morrow) + +There are a number of modules which let you write GUIs in Perl. Most +GUI toolkits have a perl interface: an incomplete list follows. + +=over 4 + +=item Tk + +This works under Unix and Windows, and the current version doesn't +look half as bad under Windows as it used to. Some of the gui elements +still don't 'feel' quite right, though. The interface is very natural +and 'perlish', making it easy to use in small scripts that just need a +simple gui. It hasn't been updated in a while. + +=item Wx + +This is a Perl binding for the cross-platform wxWidgets toolkit +( L<http://www.wxwidgets.org> ). It works under Unix, Win32 and Mac OS X, +using native widgets (Gtk under Unix). The interface follows the C++ +interface closely, but the documentation is a little sparse for someone +who doesn't know the library, mostly just referring you to the C++ +documentation. + +=item Gtk and Gtk2 + +These are Perl bindings for the Gtk toolkit ( L<http://www.gtk.org> ). The +interface changed significantly between versions 1 and 2 so they have +separate Perl modules. It runs under Unix, Win32 and Mac OS X (currently +it requires an X server on Mac OS, but a 'native' port is underway), and +the widgets look the same on every platform: i.e., they don't match the +native widgets. As with Wx, the Perl bindings follow the C API closely, +and the documentation requires you to read the C documentation to +understand it. + +=item Win32::GUI + +This provides access to most of the Win32 GUI widgets from Perl. +Obviously, it only runs under Win32, and uses native widgets. The Perl +interface doesn't really follow the C interface: it's been made more +Perlish, and the documentation is pretty good. More advanced stuff may +require familiarity with the C Win32 APIs, or reference to MSDN. + +=item CamelBones + +CamelBones ( L<http://camelbones.sourceforge.net> ) is a Perl interface to +Mac OS X's Cocoa GUI toolkit, and as such can be used to produce native +GUIs on Mac OS X. It's not on CPAN, as it requires frameworks that +CPAN.pm doesn't know how to install, but installation is via the +standard OSX package installer. The Perl API is, again, very close to +the ObjC API it's wrapping, and the documentation just tells you how to +translate from one to the other. + +=item Qt + +There is a Perl interface to TrollTech's Qt toolkit, but it does not +appear to be maintained. + +=item Athena + +Sx is an interface to the Athena widget set which comes with X, but +again it appears not to be much used nowadays. + +=back + +=head2 How can I make my Perl program run faster? + +The best way to do this is to come up with a better algorithm. This +can often make a dramatic difference. Jon Bentley's book +I<Programming Pearls> (that's not a misspelling!) has some good tips +on optimization, too. Advice on benchmarking boils down to: benchmark +and profile to make sure you're optimizing the right part, look for +better algorithms instead of microtuning your code, and when all else +fails consider just buying faster hardware. You will probably want to +read the answer to the earlier question "How do I profile my Perl +programs?" if you haven't done so already. + +A different approach is to autoload seldom-used Perl code. See the +AutoSplit and AutoLoader modules in the standard distribution for +that. Or you could locate the bottleneck and think about writing just +that part in C, the way we used to take bottlenecks in C code and +write them in assembler. Similar to rewriting in C, modules that have +critical sections can be written in C (for instance, the PDL module +from CPAN). + +If you're currently linking your perl executable to a shared +I<libc.so>, you can often gain a 10-25% performance benefit by +rebuilding it to link with a static libc.a instead. This will make a +bigger perl executable, but your Perl programs (and programmers) may +thank you for it. See the F<INSTALL> file in the source distribution +for more information. + +The undump program was an ancient attempt to speed up Perl program by +storing the already-compiled form to disk. This is no longer a viable +option, as it only worked on a few architectures, and wasn't a good +solution anyway. + +=head2 How can I make my Perl program take less memory? + +When it comes to time-space tradeoffs, Perl nearly always prefers to +throw memory at a problem. Scalars in Perl use more memory than +strings in C, arrays take more than that, and hashes use even more. While +there's still a lot to be done, recent releases have been addressing +these issues. For example, as of 5.004, duplicate hash keys are +shared amongst all hashes using them, so require no reallocation. + +In some cases, using substr() or vec() to simulate arrays can be +highly beneficial. For example, an array of a thousand booleans will +take at least 20,000 bytes of space, but it can be turned into one +125-byte bit vector--a considerable memory savings. The standard +Tie::SubstrHash module can also help for certain types of data +structure. If you're working with specialist data structures +(matrices, for instance) modules that implement these in C may use +less memory than equivalent Perl modules. + +Another thing to try is learning whether your Perl was compiled with +the system malloc or with Perl's builtin malloc. Whichever one it +is, try using the other one and see whether this makes a difference. +Information about malloc is in the F<INSTALL> file in the source +distribution. You can find out whether you are using perl's malloc by +typing C<perl -V:usemymalloc>. + +Of course, the best way to save memory is to not do anything to waste +it in the first place. Good programming practices can go a long way +toward this: + +=over 4 + +=item Don't slurp! + +Don't read an entire file into memory if you can process it line +by line. Or more concretely, use a loop like this: + + # + # Good Idea + # + while (my $line = <$file_handle>) { + # ... + } + +instead of this: + + # + # Bad Idea + # + my @data = <$file_handle>; + foreach (@data) { + # ... + } + +When the files you're processing are small, it doesn't much matter which +way you do it, but it makes a huge difference when they start getting +larger. + +=item Use map and grep selectively + +Remember that both map and grep expect a LIST argument, so doing this: + + @wanted = grep {/pattern/} <$file_handle>; + +will cause the entire file to be slurped. For large files, it's better +to loop: + + while (<$file_handle>) { + push(@wanted, $_) if /pattern/; + } + +=item Avoid unnecessary quotes and stringification + +Don't quote large strings unless absolutely necessary: + + my $copy = "$large_string"; + +makes 2 copies of $large_string (one for $copy and another for the +quotes), whereas + + my $copy = $large_string; + +only makes one copy. + +Ditto for stringifying large arrays: + + { + local $, = "\n"; + print @big_array; + } + +is much more memory-efficient than either + + print join "\n", @big_array; + +or + + { + local $" = "\n"; + print "@big_array"; + } + + +=item Pass by reference + +Pass arrays and hashes by reference, not by value. For one thing, it's +the only way to pass multiple lists or hashes (or both) in a single +call/return. It also avoids creating a copy of all the contents. This +requires some judgement, however, because any changes will be propagated +back to the original data. If you really want to mangle (er, modify) a +copy, you'll have to sacrifice the memory needed to make one. + +=item Tie large variables to disk + +For "big" data stores (i.e. ones that exceed available memory) consider +using one of the DB modules to store it on disk instead of in RAM. This +will incur a penalty in access time, but that's probably better than +causing your hard disk to thrash due to massive swapping. + +=back + +=head2 Is it safe to return a reference to local or lexical data? + +Yes. Perl's garbage collection system takes care of this so +everything works out right. + + sub makeone { + my @a = ( 1 .. 10 ); + return \@a; + } + + for ( 1 .. 10 ) { + push @many, makeone(); + } + + print $many[4][5], "\n"; + + print "@many\n"; + +=head2 How can I free an array or hash so my program shrinks? + +(contributed by Michael Carman) + +You usually can't. Memory allocated to lexicals (i.e. my() variables) +cannot be reclaimed or reused even if they go out of scope. It is +reserved in case the variables come back into scope. Memory allocated +to global variables can be reused (within your program) by using +undef() and/or delete(). + +On most operating systems, memory allocated to a program can never be +returned to the system. That's why long-running programs sometimes re- +exec themselves. Some operating systems (notably, systems that use +mmap(2) for allocating large chunks of memory) can reclaim memory that +is no longer used, but on such systems, perl must be configured and +compiled to use the OS's malloc, not perl's. + +In general, memory allocation and de-allocation isn't something you can +or should be worrying about much in Perl. + +See also "How can I make my Perl program take less memory?" + +=head2 How can I make my CGI script more efficient? + +Beyond the normal measures described to make general Perl programs +faster or smaller, a CGI program has additional issues. It may be run +several times per second. Given that each time it runs it will need +to be re-compiled and will often allocate a megabyte or more of system +memory, this can be a killer. Compiling into C B<isn't going to help +you> because the process start-up overhead is where the bottleneck is. + +There are three popular ways to avoid this overhead. One solution +involves running the Apache HTTP server (available from +L<http://www.apache.org/> ) with either of the mod_perl or mod_fastcgi +plugin modules. + +With mod_perl and the Apache::Registry module (distributed with +mod_perl), httpd will run with an embedded Perl interpreter which +pre-compiles your script and then executes it within the same address +space without forking. The Apache extension also gives Perl access to +the internal server API, so modules written in Perl can do just about +anything a module written in C can. For more on mod_perl, see +L<http://perl.apache.org/> + +With the FCGI module (from CPAN) and the mod_fastcgi +module (available from L<http://www.fastcgi.com/> ) each of your Perl +programs becomes a permanent CGI daemon process. + +Finally, L<Plack> is a Perl module and toolkit that contains PSGI middleware, +helpers and adapters to web servers, allowing you to easily deploy scripts which +can continue running, and provides flexibility with regards to which web server +you use. It can allow existing CGI scripts to enjoy this flexibility and +performance with minimal changes, or can be used along with modern Perl web +frameworks to make writing and deploying web services with Perl a breeze. + +These solutions can have far-reaching effects on your system and on the way you +write your CGI programs, so investigate them with care. + +See also +L<http://www.cpan.org/modules/by-category/15_World_Wide_Web_HTML_HTTP_CGI/> . + +=head2 How can I hide the source for my Perl program? + +Delete it. :-) Seriously, there are a number of (mostly +unsatisfactory) solutions with varying levels of "security". + +First of all, however, you I<can't> take away read permission, because +the source code has to be readable in order to be compiled and +interpreted. (That doesn't mean that a CGI script's source is +readable by people on the web, though--only by people with access to +the filesystem.) So you have to leave the permissions at the socially +friendly 0755 level. + +Some people regard this as a security problem. If your program does +insecure things and relies on people not knowing how to exploit those +insecurities, it is not secure. It is often possible for someone to +determine the insecure things and exploit them without viewing the +source. Security through obscurity, the name for hiding your bugs +instead of fixing them, is little security indeed. + +You can try using encryption via source filters (Starting from Perl +5.8 the Filter::Simple and Filter::Util::Call modules are included in +the standard distribution), but any decent programmer will be able to +decrypt it. You can try using the byte code compiler and interpreter +described later in L<perlfaq3>, but the curious might still be able to +de-compile it. You can try using the native-code compiler described +later, but crackers might be able to disassemble it. These pose +varying degrees of difficulty to people wanting to get at your code, +but none can definitively conceal it (true of every language, not just +Perl). + +It is very easy to recover the source of Perl programs. You simply +feed the program to the perl interpreter and use the modules in +the B:: hierarchy. The B::Deparse module should be able to +defeat most attempts to hide source. Again, this is not +unique to Perl. + +If you're concerned about people profiting from your code, then the +bottom line is that nothing but a restrictive license will give you +legal security. License your software and pepper it with threatening +statements like "This is unpublished proprietary software of XYZ Corp. +Your access to it does not give you permission to use it blah blah +blah." We are not lawyers, of course, so you should see a lawyer if +you want to be sure your license's wording will stand up in court. + +=head2 How can I compile my Perl program into byte code or C? + +(contributed by brian d foy) + +In general, you can't do this. There are some things that may work +for your situation though. People usually ask this question +because they want to distribute their works without giving away +the source code, and most solutions trade disk space for convenience. +You probably won't see much of a speed increase either, since most +solutions simply bundle a Perl interpreter in the final product +(but see L<How can I make my Perl program run faster?>). + +The Perl Archive Toolkit ( L<http://par.perl.org/> ) is Perl's +analog to Java's JAR. It's freely available and on CPAN ( +L<http://search.cpan.org/dist/PAR/> ). + +There are also some commercial products that may work for you, although +you have to buy a license for them. + +The Perl Dev Kit ( L<http://www.activestate.com/Products/Perl_Dev_Kit/> ) +from ActiveState can "Turn your Perl programs into ready-to-run +executables for HP-UX, Linux, Solaris and Windows." + +Perl2Exe ( L<http://www.indigostar.com/perl2exe.htm> ) is a command line +program for converting perl scripts to executable files. It targets both +Windows and Unix platforms. + +=head2 How can I get C<#!perl> to work on [MS-DOS,NT,...]? + +For OS/2 just use + + extproc perl -S -your_switches + +as the first line in C<*.cmd> file (C<-S> due to a bug in cmd.exe's +"extproc" handling). For DOS one should first invent a corresponding +batch file and codify it in C<ALTERNATE_SHEBANG> (see the +F<dosish.h> file in the source distribution for more information). + +The Win95/NT installation, when using the ActiveState port of Perl, +will modify the Registry to associate the C<.pl> extension with the +perl interpreter. If you install another port, perhaps even building +your own Win95/NT Perl from the standard sources by using a Windows port +of gcc (e.g., with cygwin or mingw32), then you'll have to modify +the Registry yourself. In addition to associating C<.pl> with the +interpreter, NT people can use: C<SET PATHEXT=%PATHEXT%;.PL> to let them +run the program C<install-linux.pl> merely by typing C<install-linux>. + +Under "Classic" MacOS, a perl program will have the appropriate Creator and +Type, so that double-clicking them will invoke the MacPerl application. +Under Mac OS X, clickable apps can be made from any C<#!> script using Wil +Sanchez' DropScript utility: L<http://www.wsanchez.net/software/> . + +I<IMPORTANT!>: Whatever you do, PLEASE don't get frustrated, and just +throw the perl interpreter into your cgi-bin directory, in order to +get your programs working for a web server. This is an EXTREMELY big +security risk. Take the time to figure out how to do it correctly. + +=head2 Can I write useful Perl programs on the command line? + +Yes. Read L<perlrun> for more information. Some examples follow. +(These assume standard Unix shell quoting rules.) + + # sum first and last fields + perl -lane 'print $F[0] + $F[-1]' * + + # identify text files + perl -le 'for(@ARGV) {print if -f && -T _}' * + + # remove (most) comments from C program + perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c + + # make file a month younger than today, defeating reaper daemons + perl -e '$X=24*60*60; utime(time(),time() + 30 * $X,@ARGV)' * + + # find first unused uid + perl -le '$i++ while getpwuid($i); print $i' + + # display reasonable manpath + echo $PATH | perl -nl -072 -e ' + s![^/+]*$!man!&&-d&&!$s{$_}++&&push@m,$_;END{print"@m"}' + +OK, the last one was actually an Obfuscated Perl Contest entry. :-) + +=head2 Why don't Perl one-liners work on my DOS/Mac/VMS system? + +The problem is usually that the command interpreters on those systems +have rather different ideas about quoting than the Unix shells under +which the one-liners were created. On some systems, you may have to +change single-quotes to double ones, which you must I<NOT> do on Unix +or Plan9 systems. You might also have to change a single % to a %%. + +For example: + + # Unix (including Mac OS X) + perl -e 'print "Hello world\n"' + + # DOS, etc. + perl -e "print \"Hello world\n\"" + + # Mac Classic + print "Hello world\n" + (then Run "Myscript" or Shift-Command-R) + + # MPW + perl -e 'print "Hello world\n"' + + # VMS + perl -e "print ""Hello world\n""" + +The problem is that none of these examples are reliable: they depend on the +command interpreter. Under Unix, the first two often work. Under DOS, +it's entirely possible that neither works. If 4DOS was the command shell, +you'd probably have better luck like this: + + perl -e "print <Ctrl-x>"Hello world\n<Ctrl-x>"" + +Under the Mac, it depends which environment you are using. The MacPerl +shell, or MPW, is much like Unix shells in its support for several +quoting variants, except that it makes free use of the Mac's non-ASCII +characters as control characters. + +Using qq(), q(), and qx(), instead of "double quotes", 'single +quotes', and `backticks`, may make one-liners easier to write. + +There is no general solution to all of this. It is a mess. + +[Some of this answer was contributed by Kenneth Albanowski.] + +=head2 Where can I learn about CGI or Web programming in Perl? + +For modules, get the CGI or LWP modules from CPAN. For textbooks, +see the two especially dedicated to web stuff in the question on +books. For problems and questions related to the web, like "Why +do I get 500 Errors" or "Why doesn't it run from the browser right +when it runs fine on the command line", see the troubleshooting +guides and references in L<perlfaq9> or in the CGI MetaFAQ: + + L<http://www.perl.org/CGI_MetaFAQ.html> + +Looking in to L<Plack> and modern Perl web frameworks is highly recommended, +though; web programming in Perl has evolved a long way from the old days of +simple CGI scripts. + +=head2 Where can I learn about object-oriented Perl programming? + +A good place to start is L<perltoot>, and you can use L<perlobj>, +L<perlboot>, L<perltoot>, L<perltooc>, and L<perlbot> for reference. + +A good book on OO on Perl is the "Object-Oriented Perl" +by Damian Conway from Manning Publications, or "Intermediate Perl" +by Randal Schwartz, brian d foy, and Tom Phoenix from O'Reilly Media. + +=head2 Where can I learn about linking C with Perl? + +If you want to call C from Perl, start with L<perlxstut>, +moving on to L<perlxs>, L<xsubpp>, and L<perlguts>. If you want to +call Perl from C, then read L<perlembed>, L<perlcall>, and +L<perlguts>. Don't forget that you can learn a lot from looking at +how the authors of existing extension modules wrote their code and +solved their problems. + +You might not need all the power of XS. The Inline::C module lets +you put C code directly in your Perl source. It handles all the +magic to make it work. You still have to learn at least some of +the perl API but you won't have to deal with the complexity of the +XS support files. + +=head2 I've read perlembed, perlguts, etc., but I can't embed perl in my C program; what am I doing wrong? + +Download the ExtUtils::Embed kit from CPAN and run `make test'. If +the tests pass, read the pods again and again and again. If they +fail, see L<perlbug> and send a bug report with the output of +C<make test TEST_VERBOSE=1> along with C<perl -V>. + +=head2 When I tried to run my script, I got this message. What does it mean? + +A complete list of Perl's error messages and warnings with explanatory +text can be found in L<perldiag>. You can also use the splain program +(distributed with Perl) to explain the error messages: + + perl program 2>diag.out + splain [-v] [-p] diag.out + +or change your program to explain the messages for you: + + use diagnostics; + +or + + use diagnostics -verbose; + +=head2 What's MakeMaker? + +(contributed by brian d foy) + +The L<ExtUtils::MakeMaker> module, better known simply as "MakeMaker", +turns a Perl script, typically called C<Makefile.PL>, into a Makefile. +The Unix tool C<make> uses this file to manage dependencies and actions +to process and install a Perl distribution. + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples here are in the public +domain. You are permitted and encouraged to use this code and any +derivatives thereof in your own programs for fun or for profit as you +see fit. A simple comment in the code giving credit to the FAQ would +be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq4.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq4.pod new file mode 100644 index 00000000000..e5de15385a5 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq4.pod @@ -0,0 +1,2679 @@ +=head1 NAME + +perlfaq4 - Data Manipulation + +=head1 DESCRIPTION + +This section of the FAQ answers questions related to manipulating +numbers, dates, strings, arrays, hashes, and miscellaneous data issues. + +=head1 Data: Numbers + +=head2 Why am I getting long decimals (eg, 19.9499999999999) instead of the numbers I should be getting (eg, 19.95)? + +For the long explanation, see David Goldberg's "What Every Computer +Scientist Should Know About Floating-Point Arithmetic" +(L<http://web.cse.msu.edu/~cse320/Documents/FloatingPoint.pdf>). + +Internally, your computer represents floating-point numbers in binary. +Digital (as in powers of two) computers cannot store all numbers +exactly. Some real numbers lose precision in the process. This is a +problem with how computers store numbers and affects all computer +languages, not just Perl. + +L<perlnumber> shows the gory details of number representations and +conversions. + +To limit the number of decimal places in your numbers, you can use the +C<printf> or C<sprintf> function. See +L<perlop/"Floating-point Arithmetic"> for more details. + + printf "%.2f", 10/3; + + my $number = sprintf "%.2f", 10/3; + +=head2 Why is int() broken? + +Your C<int()> is most probably working just fine. It's the numbers that +aren't quite what you think. + +First, see the answer to "Why am I getting long decimals +(eg, 19.9499999999999) instead of the numbers I should be getting +(eg, 19.95)?". + +For example, this + + print int(0.6/0.2-2), "\n"; + +will in most computers print 0, not 1, because even such simple +numbers as 0.6 and 0.2 cannot be presented exactly by floating-point +numbers. What you think in the above as 'three' is really more like +2.9999999999999995559. + +=head2 Why isn't my octal data interpreted correctly? + +(contributed by brian d foy) + +You're probably trying to convert a string to a number, which Perl only +converts as a decimal number. When Perl converts a string to a number, it +ignores leading spaces and zeroes, then assumes the rest of the digits +are in base 10: + + my $string = '0644'; + + print $string + 0; # prints 644 + + print $string + 44; # prints 688, certainly not octal! + +This problem usually involves one of the Perl built-ins that has the +same name a Unix command that uses octal numbers as arguments on the +command line. In this example, C<chmod> on the command line knows that +its first argument is octal because that's what it does: + + %prompt> chmod 644 file + +If you want to use the same literal digits (644) in Perl, you have to tell +Perl to treat them as octal numbers either by prefixing the digits with +a C<0> or using C<oct>: + + chmod( 0644, $filename ); # right, has leading zero + chmod( oct(644), $filename ); # also correct + +The problem comes in when you take your numbers from something that Perl +thinks is a string, such as a command line argument in C<@ARGV>: + + chmod( $ARGV[0], $filename ); # wrong, even if "0644" + + chmod( oct($ARGV[0]), $filename ); # correct, treat string as octal + +You can always check the value you're using by printing it in octal +notation to ensure it matches what you think it should be. Print it +in octal and decimal format: + + printf "0%o %d", $number, $number; + +=head2 Does Perl have a round() function? What about ceil() and floor()? Trig functions? + +Remember that C<int()> merely truncates toward 0. For rounding to a +certain number of digits, C<sprintf()> or C<printf()> is usually the +easiest route. + + printf("%.3f", 3.1415926535); # prints 3.142 + +The L<POSIX> module (part of the standard Perl distribution) +implements C<ceil()>, C<floor()>, and a number of other mathematical +and trigonometric functions. + + use POSIX; + my $ceil = ceil(3.5); # 4 + my $floor = floor(3.5); # 3 + +In 5.000 to 5.003 perls, trigonometry was done in the L<Math::Complex> +module. With 5.004, the L<Math::Trig> module (part of the standard Perl +distribution) implements the trigonometric functions. Internally it +uses the L<Math::Complex> module and some functions can break out from +the real axis into the complex plane, for example the inverse sine of +2. + +Rounding in financial applications can have serious implications, and +the rounding method used should be specified precisely. In these +cases, it probably pays not to trust whichever system of rounding is +being used by Perl, but instead to implement the rounding function you +need yourself. + +To see why, notice how you'll still have an issue on half-way-point +alternation: + + for (my $i = 0; $i < 1.01; $i += 0.05) { printf "%.1f ",$i} + + 0.0 0.1 0.1 0.2 0.2 0.2 0.3 0.3 0.4 0.4 0.5 0.5 0.6 0.7 0.7 + 0.8 0.8 0.9 0.9 1.0 1.0 + +Don't blame Perl. It's the same as in C. IEEE says we have to do +this. Perl numbers whose absolute values are integers under 2**31 (on +32-bit machines) will work pretty much like mathematical integers. +Other numbers are not guaranteed. + +=head2 How do I convert between numeric representations/bases/radixes? + +As always with Perl there is more than one way to do it. Below are a +few examples of approaches to making common conversions between number +representations. This is intended to be representational rather than +exhaustive. + +Some of the examples later in L<perlfaq4> use the L<Bit::Vector> +module from CPAN. The reason you might choose L<Bit::Vector> over the +perl built-in functions is that it works with numbers of ANY size, +that it is optimized for speed on some operations, and for at least +some programmers the notation might be familiar. + +=over 4 + +=item How do I convert hexadecimal into decimal + +Using perl's built in conversion of C<0x> notation: + + my $dec = 0xDEADBEEF; + +Using the C<hex> function: + + my $dec = hex("DEADBEEF"); + +Using C<pack>: + + my $dec = unpack("N", pack("H8", substr("0" x 8 . "DEADBEEF", -8))); + +Using the CPAN module C<Bit::Vector>: + + use Bit::Vector; + my $vec = Bit::Vector->new_Hex(32, "DEADBEEF"); + my $dec = $vec->to_Dec(); + +=item How do I convert from decimal to hexadecimal + +Using C<sprintf>: + + my $hex = sprintf("%X", 3735928559); # upper case A-F + my $hex = sprintf("%x", 3735928559); # lower case a-f + +Using C<unpack>: + + my $hex = unpack("H*", pack("N", 3735928559)); + +Using L<Bit::Vector>: + + use Bit::Vector; + my $vec = Bit::Vector->new_Dec(32, -559038737); + my $hex = $vec->to_Hex(); + +And L<Bit::Vector> supports odd bit counts: + + use Bit::Vector; + my $vec = Bit::Vector->new_Dec(33, 3735928559); + $vec->Resize(32); # suppress leading 0 if unwanted + my $hex = $vec->to_Hex(); + +=item How do I convert from octal to decimal + +Using Perl's built in conversion of numbers with leading zeros: + + my $dec = 033653337357; # note the leading 0! + +Using the C<oct> function: + + my $dec = oct("33653337357"); + +Using L<Bit::Vector>: + + use Bit::Vector; + my $vec = Bit::Vector->new(32); + $vec->Chunk_List_Store(3, split(//, reverse "33653337357")); + my $dec = $vec->to_Dec(); + +=item How do I convert from decimal to octal + +Using C<sprintf>: + + my $oct = sprintf("%o", 3735928559); + +Using L<Bit::Vector>: + + use Bit::Vector; + my $vec = Bit::Vector->new_Dec(32, -559038737); + my $oct = reverse join('', $vec->Chunk_List_Read(3)); + +=item How do I convert from binary to decimal + +Perl 5.6 lets you write binary numbers directly with +the C<0b> notation: + + my $number = 0b10110110; + +Using C<oct>: + + my $input = "10110110"; + my $decimal = oct( "0b$input" ); + +Using C<pack> and C<ord>: + + my $decimal = ord(pack('B8', '10110110')); + +Using C<pack> and C<unpack> for larger strings: + + my $int = unpack("N", pack("B32", + substr("0" x 32 . "11110101011011011111011101111", -32))); + my $dec = sprintf("%d", $int); + + # substr() is used to left-pad a 32-character string with zeros. + +Using L<Bit::Vector>: + + my $vec = Bit::Vector->new_Bin(32, "11011110101011011011111011101111"); + my $dec = $vec->to_Dec(); + +=item How do I convert from decimal to binary + +Using C<sprintf> (perl 5.6+): + + my $bin = sprintf("%b", 3735928559); + +Using C<unpack>: + + my $bin = unpack("B*", pack("N", 3735928559)); + +Using L<Bit::Vector>: + + use Bit::Vector; + my $vec = Bit::Vector->new_Dec(32, -559038737); + my $bin = $vec->to_Bin(); + +The remaining transformations (e.g. hex -> oct, bin -> hex, etc.) +are left as an exercise to the inclined reader. + +=back + +=head2 Why doesn't & work the way I want it to? + +The behavior of binary arithmetic operators depends on whether they're +used on numbers or strings. The operators treat a string as a series +of bits and work with that (the string C<"3"> is the bit pattern +C<00110011>). The operators work with the binary form of a number +(the number C<3> is treated as the bit pattern C<00000011>). + +So, saying C<11 & 3> performs the "and" operation on numbers (yielding +C<3>). Saying C<"11" & "3"> performs the "and" operation on strings +(yielding C<"1">). + +Most problems with C<&> and C<|> arise because the programmer thinks +they have a number but really it's a string or vice versa. To avoid this, +stringify the arguments explicitly (using C<""> or C<qq()>) or convert them +to numbers explicitly (using C<0+$arg>). The rest arise because +the programmer says: + + if ("\020\020" & "\101\101") { + # ... + } + +but a string consisting of two null bytes (the result of C<"\020\020" +& "\101\101">) is not a false value in Perl. You need: + + if ( ("\020\020" & "\101\101") !~ /[^\000]/) { + # ... + } + +=head2 How do I multiply matrices? + +Use the L<Math::Matrix> or L<Math::MatrixReal> modules (available from CPAN) +or the L<PDL> extension (also available from CPAN). + +=head2 How do I perform an operation on a series of integers? + +To call a function on each element in an array, and collect the +results, use: + + my @results = map { my_func($_) } @array; + +For example: + + my @triple = map { 3 * $_ } @single; + +To call a function on each element of an array, but ignore the +results: + + foreach my $iterator (@array) { + some_func($iterator); + } + +To call a function on each integer in a (small) range, you B<can> use: + + my @results = map { some_func($_) } (5 .. 25); + +but you should be aware that in this form, the C<..> operator +creates a list of all integers in the range, which can take a lot of +memory for large ranges. However, the problem does not occur when +using C<..> within a C<for> loop, because in that case the range +operator is optimized to I<iterate> over the range, without creating +the entire list. So + + my @results = (); + for my $i (5 .. 500_005) { + push(@results, some_func($i)); + } + +or even + + push(@results, some_func($_)) for 5 .. 500_005; + +will not create an intermediate list of 500,000 integers. + +=head2 How can I output Roman numerals? + +Get the L<http://www.cpan.org/modules/by-module/Roman> module. + +=head2 Why aren't my random numbers random? + +If you're using a version of Perl before 5.004, you must call C<srand> +once at the start of your program to seed the random number generator. + + BEGIN { srand() if $] < 5.004 } + +5.004 and later automatically call C<srand> at the beginning. Don't +call C<srand> more than once--you make your numbers less random, +rather than more. + +Computers are good at being predictable and bad at being random +(despite appearances caused by bugs in your programs :-). The +F<random> article in the "Far More Than You Ever Wanted To Know" +collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz>, courtesy +of Tom Phoenix, talks more about this. John von Neumann said, "Anyone +who attempts to generate random numbers by deterministic means is, of +course, living in a state of sin." + +Perl relies on the underlying system for the implementation of +C<rand> and C<srand>; on some systems, the generated numbers are +not random enough (especially on Windows : see +L<http://www.perlmonks.org/?node_id=803632>). +Several CPAN modules in the C<Math> namespace implement better +pseudorandom generators; see for example +L<Math::Random::MT> ("Mersenne Twister", fast), or +L<Math::TrulyRandom> (uses the imperfections in the system's +timer to generate random numbers, which is rather slow). +More algorithms for random numbers are described in +"Numerical Recipes in C" at L<http://www.nr.com/> + +=head2 How do I get a random number between X and Y? + +To get a random number between two values, you can use the C<rand()> +built-in to get a random number between 0 and 1. From there, you shift +that into the range that you want. + +C<rand($x)> returns a number such that C<< 0 <= rand($x) < $x >>. Thus +what you want to have perl figure out is a random number in the range +from 0 to the difference between your I<X> and I<Y>. + +That is, to get a number between 10 and 15, inclusive, you want a +random number between 0 and 5 that you can then add to 10. + + my $number = 10 + int rand( 15-10+1 ); # ( 10,11,12,13,14, or 15 ) + +Hence you derive the following simple function to abstract +that. It selects a random integer between the two given +integers (inclusive), For example: C<random_int_between(50,120)>. + + sub random_int_between { + my($min, $max) = @_; + # Assumes that the two arguments are integers themselves! + return $min if $min == $max; + ($min, $max) = ($max, $min) if $min > $max; + return $min + int rand(1 + $max - $min); + } + +=head1 Data: Dates + +=head2 How do I find the day or week of the year? + +The day of the year is in the list returned +by the C<localtime> function. Without an +argument C<localtime> uses the current time. + + my $day_of_year = (localtime)[7]; + +The L<POSIX> module can also format a date as the day of the year or +week of the year. + + use POSIX qw/strftime/; + my $day_of_year = strftime "%j", localtime; + my $week_of_year = strftime "%W", localtime; + +To get the day of year for any date, use L<POSIX>'s C<mktime> to get +a time in epoch seconds for the argument to C<localtime>. + + use POSIX qw/mktime strftime/; + my $week_of_year = strftime "%W", + localtime( mktime( 0, 0, 0, 18, 11, 87 ) ); + +You can also use L<Time::Piece>, which comes with Perl and provides a +C<localtime> that returns an object: + + use Time::Piece; + my $day_of_year = localtime->yday; + my $week_of_year = localtime->week; + +The L<Date::Calc> module provides two functions to calculate these, too: + + use Date::Calc; + my $day_of_year = Day_of_Year( 1987, 12, 18 ); + my $week_of_year = Week_of_Year( 1987, 12, 18 ); + +=head2 How do I find the current century or millennium? + +Use the following simple functions: + + sub get_century { + return int((((localtime(shift || time))[5] + 1999))/100); + } + + sub get_millennium { + return 1+int((((localtime(shift || time))[5] + 1899))/1000); + } + +On some systems, the L<POSIX> module's C<strftime()> function has been +extended in a non-standard way to use a C<%C> format, which they +sometimes claim is the "century". It isn't, because on most such +systems, this is only the first two digits of the four-digit year, and +thus cannot be used to determine reliably the current century or +millennium. + +=head2 How can I compare two dates and find the difference? + +(contributed by brian d foy) + +You could just store all your dates as a number and then subtract. +Life isn't always that simple though. + +The L<Time::Piece> module, which comes with Perl, replaces L<localtime> +with a version that returns an object. It also overloads the comparison +operators so you can compare them directly: + + use Time::Piece; + my $date1 = localtime( $some_time ); + my $date2 = localtime( $some_other_time ); + + if( $date1 < $date2 ) { + print "The date was in the past\n"; + } + +You can also get differences with a subtraction, which returns a +L<Time::Seconds> object: + + my $diff = $date1 - $date2; + print "The difference is ", $date_diff->days, " days\n"; + +If you want to work with formatted dates, the L<Date::Manip>, +L<Date::Calc>, or L<DateTime> modules can help you. + +=head2 How can I take a string and turn it into epoch seconds? + +If it's a regular enough string that it always has the same format, +you can split it up and pass the parts to C<timelocal> in the standard +L<Time::Local> module. Otherwise, you should look into the L<Date::Calc>, +L<Date::Parse>, and L<Date::Manip> modules from CPAN. + +=head2 How can I find the Julian Day? + +(contributed by brian d foy and Dave Cross) + +You can use the L<Time::Piece> module, part of the Standard Library, +which can convert a date/time to a Julian Day: + + $ perl -MTime::Piece -le 'print localtime->julian_day' + 2455607.7959375 + +Or the modified Julian Day: + + $ perl -MTime::Piece -le 'print localtime->mjd' + 55607.2961226851 + +Or even the day of the year (which is what some people think of as a +Julian day): + + $ perl -MTime::Piece -le 'print localtime->yday' + 45 + +You can also do the same things with the L<DateTime> module: + + $ perl -MDateTime -le'print DateTime->today->jd' + 2453401.5 + $ perl -MDateTime -le'print DateTime->today->mjd' + 53401 + $ perl -MDateTime -le'print DateTime->today->doy' + 31 + +You can use the L<Time::JulianDay> module available on CPAN. Ensure +that you really want to find a Julian day, though, as many people have +different ideas about Julian days (see L<http://www.hermetic.ch/cal_stud/jdn.htm> +for instance): + + $ perl -MTime::JulianDay -le 'print local_julian_day( time )' + 55608 + +=head2 How do I find yesterday's date? +X<date> X<yesterday> X<DateTime> X<Date::Calc> X<Time::Local> +X<daylight saving time> X<day> X<Today_and_Now> X<localtime> +X<timelocal> + +(contributed by brian d foy) + +To do it correctly, you can use one of the C<Date> modules since they +work with calendars instead of times. The L<DateTime> module makes it +simple, and give you the same time of day, only the day before, +despite daylight saving time changes: + + use DateTime; + + my $yesterday = DateTime->now->subtract( days => 1 ); + + print "Yesterday was $yesterday\n"; + +You can also use the L<Date::Calc> module using its C<Today_and_Now> +function. + + use Date::Calc qw( Today_and_Now Add_Delta_DHMS ); + + my @date_time = Add_Delta_DHMS( Today_and_Now(), -1, 0, 0, 0 ); + + print "@date_time\n"; + +Most people try to use the time rather than the calendar to figure out +dates, but that assumes that days are twenty-four hours each. For +most people, there are two days a year when they aren't: the switch to +and from summer time throws this off. For example, the rest of the +suggestions will be wrong sometimes: + +Starting with Perl 5.10, L<Time::Piece> and L<Time::Seconds> are part +of the standard distribution, so you might think that you could do +something like this: + + use Time::Piece; + use Time::Seconds; + + my $yesterday = localtime() - ONE_DAY; # WRONG + print "Yesterday was $yesterday\n"; + +The L<Time::Piece> module exports a new C<localtime> that returns an +object, and L<Time::Seconds> exports the C<ONE_DAY> constant that is a +set number of seconds. This means that it always gives the time 24 +hours ago, which is not always yesterday. This can cause problems +around the end of daylight saving time when there's one day that is 25 +hours long. + +You have the same problem with L<Time::Local>, which will give the wrong +answer for those same special cases: + + # contributed by Gunnar Hjalmarsson + use Time::Local; + my $today = timelocal 0, 0, 12, ( localtime )[3..5]; + my ($d, $m, $y) = ( localtime $today-86400 )[3..5]; # WRONG + printf "Yesterday: %d-%02d-%02d\n", $y+1900, $m+1, $d; + +=head2 Does Perl have a Year 2000 or 2038 problem? Is Perl Y2K compliant? + +(contributed by brian d foy) + +Perl itself never had a Y2K problem, although that never stopped people +from creating Y2K problems on their own. See the documentation for +C<localtime> for its proper use. + +Starting with Perl 5.12, C<localtime> and C<gmtime> can handle dates past +03:14:08 January 19, 2038, when a 32-bit based time would overflow. You +still might get a warning on a 32-bit C<perl>: + + % perl5.12 -E 'say scalar localtime( 0x9FFF_FFFFFFFF )' + Integer overflow in hexadecimal number at -e line 1. + Wed Nov 1 19:42:39 5576711 + +On a 64-bit C<perl>, you can get even larger dates for those really long +running projects: + + % perl5.12 -E 'say scalar gmtime( 0x9FFF_FFFFFFFF )' + Thu Nov 2 00:42:39 5576711 + +You're still out of luck if you need to keep track of decaying protons +though. + +=head1 Data: Strings + +=head2 How do I validate input? + +(contributed by brian d foy) + +There are many ways to ensure that values are what you expect or +want to accept. Besides the specific examples that we cover in the +perlfaq, you can also look at the modules with "Assert" and "Validate" +in their names, along with other modules such as L<Regexp::Common>. + +Some modules have validation for particular types of input, such +as L<Business::ISBN>, L<Business::CreditCard>, L<Email::Valid>, +and L<Data::Validate::IP>. + +=head2 How do I unescape a string? + +It depends just what you mean by "escape". URL escapes are dealt +with in L<perlfaq9>. Shell escapes with the backslash (C<\>) +character are removed with + + s/\\(.)/$1/g; + +This won't expand C<"\n"> or C<"\t"> or any other special escapes. + +=head2 How do I remove consecutive pairs of characters? + +(contributed by brian d foy) + +You can use the substitution operator to find pairs of characters (or +runs of characters) and replace them with a single instance. In this +substitution, we find a character in C<(.)>. The memory parentheses +store the matched character in the back-reference C<\g1> and we use +that to require that the same thing immediately follow it. We replace +that part of the string with the character in C<$1>. + + s/(.)\g1/$1/g; + +We can also use the transliteration operator, C<tr///>. In this +example, the search list side of our C<tr///> contains nothing, but +the C<c> option complements that so it contains everything. The +replacement list also contains nothing, so the transliteration is +almost a no-op since it won't do any replacements (or more exactly, +replace the character with itself). However, the C<s> option squashes +duplicated and consecutive characters in the string so a character +does not show up next to itself + + my $str = 'Haarlem'; # in the Netherlands + $str =~ tr///cs; # Now Harlem, like in New York + +=head2 How do I expand function calls in a string? + +(contributed by brian d foy) + +This is documented in L<perlref>, and although it's not the easiest +thing to read, it does work. In each of these examples, we call the +function inside the braces used to dereference a reference. If we +have more than one return value, we can construct and dereference an +anonymous array. In this case, we call the function in list context. + + print "The time values are @{ [localtime] }.\n"; + +If we want to call the function in scalar context, we have to do a bit +more work. We can really have any code we like inside the braces, so +we simply have to end with the scalar reference, although how you do +that is up to you, and you can use code inside the braces. Note that +the use of parens creates a list context, so we need C<scalar> to +force the scalar context on the function: + + print "The time is ${\(scalar localtime)}.\n" + + print "The time is ${ my $x = localtime; \$x }.\n"; + +If your function already returns a reference, you don't need to create +the reference yourself. + + sub timestamp { my $t = localtime; \$t } + + print "The time is ${ timestamp() }.\n"; + +The C<Interpolation> module can also do a lot of magic for you. You can +specify a variable name, in this case C<E>, to set up a tied hash that +does the interpolation for you. It has several other methods to do this +as well. + + use Interpolation E => 'eval'; + print "The time values are $E{localtime()}.\n"; + +In most cases, it is probably easier to simply use string concatenation, +which also forces scalar context. + + print "The time is " . localtime() . ".\n"; + +=head2 How do I find matching/nesting anything? + +To find something between two single +characters, a pattern like C</x([^x]*)x/> will get the intervening +bits in $1. For multiple ones, then something more like +C</alpha(.*?)omega/> would be needed. For nested patterns +and/or balanced expressions, see the so-called +L<< (?PARNO)|perlre/C<(?PARNO)> C<(?-PARNO)> C<(?+PARNO)> C<(?R)> C<(?0)> >> +construct (available since perl 5.10). +The CPAN module L<Regexp::Common> can help to build such +regular expressions (see in particular +L<Regexp::Common::balanced> and L<Regexp::Common::delimited>). + +More complex cases will require to write a parser, probably +using a parsing module from CPAN, like +L<Regexp::Grammars>, L<Parse::RecDescent>, L<Parse::Yapp>, +L<Text::Balanced>, or L<Marpa::XS>. + +=head2 How do I reverse a string? + +Use C<reverse()> in scalar context, as documented in +L<perlfunc/reverse>. + + my $reversed = reverse $string; + +=head2 How do I expand tabs in a string? + +You can do it yourself: + + 1 while $string =~ s/\t+/' ' x (length($&) * 8 - length($`) % 8)/e; + +Or you can just use the L<Text::Tabs> module (part of the standard Perl +distribution). + + use Text::Tabs; + my @expanded_lines = expand(@lines_with_tabs); + +=head2 How do I reformat a paragraph? + +Use L<Text::Wrap> (part of the standard Perl distribution): + + use Text::Wrap; + print wrap("\t", ' ', @paragraphs); + +The paragraphs you give to L<Text::Wrap> should not contain embedded +newlines. L<Text::Wrap> doesn't justify the lines (flush-right). + +Or use the CPAN module L<Text::Autoformat>. Formatting files can be +easily done by making a shell alias, like so: + + alias fmt="perl -i -MText::Autoformat -n0777 \ + -e 'print autoformat $_, {all=>1}' $*" + +See the documentation for L<Text::Autoformat> to appreciate its many +capabilities. + +=head2 How can I access or change N characters of a string? + +You can access the first characters of a string with substr(). +To get the first character, for example, start at position 0 +and grab the string of length 1. + + + my $string = "Just another Perl Hacker"; + my $first_char = substr( $string, 0, 1 ); # 'J' + +To change part of a string, you can use the optional fourth +argument which is the replacement string. + + substr( $string, 13, 4, "Perl 5.8.0" ); + +You can also use substr() as an lvalue. + + substr( $string, 13, 4 ) = "Perl 5.8.0"; + +=head2 How do I change the Nth occurrence of something? + +You have to keep track of N yourself. For example, let's say you want +to change the fifth occurrence of C<"whoever"> or C<"whomever"> into +C<"whosoever"> or C<"whomsoever">, case insensitively. These +all assume that $_ contains the string to be altered. + + $count = 0; + s{((whom?)ever)}{ + ++$count == 5 # is it the 5th? + ? "${2}soever" # yes, swap + : $1 # renege and leave it there + }ige; + +In the more general case, you can use the C</g> modifier in a C<while> +loop, keeping count of matches. + + $WANT = 3; + $count = 0; + $_ = "One fish two fish red fish blue fish"; + while (/(\w+)\s+fish\b/gi) { + if (++$count == $WANT) { + print "The third fish is a $1 one.\n"; + } + } + +That prints out: C<"The third fish is a red one."> You can also use a +repetition count and repeated pattern like this: + + /(?:\w+\s+fish\s+){2}(\w+)\s+fish/i; + +=head2 How can I count the number of occurrences of a substring within a string? + +There are a number of ways, with varying efficiency. If you want a +count of a certain single character (X) within a string, you can use the +C<tr///> function like so: + + my $string = "ThisXlineXhasXsomeXx'sXinXit"; + my $count = ($string =~ tr/X//); + print "There are $count X characters in the string"; + +This is fine if you are just looking for a single character. However, +if you are trying to count multiple character substrings within a +larger string, C<tr///> won't work. What you can do is wrap a while() +loop around a global pattern match. For example, let's count negative +integers: + + my $string = "-9 55 48 -2 23 -76 4 14 -44"; + my $count = 0; + while ($string =~ /-\d+/g) { $count++ } + print "There are $count negative numbers in the string"; + +Another version uses a global match in list context, then assigns the +result to a scalar, producing a count of the number of matches. + + my $count = () = $string =~ /-\d+/g; + +=head2 How do I capitalize all the words on one line? +X<Text::Autoformat> X<capitalize> X<case, title> X<case, sentence> + +(contributed by brian d foy) + +Damian Conway's L<Text::Autoformat> handles all of the thinking +for you. + + use Text::Autoformat; + my $x = "Dr. Strangelove or: How I Learned to Stop ". + "Worrying and Love the Bomb"; + + print $x, "\n"; + for my $style (qw( sentence title highlight )) { + print autoformat($x, { case => $style }), "\n"; + } + +How do you want to capitalize those words? + + FRED AND BARNEY'S LODGE # all uppercase + Fred And Barney's Lodge # title case + Fred and Barney's Lodge # highlight case + +It's not as easy a problem as it looks. How many words do you think +are in there? Wait for it... wait for it.... If you answered 5 +you're right. Perl words are groups of C<\w+>, but that's not what +you want to capitalize. How is Perl supposed to know not to capitalize +that C<s> after the apostrophe? You could try a regular expression: + + $string =~ s/ ( + (^\w) #at the beginning of the line + | # or + (\s\w) #preceded by whitespace + ) + /\U$1/xg; + + $string =~ s/([\w']+)/\u\L$1/g; + +Now, what if you don't want to capitalize that "and"? Just use +L<Text::Autoformat> and get on with the next problem. :) + +=head2 How can I split a [character]-delimited string except when inside [character]? + +Several modules can handle this sort of parsing--L<Text::Balanced>, +L<Text::CSV>, L<Text::CSV_XS>, and L<Text::ParseWords>, among others. + +Take the example case of trying to split a string that is +comma-separated into its different fields. You can't use C<split(/,/)> +because you shouldn't split if the comma is inside quotes. For +example, take a data line like this: + + SAR001,"","Cimetrix, Inc","Bob Smith","CAM",N,8,1,0,7,"Error, Core Dumped" + +Due to the restriction of the quotes, this is a fairly complex +problem. Thankfully, we have Jeffrey Friedl, author of +I<Mastering Regular Expressions>, to handle these for us. He +suggests (assuming your string is contained in C<$text>): + + my @new = (); + push(@new, $+) while $text =~ m{ + "([^\"\\]*(?:\\.[^\"\\]*)*)",? # groups the phrase inside the quotes + | ([^,]+),? + | , + }gx; + push(@new, undef) if substr($text,-1,1) eq ','; + +If you want to represent quotation marks inside a +quotation-mark-delimited field, escape them with backslashes (eg, +C<"like \"this\"">. + +Alternatively, the L<Text::ParseWords> module (part of the standard +Perl distribution) lets you say: + + use Text::ParseWords; + @new = quotewords(",", 0, $text); + +For parsing or generating CSV, though, using L<Text::CSV> rather than +implementing it yourself is highly recommended; you'll save yourself odd bugs +popping up later by just using code which has already been tried and tested in +production for years. + +=head2 How do I strip blank space from the beginning/end of a string? + +(contributed by brian d foy) + +A substitution can do this for you. For a single line, you want to +replace all the leading or trailing whitespace with nothing. You +can do that with a pair of substitutions: + + s/^\s+//; + s/\s+$//; + +You can also write that as a single substitution, although it turns +out the combined statement is slower than the separate ones. That +might not matter to you, though: + + s/^\s+|\s+$//g; + +In this regular expression, the alternation matches either at the +beginning or the end of the string since the anchors have a lower +precedence than the alternation. With the C</g> flag, the substitution +makes all possible matches, so it gets both. Remember, the trailing +newline matches the C<\s+>, and the C<$> anchor can match to the +absolute end of the string, so the newline disappears too. Just add +the newline to the output, which has the added benefit of preserving +"blank" (consisting entirely of whitespace) lines which the C<^\s+> +would remove all by itself: + + while( <> ) { + s/^\s+|\s+$//g; + print "$_\n"; + } + +For a multi-line string, you can apply the regular expression to each +logical line in the string by adding the C</m> flag (for +"multi-line"). With the C</m> flag, the C<$> matches I<before> an +embedded newline, so it doesn't remove it. This pattern still removes +the newline at the end of the string: + + $string =~ s/^\s+|\s+$//gm; + +Remember that lines consisting entirely of whitespace will disappear, +since the first part of the alternation can match the entire string +and replace it with nothing. If you need to keep embedded blank lines, +you have to do a little more work. Instead of matching any whitespace +(since that includes a newline), just match the other whitespace: + + $string =~ s/^[\t\f ]+|[\t\f ]+$//mg; + +=head2 How do I pad a string with blanks or pad a number with zeroes? + +In the following examples, C<$pad_len> is the length to which you wish +to pad the string, C<$text> or C<$num> contains the string to be padded, +and C<$pad_char> contains the padding character. You can use a single +character string constant instead of the C<$pad_char> variable if you +know what it is in advance. And in the same way you can use an integer in +place of C<$pad_len> if you know the pad length in advance. + +The simplest method uses the C<sprintf> function. It can pad on the left +or right with blanks and on the left with zeroes and it will not +truncate the result. The C<pack> function can only pad strings on the +right with blanks and it will truncate the result to a maximum length of +C<$pad_len>. + + # Left padding a string with blanks (no truncation): + my $padded = sprintf("%${pad_len}s", $text); + my $padded = sprintf("%*s", $pad_len, $text); # same thing + + # Right padding a string with blanks (no truncation): + my $padded = sprintf("%-${pad_len}s", $text); + my $padded = sprintf("%-*s", $pad_len, $text); # same thing + + # Left padding a number with 0 (no truncation): + my $padded = sprintf("%0${pad_len}d", $num); + my $padded = sprintf("%0*d", $pad_len, $num); # same thing + + # Right padding a string with blanks using pack (will truncate): + my $padded = pack("A$pad_len",$text); + +If you need to pad with a character other than blank or zero you can use +one of the following methods. They all generate a pad string with the +C<x> operator and combine that with C<$text>. These methods do +not truncate C<$text>. + +Left and right padding with any character, creating a new string: + + my $padded = $pad_char x ( $pad_len - length( $text ) ) . $text; + my $padded = $text . $pad_char x ( $pad_len - length( $text ) ); + +Left and right padding with any character, modifying C<$text> directly: + + substr( $text, 0, 0 ) = $pad_char x ( $pad_len - length( $text ) ); + $text .= $pad_char x ( $pad_len - length( $text ) ); + +=head2 How do I extract selected columns from a string? + +(contributed by brian d foy) + +If you know the columns that contain the data, you can +use C<substr> to extract a single column. + + my $column = substr( $line, $start_column, $length ); + +You can use C<split> if the columns are separated by whitespace or +some other delimiter, as long as whitespace or the delimiter cannot +appear as part of the data. + + my $line = ' fred barney betty '; + my @columns = split /\s+/, $line; + # ( '', 'fred', 'barney', 'betty' ); + + my $line = 'fred||barney||betty'; + my @columns = split /\|/, $line; + # ( 'fred', '', 'barney', '', 'betty' ); + +If you want to work with comma-separated values, don't do this since +that format is a bit more complicated. Use one of the modules that +handle that format, such as L<Text::CSV>, L<Text::CSV_XS>, or +L<Text::CSV_PP>. + +If you want to break apart an entire line of fixed columns, you can use +C<unpack> with the A (ASCII) format. By using a number after the format +specifier, you can denote the column width. See the C<pack> and C<unpack> +entries in L<perlfunc> for more details. + + my @fields = unpack( $line, "A8 A8 A8 A16 A4" ); + +Note that spaces in the format argument to C<unpack> do not denote literal +spaces. If you have space separated data, you may want C<split> instead. + +=head2 How do I find the soundex value of a string? + +(contributed by brian d foy) + +You can use the C<Text::Soundex> module. If you want to do fuzzy or close +matching, you might also try the L<String::Approx>, and +L<Text::Metaphone>, and L<Text::DoubleMetaphone> modules. + +=head2 How can I expand variables in text strings? + +(contributed by brian d foy) + +If you can avoid it, don't, or if you can use a templating system, +such as L<Text::Template> or L<Template> Toolkit, do that instead. You +might even be able to get the job done with C<sprintf> or C<printf>: + + my $string = sprintf 'Say hello to %s and %s', $foo, $bar; + +However, for the one-off simple case where I don't want to pull out a +full templating system, I'll use a string that has two Perl scalar +variables in it. In this example, I want to expand C<$foo> and C<$bar> +to their variable's values: + + my $foo = 'Fred'; + my $bar = 'Barney'; + $string = 'Say hello to $foo and $bar'; + +One way I can do this involves the substitution operator and a double +C</e> flag. The first C</e> evaluates C<$1> on the replacement side and +turns it into C<$foo>. The second /e starts with C<$foo> and replaces +it with its value. C<$foo>, then, turns into 'Fred', and that's finally +what's left in the string: + + $string =~ s/(\$\w+)/$1/eeg; # 'Say hello to Fred and Barney' + +The C</e> will also silently ignore violations of strict, replacing +undefined variable names with the empty string. Since I'm using the +C</e> flag (twice even!), I have all of the same security problems I +have with C<eval> in its string form. If there's something odd in +C<$foo>, perhaps something like C<@{[ system "rm -rf /" ]}>, then +I could get myself in trouble. + +To get around the security problem, I could also pull the values from +a hash instead of evaluating variable names. Using a single C</e>, I +can check the hash to ensure the value exists, and if it doesn't, I +can replace the missing value with a marker, in this case C<???> to +signal that I missed something: + + my $string = 'This has $foo and $bar'; + + my %Replacements = ( + foo => 'Fred', + ); + + # $string =~ s/\$(\w+)/$Replacements{$1}/g; + $string =~ s/\$(\w+)/ + exists $Replacements{$1} ? $Replacements{$1} : '???' + /eg; + + print $string; + +=head2 What's wrong with always quoting "$vars"? + +The problem is that those double-quotes force +stringification--coercing numbers and references into strings--even +when you don't want them to be strings. Think of it this way: +double-quote expansion is used to produce new strings. If you already +have a string, why do you need more? + +If you get used to writing odd things like these: + + print "$var"; # BAD + my $new = "$old"; # BAD + somefunc("$var"); # BAD + +You'll be in trouble. Those should (in 99.8% of the cases) be +the simpler and more direct: + + print $var; + my $new = $old; + somefunc($var); + +Otherwise, besides slowing you down, you're going to break code when +the thing in the scalar is actually neither a string nor a number, but +a reference: + + func(\@array); + sub func { + my $aref = shift; + my $oref = "$aref"; # WRONG + } + +You can also get into subtle problems on those few operations in Perl +that actually do care about the difference between a string and a +number, such as the magical C<++> autoincrement operator or the +syscall() function. + +Stringification also destroys arrays. + + my @lines = `command`; + print "@lines"; # WRONG - extra blanks + print @lines; # right + +=head2 Why don't my E<lt>E<lt>HERE documents work? + +Here documents are found in L<perlop>. Check for these three things: + +=over 4 + +=item There must be no space after the E<lt>E<lt> part. + +=item There (probably) should be a semicolon at the end of the opening token + +=item You can't (easily) have any space in front of the tag. + +=item There needs to be at least a line separator after the end token. + +=back + +If you want to indent the text in the here document, you +can do this: + + # all in one + (my $VAR = <<HERE_TARGET) =~ s/^\s+//gm; + your text + goes here + HERE_TARGET + +But the HERE_TARGET must still be flush against the margin. +If you want that indented also, you'll have to quote +in the indentation. + + (my $quote = <<' FINIS') =~ s/^\s+//gm; + ...we will have peace, when you and all your works have + perished--and the works of your dark master to whom you + would deliver us. You are a liar, Saruman, and a corrupter + of men's hearts. --Theoden in /usr/src/perl/taint.c + FINIS + $quote =~ s/\s+--/\n--/; + +A nice general-purpose fixer-upper function for indented here documents +follows. It expects to be called with a here document as its argument. +It looks to see whether each line begins with a common substring, and +if so, strips that substring off. Otherwise, it takes the amount of leading +whitespace found on the first line and removes that much off each +subsequent line. + + sub fix { + local $_ = shift; + my ($white, $leader); # common whitespace and common leading string + if (/^\s*(?:([^\w\s]+)(\s*).*\n)(?:\s*\g1\g2?.*\n)+$/) { + ($white, $leader) = ($2, quotemeta($1)); + } else { + ($white, $leader) = (/^(\s+)/, ''); + } + s/^\s*?$leader(?:$white)?//gm; + return $_; + } + +This works with leading special strings, dynamically determined: + + my $remember_the_main = fix<<' MAIN_INTERPRETER_LOOP'; + @@@ int + @@@ runops() { + @@@ SAVEI32(runlevel); + @@@ runlevel++; + @@@ while ( op = (*op->op_ppaddr)() ); + @@@ TAINT_NOT; + @@@ return 0; + @@@ } + MAIN_INTERPRETER_LOOP + +Or with a fixed amount of leading whitespace, with remaining +indentation correctly preserved: + + my $poem = fix<<EVER_ON_AND_ON; + Now far ahead the Road has gone, + And I must follow, if I can, + Pursuing it with eager feet, + Until it joins some larger way + Where many paths and errands meet. + And whither then? I cannot say. + --Bilbo in /usr/src/perl/pp_ctl.c + EVER_ON_AND_ON + +=head1 Data: Arrays + +=head2 What is the difference between a list and an array? + +(contributed by brian d foy) + +A list is a fixed collection of scalars. An array is a variable that +holds a variable collection of scalars. An array can supply its collection +for list operations, so list operations also work on arrays: + + # slices + ( 'dog', 'cat', 'bird' )[2,3]; + @animals[2,3]; + + # iteration + foreach ( qw( dog cat bird ) ) { ... } + foreach ( @animals ) { ... } + + my @three = grep { length == 3 } qw( dog cat bird ); + my @three = grep { length == 3 } @animals; + + # supply an argument list + wash_animals( qw( dog cat bird ) ); + wash_animals( @animals ); + +Array operations, which change the scalars, rearranges them, or adds +or subtracts some scalars, only work on arrays. These can't work on a +list, which is fixed. Array operations include C<shift>, C<unshift>, +C<push>, C<pop>, and C<splice>. + +An array can also change its length: + + $#animals = 1; # truncate to two elements + $#animals = 10000; # pre-extend to 10,001 elements + +You can change an array element, but you can't change a list element: + + $animals[0] = 'Rottweiler'; + qw( dog cat bird )[0] = 'Rottweiler'; # syntax error! + + foreach ( @animals ) { + s/^d/fr/; # works fine + } + + foreach ( qw( dog cat bird ) ) { + s/^d/fr/; # Error! Modification of read only value! + } + +However, if the list element is itself a variable, it appears that you +can change a list element. However, the list element is the variable, not +the data. You're not changing the list element, but something the list +element refers to. The list element itself doesn't change: it's still +the same variable. + +You also have to be careful about context. You can assign an array to +a scalar to get the number of elements in the array. This only works +for arrays, though: + + my $count = @animals; # only works with arrays + +If you try to do the same thing with what you think is a list, you +get a quite different result. Although it looks like you have a list +on the righthand side, Perl actually sees a bunch of scalars separated +by a comma: + + my $scalar = ( 'dog', 'cat', 'bird' ); # $scalar gets bird + +Since you're assigning to a scalar, the righthand side is in scalar +context. The comma operator (yes, it's an operator!) in scalar +context evaluates its lefthand side, throws away the result, and +evaluates it's righthand side and returns the result. In effect, +that list-lookalike assigns to C<$scalar> it's rightmost value. Many +people mess this up because they choose a list-lookalike whose +last element is also the count they expect: + + my $scalar = ( 1, 2, 3 ); # $scalar gets 3, accidentally + +=head2 What is the difference between $array[1] and @array[1]? + +(contributed by brian d foy) + +The difference is the sigil, that special character in front of the +array name. The C<$> sigil means "exactly one item", while the C<@> +sigil means "zero or more items". The C<$> gets you a single scalar, +while the C<@> gets you a list. + +The confusion arises because people incorrectly assume that the sigil +denotes the variable type. + +The C<$array[1]> is a single-element access to the array. It's going +to return the item in index 1 (or undef if there is no item there). +If you intend to get exactly one element from the array, this is the +form you should use. + +The C<@array[1]> is an array slice, although it has only one index. +You can pull out multiple elements simultaneously by specifying +additional indices as a list, like C<@array[1,4,3,0]>. + +Using a slice on the lefthand side of the assignment supplies list +context to the righthand side. This can lead to unexpected results. +For instance, if you want to read a single line from a filehandle, +assigning to a scalar value is fine: + + $array[1] = <STDIN>; + +However, in list context, the line input operator returns all of the +lines as a list. The first line goes into C<@array[1]> and the rest +of the lines mysteriously disappear: + + @array[1] = <STDIN>; # most likely not what you want + +Either the C<use warnings> pragma or the B<-w> flag will warn you when +you use an array slice with a single index. + +=head2 How can I remove duplicate elements from a list or array? + +(contributed by brian d foy) + +Use a hash. When you think the words "unique" or "duplicated", think +"hash keys". + +If you don't care about the order of the elements, you could just +create the hash then extract the keys. It's not important how you +create that hash: just that you use C<keys> to get the unique +elements. + + my %hash = map { $_, 1 } @array; + # or a hash slice: @hash{ @array } = (); + # or a foreach: $hash{$_} = 1 foreach ( @array ); + + my @unique = keys %hash; + +If you want to use a module, try the C<uniq> function from +L<List::MoreUtils>. In list context it returns the unique elements, +preserving their order in the list. In scalar context, it returns the +number of unique elements. + + use List::MoreUtils qw(uniq); + + my @unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 1,2,3,4,5,6,7 + my $unique = uniq( 1, 2, 3, 4, 4, 5, 6, 5, 7 ); # 7 + +You can also go through each element and skip the ones you've seen +before. Use a hash to keep track. The first time the loop sees an +element, that element has no key in C<%Seen>. The C<next> statement +creates the key and immediately uses its value, which is C<undef>, so +the loop continues to the C<push> and increments the value for that +key. The next time the loop sees that same element, its key exists in +the hash I<and> the value for that key is true (since it's not 0 or +C<undef>), so the next skips that iteration and the loop goes to the +next element. + + my @unique = (); + my %seen = (); + + foreach my $elem ( @array ) { + next if $seen{ $elem }++; + push @unique, $elem; + } + +You can write this more briefly using a grep, which does the +same thing. + + my %seen = (); + my @unique = grep { ! $seen{ $_ }++ } @array; + +=head2 How can I tell whether a certain element is contained in a list or array? + +(portions of this answer contributed by Anno Siegel and brian d foy) + +Hearing the word "in" is an I<in>dication that you probably should have +used a hash, not a list or array, to store your data. Hashes are +designed to answer this question quickly and efficiently. Arrays aren't. + +That being said, there are several ways to approach this. In Perl 5.10 +and later, you can use the smart match operator to check that an item is +contained in an array or a hash: + + use 5.010; + + if( $item ~~ @array ) { + say "The array contains $item" + } + + if( $item ~~ %hash ) { + say "The hash contains $item" + } + +With earlier versions of Perl, you have to do a bit more work. If you +are going to make this query many times over arbitrary string values, +the fastest way is probably to invert the original array and maintain a +hash whose keys are the first array's values: + + my @blues = qw/azure cerulean teal turquoise lapis-lazuli/; + my %is_blue = (); + for (@blues) { $is_blue{$_} = 1 } + +Now you can check whether C<$is_blue{$some_color}>. It might have +been a good idea to keep the blues all in a hash in the first place. + +If the values are all small integers, you could use a simple indexed +array. This kind of an array will take up less space: + + my @primes = (2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31); + my @is_tiny_prime = (); + for (@primes) { $is_tiny_prime[$_] = 1 } + # or simply @istiny_prime[@primes] = (1) x @primes; + +Now you check whether $is_tiny_prime[$some_number]. + +If the values in question are integers instead of strings, you can save +quite a lot of space by using bit strings instead: + + my @articles = ( 1..10, 150..2000, 2017 ); + undef $read; + for (@articles) { vec($read,$_,1) = 1 } + +Now check whether C<vec($read,$n,1)> is true for some C<$n>. + +These methods guarantee fast individual tests but require a re-organization +of the original list or array. They only pay off if you have to test +multiple values against the same array. + +If you are testing only once, the standard module L<List::Util> exports +the function C<first> for this purpose. It works by stopping once it +finds the element. It's written in C for speed, and its Perl equivalent +looks like this subroutine: + + sub first (&@) { + my $code = shift; + foreach (@_) { + return $_ if &{$code}(); + } + undef; + } + +If speed is of little concern, the common idiom uses grep in scalar context +(which returns the number of items that passed its condition) to traverse the +entire list. This does have the benefit of telling you how many matches it +found, though. + + my $is_there = grep $_ eq $whatever, @array; + +If you want to actually extract the matching elements, simply use grep in +list context. + + my @matches = grep $_ eq $whatever, @array; + +=head2 How do I compute the difference of two arrays? How do I compute the intersection of two arrays? + +Use a hash. Here's code to do both and more. It assumes that each +element is unique in a given array: + + my (@union, @intersection, @difference); + my %count = (); + foreach my $element (@array1, @array2) { $count{$element}++ } + foreach my $element (keys %count) { + push @union, $element; + push @{ $count{$element} > 1 ? \@intersection : \@difference }, $element; + } + +Note that this is the I<symmetric difference>, that is, all elements +in either A or in B but not in both. Think of it as an xor operation. + +=head2 How do I test whether two arrays or hashes are equal? + +With Perl 5.10 and later, the smart match operator can give you the answer +with the least amount of work: + + use 5.010; + + if( @array1 ~~ @array2 ) { + say "The arrays are the same"; + } + + if( %hash1 ~~ %hash2 ) # doesn't check values! { + say "The hash keys are the same"; + } + +The following code works for single-level arrays. It uses a +stringwise comparison, and does not distinguish defined versus +undefined empty strings. Modify if you have other needs. + + $are_equal = compare_arrays(\@frogs, \@toads); + + sub compare_arrays { + my ($first, $second) = @_; + no warnings; # silence spurious -w undef complaints + return 0 unless @$first == @$second; + for (my $i = 0; $i < @$first; $i++) { + return 0 if $first->[$i] ne $second->[$i]; + } + return 1; + } + +For multilevel structures, you may wish to use an approach more +like this one. It uses the CPAN module L<FreezeThaw>: + + use FreezeThaw qw(cmpStr); + my @a = my @b = ( "this", "that", [ "more", "stuff" ] ); + + printf "a and b contain %s arrays\n", + cmpStr(\@a, \@b) == 0 + ? "the same" + : "different"; + +This approach also works for comparing hashes. Here we'll demonstrate +two different answers: + + use FreezeThaw qw(cmpStr cmpStrHard); + + my %a = my %b = ( "this" => "that", "extra" => [ "more", "stuff" ] ); + $a{EXTRA} = \%b; + $b{EXTRA} = \%a; + + printf "a and b contain %s hashes\n", + cmpStr(\%a, \%b) == 0 ? "the same" : "different"; + + printf "a and b contain %s hashes\n", + cmpStrHard(\%a, \%b) == 0 ? "the same" : "different"; + + +The first reports that both those the hashes contain the same data, +while the second reports that they do not. Which you prefer is left as +an exercise to the reader. + +=head2 How do I find the first array element for which a condition is true? + +To find the first array element which satisfies a condition, you can +use the C<first()> function in the L<List::Util> module, which comes +with Perl 5.8. This example finds the first element that contains +"Perl". + + use List::Util qw(first); + + my $element = first { /Perl/ } @array; + +If you cannot use L<List::Util>, you can make your own loop to do the +same thing. Once you find the element, you stop the loop with last. + + my $found; + foreach ( @array ) { + if( /Perl/ ) { $found = $_; last } + } + +If you want the array index, use the C<firstidx()> function from +C<List::MoreUtils>: + + use List::MoreUtils qw(firstidx); + my $index = firstidx { /Perl/ } @array; + +Or write it yourself, iterating through the indices +and checking the array element at each index until you find one +that satisfies the condition: + + my( $found, $index ) = ( undef, -1 ); + for( $i = 0; $i < @array; $i++ ) { + if( $array[$i] =~ /Perl/ ) { + $found = $array[$i]; + $index = $i; + last; + } + } + +=head2 How do I handle linked lists? + +(contributed by brian d foy) + +Perl's arrays do not have a fixed size, so you don't need linked lists +if you just want to add or remove items. You can use array operations +such as C<push>, C<pop>, C<shift>, C<unshift>, or C<splice> to do +that. + +Sometimes, however, linked lists can be useful in situations where you +want to "shard" an array so you have have many small arrays instead of +a single big array. You can keep arrays longer than Perl's largest +array index, lock smaller arrays separately in threaded programs, +reallocate less memory, or quickly insert elements in the middle of +the chain. + +Steve Lembark goes through the details in his YAPC::NA 2009 talk "Perly +Linked Lists" ( L<http://www.slideshare.net/lembark/perly-linked-lists> ), +although you can just use his L<LinkedList::Single> module. + +=head2 How do I handle circular lists? +X<circular> X<array> X<Tie::Cycle> X<Array::Iterator::Circular> +X<cycle> X<modulus> + +(contributed by brian d foy) + +If you want to cycle through an array endlessly, you can increment the +index modulo the number of elements in the array: + + my @array = qw( a b c ); + my $i = 0; + + while( 1 ) { + print $array[ $i++ % @array ], "\n"; + last if $i > 20; + } + +You can also use L<Tie::Cycle> to use a scalar that always has the +next element of the circular array: + + use Tie::Cycle; + + tie my $cycle, 'Tie::Cycle', [ qw( FFFFFF 000000 FFFF00 ) ]; + + print $cycle; # FFFFFF + print $cycle; # 000000 + print $cycle; # FFFF00 + +The L<Array::Iterator::Circular> creates an iterator object for +circular arrays: + + use Array::Iterator::Circular; + + my $color_iterator = Array::Iterator::Circular->new( + qw(red green blue orange) + ); + + foreach ( 1 .. 20 ) { + print $color_iterator->next, "\n"; + } + +=head2 How do I shuffle an array randomly? + +If you either have Perl 5.8.0 or later installed, or if you have +Scalar-List-Utils 1.03 or later installed, you can say: + + use List::Util 'shuffle'; + + @shuffled = shuffle(@list); + +If not, you can use a Fisher-Yates shuffle. + + sub fisher_yates_shuffle { + my $deck = shift; # $deck is a reference to an array + return unless @$deck; # must not be empty! + + my $i = @$deck; + while (--$i) { + my $j = int rand ($i+1); + @$deck[$i,$j] = @$deck[$j,$i]; + } + } + + # shuffle my mpeg collection + # + my @mpeg = <audio/*/*.mp3>; + fisher_yates_shuffle( \@mpeg ); # randomize @mpeg in place + print @mpeg; + +Note that the above implementation shuffles an array in place, +unlike the C<List::Util::shuffle()> which takes a list and returns +a new shuffled list. + +You've probably seen shuffling algorithms that work using splice, +randomly picking another element to swap the current element with + + srand; + @new = (); + @old = 1 .. 10; # just a demo + while (@old) { + push(@new, splice(@old, rand @old, 1)); + } + +This is bad because splice is already O(N), and since you do it N +times, you just invented a quadratic algorithm; that is, O(N**2). +This does not scale, although Perl is so efficient that you probably +won't notice this until you have rather largish arrays. + +=head2 How do I process/modify each element of an array? + +Use C<for>/C<foreach>: + + for (@lines) { + s/foo/bar/; # change that word + tr/XZ/ZX/; # swap those letters + } + +Here's another; let's compute spherical volumes: + + my @volumes = @radii; + for (@volumes) { # @volumes has changed parts + $_ **= 3; + $_ *= (4/3) * 3.14159; # this will be constant folded + } + +which can also be done with C<map()> which is made to transform +one list into another: + + my @volumes = map {$_ ** 3 * (4/3) * 3.14159} @radii; + +If you want to do the same thing to modify the values of the +hash, you can use the C<values> function. As of Perl 5.6 +the values are not copied, so if you modify $orbit (in this +case), you modify the value. + + for my $orbit ( values %orbits ) { + ($orbit **= 3) *= (4/3) * 3.14159; + } + +Prior to perl 5.6 C<values> returned copies of the values, +so older perl code often contains constructions such as +C<@orbits{keys %orbits}> instead of C<values %orbits> where +the hash is to be modified. + +=head2 How do I select a random element from an array? + +Use the C<rand()> function (see L<perlfunc/rand>): + + my $index = rand @array; + my $element = $array[$index]; + +Or, simply: + + my $element = $array[ rand @array ]; + +=head2 How do I permute N elements of a list? +X<List::Permutor> X<permute> X<Algorithm::Loops> X<Knuth> +X<The Art of Computer Programming> X<Fischer-Krause> + +Use the L<List::Permutor> module on CPAN. If the list is actually an +array, try the L<Algorithm::Permute> module (also on CPAN). It's +written in XS code and is very efficient: + + use Algorithm::Permute; + + my @array = 'a'..'d'; + my $p_iterator = Algorithm::Permute->new ( \@array ); + + while (my @perm = $p_iterator->next) { + print "next permutation: (@perm)\n"; + } + +For even faster execution, you could do: + + use Algorithm::Permute; + + my @array = 'a'..'d'; + + Algorithm::Permute::permute { + print "next permutation: (@array)\n"; + } @array; + +Here's a little program that generates all permutations of all the +words on each line of input. The algorithm embodied in the +C<permute()> function is discussed in Volume 4 (still unpublished) of +Knuth's I<The Art of Computer Programming> and will work on any list: + + #!/usr/bin/perl -n + # Fischer-Krause ordered permutation generator + + sub permute (&@) { + my $code = shift; + my @idx = 0..$#_; + while ( $code->(@_[@idx]) ) { + my $p = $#idx; + --$p while $idx[$p-1] > $idx[$p]; + my $q = $p or return; + push @idx, reverse splice @idx, $p; + ++$q while $idx[$p-1] > $idx[$q]; + @idx[$p-1,$q]=@idx[$q,$p-1]; + } + } + + permute { print "@_\n" } split; + +The L<Algorithm::Loops> module also provides the C<NextPermute> and +C<NextPermuteNum> functions which efficiently find all unique permutations +of an array, even if it contains duplicate values, modifying it in-place: +if its elements are in reverse-sorted order then the array is reversed, +making it sorted, and it returns false; otherwise the next +permutation is returned. + +C<NextPermute> uses string order and C<NextPermuteNum> numeric order, so +you can enumerate all the permutations of C<0..9> like this: + + use Algorithm::Loops qw(NextPermuteNum); + + my @list= 0..9; + do { print "@list\n" } while NextPermuteNum @list; + +=head2 How do I sort an array by (anything)? + +Supply a comparison function to sort() (described in L<perlfunc/sort>): + + @list = sort { $a <=> $b } @list; + +The default sort function is cmp, string comparison, which would +sort C<(1, 2, 10)> into C<(1, 10, 2)>. C<< <=> >>, used above, is +the numerical comparison operator. + +If you have a complicated function needed to pull out the part you +want to sort on, then don't do it inside the sort function. Pull it +out first, because the sort BLOCK can be called many times for the +same element. Here's an example of how to pull out the first word +after the first number on each item, and then sort those words +case-insensitively. + + my @idx; + for (@data) { + my $item; + ($item) = /\d+\s*(\S+)/; + push @idx, uc($item); + } + my @sorted = @data[ sort { $idx[$a] cmp $idx[$b] } 0 .. $#idx ]; + +which could also be written this way, using a trick +that's come to be known as the Schwartzian Transform: + + my @sorted = map { $_->[0] } + sort { $a->[1] cmp $b->[1] } + map { [ $_, uc( (/\d+\s*(\S+)/)[0]) ] } @data; + +If you need to sort on several fields, the following paradigm is useful. + + my @sorted = sort { + field1($a) <=> field1($b) || + field2($a) cmp field2($b) || + field3($a) cmp field3($b) + } @data; + +This can be conveniently combined with precalculation of keys as given +above. + +See the F<sort> article in the "Far More Than You Ever Wanted +To Know" collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> for +more about this approach. + +See also the question later in L<perlfaq4> on sorting hashes. + +=head2 How do I manipulate arrays of bits? + +Use C<pack()> and C<unpack()>, or else C<vec()> and the bitwise +operations. + +For example, you don't have to store individual bits in an array +(which would mean that you're wasting a lot of space). To convert an +array of bits to a string, use C<vec()> to set the right bits. This +sets C<$vec> to have bit N set only if C<$ints[N]> was set: + + my @ints = (...); # array of bits, e.g. ( 1, 0, 0, 1, 1, 0 ... ) + my $vec = ''; + foreach( 0 .. $#ints ) { + vec($vec,$_,1) = 1 if $ints[$_]; + } + +The string C<$vec> only takes up as many bits as it needs. For +instance, if you had 16 entries in C<@ints>, C<$vec> only needs two +bytes to store them (not counting the scalar variable overhead). + +Here's how, given a vector in C<$vec>, you can get those bits into +your C<@ints> array: + + sub bitvec_to_list { + my $vec = shift; + my @ints; + # Find null-byte density then select best algorithm + if ($vec =~ tr/\0// / length $vec > 0.95) { + use integer; + my $i; + + # This method is faster with mostly null-bytes + while($vec =~ /[^\0]/g ) { + $i = -9 + 8 * pos $vec; + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + push @ints, $i if vec($vec, ++$i, 1); + } + } + else { + # This method is a fast general algorithm + use integer; + my $bits = unpack "b*", $vec; + push @ints, 0 if $bits =~ s/^(\d)// && $1; + push @ints, pos $bits while($bits =~ /1/g); + } + + return \@ints; + } + +This method gets faster the more sparse the bit vector is. +(Courtesy of Tim Bunce and Winfried Koenig.) + +You can make the while loop a lot shorter with this suggestion +from Benjamin Goldberg: + + while($vec =~ /[^\0]+/g ) { + push @ints, grep vec($vec, $_, 1), $-[0] * 8 .. $+[0] * 8; + } + +Or use the CPAN module L<Bit::Vector>: + + my $vector = Bit::Vector->new($num_of_bits); + $vector->Index_List_Store(@ints); + my @ints = $vector->Index_List_Read(); + +L<Bit::Vector> provides efficient methods for bit vector, sets of +small integers and "big int" math. + +Here's a more extensive illustration using vec(): + + # vec demo + my $vector = "\xff\x0f\xef\xfe"; + print "Ilya's string \\xff\\x0f\\xef\\xfe represents the number ", + unpack("N", $vector), "\n"; + my $is_set = vec($vector, 23, 1); + print "Its 23rd bit is ", $is_set ? "set" : "clear", ".\n"; + pvec($vector); + + set_vec(1,1,1); + set_vec(3,1,1); + set_vec(23,1,1); + + set_vec(3,1,3); + set_vec(3,2,3); + set_vec(3,4,3); + set_vec(3,4,7); + set_vec(3,8,3); + set_vec(3,8,7); + + set_vec(0,32,17); + set_vec(1,32,17); + + sub set_vec { + my ($offset, $width, $value) = @_; + my $vector = ''; + vec($vector, $offset, $width) = $value; + print "offset=$offset width=$width value=$value\n"; + pvec($vector); + } + + sub pvec { + my $vector = shift; + my $bits = unpack("b*", $vector); + my $i = 0; + my $BASE = 8; + + print "vector length in bytes: ", length($vector), "\n"; + @bytes = unpack("A8" x length($vector), $bits); + print "bits are: @bytes\n\n"; + } + +=head2 Why does defined() return true on empty arrays and hashes? + +The short story is that you should probably only use defined on scalars or +functions, not on aggregates (arrays and hashes). See L<perlfunc/defined> +in the 5.004 release or later of Perl for more detail. + +=head1 Data: Hashes (Associative Arrays) + +=head2 How do I process an entire hash? + +(contributed by brian d foy) + +There are a couple of ways that you can process an entire hash. You +can get a list of keys, then go through each key, or grab a one +key-value pair at a time. + +To go through all of the keys, use the C<keys> function. This extracts +all of the keys of the hash and gives them back to you as a list. You +can then get the value through the particular key you're processing: + + foreach my $key ( keys %hash ) { + my $value = $hash{$key} + ... + } + +Once you have the list of keys, you can process that list before you +process the hash elements. For instance, you can sort the keys so you +can process them in lexical order: + + foreach my $key ( sort keys %hash ) { + my $value = $hash{$key} + ... + } + +Or, you might want to only process some of the items. If you only want +to deal with the keys that start with C<text:>, you can select just +those using C<grep>: + + foreach my $key ( grep /^text:/, keys %hash ) { + my $value = $hash{$key} + ... + } + +If the hash is very large, you might not want to create a long list of +keys. To save some memory, you can grab one key-value pair at a time using +C<each()>, which returns a pair you haven't seen yet: + + while( my( $key, $value ) = each( %hash ) ) { + ... + } + +The C<each> operator returns the pairs in apparently random order, so if +ordering matters to you, you'll have to stick with the C<keys> method. + +The C<each()> operator can be a bit tricky though. You can't add or +delete keys of the hash while you're using it without possibly +skipping or re-processing some pairs after Perl internally rehashes +all of the elements. Additionally, a hash has only one iterator, so if +you mix C<keys>, C<values>, or C<each> on the same hash, you risk resetting +the iterator and messing up your processing. See the C<each> entry in +L<perlfunc> for more details. + +=head2 How do I merge two hashes? +X<hash> X<merge> X<slice, hash> + +(contributed by brian d foy) + +Before you decide to merge two hashes, you have to decide what to do +if both hashes contain keys that are the same and if you want to leave +the original hashes as they were. + +If you want to preserve the original hashes, copy one hash (C<%hash1>) +to a new hash (C<%new_hash>), then add the keys from the other hash +(C<%hash2> to the new hash. Checking that the key already exists in +C<%new_hash> gives you a chance to decide what to do with the +duplicates: + + my %new_hash = %hash1; # make a copy; leave %hash1 alone + + foreach my $key2 ( keys %hash2 ) { + if( exists $new_hash{$key2} ) { + warn "Key [$key2] is in both hashes!"; + # handle the duplicate (perhaps only warning) + ... + next; + } + else { + $new_hash{$key2} = $hash2{$key2}; + } + } + +If you don't want to create a new hash, you can still use this looping +technique; just change the C<%new_hash> to C<%hash1>. + + foreach my $key2 ( keys %hash2 ) { + if( exists $hash1{$key2} ) { + warn "Key [$key2] is in both hashes!"; + # handle the duplicate (perhaps only warning) + ... + next; + } + else { + $hash1{$key2} = $hash2{$key2}; + } + } + +If you don't care that one hash overwrites keys and values from the other, you +could just use a hash slice to add one hash to another. In this case, values +from C<%hash2> replace values from C<%hash1> when they have keys in common: + + @hash1{ keys %hash2 } = values %hash2; + +=head2 What happens if I add or remove keys from a hash while iterating over it? + +(contributed by brian d foy) + +The easy answer is "Don't do that!" + +If you iterate through the hash with each(), you can delete the key +most recently returned without worrying about it. If you delete or add +other keys, the iterator may skip or double up on them since perl +may rearrange the hash table. See the +entry for C<each()> in L<perlfunc>. + +=head2 How do I look up a hash element by value? + +Create a reverse hash: + + my %by_value = reverse %by_key; + my $key = $by_value{$value}; + +That's not particularly efficient. It would be more space-efficient +to use: + + while (my ($key, $value) = each %by_key) { + $by_value{$value} = $key; + } + +If your hash could have repeated values, the methods above will only find +one of the associated keys. This may or may not worry you. If it does +worry you, you can always reverse the hash into a hash of arrays instead: + + while (my ($key, $value) = each %by_key) { + push @{$key_list_by_value{$value}}, $key; + } + +=head2 How can I know how many entries are in a hash? + +(contributed by brian d foy) + +This is very similar to "How do I process an entire hash?", also in +L<perlfaq4>, but a bit simpler in the common cases. + +You can use the C<keys()> built-in function in scalar context to find out +have many entries you have in a hash: + + my $key_count = keys %hash; # must be scalar context! + +If you want to find out how many entries have a defined value, that's +a bit different. You have to check each value. A C<grep> is handy: + + my $defined_value_count = grep { defined } values %hash; + +You can use that same structure to count the entries any way that +you like. If you want the count of the keys with vowels in them, +you just test for that instead: + + my $vowel_count = grep { /[aeiou]/ } keys %hash; + +The C<grep> in scalar context returns the count. If you want the list +of matching items, just use it in list context instead: + + my @defined_values = grep { defined } values %hash; + +The C<keys()> function also resets the iterator, which means that you may +see strange results if you use this between uses of other hash operators +such as C<each()>. + +=head2 How do I sort a hash (optionally by value instead of key)? + +(contributed by brian d foy) + +To sort a hash, start with the keys. In this example, we give the list of +keys to the sort function which then compares them ASCIIbetically (which +might be affected by your locale settings). The output list has the keys +in ASCIIbetical order. Once we have the keys, we can go through them to +create a report which lists the keys in ASCIIbetical order. + + my @keys = sort { $a cmp $b } keys %hash; + + foreach my $key ( @keys ) { + printf "%-20s %6d\n", $key, $hash{$key}; + } + +We could get more fancy in the C<sort()> block though. Instead of +comparing the keys, we can compute a value with them and use that +value as the comparison. + +For instance, to make our report order case-insensitive, we use +C<lc> to lowercase the keys before comparing them: + + my @keys = sort { lc $a cmp lc $b } keys %hash; + +Note: if the computation is expensive or the hash has many elements, +you may want to look at the Schwartzian Transform to cache the +computation results. + +If we want to sort by the hash value instead, we use the hash key +to look it up. We still get out a list of keys, but this time they +are ordered by their value. + + my @keys = sort { $hash{$a} <=> $hash{$b} } keys %hash; + +From there we can get more complex. If the hash values are the same, +we can provide a secondary sort on the hash key. + + my @keys = sort { + $hash{$a} <=> $hash{$b} + or + "\L$a" cmp "\L$b" + } keys %hash; + +=head2 How can I always keep my hash sorted? +X<hash tie sort DB_File Tie::IxHash> + +You can look into using the C<DB_File> module and C<tie()> using the +C<$DB_BTREE> hash bindings as documented in L<DB_File/"In Memory +Databases">. The L<Tie::IxHash> module from CPAN might also be +instructive. Although this does keep your hash sorted, you might not +like the slowdown you suffer from the tie interface. Are you sure you +need to do this? :) + +=head2 What's the difference between "delete" and "undef" with hashes? + +Hashes contain pairs of scalars: the first is the key, the +second is the value. The key will be coerced to a string, +although the value can be any kind of scalar: string, +number, or reference. If a key C<$key> is present in +%hash, C<exists($hash{$key})> will return true. The value +for a given key can be C<undef>, in which case +C<$hash{$key}> will be C<undef> while C<exists $hash{$key}> +will return true. This corresponds to (C<$key>, C<undef>) +being in the hash. + +Pictures help... Here's the C<%hash> table: + + keys values + +------+------+ + | a | 3 | + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ + +And these conditions hold + + $hash{'a'} is true + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is true + exists $hash{'a'} is true (Perl 5 only) + grep ($_ eq 'a', keys %hash) is true + +If you now say + + undef $hash{'a'} + +your table now reads: + + + keys values + +------+------+ + | a | undef| + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ + +and these conditions now hold; changes in caps: + + $hash{'a'} is FALSE + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is FALSE + exists $hash{'a'} is true (Perl 5 only) + grep ($_ eq 'a', keys %hash) is true + +Notice the last two: you have an undef value, but a defined key! + +Now, consider this: + + delete $hash{'a'} + +your table now reads: + + keys values + +------+------+ + | x | 7 | + | d | 0 | + | e | 2 | + +------+------+ + +and these conditions now hold; changes in caps: + + $hash{'a'} is false + $hash{'d'} is false + defined $hash{'d'} is true + defined $hash{'a'} is false + exists $hash{'a'} is FALSE (Perl 5 only) + grep ($_ eq 'a', keys %hash) is FALSE + +See, the whole entry is gone! + +=head2 Why don't my tied hashes make the defined/exists distinction? + +This depends on the tied hash's implementation of EXISTS(). +For example, there isn't the concept of undef with hashes +that are tied to DBM* files. It also means that exists() and +defined() do the same thing with a DBM* file, and what they +end up doing is not what they do with ordinary hashes. + +=head2 How do I reset an each() operation part-way through? + +(contributed by brian d foy) + +You can use the C<keys> or C<values> functions to reset C<each>. To +simply reset the iterator used by C<each> without doing anything else, +use one of them in void context: + + keys %hash; # resets iterator, nothing else. + values %hash; # resets iterator, nothing else. + +See the documentation for C<each> in L<perlfunc>. + +=head2 How can I get the unique keys from two hashes? + +First you extract the keys from the hashes into lists, then solve +the "removing duplicates" problem described above. For example: + + my %seen = (); + for my $element (keys(%foo), keys(%bar)) { + $seen{$element}++; + } + my @uniq = keys %seen; + +Or more succinctly: + + my @uniq = keys %{{%foo,%bar}}; + +Or if you really want to save space: + + my %seen = (); + while (defined ($key = each %foo)) { + $seen{$key}++; + } + while (defined ($key = each %bar)) { + $seen{$key}++; + } + my @uniq = keys %seen; + +=head2 How can I store a multidimensional array in a DBM file? + +Either stringify the structure yourself (no fun), or else +get the MLDBM (which uses Data::Dumper) module from CPAN and layer +it on top of either DB_File or GDBM_File. You might also try DBM::Deep, but +it can be a bit slow. + +=head2 How can I make my hash remember the order I put elements into it? + +Use the L<Tie::IxHash> from CPAN. + + use Tie::IxHash; + + tie my %myhash, 'Tie::IxHash'; + + for (my $i=0; $i<20; $i++) { + $myhash{$i} = 2*$i; + } + + my @keys = keys %myhash; + # @keys = (0,1,2,3,...) + +=head2 Why does passing a subroutine an undefined element in a hash create it? + +(contributed by brian d foy) + +Are you using a really old version of Perl? + +Normally, accessing a hash key's value for a nonexistent key will +I<not> create the key. + + my %hash = (); + my $value = $hash{ 'foo' }; + print "This won't print\n" if exists $hash{ 'foo' }; + +Passing C<$hash{ 'foo' }> to a subroutine used to be a special case, though. +Since you could assign directly to C<$_[0]>, Perl had to be ready to +make that assignment so it created the hash key ahead of time: + + my_sub( $hash{ 'foo' } ); + print "This will print before 5.004\n" if exists $hash{ 'foo' }; + + sub my_sub { + # $_[0] = 'bar'; # create hash key in case you do this + 1; + } + +Since Perl 5.004, however, this situation is a special case and Perl +creates the hash key only when you make the assignment: + + my_sub( $hash{ 'foo' } ); + print "This will print, even after 5.004\n" if exists $hash{ 'foo' }; + + sub my_sub { + $_[0] = 'bar'; + } + +However, if you want the old behavior (and think carefully about that +because it's a weird side effect), you can pass a hash slice instead. +Perl 5.004 didn't make this a special case: + + my_sub( @hash{ qw/foo/ } ); + +=head2 How can I make the Perl equivalent of a C structure/C++ class/hash or array of hashes or arrays? + +Usually a hash ref, perhaps like this: + + $record = { + NAME => "Jason", + EMPNO => 132, + TITLE => "deputy peon", + AGE => 23, + SALARY => 37_000, + PALS => [ "Norbert", "Rhys", "Phineas"], + }; + +References are documented in L<perlref> and L<perlreftut>. +Examples of complex data structures are given in L<perldsc> and +L<perllol>. Examples of structures and object-oriented classes are +in L<perltoot>. + +=head2 How can I use a reference as a hash key? + +(contributed by brian d foy and Ben Morrow) + +Hash keys are strings, so you can't really use a reference as the key. +When you try to do that, perl turns the reference into its stringified +form (for instance, C<HASH(0xDEADBEEF)>). From there you can't get +back the reference from the stringified form, at least without doing +some extra work on your own. + +Remember that the entry in the hash will still be there even if +the referenced variable goes out of scope, and that it is entirely +possible for Perl to subsequently allocate a different variable at +the same address. This will mean a new variable might accidentally +be associated with the value for an old. + +If you have Perl 5.10 or later, and you just want to store a value +against the reference for lookup later, you can use the core +Hash::Util::Fieldhash module. This will also handle renaming the +keys if you use multiple threads (which causes all variables to be +reallocated at new addresses, changing their stringification), and +garbage-collecting the entries when the referenced variable goes out +of scope. + +If you actually need to be able to get a real reference back from +each hash entry, you can use the Tie::RefHash module, which does the +required work for you. + +=head2 How can I check if a key exists in a multilevel hash? + +(contributed by brian d foy) + +The trick to this problem is avoiding accidental autovivification. If +you want to check three keys deep, you might naE<0xEF>vely try this: + + my %hash; + if( exists $hash{key1}{key2}{key3} ) { + ...; + } + +Even though you started with a completely empty hash, after that call to +C<exists> you've created the structure you needed to check for C<key3>: + + %hash = ( + 'key1' => { + 'key2' => {} + } + ); + +That's autovivification. You can get around this in a few ways. The +easiest way is to just turn it off. The lexical C<autovivification> +pragma is available on CPAN. Now you don't add to the hash: + + { + no autovivification; + my %hash; + if( exists $hash{key1}{key2}{key3} ) { + ...; + } + } + +The L<Data::Diver> module on CPAN can do it for you too. Its C<Dive> +subroutine can tell you not only if the keys exist but also get the +value: + + use Data::Diver qw(Dive); + + my @exists = Dive( \%hash, qw(key1 key2 key3) ); + if( ! @exists ) { + ...; # keys do not exist + } + elsif( ! defined $exists[0] ) { + ...; # keys exist but value is undef + } + +You can easily do this yourself too by checking each level of the hash +before you move onto the next level. This is essentially what +L<Data::Diver> does for you: + + if( check_hash( \%hash, qw(key1 key2 key3) ) ) { + ...; + } + + sub check_hash { + my( $hash, @keys ) = @_; + + return unless @keys; + + foreach my $key ( @keys ) { + return unless eval { exists $hash->{$key} }; + $hash = $hash->{$key}; + } + + return 1; + } + +=head2 How can I prevent addition of unwanted keys into a hash? + +Since version 5.8.0, hashes can be I<restricted> to a fixed number +of given keys. Methods for creating and dealing with restricted hashes +are exported by the L<Hash::Util> module. + +=head1 Data: Misc + +=head2 How do I handle binary data correctly? + +Perl is binary-clean, so it can handle binary data just fine. +On Windows or DOS, however, you have to use C<binmode> for binary +files to avoid conversions for line endings. In general, you should +use C<binmode> any time you want to work with binary data. + +Also see L<perlfunc/"binmode"> or L<perlopentut>. + +If you're concerned about 8-bit textual data then see L<perllocale>. +If you want to deal with multibyte characters, however, there are +some gotchas. See the section on Regular Expressions. + +=head2 How do I determine whether a scalar is a number/whole/integer/float? + +Assuming that you don't care about IEEE notations like "NaN" or +"Infinity", you probably just want to use a regular expression: + + use 5.010; + + given( $number ) { + when( /\D/ ) + { say "\thas nondigits"; continue } + when( /^\d+\z/ ) + { say "\tis a whole number"; continue } + when( /^-?\d+\z/ ) + { say "\tis an integer"; continue } + when( /^[+-]?\d+\z/ ) + { say "\tis a +/- integer"; continue } + when( /^-?(?:\d+\.?|\.\d)\d*\z/ ) + { say "\tis a real number"; continue } + when( /^[+-]?(?=\.?\d)\d*\.?\d*(?:e[+-]?\d+)?\z/i) + { say "\tis a C float" } + } + +There are also some commonly used modules for the task. +L<Scalar::Util> (distributed with 5.8) provides access to perl's +internal function C<looks_like_number> for determining whether a +variable looks like a number. L<Data::Types> exports functions that +validate data types using both the above and other regular +expressions. Thirdly, there is L<Regexp::Common> which has regular +expressions to match various types of numbers. Those three modules are +available from the CPAN. + +If you're on a POSIX system, Perl supports the C<POSIX::strtod> +function for converting strings to doubles (and also C<POSIX::strtol> +for longs). Its semantics are somewhat cumbersome, so here's a +C<getnum> wrapper function for more convenient access. This function +takes a string and returns the number it found, or C<undef> for input +that isn't a C float. The C<is_numeric> function is a front end to +C<getnum> if you just want to say, "Is this a float?" + + sub getnum { + use POSIX qw(strtod); + my $str = shift; + $str =~ s/^\s+//; + $str =~ s/\s+$//; + $! = 0; + my($num, $unparsed) = strtod($str); + if (($str eq '') || ($unparsed != 0) || $!) { + return undef; + } + else { + return $num; + } + } + + sub is_numeric { defined getnum($_[0]) } + +Or you could check out the L<String::Scanf> module on the CPAN +instead. + +=head2 How do I keep persistent data across program calls? + +For some specific applications, you can use one of the DBM modules. +See L<AnyDBM_File>. More generically, you should consult the L<FreezeThaw> +or L<Storable> modules from CPAN. Starting from Perl 5.8, L<Storable> is part +of the standard distribution. Here's one example using L<Storable>'s C<store> +and C<retrieve> functions: + + use Storable; + store(\%hash, "filename"); + + # later on... + $href = retrieve("filename"); # by ref + %hash = %{ retrieve("filename") }; # direct to hash + +=head2 How do I print out or copy a recursive data structure? + +The L<Data::Dumper> module on CPAN (or the 5.005 release of Perl) is great +for printing out data structures. The L<Storable> module on CPAN (or the +5.8 release of Perl), provides a function called C<dclone> that recursively +copies its argument. + + use Storable qw(dclone); + $r2 = dclone($r1); + +Where C<$r1> can be a reference to any kind of data structure you'd like. +It will be deeply copied. Because C<dclone> takes and returns references, +you'd have to add extra punctuation if you had a hash of arrays that +you wanted to copy. + + %newhash = %{ dclone(\%oldhash) }; + +=head2 How do I define methods for every class/object? + +(contributed by Ben Morrow) + +You can use the C<UNIVERSAL> class (see L<UNIVERSAL>). However, please +be very careful to consider the consequences of doing this: adding +methods to every object is very likely to have unintended +consequences. If possible, it would be better to have all your object +inherit from some common base class, or to use an object system like +Moose that supports roles. + +=head2 How do I verify a credit card checksum? + +Get the L<Business::CreditCard> module from CPAN. + +=head2 How do I pack arrays of doubles or floats for XS code? + +The arrays.h/arrays.c code in the L<PGPLOT> module on CPAN does just this. +If you're doing a lot of float or double processing, consider using +the L<PDL> module from CPAN instead--it makes number-crunching easy. + +See L<http://search.cpan.org/dist/PGPLOT> for the code. + + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples in this file +are hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun +or for profit as you see fit. A simple comment in the code giving +credit would be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq5.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq5.pod new file mode 100644 index 00000000000..60bd08306d5 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq5.pod @@ -0,0 +1,1574 @@ +=head1 NAME + +perlfaq5 - Files and Formats + +=head1 DESCRIPTION + +This section deals with I/O and the "f" issues: filehandles, flushing, +formats, and footers. + +=head2 How do I flush/unbuffer an output filehandle? Why must I do this? +X<flush> X<buffer> X<unbuffer> X<autoflush> + +(contributed by brian d foy) + +You might like to read Mark Jason Dominus's "Suffering From Buffering" +at L<http://perl.plover.com/FAQs/Buffering.html> . + +Perl normally buffers output so it doesn't make a system call for every +bit of output. By saving up output, it makes fewer expensive system calls. +For instance, in this little bit of code, you want to print a dot to the +screen for every line you process to watch the progress of your program. +Instead of seeing a dot for every line, Perl buffers the output and you +have a long wait before you see a row of 50 dots all at once: + + # long wait, then row of dots all at once + while( <> ) { + print "."; + print "\n" unless ++$count % 50; + + #... expensive line processing operations + } + +To get around this, you have to unbuffer the output filehandle, in this +case, C<STDOUT>. You can set the special variable C<$|> to a true value +(mnemonic: making your filehandles "piping hot"): + + $|++; + + # dot shown immediately + while( <> ) { + print "."; + print "\n" unless ++$count % 50; + + #... expensive line processing operations + } + +The C<$|> is one of the per-filehandle special variables, so each +filehandle has its own copy of its value. If you want to merge +standard output and standard error for instance, you have to unbuffer +each (although STDERR might be unbuffered by default): + + { + my $previous_default = select(STDOUT); # save previous default + $|++; # autoflush STDOUT + select(STDERR); + $|++; # autoflush STDERR, to be sure + select($previous_default); # restore previous default + } + + # now should alternate . and + + while( 1 ) { + sleep 1; + print STDOUT "."; + print STDERR "+"; + print STDOUT "\n" unless ++$count % 25; + } + +Besides the C<$|> special variable, you can use C<binmode> to give +your filehandle a C<:unix> layer, which is unbuffered: + + binmode( STDOUT, ":unix" ); + + while( 1 ) { + sleep 1; + print "."; + print "\n" unless ++$count % 50; + } + +For more information on output layers, see the entries for C<binmode> +and L<open> in L<perlfunc>, and the L<PerlIO> module documentation. + +If you are using L<IO::Handle> or one of its subclasses, you can +call the C<autoflush> method to change the settings of the +filehandle: + + use IO::Handle; + open my( $io_fh ), ">", "output.txt"; + $io_fh->autoflush(1); + +The L<IO::Handle> objects also have a C<flush> method. You can flush +the buffer any time you want without auto-buffering + + $io_fh->flush; + +=head2 How do I change, delete, or insert a line in a file, or append to the beginning of a file? +X<file, editing> + +(contributed by brian d foy) + +The basic idea of inserting, changing, or deleting a line from a text +file involves reading and printing the file to the point you want to +make the change, making the change, then reading and printing the rest +of the file. Perl doesn't provide random access to lines (especially +since the record input separator, C<$/>, is mutable), although modules +such as L<Tie::File> can fake it. + +A Perl program to do these tasks takes the basic form of opening a +file, printing its lines, then closing the file: + + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; + + while( <$in> ) { + print $out $_; + } + + close $out; + +Within that basic form, add the parts that you need to insert, change, +or delete lines. + +To prepend lines to the beginning, print those lines before you enter +the loop that prints the existing lines. + + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; + + print $out "# Add this line to the top\n"; # <--- HERE'S THE MAGIC + + while( <$in> ) { + print $out $_; + } + + close $out; + +To change existing lines, insert the code to modify the lines inside +the C<while> loop. In this case, the code finds all lowercased +versions of "perl" and uppercases them. The happens for every line, so +be sure that you're supposed to do that on every line! + + open my $in, '<', $file or die "Can't read old file: $!"; + open my $out, '>', "$file.new" or die "Can't write new file: $!"; + + print $out "# Add this line to the top\n"; + + while( <$in> ) { + s/\b(perl)\b/Perl/g; + print $out $_; + } + + close $out; + +To change only a particular line, the input line number, C<$.>, is +useful. First read and print the lines up to the one you want to +change. Next, read the single line you want to change, change it, and +print it. After that, read the rest of the lines and print those: + + while( <$in> ) { # print the lines before the change + print $out $_; + last if $. == 4; # line number before change + } + + my $line = <$in>; + $line =~ s/\b(perl)\b/Perl/g; + print $out $line; + + while( <$in> ) { # print the rest of the lines + print $out $_; + } + +To skip lines, use the looping controls. The C<next> in this example +skips comment lines, and the C<last> stops all processing once it +encounters either C<__END__> or C<__DATA__>. + + while( <$in> ) { + next if /^\s+#/; # skip comment lines + last if /^__(END|DATA)__$/; # stop at end of code marker + print $out $_; + } + +Do the same sort of thing to delete a particular line by using C<next> +to skip the lines you don't want to show up in the output. This +example skips every fifth line: + + while( <$in> ) { + next unless $. % 5; + print $out $_; + } + +If, for some odd reason, you really want to see the whole file at once +rather than processing line-by-line, you can slurp it in (as long as +you can fit the whole thing in memory!): + + open my $in, '<', $file or die "Can't read old file: $!" + open my $out, '>', "$file.new" or die "Can't write new file: $!"; + + my @lines = do { local $/; <$in> }; # slurp! + + # do your magic here + + print $out @lines; + +Modules such as L<File::Slurp> and L<Tie::File> can help with that +too. If you can, however, avoid reading the entire file at once. Perl +won't give that memory back to the operating system until the process +finishes. + +You can also use Perl one-liners to modify a file in-place. The +following changes all 'Fred' to 'Barney' in F<inFile.txt>, overwriting +the file with the new contents. With the C<-p> switch, Perl wraps a +C<while> loop around the code you specify with C<-e>, and C<-i> turns +on in-place editing. The current line is in C<$_>. With C<-p>, Perl +automatically prints the value of C<$_> at the end of the loop. See +L<perlrun> for more details. + + perl -pi -e 's/Fred/Barney/' inFile.txt + +To make a backup of C<inFile.txt>, give C<-i> a file extension to add: + + perl -pi.bak -e 's/Fred/Barney/' inFile.txt + +To change only the fifth line, you can add a test checking C<$.>, the +input line number, then only perform the operation when the test +passes: + + perl -pi -e 's/Fred/Barney/ if $. == 5' inFile.txt + +To add lines before a certain line, you can add a line (or lines!) +before Perl prints C<$_>: + + perl -pi -e 'print "Put before third line\n" if $. == 3' inFile.txt + +You can even add a line to the beginning of a file, since the current +line prints at the end of the loop: + + perl -pi -e 'print "Put before first line\n" if $. == 1' inFile.txt + +To insert a line after one already in the file, use the C<-n> switch. +It's just like C<-p> except that it doesn't print C<$_> at the end of +the loop, so you have to do that yourself. In this case, print C<$_> +first, then print the line that you want to add. + + perl -ni -e 'print; print "Put after fifth line\n" if $. == 5' inFile.txt + +To delete lines, only print the ones that you want. + + perl -ni -e 'print if /d/' inFile.txt + +=head2 How do I count the number of lines in a file? +X<file, counting lines> X<lines> X<line> + +(contributed by brian d foy) + +Conceptually, the easiest way to count the lines in a file is to +simply read them and count them: + + my $count = 0; + while( <$fh> ) { $count++; } + +You don't really have to count them yourself, though, since Perl +already does that with the C<$.> variable, which is the current line +number from the last filehandle read: + + 1 while( <$fh> ); + my $count = $.; + +If you want to use C<$.>, you can reduce it to a simple one-liner, +like one of these: + + % perl -lne '} print $.; {' file + + % perl -lne 'END { print $. }' file + +Those can be rather inefficient though. If they aren't fast enough for +you, you might just read chunks of data and count the number of +newlines: + + my $lines = 0; + open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; + while( sysread $fh, $buffer, 4096 ) { + $lines += ( $buffer =~ tr/\n// ); + } + close FILE; + +However, that doesn't work if the line ending isn't a newline. You +might change that C<tr///> to a C<s///> so you can count the number of +times the input record separator, C<$/>, shows up: + + my $lines = 0; + open my($fh), '<:raw', $filename or die "Can't open $filename: $!"; + while( sysread $fh, $buffer, 4096 ) { + $lines += ( $buffer =~ s|$/||g; ); + } + close FILE; + +If you don't mind shelling out, the C<wc> command is usually the +fastest, even with the extra interprocess overhead. Ensure that you +have an untainted filename though: + + #!perl -T + + $ENV{PATH} = undef; + + my $lines; + if( $filename =~ /^([0-9a-z_.]+)\z/ ) { + $lines = `/usr/bin/wc -l $1` + chomp $lines; + } + +=head2 How do I delete the last N lines from a file? +X<lines> X<file> + +(contributed by brian d foy) + +The easiest conceptual solution is to count the lines in the +file then start at the beginning and print the number of lines +(minus the last N) to a new file. + +Most often, the real question is how you can delete the last N lines +without making more than one pass over the file, or how to do it +without a lot of copying. The easy concept is the hard reality when +you might have millions of lines in your file. + +One trick is to use L<File::ReadBackwards>, which starts at the end of +the file. That module provides an object that wraps the real filehandle +to make it easy for you to move around the file. Once you get to the +spot you need, you can get the actual filehandle and work with it as +normal. In this case, you get the file position at the end of the last +line you want to keep and truncate the file to that point: + + use File::ReadBackwards; + + my $filename = 'test.txt'; + my $Lines_to_truncate = 2; + + my $bw = File::ReadBackwards->new( $filename ) + or die "Could not read backwards in [$filename]: $!"; + + my $lines_from_end = 0; + until( $bw->eof or $lines_from_end == $Lines_to_truncate ) { + print "Got: ", $bw->readline; + $lines_from_end++; + } + + truncate( $filename, $bw->tell ); + +The L<File::ReadBackwards> module also has the advantage of setting +the input record separator to a regular expression. + +You can also use the L<Tie::File> module which lets you access +the lines through a tied array. You can use normal array operations +to modify your file, including setting the last index and using +C<splice>. + +=head2 How can I use Perl's C<-i> option from within a program? +X<-i> X<in-place> + +C<-i> sets the value of Perl's C<$^I> variable, which in turn affects +the behavior of C<< <> >>; see L<perlrun> for more details. By +modifying the appropriate variables directly, you can get the same +behavior within a larger program. For example: + + # ... + { + local($^I, @ARGV) = ('.orig', glob("*.c")); + while (<>) { + if ($. == 1) { + print "This line should appear at the top of each file\n"; + } + s/\b(p)earl\b/${1}erl/i; # Correct typos, preserving case + print; + close ARGV if eof; # Reset $. + } + } + # $^I and @ARGV return to their old values here + +This block modifies all the C<.c> files in the current directory, +leaving a backup of the original data from each file in a new +C<.c.orig> file. + +=head2 How can I copy a file? +X<copy> X<file, copy> X<File::Copy> + +(contributed by brian d foy) + +Use the L<File::Copy> module. It comes with Perl and can do a +true copy across file systems, and it does its magic in +a portable fashion. + + use File::Copy; + + copy( $original, $new_copy ) or die "Copy failed: $!"; + +If you can't use L<File::Copy>, you'll have to do the work yourself: +open the original file, open the destination file, then print +to the destination file as you read the original. You also have to +remember to copy the permissions, owner, and group to the new file. + +=head2 How do I make a temporary file name? +X<file, temporary> + +If you don't need to know the name of the file, you can use C<open()> +with C<undef> in place of the file name. In Perl 5.8 or later, the +C<open()> function creates an anonymous temporary file: + + open my $tmp, '+>', undef or die $!; + +Otherwise, you can use the File::Temp module. + + use File::Temp qw/ tempfile tempdir /; + + my $dir = tempdir( CLEANUP => 1 ); + ($fh, $filename) = tempfile( DIR => $dir ); + + # or if you don't need to know the filename + + my $fh = tempfile( DIR => $dir ); + +The File::Temp has been a standard module since Perl 5.6.1. If you +don't have a modern enough Perl installed, use the C<new_tmpfile> +class method from the IO::File module to get a filehandle opened for +reading and writing. Use it if you don't need to know the file's name: + + use IO::File; + my $fh = IO::File->new_tmpfile() + or die "Unable to make new temporary file: $!"; + +If you're committed to creating a temporary file by hand, use the +process ID and/or the current time-value. If you need to have many +temporary files in one process, use a counter: + + BEGIN { + use Fcntl; + my $temp_dir = -d '/tmp' ? '/tmp' : $ENV{TMPDIR} || $ENV{TEMP}; + my $base_name = sprintf "%s/%d-%d-0000", $temp_dir, $$, time; + + sub temp_file { + my $fh; + my $count = 0; + until( defined(fileno($fh)) || $count++ > 100 ) { + $base_name =~ s/-(\d+)$/"-" . (1 + $1)/e; + # O_EXCL is required for security reasons. + sysopen $fh, $base_name, O_WRONLY|O_EXCL|O_CREAT; + } + + if( defined fileno($fh) ) { + return ($fh, $base_name); + } + else { + return (); + } + } + } + +=head2 How can I manipulate fixed-record-length files? +X<fixed-length> X<file, fixed-length records> + +The most efficient way is using L<pack()|perlfunc/"pack"> and +L<unpack()|perlfunc/"unpack">. This is faster than using +L<substr()|perlfunc/"substr"> when taking many, many strings. It is +slower for just a few. + +Here is a sample chunk of code to break up and put back together again +some fixed-format input lines, in this case from the output of a normal, +Berkeley-style ps: + + # sample input line: + # 15158 p5 T 0:00 perl /home/tchrist/scripts/now-what + my $PS_T = 'A6 A4 A7 A5 A*'; + open my $ps, '-|', 'ps'; + print scalar <$ps>; + my @fields = qw( pid tt stat time command ); + while (<$ps>) { + my %process; + @process{@fields} = unpack($PS_T, $_); + for my $field ( @fields ) { + print "$field: <$process{$field}>\n"; + } + print 'line=', pack($PS_T, @process{@fields} ), "\n"; + } + +We've used a hash slice in order to easily handle the fields of each row. +Storing the keys in an array makes it easy to operate on them as a +group or loop over them with C<for>. It also avoids polluting the program +with global variables and using symbolic references. + +=head2 How can I make a filehandle local to a subroutine? How do I pass filehandles between subroutines? How do I make an array of filehandles? +X<filehandle, local> X<filehandle, passing> X<filehandle, reference> + +As of perl5.6, open() autovivifies file and directory handles +as references if you pass it an uninitialized scalar variable. +You can then pass these references just like any other scalar, +and use them in the place of named handles. + + open my $fh, $file_name; + + open local $fh, $file_name; + + print $fh "Hello World!\n"; + + process_file( $fh ); + +If you like, you can store these filehandles in an array or a hash. +If you access them directly, they aren't simple scalars and you +need to give C<print> a little help by placing the filehandle +reference in braces. Perl can only figure it out on its own when +the filehandle reference is a simple scalar. + + my @fhs = ( $fh1, $fh2, $fh3 ); + + for( $i = 0; $i <= $#fhs; $i++ ) { + print {$fhs[$i]} "just another Perl answer, \n"; + } + +Before perl5.6, you had to deal with various typeglob idioms +which you may see in older code. + + open FILE, "> $filename"; + process_typeglob( *FILE ); + process_reference( \*FILE ); + + sub process_typeglob { local *FH = shift; print FH "Typeglob!" } + sub process_reference { local $fh = shift; print $fh "Reference!" } + +If you want to create many anonymous handles, you should +check out the Symbol or IO::Handle modules. + +=head2 How can I use a filehandle indirectly? +X<filehandle, indirect> + +An indirect filehandle is the use of something other than a symbol +in a place that a filehandle is expected. Here are ways +to get indirect filehandles: + + $fh = SOME_FH; # bareword is strict-subs hostile + $fh = "SOME_FH"; # strict-refs hostile; same package only + $fh = *SOME_FH; # typeglob + $fh = \*SOME_FH; # ref to typeglob (bless-able) + $fh = *SOME_FH{IO}; # blessed IO::Handle from *SOME_FH typeglob + +Or, you can use the C<new> method from one of the IO::* modules to +create an anonymous filehandle and store that in a scalar variable. + + use IO::Handle; # 5.004 or higher + my $fh = IO::Handle->new(); + +Then use any of those as you would a normal filehandle. Anywhere that +Perl is expecting a filehandle, an indirect filehandle may be used +instead. An indirect filehandle is just a scalar variable that contains +a filehandle. Functions like C<print>, C<open>, C<seek>, or +the C<< <FH> >> diamond operator will accept either a named filehandle +or a scalar variable containing one: + + ($ifh, $ofh, $efh) = (*STDIN, *STDOUT, *STDERR); + print $ofh "Type it: "; + my $got = <$ifh> + print $efh "What was that: $got"; + +If you're passing a filehandle to a function, you can write +the function in two ways: + + sub accept_fh { + my $fh = shift; + print $fh "Sending to indirect filehandle\n"; + } + +Or it can localize a typeglob and use the filehandle directly: + + sub accept_fh { + local *FH = shift; + print FH "Sending to localized filehandle\n"; + } + +Both styles work with either objects or typeglobs of real filehandles. +(They might also work with strings under some circumstances, but this +is risky.) + + accept_fh(*STDOUT); + accept_fh($handle); + +In the examples above, we assigned the filehandle to a scalar variable +before using it. That is because only simple scalar variables, not +expressions or subscripts of hashes or arrays, can be used with +built-ins like C<print>, C<printf>, or the diamond operator. Using +something other than a simple scalar variable as a filehandle is +illegal and won't even compile: + + my @fd = (*STDIN, *STDOUT, *STDERR); + print $fd[1] "Type it: "; # WRONG + my $got = <$fd[0]> # WRONG + print $fd[2] "What was that: $got"; # WRONG + +With C<print> and C<printf>, you get around this by using a block and +an expression where you would place the filehandle: + + print { $fd[1] } "funny stuff\n"; + printf { $fd[1] } "Pity the poor %x.\n", 3_735_928_559; + # Pity the poor deadbeef. + +That block is a proper block like any other, so you can put more +complicated code there. This sends the message out to one of two places: + + my $ok = -x "/bin/cat"; + print { $ok ? $fd[1] : $fd[2] } "cat stat $ok\n"; + print { $fd[ 1+ ($ok || 0) ] } "cat stat $ok\n"; + +This approach of treating C<print> and C<printf> like object methods +calls doesn't work for the diamond operator. That's because it's a +real operator, not just a function with a comma-less argument. Assuming +you've been storing typeglobs in your structure as we did above, you +can use the built-in function named C<readline> to read a record just +as C<< <> >> does. Given the initialization shown above for @fd, this +would work, but only because readline() requires a typeglob. It doesn't +work with objects or strings, which might be a bug we haven't fixed yet. + + $got = readline($fd[0]); + +Let it be noted that the flakiness of indirect filehandles is not +related to whether they're strings, typeglobs, objects, or anything else. +It's the syntax of the fundamental operators. Playing the object +game doesn't help you at all here. + +=head2 How can I set up a footer format to be used with write()? +X<footer> + +There's no builtin way to do this, but L<perlform> has a couple of +techniques to make it possible for the intrepid hacker. + +=head2 How can I write() into a string? +X<write, into a string> + +(contributed by brian d foy) + +If you want to C<write> into a string, you just have to <open> a +filehandle to a string, which Perl has been able to do since Perl 5.6: + + open FH, '>', \my $string; + write( FH ); + +Since you want to be a good programmer, you probably want to use a lexical +filehandle, even though formats are designed to work with bareword filehandles +since the default format names take the filehandle name. However, you can +control this with some Perl special per-filehandle variables: C<$^>, which +names the top-of-page format, and C<$~> which shows the line format. You have +to change the default filehandle to set these variables: + + open my($fh), '>', \my $string; + + { # set per-filehandle variables + my $old_fh = select( $fh ); + $~ = 'ANIMAL'; + $^ = 'ANIMAL_TOP'; + select( $old_fh ); + } + + format ANIMAL_TOP = + ID Type Name + . + + format ANIMAL = + @## @<<< @<<<<<<<<<<<<<< + $id, $type, $name + . + +Although write can work with lexical or package variables, whatever variables +you use have to scope in the format. That most likely means you'll want to +localize some package variables: + + { + local( $id, $type, $name ) = qw( 12 cat Buster ); + write( $fh ); + } + + print $string; + +There are also some tricks that you can play with C<formline> and the +accumulator variable C<$^A>, but you lose a lot of the value of formats +since C<formline> won't handle paging and so on. You end up reimplementing +formats when you use them. + +=head2 How can I open a filehandle to a string? +X<string> X<open> X<IO::String> X<filehandle> + +(contributed by Peter J. Holzer, hjp-usenet2@hjp.at) + +Since Perl 5.8.0 a file handle referring to a string can be created by +calling open with a reference to that string instead of the filename. +This file handle can then be used to read from or write to the string: + + open(my $fh, '>', \$string) or die "Could not open string for writing"; + print $fh "foo\n"; + print $fh "bar\n"; # $string now contains "foo\nbar\n" + + open(my $fh, '<', \$string) or die "Could not open string for reading"; + my $x = <$fh>; # $x now contains "foo\n" + +With older versions of Perl, the L<IO::String> module provides similar +functionality. + +=head2 How can I output my numbers with commas added? +X<number, commify> + +(contributed by brian d foy and Benjamin Goldberg) + +You can use L<Number::Format> to separate places in a number. +It handles locale information for those of you who want to insert +full stops instead (or anything else that they want to use, +really). + +This subroutine will add commas to your number: + + sub commify { + local $_ = shift; + 1 while s/^([-+]?\d+)(\d{3})/$1,$2/; + return $_; + } + +This regex from Benjamin Goldberg will add commas to numbers: + + s/(^[-+]?\d+?(?=(?>(?:\d{3})+)(?!\d))|\G\d{3}(?=\d))/$1,/g; + +It is easier to see with comments: + + s/( + ^[-+]? # beginning of number. + \d+? # first digits before first comma + (?= # followed by, (but not included in the match) : + (?>(?:\d{3})+) # some positive multiple of three digits. + (?!\d) # an *exact* multiple, not x * 3 + 1 or whatever. + ) + | # or: + \G\d{3} # after the last group, get three digits + (?=\d) # but they have to have more digits after them. + )/$1,/xg; + +=head2 How can I translate tildes (~) in a filename? +X<tilde> X<tilde expansion> + +Use the E<lt>E<gt> (C<glob()>) operator, documented in L<perlfunc>. +Versions of Perl older than 5.6 require that you have a shell +installed that groks tildes. Later versions of Perl have this feature +built in. The L<File::KGlob> module (available from CPAN) gives more +portable glob functionality. + +Within Perl, you may use this directly: + + $filename =~ s{ + ^ ~ # find a leading tilde + ( # save this in $1 + [^/] # a non-slash character + * # repeated 0 or more times (0 means me) + ) + }{ + $1 + ? (getpwnam($1))[7] + : ( $ENV{HOME} || $ENV{LOGDIR} ) + }ex; + +=head2 How come when I open a file read-write it wipes it out? +X<clobber> X<read-write> X<clobbering> X<truncate> X<truncating> + +Because you're using something like this, which truncates the file +I<then> gives you read-write access: + + open my $fh, '+>', '/path/name'; # WRONG (almost always) + +Whoops. You should instead use this, which will fail if the file +doesn't exist: + + open my $fh, '+<', '/path/name'; # open for update + +Using ">" always clobbers or creates. Using "<" never does +either. The "+" doesn't change this. + +Here are examples of many kinds of file opens. Those using C<sysopen> +all assume that you've pulled in the constants from L<Fcntl>: + + use Fcntl; + +To open file for reading: + + open my $fh, '<', $path or die $!; + sysopen my $fh, $path, O_RDONLY or die $!; + +To open file for writing, create new file if needed or else truncate old file: + + open my $fh, '>', $path or die $!; + sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_TRUNC|O_CREAT, 0666 or die $!; + +To open file for writing, create new file, file must not exist: + + sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_EXCL|O_CREAT, 0666 or die $!; + +To open file for appending, create if necessary: + + open my $fh, '>>' $path or die $!; + sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT or die $!; + sysopen my $fh, $path, O_WRONLY|O_APPEND|O_CREAT, 0666 or die $!; + +To open file for appending, file must exist: + + sysopen my $fh, $path, O_WRONLY|O_APPEND or die $!; + +To open file for update, file must exist: + + open my $fh, '+<', $path or die $!; + sysopen my $fh, $path, O_RDWR or die $!; + +To open file for update, create file if necessary: + + sysopen my $fh, $path, O_RDWR|O_CREAT or die $!; + sysopen my $fh, $path, O_RDWR|O_CREAT, 0666 or die $!; + +To open file for update, file must not exist: + + sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT or die $!; + sysopen my $fh, $path, O_RDWR|O_EXCL|O_CREAT, 0666 or die $!; + +To open a file without blocking, creating if necessary: + + sysopen my $fh, '/foo/somefile', O_WRONLY|O_NDELAY|O_CREAT + or die "can't open /foo/somefile: $!": + +Be warned that neither creation nor deletion of files is guaranteed to +be an atomic operation over NFS. That is, two processes might both +successfully create or unlink the same file! Therefore O_EXCL +isn't as exclusive as you might wish. + +See also L<perlopentut>. + +=head2 Why do I sometimes get an "Argument list too long" when I use E<lt>*E<gt>? +X<argument list too long> + +The C<< <> >> operator performs a globbing operation (see above). +In Perl versions earlier than v5.6.0, the internal glob() operator forks +csh(1) to do the actual glob expansion, but +csh can't handle more than 127 items and so gives the error message +C<Argument list too long>. People who installed tcsh as csh won't +have this problem, but their users may be surprised by it. + +To get around this, either upgrade to Perl v5.6.0 or later, do the glob +yourself with readdir() and patterns, or use a module like L<File::Glob>, +one that doesn't use the shell to do globbing. + +=head2 How can I open a file with a leading ">" or trailing blanks? +X<filename, special characters> + +(contributed by Brian McCauley) + +The special two-argument form of Perl's open() function ignores +trailing blanks in filenames and infers the mode from certain leading +characters (or a trailing "|"). In older versions of Perl this was the +only version of open() and so it is prevalent in old code and books. + +Unless you have a particular reason to use the two-argument form you +should use the three-argument form of open() which does not treat any +characters in the filename as special. + + open my $fh, "<", " file "; # filename is " file " + open my $fh, ">", ">file"; # filename is ">file" + +=head2 How can I reliably rename a file? +X<rename> X<mv> X<move> X<file, rename> + +If your operating system supports a proper mv(1) utility or its +functional equivalent, this works: + + rename($old, $new) or system("mv", $old, $new); + +It may be more portable to use the L<File::Copy> module instead. +You just copy to the new file to the new name (checking return +values), then delete the old one. This isn't really the same +semantically as a C<rename()>, which preserves meta-information like +permissions, timestamps, inode info, etc. + +=head2 How can I lock a file? +X<lock> X<file, lock> X<flock> + +Perl's builtin flock() function (see L<perlfunc> for details) will call +flock(2) if that exists, fcntl(2) if it doesn't (on perl version 5.004 and +later), and lockf(3) if neither of the two previous system calls exists. +On some systems, it may even use a different form of native locking. +Here are some gotchas with Perl's flock(): + +=over 4 + +=item 1 + +Produces a fatal error if none of the three system calls (or their +close equivalent) exists. + +=item 2 + +lockf(3) does not provide shared locking, and requires that the +filehandle be open for writing (or appending, or read/writing). + +=item 3 + +Some versions of flock() can't lock files over a network (e.g. on NFS file +systems), so you'd need to force the use of fcntl(2) when you build Perl. +But even this is dubious at best. See the flock entry of L<perlfunc> +and the F<INSTALL> file in the source distribution for information on +building Perl to do this. + +Two potentially non-obvious but traditional flock semantics are that +it waits indefinitely until the lock is granted, and that its locks are +I<merely advisory>. Such discretionary locks are more flexible, but +offer fewer guarantees. This means that files locked with flock() may +be modified by programs that do not also use flock(). Cars that stop +for red lights get on well with each other, but not with cars that don't +stop for red lights. See the perlport manpage, your port's specific +documentation, or your system-specific local manpages for details. It's +best to assume traditional behavior if you're writing portable programs. +(If you're not, you should as always feel perfectly free to write +for your own system's idiosyncrasies (sometimes called "features"). +Slavish adherence to portability concerns shouldn't get in the way of +your getting your job done.) + +For more information on file locking, see also +L<perlopentut/"File Locking"> if you have it (new for 5.6). + +=back + +=head2 Why can't I just open(FH, "E<gt>file.lock")? +X<lock, lockfile race condition> + +A common bit of code B<NOT TO USE> is this: + + sleep(3) while -e 'file.lock'; # PLEASE DO NOT USE + open my $lock, '>', 'file.lock'; # THIS BROKEN CODE + +This is a classic race condition: you take two steps to do something +which must be done in one. That's why computer hardware provides an +atomic test-and-set instruction. In theory, this "ought" to work: + + sysopen my $fh, "file.lock", O_WRONLY|O_EXCL|O_CREAT + or die "can't open file.lock: $!"; + +except that lamentably, file creation (and deletion) is not atomic +over NFS, so this won't work (at least, not every time) over the net. +Various schemes involving link() have been suggested, but +these tend to involve busy-wait, which is also less than desirable. + +=head2 I still don't get locking. I just want to increment the number in the file. How can I do this? +X<counter> X<file, counter> + +Didn't anyone ever tell you web-page hit counters were useless? +They don't count number of hits, they're a waste of time, and they serve +only to stroke the writer's vanity. It's better to pick a random number; +they're more realistic. + +Anyway, this is what you can do if you can't help yourself. + + use Fcntl qw(:DEFAULT :flock); + sysopen my $fh, "numfile", O_RDWR|O_CREAT or die "can't open numfile: $!"; + flock $fh, LOCK_EX or die "can't flock numfile: $!"; + my $num = <$fh> || 0; + seek $fh, 0, 0 or die "can't rewind numfile: $!"; + truncate $fh, 0 or die "can't truncate numfile: $!"; + (print $fh $num+1, "\n") or die "can't write numfile: $!"; + close $fh or die "can't close numfile: $!"; + +Here's a much better web-page hit counter: + + $hits = int( (time() - 850_000_000) / rand(1_000) ); + +If the count doesn't impress your friends, then the code might. :-) + +=head2 All I want to do is append a small amount of text to the end of a file. Do I still have to use locking? +X<append> X<file, append> + +If you are on a system that correctly implements C<flock> and you use +the example appending code from "perldoc -f flock" everything will be +OK even if the OS you are on doesn't implement append mode correctly +(if such a system exists). So if you are happy to restrict yourself to +OSs that implement C<flock> (and that's not really much of a +restriction) then that is what you should do. + +If you know you are only going to use a system that does correctly +implement appending (i.e. not Win32) then you can omit the C<seek> +from the code in the previous answer. + +If you know you are only writing code to run on an OS and filesystem +that does implement append mode correctly (a local filesystem on a +modern Unix for example), and you keep the file in block-buffered mode +and you write less than one buffer-full of output between each manual +flushing of the buffer then each bufferload is almost guaranteed to be +written to the end of the file in one chunk without getting +intermingled with anyone else's output. You can also use the +C<syswrite> function which is simply a wrapper around your system's +C<write(2)> system call. + +There is still a small theoretical chance that a signal will interrupt +the system-level C<write()> operation before completion. There is also +a possibility that some STDIO implementations may call multiple system +level C<write()>s even if the buffer was empty to start. There may be +some systems where this probability is reduced to zero, and this is +not a concern when using C<:perlio> instead of your system's STDIO. + +=head2 How do I randomly update a binary file? +X<file, binary patch> + +If you're just trying to patch a binary, in many cases something as +simple as this works: + + perl -i -pe 's{window manager}{window mangler}g' /usr/bin/emacs + +However, if you have fixed sized records, then you might do something more +like this: + + my $RECSIZE = 220; # size of record, in bytes + my $recno = 37; # which record to update + open my $fh, '+<', 'somewhere' or die "can't update somewhere: $!"; + seek $fh, $recno * $RECSIZE, 0; + read $fh, $record, $RECSIZE == $RECSIZE or die "can't read record $recno: $!"; + # munge the record + seek $fh, -$RECSIZE, 1; + print $fh $record; + close $fh; + +Locking and error checking are left as an exercise for the reader. +Don't forget them or you'll be quite sorry. + +=head2 How do I get a file's timestamp in perl? +X<timestamp> X<file, timestamp> + +If you want to retrieve the time at which the file was last read, +written, or had its meta-data (owner, etc) changed, you use the B<-A>, +B<-M>, or B<-C> file test operations as documented in L<perlfunc>. +These retrieve the age of the file (measured against the start-time of +your program) in days as a floating point number. Some platforms may +not have all of these times. See L<perlport> for details. To retrieve +the "raw" time in seconds since the epoch, you would call the stat +function, then use C<localtime()>, C<gmtime()>, or +C<POSIX::strftime()> to convert this into human-readable form. + +Here's an example: + + my $write_secs = (stat($file))[9]; + printf "file %s updated at %s\n", $file, + scalar localtime($write_secs); + +If you prefer something more legible, use the File::stat module +(part of the standard distribution in version 5.004 and later): + + # error checking left as an exercise for reader. + use File::stat; + use Time::localtime; + my $date_string = ctime(stat($file)->mtime); + print "file $file updated at $date_string\n"; + +The POSIX::strftime() approach has the benefit of being, +in theory, independent of the current locale. See L<perllocale> +for details. + +=head2 How do I set a file's timestamp in perl? +X<timestamp> X<file, timestamp> + +You use the utime() function documented in L<perlfunc/utime>. +By way of example, here's a little program that copies the +read and write times from its first argument to all the rest +of them. + + if (@ARGV < 2) { + die "usage: cptimes timestamp_file other_files ...\n"; + } + my $timestamp = shift; + my($atime, $mtime) = (stat($timestamp))[8,9]; + utime $atime, $mtime, @ARGV; + +Error checking is, as usual, left as an exercise for the reader. + +The perldoc for utime also has an example that has the same +effect as touch(1) on files that I<already exist>. + +Certain file systems have a limited ability to store the times +on a file at the expected level of precision. For example, the +FAT and HPFS filesystem are unable to create dates on files with +a finer granularity than two seconds. This is a limitation of +the filesystems, not of utime(). + +=head2 How do I print to more than one file at once? +X<print, to multiple files> + +To connect one filehandle to several output filehandles, +you can use the L<IO::Tee> or L<Tie::FileHandle::Multiplex> modules. + +If you only have to do this once, you can print individually +to each filehandle. + + for my $fh ($fh1, $fh2, $fh3) { print $fh "whatever\n" } + +=head2 How can I read in an entire file all at once? +X<slurp> X<file, slurping> + +The customary Perl approach for processing all the lines in a file is to +do so one line at a time: + + open my $input, '<', $file or die "can't open $file: $!"; + while (<$input>) { + chomp; + # do something with $_ + } + close $input or die "can't close $file: $!"; + +This is tremendously more efficient than reading the entire file into +memory as an array of lines and then processing it one element at a time, +which is often--if not almost always--the wrong approach. Whenever +you see someone do this: + + my @lines = <INPUT>; + +You should think long and hard about why you need everything loaded at +once. It's just not a scalable solution. + +If you "mmap" the file with the File::Map module from +CPAN, you can virtually load the entire file into a +string without actually storing it in memory: + + use File::Map qw(map_file); + + map_file my $string, $filename; + +Once mapped, you can treat C<$string> as you would any other string. +Since you don't necessarily have to load the data, mmap-ing can be +very fast and may not increase your memory footprint. + +You might also find it more +fun to use the standard L<Tie::File> module, or the L<DB_File> module's +C<$DB_RECNO> bindings, which allow you to tie an array to a file so that +accessing an element of the array actually accesses the corresponding +line in the file. + +If you want to load the entire file, you can use the L<File::Slurp> +module to do it in one one simple and efficient step: + + use File::Slurp; + + my $all_of_it = read_file($filename); # entire file in scalar + my @all_lines = read_file($filename); # one line per element + +Or you can read the entire file contents into a scalar like this: + + my $var; + { + local $/; + open my $fh, '<', $file or die "can't open $file: $!"; + $var = <$fh>; + } + +That temporarily undefs your record separator, and will automatically +close the file at block exit. If the file is already open, just use this: + + my $var = do { local $/; <$fh> }; + +You can also use a localized C<@ARGV> to eliminate the C<open>: + + my $var = do { local( @ARGV, $/ ) = $file; <> }; + +For ordinary files you can also use the C<read> function. + + read( $fh, $var, -s $fh ); + +That third argument tests the byte size of the data on the C<$fh> filehandle +and reads that many bytes into the buffer C<$var>. + +=head2 How can I read in a file by paragraphs? +X<file, reading by paragraphs> + +Use the C<$/> variable (see L<perlvar> for details). You can either +set it to C<""> to eliminate empty paragraphs (C<"abc\n\n\n\ndef">, +for instance, gets treated as two paragraphs and not three), or +C<"\n\n"> to accept empty paragraphs. + +Note that a blank line must have no blanks in it. Thus +S<C<"fred\n \nstuff\n\n">> is one paragraph, but C<"fred\n\nstuff\n\n"> is two. + +=head2 How can I read a single character from a file? From the keyboard? +X<getc> X<file, reading one character at a time> + +You can use the builtin C<getc()> function for most filehandles, but +it won't (easily) work on a terminal device. For STDIN, either use +the Term::ReadKey module from CPAN or use the sample code in +L<perlfunc/getc>. + +If your system supports the portable operating system programming +interface (POSIX), you can use the following code, which you'll note +turns off echo processing as well. + + #!/usr/bin/perl -w + use strict; + $| = 1; + for (1..4) { + print "gimme: "; + my $got = getone(); + print "--> $got\n"; + } + exit; + + BEGIN { + use POSIX qw(:termios_h); + + my ($term, $oterm, $echo, $noecho, $fd_stdin); + + my $fd_stdin = fileno(STDIN); + + $term = POSIX::Termios->new(); + $term->getattr($fd_stdin); + $oterm = $term->getlflag(); + + $echo = ECHO | ECHOK | ICANON; + $noecho = $oterm & ~$echo; + + sub cbreak { + $term->setlflag($noecho); + $term->setcc(VTIME, 1); + $term->setattr($fd_stdin, TCSANOW); + } + + sub cooked { + $term->setlflag($oterm); + $term->setcc(VTIME, 0); + $term->setattr($fd_stdin, TCSANOW); + } + + sub getone { + my $key = ''; + cbreak(); + sysread(STDIN, $key, 1); + cooked(); + return $key; + } + } + + END { cooked() } + +The Term::ReadKey module from CPAN may be easier to use. Recent versions +include also support for non-portable systems as well. + + use Term::ReadKey; + open my $tty, '<', '/dev/tty'; + print "Gimme a char: "; + ReadMode "raw"; + my $key = ReadKey 0, $tty; + ReadMode "normal"; + printf "\nYou said %s, char number %03d\n", + $key, ord $key; + +=head2 How can I tell whether there's a character waiting on a filehandle? + +The very first thing you should do is look into getting the Term::ReadKey +extension from CPAN. As we mentioned earlier, it now even has limited +support for non-portable (read: not open systems, closed, proprietary, +not POSIX, not Unix, etc.) systems. + +You should also check out the Frequently Asked Questions list in +comp.unix.* for things like this: the answer is essentially the same. +It's very system-dependent. Here's one solution that works on BSD +systems: + + sub key_ready { + my($rin, $nfd); + vec($rin, fileno(STDIN), 1) = 1; + return $nfd = select($rin,undef,undef,0); + } + +If you want to find out how many characters are waiting, there's +also the FIONREAD ioctl call to be looked at. The I<h2ph> tool that +comes with Perl tries to convert C include files to Perl code, which +can be C<require>d. FIONREAD ends up defined as a function in the +I<sys/ioctl.ph> file: + + require 'sys/ioctl.ph'; + + $size = pack("L", 0); + ioctl(FH, FIONREAD(), $size) or die "Couldn't call ioctl: $!\n"; + $size = unpack("L", $size); + +If I<h2ph> wasn't installed or doesn't work for you, you can +I<grep> the include files by hand: + + % grep FIONREAD /usr/include/*/* + /usr/include/asm/ioctls.h:#define FIONREAD 0x541B + +Or write a small C program using the editor of champions: + + % cat > fionread.c + #include <sys/ioctl.h> + main() { + printf("%#08x\n", FIONREAD); + } + ^D + % cc -o fionread fionread.c + % ./fionread + 0x4004667f + +And then hard-code it, leaving porting as an exercise to your successor. + + $FIONREAD = 0x4004667f; # XXX: opsys dependent + + $size = pack("L", 0); + ioctl(FH, $FIONREAD, $size) or die "Couldn't call ioctl: $!\n"; + $size = unpack("L", $size); + +FIONREAD requires a filehandle connected to a stream, meaning that sockets, +pipes, and tty devices work, but I<not> files. + +=head2 How do I do a C<tail -f> in perl? +X<tail> X<IO::Handle> X<File::Tail> X<clearerr> + +First try + + seek($gw_fh, 0, 1); + +The statement C<seek($gw_fh, 0, 1)> doesn't change the current position, +but it does clear the end-of-file condition on the handle, so that the +next C<< <$gw_fh> >> makes Perl try again to read something. + +If that doesn't work (it relies on features of your stdio implementation), +then you need something more like this: + + for (;;) { + for ($curpos = tell($gw_fh); <$gw_fh>; $curpos =tell($gw_fh)) { + # search for some stuff and put it into files + } + # sleep for a while + seek($gw_fh, $curpos, 0); # seek to where we had been + } + +If this still doesn't work, look into the C<clearerr> method +from L<IO::Handle>, which resets the error and end-of-file states +on the handle. + +There's also a L<File::Tail> module from CPAN. + +=head2 How do I dup() a filehandle in Perl? +X<dup> + +If you check L<perlfunc/open>, you'll see that several of the ways +to call open() should do the trick. For example: + + open my $log, '>>', '/foo/logfile'; + open STDERR, '>&', $log; + +Or even with a literal numeric descriptor: + + my $fd = $ENV{MHCONTEXTFD}; + open $mhcontext, "<&=$fd"; # like fdopen(3S) + +Note that "<&STDIN" makes a copy, but "<&=STDIN" makes +an alias. That means if you close an aliased handle, all +aliases become inaccessible. This is not true with +a copied one. + +Error checking, as always, has been left as an exercise for the reader. + +=head2 How do I close a file descriptor by number? +X<file, closing file descriptors> X<POSIX> X<close> + +If, for some reason, you have a file descriptor instead of a +filehandle (perhaps you used C<POSIX::open>), you can use the +C<close()> function from the L<POSIX> module: + + use POSIX (); + + POSIX::close( $fd ); + +This should rarely be necessary, as the Perl C<close()> function is to be +used for things that Perl opened itself, even if it was a dup of a +numeric descriptor as with C<MHCONTEXT> above. But if you really have +to, you may be able to do this: + + require 'sys/syscall.ph'; + my $rc = syscall(&SYS_close, $fd + 0); # must force numeric + die "can't sysclose $fd: $!" unless $rc == -1; + +Or, just use the fdopen(3S) feature of C<open()>: + + { + open my $fh, "<&=$fd" or die "Cannot reopen fd=$fd: $!"; + close $fh; + } + +=head2 Why can't I use "C:\temp\foo" in DOS paths? Why doesn't `C:\temp\foo.exe` work? +X<filename, DOS issues> + +Whoops! You just put a tab and a formfeed into that filename! +Remember that within double quoted strings ("like\this"), the +backslash is an escape character. The full list of these is in +L<perlop/Quote and Quote-like Operators>. Unsurprisingly, you don't +have a file called "c:(tab)emp(formfeed)oo" or +"c:(tab)emp(formfeed)oo.exe" on your legacy DOS filesystem. + +Either single-quote your strings, or (preferably) use forward slashes. +Since all DOS and Windows versions since something like MS-DOS 2.0 or so +have treated C</> and C<\> the same in a path, you might as well use the +one that doesn't clash with Perl--or the POSIX shell, ANSI C and C++, +awk, Tcl, Java, or Python, just to mention a few. POSIX paths +are more portable, too. + +=head2 Why doesn't glob("*.*") get all the files? +X<glob> + +Because even on non-Unix ports, Perl's glob function follows standard +Unix globbing semantics. You'll need C<glob("*")> to get all (non-hidden) +files. This makes glob() portable even to legacy systems. Your +port may include proprietary globbing functions as well. Check its +documentation for details. + +=head2 Why does Perl let me delete read-only files? Why does C<-i> clobber protected files? Isn't this a bug in Perl? + +This is elaborately and painstakingly described in the +F<file-dir-perms> article in the "Far More Than You Ever Wanted To +Know" collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . + +The executive summary: learn how your filesystem works. The +permissions on a file say what can happen to the data in that file. +The permissions on a directory say what can happen to the list of +files in that directory. If you delete a file, you're removing its +name from the directory (so the operation depends on the permissions +of the directory, not of the file). If you try to write to the file, +the permissions of the file govern whether you're allowed to. + +=head2 How do I select a random line from a file? +X<file, selecting a random line> + +Short of loading the file into a database or pre-indexing the lines in +the file, there are a couple of things that you can do. + +Here's a reservoir-sampling algorithm from the Camel Book: + + srand; + rand($.) < 1 && ($line = $_) while <>; + +This has a significant advantage in space over reading the whole file +in. You can find a proof of this method in I<The Art of Computer +Programming>, Volume 2, Section 3.4.2, by Donald E. Knuth. + +You can use the L<File::Random> module which provides a function +for that algorithm: + + use File::Random qw/random_line/; + my $line = random_line($filename); + +Another way is to use the L<Tie::File> module, which treats the entire +file as an array. Simply access a random array element. + +=head2 Why do I get weird spaces when I print an array of lines? + +(contributed by brian d foy) + +If you are seeing spaces between the elements of your array when +you print the array, you are probably interpolating the array in +double quotes: + + my @animals = qw(camel llama alpaca vicuna); + print "animals are: @animals\n"; + +It's the double quotes, not the C<print>, doing this. Whenever you +interpolate an array in a double quote context, Perl joins the +elements with spaces (or whatever is in C<$">, which is a space by +default): + + animals are: camel llama alpaca vicuna + +This is different than printing the array without the interpolation: + + my @animals = qw(camel llama alpaca vicuna); + print "animals are: ", @animals, "\n"; + +Now the output doesn't have the spaces between the elements because +the elements of C<@animals> simply become part of the list to +C<print>: + + animals are: camelllamaalpacavicuna + +You might notice this when each of the elements of C<@array> end with +a newline. You expect to print one element per line, but notice that +every line after the first is indented: + + this is a line + this is another line + this is the third line + +That extra space comes from the interpolation of the array. If you +don't want to put anything between your array elements, don't use the +array in double quotes. You can send it to print without them: + + print @lines; + +=head2 How do I traverse a directory tree? + +(contributed by brian d foy) + +The L<File::Find> module, which comes with Perl, does all of the hard +work to traverse a directory structure. It comes with Perl. You simply +call the C<find> subroutine with a callback subroutine and the +directories you want to traverse: + + use File::Find; + + find( \&wanted, @directories ); + + sub wanted { + # full path in $File::Find::name + # just filename in $_ + ... do whatever you want to do ... + } + +The L<File::Find::Closures>, which you can download from CPAN, provides +many ready-to-use subroutines that you can use with L<File::Find>. + +The L<File::Finder>, which you can download from CPAN, can help you +create the callback subroutine using something closer to the syntax of +the C<find> command-line utility: + + use File::Find; + use File::Finder; + + my $deep_dirs = File::Finder->depth->type('d')->ls->exec('rmdir','{}'); + + find( $deep_dirs->as_options, @places ); + +The L<File::Find::Rule> module, which you can download from CPAN, has +a similar interface, but does the traversal for you too: + + use File::Find::Rule; + + my @files = File::Find::Rule->file() + ->name( '*.pm' ) + ->in( @INC ); + +=head2 How do I delete a directory tree? + +(contributed by brian d foy) + +If you have an empty directory, you can use Perl's built-in C<rmdir>. +If the directory is not empty (so, no files or subdirectories), you +either have to empty it yourself (a lot of work) or use a module to +help you. + +The L<File::Path> module, which comes with Perl, has a C<remove_tree> +which can take care of all of the hard work for you: + + use File::Path qw(remove_tree); + + remove_tree( @directories ); + +The L<File::Path> module also has a legacy interface to the older +C<rmtree> subroutine. + +=head2 How do I copy an entire directory? + +(contributed by Shlomi Fish) + +To do the equivalent of C<cp -R> (i.e. copy an entire directory tree +recursively) in portable Perl, you'll either need to write something yourself +or find a good CPAN module such as L<File::Copy::Recursive>. + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples here are in the public +domain. You are permitted and encouraged to use this code and any +derivatives thereof in your own programs for fun or for profit as you +see fit. A simple comment in the code giving credit to the FAQ would +be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq6.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq6.pod new file mode 100644 index 00000000000..40c2b07c3dc --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq6.pod @@ -0,0 +1,1124 @@ +=head1 NAME + +perlfaq6 - Regular Expressions + +=head1 DESCRIPTION + +This section is surprisingly small because the rest of the FAQ is +littered with answers involving regular expressions. For example, +decoding a URL and checking whether something is a number can be handled +with regular expressions, but those answers are found elsewhere in +this document (in L<perlfaq9>: "How do I decode or create those %-encodings +on the web" and L<perlfaq4>: "How do I determine whether a scalar is +a number/whole/integer/float", to be precise). + +=head2 How can I hope to use regular expressions without creating illegible and unmaintainable code? +X<regex, legibility> X<regexp, legibility> +X<regular expression, legibility> X</x> + +Three techniques can make regular expressions maintainable and +understandable. + +=over 4 + +=item Comments Outside the Regex + +Describe what you're doing and how you're doing it, using normal Perl +comments. + + # turn the line into the first word, a colon, and the + # number of characters on the rest of the line + s/^(\w+)(.*)/ lc($1) . ":" . length($2) /meg; + +=item Comments Inside the Regex + +The C</x> modifier causes whitespace to be ignored in a regex pattern +(except in a character class and a few other places), and also allows you to +use normal comments there, too. As you can imagine, whitespace and comments +help a lot. + +C</x> lets you turn this: + + s{<(?:[^>'"]*|".*?"|'.*?')+>}{}gs; + +into this: + + s{ < # opening angle bracket + (?: # Non-backreffing grouping paren + [^>'"] * # 0 or more things that are neither > nor ' nor " + | # or else + ".*?" # a section between double quotes (stingy match) + | # or else + '.*?' # a section between single quotes (stingy match) + ) + # all occurring one or more times + > # closing angle bracket + }{}gsx; # replace with nothing, i.e. delete + +It's still not quite so clear as prose, but it is very useful for +describing the meaning of each part of the pattern. + +=item Different Delimiters + +While we normally think of patterns as being delimited with C</> +characters, they can be delimited by almost any character. L<perlre> +describes this. For example, the C<s///> above uses braces as +delimiters. Selecting another delimiter can avoid quoting the +delimiter within the pattern: + + s/\/usr\/local/\/usr\/share/g; # bad delimiter choice + s#/usr/local#/usr/share#g; # better + +Using logically paired delimiters can be even more readable: + + s{/usr/local/}{/usr/share}g; # better still + +=back + +=head2 I'm having trouble matching over more than one line. What's wrong? +X<regex, multiline> X<regexp, multiline> X<regular expression, multiline> + +Either you don't have more than one line in the string you're looking +at (probably), or else you aren't using the correct modifier(s) on +your pattern (possibly). + +There are many ways to get multiline data into a string. If you want +it to happen automatically while reading input, you'll want to set $/ +(probably to '' for paragraphs or C<undef> for the whole file) to +allow you to read more than one line at a time. + +Read L<perlre> to help you decide which of C</s> and C</m> (or both) +you might want to use: C</s> allows dot to include newline, and C</m> +allows caret and dollar to match next to a newline, not just at the +end of the string. You do need to make sure that you've actually +got a multiline string in there. + +For example, this program detects duplicate words, even when they span +line breaks (but not paragraph ones). For this example, we don't need +C</s> because we aren't using dot in a regular expression that we want +to cross line boundaries. Neither do we need C</m> because we don't +want caret or dollar to match at any point inside the record next +to newlines. But it's imperative that $/ be set to something other +than the default, or else we won't actually ever have a multiline +record read in. + + $/ = ''; # read in whole paragraph, not just one line + while ( <> ) { + while ( /\b([\w'-]+)(\s+\g1)+\b/gi ) { # word starts alpha + print "Duplicate $1 at paragraph $.\n"; + } + } + +Here's some code that finds sentences that begin with "From " (which would +be mangled by many mailers): + + $/ = ''; # read in whole paragraph, not just one line + while ( <> ) { + while ( /^From /gm ) { # /m makes ^ match next to \n + print "leading from in paragraph $.\n"; + } + } + +Here's code that finds everything between START and END in a paragraph: + + undef $/; # read in whole file, not just one line or paragraph + while ( <> ) { + while ( /START(.*?)END/sgm ) { # /s makes . cross line boundaries + print "$1\n"; + } + } + +=head2 How can I pull out lines between two patterns that are themselves on different lines? +X<..> + +You can use Perl's somewhat exotic C<..> operator (documented in +L<perlop>): + + perl -ne 'print if /START/ .. /END/' file1 file2 ... + +If you wanted text and not lines, you would use + + perl -0777 -ne 'print "$1\n" while /START(.*?)END/gs' file1 file2 ... + +But if you want nested occurrences of C<START> through C<END>, you'll +run up against the problem described in the question in this section +on matching balanced text. + +Here's another example of using C<..>: + + while (<>) { + my $in_header = 1 .. /^$/; + my $in_body = /^$/ .. eof; + # now choose between them + } continue { + $. = 0 if eof; # fix $. + } + +=head2 How do I match XML, HTML, or other nasty, ugly things with a regex? +X<regex, XML> X<regex, HTML> X<XML> X<HTML> X<pain> X<frustration> +X<sucking out, will to live> + +Do not use regexes. Use a module and forget about the +regular expressions. The L<XML::LibXML>, L<HTML::TokeParser> and +L<HTML::TreeBuilder> modules are good starts, although each namespace +has other parsing modules specialized for certain tasks and different +ways of doing it. Start at CPAN Search ( L<http://metacpan.org/> ) +and wonder at all the work people have done for you already! :) + +=head2 I put a regular expression into $/ but it didn't work. What's wrong? +X<$/, regexes in> X<$INPUT_RECORD_SEPARATOR, regexes in> +X<$RS, regexes in> + +$/ has to be a string. You can use these examples if you really need to +do this. + +If you have L<File::Stream>, this is easy. + + use File::Stream; + + my $stream = File::Stream->new( + $filehandle, + separator => qr/\s*,\s*/, + ); + + print "$_\n" while <$stream>; + +If you don't have File::Stream, you have to do a little more work. + +You can use the four-argument form of sysread to continually add to +a buffer. After you add to the buffer, you check if you have a +complete line (using your regular expression). + + local $_ = ""; + while( sysread FH, $_, 8192, length ) { + while( s/^((?s).*?)your_pattern// ) { + my $record = $1; + # do stuff here. + } + } + +You can do the same thing with foreach and a match using the +c flag and the \G anchor, if you do not mind your entire file +being in memory at the end. + + local $_ = ""; + while( sysread FH, $_, 8192, length ) { + foreach my $record ( m/\G((?s).*?)your_pattern/gc ) { + # do stuff here. + } + substr( $_, 0, pos ) = "" if pos; + } + + +=head2 How do I substitute case-insensitively on the LHS while preserving case on the RHS? +X<replace, case preserving> X<substitute, case preserving> +X<substitution, case preserving> X<s, case preserving> + +Here's a lovely Perlish solution by Larry Rosler. It exploits +properties of bitwise xor on ASCII strings. + + $_= "this is a TEsT case"; + + $old = 'test'; + $new = 'success'; + + s{(\Q$old\E)} + { uc $new | (uc $1 ^ $1) . + (uc(substr $1, -1) ^ substr $1, -1) x + (length($new) - length $1) + }egi; + + print; + +And here it is as a subroutine, modeled after the above: + + sub preserve_case($$) { + my ($old, $new) = @_; + my $mask = uc $old ^ $old; + + uc $new | $mask . + substr($mask, -1) x (length($new) - length($old)) + } + + $string = "this is a TEsT case"; + $string =~ s/(test)/preserve_case($1, "success")/egi; + print "$string\n"; + +This prints: + + this is a SUcCESS case + +As an alternative, to keep the case of the replacement word if it is +longer than the original, you can use this code, by Jeff Pinyan: + + sub preserve_case { + my ($from, $to) = @_; + my ($lf, $lt) = map length, @_; + + if ($lt < $lf) { $from = substr $from, 0, $lt } + else { $from .= substr $to, $lf } + + return uc $to | ($from ^ uc $from); + } + +This changes the sentence to "this is a SUcCess case." + +Just to show that C programmers can write C in any programming language, +if you prefer a more C-like solution, the following script makes the +substitution have the same case, letter by letter, as the original. +(It also happens to run about 240% slower than the Perlish solution runs.) +If the substitution has more characters than the string being substituted, +the case of the last character is used for the rest of the substitution. + + # Original by Nathan Torkington, massaged by Jeffrey Friedl + # + sub preserve_case($$) + { + my ($old, $new) = @_; + my $state = 0; # 0 = no change; 1 = lc; 2 = uc + my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new)); + my $len = $oldlen < $newlen ? $oldlen : $newlen; + + for ($i = 0; $i < $len; $i++) { + if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) { + $state = 0; + } elsif (lc $c eq $c) { + substr($new, $i, 1) = lc(substr($new, $i, 1)); + $state = 1; + } else { + substr($new, $i, 1) = uc(substr($new, $i, 1)); + $state = 2; + } + } + # finish up with any remaining new (for when new is longer than old) + if ($newlen > $oldlen) { + if ($state == 1) { + substr($new, $oldlen) = lc(substr($new, $oldlen)); + } elsif ($state == 2) { + substr($new, $oldlen) = uc(substr($new, $oldlen)); + } + } + return $new; + } + +=head2 How can I make C<\w> match national character sets? +X<\w> + +Put C<use locale;> in your script. The \w character class is taken +from the current locale. + +See L<perllocale> for details. + +=head2 How can I match a locale-smart version of C</[a-zA-Z]/>? +X<alpha> + +You can use the POSIX character class syntax C</[[:alpha:]]/> +documented in L<perlre>. + +No matter which locale you are in, the alphabetic characters are +the characters in \w without the digits and the underscore. +As a regex, that looks like C</[^\W\d_]/>. Its complement, +the non-alphabetics, is then everything in \W along with +the digits and the underscore, or C</[\W\d_]/>. + +=head2 How can I quote a variable to use in a regex? +X<regex, escaping> X<regexp, escaping> X<regular expression, escaping> + +The Perl parser will expand $variable and @variable references in +regular expressions unless the delimiter is a single quote. Remember, +too, that the right-hand side of a C<s///> substitution is considered +a double-quoted string (see L<perlop> for more details). Remember +also that any regex special characters will be acted on unless you +precede the substitution with \Q. Here's an example: + + $string = "Placido P. Octopus"; + $regex = "P."; + + $string =~ s/$regex/Polyp/; + # $string is now "Polypacido P. Octopus" + +Because C<.> is special in regular expressions, and can match any +single character, the regex C<P.> here has matched the <Pl> in the +original string. + +To escape the special meaning of C<.>, we use C<\Q>: + + $string = "Placido P. Octopus"; + $regex = "P."; + + $string =~ s/\Q$regex/Polyp/; + # $string is now "Placido Polyp Octopus" + +The use of C<\Q> causes the <.> in the regex to be treated as a +regular character, so that C<P.> matches a C<P> followed by a dot. + +=head2 What is C</o> really for? +X</o, regular expressions> X<compile, regular expressions> + +(contributed by brian d foy) + +The C</o> option for regular expressions (documented in L<perlop> and +L<perlreref>) tells Perl to compile the regular expression only once. +This is only useful when the pattern contains a variable. Perls 5.6 +and later handle this automatically if the pattern does not change. + +Since the match operator C<m//>, the substitution operator C<s///>, +and the regular expression quoting operator C<qr//> are double-quotish +constructs, you can interpolate variables into the pattern. See the +answer to "How can I quote a variable to use in a regex?" for more +details. + +This example takes a regular expression from the argument list and +prints the lines of input that match it: + + my $pattern = shift @ARGV; + + while( <> ) { + print if m/$pattern/; + } + +Versions of Perl prior to 5.6 would recompile the regular expression +for each iteration, even if C<$pattern> had not changed. The C</o> +would prevent this by telling Perl to compile the pattern the first +time, then reuse that for subsequent iterations: + + my $pattern = shift @ARGV; + + while( <> ) { + print if m/$pattern/o; # useful for Perl < 5.6 + } + +In versions 5.6 and later, Perl won't recompile the regular expression +if the variable hasn't changed, so you probably don't need the C</o> +option. It doesn't hurt, but it doesn't help either. If you want any +version of Perl to compile the regular expression only once even if +the variable changes (thus, only using its initial value), you still +need the C</o>. + +You can watch Perl's regular expression engine at work to verify for +yourself if Perl is recompiling a regular expression. The C<use re +'debug'> pragma (comes with Perl 5.005 and later) shows the details. +With Perls before 5.6, you should see C<re> reporting that its +compiling the regular expression on each iteration. With Perl 5.6 or +later, you should only see C<re> report that for the first iteration. + + use re 'debug'; + + my $regex = 'Perl'; + foreach ( qw(Perl Java Ruby Python) ) { + print STDERR "-" x 73, "\n"; + print STDERR "Trying $_...\n"; + print STDERR "\t$_ is good!\n" if m/$regex/; + } + +=head2 How do I use a regular expression to strip C-style comments from a file? + +While this actually can be done, it's much harder than you'd think. +For example, this one-liner + + perl -0777 -pe 's{/\*.*?\*/}{}gs' foo.c + +will work in many but not all cases. You see, it's too simple-minded for +certain kinds of C programs, in particular, those with what appear to be +comments in quoted strings. For that, you'd need something like this, +created by Jeffrey Friedl and later modified by Fred Curtis. + + $/ = undef; + $_ = <>; + s#/\*[^*]*\*+([^/*][^*]*\*+)*/|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $2 ? $2 : ""#gse; + print; + +This could, of course, be more legibly written with the C</x> modifier, adding +whitespace and comments. Here it is expanded, courtesy of Fred Curtis. + + s{ + /\* ## Start of /* ... */ comment + [^*]*\*+ ## Non-* followed by 1-or-more *'s + ( + [^/*][^*]*\*+ + )* ## 0-or-more things which don't start with / + ## but do end with '*' + / ## End of /* ... */ comment + + | ## OR various things which aren't comments: + + ( + " ## Start of " ... " string + ( + \\. ## Escaped char + | ## OR + [^"\\] ## Non "\ + )* + " ## End of " ... " string + + | ## OR + + ' ## Start of ' ... ' string + ( + \\. ## Escaped char + | ## OR + [^'\\] ## Non '\ + )* + ' ## End of ' ... ' string + + | ## OR + + . ## Anything other char + [^/"'\\]* ## Chars which doesn't start a comment, string or escape + ) + }{defined $2 ? $2 : ""}gxse; + +A slight modification also removes C++ comments, possibly spanning multiple lines +using a continuation character: + + s#/\*[^*]*\*+([^/*][^*]*\*+)*/|//([^\\]|[^\n][\n]?)*?\n|("(\\.|[^"\\])*"|'(\\.|[^'\\])*'|.[^/"'\\]*)#defined $3 ? $3 : ""#gse; + +=head2 Can I use Perl regular expressions to match balanced text? +X<regex, matching balanced test> X<regexp, matching balanced test> +X<regular expression, matching balanced test> X<possessive> X<PARNO> +X<Text::Balanced> X<Regexp::Common> X<backtracking> X<recursion> + +(contributed by brian d foy) + +Your first try should probably be the L<Text::Balanced> module, which +is in the Perl standard library since Perl 5.8. It has a variety of +functions to deal with tricky text. The L<Regexp::Common> module can +also help by providing canned patterns you can use. + +As of Perl 5.10, you can match balanced text with regular expressions +using recursive patterns. Before Perl 5.10, you had to resort to +various tricks such as using Perl code in C<(??{})> sequences. + +Here's an example using a recursive regular expression. The goal is to +capture all of the text within angle brackets, including the text in +nested angle brackets. This sample text has two "major" groups: a +group with one level of nesting and a group with two levels of +nesting. There are five total groups in angle brackets: + + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. + +The regular expression to match the balanced text uses two new (to +Perl 5.10) regular expression features. These are covered in L<perlre> +and this example is a modified version of one in that documentation. + +First, adding the new possessive C<+> to any quantifier finds the +longest match and does not backtrack. That's important since you want +to handle any angle brackets through the recursion, not backtracking. +The group C<< [^<>]++ >> finds one or more non-angle brackets without +backtracking. + +Second, the new C<(?PARNO)> refers to the sub-pattern in the +particular capture group given by C<PARNO>. In the following regex, +the first capture group finds (and remembers) the balanced text, and +you need that same pattern within the first buffer to get past the +nested text. That's the recursive part. The C<(?1)> uses the pattern +in the outer capture group as an independent part of the regex. + +Putting it all together, you have: + + #!/usr/local/bin/perl5.10.0 + + my $string =<<"HERE"; + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. + HERE + + my @groups = $string =~ m/ + ( # start of capture group 1 + < # match an opening angle bracket + (?: + [^<>]++ # one or more non angle brackets, non backtracking + | + (?1) # found < or >, so recurse to capture group 1 + )* + > # match a closing angle bracket + ) # end of capture group 1 + /xg; + + $" = "\n\t"; + print "Found:\n\t@groups\n"; + +The output shows that Perl found the two major groups: + + Found: + <brackets in <nested brackets> > + <another group <nested once <nested twice> > > + +With a little extra work, you can get the all of the groups in angle +brackets even if they are in other angle brackets too. Each time you +get a balanced match, remove its outer delimiter (that's the one you +just matched so don't match it again) and add it to a queue of strings +to process. Keep doing that until you get no matches: + + #!/usr/local/bin/perl5.10.0 + + my @queue =<<"HERE"; + I have some <brackets in <nested brackets> > and + <another group <nested once <nested twice> > > + and that's it. + HERE + + my $regex = qr/ + ( # start of bracket 1 + < # match an opening angle bracket + (?: + [^<>]++ # one or more non angle brackets, non backtracking + | + (?1) # recurse to bracket 1 + )* + > # match a closing angle bracket + ) # end of bracket 1 + /x; + + $" = "\n\t"; + + while( @queue ) { + my $string = shift @queue; + + my @groups = $string =~ m/$regex/g; + print "Found:\n\t@groups\n\n" if @groups; + + unshift @queue, map { s/^<//; s/>$//; $_ } @groups; + } + +The output shows all of the groups. The outermost matches show up +first and the nested matches so up later: + + Found: + <brackets in <nested brackets> > + <another group <nested once <nested twice> > > + + Found: + <nested brackets> + + Found: + <nested once <nested twice> > + + Found: + <nested twice> + +=head2 What does it mean that regexes are greedy? How can I get around it? +X<greedy> X<greediness> + +Most people mean that greedy regexes match as much as they can. +Technically speaking, it's actually the quantifiers (C<?>, C<*>, C<+>, +C<{}>) that are greedy rather than the whole pattern; Perl prefers local +greed and immediate gratification to overall greed. To get non-greedy +versions of the same quantifiers, use (C<??>, C<*?>, C<+?>, C<{}?>). + +An example: + + my $s1 = my $s2 = "I am very very cold"; + $s1 =~ s/ve.*y //; # I am cold + $s2 =~ s/ve.*?y //; # I am very cold + +Notice how the second substitution stopped matching as soon as it +encountered "y ". The C<*?> quantifier effectively tells the regular +expression engine to find a match as quickly as possible and pass +control on to whatever is next in line, as you would if you were +playing hot potato. + +=head2 How do I process each word on each line? +X<word> + +Use the split function: + + while (<>) { + foreach my $word ( split ) { + # do something with $word here + } + } + +Note that this isn't really a word in the English sense; it's just +chunks of consecutive non-whitespace characters. + +To work with only alphanumeric sequences (including underscores), you +might consider + + while (<>) { + foreach $word (m/(\w+)/g) { + # do something with $word here + } + } + +=head2 How can I print out a word-frequency or line-frequency summary? + +To do this, you have to parse out each word in the input stream. We'll +pretend that by word you mean chunk of alphabetics, hyphens, or +apostrophes, rather than the non-whitespace chunk idea of a word given +in the previous question: + + my (%seen); + while (<>) { + while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'" + $seen{$1}++; + } + } + + while ( my ($word, $count) = each %seen ) { + print "$count $word\n"; + } + +If you wanted to do the same thing for lines, you wouldn't need a +regular expression: + + my (%seen); + + while (<>) { + $seen{$_}++; + } + + while ( my ($line, $count) = each %seen ) { + print "$count $line"; + } + +If you want these output in a sorted order, see L<perlfaq4>: "How do I +sort a hash (optionally by value instead of key)?". + +=head2 How can I do approximate matching? +X<match, approximate> X<matching, approximate> + +See the module L<String::Approx> available from CPAN. + +=head2 How do I efficiently match many regular expressions at once? +X<regex, efficiency> X<regexp, efficiency> +X<regular expression, efficiency> + +(contributed by brian d foy) + +If you have Perl 5.10 or later, this is almost trivial. You just smart +match against an array of regular expression objects: + + my @patterns = ( qr/Fr.d/, qr/B.rn.y/, qr/W.lm./ ); + + if( $string ~~ @patterns ) { + ... + }; + +The smart match stops when it finds a match, so it doesn't have to try +every expression. + +Earlier than Perl 5.10, you have a bit of work to do. You want to +avoid compiling a regular expression every time you want to match it. +In this example, perl must recompile the regular expression for every +iteration of the C<foreach> loop since it has no way to know what +C<$pattern> will be: + + my @patterns = qw( foo bar baz ); + + LINE: while( <DATA> ) { + foreach $pattern ( @patterns ) { + if( /\b$pattern\b/i ) { + print; + next LINE; + } + } + } + +The C<qr//> operator showed up in perl 5.005. It compiles a regular +expression, but doesn't apply it. When you use the pre-compiled +version of the regex, perl does less work. In this example, I inserted +a C<map> to turn each pattern into its pre-compiled form. The rest of +the script is the same, but faster: + + my @patterns = map { qr/\b$_\b/i } qw( foo bar baz ); + + LINE: while( <> ) { + foreach $pattern ( @patterns ) { + if( /$pattern/ ) { + print; + next LINE; + } + } + } + +In some cases, you may be able to make several patterns into a single +regular expression. Beware of situations that require backtracking +though. + + my $regex = join '|', qw( foo bar baz ); + + LINE: while( <> ) { + print if /\b(?:$regex)\b/i; + } + +For more details on regular expression efficiency, see I<Mastering +Regular Expressions> by Jeffrey Friedl. He explains how the regular +expressions engine works and why some patterns are surprisingly +inefficient. Once you understand how perl applies regular expressions, +you can tune them for individual situations. + +=head2 Why don't word-boundary searches with C<\b> work for me? +X<\b> + +(contributed by brian d foy) + +Ensure that you know what \b really does: it's the boundary between a +word character, \w, and something that isn't a word character. That +thing that isn't a word character might be \W, but it can also be the +start or end of the string. + +It's not (not!) the boundary between whitespace and non-whitespace, +and it's not the stuff between words we use to create sentences. + +In regex speak, a word boundary (\b) is a "zero width assertion", +meaning that it doesn't represent a character in the string, but a +condition at a certain position. + +For the regular expression, /\bPerl\b/, there has to be a word +boundary before the "P" and after the "l". As long as something other +than a word character precedes the "P" and succeeds the "l", the +pattern will match. These strings match /\bPerl\b/. + + "Perl" # no word char before P or after l + "Perl " # same as previous (space is not a word char) + "'Perl'" # the ' char is not a word char + "Perl's" # no word char before P, non-word char after "l" + +These strings do not match /\bPerl\b/. + + "Perl_" # _ is a word char! + "Perler" # no word char before P, but one after l + +You don't have to use \b to match words though. You can look for +non-word characters surrounded by word characters. These strings +match the pattern /\b'\b/. + + "don't" # the ' char is surrounded by "n" and "t" + "qep'a'" # the ' char is surrounded by "p" and "a" + +These strings do not match /\b'\b/. + + "foo'" # there is no word char after non-word ' + +You can also use the complement of \b, \B, to specify that there +should not be a word boundary. + +In the pattern /\Bam\B/, there must be a word character before the "a" +and after the "m". These patterns match /\Bam\B/: + + "llama" # "am" surrounded by word chars + "Samuel" # same + +These strings do not match /\Bam\B/ + + "Sam" # no word boundary before "a", but one after "m" + "I am Sam" # "am" surrounded by non-word chars + + +=head2 Why does using $&, $`, or $' slow my program down? +X<$MATCH> X<$&> X<$POSTMATCH> X<$'> X<$PREMATCH> X<$`> + +(contributed by Anno Siegel) + +Once Perl sees that you need one of these variables anywhere in the +program, it provides them on each and every pattern match. That means +that on every pattern match the entire string will be copied, part of it +to $`, part to $&, and part to $'. Thus the penalty is most severe with +long strings and patterns that match often. Avoid $&, $', and $` if you +can, but if you can't, once you've used them at all, use them at will +because you've already paid the price. Remember that some algorithms +really appreciate them. As of the 5.005 release, the $& variable is no +longer "expensive" the way the other two are. + +Since Perl 5.6.1 the special variables @- and @+ can functionally replace +$`, $& and $'. These arrays contain pointers to the beginning and end +of each match (see perlvar for the full story), so they give you +essentially the same information, but without the risk of excessive +string copying. + +Perl 5.10 added three specials, C<${^MATCH}>, C<${^PREMATCH}>, and +C<${^POSTMATCH}> to do the same job but without the global performance +penalty. Perl 5.10 only sets these variables if you compile or execute the +regular expression with the C</p> modifier. + +=head2 What good is C<\G> in a regular expression? +X<\G> + +You use the C<\G> anchor to start the next match on the same +string where the last match left off. The regular +expression engine cannot skip over any characters to find +the next match with this anchor, so C<\G> is similar to the +beginning of string anchor, C<^>. The C<\G> anchor is typically +used with the C<g> flag. It uses the value of C<pos()> +as the position to start the next match. As the match +operator makes successive matches, it updates C<pos()> with the +position of the next character past the last match (or the +first character of the next match, depending on how you like +to look at it). Each string has its own C<pos()> value. + +Suppose you want to match all of consecutive pairs of digits +in a string like "1122a44" and stop matching when you +encounter non-digits. You want to match C<11> and C<22> but +the letter <a> shows up between C<22> and C<44> and you want +to stop at C<a>. Simply matching pairs of digits skips over +the C<a> and still matches C<44>. + + $_ = "1122a44"; + my @pairs = m/(\d\d)/g; # qw( 11 22 44 ) + +If you use the C<\G> anchor, you force the match after C<22> to +start with the C<a>. The regular expression cannot match +there since it does not find a digit, so the next match +fails and the match operator returns the pairs it already +found. + + $_ = "1122a44"; + my @pairs = m/\G(\d\d)/g; # qw( 11 22 ) + +You can also use the C<\G> anchor in scalar context. You +still need the C<g> flag. + + $_ = "1122a44"; + while( m/\G(\d\d)/g ) { + print "Found $1\n"; + } + +After the match fails at the letter C<a>, perl resets C<pos()> +and the next match on the same string starts at the beginning. + + $_ = "1122a44"; + while( m/\G(\d\d)/g ) { + print "Found $1\n"; + } + + print "Found $1 after while" if m/(\d\d)/g; # finds "11" + +You can disable C<pos()> resets on fail with the C<c> flag, documented +in L<perlop> and L<perlreref>. Subsequent matches start where the last +successful match ended (the value of C<pos()>) even if a match on the +same string has failed in the meantime. In this case, the match after +the C<while()> loop starts at the C<a> (where the last match stopped), +and since it does not use any anchor it can skip over the C<a> to find +C<44>. + + $_ = "1122a44"; + while( m/\G(\d\d)/gc ) { + print "Found $1\n"; + } + + print "Found $1 after while" if m/(\d\d)/g; # finds "44" + +Typically you use the C<\G> anchor with the C<c> flag +when you want to try a different match if one fails, +such as in a tokenizer. Jeffrey Friedl offers this example +which works in 5.004 or later. + + while (<>) { + chomp; + PARSER: { + m/ \G( \d+\b )/gcx && do { print "number: $1\n"; redo; }; + m/ \G( \w+ )/gcx && do { print "word: $1\n"; redo; }; + m/ \G( \s+ )/gcx && do { print "space: $1\n"; redo; }; + m/ \G( [^\w\d]+ )/gcx && do { print "other: $1\n"; redo; }; + } + } + +For each line, the C<PARSER> loop first tries to match a series +of digits followed by a word boundary. This match has to +start at the place the last match left off (or the beginning +of the string on the first match). Since C<m/ \G( \d+\b +)/gcx> uses the C<c> flag, if the string does not match that +regular expression, perl does not reset pos() and the next +match starts at the same position to try a different +pattern. + +=head2 Are Perl regexes DFAs or NFAs? Are they POSIX compliant? +X<DFA> X<NFA> X<POSIX> + +While it's true that Perl's regular expressions resemble the DFAs +(deterministic finite automata) of the egrep(1) program, they are in +fact implemented as NFAs (non-deterministic finite automata) to allow +backtracking and backreferencing. And they aren't POSIX-style either, +because those guarantee worst-case behavior for all cases. (It seems +that some people prefer guarantees of consistency, even when what's +guaranteed is slowness.) See the book "Mastering Regular Expressions" +(from O'Reilly) by Jeffrey Friedl for all the details you could ever +hope to know on these matters (a full citation appears in +L<perlfaq2>). + +=head2 What's wrong with using grep in a void context? +X<grep> + +The problem is that grep builds a return list, regardless of the context. +This means you're making Perl go to the trouble of building a list that +you then just throw away. If the list is large, you waste both time and space. +If your intent is to iterate over the list, then use a for loop for this +purpose. + +In perls older than 5.8.1, map suffers from this problem as well. +But since 5.8.1, this has been fixed, and map is context aware - in void +context, no lists are constructed. + +=head2 How can I match strings with multibyte characters? +X<regex, and multibyte characters> X<regexp, and multibyte characters> +X<regular expression, and multibyte characters> X<martian> X<encoding, Martian> + +Starting from Perl 5.6 Perl has had some level of multibyte character +support. Perl 5.8 or later is recommended. Supported multibyte +character repertoires include Unicode, and legacy encodings +through the Encode module. See L<perluniintro>, L<perlunicode>, +and L<Encode>. + +If you are stuck with older Perls, you can do Unicode with the +L<Unicode::String> module, and character conversions using the +L<Unicode::Map8> and L<Unicode::Map> modules. If you are using +Japanese encodings, you might try using the jperl 5.005_03. + +Finally, the following set of approaches was offered by Jeffrey +Friedl, whose article in issue #5 of The Perl Journal talks about +this very matter. + +Let's suppose you have some weird Martian encoding where pairs of +ASCII uppercase letters encode single Martian letters (i.e. the two +bytes "CV" make a single Martian letter, as do the two bytes "SG", +"VS", "XX", etc.). Other bytes represent single characters, just like +ASCII. + +So, the string of Martian "I am CVSGXX!" uses 12 bytes to encode the +nine characters 'I', ' ', 'a', 'm', ' ', 'CV', 'SG', 'XX', '!'. + +Now, say you want to search for the single character C</GX/>. Perl +doesn't know about Martian, so it'll find the two bytes "GX" in the "I +am CVSGXX!" string, even though that character isn't there: it just +looks like it is because "SG" is next to "XX", but there's no real +"GX". This is a big problem. + +Here are a few ways, all painful, to deal with it: + + # Make sure adjacent "martian" bytes are no longer adjacent. + $martian =~ s/([A-Z][A-Z])/ $1 /g; + + print "found GX!\n" if $martian =~ /GX/; + +Or like this: + + my @chars = $martian =~ m/([A-Z][A-Z]|[^A-Z])/g; + # above is conceptually similar to: my @chars = $text =~ m/(.)/g; + # + foreach my $char (@chars) { + print "found GX!\n", last if $char eq 'GX'; + } + +Or like this: + + while ($martian =~ m/\G([A-Z][A-Z]|.)/gs) { # \G probably unneeded + if ($1 eq 'GX') { + print "found GX!\n"; + last; + } + } + +Here's another, slightly less painful, way to do it from Benjamin +Goldberg, who uses a zero-width negative look-behind assertion. + + print "found GX!\n" if $martian =~ m/ + (?<![A-Z]) + (?:[A-Z][A-Z])*? + GX + /x; + +This succeeds if the "martian" character GX is in the string, and fails +otherwise. If you don't like using (?<!), a zero-width negative +look-behind assertion, you can replace (?<![A-Z]) with (?:^|[^A-Z]). + +It does have the drawback of putting the wrong thing in $-[0] and $+[0], +but this usually can be worked around. + +=head2 How do I match a regular expression that's in a variable? +X<regex, in variable> X<eval> X<regex> X<quotemeta> X<\Q, regex> +X<\E, regex> X<qr//> + +(contributed by brian d foy) + +We don't have to hard-code patterns into the match operator (or +anything else that works with regular expressions). We can put the +pattern in a variable for later use. + +The match operator is a double quote context, so you can interpolate +your variable just like a double quoted string. In this case, you +read the regular expression as user input and store it in C<$regex>. +Once you have the pattern in C<$regex>, you use that variable in the +match operator. + + chomp( my $regex = <STDIN> ); + + if( $string =~ m/$regex/ ) { ... } + +Any regular expression special characters in C<$regex> are still +special, and the pattern still has to be valid or Perl will complain. +For instance, in this pattern there is an unpaired parenthesis. + + my $regex = "Unmatched ( paren"; + + "Two parens to bind them all" =~ m/$regex/; + +When Perl compiles the regular expression, it treats the parenthesis +as the start of a memory match. When it doesn't find the closing +parenthesis, it complains: + + Unmatched ( in regex; marked by <-- HERE in m/Unmatched ( <-- HERE paren/ at script line 3. + +You can get around this in several ways depending on our situation. +First, if you don't want any of the characters in the string to be +special, you can escape them with C<quotemeta> before you use the string. + + chomp( my $regex = <STDIN> ); + $regex = quotemeta( $regex ); + + if( $string =~ m/$regex/ ) { ... } + +You can also do this directly in the match operator using the C<\Q> +and C<\E> sequences. The C<\Q> tells Perl where to start escaping +special characters, and the C<\E> tells it where to stop (see L<perlop> +for more details). + + chomp( my $regex = <STDIN> ); + + if( $string =~ m/\Q$regex\E/ ) { ... } + +Alternately, you can use C<qr//>, the regular expression quote operator (see +L<perlop> for more details). It quotes and perhaps compiles the pattern, +and you can apply regular expression flags to the pattern. + + chomp( my $input = <STDIN> ); + + my $regex = qr/$input/is; + + $string =~ m/$regex/ # same as m/$input/is; + +You might also want to trap any errors by wrapping an C<eval> block +around the whole thing. + + chomp( my $input = <STDIN> ); + + eval { + if( $string =~ m/\Q$input\E/ ) { ... } + }; + warn $@ if $@; + +Or... + + my $regex = eval { qr/$input/is }; + if( defined $regex ) { + $string =~ m/$regex/; + } + else { + warn $@; + } + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples in this file +are hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun +or for profit as you see fit. A simple comment in the code giving +credit would be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq7.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq7.pod new file mode 100644 index 00000000000..35c9330f2dc --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq7.pod @@ -0,0 +1,1061 @@ +=head1 NAME + +perlfaq7 - General Perl Language Issues + +=head1 DESCRIPTION + +This section deals with general Perl language issues that don't +clearly fit into any of the other sections. + +=head2 Can I get a BNF/yacc/RE for the Perl language? + +There is no BNF, but you can paw your way through the yacc grammar in +perly.y in the source distribution if you're particularly brave. The +grammar relies on very smart tokenizing code, so be prepared to +venture into toke.c as well. + +In the words of Chaim Frenkel: "Perl's grammar can not be reduced to BNF. +The work of parsing perl is distributed between yacc, the lexer, smoke +and mirrors." + +=head2 What are all these $@%&* punctuation signs, and how do I know when to use them? + +They are type specifiers, as detailed in L<perldata>: + + $ for scalar values (number, string or reference) + @ for arrays + % for hashes (associative arrays) + & for subroutines (aka functions, procedures, methods) + * for all types of that symbol name. In version 4 you used them like + pointers, but in modern perls you can just use references. + +There are a couple of other symbols that +you're likely to encounter that aren't +really type specifiers: + + <> are used for inputting a record from a filehandle. + \ takes a reference to something. + +Note that <FILE> is I<neither> the type specifier for files +nor the name of the handle. It is the C<< <> >> operator applied +to the handle FILE. It reads one line (well, record--see +L<perlvar/$E<sol>>) from the handle FILE in scalar context, or I<all> lines +in list context. When performing open, close, or any other operation +besides C<< <> >> on files, or even when talking about the handle, do +I<not> use the brackets. These are correct: C<eof(FH)>, C<seek(FH, 0, +2)> and "copying from STDIN to FILE". + +=head2 Do I always/never have to quote my strings or use semicolons and commas? + +Normally, a bareword doesn't need to be quoted, but in most cases +probably should be (and must be under C<use strict>). But a hash key +consisting of a simple word and the left-hand +operand to the C<< => >> operator both +count as though they were quoted: + + This is like this + ------------ --------------- + $foo{line} $foo{'line'} + bar => stuff 'bar' => stuff + +The final semicolon in a block is optional, as is the final comma in a +list. Good style (see L<perlstyle>) says to put them in except for +one-liners: + + if ($whoops) { exit 1 } + my @nums = (1, 2, 3); + + if ($whoops) { + exit 1; + } + + my @lines = ( + "There Beren came from mountains cold", + "And lost he wandered under leaves", + ); + +=head2 How do I skip some return values? + +One way is to treat the return values as a list and index into it: + + $dir = (getpwnam($user))[7]; + +Another way is to use undef as an element on the left-hand-side: + + ($dev, $ino, undef, undef, $uid, $gid) = stat($file); + +You can also use a list slice to select only the elements that +you need: + + ($dev, $ino, $uid, $gid) = ( stat($file) )[0,1,4,5]; + +=head2 How do I temporarily block warnings? + +If you are running Perl 5.6.0 or better, the C<use warnings> pragma +allows fine control of what warnings are produced. +See L<perllexwarn> for more details. + + { + no warnings; # temporarily turn off warnings + $x = $y + $z; # I know these might be undef + } + +Additionally, you can enable and disable categories of warnings. +You turn off the categories you want to ignore and you can still +get other categories of warnings. See L<perllexwarn> for the +complete details, including the category names and hierarchy. + + { + no warnings 'uninitialized'; + $x = $y + $z; + } + +If you have an older version of Perl, the C<$^W> variable (documented +in L<perlvar>) controls runtime warnings for a block: + + { + local $^W = 0; # temporarily turn off warnings + $x = $y + $z; # I know these might be undef + } + +Note that like all the punctuation variables, you cannot currently +use my() on C<$^W>, only local(). + +=head2 What's an extension? + +An extension is a way of calling compiled C code from Perl. Reading +L<perlxstut> is a good place to learn more about extensions. + +=head2 Why do Perl operators have different precedence than C operators? + +Actually, they don't. All C operators that Perl copies have the same +precedence in Perl as they do in C. The problem is with operators that C +doesn't have, especially functions that give a list context to everything +on their right, eg. print, chmod, exec, and so on. Such functions are +called "list operators" and appear as such in the precedence table in +L<perlop>. + +A common mistake is to write: + + unlink $file || die "snafu"; + +This gets interpreted as: + + unlink ($file || die "snafu"); + +To avoid this problem, either put in extra parentheses or use the +super low precedence C<or> operator: + + (unlink $file) || die "snafu"; + unlink $file or die "snafu"; + +The "English" operators (C<and>, C<or>, C<xor>, and C<not>) +deliberately have precedence lower than that of list operators for +just such situations as the one above. + +Another operator with surprising precedence is exponentiation. It +binds more tightly even than unary minus, making C<-2**2> produce a +negative four and not a positive one. It is also right-associating, meaning +that C<2**3**2> is two raised to the ninth power, not eight squared. + +Although it has the same precedence as in C, Perl's C<?:> operator +produces an lvalue. This assigns $x to either $if_true or $if_false, depending +on the trueness of $maybe: + + ($maybe ? $if_true : $if_false) = $x; + +=head2 How do I declare/create a structure? + +In general, you don't "declare" a structure. Just use a (probably +anonymous) hash reference. See L<perlref> and L<perldsc> for details. +Here's an example: + + $person = {}; # new anonymous hash + $person->{AGE} = 24; # set field AGE to 24 + $person->{NAME} = "Nat"; # set field NAME to "Nat" + +If you're looking for something a bit more rigorous, try L<perltoot>. + +=head2 How do I create a module? + +L<perlnewmod> is a good place to start, ignore the bits +about uploading to CPAN if you don't want to make your +module publicly available. + +L<ExtUtils::ModuleMaker> and L<Module::Starter> are also +good places to start. Many CPAN authors now use L<Dist::Zilla> +to automate as much as possible. + +Detailed documentation about modules can be found at: +L<perlmod>, L<perlmodlib>, L<perlmodstyle>. + +If you need to include C code or C library interfaces +use h2xs. h2xs will create the module distribution structure +and the initial interface files. +L<perlxs> and L<perlxstut> explain the details. + +=head2 How do I adopt or take over a module already on CPAN? + +Ask the current maintainer to make you a co-maintainer or +transfer the module to you. + +If you can not reach the author for some reason contact +the PAUSE admins at modules@perl.org who may be able to help, +but each case it treated seperatly. + +=over 4 + +=item * + +Get a login for the Perl Authors Upload Server (PAUSE) if you don't +already have one: L<http://pause.perl.org> + +=item * + +Write to modules@perl.org explaining what you did to contact the +current maintainer. The PAUSE admins will also try to reach the +maintainer. + +=item * + +Post a public message in a heavily trafficked site announcing your +intention to take over the module. + +=item * + +Wait a bit. The PAUSE admins don't want to act too quickly in case +the current maintainer is on holiday. If there's no response to +private communication or the public post, a PAUSE admin can transfer +it to you. + +=back + +=head2 How do I create a class? +X<class, creation> X<package> + +(contributed by brian d foy) + +In Perl, a class is just a package, and methods are just subroutines. +Perl doesn't get more formal than that and lets you set up the package +just the way that you like it (that is, it doesn't set up anything for +you). + +The Perl documentation has several tutorials that cover class +creation, including L<perlboot> (Barnyard Object Oriented Tutorial), +L<perltoot> (Tom's Object Oriented Tutorial), L<perlbot> (Bag o' +Object Tricks), and L<perlobj>. + +=head2 How can I tell if a variable is tainted? + +You can use the tainted() function of the Scalar::Util module, available +from CPAN (or included with Perl since release 5.8.0). +See also L<perlsec/"Laundering and Detecting Tainted Data">. + +=head2 What's a closure? + +Closures are documented in L<perlref>. + +I<Closure> is a computer science term with a precise but +hard-to-explain meaning. Usually, closures are implemented in Perl as +anonymous subroutines with lasting references to lexical variables +outside their own scopes. These lexicals magically refer to the +variables that were around when the subroutine was defined (deep +binding). + +Closures are most often used in programming languages where you can +have the return value of a function be itself a function, as you can +in Perl. Note that some languages provide anonymous functions but are +not capable of providing proper closures: the Python language, for +example. For more information on closures, check out any textbook on +functional programming. Scheme is a language that not only supports +but encourages closures. + +Here's a classic non-closure function-generating function: + + sub add_function_generator { + return sub { shift() + shift() }; + } + + my $add_sub = add_function_generator(); + my $sum = $add_sub->(4,5); # $sum is 9 now. + +The anonymous subroutine returned by add_function_generator() isn't +technically a closure because it refers to no lexicals outside its own +scope. Using a closure gives you a I<function template> with some +customization slots left out to be filled later. + +Contrast this with the following make_adder() function, in which the +returned anonymous function contains a reference to a lexical variable +outside the scope of that function itself. Such a reference requires +that Perl return a proper closure, thus locking in for all time the +value that the lexical had when the function was created. + + sub make_adder { + my $addpiece = shift; + return sub { shift() + $addpiece }; + } + + my $f1 = make_adder(20); + my $f2 = make_adder(555); + +Now C<< $f1->($n) >> is always 20 plus whatever $n you pass in, whereas +C<< $f2->($n) >> is always 555 plus whatever $n you pass in. The $addpiece +in the closure sticks around. + +Closures are often used for less esoteric purposes. For example, when +you want to pass in a bit of code into a function: + + my $line; + timeout( 30, sub { $line = <STDIN> } ); + +If the code to execute had been passed in as a string, +C<< '$line = <STDIN>' >>, there would have been no way for the +hypothetical timeout() function to access the lexical variable +$line back in its caller's scope. + +Another use for a closure is to make a variable I<private> to a +named subroutine, e.g. a counter that gets initialized at creation +time of the sub and can only be modified from within the sub. +This is sometimes used with a BEGIN block in package files to make +sure a variable doesn't get meddled with during the lifetime of the +package: + + BEGIN { + my $id = 0; + sub next_id { ++$id } + } + +This is discussed in more detail in L<perlsub>; see the entry on +I<Persistent Private Variables>. + +=head2 What is variable suicide and how can I prevent it? + +This problem was fixed in perl 5.004_05, so preventing it means upgrading +your version of perl. ;) + +Variable suicide is when you (temporarily or permanently) lose the value +of a variable. It is caused by scoping through my() and local() +interacting with either closures or aliased foreach() iterator variables +and subroutine arguments. It used to be easy to inadvertently lose a +variable's value this way, but now it's much harder. Take this code: + + my $f = 'foo'; + sub T { + while ($i++ < 3) { my $f = $f; $f .= "bar"; print $f, "\n" } + } + + T; + print "Finally $f\n"; + +If you are experiencing variable suicide, that C<my $f> in the subroutine +doesn't pick up a fresh copy of the C<$f> whose value is C<'foo'>. The +output shows that inside the subroutine the value of C<$f> leaks through +when it shouldn't, as in this output: + + foobar + foobarbar + foobarbarbar + Finally foo + +The $f that has "bar" added to it three times should be a new C<$f> +C<my $f> should create a new lexical variable each time through the loop. +The expected output is: + + foobar + foobar + foobar + Finally foo + +=head2 How can I pass/return a {Function, FileHandle, Array, Hash, Method, Regex}? + +You need to pass references to these objects. See L<perlsub/"Pass by +Reference"> for this particular question, and L<perlref> for +information on references. + +=over 4 + +=item Passing Variables and Functions + +Regular variables and functions are quite easy to pass: just pass in a +reference to an existing or anonymous variable or function: + + func( \$some_scalar ); + + func( \@some_array ); + func( [ 1 .. 10 ] ); + + func( \%some_hash ); + func( { this => 10, that => 20 } ); + + func( \&some_func ); + func( sub { $_[0] ** $_[1] } ); + +=item Passing Filehandles + +As of Perl 5.6, you can represent filehandles with scalar variables +which you treat as any other scalar. + + open my $fh, $filename or die "Cannot open $filename! $!"; + func( $fh ); + + sub func { + my $passed_fh = shift; + + my $line = <$passed_fh>; + } + +Before Perl 5.6, you had to use the C<*FH> or C<\*FH> notations. +These are "typeglobs"--see L<perldata/"Typeglobs and Filehandles"> +and especially L<perlsub/"Pass by Reference"> for more information. + +=item Passing Regexes + +Here's an example of how to pass in a string and a regular expression +for it to match against. You construct the pattern with the C<qr//> +operator: + + sub compare($$) { + my ($val1, $regex) = @_; + my $retval = $val1 =~ /$regex/; + return $retval; + } + $match = compare("old McDonald", qr/d.*D/i); + +=item Passing Methods + +To pass an object method into a subroutine, you can do this: + + call_a_lot(10, $some_obj, "methname") + sub call_a_lot { + my ($count, $widget, $trick) = @_; + for (my $i = 0; $i < $count; $i++) { + $widget->$trick(); + } + } + +Or, you can use a closure to bundle up the object, its +method call, and arguments: + + my $whatnot = sub { $some_obj->obfuscate(@args) }; + func($whatnot); + sub func { + my $code = shift; + &$code(); + } + +You could also investigate the can() method in the UNIVERSAL class +(part of the standard perl distribution). + +=back + +=head2 How do I create a static variable? + +(contributed by brian d foy) + +In Perl 5.10, declare the variable with C<state>. The C<state> +declaration creates the lexical variable that persists between calls +to the subroutine: + + sub counter { state $count = 1; $count++ } + +You can fake a static variable by using a lexical variable which goes +out of scope. In this example, you define the subroutine C<counter>, and +it uses the lexical variable C<$count>. Since you wrap this in a BEGIN +block, C<$count> is defined at compile-time, but also goes out of +scope at the end of the BEGIN block. The BEGIN block also ensures that +the subroutine and the value it uses is defined at compile-time so the +subroutine is ready to use just like any other subroutine, and you can +put this code in the same place as other subroutines in the program +text (i.e. at the end of the code, typically). The subroutine +C<counter> still has a reference to the data, and is the only way you +can access the value (and each time you do, you increment the value). +The data in chunk of memory defined by C<$count> is private to +C<counter>. + + BEGIN { + my $count = 1; + sub counter { $count++ } + } + + my $start = counter(); + + .... # code that calls counter(); + + my $end = counter(); + +In the previous example, you created a function-private variable +because only one function remembered its reference. You could define +multiple functions while the variable is in scope, and each function +can share the "private" variable. It's not really "static" because you +can access it outside the function while the lexical variable is in +scope, and even create references to it. In this example, +C<increment_count> and C<return_count> share the variable. One +function adds to the value and the other simply returns the value. +They can both access C<$count>, and since it has gone out of scope, +there is no other way to access it. + + BEGIN { + my $count = 1; + sub increment_count { $count++ } + sub return_count { $count } + } + +To declare a file-private variable, you still use a lexical variable. +A file is also a scope, so a lexical variable defined in the file +cannot be seen from any other file. + +See L<perlsub/"Persistent Private Variables"> for more information. +The discussion of closures in L<perlref> may help you even though we +did not use anonymous subroutines in this answer. See +L<perlsub/"Persistent Private Variables"> for details. + +=head2 What's the difference between dynamic and lexical (static) scoping? Between local() and my()? + +C<local($x)> saves away the old value of the global variable C<$x> +and assigns a new value for the duration of the subroutine I<which is +visible in other functions called from that subroutine>. This is done +at run-time, so is called dynamic scoping. local() always affects global +variables, also called package variables or dynamic variables. + +C<my($x)> creates a new variable that is only visible in the current +subroutine. This is done at compile-time, so it is called lexical or +static scoping. my() always affects private variables, also called +lexical variables or (improperly) static(ly scoped) variables. + +For instance: + + sub visible { + print "var has value $var\n"; + } + + sub dynamic { + local $var = 'local'; # new temporary value for the still-global + visible(); # variable called $var + } + + sub lexical { + my $var = 'private'; # new private variable, $var + visible(); # (invisible outside of sub scope) + } + + $var = 'global'; + + visible(); # prints global + dynamic(); # prints local + lexical(); # prints global + +Notice how at no point does the value "private" get printed. That's +because $var only has that value within the block of the lexical() +function, and it is hidden from the called subroutine. + +In summary, local() doesn't make what you think of as private, local +variables. It gives a global variable a temporary value. my() is +what you're looking for if you want private variables. + +See L<perlsub/"Private Variables via my()"> and +L<perlsub/"Temporary Values via local()"> for excruciating details. + +=head2 How can I access a dynamic variable while a similarly named lexical is in scope? + +If you know your package, you can just mention it explicitly, as in +$Some_Pack::var. Note that the notation $::var is B<not> the dynamic $var +in the current package, but rather the one in the "main" package, as +though you had written $main::var. + + use vars '$var'; + local $var = "global"; + my $var = "lexical"; + + print "lexical is $var\n"; + print "global is $main::var\n"; + +Alternatively you can use the compiler directive our() to bring a +dynamic variable into the current lexical scope. + + require 5.006; # our() did not exist before 5.6 + use vars '$var'; + + local $var = "global"; + my $var = "lexical"; + + print "lexical is $var\n"; + + { + our $var; + print "global is $var\n"; + } + +=head2 What's the difference between deep and shallow binding? + +In deep binding, lexical variables mentioned in anonymous subroutines +are the same ones that were in scope when the subroutine was created. +In shallow binding, they are whichever variables with the same names +happen to be in scope when the subroutine is called. Perl always uses +deep binding of lexical variables (i.e., those created with my()). +However, dynamic variables (aka global, local, or package variables) +are effectively shallowly bound. Consider this just one more reason +not to use them. See the answer to L<"What's a closure?">. + +=head2 Why doesn't "my($foo) = E<lt>$fhE<gt>;" work right? + +C<my()> and C<local()> give list context to the right hand side +of C<=>. The <$fh> read operation, like so many of Perl's +functions and operators, can tell which context it was called in and +behaves appropriately. In general, the scalar() function can help. +This function does nothing to the data itself (contrary to popular myth) +but rather tells its argument to behave in whatever its scalar fashion is. +If that function doesn't have a defined scalar behavior, this of course +doesn't help you (such as with sort()). + +To enforce scalar context in this particular case, however, you need +merely omit the parentheses: + + local($foo) = <$fh>; # WRONG + local($foo) = scalar(<$fh>); # ok + local $foo = <$fh>; # right + +You should probably be using lexical variables anyway, although the +issue is the same here: + + my($foo) = <$fh>; # WRONG + my $foo = <$fh>; # right + +=head2 How do I redefine a builtin function, operator, or method? + +Why do you want to do that? :-) + +If you want to override a predefined function, such as open(), +then you'll have to import the new definition from a different +module. See L<perlsub/"Overriding Built-in Functions">. + +If you want to overload a Perl operator, such as C<+> or C<**>, +then you'll want to use the C<use overload> pragma, documented +in L<overload>. + +If you're talking about obscuring method calls in parent classes, +see L<perltoot/"Overridden Methods">. + +=head2 What's the difference between calling a function as &foo and foo()? + +(contributed by brian d foy) + +Calling a subroutine as C<&foo> with no trailing parentheses ignores +the prototype of C<foo> and passes it the current value of the argument +list, C<@_>. Here's an example; the C<bar> subroutine calls C<&foo>, +which prints its arguments list: + + sub bar { &foo } + + sub foo { print "Args in foo are: @_\n" } + + bar( qw( a b c ) ); + +When you call C<bar> with arguments, you see that C<foo> got the same C<@_>: + + Args in foo are: a b c + +Calling the subroutine with trailing parentheses, with or without arguments, +does not use the current C<@_> and respects the subroutine prototype. Changing +the example to put parentheses after the call to C<foo> changes the program: + + sub bar { &foo() } + + sub foo { print "Args in foo are: @_\n" } + + bar( qw( a b c ) ); + +Now the output shows that C<foo> doesn't get the C<@_> from its caller. + + Args in foo are: + +The main use of the C<@_> pass-through feature is to write subroutines +whose main job it is to call other subroutines for you. For further +details, see L<perlsub>. + +=head2 How do I create a switch or case statement? + +In Perl 5.10, use the C<given-when> construct described in L<perlsyn>: + + use 5.010; + + given ( $string ) { + when( 'Fred' ) { say "I found Fred!" } + when( 'Barney' ) { say "I found Barney!" } + when( /Bamm-?Bamm/ ) { say "I found Bamm-Bamm!" } + default { say "I don't recognize the name!" } + }; + +If one wants to use pure Perl and to be compatible with Perl versions +prior to 5.10, the general answer is to use C<if-elsif-else>: + + for ($variable_to_test) { + if (/pat1/) { } # do something + elsif (/pat2/) { } # do something else + elsif (/pat3/) { } # do something else + else { } # default + } + +Here's a simple example of a switch based on pattern matching, +lined up in a way to make it look more like a switch statement. +We'll do a multiway conditional based on the type of reference stored +in $whatchamacallit: + + SWITCH: for (ref $whatchamacallit) { + + /^$/ && die "not a reference"; + + /SCALAR/ && do { + print_scalar($$ref); + last SWITCH; + }; + + /ARRAY/ && do { + print_array(@$ref); + last SWITCH; + }; + + /HASH/ && do { + print_hash(%$ref); + last SWITCH; + }; + + /CODE/ && do { + warn "can't print function ref"; + last SWITCH; + }; + + # DEFAULT + + warn "User defined type skipped"; + + } + +See L<perlsyn> for other examples in this style. + +Sometimes you should change the positions of the constant and the variable. +For example, let's say you wanted to test which of many answers you were +given, but in a case-insensitive way that also allows abbreviations. +You can use the following technique if the strings all start with +different characters or if you want to arrange the matches so that +one takes precedence over another, as C<"SEND"> has precedence over +C<"STOP"> here: + + chomp($answer = <>); + if ("SEND" =~ /^\Q$answer/i) { print "Action is send\n" } + elsif ("STOP" =~ /^\Q$answer/i) { print "Action is stop\n" } + elsif ("ABORT" =~ /^\Q$answer/i) { print "Action is abort\n" } + elsif ("LIST" =~ /^\Q$answer/i) { print "Action is list\n" } + elsif ("EDIT" =~ /^\Q$answer/i) { print "Action is edit\n" } + +A totally different approach is to create a hash of function references. + + my %commands = ( + "happy" => \&joy, + "sad", => \&sullen, + "done" => sub { die "See ya!" }, + "mad" => \&angry, + ); + + print "How are you? "; + chomp($string = <STDIN>); + if ($commands{$string}) { + $commands{$string}->(); + } else { + print "No such command: $string\n"; + } + +Starting from Perl 5.8, a source filter module, C<Switch>, can also be +used to get switch and case. Its use is now discouraged, because it's +not fully compatible with the native switch of Perl 5.10, and because, +as it's implemented as a source filter, it doesn't always work as intended +when complex syntax is involved. + +=head2 How can I catch accesses to undefined variables, functions, or methods? + +The AUTOLOAD method, discussed in L<perlsub/"Autoloading"> and +L<perltoot/"AUTOLOAD: Proxy Methods">, lets you capture calls to +undefined functions and methods. + +When it comes to undefined variables that would trigger a warning +under C<use warnings>, you can promote the warning to an error. + + use warnings FATAL => qw(uninitialized); + +=head2 Why can't a method included in this same file be found? + +Some possible reasons: your inheritance is getting confused, you've +misspelled the method name, or the object is of the wrong type. Check +out L<perltoot> for details about any of the above cases. You may +also use C<print ref($object)> to find out the class C<$object> was +blessed into. + +Another possible reason for problems is that you've used the +indirect object syntax (eg, C<find Guru "Samy">) on a class name +before Perl has seen that such a package exists. It's wisest to make +sure your packages are all defined before you start using them, which +will be taken care of if you use the C<use> statement instead of +C<require>. If not, make sure to use arrow notation (eg., +C<< Guru->find("Samy") >>) instead. Object notation is explained in +L<perlobj>. + +Make sure to read about creating modules in L<perlmod> and +the perils of indirect objects in L<perlobj/"Method Invocation">. + +=head2 How can I find out my current or calling package? + +(contributed by brian d foy) + +To find the package you are currently in, use the special literal +C<__PACKAGE__>, as documented in L<perldata>. You can only use the +special literals as separate tokens, so you can't interpolate them +into strings like you can with variables: + + my $current_package = __PACKAGE__; + print "I am in package $current_package\n"; + +If you want to find the package calling your code, perhaps to give better +diagnostics as L<Carp> does, use the C<caller> built-in: + + sub foo { + my @args = ...; + my( $package, $filename, $line ) = caller; + + print "I was called from package $package\n"; + ); + +By default, your program starts in package C<main>, so you will +always be in some package. + +This is different from finding out the package an object is blessed +into, which might not be the current package. For that, use C<blessed> +from L<Scalar::Util>, part of the Standard Library since Perl 5.8: + + use Scalar::Util qw(blessed); + my $object_package = blessed( $object ); + +Most of the time, you shouldn't care what package an object is blessed +into, however, as long as it claims to inherit from that class: + + my $is_right_class = eval { $object->isa( $package ) }; # true or false + +And, with Perl 5.10 and later, you don't have to check for an +inheritance to see if the object can handle a role. For that, you can +use C<DOES>, which comes from C<UNIVERSAL>: + + my $class_does_it = eval { $object->DOES( $role ) }; # true or false + +You can safely replace C<isa> with C<DOES> (although the converse is not true). + +=head2 How can I comment out a large block of Perl code? + +(contributed by brian d foy) + +The quick-and-dirty way to comment out more than one line of Perl is +to surround those lines with Pod directives. You have to put these +directives at the beginning of the line and somewhere where Perl +expects a new statement (so not in the middle of statements like the C<#> +comments). You end the comment with C<=cut>, ending the Pod section: + + =pod + + my $object = NotGonnaHappen->new(); + + ignored_sub(); + + $wont_be_assigned = 37; + + =cut + +The quick-and-dirty method only works well when you don't plan to +leave the commented code in the source. If a Pod parser comes along, +you're multiline comment is going to show up in the Pod translation. +A better way hides it from Pod parsers as well. + +The C<=begin> directive can mark a section for a particular purpose. +If the Pod parser doesn't want to handle it, it just ignores it. Label +the comments with C<comment>. End the comment using C<=end> with the +same label. You still need the C<=cut> to go back to Perl code from +the Pod comment: + + =begin comment + + my $object = NotGonnaHappen->new(); + + ignored_sub(); + + $wont_be_assigned = 37; + + =end comment + + =cut + +For more information on Pod, check out L<perlpod> and L<perlpodspec>. + +=head2 How do I clear a package? + +Use this code, provided by Mark-Jason Dominus: + + sub scrub_package { + no strict 'refs'; + my $pack = shift; + die "Shouldn't delete main package" + if $pack eq "" || $pack eq "main"; + my $stash = *{$pack . '::'}{HASH}; + my $name; + foreach $name (keys %$stash) { + my $fullname = $pack . '::' . $name; + # Get rid of everything with that name. + undef $$fullname; + undef @$fullname; + undef %$fullname; + undef &$fullname; + undef *$fullname; + } + } + +Or, if you're using a recent release of Perl, you can +just use the Symbol::delete_package() function instead. + +=head2 How can I use a variable as a variable name? + +Beginners often think they want to have a variable contain the name +of a variable. + + $fred = 23; + $varname = "fred"; + ++$$varname; # $fred now 24 + +This works I<sometimes>, but it is a very bad idea for two reasons. + +The first reason is that this technique I<only works on global +variables>. That means that if $fred is a lexical variable created +with my() in the above example, the code wouldn't work at all: you'd +accidentally access the global and skip right over the private lexical +altogether. Global variables are bad because they can easily collide +accidentally and in general make for non-scalable and confusing code. + +Symbolic references are forbidden under the C<use strict> pragma. +They are not true references and consequently are not reference-counted +or garbage-collected. + +The other reason why using a variable to hold the name of another +variable is a bad idea is that the question often stems from a lack of +understanding of Perl data structures, particularly hashes. By using +symbolic references, you are just using the package's symbol-table hash +(like C<%main::>) instead of a user-defined hash. The solution is to +use your own hash or a real reference instead. + + $USER_VARS{"fred"} = 23; + my $varname = "fred"; + $USER_VARS{$varname}++; # not $$varname++ + +There we're using the %USER_VARS hash instead of symbolic references. +Sometimes this comes up in reading strings from the user with variable +references and wanting to expand them to the values of your perl +program's variables. This is also a bad idea because it conflates the +program-addressable namespace and the user-addressable one. Instead of +reading a string and expanding it to the actual contents of your program's +own variables: + + $str = 'this has a $fred and $barney in it'; + $str =~ s/(\$\w+)/$1/eeg; # need double eval + +it would be better to keep a hash around like %USER_VARS and have +variable references actually refer to entries in that hash: + + $str =~ s/\$(\w+)/$USER_VARS{$1}/g; # no /e here at all + +That's faster, cleaner, and safer than the previous approach. Of course, +you don't need to use a dollar sign. You could use your own scheme to +make it less confusing, like bracketed percent symbols, etc. + + $str = 'this has a %fred% and %barney% in it'; + $str =~ s/%(\w+)%/$USER_VARS{$1}/g; # no /e here at all + +Another reason that folks sometimes think they want a variable to +contain the name of a variable is that they don't know how to build +proper data structures using hashes. For example, let's say they +wanted two hashes in their program: %fred and %barney, and that they +wanted to use another scalar variable to refer to those by name. + + $name = "fred"; + $$name{WIFE} = "wilma"; # set %fred + + $name = "barney"; + $$name{WIFE} = "betty"; # set %barney + +This is still a symbolic reference, and is still saddled with the +problems enumerated above. It would be far better to write: + + $folks{"fred"}{WIFE} = "wilma"; + $folks{"barney"}{WIFE} = "betty"; + +And just use a multilevel hash to start with. + +The only times that you absolutely I<must> use symbolic references are +when you really must refer to the symbol table. This may be because it's +something that one can't take a real reference to, such as a format name. +Doing so may also be important for method calls, since these always go +through the symbol table for resolution. + +In those cases, you would turn off C<strict 'refs'> temporarily so you +can play around with the symbol table. For example: + + @colors = qw(red blue green yellow orange purple violet); + for my $name (@colors) { + no strict 'refs'; # renege for the block + *$name = sub { "<FONT COLOR='$name'>@_</FONT>" }; + } + +All those functions (red(), blue(), green(), etc.) appear to be separate, +but the real code in the closure actually was compiled only once. + +So, sometimes you might want to use symbolic references to manipulate +the symbol table directly. This doesn't matter for formats, handles, and +subroutines, because they are always global--you can't use my() on them. +For scalars, arrays, and hashes, though--and usually for subroutines-- +you probably only want to use hard references. + +=head2 What does "bad interpreter" mean? + +(contributed by brian d foy) + +The "bad interpreter" message comes from the shell, not perl. The +actual message may vary depending on your platform, shell, and locale +settings. + +If you see "bad interpreter - no such file or directory", the first +line in your perl script (the "shebang" line) does not contain the +right path to perl (or any other program capable of running scripts). +Sometimes this happens when you move the script from one machine to +another and each machine has a different path to perl--/usr/bin/perl +versus /usr/local/bin/perl for instance. It may also indicate +that the source machine has CRLF line terminators and the +destination machine has LF only: the shell tries to find +/usr/bin/perl<CR>, but can't. + +If you see "bad interpreter: Permission denied", you need to make your +script executable. + +In either case, you should still be able to run the scripts with perl +explicitly: + + % perl script.pl + +If you get a message like "perl: command not found", perl is not in +your PATH, which might also mean that the location of perl is not +where you expect it so you need to adjust your shebang line. + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples in this file +are hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun +or for profit as you see fit. A simple comment in the code giving +credit would be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq8.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq8.pod new file mode 100644 index 00000000000..1c7793e3558 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq8.pod @@ -0,0 +1,1422 @@ +=head1 NAME + +perlfaq8 - System Interaction + +=head1 DESCRIPTION + +This section of the Perl FAQ covers questions involving operating +system interaction. Topics include interprocess communication (IPC), +control over the user-interface (keyboard, screen and pointing +devices), and most anything else not related to data manipulation. + +Read the FAQs and documentation specific to the port of perl to your +operating system (eg, L<perlvms>, L<perlplan9>, ...). These should +contain more detailed information on the vagaries of your perl. + +=head2 How do I find out which operating system I'm running under? + +The C<$^O> variable (C<$OSNAME> if you use C<English>) contains an +indication of the name of the operating system (not its release +number) that your perl binary was built for. + +=head2 How come exec() doesn't return? +X<exec> X<system> X<fork> X<open> X<pipe> + +(contributed by brian d foy) + +The C<exec> function's job is to turn your process into another +command and never to return. If that's not what you want to do, don't +use C<exec>. :) + +If you want to run an external command and still keep your Perl process +going, look at a piped C<open>, C<fork>, or C<system>. + +=head2 How do I do fancy stuff with the keyboard/screen/mouse? + +How you access/control keyboards, screens, and pointing devices +("mice") is system-dependent. Try the following modules: + +=over 4 + +=item Keyboard + + Term::Cap Standard perl distribution + Term::ReadKey CPAN + Term::ReadLine::Gnu CPAN + Term::ReadLine::Perl CPAN + Term::Screen CPAN + +=item Screen + + Term::Cap Standard perl distribution + Curses CPAN + Term::ANSIColor CPAN + +=item Mouse + + Tk CPAN + Wx CPAN + Gtk2 CPAN + Qt4 kdebindings4 package + +=back + +Some of these specific cases are shown as examples in other answers +in this section of the perlfaq. + +=head2 How do I print something out in color? + +In general, you don't, because you don't know whether +the recipient has a color-aware display device. If you +know that they have an ANSI terminal that understands +color, you can use the L<Term::ANSIColor> module from CPAN: + + use Term::ANSIColor; + print color("red"), "Stop!\n", color("reset"); + print color("green"), "Go!\n", color("reset"); + +Or like this: + + use Term::ANSIColor qw(:constants); + print RED, "Stop!\n", RESET; + print GREEN, "Go!\n", RESET; + +=head2 How do I read just one key without waiting for a return key? + +Controlling input buffering is a remarkably system-dependent matter. +On many systems, you can just use the B<stty> command as shown in +L<perlfunc/getc>, but as you see, that's already getting you into +portability snags. + + open(TTY, "+</dev/tty") or die "no tty: $!"; + system "stty cbreak </dev/tty >/dev/tty 2>&1"; + $key = getc(TTY); # perhaps this works + # OR ELSE + sysread(TTY, $key, 1); # probably this does + system "stty -cbreak </dev/tty >/dev/tty 2>&1"; + +The L<Term::ReadKey> module from CPAN offers an easy-to-use interface that +should be more efficient than shelling out to B<stty> for each key. +It even includes limited support for Windows. + + use Term::ReadKey; + ReadMode('cbreak'); + $key = ReadKey(0); + ReadMode('normal'); + +However, using the code requires that you have a working C compiler +and can use it to build and install a CPAN module. Here's a solution +using the standard L<POSIX> module, which is already on your system +(assuming your system supports POSIX). + + use HotKey; + $key = readkey(); + +And here's the C<HotKey> module, which hides the somewhat mystifying calls +to manipulate the POSIX termios structures. + + # HotKey.pm + package HotKey; + + use strict; + use warnings; + + use parent 'Exporter'; + our @EXPORT = qw(cbreak cooked readkey); + + use POSIX qw(:termios_h); + my ($term, $oterm, $echo, $noecho, $fd_stdin); + + $fd_stdin = fileno(STDIN); + $term = POSIX::Termios->new(); + $term->getattr($fd_stdin); + $oterm = $term->getlflag(); + + $echo = ECHO | ECHOK | ICANON; + $noecho = $oterm & ~$echo; + + sub cbreak { + $term->setlflag($noecho); # ok, so i don't want echo either + $term->setcc(VTIME, 1); + $term->setattr($fd_stdin, TCSANOW); + } + + sub cooked { + $term->setlflag($oterm); + $term->setcc(VTIME, 0); + $term->setattr($fd_stdin, TCSANOW); + } + + sub readkey { + my $key = ''; + cbreak(); + sysread(STDIN, $key, 1); + cooked(); + return $key; + } + + END { cooked() } + + 1; + +=head2 How do I check whether input is ready on the keyboard? + +The easiest way to do this is to read a key in nonblocking mode with the +L<Term::ReadKey> module from CPAN, passing it an argument of -1 to indicate +not to block: + + use Term::ReadKey; + + ReadMode('cbreak'); + + if (defined (my $char = ReadKey(-1)) ) { + # input was waiting and it was $char + } else { + # no input was waiting + } + + ReadMode('normal'); # restore normal tty settings + +=head2 How do I clear the screen? + +(contributed by brian d foy) + +To clear the screen, you just have to print the special sequence +that tells the terminal to clear the screen. Once you have that +sequence, output it when you want to clear the screen. + +You can use the L<Term::ANSIScreen> module to get the special +sequence. Import the C<cls> function (or the C<:screen> tag): + + use Term::ANSIScreen qw(cls); + my $clear_screen = cls(); + + print $clear_screen; + +The L<Term::Cap> module can also get the special sequence if you want +to deal with the low-level details of terminal control. The C<Tputs> +method returns the string for the given capability: + + use Term::Cap; + + my $terminal = Term::Cap->Tgetent( { OSPEED => 9600 } ); + my $clear_string = $terminal->Tputs('cl'); + + print $clear_screen; + +On Windows, you can use the L<Win32::Console> module. After creating +an object for the output filehandle you want to affect, call the +C<Cls> method: + + Win32::Console; + + my $OUT = Win32::Console->new(STD_OUTPUT_HANDLE); + my $clear_string = $OUT->Cls; + + print $clear_screen; + +If you have a command-line program that does the job, you can call +it in backticks to capture whatever it outputs so you can use it +later: + + my $clear_string = `clear`; + + print $clear_string; + +=head2 How do I get the screen size? + +If you have L<Term::ReadKey> module installed from CPAN, +you can use it to fetch the width and height in characters +and in pixels: + + use Term::ReadKey; + my ($wchar, $hchar, $wpixels, $hpixels) = GetTerminalSize(); + +This is more portable than the raw C<ioctl>, but not as +illustrative: + + require 'sys/ioctl.ph'; + die "no TIOCGWINSZ " unless defined &TIOCGWINSZ; + open(my $tty_fh, "+</dev/tty") or die "No tty: $!"; + unless (ioctl($tty_fh, &TIOCGWINSZ, $winsize='')) { + die sprintf "$0: ioctl TIOCGWINSZ (%08x: $!)\n", &TIOCGWINSZ; + } + my ($row, $col, $xpixel, $ypixel) = unpack('S4', $winsize); + print "(row,col) = ($row,$col)"; + print " (xpixel,ypixel) = ($xpixel,$ypixel)" if $xpixel || $ypixel; + print "\n"; + +=head2 How do I ask the user for a password? + +(This question has nothing to do with the web. See a different +FAQ for that.) + +There's an example of this in L<perlfunc/crypt>). First, you put the +terminal into "no echo" mode, then just read the password normally. +You may do this with an old-style C<ioctl()> function, POSIX terminal +control (see L<POSIX> or its documentation the Camel Book), or a call +to the B<stty> program, with varying degrees of portability. + +You can also do this for most systems using the L<Term::ReadKey> module +from CPAN, which is easier to use and in theory more portable. + + use Term::ReadKey; + + ReadMode('noecho'); + my $password = ReadLine(0); + +=head2 How do I read and write the serial port? + +This depends on which operating system your program is running on. In +the case of Unix, the serial ports will be accessible through files in +C</dev>; on other systems, device names will doubtless differ. +Several problem areas common to all device interaction are the +following: + +=over 4 + +=item lockfiles + +Your system may use lockfiles to control multiple access. Make sure +you follow the correct protocol. Unpredictable behavior can result +from multiple processes reading from one device. + +=item open mode + +If you expect to use both read and write operations on the device, +you'll have to open it for update (see L<perlfunc/"open"> for +details). You may wish to open it without running the risk of +blocking by using C<sysopen()> and C<O_RDWR|O_NDELAY|O_NOCTTY> from the +L<Fcntl> module (part of the standard perl distribution). See +L<perlfunc/"sysopen"> for more on this approach. + +=item end of line + +Some devices will be expecting a "\r" at the end of each line rather +than a "\n". In some ports of perl, "\r" and "\n" are different from +their usual (Unix) ASCII values of "\015" and "\012". You may have to +give the numeric values you want directly, using octal ("\015"), hex +("0x0D"), or as a control-character specification ("\cM"). + + print DEV "atv1\012"; # wrong, for some devices + print DEV "atv1\015"; # right, for some devices + +Even though with normal text files a "\n" will do the trick, there is +still no unified scheme for terminating a line that is portable +between Unix, DOS/Win, and Macintosh, except to terminate I<ALL> line +ends with "\015\012", and strip what you don't need from the output. +This applies especially to socket I/O and autoflushing, discussed +next. + +=item flushing output + +If you expect characters to get to your device when you C<print()> them, +you'll want to autoflush that filehandle. You can use C<select()> +and the C<$|> variable to control autoflushing (see L<perlvar/$E<verbar>> +and L<perlfunc/select>, or L<perlfaq5>, "How do I flush/unbuffer an +output filehandle? Why must I do this?"): + + my $old_handle = select($dev_fh); + $| = 1; + select($old_handle); + +You'll also see code that does this without a temporary variable, as in + + select((select($deb_handle), $| = 1)[0]); + +Or if you don't mind pulling in a few thousand lines +of code just because you're afraid of a little C<$|> variable: + + use IO::Handle; + $dev_fh->autoflush(1); + +As mentioned in the previous item, this still doesn't work when using +socket I/O between Unix and Macintosh. You'll need to hard code your +line terminators, in that case. + +=item non-blocking input + +If you are doing a blocking C<read()> or C<sysread()>, you'll have to +arrange for an alarm handler to provide a timeout (see +L<perlfunc/alarm>). If you have a non-blocking open, you'll likely +have a non-blocking read, which means you may have to use a 4-arg +C<select()> to determine whether I/O is ready on that device (see +L<perlfunc/"select">. + +=back + +While trying to read from his caller-id box, the notorious Jamie +Zawinski C<< <jwz@netscape.com> >>, after much gnashing of teeth and +fighting with C<sysread>, C<sysopen>, POSIX's C<tcgetattr> business, +and various other functions that go bump in the night, finally came up +with this: + + sub open_modem { + use IPC::Open2; + my $stty = `/bin/stty -g`; + open2( \*MODEM_IN, \*MODEM_OUT, "cu -l$modem_device -s2400 2>&1"); + # starting cu hoses /dev/tty's stty settings, even when it has + # been opened on a pipe... + system("/bin/stty $stty"); + $_ = <MODEM_IN>; + chomp; + if ( !m/^Connected/ ) { + print STDERR "$0: cu printed `$_' instead of `Connected'\n"; + } + } + +=head2 How do I decode encrypted password files? + +You spend lots and lots of money on dedicated hardware, but this is +bound to get you talked about. + +Seriously, you can't if they are Unix password files--the Unix +password system employs one-way encryption. It's more like hashing +than encryption. The best you can do is check whether something else +hashes to the same string. You can't turn a hash back into the +original string. Programs like Crack can forcibly (and intelligently) +try to guess passwords, but don't (can't) guarantee quick success. + +If you're worried about users selecting bad passwords, you should +proactively check when they try to change their password (by modifying +L<passwd(1)>, for example). + +=head2 How do I start a process in the background? + +(contributed by brian d foy) + +There's not a single way to run code in the background so you don't +have to wait for it to finish before your program moves on to other +tasks. Process management depends on your particular operating system, +and many of the techniques are covered in L<perlipc>. + +Several CPAN modules may be able to help, including L<IPC::Open2> or +L<IPC::Open3>, L<IPC::Run>, L<Parallel::Jobs>, +L<Parallel::ForkManager>, L<POE>, L<Proc::Background>, and +L<Win32::Process>. There are many other modules you might use, so +check those namespaces for other options too. + +If you are on a Unix-like system, you might be able to get away with a +system call where you put an C<&> on the end of the command: + + system("cmd &") + +You can also try using C<fork>, as described in L<perlfunc> (although +this is the same thing that many of the modules will do for you). + +=over 4 + +=item STDIN, STDOUT, and STDERR are shared + +Both the main process and the backgrounded one (the "child" process) +share the same STDIN, STDOUT and STDERR filehandles. If both try to +access them at once, strange things can happen. You may want to close +or reopen these for the child. You can get around this with +C<open>ing a pipe (see L<perlfunc/"open">) but on some systems this +means that the child process cannot outlive the parent. + +=item Signals + +You'll have to catch the SIGCHLD signal, and possibly SIGPIPE too. +SIGCHLD is sent when the backgrounded process finishes. SIGPIPE is +sent when you write to a filehandle whose child process has closed (an +untrapped SIGPIPE can cause your program to silently die). This is +not an issue with C<system("cmd&")>. + +=item Zombies + +You have to be prepared to "reap" the child process when it finishes. + + $SIG{CHLD} = sub { wait }; + + $SIG{CHLD} = 'IGNORE'; + +You can also use a double fork. You immediately C<wait()> for your +first child, and the init daemon will C<wait()> for your grandchild once +it exits. + + unless ($pid = fork) { + unless (fork) { + exec "what you really wanna do"; + die "exec failed!"; + } + exit 0; + } + waitpid($pid, 0); + +See L<perlipc/"Signals"> for other examples of code to do this. +Zombies are not an issue with C<system("prog &")>. + +=back + +=head2 How do I trap control characters/signals? + +You don't actually "trap" a control character. Instead, that character +generates a signal which is sent to your terminal's currently +foregrounded process group, which you then trap in your process. +Signals are documented in L<perlipc/"Signals"> and the +section on "Signals" in the Camel. + +You can set the values of the C<%SIG> hash to be the functions you want +to handle the signal. After perl catches the signal, it looks in C<%SIG> +for a key with the same name as the signal, then calls the subroutine +value for that key. + + # as an anonymous subroutine + + $SIG{INT} = sub { syswrite(STDERR, "ouch\n", 5 ) }; + + # or a reference to a function + + $SIG{INT} = \&ouch; + + # or the name of the function as a string + + $SIG{INT} = "ouch"; + +Perl versions before 5.8 had in its C source code signal handlers which +would catch the signal and possibly run a Perl function that you had set +in C<%SIG>. This violated the rules of signal handling at that level +causing perl to dump core. Since version 5.8.0, perl looks at C<%SIG> +B<after> the signal has been caught, rather than while it is being caught. +Previous versions of this answer were incorrect. + +=head2 How do I modify the shadow password file on a Unix system? + +If perl was installed correctly and your shadow library was written +properly, the C<getpw*()> functions described in L<perlfunc> should in +theory provide (read-only) access to entries in the shadow password +file. To change the file, make a new shadow password file (the format +varies from system to system--see L<passwd(1)> for specifics) and use +C<pwd_mkdb(8)> to install it (see L<pwd_mkdb(8)> for more details). + +=head2 How do I set the time and date? + +Assuming you're running under sufficient permissions, you should be +able to set the system-wide date and time by running the C<date(1)> +program. (There is no way to set the time and date on a per-process +basis.) This mechanism will work for Unix, MS-DOS, Windows, and NT; +the VMS equivalent is C<set time>. + +However, if all you want to do is change your time zone, you can +probably get away with setting an environment variable: + + $ENV{TZ} = "MST7MDT"; # Unixish + $ENV{'SYS$TIMEZONE_DIFFERENTIAL'}="-5" # vms + system('trn', 'comp.lang.perl.misc'); + +=head2 How can I sleep() or alarm() for under a second? +X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> + +If you want finer granularity than the 1 second that the C<sleep()> +function provides, the easiest way is to use the C<select()> function as +documented in L<perlfunc/"select">. Try the L<Time::HiRes> and +the L<BSD::Itimer> modules (available from CPAN, and starting from +Perl 5.8 L<Time::HiRes> is part of the standard distribution). + +=head2 How can I measure time under a second? +X<Time::HiRes> X<BSD::Itimer> X<sleep> X<select> + +(contributed by brian d foy) + +The L<Time::HiRes> module (part of the standard distribution as of +Perl 5.8) measures time with the C<gettimeofday()> system call, which +returns the time in microseconds since the epoch. If you can't install +L<Time::HiRes> for older Perls and you are on a Unixish system, you +may be able to call C<gettimeofday(2)> directly. See +L<perlfunc/syscall>. + +=head2 How can I do an atexit() or setjmp()/longjmp()? (Exception handling) + +You can use the C<END> block to simulate C<atexit()>. Each package's +C<END> block is called when the program or thread ends. See the L<perlmod> +manpage for more details about C<END> blocks. + +For example, you can use this to make sure your filter program managed +to finish its output without filling up the disk: + + END { + close(STDOUT) || die "stdout close failed: $!"; + } + +The C<END> block isn't called when untrapped signals kill the program, +though, so if you use C<END> blocks you should also use + + use sigtrap qw(die normal-signals); + +Perl's exception-handling mechanism is its C<eval()> operator. You +can use C<eval()> as C<setjmp> and C<die()> as C<longjmp>. For +details of this, see the section on signals, especially the time-out +handler for a blocking C<flock()> in L<perlipc/"Signals"> or the +section on "Signals" in I<Programming Perl>. + +If exception handling is all you're interested in, use one of the +many CPAN modules that handle exceptions, such as L<Try::Tiny>. + +If you want the C<atexit()> syntax (and an C<rmexit()> as well), try the +C<AtExit> module available from CPAN. + +=head2 Why doesn't my sockets program work under System V (Solaris)? What does the error message "Protocol not supported" mean? + +Some Sys-V based systems, notably Solaris 2.X, redefined some of the +standard socket constants. Since these were constant across all +architectures, they were often hardwired into perl code. The proper +way to deal with this is to "use Socket" to get the correct values. + +Note that even though SunOS and Solaris are binary compatible, these +values are different. Go figure. + +=head2 How can I call my system's unique C functions from Perl? + +In most cases, you write an external module to do it--see the answer +to "Where can I learn about linking C with Perl? [h2xs, xsubpp]". +However, if the function is a system call, and your system supports +C<syscall()>, you can use the C<syscall> function (documented in +L<perlfunc>). + +Remember to check the modules that came with your distribution, and +CPAN as well--someone may already have written a module to do it. On +Windows, try L<Win32::API>. On Macs, try L<Mac::Carbon>. If no module +has an interface to the C function, you can inline a bit of C in your +Perl source with L<Inline::C>. + +=head2 Where do I get the include files to do ioctl() or syscall()? + +Historically, these would be generated by the L<h2ph> tool, part of the +standard perl distribution. This program converts C<cpp(1)> directives +in C header files to files containing subroutine definitions, like +C<&SYS_getitimer>, which you can use as arguments to your functions. +It doesn't work perfectly, but it usually gets most of the job done. +Simple files like F<errno.h>, F<syscall.h>, and F<socket.h> were fine, +but the hard ones like F<ioctl.h> nearly always need to be hand-edited. +Here's how to install the *.ph files: + + 1. Become the super-user + 2. cd /usr/include + 3. h2ph *.h */*.h + +If your system supports dynamic loading, for reasons of portability and +sanity you probably ought to use L<h2xs> (also part of the standard perl +distribution). This tool converts C header files to Perl extensions. +See L<perlxstut> for how to get started with L<h2xs>. + +If your system doesn't support dynamic loading, you still probably +ought to use L<h2xs>. See L<perlxstut> and L<ExtUtils::MakeMaker> for +more information (in brief, just use B<make perl> instead of a plain +B<make> to rebuild perl with a new static extension). + +=head2 Why do setuid perl scripts complain about kernel problems? + +Some operating systems have bugs in the kernel that make setuid +scripts inherently insecure. Perl gives you a number of options +(described in L<perlsec>) to work around such systems. + +=head2 How can I open a pipe both to and from a command? + +The L<IPC::Open2> module (part of the standard perl distribution) is +an easy-to-use approach that internally uses C<pipe()>, C<fork()>, and +C<exec()> to do the job. Make sure you read the deadlock warnings in +its documentation, though (see L<IPC::Open2>). See +L<perlipc/"Bidirectional Communication with Another Process"> and +L<perlipc/"Bidirectional Communication with Yourself"> + +You may also use the L<IPC::Open3> module (part of the standard perl +distribution), but be warned that it has a different order of +arguments from L<IPC::Open2> (see L<IPC::Open3>). + +=head2 Why can't I get the output of a command with system()? + +You're confusing the purpose of C<system()> and backticks (``). C<system()> +runs a command and returns exit status information (as a 16 bit value: +the low 7 bits are the signal the process died from, if any, and +the high 8 bits are the actual exit value). Backticks (``) run a +command and return what it sent to STDOUT. + + my $exit_status = system("mail-users"); + my $output_string = `ls`; + +=head2 How can I capture STDERR from an external command? + +There are three basic ways of running external commands: + + system $cmd; # using system() + my $output = `$cmd`; # using backticks (``) + open (my $pipe_fh, "$cmd |"); # using open() + +With C<system()>, both STDOUT and STDERR will go the same place as the +script's STDOUT and STDERR, unless the C<system()> command redirects them. +Backticks and C<open()> read B<only> the STDOUT of your command. + +You can also use the C<open3()> function from L<IPC::Open3>. Benjamin +Goldberg provides some sample code: + +To capture a program's STDOUT, but discard its STDERR: + + use IPC::Open3; + use File::Spec; + use Symbol qw(gensym); + open(NULL, ">", File::Spec->devnull); + my $pid = open3(gensym, \*PH, ">&NULL", "cmd"); + while( <PH> ) { } + waitpid($pid, 0); + +To capture a program's STDERR, but discard its STDOUT: + + use IPC::Open3; + use File::Spec; + use Symbol qw(gensym); + open(NULL, ">", File::Spec->devnull); + my $pid = open3(gensym, ">&NULL", \*PH, "cmd"); + while( <PH> ) { } + waitpid($pid, 0); + +To capture a program's STDERR, and let its STDOUT go to our own STDERR: + + use IPC::Open3; + use Symbol qw(gensym); + my $pid = open3(gensym, ">&STDERR", \*PH, "cmd"); + while( <PH> ) { } + waitpid($pid, 0); + +To read both a command's STDOUT and its STDERR separately, you can +redirect them to temp files, let the command run, then read the temp +files: + + use IPC::Open3; + use Symbol qw(gensym); + use IO::File; + local *CATCHOUT = IO::File->new_tmpfile; + local *CATCHERR = IO::File->new_tmpfile; + my $pid = open3(gensym, ">&CATCHOUT", ">&CATCHERR", "cmd"); + waitpid($pid, 0); + seek $_, 0, 0 for \*CATCHOUT, \*CATCHERR; + while( <CATCHOUT> ) {} + while( <CATCHERR> ) {} + +But there's no real need for B<both> to be tempfiles... the following +should work just as well, without deadlocking: + + use IPC::Open3; + use Symbol qw(gensym); + use IO::File; + local *CATCHERR = IO::File->new_tmpfile; + my $pid = open3(gensym, \*CATCHOUT, ">&CATCHERR", "cmd"); + while( <CATCHOUT> ) {} + waitpid($pid, 0); + seek CATCHERR, 0, 0; + while( <CATCHERR> ) {} + +And it'll be faster, too, since we can begin processing the program's +stdout immediately, rather than waiting for the program to finish. + +With any of these, you can change file descriptors before the call: + + open(STDOUT, ">logfile"); + system("ls"); + +or you can use Bourne shell file-descriptor redirection: + + $output = `$cmd 2>some_file`; + open (PIPE, "cmd 2>some_file |"); + +You can also use file-descriptor redirection to make STDERR a +duplicate of STDOUT: + + $output = `$cmd 2>&1`; + open (PIPE, "cmd 2>&1 |"); + +Note that you I<cannot> simply open STDERR to be a dup of STDOUT +in your Perl program and avoid calling the shell to do the redirection. +This doesn't work: + + open(STDERR, ">&STDOUT"); + $alloutput = `cmd args`; # stderr still escapes + +This fails because the C<open()> makes STDERR go to where STDOUT was +going at the time of the C<open()>. The backticks then make STDOUT go to +a string, but don't change STDERR (which still goes to the old +STDOUT). + +Note that you I<must> use Bourne shell (C<sh(1)>) redirection syntax in +backticks, not C<csh(1)>! Details on why Perl's C<system()> and backtick +and pipe opens all use the Bourne shell are in the +F<versus/csh.whynot> article in the "Far More Than You Ever Wanted To +Know" collection in L<http://www.cpan.org/misc/olddoc/FMTEYEWTK.tgz> . To +capture a command's STDERR and STDOUT together: + + $output = `cmd 2>&1`; # either with backticks + $pid = open(PH, "cmd 2>&1 |"); # or with an open pipe + while (<PH>) { } # plus a read + +To capture a command's STDOUT but discard its STDERR: + + $output = `cmd 2>/dev/null`; # either with backticks + $pid = open(PH, "cmd 2>/dev/null |"); # or with an open pipe + while (<PH>) { } # plus a read + +To capture a command's STDERR but discard its STDOUT: + + $output = `cmd 2>&1 1>/dev/null`; # either with backticks + $pid = open(PH, "cmd 2>&1 1>/dev/null |"); # or with an open pipe + while (<PH>) { } # plus a read + +To exchange a command's STDOUT and STDERR in order to capture the STDERR +but leave its STDOUT to come out our old STDERR: + + $output = `cmd 3>&1 1>&2 2>&3 3>&-`; # either with backticks + $pid = open(PH, "cmd 3>&1 1>&2 2>&3 3>&-|");# or with an open pipe + while (<PH>) { } # plus a read + +To read both a command's STDOUT and its STDERR separately, it's easiest +to redirect them separately to files, and then read from those files +when the program is done: + + system("program args 1>program.stdout 2>program.stderr"); + +Ordering is important in all these examples. That's because the shell +processes file descriptor redirections in strictly left to right order. + + system("prog args 1>tmpfile 2>&1"); + system("prog args 2>&1 1>tmpfile"); + +The first command sends both standard out and standard error to the +temporary file. The second command sends only the old standard output +there, and the old standard error shows up on the old standard out. + +=head2 Why doesn't open() return an error when a pipe open fails? + +If the second argument to a piped C<open()> contains shell +metacharacters, perl C<fork()>s, then C<exec()>s a shell to decode the +metacharacters and eventually run the desired program. If the program +couldn't be run, it's the shell that gets the message, not Perl. All +your Perl program can find out is whether the shell itself could be +successfully started. You can still capture the shell's STDERR and +check it for error messages. See L<"How can I capture STDERR from an +external command?"> elsewhere in this document, or use the +L<IPC::Open3> module. + +If there are no shell metacharacters in the argument of C<open()>, Perl +runs the command directly, without using the shell, and can correctly +report whether the command started. + +=head2 What's wrong with using backticks in a void context? + +Strictly speaking, nothing. Stylistically speaking, it's not a good +way to write maintainable code. Perl has several operators for +running external commands. Backticks are one; they collect the output +from the command for use in your program. The C<system> function is +another; it doesn't do this. + +Writing backticks in your program sends a clear message to the readers +of your code that you wanted to collect the output of the command. +Why send a clear message that isn't true? + +Consider this line: + + `cat /etc/termcap`; + +You forgot to check C<$?> to see whether the program even ran +correctly. Even if you wrote + + print `cat /etc/termcap`; + +this code could and probably should be written as + + system("cat /etc/termcap") == 0 + or die "cat program failed!"; + +which will echo the cat command's output as it is generated, instead +of waiting until the program has completed to print it out. It also +checks the return value. + +C<system> also provides direct control over whether shell wildcard +processing may take place, whereas backticks do not. + +=head2 How can I call backticks without shell processing? + +This is a bit tricky. You can't simply write the command +like this: + + @ok = `grep @opts '$search_string' @filenames`; + +As of Perl 5.8.0, you can use C<open()> with multiple arguments. +Just like the list forms of C<system()> and C<exec()>, no shell +escapes happen. + + open( GREP, "-|", 'grep', @opts, $search_string, @filenames ); + chomp(@ok = <GREP>); + close GREP; + +You can also: + + my @ok = (); + if (open(GREP, "-|")) { + while (<GREP>) { + chomp; + push(@ok, $_); + } + close GREP; + } else { + exec 'grep', @opts, $search_string, @filenames; + } + +Just as with C<system()>, no shell escapes happen when you C<exec()> a +list. Further examples of this can be found in L<perlipc/"Safe Pipe +Opens">. + +Note that if you're using Windows, no solution to this vexing issue is +even possible. Even though Perl emulates C<fork()>, you'll still be +stuck, because Windows does not have an argc/argv-style API. + +=head2 Why can't my script read from STDIN after I gave it EOF (^D on Unix, ^Z on MS-DOS)? + +This happens only if your perl is compiled to use stdio instead of +perlio, which is the default. Some (maybe all?) stdios set error and +eof flags that you may need to clear. The L<POSIX> module defines +C<clearerr()> that you can use. That is the technically correct way to +do it. Here are some less reliable workarounds: + +=over 4 + +=item 1 + +Try keeping around the seekpointer and go there, like this: + + my $where = tell($log_fh); + seek($log_fh, $where, 0); + +=item 2 + +If that doesn't work, try seeking to a different part of the file and +then back. + +=item 3 + +If that doesn't work, try seeking to a different part of +the file, reading something, and then seeking back. + +=item 4 + +If that doesn't work, give up on your stdio package and use sysread. + +=back + +=head2 How can I convert my shell script to perl? + +Learn Perl and rewrite it. Seriously, there's no simple converter. +Things that are awkward to do in the shell are easy to do in Perl, and +this very awkwardness is what would make a shell->perl converter +nigh-on impossible to write. By rewriting it, you'll think about what +you're really trying to do, and hopefully will escape the shell's +pipeline datastream paradigm, which while convenient for some matters, +causes many inefficiencies. + +=head2 Can I use perl to run a telnet or ftp session? + +Try the L<Net::FTP>, L<TCP::Client>, and L<Net::Telnet> modules +(available from CPAN). +L<http://www.cpan.org/scripts/netstuff/telnet.emul.shar> will also help +for emulating the telnet protocol, but L<Net::Telnet> is quite +probably easier to use. + +If all you want to do is pretend to be telnet but don't need +the initial telnet handshaking, then the standard dual-process +approach will suffice: + + use IO::Socket; # new in 5.004 + my $handle = IO::Socket::INET->new('www.perl.com:80') + or die "can't connect to port 80 on www.perl.com $!"; + $handle->autoflush(1); + if (fork()) { # XXX: undef means failure + select($handle); + print while <STDIN>; # everything from stdin to socket + } else { + print while <$handle>; # everything from socket to stdout + } + close $handle; + exit; + +=head2 How can I write expect in Perl? + +Once upon a time, there was a library called F<chat2.pl> (part of the +standard perl distribution), which never really got finished. If you +find it somewhere, I<don't use it>. These days, your best bet is to +look at the L<Expect> module available from CPAN, which also requires two +other modules from CPAN, L<IO::Pty> and L<IO::Stty>. + +=head2 Is there a way to hide perl's command line from programs such as "ps"? + +First of all note that if you're doing this for security reasons (to +avoid people seeing passwords, for example) then you should rewrite +your program so that critical information is never given as an +argument. Hiding the arguments won't make your program completely +secure. + +To actually alter the visible command line, you can assign to the +variable $0 as documented in L<perlvar>. This won't work on all +operating systems, though. Daemon programs like sendmail place their +state there, as in: + + $0 = "orcus [accepting connections]"; + +=head2 I {changed directory, modified my environment} in a perl script. How come the change disappeared when I exited the script? How do I get my changes to be visible? + +=over 4 + +=item Unix + +In the strictest sense, it can't be done--the script executes as a +different process from the shell it was started from. Changes to a +process are not reflected in its parent--only in any children +created after the change. There is shell magic that may allow you to +fake it by C<eval()>ing the script's output in your shell; check out the +comp.unix.questions FAQ for details. + +=back + +=head2 How do I close a process's filehandle without waiting for it to complete? + +Assuming your system supports such things, just send an appropriate signal +to the process (see L<perlfunc/"kill">). It's common to first send a TERM +signal, wait a little bit, and then send a KILL signal to finish it off. + +=head2 How do I fork a daemon process? + +If by daemon process you mean one that's detached (disassociated from +its tty), then the following process is reported to work on most +Unixish systems. Non-Unix users should check their Your_OS::Process +module for other solutions. + +=over 4 + +=item * + +Open /dev/tty and use the TIOCNOTTY ioctl on it. See L<tty(1)> +for details. Or better yet, you can just use the C<POSIX::setsid()> +function, so you don't have to worry about process groups. + +=item * + +Change directory to / + +=item * + +Reopen STDIN, STDOUT, and STDERR so they're not connected to the old +tty. + +=item * + +Background yourself like this: + + fork && exit; + +=back + +The L<Proc::Daemon> module, available from CPAN, provides a function to +perform these actions for you. + +=head2 How do I find out if I'm running interactively or not? + +(contributed by brian d foy) + +This is a difficult question to answer, and the best answer is +only a guess. + +What do you really want to know? If you merely want to know if one of +your filehandles is connected to a terminal, you can try the C<-t> +file test: + + if( -t STDOUT ) { + print "I'm connected to a terminal!\n"; + } + +However, you might be out of luck if you expect that means there is a +real person on the other side. With the L<Expect> module, another +program can pretend to be a person. The program might even come close +to passing the Turing test. + +The L<IO::Interactive> module does the best it can to give you an +answer. Its C<is_interactive> function returns an output filehandle; +that filehandle points to standard output if the module thinks the +session is interactive. Otherwise, the filehandle is a null handle +that simply discards the output: + + use IO::Interactive; + + print { is_interactive } "I might go to standard output!\n"; + +This still doesn't guarantee that a real person is answering your +prompts or reading your output. + +If you want to know how to handle automated testing for your +distribution, you can check the environment. The CPAN +Testers, for instance, set the value of C<AUTOMATED_TESTING>: + + unless( $ENV{AUTOMATED_TESTING} ) { + print "Hello interactive tester!\n"; + } + +=head2 How do I timeout a slow event? + +Use the C<alarm()> function, probably in conjunction with a signal +handler, as documented in L<perlipc/"Signals"> and the section on +"Signals" in the Camel. You may instead use the more flexible +L<Sys::AlarmCall> module available from CPAN. + +The C<alarm()> function is not implemented on all versions of Windows. +Check the documentation for your specific version of Perl. + +=head2 How do I set CPU limits? +X<BSD::Resource> X<limit> X<CPU> + +(contributed by Xho) + +Use the L<BSD::Resource> module from CPAN. As an example: + + use BSD::Resource; + setrlimit(RLIMIT_CPU,10,20) or die $!; + +This sets the soft and hard limits to 10 and 20 seconds, respectively. +After 10 seconds of time spent running on the CPU (not "wall" time), +the process will be sent a signal (XCPU on some systems) which, if not +trapped, will cause the process to terminate. If that signal is +trapped, then after 10 more seconds (20 seconds in total) the process +will be killed with a non-trappable signal. + +See the L<BSD::Resource> and your systems documentation for the gory +details. + +=head2 How do I avoid zombies on a Unix system? + +Use the reaper code from L<perlipc/"Signals"> to call C<wait()> when a +SIGCHLD is received, or else use the double-fork technique described +in L<perlfaq8/"How do I start a process in the background?">. + +=head2 How do I use an SQL database? + +The L<DBI> module provides an abstract interface to most database +servers and types, including Oracle, DB2, Sybase, mysql, Postgresql, +ODBC, and flat files. The DBI module accesses each database type +through a database driver, or DBD. You can see a complete list of +available drivers on CPAN: L<http://www.cpan.org/modules/by-module/DBD/> . +You can read more about DBI on L<http://dbi.perl.org/> . + +Other modules provide more specific access: L<Win32::ODBC>, L<Alzabo>, +C<iodbc>, and others found on CPAN Search: L<http://search.cpan.org/> . + +=head2 How do I make a system() exit on control-C? + +You can't. You need to imitate the C<system()> call (see L<perlipc> for +sample code) and then have a signal handler for the INT signal that +passes the signal on to the subprocess. Or you can check for it: + + $rc = system($cmd); + if ($rc & 127) { die "signal death" } + +=head2 How do I open a file without blocking? + +If you're lucky enough to be using a system that supports +non-blocking reads (most Unixish systems do), you need only to use the +C<O_NDELAY> or C<O_NONBLOCK> flag from the C<Fcntl> module in conjunction with +C<sysopen()>: + + use Fcntl; + sysopen(my $fh, "/foo/somefile", O_WRONLY|O_NDELAY|O_CREAT, 0644) + or die "can't open /foo/somefile: $!": + +=head2 How do I tell the difference between errors from the shell and perl? + +(answer contributed by brian d foy) + +When you run a Perl script, something else is running the script for you, +and that something else may output error messages. The script might +emit its own warnings and error messages. Most of the time you cannot +tell who said what. + +You probably cannot fix the thing that runs perl, but you can change how +perl outputs its warnings by defining a custom warning and die functions. + +Consider this script, which has an error you may not notice immediately. + + #!/usr/locl/bin/perl + + print "Hello World\n"; + +I get an error when I run this from my shell (which happens to be +bash). That may look like perl forgot it has a C<print()> function, +but my shebang line is not the path to perl, so the shell runs the +script, and I get the error. + + $ ./test + ./test: line 3: print: command not found + +A quick and dirty fix involves a little bit of code, but this may be all +you need to figure out the problem. + + #!/usr/bin/perl -w + + BEGIN { + $SIG{__WARN__} = sub{ print STDERR "Perl: ", @_; }; + $SIG{__DIE__} = sub{ print STDERR "Perl: ", @_; exit 1}; + } + + $a = 1 + undef; + $x / 0; + __END__ + +The perl message comes out with "Perl" in front. The C<BEGIN> block +works at compile time so all of the compilation errors and warnings +get the "Perl:" prefix too. + + Perl: Useless use of division (/) in void context at ./test line 9. + Perl: Name "main::a" used only once: possible typo at ./test line 8. + Perl: Name "main::x" used only once: possible typo at ./test line 9. + Perl: Use of uninitialized value in addition (+) at ./test line 8. + Perl: Use of uninitialized value in division (/) at ./test line 9. + Perl: Illegal division by zero at ./test line 9. + Perl: Illegal division by zero at -e line 3. + +If I don't see that "Perl:", it's not from perl. + +You could also just know all the perl errors, and although there are +some people who may know all of them, you probably don't. However, they +all should be in the L<perldiag> manpage. If you don't find the error in +there, it probably isn't a perl error. + +Looking up every message is not the easiest way, so let perl to do it +for you. Use the diagnostics pragma with turns perl's normal messages +into longer discussions on the topic. + + use diagnostics; + +If you don't get a paragraph or two of expanded discussion, it +might not be perl's message. + +=head2 How do I install a module from CPAN? + +(contributed by brian d foy) + +The easiest way is to have a module also named CPAN do it for you by using +the C<cpan> command that comes with Perl. You can give it a list of modules +to install: + + $ cpan IO::Interactive Getopt::Whatever + +If you prefer C<CPANPLUS>, it's just as easy: + + $ cpanp i IO::Interactive Getopt::Whatever + +If you want to install a distribution from the current directory, you can +tell C<CPAN.pm> to install C<.> (the full stop): + + $ cpan . + +See the documentation for either of those commands to see what else +you can do. + +If you want to try to install a distribution by yourself, resolving +all dependencies on your own, you follow one of two possible build +paths. + +For distributions that use I<Makefile.PL>: + + $ perl Makefile.PL + $ make test install + +For distributions that use I<Build.PL>: + + $ perl Build.PL + $ ./Build test + $ ./Build install + +Some distributions may need to link to libraries or other third-party +code and their build and installation sequences may be more complicated. +Check any I<README> or I<INSTALL> files that you may find. + +=head2 What's the difference between require and use? + +(contributed by brian d foy) + +Perl runs C<require> statement at run-time. Once Perl loads, compiles, +and runs the file, it doesn't do anything else. The C<use> statement +is the same as a C<require> run at compile-time, but Perl also calls the +C<import> method for the loaded package. These two are the same: + + use MODULE qw(import list); + + BEGIN { + require MODULE; + MODULE->import(import list); + } + +However, you can suppress the C<import> by using an explicit, empty +import list. Both of these still happen at compile-time: + + use MODULE (); + + BEGIN { + require MODULE; + } + +Since C<use> will also call the C<import> method, the actual value +for C<MODULE> must be a bareword. That is, C<use> cannot load files +by name, although C<require> can: + + require "$ENV{HOME}/lib/Foo.pm"; # no @INC searching! + +See the entry for C<use> in L<perlfunc> for more details. + +=head2 How do I keep my own module/library directory? + +When you build modules, tell Perl where to install the modules. + +If you want to install modules for your own use, the easiest way might +be L<local::lib>, which you can download from CPAN. It sets various +installation settings for you, and uses those same settings within +your programs. + +If you want more flexibility, you need to configure your CPAN client +for your particular situation. + +For C<Makefile.PL>-based distributions, use the INSTALL_BASE option +when generating Makefiles: + + perl Makefile.PL INSTALL_BASE=/mydir/perl + +You can set this in your C<CPAN.pm> configuration so modules +automatically install in your private library directory when you use +the CPAN.pm shell: + + % cpan + cpan> o conf makepl_arg INSTALL_BASE=/mydir/perl + cpan> o conf commit + +For C<Build.PL>-based distributions, use the --install_base option: + + perl Build.PL --install_base /mydir/perl + +You can configure C<CPAN.pm> to automatically use this option too: + + % cpan + cpan> o conf mbuild_arg "--install_base /mydir/perl" + cpan> o conf commit + +INSTALL_BASE tells these tools to put your modules into +F</mydir/perl/lib/perl5>. See L<How do I add a directory to my +include path (@INC) at runtime?> for details on how to run your newly +installed modules. + +There is one caveat with INSTALL_BASE, though, since it acts +differently from the PREFIX and LIB settings that older versions of +L<ExtUtils::MakeMaker> advocated. INSTALL_BASE does not support +installing modules for multiple versions of Perl or different +architectures under the same directory. You should consider whether you +really want that and, if you do, use the older PREFIX and LIB +settings. See the L<ExtUtils::Makemaker> documentation for more details. + +=head2 How do I add the directory my program lives in to the module/library search path? + +(contributed by brian d foy) + +If you know the directory already, you can add it to C<@INC> as you would +for any other directory. You might <use lib> if you know the directory +at compile time: + + use lib $directory; + +The trick in this task is to find the directory. Before your script does +anything else (such as a C<chdir>), you can get the current working +directory with the C<Cwd> module, which comes with Perl: + + BEGIN { + use Cwd; + our $directory = cwd; + } + + use lib $directory; + +You can do a similar thing with the value of C<$0>, which holds the +script name. That might hold a relative path, but C<rel2abs> can turn +it into an absolute path. Once you have the + + BEGIN { + use File::Spec::Functions qw(rel2abs); + use File::Basename qw(dirname); + + my $path = rel2abs( $0 ); + our $directory = dirname( $path ); + } + + use lib $directory; + +The L<FindBin> module, which comes with Perl, might work. It finds the +directory of the currently running script and puts it in C<$Bin>, which +you can then use to construct the right library path: + + use FindBin qw($Bin); + +You can also use L<local::lib> to do much of the same thing. Install +modules using L<local::lib>'s settings then use the module in your +program: + + use local::lib; # sets up a local lib at ~/perl5 + +See the L<local::lib> documentation for more details. + +=head2 How do I add a directory to my include path (@INC) at runtime? + +Here are the suggested ways of modifying your include path, including +environment variables, run-time switches, and in-code statements: + +=over 4 + +=item the C<PERLLIB> environment variable + + $ export PERLLIB=/path/to/my/dir + $ perl program.pl + +=item the C<PERL5LIB> environment variable + + $ export PERL5LIB=/path/to/my/dir + $ perl program.pl + +=item the C<perl -Idir> command line flag + + $ perl -I/path/to/my/dir program.pl + +=item the C<lib> pragma: + + use lib "$ENV{HOME}/myown_perllib"; + +=item the L<local::lib> module: + + use local::lib; + + use local::lib "~/myown_perllib"; + +=back + +The last is particularly useful because it knows about machine-dependent +architectures. The C<lib.pm> pragmatic module was first +included with the 5.002 release of Perl. + +=head2 What is socket.ph and where do I get it? + +It's a Perl 4 style file defining values for system networking +constants. Sometimes it is built using L<h2ph> when Perl is installed, +but other times it is not. Modern programs should use C<use Socket;> +instead. + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples in this file +are hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun +or for profit as you see fit. A simple comment in the code giving +credit would be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq9.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq9.pod new file mode 100644 index 00000000000..b42755efe0a --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlfaq9.pod @@ -0,0 +1,426 @@ +=head1 NAME + +perlfaq9 - Web, Email and Networking + +=head1 DESCRIPTION + +This section deals with questions related to running web sites, +sending and receiving email as well as general networking. + +=head2 Should I use a web framework? + +Yes. If you are building a web site with any level of interactivity +(forms / users / databases), you +will want to use a framework to make handling requests +and responses easier. + +If there is no interactivity then you may still want +to look at using something like L<Template Toolkit|https://metacpan.org/module/Template> +or L<Plack::Middleware::TemplateToolkit> +so maintenance of your HTML files (and other assets) is easier. + +=head2 Which web framework should I use? +X<framework> X<CGI.pm> X<CGI> X<Catalyst> X<Dancer> + +There is no simple answer to this question. Perl frameworks can run everything +from basic file servers and small scale intranets to massive multinational +multilingual websites that are the core to international businesses. + +Below is a list of a few frameworks with comments which might help you in +making a decision, depending on your specific requirements. Start by reading +the docs, then ask questions on the relevant mailing list or IRC channel. + +=over 4 + +=item L<Catalyst> + +Strongly object-oriented and fully-featured with a long development history and +a large community and addon ecosystem. It is excellent for large and complex +applications, where you have full control over the server. + +=item L<Dancer> + +Young and free of legacy weight, providing a lightweight and easy to learn API. +Has a growing addon ecosystem. It is best used for smaller projects and +very easy to learn for beginners. + +=item L<Mojolicious> + +Fairly young with a focus on HTML5 and real-time web technologies such as +WebSockets. + +=item L<Web::Simple> + +Currently experimental, strongly object-oriented, built for speed and intended +as a toolkit for building micro web apps, custom frameworks or for tieing +together existing Plack-compatible web applications with one central dispatcher. + +=back + +All of these interact with or use L<Plack> which is worth understanding +the basics of when building a website in Perl (there is a lot of useful +L<Plack::Middleware|https://metacpan.org/search?q=plack%3A%3Amiddleware>). + +=head2 What is Plack and PSGI? + +L<PSGI> is the Perl Web Server Gateway Interface Specification, it is +a standard that many Perl web frameworks use, you should not need to +understand it to build a web site, the part you might want to use is L<Plack>. + +L<Plack> is a set of tools for using the PSGI stack. It contains +L<middleware|https://metacpan.org/search?q=plack%3A%3Amiddleware> +components, a reference server and utilities for Web application frameworks. +Plack is like Ruby's Rack or Python's Paste for WSGI. + +You could build a web site using L<Plack> and your own code, +but for anything other than a very basic web site, using a web framework +(that uses L<Plack>) is a better option. + +=head2 How do I remove HTML from a string? + +Use L<HTML::Strip>, or L<HTML::FormatText> which not only removes HTML +but also attempts to do a little simple formatting of the resulting +plain text. + +=head2 How do I extract URLs? + +L<HTML::SimpleLinkExtor> will extract URLs from HTML, it handles anchors, +images, objects, frames, and many other tags that can contain a URL. +If you need anything more complex, you can create your own subclass of +L<HTML::LinkExtor> or L<HTML::Parser>. You might even use +L<HTML::SimpleLinkExtor> as an example for something specifically +suited to your needs. + +You can use L<URI::Find> to extract URLs from an arbitrary text document. + +=head2 How do I fetch an HTML file? + +(contributed by brian d foy) + +Use the libwww-perl distribution. The L<LWP::Simple> module can fetch web +resources and give their content back to you as a string: + + use LWP::Simple qw(get); + + my $html = get( "http://www.example.com/index.html" ); + +It can also store the resource directly in a file: + + use LWP::Simple qw(getstore); + + getstore( "http://www.example.com/index.html", "foo.html" ); + +If you need to do something more complicated, you can use +L<LWP::UserAgent> module to create your own user-agent (e.g. browser) +to get the job done. If you want to simulate an interactive web +browser, you can use the L<WWW::Mechanize> module. + +=head2 How do I automate an HTML form submission? + +If you are doing something complex, such as moving through many pages +and forms or a web site, you can use L<WWW::Mechanize>. See its +documentation for all the details. + +If you're submitting values using the GET method, create a URL and encode +the form using the C<query_form> method: + + use LWP::Simple; + use URI::URL; + + my $url = url('L<http://www.perl.com/cgi-bin/cpan_mod')>; + $url->query_form(module => 'DB_File', readme => 1); + $content = get($url); + +If you're using the POST method, create your own user agent and encode +the content appropriately. + + use HTTP::Request::Common qw(POST); + use LWP::UserAgent; + + my $ua = LWP::UserAgent->new(); + my $req = POST 'L<http://www.perl.com/cgi-bin/cpan_mod'>, + [ module => 'DB_File', readme => 1 ]; + my $content = $ua->request($req)->as_string; + +=head2 How do I decode or create those %-encodings on the web? +X<URI> X<URI::Escape> X<RFC 2396> + +Most of the time you should not need to do this as +your web framework, or if you are making a request, +the L<LWP> or other module would handle it for you. + +To encode a string yourself, use the L<URI::Escape> module. The C<uri_escape> +function returns the escaped string: + + my $original = "Colon : Hash # Percent %"; + + my $escaped = uri_escape( $original ); + + print "$escaped\n"; # 'Colon%20%3A%20Hash%20%23%20Percent%20%25' + +To decode the string, use the C<uri_unescape> function: + + my $unescaped = uri_unescape( $escaped ); + + print $unescaped; # back to original + +Remember not to encode a full URI, you need to escape each +component separately and then join them together. + +=head2 How do I redirect to another page? + +Most Perl Web Frameworks will have a mechanism for doing this, +using the L<Catalyst> framework it would be: + + $c->res->redirect($url); + $c->detach(); + +If you are using Plack (which most frameworks do), then +L<Plack::Middleware::Rewrite> is worth looking at if you +are migrating from Apache or have URL's you want to always +redirect. + +=head2 How do I put a password on my web pages? + +See if the web framework you are using has an +authentication system and if that fits your needs. + +Alternativly look at L<Plack::Middleware::Auth::Basic>, +or one of the other L<Plack authentication|https://metacpan.org/search?q=plack+auth> +options. + +=head2 How do I make sure users can't enter values into a form that causes my CGI script to do bad things? + +(contributed by brian d foy) + +You can't prevent people from sending your script bad data. Even if +you add some client-side checks, people may disable them or bypass +them completely. For instance, someone might use a module such as +L<LWP> to submit to your web site. If you want to prevent data that +try to use SQL injection or other sorts of attacks (and you should +want to), you have to not trust any data that enter your program. + +The L<perlsec> documentation has general advice about data security. +If you are using the L<DBI> module, use placeholder to fill in data. +If you are running external programs with C<system> or C<exec>, use +the list forms. There are many other precautions that you should take, +too many to list here, and most of them fall under the category of not +using any data that you don't intend to use. Trust no one. + +=head2 How do I parse a mail header? + +Use the L<Email::MIME> module. It's well-tested and supports all the +craziness that you'll see in the real world (comment-folding whitespace, +encodings, comments, etc.). + + use Email::MIME; + + my $message = Email::MIME->new($rfc2822); + my $subject = $message->header('Subject'); + my $from = $message->header('From'); + +If you've already got some other kind of email object, consider passing +it to L<Email::Abstract> and then using its cast method to get an +L<Email::MIME> object: + + my $mail_message_object = read_message(); + my $abstract = Email::Abstract->new($mail_message_object); + my $email_mime_object = $abstract->cast('Email::MIME'); + +=head2 How do I check a valid mail address? + +(partly contributed by Aaron Sherman) + +This isn't as simple a question as it sounds. There are two parts: + +a) How do I verify that an email address is correctly formatted? + +b) How do I verify that an email address targets a valid recipient? + +Without sending mail to the address and seeing whether there's a human +on the other end to answer you, you cannot fully answer part I<b>, but +the L<Email::Valid> module will do both part I<a> and part I<b> as far +as you can in real-time. + +Our best advice for verifying a person's mail address is to have them +enter their address twice, just as you normally do to change a +password. This usually weeds out typos. If both versions match, send +mail to that address with a personal message. If you get the message +back and they've followed your directions, you can be reasonably +assured that it's real. + +A related strategy that's less open to forgery is to give them a PIN +(personal ID number). Record the address and PIN (best that it be a +random one) for later processing. In the mail you send, include a link to +your site with the PIN included. If the mail bounces, you know it's not +valid. If they don't click on the link, either they forged the address or +(assuming they got the message) following through wasn't important so you +don't need to worry about it. + +=head2 How do I decode a MIME/BASE64 string? + +The L<MIME::Base64> package handles this as well as the MIME/QP encoding. +Decoding base 64 becomes as simple as: + + use MIME::Base64; + my $decoded = decode_base64($encoded); + +The L<Email::MIME> module can decode base 64-encoded email message parts +transparently so the developer doesn't need to worry about it. + +=head2 How do I find the user's mail address? + +Ask them for it. There are so many email providers available that it's +unlikely the local system has any idea how to determine a user's email address. + +The exception is for organization-specific email (e.g. foo@yourcompany.com) +where policy can be codified in your program. In that case, you could look at +$ENV{USER}, $ENV{LOGNAME}, and getpwuid($<) in scalar context, like so: + + my $user_name = getpwuid($<) + +But you still cannot make assumptions about whether this is correct, unless +your policy says it is. You really are best off asking the user. + +=head2 How do I send email? + +Use the L<Email::MIME> and L<Email::Sender::Simple> modules, like so: + + # first, create your message + my $message = Email::MIME->create( + header_str => [ + From => 'you@example.com', + To => 'friend@example.com', + Subject => 'Happy birthday!', + ], + attributes => { + encoding => 'quoted-printable', + charset => 'ISO-8859-1', + }, + body_str => "Happy birthday to you!\n", + ); + + use Email::Sender::Simple qw(sendmail); + sendmail($message); + +By default, L<Email::Sender::Simple> will try `sendmail` first, if it exists +in your $PATH. This generally isn't the case. If there's a remote mail +server you use to send mail, consider investigating one of the Transport +classes. At time of writing, the available transports include: + +=over 4 + +=item L<Email::Sender::Transport::Sendmail> + +This is the default. If you can use the L<mail(1)> or L<mailx(1)> +program to send mail from the machine where your code runs, you should +be able to use this. + +=item L<Email::Sender::Transport::SMTP> + +This transport contacts a remote SMTP server over TCP. It optionally +uses SSL and can authenticate to the server via SASL. + +=item L<Email::Sender::Transport::SMTP::TLS> + +This is like the SMTP transport, but uses TLS security. You can +authenticate with this module as well, using any mechanisms your server +supports after STARTTLS. + +=back + +Telling L<Email::Sender::Simple> to use your transport is straightforward. + + sendmail( + $message, + { + transport => $email_sender_transport_object, + } + ); + +=head2 How do I use MIME to make an attachment to a mail message? + +L<Email::MIME> directly supports multipart messages. L<Email::MIME> +objects themselves are parts and can be attached to other L<Email::MIME> +objects. Consult the L<Email::MIME> documentation for more information, +including all of the supported methods and examples of their use. + +=head2 How do I read email? + +Use the L<Email::Folder> module, like so: + + use Email::Folder; + + my $folder = Email::Folder->new('/path/to/email/folder'); + while(my $message = $folder->next_message) { + # next_message returns Email::Simple objects, but we want + # Email::MIME objects as they're more robust + my $mime = Email::MIME->new($message->as_string); + } + +There are different classes in the L<Email::Folder> namespace for +supporting various mailbox types. Note that these modules are generally +rather limited and only support B<reading> rather than writing. + +=head2 How do I find out my hostname, domainname, or IP address? +X<hostname, domainname, IP address, host, domain, hostfqdn, inet_ntoa, +gethostbyname, Socket, Net::Domain, Sys::Hostname> + +(contributed by brian d foy) + +The L<Net::Domain> module, which is part of the Standard Library starting +in Perl 5.7.3, can get you the fully qualified domain name (FQDN), the host +name, or the domain name. + + use Net::Domain qw(hostname hostfqdn hostdomain); + + my $host = hostfqdn(); + +The L<Sys::Hostname> module, part of the Standard Library, can also get the +hostname: + + use Sys::Hostname; + + $host = hostname(); + + +The L<Sys::Hostname::Long> module takes a different approach and tries +harder to return the fully qualified hostname: + + use Sys::Hostname::Long 'hostname_long'; + + my $hostname = hostname_long(); + +To get the IP address, you can use the C<gethostbyname> built-in function +to turn the name into a number. To turn that number into the dotted octet +form (a.b.c.d) that most people expect, use the C<inet_ntoa> function +from the L<Socket> module, which also comes with perl. + + use Socket; + + my $address = inet_ntoa( + scalar gethostbyname( $host || 'localhost' ) + ); + +=head2 How do I fetch/put an (S)FTP file? + +L<Net::FTP>, and L<Net::SFTP> allow you to interact with FTP and SFTP (Secure +FTP) servers. + +=head2 How can I do RPC in Perl? + +Use one of the RPC modules( L<https://metacpan.org/search?q=RPC> ). + +=head1 AUTHOR AND COPYRIGHT + +Copyright (c) 1997-2010 Tom Christiansen, Nathan Torkington, and +other authors as noted. All rights reserved. + +This documentation is free; you can redistribute it and/or modify it +under the same terms as Perl itself. + +Irrespective of its distribution, all code examples in this file +are hereby placed into the public domain. You are permitted and +encouraged to use this code in your own programs for fun +or for profit as you see fit. A simple comment in the code giving +credit would be courteous but is not required. diff --git a/gnu/usr.bin/perl/cpan/perlfaq/lib/perlglossary.pod b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlglossary.pod new file mode 100644 index 00000000000..5adef04b397 --- /dev/null +++ b/gnu/usr.bin/perl/cpan/perlfaq/lib/perlglossary.pod @@ -0,0 +1,3442 @@ +=head1 NAME + +perlglossary - Perl Glossary + +=head1 DESCRIPTION + +A glossary of terms (technical and otherwise) used in the Perl documentation. +Other useful sources include the Free On-Line Dictionary of Computing +L<http://foldoc.org/>, the Jargon File +L<http://catb.org/~esr/jargon/>, and Wikipedia L<http://www.wikipedia.org/>. + +=head2 A + +=over 4 + +=item accessor methods + +A L</method> used to indirectly inspect or update an L</object>'s +state (its L<instance variables|/instance variable>). + +=item actual arguments + +The L<scalar values|/scalar value> that you supply to a L</function> +or L</subroutine> when you call it. For instance, when you call +C<power("puff")>, the string C<"puff"> is the actual argument. See +also L</argument> and L</formal arguments>. + +=item address operator + +Some languages work directly with the memory addresses of values, but +this can be like playing with fire. Perl provides a set of asbestos +gloves for handling all memory management. The closest to an address +operator in Perl is the backslash operator, but it gives you a L</hard +reference>, which is much safer than a memory address. + +=item algorithm + +A well-defined sequence of steps, clearly enough explained that even a +computer could do them. + +=item alias + +A nickname for something, which behaves in all ways as though you'd +used the original name instead of the nickname. Temporary aliases are +implicitly created in the loop variable for C<foreach> loops, in the +C<$_> variable for L<map|perlfunc/map> or L<grep|perlfunc/grep> +operators, in C<$a> and C<$b> during L<sort|perlfunc/sort>'s +comparison function, and in each element of C<@_> for the L</actual +arguments> of a subroutine call. Permanent aliases are explicitly +created in L<packages|/package> by L<importing|/import> symbols or by +assignment to L<typeglobs|/typeglob>. Lexically scoped aliases for +package variables are explicitly created by the L<our|perlfunc/our> +declaration. + +=item alternatives + +A list of possible choices from which you may select only one, as in +"Would you like door A, B, or C?" Alternatives in regular expressions +are separated with a single vertical bar: C<|>. Alternatives in +normal Perl expressions are separated with a double vertical bar: +C<||>. Logical alternatives in L</Boolean> expressions are separated +with either C<||> or C<or>. + +=item anonymous + +Used to describe a L</referent> that is not directly accessible +through a named L</variable>. Such a referent must be indirectly +accessible through at least one L</hard reference>. When the last +hard reference goes away, the anonymous referent is destroyed without +pity. + +=item architecture + +The kind of computer you're working on, where one "kind" of computer +means all those computers sharing a compatible machine language. +Since Perl programs are (typically) simple text files, not executable +images, a Perl program is much less sensitive to the architecture it's +running on than programs in other languages, such as C, that are +compiled into machine code. See also L</platform> and L</operating +system>. + +=item argument + +A piece of data supplied to a L<program|/executable file>, +L</subroutine>, L</function>, or L</method> to tell it what it's +supposed to do. Also called a "parameter". + +=item ARGV + +The name of the array containing the L</argument> L</vector> from the +command line. If you use the empty C<< E<lt>E<gt> >> operator, L</ARGV> is +the name of both the L</filehandle> used to traverse the arguments and +the L</scalar> containing the name of the current input file. + +=item arithmetical operator + +A L</symbol> such as C<+> or C</> that tells Perl to do the arithmetic +you were supposed to learn in grade school. + +=item array + +An ordered sequence of L<values|/value>, stored such that you can +easily access any of the values using an integer L</subscript> +that specifies the value's L</offset> in the sequence. + +=item array context + +An archaic expression for what is more correctly referred to as +L</list context>. + +=item ASCII + +The American Standard Code for Information Interchange (a 7-bit +character set adequate only for poorly representing English text). +Often used loosely to describe the lowest 128 values of the various +ISO-8859-X character sets, a bunch of mutually incompatible 8-bit +codes sometimes described as half ASCII. See also L</Unicode>. + +=item assertion + +A component of a L</regular expression> that must be true for the +pattern to match but does not necessarily match any characters itself. +Often used specifically to mean a L</zero width> assertion. + +=item assignment + +An L</operator> whose assigned mission in life is to change the value +of a L</variable>. + +=item assignment operator + +Either a regular L</assignment>, or a compound L</operator> composed +of an ordinary assignment and some other operator, that changes the +value of a variable in place, that is, relative to its old value. For +example, C<$a += 2> adds C<2> to C<$a>. + +=item associative array + +See L</hash>. Please. + +=item associativity + +Determines whether you do the left L</operator> first or the right +L</operator> first when you have "A L</operator> B L</operator> C" and +the two operators are of the same precedence. Operators like C<+> are +left associative, while operators like C<**> are right associative. +See L<perlop> for a list of operators and their associativity. + +=item asynchronous + +Said of events or activities whose relative temporal ordering is +indeterminate because too many things are going on at once. Hence, an +asynchronous event is one you didn't know when to expect. + +=item atom + +A L</regular expression> component potentially matching a +L</substring> containing one or more characters and treated as an +indivisible syntactic unit by any following L</quantifier>. (Contrast +with an L</assertion> that matches something of L</zero width> and may +not be quantified.) + +=item atomic operation + +When Democritus gave the word "atom" to the indivisible bits of +matter, he meant literally something that could not be cut: I<a-> +(not) + I<tomos> (cuttable). An atomic operation is an action that +can't be interrupted, not one forbidden in a nuclear-free zone. + +=item attribute + +A new feature that allows the declaration of L<variables|/variable> +and L<subroutines|/subroutine> with modifiers as in C<sub foo : locked +method>. Also, another name for an L</instance variable> of an +L</object>. + +=item autogeneration + +A feature of L</operator overloading> of L<objects|/object>, whereby +the behavior of certain L<operators|/operator> can be reasonably +deduced using more fundamental operators. This assumes that the +overloaded operators will often have the same relationships as the +regular operators. See L<perlop>. + +=item autoincrement + +To add one to something automatically, hence the name of the C<++> +operator. To instead subtract one from something automatically is +known as an "autodecrement". + +=item autoload + +To load on demand. (Also called "lazy" loading.) Specifically, to +call an L<AUTOLOAD|perlsub/Autoloading> subroutine on behalf of an +undefined subroutine. + +=item autosplit + +To split a string automatically, as the B<-a> L</switch> does when +running under B<-p> or B<-n> in order to emulate L</awk>. (See also +the L<AutoSplit> module, which has nothing to do with the B<-a> +switch, but a lot to do with autoloading.) + +=item autovivification + +A Greco-Roman word meaning "to bring oneself to life". In Perl, +storage locations (L<lvalues|/lvalue>) spontaneously generate +themselves as needed, including the creation of any L</hard reference> +values to point to the next level of storage. The assignment +C<$a[5][5][5][5][5] = "quintet"> potentially creates five scalar +storage locations, plus four references (in the first four scalar +locations) pointing to four new anonymous arrays (to hold the last +four scalar locations). But the point of autovivification is that you +don't have to worry about it. + +=item AV + +Short for "array value", which refers to one of Perl's internal data +types that holds an L</array>. The L</AV> type is a subclass of +L</SV>. + +=item awk + +Descriptive editing term--short for "awkward". Also coincidentally +refers to a venerable text-processing language from which Perl derived +some of its high-level ideas. + +=back + +=head2 B + +=over 4 + +=item backreference + +A substring L<captured|/capturing> by a subpattern within +unadorned parentheses in a L</regex>, also referred to as a capture group. The +sequences (C<\g1>, C<\g2>, etc.) later in the same pattern refer back to +the corresponding subpattern in the current match. Outside the pattern, +the numbered variables (C<$1>, C<$2>, etc.) continue to refer to these +same values, as long as the pattern was the last successful match of +the current dynamic scope. C<\g{-1}> can be used to refer to a group by +relative rather than absolute position; and groups can be also be named, and +referred to later by name rather than number. See L<perlre/"Capture groups">. + +=item backtracking + +The practice of saying, "If I had to do it all over, I'd do it +differently," and then actually going back and doing it all over +differently. Mathematically speaking, it's returning from an +unsuccessful recursion on a tree of possibilities. Perl backtracks +when it attempts to match patterns with a L</regular expression>, and +its earlier attempts don't pan out. See L<perlre/Backtracking>. + +=item backward compatibility + +Means you can still run your old program because we didn't break any +of the features or bugs it was relying on. + +=item bareword + +A word sufficiently ambiguous to be deemed illegal under L<use strict +'subs'|strict/strict subs>. In the absence of that stricture, a +bareword is treated as if quotes were around it. + +=item base class + +A generic L</object> type; that is, a L</class> from which other, more +specific classes are derived genetically by L</inheritance>. Also +called a "superclass" by people who respect their ancestors. + +=item big-endian + +From Swift: someone who eats eggs big end first. Also used of +computers that store the most significant L</byte> of a word at a +lower byte address than the least significant byte. Often considered +superior to little-endian machines. See also L</little-endian>. + +=item binary + +Having to do with numbers represented in base 2. That means there's +basically two numbers, 0 and 1. Also used to describe a "non-text +file", presumably because such a file makes full use of all the binary +bits in its bytes. With the advent of L</Unicode>, this distinction, +already suspect, loses even more of its meaning. + +=item binary operator + +An L</operator> that takes two L<operands|/operand>. + +=item bind + +To assign a specific L</network address> to a L</socket>. + +=item bit + +An integer in the range from 0 to 1, inclusive. The smallest possible +unit of information storage. An eighth of a L</byte> or of a dollar. +(The term "Pieces of Eight" comes from being able to split the old +Spanish dollar into 8 bits, each of which still counted for money. +That's why a 25-cent piece today is still "two bits".) + +=item bit shift + +The movement of bits left or right in a computer word, which has the +effect of multiplying or dividing by a power of 2. + +=item bit string + +A sequence of L<bits|/bit> that is actually being thought of as a +sequence of bits, for once. + +=item bless + +In corporate life, to grant official approval to a thing, as in, "The +VP of Engineering has blessed our WebCruncher project." Similarly in +Perl, to grant official approval to a L</referent> so that it can +function as an L</object>, such as a WebCruncher object. See +L<perlfunc/"bless">. + +=item block + +What a L</process> does when it has to wait for something: "My process +blocked waiting for the disk." As an unrelated noun, it refers to a +large chunk of data, of a size that the L</operating system> likes to +deal with (normally a power of two such as 512 or 8192). Typically +refers to a chunk of data that's coming from or going to a disk file. + +=item BLOCK + +A syntactic construct consisting of a sequence of Perl +L<statements|/statement> that is delimited by braces. The C<if> and +C<while> statements are defined in terms of L<BLOCKs|/BLOCK>, for instance. +Sometimes we also say "block" to mean a lexical scope; that is, a +sequence of statements that act like a L</BLOCK>, such as within an +L<eval|perlfunc/eval> or a file, even though the statements aren't +delimited by braces. + +=item block buffering + +A method of making input and output efficient by passing one L</block> +at a time. By default, Perl does block buffering to disk files. See +L</buffer> and L</command buffering>. + +=item Boolean + +A value that is either L</true> or L</false>. + +=item Boolean context + +A special kind of L</scalar context> used in conditionals to decide +whether the L</scalar value> returned by an expression is L</true> or +L</false>. Does not evaluate as either a string or a number. See +L</context>. + +=item breakpoint + +A spot in your program where you've told the debugger to stop +L<execution|/execute> so you can poke around and see whether anything +is wrong yet. + +=item broadcast + +To send a L</datagram> to multiple destinations simultaneously. + +=item BSD + +A psychoactive drug, popular in the 80s, probably developed at +U. C. Berkeley or thereabouts. Similar in many ways to the +prescription-only medication called "System V", but infinitely more +useful. (Or, at least, more fun.) The full chemical name is +"Berkeley Standard Distribution". + +=item bucket + +A location in a L</hash table> containing (potentially) multiple +entries whose keys "hash" to the same hash value according to its hash +function. (As internal policy, you don't have to worry about it, +unless you're into internals, or policy.) + +=item buffer + +A temporary holding location for data. L<Block buffering|/block +buffering> means that the data is passed on to its destination +whenever the buffer is full. L<Line buffering|/line buffering> means +that it's passed on whenever a complete line is received. L<Command +buffering|/command buffering> means that it's passed every time you do +a L<print|perlfunc/print> command (or equivalent). If your output is +unbuffered, the system processes it one byte at a time without the use +of a holding area. This can be rather inefficient. + +=item built-in + +A L</function> that is predefined in the language. Even when hidden +by L</overriding>, you can always get at a built-in function by +L<qualifying|/qualified> its name with the C<CORE::> pseudo-package. + +=item bundle + +A group of related modules on L</CPAN>. (Also, sometimes refers to a +group of command-line switches grouped into one L</switch cluster>.) + +=item byte + +A piece of data worth eight L<bits|/bit> in most places. + +=item bytecode + +A pidgin-like language spoken among 'droids when they don't wish to +reveal their orientation (see L</endian>). Named after some similar +languages spoken (for similar reasons) between compilers and +interpreters in the late 20th century. These languages are +characterized by representing everything as a +non-architecture-dependent sequence of bytes. + +=back + +=head2 C + +=over 4 + +=item C + +A language beloved by many for its inside-out L</type> definitions, +inscrutable L</precedence> rules, and heavy L</overloading> of the +function-call mechanism. (Well, actually, people first switched to C +because they found lowercase identifiers easier to read than upper.) +Perl is written in C, so it's not surprising that Perl borrowed a few +ideas from it. + +=item C preprocessor + +The typical C compiler's first pass, which processes lines beginning +with C<#> for conditional compilation and macro definition and does +various manipulations of the program text based on the current +definitions. Also known as I<cpp>(1). + +=item call by reference + +An L</argument>-passing mechanism in which the L</formal arguments> +refer directly to the L</actual arguments>, and the L</subroutine> can +change the actual arguments by changing the formal arguments. That +is, the formal argument is an L</alias> for the actual argument. See +also L</call by value>. + +=item call by value + +An L</argument>-passing mechanism in which the L</formal arguments> +refer to a copy of the L</actual arguments>, and the L</subroutine> +cannot change the actual arguments by changing the formal arguments. +See also L</call by reference>. + +=item callback + +A L</handler> that you register with some other part of your program +in the hope that the other part of your program will L</trigger> your +handler when some event of interest transpires. + +=item canonical + +Reduced to a standard form to facilitate comparison. + +=item capture buffer, capture group + +These two terms are synonymous: +a L<captured substring|/capturing> by a regex subpattern. + +=item capturing + +The use of parentheses around a L</subpattern> in a L</regular +expression> to store the matched L</substring> as a L</backreference> +or L<capture group|/capture buffer, capture group>. +(Captured strings are also returned as a list in L</list context>.) + +=item character + +A small integer representative of a unit of orthography. +Historically, characters were usually stored as fixed-width integers +(typically in a byte, or maybe two, depending on the character set), +but with the advent of UTF-8, characters are often stored in a +variable number of bytes depending on the size of the integer that +represents the character. Perl manages this transparently for you, +for the most part. + +=item character class + +A square-bracketed list of characters used in a L</regular expression> +to indicate that any character of the set may occur at a given point. +Loosely, any predefined set of characters so used. + +=item character property + +A predefined L</character class> matchable by the C<\p> +L</metasymbol>. Many standard properties are defined for L</Unicode>. + +=item circumfix operator + +An L</operator> that surrounds its L</operand>, like the angle +operator, or parentheses, or a hug. + +=item class + +A user-defined L</type>, implemented in Perl via a L</package> that +provides (either directly or by inheritance) L<methods|/method> (that +is, L<subroutines|/subroutine>) to handle L<instances|/instance> of +the class (its L<objects|/object>). See also L</inheritance>. + +=item class method + +A L</method> whose L</invocant> is a L</package> name, not an +L</object> reference. A method associated with the class as a whole. + +=item client + +In networking, a L</process> that initiates contact with a L</server> +process in order to exchange data and perhaps receive a service. + +=item cloister + +A L</cluster> used to restrict the scope of a L</regular expression +modifier>. + +=item closure + +An L</anonymous> subroutine that, when a reference to it is generated +at run time, keeps track of the identities of externally visible +L<lexical variables|/lexical variable> even after those lexical +variables have supposedly gone out of L</scope>. They're called +"closures" because this sort of behavior gives mathematicians a sense +of closure. + +=item cluster + +A parenthesized L</subpattern> used to group parts of a L</regular +expression> into a single L</atom>. + +=item CODE + +The word returned by the L<ref|perlfunc/ref> function when you apply +it to a reference to a subroutine. See also L</CV>. + +=item code generator + +A system that writes code for you in a low-level language, such as +code to implement the backend of a compiler. See L</program +generator>. + +=item code point + +The position of a character in a character set encoding. The character +C<NULL> is almost certainly at the zeroth position in all character +sets, so its code point is 0. The code point for the C<SPACE> +character in the ASCII character set is 0x20, or 32 decimal; in EBCDIC +it is 0x40, or 64 decimal. The L<ord|perlfunc/ord> function returns +the code point of a character. + +"code position" and "ordinal" mean the same thing as "code point". + +=item code subpattern + +A L</regular expression> subpattern whose real purpose is to execute +some Perl code, for example, the C<(?{...})> and C<(??{...})> +subpatterns. + +=item collating sequence + +The order into which L<characters|/character> sort. This is used by +L</string> comparison routines to decide, for example, where in this +glossary to put "collating sequence". + +=item command + +In L</shell> programming, the syntactic combination of a program name +and its arguments. More loosely, anything you type to a shell (a +command interpreter) that starts it doing something. Even more +loosely, a Perl L</statement>, which might start with a L</label> and +typically ends with a semicolon. + +=item command buffering + +A mechanism in Perl that lets you store up the output of each Perl +L</command> and then flush it out as a single request to the +L</operating system>. It's enabled by setting the C<$|> +(C<$AUTOFLUSH>) variable to a true value. It's used when you don't +want data sitting around not going where it's supposed to, which may +happen because the default on a L</file> or L</pipe> is to use +L</block buffering>. + +=item command name + +The name of the program currently executing, as typed on the command +line. In C, the L</command> name is passed to the program as the +first command-line argument. In Perl, it comes in separately as +C<$0>. + +=item command-line arguments + +The L<values|/value> you supply along with a program name when you +tell a L</shell> to execute a L</command>. These values are passed to +a Perl program through C<@ARGV>. + +=item comment + +A remark that doesn't affect the meaning of the program. In Perl, a +comment is introduced by a C<#> character and continues to the end of +the line. + +=item compilation unit + +The L</file> (or L</string>, in the case of L<eval|perlfunc/eval>) +that is currently being compiled. + +=item compile phase + +Any time before Perl starts running your main program. See also +L</run phase>. Compile phase is mostly spent in L</compile time>, but +may also be spent in L</run time> when C<BEGIN> blocks, +L<use|perlfunc/use> declarations, or constant subexpressions are being +evaluated. The startup and import code of any L<use|perlfunc/use> +declaration is also run during compile phase. + +=item compile time + +The time when Perl is trying to make sense of your code, as opposed to +when it thinks it knows what your code means and is merely trying to +do what it thinks your code says to do, which is L</run time>. + +=item compiler + +Strictly speaking, a program that munches up another program and spits +out yet another file containing the program in a "more executable" +form, typically containing native machine instructions. The I<perl> +program is not a compiler by this definition, but it does contain a +kind of compiler that takes a program and turns it into a more +executable form (L<syntax trees|/syntax tree>) within the I<perl> +process itself, which the L</interpreter> then interprets. There are, +however, extension L<modules|/module> to get Perl to act more like a +"real" compiler. See L<O>. + +=item composer + +A "constructor" for a L</referent> that isn't really an L</object>, +like an anonymous array or a hash (or a sonata, for that matter). For +example, a pair of braces acts as a composer for a hash, and a pair of +brackets acts as a composer for an array. See L<perlref/Making +References>. + +=item concatenation + +The process of gluing one cat's nose to another cat's tail. Also, a +similar operation on two L<strings|/string>. + +=item conditional + +Something "iffy". See L</Boolean context>. + +=item connection + +In telephony, the temporary electrical circuit between the caller's +and the callee's phone. In networking, the same kind of temporary +circuit between a L</client> and a L</server>. + +=item construct + +As a noun, a piece of syntax made up of smaller pieces. As a +transitive verb, to create an L</object> using a L</constructor>. + +=item constructor + +Any L</class method>, instance L</method>, or L</subroutine> +that composes, initializes, blesses, and returns an L</object>. +Sometimes we use the term loosely to mean a L</composer>. + +=item context + +The surroundings, or environment. The context given by the +surrounding code determines what kind of data a particular +L</expression> is expected to return. The three primary contexts are +L</list context>, L</scalar context>, and L</void context>. Scalar +context is sometimes subdivided into L</Boolean context>, L</numeric +context>, L</string context>, and L</void context>. There's also a +"don't care" scalar context (which is dealt with in Programming Perl, +Third Edition, Chapter 2, "Bits and Pieces" if you care). + +=item continuation + +The treatment of more than one physical L</line> as a single logical +line. L</Makefile> lines are continued by putting a backslash before +the L</newline>. Mail headers as defined by RFC 822 are continued by +putting a space or tab I<after> the newline. In general, lines in +Perl do not need any form of continuation mark, because L</whitespace> +(including newlines) is gleefully ignored. Usually. + +=item core dump + +The corpse of a L</process>, in the form of a file left in the +L</working directory> of the process, usually as a result of certain +kinds of fatal error. + +=item CPAN + +The Comprehensive Perl Archive Network. (See L<perlfaq2/What modules and extensions are available for Perl? What is CPAN?>). + +=item cracker + +Someone who breaks security on computer systems. A cracker may be a +true L</hacker> or only a L</script kiddie>. + +=item current package + +The L</package> in which the current statement is compiled. Scan +backwards in the text of your program through the current L<lexical +scope|/lexical scoping> or any enclosing lexical scopes till you find +a package declaration. That's your current package name. + +=item current working directory + +See L</working directory>. + +=item currently selected output channel + +The last L</filehandle> that was designated with +L<select|perlfunc/select>(C<FILEHANDLE>); L</STDOUT>, if no filehandle +has been selected. + +=item CV + +An internal "code value" typedef, holding a L</subroutine>. The L</CV> +type is a subclass of L</SV>. + +=back + +=head2 D + +=over 4 + +=item dangling statement + +A bare, single L</statement>, without any braces, hanging off an C<if> +or C<while> conditional. C allows them. Perl doesn't. + +=item data structure + +How your various pieces of data relate to each other and what shape +they make when you put them all together, as in a rectangular table or +a triangular-shaped tree. + +=item data type + +A set of possible values, together with all the operations that know +how to deal with those values. For example, a numeric data type has a +certain set of numbers that you can work with and various mathematical +operations that you can do on the numbers but would make little sense +on, say, a string such as C<"Kilroy">. Strings have their own +operations, such as L</concatenation>. Compound types made of a +number of smaller pieces generally have operations to compose and +decompose them, and perhaps to rearrange them. L<Objects|/object> +that model things in the real world often have operations that +correspond to real activities. For instance, if you model an +elevator, your elevator object might have an C<open_door()> +L</method>. + +=item datagram + +A packet of data, such as a L</UDP> message, that (from the viewpoint +of the programs involved) can be sent independently over the network. +(In fact, all packets are sent independently at the L</IP> level, but +L</stream> protocols such as L</TCP> hide this from your program.) + +=item DBM + +Stands for "Data Base Management" routines, a set of routines that +emulate an L</associative array> using disk files. The routines use a +dynamic hashing scheme to locate any entry with only two disk +accesses. DBM files allow a Perl program to keep a persistent +L</hash> across multiple invocations. You can L<tie|perlfunc/tie> +your hash variables to various DBM implementations--see L<AnyDBM_File> +and L<DB_File>. + +=item declaration + +An L</assertion> that states something exists and perhaps describes +what it's like, without giving any commitment as to how or where +you'll use it. A declaration is like the part of your recipe that +says, "two cups flour, one large egg, four or five tadpoles..." See +L</statement> for its opposite. Note that some declarations also +function as statements. Subroutine declarations also act as +definitions if a body is supplied. + +=item decrement + +To subtract a value from a variable, as in "decrement C<$x>" (meaning +to remove 1 from its value) or "decrement C<$x> by 3". + +=item default + +A L</value> chosen for you if you don't supply a value of your own. + +=item defined + +Having a meaning. Perl thinks that some of the things people try to +do are devoid of meaning, in particular, making use of variables that +have never been given a L</value> and performing certain operations on +data that isn't there. For example, if you try to read data past the +end of a file, Perl will hand you back an undefined value. See also +L</false> and L<perlfunc/defined>. + +=item delimiter + +A L</character> or L</string> that sets bounds to an arbitrarily-sized +textual object, not to be confused with a L</separator> or +L</terminator>. "To delimit" really just means "to surround" or "to +enclose" (like these parentheses are doing). + +=item deprecated modules and features + +Deprecated modules and features are those which were part of a stable +release, but later found to be subtly flawed, and which should be avoided. +They are subject to removal and/or bug-incompatible reimplementation in +the next major release (but they will be preserved through maintenance +releases). Deprecation warnings are issued under B<-w> or C<use +diagnostics>, and notices are found in L<perldelta>s, as well as various +other PODs. Coding practices that misuse features, such as C<my $foo if +0>, can also be deprecated. + +=item dereference + +A fancy computer science term meaning "to follow a L</reference> to +what it points to". The "de" part of it refers to the fact that +you're taking away one level of L</indirection>. + +=item derived class + +A L</class> that defines some of its L<methods|/method> in terms of a +more generic class, called a L</base class>. Note that classes aren't +classified exclusively into base classes or derived classes: a class +can function as both a derived class and a base class simultaneously, +which is kind of classy. + +=item descriptor + +See L</file descriptor>. + +=item destroy + +To deallocate the memory of a L</referent> (first triggering its +C<DESTROY> method, if it has one). + +=item destructor + +A special L</method> that is called when an L</object> is thinking +about L<destroying|/destroy> itself. A Perl program's C<DESTROY> +method doesn't do the actual destruction; Perl just +L<triggers|/trigger> the method in case the L</class> wants to do any +associated cleanup. + +=item device + +A whiz-bang hardware gizmo (like a disk or tape drive or a modem or a +joystick or a mouse) attached to your computer, that the L</operating +system> tries to make look like a L</file> (or a bunch of files). +Under Unix, these fake files tend to live in the I</dev> directory. + +=item directive + +A L</pod> directive. See L<perlpod>. + +=item directory + +A special file that contains other files. Some L<operating +systems|/operating system> call these "folders", "drawers", or +"catalogs". + +=item directory handle + +A name that represents a particular instance of opening a directory to +read it, until you close it. See the L<opendir|perlfunc/opendir> +function. + +=item dispatch + +To send something to its correct destination. Often used +metaphorically to indicate a transfer of programmatic control to a +destination selected algorithmically, often by lookup in a table of +function L<references|/reference> or, in the case of object +L<methods|/method>, by traversing the inheritance tree looking for the +most specific definition for the method. + +=item distribution + +A standard, bundled release of a system of software. The default +usage implies source code is included. If that is not the case, it +will be called a "binary-only" distribution. + +=item (to be) dropped modules + +When Perl 5 was first released (see L<perlhist>), several modules were +included, which have now fallen out of common use. It has been suggested +that these modules should be removed, since the distribution became rather +large, and the common criterion for new module additions is now limited to +modules that help to build, test, and extend perl itself. Furthermore, +the CPAN (which didn't exist at the time of Perl 5.0) can become the new +home of dropped modules. Dropping modules is currently not an option, but +further developments may clear the last barriers. + +=item dweomer + +An enchantment, illusion, phantasm, or jugglery. Said when Perl's +magical L</dwimmer> effects don't do what you expect, but rather seem +to be the product of arcane dweomercraft, sorcery, or wonder working. +[From Old English] + +=item dwimmer + +DWIM is an acronym for "Do What I Mean", the principle that something +should just do what you want it to do without an undue amount of fuss. +A bit of code that does "dwimming" is a "dwimmer". Dwimming can +require a great deal of behind-the-scenes magic, which (if it doesn't +stay properly behind the scenes) is called a L</dweomer> instead. + +=item dynamic scoping + +Dynamic scoping works over a dynamic scope, making variables visible +throughout the rest of the L</block> in which they are first used and +in any L<subroutines|/subroutine> that are called by the rest of the +block. Dynamically scoped variables can have their values temporarily +changed (and implicitly restored later) by a L<local|perlfunc/local> +operator. (Compare L</lexical scoping>.) Used more loosely to mean +how a subroutine that is in the middle of calling another subroutine +"contains" that subroutine at L</run time>. + +=back + +=head2 E + +=over 4 + +=item eclectic + +Derived from many sources. Some would say I<too> many. + +=item element + +A basic building block. When you're talking about an L</array>, it's +one of the items that make up the array. + +=item embedding + +When something is contained in something else, particularly when that +might be considered surprising: "I've embedded a complete Perl +interpreter in my editor!" + +=item empty list + +See </null list>. + +=item empty subclass test + +The notion that an empty L</derived class> should behave exactly like +its L</base class>. + +=item en passant + +When you change a L</value> as it is being copied. [From French, "in +passing", as in the exotic pawn-capturing maneuver in chess.] + +=item encapsulation + +The veil of abstraction separating the L</interface> from the +L</implementation> (whether enforced or not), which mandates that all +access to an L</object>'s state be through L<methods|/method> alone. + +=item endian + +See L</little-endian> and L</big-endian>. + +=item environment + +The collective set of L<environment variables|/environment variable> +your L</process> inherits from its parent. Accessed via C<%ENV>. + +=item environment variable + +A mechanism by which some high-level agent such as a user can pass its +preferences down to its future offspring (child L<processes|/process>, +grandchild processes, great-grandchild processes, and so on). Each +environment variable is a L</key>/L</value> pair, like one entry in a +L</hash>. + +=item EOF + +End of File. Sometimes used metaphorically as the terminating string +of a L</here document>. + +=item errno + +The error number returned by a L</syscall> when it fails. Perl refers +to the error by the name C<$!> (or C<$OS_ERROR> if you use the English +module). + +=item error + +See L</exception> or L</fatal error>. + +=item escape sequence + +See L</metasymbol>. + +=item exception + +A fancy term for an error. See L</fatal error>. + +=item exception handling + +The way a program responds to an error. The exception handling +mechanism in Perl is the L<eval|perlfunc/eval> operator. + +=item exec + +To throw away the current L</process>'s program and replace it with +another without exiting the process or relinquishing any resources +held (apart from the old memory image). + +=item executable file + +A L</file> that is specially marked to tell the L</operating system> +that it's okay to run this file as a program. Usually shortened to +"executable". + +=item execute + +To run a L<program|/executable file> or L</subroutine>. (Has nothing +to do with the L<kill|perlfunc/kill> built-in, unless you're trying to +run a L</signal handler>.) + +=item execute bit + +The special mark that tells the operating system it can run this +program. There are actually three execute bits under Unix, and which +bit gets used depends on whether you own the file singularly, +collectively, or not at all. + +=item exit status + +See L</status>. + +=item export + +To make symbols from a L</module> available for L</import> by other modules. + +=item expression + +Anything you can legally say in a spot where a L</value> is required. +Typically composed of L<literals|/literal>, L<variables|/variable>, +L<operators|/operator>, L<functions|/function>, and L</subroutine> +calls, not necessarily in that order. + +=item extension + +A Perl module that also pulls in compiled C or C++ code. More +generally, any experimental option that can be compiled into Perl, +such as multithreading. + +=back + +=head2 F + +=over 4 + +=item false + +In Perl, any value that would look like C<""> or C<"0"> if evaluated +in a string context. Since undefined values evaluate to C<"">, all +undefined values are false (including the L</null list>), but not all +false values are undefined. + +=item FAQ + +Frequently Asked Question (although not necessarily frequently +answered, especially if the answer appears in the Perl FAQ shipped +standard with Perl). + +=item fatal error + +An uncaught L</exception>, which causes termination of the L</process> +after printing a message on your L</standard error> stream. Errors +that happen inside an L<eval|perlfunc/eval> are not fatal. Instead, +the L<eval|perlfunc/eval> terminates after placing the exception +message in the C<$@> (C<$EVAL_ERROR>) variable. You can try to +provoke a fatal error with the L<die|perlfunc/die> operator (known as +throwing or raising an exception), but this may be caught by a +dynamically enclosing L<eval|perlfunc/eval>. If not caught, the +L<die|perlfunc/die> becomes a fatal error. + +=item field + +A single piece of numeric or string data that is part of a longer +L</string>, L</record>, or L</line>. Variable-width fields are usually +split up by L<separators|/separator> (so use L<split|perlfunc/split> to +extract the fields), while fixed-width fields are usually at fixed +positions (so use L<unpack|perlfunc/unpack>). L<Instance +variables|/instance variable> are also known as fields. + +=item FIFO + +First In, First Out. See also L</LIFO>. Also, a nickname for a +L</named pipe>. + +=item file + +A named collection of data, usually stored on disk in a L</directory> +in a L</filesystem>. Roughly like a document, if you're into office +metaphors. In modern filesystems, you can actually give a file more +than one name. Some files have special properties, like directories +and devices. + +=item file descriptor + +The little number the L</operating system> uses to keep track of which +opened L</file> you're talking about. Perl hides the file descriptor +inside a L</standard IE<sol>O> stream and then attaches the stream to +a L</filehandle>. + +=item file test operator + +A built-in unary operator that you use to determine whether something +is L</true> about a file, such as C<-o $filename> to test whether +you're the owner of the file. + +=item fileglob + +A "wildcard" match on L<filenames|/filename>. See the +L<glob|perlfunc/glob> function. + +=item filehandle + +An identifier (not necessarily related to the real name of a file) +that represents a particular instance of opening a file until you +close it. If you're going to open and close several different files +in succession, it's fine to open each of them with the same +filehandle, so you don't have to write out separate code to process +each file. + +=item filename + +One name for a file. This name is listed in a L</directory>, and you +can use it in an L<open|perlfunc/open> to tell the L</operating +system> exactly which file you want to open, and associate the file +with a L</filehandle> which will carry the subsequent identity of that +file in your program, until you close it. + +=item filesystem + +A set of L<directories|/directory> and L<files|/file> residing on a +partition of the disk. Sometimes known as a "partition". You can +change the file's name or even move a file around from directory to +directory within a filesystem without actually moving the file itself, +at least under Unix. + +=item filter + +A program designed to take a L</stream> of input and transform it into +a stream of output. + +=item flag + +We tend to avoid this term because it means so many things. It may +mean a command-line L</switch> that takes no argument +itself (such as Perl's B<-n> and B<-p> +flags) or, less frequently, a single-bit indicator (such as the +C<O_CREAT> and C<O_EXCL> flags used in +L<sysopen|perlfunc/sysopen>). + +=item floating point + +A method of storing numbers in "scientific notation", such that the +precision of the number is independent of its magnitude (the decimal +point "floats"). Perl does its numeric work with floating-point +numbers (sometimes called "floats"), when it can't get away with +using L<integers|/integer>. Floating-point numbers are mere +approximations of real numbers. + +=item flush + +The act of emptying a L</buffer>, often before it's full. + +=item FMTEYEWTK + +Far More Than Everything You Ever Wanted To Know. An exhaustive +treatise on one narrow topic, something of a super-L</FAQ>. See Tom +for far more. + +=item fork + +To create a child L</process> identical to the parent process at its +moment of conception, at least until it gets ideas of its own. A +thread with protected memory. + +=item formal arguments + +The generic names by which a L</subroutine> knows its +L<arguments|/argument>. In many languages, formal arguments are +always given individual names, but in Perl, the formal arguments are +just the elements of an array. The formal arguments to a Perl program +are C<$ARGV[0]>, C<$ARGV[1]>, and so on. Similarly, the formal +arguments to a Perl subroutine are C<$_[0]>, C<$_[1]>, and so on. You +may give the arguments individual names by assigning the values to a +L<my|perlfunc/my> list. See also L</actual arguments>. + +=item format + +A specification of how many spaces and digits and things to put +somewhere so that whatever you're printing comes out nice and pretty. + +=item freely available + +Means you don't have to pay money to get it, but the copyright on it +may still belong to someone else (like Larry). + +=item freely redistributable + +Means you're not in legal trouble if you give a bootleg copy of it to +your friends and we find out about it. In fact, we'd rather you gave +a copy to all your friends. + +=item freeware + +Historically, any software that you give away, particularly if you +make the source code available as well. Now often called C<open +source software>. Recently there has been a trend to use the term in +contradistinction to L</open source software>, to refer only to free +software released under the Free Software Foundation's GPL (General +Public License), but this is difficult to justify etymologically. + +=item function + +Mathematically, a mapping of each of a set of input values to a +particular output value. In computers, refers to a L</subroutine> or +L</operator> that returns a L</value>. It may or may not have input +values (called L<arguments|/argument>). + +=item funny character + +Someone like Larry, or one of his peculiar friends. Also refers to +the strange prefixes that Perl requires as noun markers on its +variables. + +=back + +=head2 G + +=over 4 + +=item garbage collection + +A misnamed feature--it should be called, "expecting your mother to +pick up after you". Strictly speaking, Perl doesn't do this, but it +relies on a reference-counting mechanism to keep things tidy. +However, we rarely speak strictly and will often refer to the +reference-counting scheme as a form of garbage collection. (If it's +any comfort, when your interpreter exits, a "real" garbage collector +runs to make sure everything is cleaned up if you've been messy with +circular references and such.) + +=item GID + +Group ID--in Unix, the numeric group ID that the L</operating system> +uses to identify you and members of your L</group>. + +=item glob + +Strictly, the shell's C<*> character, which will match a "glob" of +characters when you're trying to generate a list of filenames. +Loosely, the act of using globs and similar symbols to do pattern +matching. See also L</fileglob> and L</typeglob>. + +=item global + +Something you can see from anywhere, usually used of +L<variables|/variable> and L<subroutines|/subroutine> that are visible +everywhere in your program. In Perl, only certain special variables +are truly global--most variables (and all subroutines) exist only in +the current L</package>. Global variables can be declared with +L<our|perlfunc/our>. See L<perlfunc/our>. + +=item global destruction + +The L</garbage collection> of globals (and the running of any +associated object destructors) that takes place when a Perl +L</interpreter> is being shut down. Global destruction should not be +confused with the Apocalypse, except perhaps when it should. + +=item glue language + +A language such as Perl that is good at hooking things together that +weren't intended to be hooked together. + +=item granularity + +The size of the pieces you're dealing with, mentally speaking. + +=item greedy + +A L</subpattern> whose L</quantifier> wants to match as many things as +possible. + +=item grep + +Originally from the old Unix editor command for "Globally search for a +Regular Expression and Print it", now used in the general sense of any +kind of search, especially text searches. Perl has a built-in +L<grep|perlfunc/grep> function that searches a list for elements +matching any given criterion, whereas the I<grep>(1) program searches +for lines matching a L</regular expression> in one or more files. + +=item group + +A set of users of which you are a member. In some operating systems +(like Unix), you can give certain file access permissions to other +members of your group. + +=item GV + +An internal "glob value" typedef, holding a L</typeglob>. The L</GV> +type is a subclass of L</SV>. + +=back + +=head2 H + +=over 4 + +=item hacker + +Someone who is brilliantly persistent in solving technical problems, +whether these involve golfing, fighting orcs, or programming. Hacker +is a neutral term, morally speaking. Good hackers are not to be +confused with evil L<crackers|/cracker> or clueless L<script +kiddies|/script kiddie>. If you confuse them, we will presume that +you are either evil or clueless. + +=item handler + +A L</subroutine> or L</method> that is called by Perl when your +program needs to respond to some internal event, such as a L</signal>, +or an encounter with an operator subject to L</operator overloading>. +See also L</callback>. + +=item hard reference + +A L</scalar> L</value> containing the actual address of a +L</referent>, such that the referent's L</reference> count accounts +for it. (Some hard references are held internally, such as the +implicit reference from one of a L</typeglob>'s variable slots to its +corresponding referent.) A hard reference is different from a +L</symbolic reference>. + +=item hash + +An unordered association of L</key>/L</value> pairs, stored such that +you can easily use a string L</key> to look up its associated data +L</value>. This glossary is like a hash, where the word to be defined +is the key, and the definition is the value. A hash is also sometimes +septisyllabically called an "associative array", which is a pretty +good reason for simply calling it a "hash" instead. A hash can optionally +be L<restricted|/restricted hash> to a fixed set of keys. + +=item hash table + +A data structure used internally by Perl for implementing associative +arrays (hashes) efficiently. See also L</bucket>. + +=item header file + +A file containing certain required definitions that you must include +"ahead" of the rest of your program to do certain obscure operations. +A C header file has a I<.h> extension. Perl doesn't really have +header files, though historically Perl has sometimes used translated +I<.h> files with a I<.ph> extension. See L<perlfunc/require>. +(Header files have been superseded by the L</module> mechanism.) + +=item here document + +So called because of a similar construct in L<shells|/shell> that +pretends that the L<lines|/line> following the L</command> are a +separate L</file> to be fed to the command, up to some terminating +string. In Perl, however, it's just a fancy form of quoting. + +=item hexadecimal + +A number in base 16, "hex" for short. The digits for 10 through 16 +are customarily represented by the letters C<a> through C<f>. +Hexadecimal constants in Perl start with C<0x>. See also +L<perlfunc/hex>. + +=item home directory + +The directory you are put into when you log in. On a Unix system, the +name is often placed into C<$ENV{HOME}> or C<$ENV{LOGDIR}> by +I<login>, but you can also find it with C<(getpwuid($E<lt>))[7]>. +(Some platforms do not have a concept of a home directory.) + +=item host + +The computer on which a program or other data resides. + +=item hubris + +Excessive pride, the sort of thing Zeus zaps you for. Also the +quality that makes you write (and maintain) programs that other people +won't want to say bad things about. Hence, the third great virtue of +a programmer. See also L</laziness> and L</impatience>. + +=item HV + +Short for a "hash value" typedef, which holds Perl's internal +representation of a hash. The L</HV> type is a subclass of L</SV>. + +=back + +=head2 I + +=over 4 + +=item identifier + +A legally formed name for most anything in which a computer program +might be interested. Many languages (including Perl) allow +identifiers that start with a letter and contain letters and digits. +Perl also counts the underscore character as a valid letter. (Perl +also has more complicated names, such as L</qualified> names.) + +=item impatience + +The anger you feel when the computer is being lazy. This makes you +write programs that don't just react to your needs, but actually +anticipate them. Or at least that pretend to. Hence, the second +great virtue of a programmer. See also L</laziness> and L</hubris>. + +=item implementation + +How a piece of code actually goes about doing its job. Users of the +code should not count on implementation details staying the same +unless they are part of the published L</interface>. + +=item import + +To gain access to symbols that are exported from another module. See +L<perlfunc/use>. + +=item increment + +To increase the value of something by 1 (or by some other number, if +so specified). + +=item indexing + +In olden days, the act of looking up a L</key> in an actual index +(such as a phone book), but now merely the act of using any kind of +key or position to find the corresponding L</value>, even if no index +is involved. Things have degenerated to the point that Perl's +L<index|perlfunc/index> function merely locates the position (index) +of one string in another. + +=item indirect filehandle + +An L</expression> that evaluates to something that can be used as a +L</filehandle>: a L</string> (filehandle name), a L</typeglob>, a +typeglob L</reference>, or a low-level L</IO> object. + +=item indirect object + +In English grammar, a short noun phrase between a verb and its direct +object indicating the beneficiary or recipient of the action. In +Perl, C<print STDOUT "$foo\n";> can be understood as "verb +indirect-object object" where L</STDOUT> is the recipient of the +L<print|perlfunc/print> action, and C<"$foo"> is the object being +printed. Similarly, when invoking a L</method>, you might place the +invocant between the method and its arguments: + + $gollum = new Pathetic::Creature "Smeagol"; + give $gollum "Fisssssh!"; + give $gollum "Precious!"; + +In modern Perl, calling methods this way is often considered bad practice and +to be avoided. + +=item indirect object slot + +The syntactic position falling between a method call and its arguments +when using the indirect object invocation syntax. (The slot is +distinguished by the absence of a comma between it and the next +argument.) L</STDERR> is in the indirect object slot here: + + print STDERR "Awake! Awake! Fear, Fire, + Foes! Awake!\n"; + +=item indirection + +If something in a program isn't the value you're looking for but +indicates where the value is, that's indirection. This can be done +with either L<symbolic references|/symbolic reference> or L<hard +references|/hard reference>. + +=item infix + +An L</operator> that comes in between its L<operands|/operand>, such +as multiplication in C<24 * 7>. + +=item inheritance + +What you get from your ancestors, genetically or otherwise. If you +happen to be a L</class>, your ancestors are called L<base +classes|/base class> and your descendants are called L<derived +classes|/derived class>. See L</single inheritance> and L</multiple +inheritance>. + +=item instance + +Short for "an instance of a class", meaning an L</object> of that L</class>. + +=item instance variable + +An L</attribute> of an L</object>; data stored with the particular +object rather than with the class as a whole. + +=item integer + +A number with no fractional (decimal) part. A counting number, like +1, 2, 3, and so on, but including 0 and the negatives. + +=item interface + +The services a piece of code promises to provide forever, in contrast to +its L</implementation>, which it should feel free to change whenever it +likes. + +=item interpolation + +The insertion of a scalar or list value somewhere in the middle of +another value, such that it appears to have been there all along. In +Perl, variable interpolation happens in double-quoted strings and +patterns, and list interpolation occurs when constructing the list of +values to pass to a list operator or other such construct that takes a +L</LIST>. + +=item interpreter + +Strictly speaking, a program that reads a second program and does what +the second program says directly without turning the program into a +different form first, which is what L<compilers|/compiler> do. Perl +is not an interpreter by this definition, because it contains a kind +of compiler that takes a program and turns it into a more executable +form (L<syntax trees|/syntax tree>) within the I<perl> process itself, +which the Perl L</run time> system then interprets. + +=item invocant + +The agent on whose behalf a L</method> is invoked. In a L</class> +method, the invocant is a package name. In an L</instance> method, +the invocant is an object reference. + +=item invocation + +The act of calling up a deity, daemon, program, method, subroutine, or +function to get it do what you think it's supposed to do. We usually +"call" subroutines but "invoke" methods, since it sounds cooler. + +=item I/O + +Input from, or output to, a L</file> or L</device>. + +=item IO + +An internal I/O object. Can also mean L</indirect object>. + +=item IP + +Internet Protocol, or Intellectual Property. + +=item IPC + +Interprocess Communication. + +=item is-a + +A relationship between two L<objects|/object> in which one object is +considered to be a more specific version of the other, generic object: +"A camel is a mammal." Since the generic object really only exists in +a Platonic sense, we usually add a little abstraction to the notion of +objects and think of the relationship as being between a generic +L</base class> and a specific L</derived class>. Oddly enough, +Platonic classes don't always have Platonic relationships--see +L</inheritance>. + +=item iteration + +Doing something repeatedly. + +=item iterator + +A special programming gizmo that keeps track of where you are in +something that you're trying to iterate over. The C<foreach> loop in +Perl contains an iterator; so does a hash, allowing you to +L<each|perlfunc/each> through it. + +=item IV + +The integer four, not to be confused with six, Tom's favorite editor. +IV also means an internal Integer Value of the type a L</scalar> can +hold, not to be confused with an L</NV>. + +=back + +=head2 J + +=over 4 + +=item JAPH + +"Just Another Perl Hacker," a clever but cryptic bit of Perl code that +when executed, evaluates to that string. Often used to illustrate a +particular Perl feature, and something of an ongoing Obfuscated Perl +Contest seen in Usenix signatures. + +=back + +=head2 K + +=over 4 + +=item key + +The string index to a L</hash>, used to look up the L</value> +associated with that key. + +=item keyword + +See L</reserved words>. + +=back + +=head2 L + +=over 4 + +=item label + +A name you give to a L</statement> so that you can talk about that +statement elsewhere in the program. + +=item laziness + +The quality that makes you go to great effort to reduce overall energy +expenditure. It makes you write labor-saving programs that other +people will find useful, and document what you wrote so you don't have +to answer so many questions about it. Hence, the first great virtue +of a programmer. Also hence, this book. See also L</impatience> and +L</hubris>. + +=item left shift + +A L</bit shift> that multiplies the number by some power of 2. + +=item leftmost longest + +The preference of the L</regular expression> engine to match the +leftmost occurrence of a L</pattern>, then given a position at which a +match will occur, the preference for the longest match (presuming the +use of a L</greedy> quantifier). See L<perlre> for I<much> more on +this subject. + +=item lexeme + +Fancy term for a L</token>. + +=item lexer + +Fancy term for a L</tokener>. + +=item lexical analysis + +Fancy term for L</tokenizing>. + +=item lexical scoping + +Looking at your I<Oxford English Dictionary> through a microscope. +(Also known as L</static scoping>, because dictionaries don't change +very fast.) Similarly, looking at variables stored in a private +dictionary (namespace) for each scope, which are visible only from +their point of declaration down to the end of the lexical scope in +which they are declared. --Syn. L</static scoping>. +--Ant. L</dynamic scoping>. + +=item lexical variable + +A L</variable> subject to L</lexical scoping>, declared by +L<my|perlfunc/my>. Often just called a "lexical". (The +L<our|perlfunc/our> declaration declares a lexically scoped name for a +global variable, which is not itself a lexical variable.) + +=item library + +Generally, a collection of procedures. In ancient days, referred to a +collection of subroutines in a I<.pl> file. In modern times, refers +more often to the entire collection of Perl L<modules|/module> on your +system. + +=item LIFO + +Last In, First Out. See also L</FIFO>. A LIFO is usually called a +L</stack>. + +=item line + +In Unix, a sequence of zero or more non-newline characters terminated +with a L</newline> character. On non-Unix machines, this is emulated +by the C library even if the underlying L</operating system> has +different ideas. + +=item line buffering + +Used by a L</standard IE<sol>O> output stream that flushes its +L</buffer> after every L</newline>. Many standard I/O libraries +automatically set up line buffering on output that is going to the +terminal. + +=item line number + +The number of lines read previous to this one, plus 1. Perl keeps a +separate line number for each source or input file it opens. The +current source file's line number is represented by C<__LINE__>. The +current input line number (for the file that was most recently read +via C<< E<lt>FHE<gt> >>) is represented by the C<$.> +(C<$INPUT_LINE_NUMBER>) variable. Many error messages report both +values, if available. + +=item link + +Used as a noun, a name in a L</directory>, representing a L</file>. A +given file can have multiple links to it. It's like having the same +phone number listed in the phone directory under different names. As +a verb, to resolve a partially compiled file's unresolved symbols into +a (nearly) executable image. Linking can generally be static or +dynamic, which has nothing to do with static or dynamic scoping. + +=item LIST + +A syntactic construct representing a comma-separated list of +expressions, evaluated to produce a L</list value>. Each +L</expression> in a L</LIST> is evaluated in L</list context> and +interpolated into the list value. + +=item list + +An ordered set of scalar values. + +=item list context + +The situation in which an L</expression> is expected by its +surroundings (the code calling it) to return a list of values rather +than a single value. Functions that want a L</LIST> of arguments tell +those arguments that they should produce a list value. See also +L</context>. + +=item list operator + +An L</operator> that does something with a list of values, such as +L<join|perlfunc/join> or L<grep|perlfunc/grep>. Usually used for +named built-in operators (such as L<print|perlfunc/print>, +L<unlink|perlfunc/unlink>, and L<system|perlfunc/system>) that do not +require parentheses around their L</argument> list. + +=item list value + +An unnamed list of temporary scalar values that may be passed around +within a program from any list-generating function to any function or +construct that provides a L</list context>. + +=item literal + +A token in a programming language such as a number or L</string> that +gives you an actual L</value> instead of merely representing possible +values as a L</variable> does. + +=item little-endian + +From Swift: someone who eats eggs little end first. Also used of +computers that store the least significant L</byte> of a word at a +lower byte address than the most significant byte. Often considered +superior to big-endian machines. See also L</big-endian>. + +=item local + +Not meaning the same thing everywhere. A global variable in Perl can +be localized inside a L<dynamic scope|/dynamic scoping> via the +L<local|perlfunc/local> operator. + +=item logical operator + +Symbols representing the concepts "and", "or", "xor", and "not". + +=item lookahead + +An L</assertion> that peeks at the string to the right of the current +match location. + +=item lookbehind + +An L</assertion> that peeks at the string to the left of the current +match location. + +=item loop + +A construct that performs something repeatedly, like a roller coaster. + +=item loop control statement + +Any statement within the body of a loop that can make a loop +prematurely stop looping or skip an L</iteration>. Generally you +shouldn't try this on roller coasters. + +=item loop label + +A kind of key or name attached to a loop (or roller coaster) so that +loop control statements can talk about which loop they want to +control. + +=item lvaluable + +Able to serve as an L</lvalue>. + +=item lvalue + +Term used by language lawyers for a storage location you can assign a +new L</value> to, such as a L</variable> or an element of an +L</array>. The "l" is short for "left", as in the left side of an +assignment, a typical place for lvalues. An L</lvaluable> function or +expression is one to which a value may be assigned, as in C<pos($x) = +10>. + +=item lvalue modifier + +An adjectival pseudofunction that warps the meaning of an L</lvalue> +in some declarative fashion. Currently there are three lvalue +modifiers: L<my|perlfunc/my>, L<our|perlfunc/our>, and +L<local|perlfunc/local>. + +=back + +=head2 M + +=over 4 + +=item magic + +Technically speaking, any extra semantics attached to a variable such +as C<$!>, C<$0>, C<%ENV>, or C<%SIG>, or to any tied variable. +Magical things happen when you diddle those variables. + +=item magical increment + +An L</increment> operator that knows how to bump up alphabetics as +well as numbers. + +=item magical variables + +Special variables that have side effects when you access them or +assign to them. For example, in Perl, changing elements of the +C<%ENV> array also changes the corresponding environment variables +that subprocesses will use. Reading the C<$!> variable gives you the +current system error number or message. + +=item Makefile + +A file that controls the compilation of a program. Perl programs +don't usually need a L</Makefile> because the Perl compiler has plenty +of self-control. + +=item man + +The Unix program that displays online documentation (manual pages) for +you. + +=item manpage + +A "page" from the manuals, typically accessed via the I<man>(1) +command. A manpage contains a SYNOPSIS, a DESCRIPTION, a list of +BUGS, and so on, and is typically longer than a page. There are +manpages documenting L<commands|/command>, L<syscalls|/syscall>, +L</library> L<functions|/function>, L<devices|/device>, +L<protocols|/protocol>, L<files|/file>, and such. In this book, we +call any piece of standard Perl documentation (like I<perlop> or +I<perldelta>) a manpage, no matter what format it's installed in on +your system. + +=item matching + +See L</pattern matching>. + +=item member data + +See L</instance variable>. + +=item memory + +This always means your main memory, not your disk. Clouding the issue +is the fact that your machine may implement L</virtual> memory; that +is, it will pretend that it has more memory than it really does, and +it'll use disk space to hold inactive bits. This can make it seem +like you have a little more memory than you really do, but it's not a +substitute for real memory. The best thing that can be said about +virtual memory is that it lets your performance degrade gradually +rather than suddenly when you run out of real memory. But your +program can die when you run out of virtual memory too, if you haven't +thrashed your disk to death first. + +=item metacharacter + +A L</character> that is I<not> supposed to be treated normally. Which +characters are to be treated specially as metacharacters varies +greatly from context to context. Your L</shell> will have certain +metacharacters, double-quoted Perl L<strings|/string> have other +metacharacters, and L</regular expression> patterns have all the +double-quote metacharacters plus some extra ones of their own. + +=item metasymbol + +Something we'd call a L</metacharacter> except that it's a sequence of +more than one character. Generally, the first character in the +sequence must be a true metacharacter to get the other characters in +the metasymbol to misbehave along with it. + +=item method + +A kind of action that an L</object> can take if you tell it to. See +L<perlobj>. + +=item minimalism + +The belief that "small is beautiful." Paradoxically, if you say +something in a small language, it turns out big, and if you say it in +a big language, it turns out small. Go figure. + +=item mode + +In the context of the L<stat(2)> syscall, refers to the field holding +the L</permission bits> and the type of the L</file>. + +=item modifier + +See L</statement modifier>, L</regular expression modifier>, and +L</lvalue modifier>, not necessarily in that order. + +=item module + +A L</file> that defines a L</package> of (almost) the same name, which +can either L</export> symbols or function as an L</object> class. (A +module's main I<.pm> file may also load in other files in support of +the module.) See the L<use|perlfunc/use> built-in. + +=item modulus + +An integer divisor when you're interested in the remainder instead of +the quotient. + +=item monger + +Short for Perl Monger, a purveyor of Perl. + +=item mortal + +A temporary value scheduled to die when the current statement +finishes. + +=item multidimensional array + +An array with multiple subscripts for finding a single element. Perl +implements these using L<references|/reference>--see L<perllol> and +L<perldsc>. + +=item multiple inheritance + +The features you got from your mother and father, mixed together +unpredictably. (See also L</inheritance>, and L</single +inheritance>.) In computer languages (including Perl), the notion +that a given class may have multiple direct ancestors or L<base +classes|/base class>. + +=back + +=head2 N + +=over 4 + +=item named pipe + +A L</pipe> with a name embedded in the L</filesystem> so that it can +be accessed by two unrelated L<processes|/process>. + +=item namespace + +A domain of names. You needn't worry about whether the names in one +such domain have been used in another. See L</package>. + +=item network address + +The most important attribute of a socket, like your telephone's +telephone number. Typically an IP address. See also L</port>. + +=item newline + +A single character that represents the end of a line, with the ASCII +value of 012 octal under Unix (but 015 on a Mac), and represented by +C<\n> in Perl strings. For Windows machines writing text files, and +for certain physical devices like terminals, the single newline gets +automatically translated by your C library into a line feed and a +carriage return, but normally, no translation is done. + +=item NFS + +Network File System, which allows you to mount a remote filesystem as +if it were local. + +=item null character + +A character with the ASCII value of zero. It's used by C to terminate +strings, but Perl allows strings to contain a null. + +=item null list + +A valueless value represented in Perl by C<()>. It is not really a +L</LIST>, but an expression that yields C<undef> in L</scalar context> and +a L</list value> with zero elements in L</list context>. + +=item null string + +A L</string> containing no characters, not to be confused with a +string containing a L</null character>, which has a positive length +and is L</true>. + +=item numeric context + +The situation in which an expression is expected by its surroundings +(the code calling it) to return a number. See also L</context> and +L</string context>. + +=item NV + +Short for Nevada, no part of which will ever be confused with +civilization. NV also means an internal floating-point Numeric Value +of the type a L</scalar> can hold, not to be confused with an L</IV>. + +=item nybble + +Half a L</byte>, equivalent to one L</hexadecimal> digit, and worth +four L<bits|/bit>. + +=back + +=head2 O + +=over 4 + +=item object + +An L</instance> of a L</class>. Something that "knows" what +user-defined type (class) it is, and what it can do because of what +class it is. Your program can request an object to do things, but the +object gets to decide whether it wants to do them or not. Some +objects are more accommodating than others. + +=item octal + +A number in base 8. Only the digits 0 through 7 are allowed. Octal +constants in Perl start with 0, as in 013. See also the +L<oct|perlfunc/oct> function. + +=item offset + +How many things you have to skip over when moving from the beginning +of a string or array to a specific position within it. Thus, the +minimum offset is zero, not one, because you don't skip anything to +get to the first item. + +=item one-liner + +An entire computer program crammed into one line of text. + +=item open source software + +Programs for which the source code is freely available and freely +redistributable, with no commercial strings attached. For a more +detailed definition, see L<http://www.opensource.org/osd.html>. + +=item operand + +An L</expression> that yields a L</value> that an L</operator> +operates on. See also L</precedence>. + +=item operating system + +A special program that runs on the bare machine and hides the gory +details of managing L<processes|/process> and L<devices|/device>. +Usually used in a looser sense to indicate a particular culture of +programming. The loose sense can be used at varying levels of +specificity. At one extreme, you might say that all versions of Unix +and Unix-lookalikes are the same operating system (upsetting many +people, especially lawyers and other advocates). At the other +extreme, you could say this particular version of this particular +vendor's operating system is different from any other version of this +or any other vendor's operating system. Perl is much more portable +across operating systems than many other languages. See also +L</architecture> and L</platform>. + +=item operator + +A gizmo that transforms some number of input values to some number of +output values, often built into a language with a special syntax or +symbol. A given operator may have specific expectations about what +L<types|/type> of data you give as its arguments +(L<operands|/operand>) and what type of data you want back from it. + +=item operator overloading + +A kind of L</overloading> that you can do on built-in +L<operators|/operator> to make them work on L<objects|/object> as if +the objects were ordinary scalar values, but with the actual semantics +supplied by the object class. This is set up with the L<overload> +L</pragma>. + +=item options + +See either L<switches|/switch> or L</regular expression modifier>. + +=item ordinal + +Another name for L</code point> + +=item overloading + +Giving additional meanings to a symbol or construct. Actually, all +languages do overloading to one extent or another, since people are +good at figuring out things from L</context>. + +=item overriding + +Hiding or invalidating some other definition of the same name. (Not +to be confused with L</overloading>, which adds definitions that must +be disambiguated some other way.) To confuse the issue further, we use +the word with two overloaded definitions: to describe how you can +define your own L</subroutine> to hide a built-in L</function> of the +same name (see L<perlsub/Overriding Built-in Functions>) and to +describe how you can define a replacement L</method> in a L</derived +class> to hide a L</base class>'s method of the same name (see +L<perlobj>). + +=item owner + +The one user (apart from the superuser) who has absolute control over +a L</file>. A file may also have a L</group> of users who may +exercise joint ownership if the real owner permits it. See +L</permission bits>. + +=back + +=head2 P + +=over 4 + +=item package + +A L</namespace> for global L<variables|/variable>, +L<subroutines|/subroutine>, and the like, such that they can be kept +separate from like-named L<symbols|/symbol> in other namespaces. In a +sense, only the package is global, since the symbols in the package's +symbol table are only accessible from code compiled outside the +package by naming the package. But in another sense, all package +symbols are also globals--they're just well-organized globals. + +=item pad + +Short for L</scratchpad>. + +=item parameter + +See L</argument>. + +=item parent class + +See L</base class>. + +=item parse tree + +See L</syntax tree>. + +=item parsing + +The subtle but sometimes brutal art of attempting to turn your +possibly malformed program into a valid L</syntax tree>. + +=item patch + +To fix by applying one, as it were. In the realm of hackerdom, a +listing of the differences between two versions of a program as might +be applied by the I<patch>(1) program when you want to fix a bug or +upgrade your old version. + +=item PATH + +The list of L<directories|/directory> the system searches to find a +program you want to L</execute>. The list is stored as one of your +L<environment variables|/environment variable>, accessible in Perl as +C<$ENV{PATH}>. + +=item pathname + +A fully qualified filename such as I</usr/bin/perl>. Sometimes +confused with L</PATH>. + +=item pattern + +A template used in L</pattern matching>. + +=item pattern matching + +Taking a pattern, usually a L</regular expression>, and trying the +pattern various ways on a string to see whether there's any way to +make it fit. Often used to pick interesting tidbits out of a file. + +=item permission bits + +Bits that the L</owner> of a file sets or unsets to allow or disallow +access to other people. These flag bits are part of the L</mode> word +returned by the L<stat|perlfunc/stat> built-in when you ask about a +file. On Unix systems, you can check the I<ls>(1) manpage for more +information. + +=item Pern + +What you get when you do C<Perl++> twice. Doing it only once will +curl your hair. You have to increment it eight times to shampoo your +hair. Lather, rinse, iterate. + +=item pipe + +A direct L</connection> that carries the output of one L</process> to +the input of another without an intermediate temporary file. Once the +pipe is set up, the two processes in question can read and write as if +they were talking to a normal file, with some caveats. + +=item pipeline + +A series of L<processes|/process> all in a row, linked by +L<pipes|/pipe>, where each passes its output stream to the next. + +=item platform + +The entire hardware and software context in which a program runs. A + program written in a platform-dependent language might break if you +change any of: machine, operating system, libraries, compiler, or +system configuration. The I<perl> interpreter has to be compiled +differently for each platform because it is implemented in C, but +programs written in the Perl language are largely +platform-independent. + +=item pod + +The markup used to embed documentation into your Perl code. See +L<perlpod>. + +=item pointer + +A L</variable> in a language like C that contains the exact memory +location of some other item. Perl handles pointers internally so you +don't have to worry about them. Instead, you just use symbolic +pointers in the form of L<keys|/key> and L</variable> names, or L<hard +references|/hard reference>, which aren't pointers (but act like +pointers and do in fact contain pointers). + +=item polymorphism + +The notion that you can tell an L</object> to do something generic, +and the object will interpret the command in different ways depending +on its type. [E<lt>Gk many shapes] + +=item port + +The part of the address of a TCP or UDP socket that directs packets to +the correct process after finding the right machine, something like +the phone extension you give when you reach the company operator. +Also, the result of converting code to run on a different platform +than originally intended, or the verb denoting this conversion. + +=item portable + +Once upon a time, C code compilable under both BSD and SysV. In +general, code that can be easily converted to run on another +L</platform>, where "easily" can be defined however you like, and +usually is. Anything may be considered portable if you try hard +enough. See I<mobile home> or I<London Bridge>. + +=item porter + +Someone who "carries" software from one L</platform> to another. +Porting programs written in platform-dependent languages such as C can +be difficult work, but porting programs like Perl is very much worth +the agony. + +=item POSIX + +The Portable Operating System Interface specification. + +=item postfix + +An L</operator> that follows its L</operand>, as in C<$x++>. + +=item pp + +An internal shorthand for a "push-pop" code, that is, C code +implementing Perl's stack machine. + +=item pragma + +A standard module whose practical hints and suggestions are received +(and possibly ignored) at compile time. Pragmas are named in all +lowercase. + +=item precedence + +The rules of conduct that, in the absence of other guidance, determine +what should happen first. For example, in the absence of parentheses, +you always do multiplication before addition. + +=item prefix + +An L</operator> that precedes its L</operand>, as in C<++$x>. + +=item preprocessing + +What some helper L</process> did to transform the incoming data into a +form more suitable for the current process. Often done with an +incoming L</pipe>. See also L</C preprocessor>. + +=item procedure + +A L</subroutine>. + +=item process + +An instance of a running program. Under multitasking systems like +Unix, two or more separate processes could be running the same program +independently at the same time--in fact, the L<fork|perlfunc/fork> +function is designed to bring about this happy state of affairs. +Under other operating systems, processes are sometimes called +"threads", "tasks", or "jobs", often with slight nuances in meaning. + +=item program generator + +A system that algorithmically writes code for you in a high-level +language. See also L</code generator>. + +=item progressive matching + +L<Pattern matching|/pattern matching> that picks up where it left off before. + +=item property + +See either L</instance variable> or L</character property>. + +=item protocol + +In networking, an agreed-upon way of sending messages back and forth +so that neither correspondent will get too confused. + +=item prototype + +An optional part of a L</subroutine> declaration telling the Perl +compiler how many and what flavor of arguments may be passed as +L</actual arguments>, so that you can write subroutine calls that +parse much like built-in functions. (Or don't parse, as the case may +be.) + +=item pseudofunction + +A construct that sometimes looks like a function but really isn't. +Usually reserved for L</lvalue> modifiers like L<my|perlfunc/my>, for +L</context> modifiers like L<scalar|perlfunc/scalar>, and for the +pick-your-own-quotes constructs, C<q//>, C<qq//>, C<qx//>, C<qw//>, +C<qr//>, C<m//>, C<s///>, C<y///>, and C<tr///>. + +=item pseudohash + +A reference to an array whose initial element happens to hold a +reference to a hash. You can treat a pseudohash reference as either +an array reference or a hash reference. + +=item pseudoliteral + +An L</operator> that looks something like a L</literal>, such as the +output-grabbing operator, C<`>I<C<command>>C<`>. + +=item public domain + +Something not owned by anybody. Perl is copyrighted and is thus +I<not> in the public domain--it's just L</freely available> and +L</freely redistributable>. + +=item pumpkin + +A notional "baton" handed around the Perl community indicating who is +the lead integrator in some arena of development. + +=item pumpking + +A L</pumpkin> holder, the person in charge of pumping the pump, or at +least priming it. Must be willing to play the part of the Great +Pumpkin now and then. + +=item PV + +A "pointer value", which is Perl Internals Talk for a C<char*>. + +=back + +=head2 Q + +=over 4 + +=item qualified + +Possessing a complete name. The symbol C<$Ent::moot> is qualified; +C<$moot> is unqualified. A fully qualified filename is specified from +the top-level directory. + +=item quantifier + +A component of a L</regular expression> specifying how many times the +foregoing L</atom> may occur. + +=back + +=head2 R + +=over 4 + +=item readable + +With respect to files, one that has the proper permission bit set to +let you access the file. With respect to computer programs, one +that's written well enough that someone has a chance of figuring out +what it's trying to do. + +=item reaping + +The last rites performed by a parent L</process> on behalf of a +deceased child process so that it doesn't remain a L</zombie>. See +the L<wait|perlfunc/wait> and L<waitpid|perlfunc/waitpid> function +calls. + +=item record + +A set of related data values in a L</file> or L</stream>, often +associated with a unique L</key> field. In Unix, often commensurate +with a L</line>, or a blank-line-terminated set of lines (a +"paragraph"). Each line of the I</etc/passwd> file is a record, keyed +on login name, containing information about that user. + +=item recursion + +The art of defining something (at least partly) in terms of itself, +which is a naughty no-no in dictionaries but often works out okay in +computer programs if you're careful not to recurse forever, which is +like an infinite loop with more spectacular failure modes. + +=item reference + +Where you look to find a pointer to information somewhere else. (See +L</indirection>.) References come in two flavors, L<symbolic +references|/symbolic reference> and L<hard references|/hard +reference>. + +=item referent + +Whatever a reference refers to, which may or may not have a name. +Common types of referents include scalars, arrays, hashes, and +subroutines. + +=item regex + +See L</regular expression>. + +=item regular expression + +A single entity with various interpretations, like an elephant. To a +computer scientist, it's a grammar for a little language in which some +strings are legal and others aren't. To normal people, it's a pattern +you can use to find what you're looking for when it varies from case +to case. Perl's regular expressions are far from regular in the +theoretical sense, but in regular use they work quite well. Here's a +regular expression: C</Oh s.*t./>. This will match strings like "C<Oh +say can you see by the dawn's early light>" and "C<Oh sit!>". See +L<perlre>. + +=item regular expression modifier + +An option on a pattern or substitution, such as C</i> to render the +pattern case insensitive. See also L</cloister>. + +=item regular file + +A L</file> that's not a L</directory>, a L</device>, a named L</pipe> +or L</socket>, or a L</symbolic link>. Perl uses the C<-f> file test +operator to identify regular files. Sometimes called a "plain" file. + +=item relational operator + +An L</operator> that says whether a particular ordering relationship +is L</true> about a pair of L<operands|/operand>. Perl has both +numeric and string relational operators. See L</collating sequence>. + +=item reserved words + +A word with a specific, built-in meaning to a L</compiler>, such as +C<if> or L<delete|perlfunc/delete>. In many languages (not Perl), +it's illegal to use reserved words to name anything else. (Which is +why they're reserved, after all.) In Perl, you just can't use them to +name L<labels|/label> or L<filehandles|/filehandle>. Also called +"keywords". + +=item restricted hash + +A L</hash> with a closed set of allowed keys. See L<Hash::Util>. + +=item return value + +The L</value> produced by a L</subroutine> or L</expression> when +evaluated. In Perl, a return value may be either a L</list> or a +L</scalar>. + +=item RFC + +Request For Comment, which despite the timid connotations is the name +of a series of important standards documents. + +=item right shift + +A L</bit shift> that divides a number by some power of 2. + +=item root + +The superuser (UID == 0). Also, the top-level directory of the +filesystem. + +=item RTFM + +What you are told when someone thinks you should Read The Fine Manual. + +=item run phase + +Any time after Perl starts running your main program. See also +L</compile phase>. Run phase is mostly spent in L</run time> but may +also be spent in L</compile time> when L<require|perlfunc/require>, +L<do|perlfunc/do> C<FILE>, or L<eval|perlfunc/eval> C<STRING> +operators are executed or when a substitution uses the C</ee> +modifier. + +=item run time + +The time when Perl is actually doing what your code says to do, as +opposed to the earlier period of time when it was trying to figure out +whether what you said made any sense whatsoever, which is L</compile +time>. + +=item run-time pattern + +A pattern that contains one or more variables to be interpolated +before parsing the pattern as a L</regular expression>, and that +therefore cannot be analyzed at compile time, but must be re-analyzed +each time the pattern match operator is evaluated. Run-time patterns +are useful but expensive. + +=item RV + +A recreational vehicle, not to be confused with vehicular recreation. +RV also means an internal Reference Value of the type a L</scalar> can +hold. See also L</IV> and L</NV> if you're not confused yet. + +=item rvalue + +A L</value> that you might find on the right side of an +L</assignment>. See also L</lvalue>. + +=back + +=head2 S + +=over 4 + +=item scalar + +A simple, singular value; a number, L</string>, or L</reference>. + +=item scalar context + +The situation in which an L</expression> is expected by its +surroundings (the code calling it) to return a single L</value> rather +than a L</list> of values. See also L</context> and L</list context>. +A scalar context sometimes imposes additional constraints on the +return value--see L</string context> and L</numeric context>. +Sometimes we talk about a L</Boolean context> inside conditionals, but +this imposes no additional constraints, since any scalar value, +whether numeric or L</string>, is already true or false. + +=item scalar literal + +A number or quoted L</string>--an actual L</value> in the text of your +program, as opposed to a L</variable>. + +=item scalar value + +A value that happens to be a L</scalar> as opposed to a L</list>. + +=item scalar variable + +A L</variable> prefixed with C<$> that holds a single value. + +=item scope + +How far away you can see a variable from, looking through one. Perl +has two visibility mechanisms: it does L</dynamic scoping> of +L<local|perlfunc/local> L<variables|/variable>, meaning that the rest +of the L</block>, and any L<subroutines|/subroutine> that are called +by the rest of the block, can see the variables that are local to the +block. Perl does L</lexical scoping> of L<my|perlfunc/my> variables, +meaning that the rest of the block can see the variable, but other +subroutines called by the block I<cannot> see the variable. + +=item scratchpad + +The area in which a particular invocation of a particular file or +subroutine keeps some of its temporary values, including any lexically +scoped variables. + +=item script + +A text L</file> that is a program intended to be L<executed|/execute> +directly rather than L<compiled|/compiler> to another form of file +before execution. Also, in the context of L</Unicode>, a writing +system for a particular language or group of languages, such as Greek, +Bengali, or Klingon. + +=item script kiddie + +A L</cracker> who is not a L</hacker>, but knows just enough to run +canned scripts. A cargo-cult programmer. + +=item sed + +A venerable Stream EDitor from which Perl derives some of its ideas. + +=item semaphore + +A fancy kind of interlock that prevents multiple L<threads|/thread> or +L<processes|/process> from using up the same resources simultaneously. + +=item separator + +A L</character> or L</string> that keeps two surrounding strings from +being confused with each other. The L<split|perlfunc/split> function +works on separators. Not to be confused with L<delimiters|/delimiter> +or L<terminators|/terminator>. The "or" in the previous sentence +separated the two alternatives. + +=item serialization + +Putting a fancy L</data structure> into linear order so that it can be +stored as a L</string> in a disk file or database or sent through a +L</pipe>. Also called marshalling. + +=item server + +In networking, a L</process> that either advertises a L</service> or +just hangs around at a known location and waits for L<clients|/client> +who need service to get in touch with it. + +=item service + +Something you do for someone else to make them happy, like giving them +the time of day (or of their life). On some machines, well-known +services are listed by the L<getservent|perlfunc/getservent> function. + +=item setgid + +Same as L</setuid>, only having to do with giving away L</group> +privileges. + +=item setuid + +Said of a program that runs with the privileges of its L</owner> +rather than (as is usually the case) the privileges of whoever is +running it. Also describes the bit in the mode word (L</permission +bits>) that controls the feature. This bit must be explicitly set by +the owner to enable this feature, and the program must be carefully +written not to give away more privileges than it ought to. + +=item shared memory + +A piece of L</memory> accessible by two different +L<processes|/process> who otherwise would not see each other's memory. + +=item shebang + +Irish for the whole McGillicuddy. In Perl culture, a portmanteau of +"sharp" and "bang", meaning the C<#!> sequence that tells the system +where to find the interpreter. + +=item shell + +A L</command>-line L</interpreter>. The program that interactively +gives you a prompt, accepts one or more L<lines|/line> of input, and +executes the programs you mentioned, feeding each of them their proper +L<arguments|/argument> and input data. Shells can also execute +scripts containing such commands. Under Unix, typical shells include +the Bourne shell (I</bin/sh>), the C shell (I</bin/csh>), and the Korn +shell (I</bin/ksh>). Perl is not strictly a shell because it's not +interactive (although Perl programs can be interactive). + +=item side effects + +Something extra that happens when you evaluate an L</expression>. +Nowadays it can refer to almost anything. For example, evaluating a +simple assignment statement typically has the "side effect" of +assigning a value to a variable. (And you thought assigning the value +was your primary intent in the first place!) Likewise, assigning a +value to the special variable C<$|> (C<$AUTOFLUSH>) has the side +effect of forcing a flush after every L<write|perlfunc/write> or +L<print|perlfunc/print> on the currently selected filehandle. + +=item signal + +A bolt out of the blue; that is, an event triggered by the +L</operating system>, probably when you're least expecting it. + +=item signal handler + +A L</subroutine> that, instead of being content to be called in the +normal fashion, sits around waiting for a bolt out of the blue before +it will deign to L</execute>. Under Perl, bolts out of the blue are +called signals, and you send them with the L<kill|perlfunc/kill> +built-in. See L<perlvar/%SIG> and L<perlipc/Signals>. + +=item single inheritance + +The features you got from your mother, if she told you that you don't +have a father. (See also L</inheritance> and L</multiple +inheritance>.) In computer languages, the notion that +L<classes|/class> reproduce asexually so that a given class can only +have one direct ancestor or L</base class>. Perl supplies no such +restriction, though you may certainly program Perl that way if you +like. + +=item slice + +A selection of any number of L<elements|/element> from a L</list>, +L</array>, or L</hash>. + +=item slurp + +To read an entire L</file> into a L</string> in one operation. + +=item socket + +An endpoint for network communication among multiple +L<processes|/process> that works much like a telephone or a post +office box. The most important thing about a socket is its L</network +address> (like a phone number). Different kinds of sockets have +different kinds of addresses--some look like filenames, and some +don't. + +=item soft reference + +See L</symbolic reference>. + +=item source filter + +A special kind of L</module> that does L</preprocessing> on your +script just before it gets to the L</tokener>. + +=item stack + +A device you can put things on the top of, and later take them back +off in the opposite order in which you put them on. See L</LIFO>. + +=item standard + +Included in the official Perl distribution, as in a standard module, a +standard tool, or a standard Perl L</manpage>. + +=item standard error + +The default output L</stream> for nasty remarks that don't belong in +L</standard output>. Represented within a Perl program by the +L</filehandle> L</STDERR>. You can use this stream explicitly, but the +L<die|perlfunc/die> and L<warn|perlfunc/warn> built-ins write to your +standard error stream automatically. + +=item standard I/O + +A standard C library for doing L<buffered|/buffer> input and output to +the L</operating system>. (The "standard" of standard I/O is only +marginally related to the "standard" of standard input and output.) +In general, Perl relies on whatever implementation of standard I/O a +given operating system supplies, so the buffering characteristics of a +Perl program on one machine may not exactly match those on another +machine. Normally this only influences efficiency, not semantics. If +your standard I/O package is doing block buffering and you want it to +L</flush> the buffer more often, just set the C<$|> variable to a true +value. + +=item standard input + +The default input L</stream> for your program, which if possible +shouldn't care where its data is coming from. Represented within a +Perl program by the L</filehandle> L</STDIN>. + +=item standard output + +The default output L</stream> for your program, which if possible +shouldn't care where its data is going. Represented within a Perl +program by the L</filehandle> L</STDOUT>. + +=item stat structure + +A special internal spot in which Perl keeps the information about the +last L</file> on which you requested information. + +=item statement + +A L</command> to the computer about what to do next, like a step in a +recipe: "Add marmalade to batter and mix until mixed." A statement is +distinguished from a L</declaration>, which doesn't tell the computer +to do anything, but just to learn something. + +=item statement modifier + +A L</conditional> or L</loop> that you put after the L</statement> +instead of before, if you know what we mean. + +=item static + +Varying slowly compared to something else. (Unfortunately, everything +is relatively stable compared to something else, except for certain +elementary particles, and we're not so sure about them.) In +computers, where things are supposed to vary rapidly, "static" has a +derogatory connotation, indicating a slightly dysfunctional +L</variable>, L</subroutine>, or L</method>. In Perl culture, the +word is politely avoided. + +=item static method + +No such thing. See L</class method>. + +=item static scoping + +No such thing. See L</lexical scoping>. + +=item static variable + +No such thing. Just use a L</lexical variable> in a scope larger than +your L</subroutine>. + +=item status + +The L</value> returned to the parent L</process> when one of its child +processes dies. This value is placed in the special variable C<$?>. +Its upper eight L<bits|/bit> are the exit status of the defunct +process, and its lower eight bits identify the signal (if any) that +the process died from. On Unix systems, this status value is the same +as the status word returned by I<wait>(2). See L<perlfunc/system>. + +=item STDERR + +See L</standard error>. + +=item STDIN + +See L</standard input>. + +=item STDIO + +See L</standard IE<sol>O>. + +=item STDOUT + +See L</standard output>. + +=item stream + +A flow of data into or out of a process as a steady sequence of bytes +or characters, without the appearance of being broken up into packets. +This is a kind of L</interface>--the underlying L</implementation> may +well break your data up into separate packets for delivery, but this +is hidden from you. + +=item string + +A sequence of characters such as "He said !@#*&%@#*?!". A string does +not have to be entirely printable. + +=item string context + +The situation in which an expression is expected by its surroundings +(the code calling it) to return a L</string>. See also L</context> +and L</numeric context>. + +=item stringification + +The process of producing a L</string> representation of an abstract +object. + +=item struct + +C keyword introducing a structure definition or name. + +=item structure + +See L</data structure>. + +=item subclass + +See L</derived class>. + +=item subpattern + +A component of a L</regular expression> pattern. + +=item subroutine + +A named or otherwise accessible piece of program that can be invoked +from elsewhere in the program in order to accomplish some sub-goal of +the program. A subroutine is often parameterized to accomplish +different but related things depending on its input +L<arguments|/argument>. If the subroutine returns a meaningful +L</value>, it is also called a L</function>. + +=item subscript + +A L</value> that indicates the position of a particular L</array> +L</element> in an array. + +=item substitution + +Changing parts of a string via the C<s///> operator. (We avoid use of +this term to mean L</variable interpolation>.) + +=item substring + +A portion of a L</string>, starting at a certain L</character> +position (L</offset>) and proceeding for a certain number of +characters. + +=item superclass + +See L</base class>. + +=item superuser + +The person whom the L</operating system> will let do almost anything. +Typically your system administrator or someone pretending to be your +system administrator. On Unix systems, the L</root> user. On Windows +systems, usually the Administrator user. + +=item SV + +Short for "scalar value". But within the Perl interpreter every +L</referent> is treated as a member of a class derived from SV, in an +object-oriented sort of way. Every L</value> inside Perl is passed +around as a C language C<SV*> pointer. The SV L</struct> knows its +own "referent type", and the code is smart enough (we hope) not to try +to call a L</hash> function on a L</subroutine>. + +=item switch + +An option you give on a command line to influence the way your program +works, usually introduced with a minus sign. The word is also used as +a nickname for a L</switch statement>. + +=item switch cluster + +The combination of multiple command-line switches (e.g., B<-a -b -c>) +into one switch (e.g., B<-abc>). Any switch with an additional +L</argument> must be the last switch in a cluster. + +=item switch statement + +A program technique that lets you evaluate an L</expression> and then, +based on the value of the expression, do a multiway branch to the +appropriate piece of code for that value. Also called a "case +structure", named after the similar Pascal construct. See +See L<perlsyn/Basic BLOCKs>. + +=item symbol + +Generally, any L</token> or L</metasymbol>. Often used more +specifically to mean the sort of name you might find in a L</symbol +table>. + +=item symbol table + +Where a L</compiler> remembers symbols. A program like Perl must +somehow remember all the names of all the L<variables|/variable>, +L<filehandles|/filehandle>, and L<subroutines|/subroutine> you've +used. It does this by placing the names in a symbol table, which is +implemented in Perl using a L</hash table>. There is a separate +symbol table for each L</package> to give each package its own +L</namespace>. + +=item symbolic debugger + +A program that lets you step through the L<execution|/execute> of your +program, stopping or printing things out here and there to see whether +anything has gone wrong, and if so, what. The "symbolic" part just +means that you can talk to the debugger using the same symbols with +which your program is written. + +=item symbolic link + +An alternate filename that points to the real L</filename>, which in +turn points to the real L</file>. Whenever the L</operating system> +is trying to parse a L</pathname> containing a symbolic link, it +merely substitutes the new name and continues parsing. + +=item symbolic reference + +A variable whose value is the name of another variable or subroutine. +By L<dereferencing|/dereference> the first variable, you can get at +the second one. Symbolic references are illegal under L<use strict +'refs'|strict/strict refs>. + +=item synchronous + +Programming in which the orderly sequence of events can be determined; +that is, when things happen one after the other, not at the same time. + +=item syntactic sugar + +An alternative way of writing something more easily; a shortcut. + +=item syntax + +From Greek, "with-arrangement". How things (particularly symbols) are +put together with each other. + +=item syntax tree + +An internal representation of your program wherein lower-level +L<constructs|/construct> dangle off the higher-level constructs +enclosing them. + +=item syscall + +A L</function> call directly to the L</operating system>. Many of the +important subroutines and functions you use aren't direct system +calls, but are built up in one or more layers above the system call +level. In general, Perl programmers don't need to worry about the +distinction. However, if you do happen to know which Perl functions +are really syscalls, you can predict which of these will set the C<$!> +(C<$ERRNO>) variable on failure. Unfortunately, beginning programmers +often confusingly employ the term "system call" to mean what happens +when you call the Perl L<system|perlfunc/system> function, which +actually involves many syscalls. To avoid any confusion, we nearly +always use say "syscall" for something you could call indirectly via +Perl's L<syscall|perlfunc/syscall> function, and never for something +you would call with Perl's L<system|perlfunc/system> function. + +=back + +=head2 T + +=over 4 + +=item tainted + +Said of data derived from the grubby hands of a user and thus unsafe +for a secure program to rely on. Perl does taint checks if you run a +L</setuid> (or L</setgid>) program, or if you use the B<-T> switch. + +=item TCP + +Short for Transmission Control Protocol. A protocol wrapped around +the Internet Protocol to make an unreliable packet transmission +mechanism appear to the application program to be a reliable +L</stream> of bytes. (Usually.) + +=item term + +Short for a "terminal", that is, a leaf node of a L</syntax tree>. A +thing that functions grammatically as an L</operand> for the operators +in an expression. + +=item terminator + +A L</character> or L</string> that marks the end of another string. +The C<$/> variable contains the string that terminates a +L<readline|perlfunc/readline> operation, which L<chomp|perlfunc/chomp> +deletes from the end. Not to be confused with +L<delimiters|/delimiter> or L<separators|/separator>. The period at +the end of this sentence is a terminator. + +=item ternary + +An L</operator> taking three L<operands|/operand>. Sometimes +pronounced L</trinary>. + +=item text + +A L</string> or L</file> containing primarily printable characters. + +=item thread + +Like a forked process, but without L</fork>'s inherent memory +protection. A thread is lighter weight than a full process, in that a +process could have multiple threads running around in it, all fighting +over the same process's memory space unless steps are taken to protect +threads from each other. See L<threads>. + +=item tie + +The bond between a magical variable and its implementation class. See +L<perlfunc/tie> and L<perltie>. + +=item TMTOWTDI + +There's More Than One Way To Do It, the Perl Motto. The notion that +there can be more than one valid path to solving a programming problem +in context. (This doesn't mean that more ways are always better or +that all possible paths are equally desirable--just that there need +not be One True Way.) Pronounced TimToady. + +=item token + +A morpheme in a programming language, the smallest unit of text with +semantic significance. + +=item tokener + +A module that breaks a program text into a sequence of +L<tokens|/token> for later analysis by a parser. + +=item tokenizing + +Splitting up a program text into L<tokens|/token>. Also known as +"lexing", in which case you get "lexemes" instead of tokens. + +=item toolbox approach + +The notion that, with a complete set of simple tools that work well +together, you can build almost anything you want. Which is fine if +you're assembling a tricycle, but if you're building a defranishizing +comboflux regurgalator, you really want your own machine shop in which +to build special tools. Perl is sort of a machine shop. + +=item transliterate + +To turn one string representation into another by mapping each +character of the source string to its corresponding character in the +result string. See +L<perlop/trE<sol>SEARCHLISTE<sol>REPLACEMENTLISTE<sol>cdsr>. + +=item trigger + +An event that causes a L</handler> to be run. + +=item trinary + +Not a stellar system with three stars, but an L</operator> taking +three L<operands|/operand>. Sometimes pronounced L</ternary>. + +=item troff + +A venerable typesetting language from which Perl derives the name of +its C<$%> variable and which is secretly used in the production of +Camel books. + +=item true + +Any scalar value that doesn't evaluate to 0 or C<"">. + +=item truncating + +Emptying a file of existing contents, either automatically when +opening a file for writing or explicitly via the +L<truncate|perlfunc/truncate> function. + +=item type + +See L</data type> and L</class>. + +=item type casting + +Converting data from one type to another. C permits this. Perl does +not need it. Nor want it. + +=item typed lexical + +A L</lexical variable> that is declared with a L</class> type: C<my +Pony $bill>. + +=item typedef + +A type definition in the C language. + +=item typeglob + +Use of a single identifier, prefixed with C<*>. For example, C<*name> +stands for any or all of C<$name>, C<@name>, C<%name>, C<&name>, or +just C<name>. How you use it determines whether it is interpreted as +all or only one of them. See L<perldata/Typeglobs and Filehandles>. + +=item typemap + +A description of how C types may be transformed to and from Perl types +within an L</extension> module written in L</XS>. + +=back + +=head2 U + +=over 4 + +=item UDP + +User Datagram Protocol, the typical way to send L<datagrams|/datagram> +over the Internet. + +=item UID + +A user ID. Often used in the context of L</file> or L</process> +ownership. + +=item umask + +A mask of those L</permission bits> that should be forced off when +creating files or directories, in order to establish a policy of whom +you'll ordinarily deny access to. See the L<umask|perlfunc/umask> +function. + +=item unary operator + +An operator with only one L</operand>, like C<!> or +L<chdir|perlfunc/chdir>. Unary operators are usually prefix +operators; that is, they precede their operand. The C<++> and C<--> +operators can be either prefix or postfix. (Their position I<does> +change their meanings.) + +=item Unicode + +A character set comprising all the major character sets of the world, +more or less. See L<perlunicode> and L<http://www.unicode.org>. + +=item Unix + +A very large and constantly evolving language with several alternative +and largely incompatible syntaxes, in which anyone can define anything +any way they choose, and usually do. Speakers of this language think +it's easy to learn because it's so easily twisted to one's own ends, +but dialectical differences make tribal intercommunication nearly +impossible, and travelers are often reduced to a pidgin-like subset of +the language. To be universally understood, a Unix shell programmer +must spend years of study in the art. Many have abandoned this +discipline and now communicate via an Esperanto-like language called +Perl. + +In ancient times, Unix was also used to refer to some code that a +couple of people at Bell Labs wrote to make use of a PDP-7 computer +that wasn't doing much of anything else at the time. + +=back + +=head2 V + +=over 4 + +=item value + +An actual piece of data, in contrast to all the variables, references, +keys, indexes, operators, and whatnot that you need to access the +value. + +=item variable + +A named storage location that can hold any of various kinds of +L</value>, as your program sees fit. + +=item variable interpolation + +The L</interpolation> of a scalar or array variable into a string. + +=item variadic + +Said of a L</function> that happily receives an indeterminate number +of L</actual arguments>. + +=item vector + +Mathematical jargon for a list of L<scalar values|/scalar value>. + +=item virtual + +Providing the appearance of something without the reality, as in: +virtual memory is not real memory. (See also L</memory>.) The +opposite of "virtual" is "transparent", which means providing the +reality of something without the appearance, as in: Perl handles the +variable-length UTF-8 character encoding transparently. + +=item void context + +A form of L</scalar context> in which an L</expression> is not +expected to return any L</value> at all and is evaluated for its +L</side effects> alone. + +=item v-string + +A "version" or "vector" L</string> specified with a C<v> followed by a +series of decimal integers in dot notation, for instance, +C<v1.20.300.4000>. Each number turns into a L</character> with the +specified ordinal value. (The C<v> is optional when there are at +least three integers.) + +=back + +=head2 W + +=over 4 + +=item warning + +A message printed to the L</STDERR> stream to the effect that something +might be wrong but isn't worth blowing up over. See L<perlfunc/warn> +and the L<warnings> pragma. + +=item watch expression + +An expression which, when its value changes, causes a breakpoint in +the Perl debugger. + +=item whitespace + +A L</character> that moves your cursor but doesn't otherwise put +anything on your screen. Typically refers to any of: space, tab, line +feed, carriage return, or form feed. + +=item word + +In normal "computerese", the piece of data of the size most +efficiently handled by your computer, typically 32 bits or so, give or +take a few powers of 2. In Perl culture, it more often refers to an +alphanumeric L</identifier> (including underscores), or to a string of +nonwhitespace L<characters|/character> bounded by whitespace or string +boundaries. + +=item working directory + +Your current L</directory>, from which relative pathnames are +interpreted by the L</operating system>. The operating system knows +your current directory because you told it with a +L<chdir|perlfunc/chdir> or because you started out in the place where +your parent L</process> was when you were born. + +=item wrapper + +A program or subroutine that runs some other program or subroutine for +you, modifying some of its input or output to better suit your +purposes. + +=item WYSIWYG + +What You See Is What You Get. Usually used when something that +appears on the screen matches how it will eventually look, like Perl's +L<format|perlfunc/format> declarations. Also used to mean the +opposite of magic because everything works exactly as it appears, as +in the three-argument form of L<open|perlfunc/open>. + +=back + +=head2 X + +=over 4 + +=item XS + +A language to extend Perl with L<C> and C++. XS is an interface description +file format used to create an extension interface between +Perl and C code (or a C library) which one wishes to use with Perl. +See L<perlxs> for the exact explanation or read the L<perlxstut> +tutorial. + +=item XSUB + +An external L</subroutine> defined in L</XS>. + +=back + +=head2 Y + +=over 4 + +=item yacc + +Yet Another Compiler Compiler. A parser generator without which Perl +probably would not have existed. See the file I<perly.y> in the Perl +source distribution. + +=back + +=head2 Z + +=over 4 + +=item zero width + +A subpattern L</assertion> matching the L</null string> between +L<characters|/character>. + +=item zombie + +A process that has died (exited) but whose parent has not yet received +proper notification of its demise by virtue of having called +L<wait|perlfunc/wait> or L<waitpid|perlfunc/waitpid>. If you +L<fork|perlfunc/fork>, you must clean up after your child processes +when they exit, or else the process table will fill up and your system +administrator will Not Be Happy with you. + +=back + +=head1 AUTHOR AND COPYRIGHT + +Based on the Glossary of Programming Perl, Third Edition, +by Larry Wall, Tom Christiansen & Jon Orwant. +Copyright (c) 2000, 1996, 1991 O'Reilly Media, Inc. +This document may be distributed under the same terms as Perl itself. |