Table of Contents

1.0 Document id
2.0 Procmail pointers
3.0 Dry run testing
4.0 Things to remember
5.0 Procmail flags
6.0 Matching and regexps (regular expressions)
7.0 Variables
8.0 Suggestions and miscellaneous
10.0 Scoring
11.0 Formail usage
12.0 Saving mailing list messages
13.0 Procmail, MIME and HTML
14.0 Simple recipe examples
15.0 Miscellaneous recipes
16.0 Procmail and PGP
17.0 Includerc usage
18.0 Mailing list server
19.0 Common troubles
20.0 Technical matters
21.0 Procmail software for Emacs
22.0 RFC, Request for comments
23.0 Introduction to E-mail Headers
24.0 Message headers

1.0 Document id

1.1 General

$Id: pm-tips.txt,v 2.42 2008/03/09 13:16:49 jaalto Exp $
Homepage http://pm-doc.sourceforge.net/
URLs last checked: 2007-10-02

Procmail is powerful mail handling tool and a lot of space here has been devoted to discuss about UBE (aka Spam) and its essence.

This is a Procmail Tips page: a collection of procmail recipes, instructions, howtos. You will also find many other interesting subjects that discuss about Internet mail in general: mail headers, MIME and RFCs. Another part of this document is dedicated to Emacs and its maili handlling capabilities. Emacs is powerful tool that can be used for both mail and news reading; available in Windows platform as well. The tips have been compiled from the procmail discussion list, from comp.mail.misc and from the author's own experiences with procmail.

This document does not intend to teach the basics of procmail, instead you should be familiar with the procmail manual pages already. If you're using Windows operating system, procmail is available in Cygwin <http://www.cygwin.com/> distribution.

You may want to read Nancy's and Era's procmail FAQ pages before this page.

If you find errors or things to improve in this document, please send mail to this document's maintainer (see project page). If some URL is not alive any more, you may still be able to find it by using a WWW search such as Google.

There is never too much to learn about procmail and the best source is the rc files that people have done. If you have some time, please place your .procmailrc with good comments to your home page.

1.2 What is Procmail?

[FAQ] Procmail is a mail processing utility, which can help you filter your mail, sort incoming mail according to sender, Subject line, length of message, keywords in the message, etc, implement an ftp-by-mail server, and much more. Procmail is also a complete drop-in replacement for your MDA. (If this doesn't mean anything to you, you may not want to know.) Procmail runs under Unix. See Infinite Ink's Mail Filtering and Robots page for information about related utilities for various other platforms, and competing Unix programs, too (there aren't that many of either).

1.3 Abbreviations and thanks

[stephen] Stephen R. van den Berg, Author of Procmail Last heard from stephen 1997-08 in procmail mailing list by using address <srb A T cuci nl>. Later 1998 due to his regular work activities and lack of time he nominated Philip Guenther to the head of Procmail development.

[aaron] Aaron Schrab aaron+procmail A T schrab com
[alan] Alan K. Stebbens alan.stebbens A T openwave com
[dan] Daniel Smith J.Daniel.Smith A T WriteMe dt com
[david] David W. Tamkin dattier A T panix com
Text has been rephrased or modified that does not exist in the original source.
[ed] Edward J. Sabol sabol A T alderaan gsfc nasa gov
[elijah] Eli the Bearded process A T qz little-neck ny us
[hal] Hal Wine hal A T dtor com
[jari] Jari Aalto jari aalto A T poboxes dt com
[philip] Philip Guenther guenther A T gac edu
[richard] Richard Kabel rkabel A T sequent com
[sean] Sean B. Straw PSE-L A T mail professional org
[timothy] Timothy J Luoma luomat+procmail A T luomat peak org
[walter] Walter Dnes waltdnes A T interlog com

[FAQ] Procmail FAQ era A T iki.fi
[manual] Quote from some procmail manual page
[maintainer] As of 2000-09 the maintainer is [jari]

A big thanks goes to all these people:

1.4 Version information

Here is version and file size log of the text file, which gives you some estimate how the document has evolved.

      v2.42   2008-03-10  510  Add Gmane URL. Links checked.
      v2.36   2007-10-02  519  New HTML/CSS layout. Links checked.
      v2.30   2006-02-15  519  Sanitized all email addresses.
      v2.27   2004-10-10  516  Spam related things removed.
      v2.16   2002-08-31  596  Removed old UBE pointers.
      v2.13   2002-08-13  596  Removed old UBE pointers.
      v2.5    2002-02-01  608  Spelling checked with Emacs ispell
      v2.2    2002-01-28  608  URL links checked and updated
      v2.0    2001-08-09  608  http://pm-doc.sourceforge.net opened.
      v1.77   1999-12-27  603  Netscape spam filters added
      v1.76   1999-10-01  602  Mark Seiden's patch applied. Now under CVS.
      v1.74   1999-04-26  599  document moved to www.procmail.org
      v1.72   1999-04-21  597  Links corrected
      v1.71   1999-03-29  597  Ricochet -- Perl script to fight UBE
      v1.70   1999-02-26  592  procmail's Y2K compliance
      v1.69   1999-02-23  590  RFC and using MIME in Usenet postings
      v1.68   1998-01-29  587  Added "Lua" language pointer
      v1.67   1998-01-07  579  Eli's procmail recipes in module section
      v1.66   1998-12-14  578  Philip took care of bugs/patches listing
      v1.64   1998-11-26  602  More Richard's comments integrated
      v1.63   1998-10-30  595  Richard's english correction patch
      v1.60   1998-10-21  591  UMASK, .forward if procmail already is LDA
      v1.58   1998-10-12  583  SmartList and other MLM software discussed
      v1.57   1998-10-06  575  PLUS addr. Convert HTML body to text
      v1.55   1998-08-29  565  Fetching fields with formail -x
      v1.53   1998-08-24  554  Procmail doesn't pass 8bit characters
      v1.52   1998-08-24  553  Flag c forking study, procmail wish list
      v1.51   1998-08-18  541  Small changes. MIME notes
      v1.49   1998-08-10  529  Guido.Van.Hoeck's 55k patch applied
      v1.46   1998-06-24  526  Added live urls to procmail archive
      v1.45   1998-06-23  521  All recipes checked by eye. Many fixes.
      v1.44   1998-06-19  516  Detecting mailing lists with pm-jalist.rc
      v1.41   1998-06-17  510  How to disable recipe quickly with
      v1.36   1998-04-03  493  Includerc rewritten, plus addressing
      v1.34   1998-04-02  488  ORing and supreme scoring added
      v1.32   1998-03-23  471  All recipes checked (by eye)
      v1.31   1998-03-10  469  Better ordering: ORing rules discussed
      v1.29   1998-01-30  429  "regexp" section rewrite.
      v1.24   1997-12-30  415  up till 1996-12 is now included
      v1.17   1997-12-09  343  up till archive 1996-07 now included
      v1.14   1997-11-25  260
      v1.13   1997-11-08  218  Era's correction suggestions.
      v1.10   1997-10-13  181  archive file 1995-10's tips included
      v1.9    1997-10-11  142
      v1.8    1997-10-01  127
      v1.6    1997-09-18  94
      v1.5    1997-09-16  76
      v1.05   1997-09-14  53
      v1.01   1997-09-13  46 (k)    

1.5 Document layout and maintenance

The base version of this document is kept in plain text format, which requires no special editors or learning a markup language. The tools to help maintaining this document include:

Text version of this file is converted to HTML with:

      perl -S <conversion program> --Auto-detect --Out pm-tips.txt    

SENDING IMPROVEMENTS

If you have any spare moment, a glimpse to find some spelling mistakes or misuse of the verbs, please go ahead and send a patch to maintainer of this page. The preferred way to send corrections to this document is as diff(1) output. Here's how to make corrections send them forward. Please try to use unified diff -u option. The source is available at http://pm-doc.cvs.sourceforge.net/pm-doc/pm-doc/doc/tips/

      cp pm-tips.txt pm-tips.txt.orig

      ... load the pm-tips.txt to a text editor / edit / save
      ... Generate the difference

      diff -bwu pm-tips.txt.orig pm-tips.txt > pm-tips.txt.patch

      ...Send content of pm-tips.txt.diff by mail to maintainer    

If you do not know what a diff format is, then simply send your comments in email. Use "Linux: pm-doc" as subject to bypass spam filtering.

1.6 About presented recipes

The recipes have been kept as original as possible, but a generalization of the ideas have been done when necessary. If some recipe doesn't work as announced, please a) send note to [maintainer] b) send mail to procmail mailing list and ask how to correct it. Sometimes a simple dot(.) has been used in regular expressions, where the right, pedantic way would have been to use an escaped dot. If you want to be very strict, you should use the escaped dot where applicable.

      # free hand version     # pedantic version
      :0                      :0
      * match.this.site       * match\.this\.site    

Procmail also accepts assignments without quotes, like this:

      var = value
      num = 1
      dir = /var/mail    

But in this document a strict style has been adopted, where literal strings are assigned with double quotes:

      var = "value"    

That's because the procmail code checker (Emacs package tinyprocmail.el) then won't warn about missing dollar-sign, which might have very well been forgotten. Emacs package font-lock.el, a syntax highlighting assistant, also displays double quoted string in color.

      #   If you do this...

      var = value

      #   then you might have made a typo. It is in fact not clear
      #   what was intended:

      var = "value"   # Did you mean:  literal assignment?
      var = $value    # Did you mean: variable assignment?    

Recipe flags are also not stuck together, because the visual distinction of :0 and flags is a valuable one. Reasoning for which flags are kept together and in which order is explained later in details.

      # Erm, all stuck]      # This may be visually more clear
      :0ABDc:                :0 A BD c:    

1.7 Variables used in recipes

These are part of the procmail module pm-javar.rc and are used in recipes.

      #   Pure newline; typical usage if you want to write
      #   Something directly to procmail's active logfile:
      #
      #       LOG = "$NL message $NL"

      NL = "
      "    

Refer to "improving Space-Tab syndrome" section for more details

      WSPC    = "     "               # whitespace: space + tab

      SPC     = "[$WSPC]"             # Regexp: space + tab
      SPCL    = "($SPC|$)"            # whitespace + linefeed: spc/tab/nl
      NSPC    = "[^$WSPC]"            # negation

      s       = $SPC                  # shortname: like perl -- \s
      d       = "[0-9]"               # A digit -- Perl \d
      w       = "[0-9a-z_A-Z]"        # A word  -- Perl \w
      W       = "[^0-9a-z_A-Z]"       # A word  -- Perl \W
      a       = "[a-zA-Z]"            # A word, only alphabetic chars    

Writing recipes is now a little easier and may look more clear at least to people that have accustomed reading Perl regular expression short names:

      :0
      *$ Header-Name:$s+$d+$s+$d      # Matches "Header: 11 12"
      {
          # Matched "whitespace" + "digit" + "whitespace" + "digit"
          # Do  something
      }    

SUPREME = 9876543210, is the highest score value that causes procmail to bail out. [david] Actually the maximum is 2147483647, but 9876543210 is easier to remember/type and will function just as well.

PMSRC = Procmail module source code directory. Location where *.rc files reside. Anywhere you want it to be. Usually $HOME/pm or $HOME/procmail/lib. Here you can keep the procmail files, log files and includerc scripts. Another common used synonym is PMDIR.

SPOOL = Directory where your procmail delivers the categorized messages. Like mailing lists:

      list.procmail, list.lynx-users, list.emacs, list.elm    

and work mail:

      work.announcements, work.lab, work.doc, work.customer    

and your private message:

      mail.usenet, mail.private, mail.default, mail.perl    

and unimportant messages

      junk.daemon, junk.cron, junk.ube    

If you read the procmail-delivered files directly, this directory is usually $HOME/Mail or $HOME/mail. If you use some other software that reads these files as mail spool files (like Emacs Gnus), then this directory is typically $HOME/Mail/spool or similar.

MYXLOOP = Used to prevent re-sending messages that have already been handled. Typically $LOGNAME@$HOST, but this can be any user chosen string. Make it it unique to your address. In this document the definition is:

      MYXLOOP = "X-Loop: $LOGNAME@$HOST"    

SENDMAIL = Program to deliver composed mail. Usually standard Unix sendmail(1), but it must have some switches with it. See man page for more. We use following definition in scripts:

      SENDMAIL = "sendmail -oi -t"    

NICE = In a Unix environment you can lower the scheduling priority with nice(1). If you are conscious of how many external processes you launch for each piece of mail it would be polite to lower the priority of such processes. You may see in this document that external processes are called with NICE enabled:

      :0 w                # Same as "nice -10 script.pl"
      | $NICE script.pl    

IS functions; Functions to test file or directory attributes. E.g. IS_EXIST is defined as "test -e" and so on. The definitions of IS functions are system-dependent. E.g. On Irix the "-e" option is not recognized and the nearest equivalent is "test -r". All IS functions are defined in the pm-javar.rc module.

1.8 About "useless use of cat award"

Randal Schwartz, a well-known Perl programmer and Perl book writer, started giving rewards for the "useless use of cat command" whenever someone wrote examples without token "<". Like this:

      $ cat file.name.this | wc -l    

Instead he writes that the call should have been written like this, which saves the pipe (never mind that wc can read the file directly; this is an example).

      $ wc -l < file.name.this    

[Paul David Fardy <pdf A T morgan.ucs.mun.ca>] There is weight in the pipeline, but the true cost is in process startup. Try running wc 100 times on /etc/motd or on this message. My tests show the useless use of cat doubles the real and processing time (real, user, and system time are each roughly doubled):

      $ cat > /tmp/randal << 'EOF'
      COUNT=100
      i=1
      while :
      do
              wc < /etc/motd > /dev/null
              i=$(expr $i + 1)
              [ "$i" = "$COUNT" ] && break
      done
      EOF

      $ cat > /tmp/useless << 'EOF'
      COUNT=100
      i=1
      while :
      do
              cat /etc/motd | wc > /dev/null
              i=$(expr $i + 1)
              [ "$i" = "$COUNT" ] && break
      done
      EOF

      # NOTE: The timing values should be read as absolute, but
      # examine the relative differencies.

      $ time sh /tmp/randall
      real    0m0.568s
      user    0m0.208s
      sys     0m0.348s

      $ time sh /tmp/useless
      real    0m0.825s
      user    0m0.348s
      sys     0m0.476s    

This becomes important, for example, when you decide to filter all your mail with procmail – looking for virus signatures for example. I might well decide to look only at the first 3 or 4 kilobytes. It's not the size of messages--most are small anyway – but the number of messages that cause a problem. Do you want to double the processing cost of all our mail? I'm looking at a system-wide filter for all my users' mail. I'm considering Sendmail's mail filter versus procmail filtering. I'll likely be using a bit of both. And given that all of the filtering really just getting in the way of legitimate traffic, it'd really piss me off if I naively doubled the cost.


2.0 Procmail pointers

2.1 Where is procmail developed

Philip Guenther <guenther A T gac.edu> is currently taking care of and coordinating procmail bug fixes. Please send any procmail bugs to the mailing list or to bug@procmail.org. The development mailing list is running SmarList at procmail-dev@procmail.org. Newest Procmail code it at <http://www.procmail.org/> and ready packages are available at Linux distributions' repositories.

2.2 Procmail resources

Procmail is discussed in Usenet newsgroup comp.mail.misc and mailing list accessible at NNTP server <http://news.gmane.org/gmane.mail.procmail>.

2.3 Procmail mode for Emacs

If you use Emacs, See Procmail mode tinypm.el at <http://freshmeat.net/projects/emacs-tiny-tools>. It can also be used to statically syntax check recipes. Here is an example of its output:

      *** 1997-11-24 22:13 (pm.lint) 3.11pre7 tinypm.el 1.80
      cd /users/jaalto/junk/
      pm.lint:010: Warning, no right hand variable found. ([$`']
      pm.lint:055: Pedantic, flag orer style is not standard `hW:'
      pm.lint:060: Warning, message dropped to folder, you need lock.
      pm.lint:062: Warning, recipe with "|" may need `w' flag.
      pm.lint:073: Warning, Formail used but no `f' flag found.    

2.4 Procmail module library project

2.4.1 Where to get various modules

2.4.2 Terminology

subroutine/module = A piece of code that gets something in INPUT and responds with OUTPUT. Subroutine is not message specific.

recipe = A piece of code that is somewhat self contained: It reads something from the message or does something according to matches in message. Recipe may be message-specific. Recipe is more free-form and does not follow strict INPUT/OUTPUT methodology.

2.4.3 Foreword to using modules

In the module listing, some of the modules are recipes and some can be considered subroutines. Let's take the address exploder. First, visualise following familiar programming language pseudo code:

      (ret-val1, ret-val2 ...) = Function(arg1, arg2, arg3 ...)    

Function may return multiple arguments and multiple arguments can be passed to it. Clear so far. The concept applies to procmail modules like this:

      RC_FUNCTION  = $PMSRC/pm-xxx.rc # name the subroutine/module
      RC_FUNCTION2 = ...

      INPUT       = "value"           # Set the arg1 for module
      INCLUDERC   = $RC_FUNCTION      # Call Function( $arg1 )

      :0                              # Examine function's return value
      * ERROR ?? yes
      ...    

This should be pretty clear too. You just have to look into the subroutine/module which you intend to use, to find out what arguments it wants which you need to set (INPUT) before calling it. The documentation also tells you what values are returned, e.g. one of them was ERROR.

If it were recipe, the call would be almost the same, but instead of returning values, the recipe/module most likely does something to your message or writes something to the data files etc. A recipe is much higher level, because it may call multiple subroutines/modules. The distinction between subroutine and recipe module type is not crystal clear, but I hope the above will clarify a bit the Procmail module/subroutine/recipe concept.

2.4.4 Header file modules

These are like #include .h files in C, they define common variables, but do not contain actual code.

2.4.5 General modules

2.4.6 Spam modules

Read "Thoughts about increasing spam annoyance" at <http://pm-lib.sourceforge.net/README.html> which explains these modules better in context "2.0 A lightweight UBE block system with pure procmail".

2.4.7 Mime modules

2.4.8 Filtering message body or headers

2.4.9 Mailing list modules

2.4.10 Miscellaneous modules

2.4.11 Low-level Date and time handling

For these, you get the date string from somewhere, then feed it to some of these subroutines:

2.4.12 Higher-level Date and time handling

You use these recipes to get the date directly from the message:

2.4.13 Forwarding and account modules

2.4.14 Vacation modules

2.4.15 Message-id based modules

2.4.16 Cron modules

2.4.17 Backup modules

2.4.18 Confirmation modules

2.5 Procmail code to filter UBE

Sysadms remember : Spam filtering is much more efficiently done in the MTA, especially if you are just looking at From and To lines. For example, you can setup in Exim a rule that blocks \d.*@aol\.com (that is any aol.com local part that begins with a digit). AOL guarantees that none of their addresses begin with a digit. Exim rejects such bogus addresses at the SMTP level before the message is received.


3.0 Dry run testing

3.1 What is dry run testing?

It means that you call your procmail test script directly with sample test mail

      % procmail $HOME/pm/pm-test.rc < $HOME/tmp/test-mail.txt    

The script pm-test.rc has the procmail recipe you're testing or improving. The test-mail.txt is any valid mail message containing the headers and body. You can make one with any text editor, e.g. vi, pico, nano, emacs or xemacs. Here's a simple test mail skeleton. Copy verbatim:

      From: me@example.com
      To: me@example.com (self test)
      X-info: I'm just testing

      BODY OF MESSAGE SEPARATED BY EMPTY LINE
      txt txt txt txt txt txt txt txt txt txt    

Remember that you can define environment variables as well in the dry run call. Here's an example where procmail just executes the script and does nothing fancy.

      % procmail VERBOSE=on DEFAULT=/dev/null \
          ~/pm/pm-test.rc < ~/txt/test-mail.txt    

Suppose the script prints something to log files, but you'd instead like to get it all dumped to screen. No problem, first find out your tty value by calling tty at shell prompt and pass that on the command line. Here the default LOGFILE is directed to take care of redirecting "LOG=" commands and statement:

      #  `tty' tells what to fill in /dev/..

      % procmail VERBOSE=on DEFAULT=/dev/null   \
          LOGFILE=/dev/pts/0                    \
          ~/pm/pm-test.rc < ~/txt/test-mail.txt    

3.2 Why the From field is not okay after dry run?

Why it now says "From foo@bar Mon Sep 8 14:38:06 1997"?

Don't worry about this. It's a side-effect of running the message through formail after having generated any auto-reply – the auto-reply generated by "formail -rt" doesn't have a "From " header (it's pointless for outgoing messages), so the second formail adds one, not knowing that it'll just be ignored by sendmail later (well, sendmail will extract the date from it, but that's ignorable). You only see it because you're saving to a folder instead of the mailing it.

3.3 Getting default value of a procmail variable

There's always this way to learn a variable's initial value (note the strong quotes), which Stephen uses to get procmail's value for $SENDMAIL in the scripts that build SmartList:

      procmail LOG='$PATH' DEFAULT=/dev/null /dev/null < /dev/null    

Since LOGFILE hasn't been defined, $PATH will be printed to the screen. One caution: if there are any variables in the definition of $PATH (such as $HOME), they'll be expanded in the output.


4.0 Things to remember

4.1 Get the newest procmail

Lot of troubles surface only because you have an old procmail version. Be sure to have the latest. Knock your sysadm or ISP until he installs this version and don't give up, if you're serious about using procmail. Here is a command to check your procmail version number:

      % procmail -v    

4.2 Csh's tilde is not supported

Many shell users have accustomed to using tilde (~) everywhere. Unfortunately procmail doesn't expand that to home directories; just use $HOME. When you write procmail recipes, think sh not bash. This mind set will automatically get your brain tuned to the right programming habits.

4.3 Be sure to write the recipe starting right

The recipe starts with :0 or just with : but the latter one is somewhat dangerous and easy to miss. Beware writing it 0: as it happens easily. Always put a zero after the colon that begins the recipe. In the first versions of procmail, you would put the number of conditions, with a default of 1. That was annoying, and the computer can do the counting easier, so Stephen made it so that a count of 0 indicates that the conditions are all the lines beginning with a *. The default is one, unless the a, A , e, or E flags is given, in which case the default is zero. ALWAYS START a RECIPE WITH :0.

4.4 Always set SHELL

If your login shell is a C shell (csh or tcsh), avoid havoc: as a precaution, always put following at the top of your $HOME/.procmailrc.

      SHELL = /bin/sh    

4.4.1 If system has no /bin/sh and you're forced to use csh/tcsh

[<kuhlmav A T elec.canterbury.ac.nz>] Csh and tcsh execute the .cshrc first, THEN if, and only if it is the login shell (not a sub shell) it executes the .login, which should contain basic important system setting like stty commands. Likewise, bash and ksh users are taught to define and export PATH in profile, so our per-shell startup files would not have
clobbered the PATH set in .procmailrc the way your .cshrc did.

[philip] ...I have been told by other sysadmins that there are systems on which csh was hacked to source the .login before the cshrc. For various reasons I suspect these to be systems based on
older versions of BSD (say, 2.3 BSD).

As for tcsh, the order in which the .login and .cshrc is sourced is a compile-time option which defaults to the .cshrc (or .tcshrc) before the .login. There may be some wackos out there who change the default in memory of the system(s) that they were raised on. I suggest electroshock as the proper treatment.

...done sys admin on Crays, Convexes, Suns, SGIs, Decs, PC running BSDI, Linux and Free BSD, and I have never run into a system where the .cshrc is sourced AFTER the .login. If someone goes to the trouble to change the order, I would love to know a valid reason for it.

4.4.2 Procmail won't work well with SHELL set to csh derivate

[1998-08-17 PM-L <kuhlmav A T elec.canterbury.ac.nz> Volker Kuhlmann] ...The blame lies with procmail and its documentation. Obviously, procmail is programmed with the assumption that the login shell is a sh derivative. This assumption is a) not very nice, and b) not stated in the otherwise very good documentation. Of course a user can set SHELL to tcsh. If then procmail is too stupid to hack it, it ought to say so clearly, and the above-mentioned questions of people using tcsh will disappear from this list. One could also be nice and point out pitfall (3) mentioned above in the procmail docs. It is customary to have terminal configuration in .login. If it is shifted to .cshrc it should be properly surrounded by if .. endif. Perhaps it is not customary to configure the terminal in bashrc (where else then? - only a rhetorical question), but that
is no reason to blame it on tcsh.

My .cshrc only setenvs the environment when it is a login shell (shell level 1). Obviously procmail runs a login shell. As I said earlier, there are good reasons for setting a full PATH independently whether the shell is interactive or not. So, when procmail executes programs with SHELL=tcsh, PATH is set to the tcsh defaults. That may or may not be desirable, depending on the individual case. No problem with that and avoidable (run tcsh with -f). Nice if it was in the procmail docs.

But then, the PATH getting clobbered is not the point here (just a side-effect I didn't realize until 2 people pointed it out).

4.5 Check and set PATH

It is very likely that the default PATH environment variable that your $HOME/.procmailrc sees it not enough. To play safe, so that all the needed binaries can be found when escaping to shell in .procmailrc, set the PATH variable as a very first statement. Adding paths that don't exist in another system but does exists in the other makes it possible to use the same $HOME/.procmail on multiple servers (Like HP, SUN, IBM, Linux)

      PATH = \
      $HOME/bin:\
      /usr/local/gnu/bin:\
      /usr/contrib/bin:\
      /usr/local/bin:\
      /opt/local/bin:\
      /bin:\
      /usr/bin:\
      /usr/lib:\
      /usr/ucb:\
      /usr/sbin:\
      /vol/bin:\
      /vol/lib:\
      /vol/local/bin:\
      ${PATH}    

4.6 Keep the log on all the time

It's best that you put these variables at the very start of your .procmailrc. When you start using procmail, you also want to know all the time what's happening there and why your recipes didn't work as expected. The answer to almost all your questions can be found in the log file. As the log file will grow to be quite big, remember to set up a cron job to keep it moderate size.

      LOGFILE     = $PMSRC/pm.log
      LOGABSTRACT = "all"
      VERBOSE     = "on"    

4.7 Never add a trailing slash for directories

Drop the trailing slash: it'll choke if you ever end up on Apollo's DomainOS where double slashes are network references. If the directory has a trailing slash, it will choke on most OSes (they treat it like "/.").

      DIR         = /full/path/to/www/directory/    # Wait...
      FILE        = $ARCHIVEDIR/file                # Ouch !    

4.8 Remember what term DELIVERED means

When procmail delivers a piece of mail, whether to a file or a pipe-command, if the write succeeds, then the mail is considered to have been delivered, and processing stops with that recipe file. Here is the relevant text from man page:

...There are two kinds of recipes: delivering and non-delivering recipes. If a delivering recipe is found to match, procmail considers the mail (you guessed it) delivered and will cease processing the rcfile after having successfully executed the action line of the recipe. If a non-delivering recipe is found to match, processing of the rcfile will continue after the action line of this recipe has been executed.

4.9 Beware putting comment in wrong places

You like commenting a lot, sticking them everywhere possible? Yes, I do that too, and got into trouble because one is not that free to comment code in procmail. Pay attention to the following example

      :0              #  comment ok
      * condition     #  OUCH, ouch. This comment must not be here.
      #  Hm, Old procmail versions don't understand this
      #  Are you sure you want to put comments inside
      #  condition line?
      * condition
      {               #  comment ok
                      #  comment ok
          :0          #  comment ok
          /dev/null   #  comment ok
      }               #  comment ok    

So, the place to watch is the condition line. Later procmail versions may understand those, but if you intend to share your recipe, play it safe and think about backward portability.

4.10 Brace placement

Be careful with your braces and remember that old procmail versions aren't as forgiving as newer versions. Below you see classical "Test OK condition first, and if that fails then do something else". See the side comments.

      :0
      * condition
                          # No space allowed here!
      {}                  # Wrong, at least _one_ empty space
      :0 E
      {do_something }     # Again mistake, must have surrounding spaces    

4.11 Local lockfile usage

Lock files are only needed when procmail is doing something that should be serialized, i.e., when only one process at a time should be doing it.

This generally means that any time you write to a file, you should have a local lock, preferably based on the name of the file being written to. Forwarding actions ('!'), and 99% of all filters don't need lock files. However, if a filter action writes to a file while filtering, then you may need a lock. Procmail always does kernel locking when it writes mail to files via simple file actions. So even if you forgot the lock colon, procmail tries to play safe if kernel locking has been compiled in.

Beware misplacing the lock colon(:)

       :0: a      # Ouch! Wrong unless you want a lock file named a
       :0 a:      # Okay.    

Note that in delivering recipes where you manually write the content, you must use local lock file with > token, because procmail can't determine lock by itself. It can only determine the lock file from the >> token. However, putting a lock file on a recipe like this is, of course, utterly useless. So you might as well omit the locking entirely.

      #   Save last body of message to file mail.body

      :0 b:  mail.body$LOCKEXT
      | cat > mail.body    

Watch this too. A nesting block that does not launch a clone cannot take a local lock file on the recipe that starts the braces. A nesting block that does launch a clone can. (see the error)

      :0: file$LOCKEXT
      {
          #  error: "procmail: Extraneous local lock file ignored"
          #  - This lock file will be ignored
          #  - If the recipes inside the braces try to use file.lck
          #    as  a lock file, then you'll have a deadlock situation.

          :0 :
          /tmp/tmp.mbx
      }    

Let me also explain why the w is so important. Notice, that the two here are equivalent. The W here is implicit. NOTE: this is only true on the recipe that opens a nested block. On a recipe with a program, forward, or delivery action, W' is different from w is different from missing both.

      :0 c: file$LOCKEXT      :0 Wc: file$LOCKEXT
      { ... }                 { ... }    

To quote the comment in source code, "try and protect the user from his blissful ignorance". The parent will always wait for the cloned child to exit when a lock file is involved. The only question is whether or not it should be logged. If you want failure of the cloned child to be logged, then you should use the w flag, ala:

      :0 wc: file$LOCKEXT
      { ... }    

A local lockfile can be used to lock a clone; the parent procmail will remove it when the clone exits (thus it serves as a global lock file for the clone). If the braced block does not launch a clone, asking for a local lock file generates an error.

4.12 Global lockfile

If you want to block everything while the recipe runs, even during the conditions, use global lock. For example in this construct the formail which updates the message-id cache file must be protected with a global lock file.

      MID_CACHE_LEN   = 8192
      MID_CACHE_FILE  = $PMSRC/msgid.cache
      MID_CACHE_LOCK  = $PMSRC/msgid.cache$LOCKEXT

      LOCKFILE        = $MID_CACHE_LOCK

      :0
      * ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE
      {
              LOG = "dupecheck: discarded $MESSAGEID from $FROM $NL"

              :0                  # no lockfile !
              $DUPLICATE_MBOX
      }

      LOCKFILE                    # kill variable    

You cannot use local lockfile as below:

      :0 : $MID_CACHE_FILE$LOCKEXT
      *   ^Message-ID:
      * ? $FORMAIL -D $MID_CACHE_LEN $MID_CACHE_FILE    

because the local lock file named on the flag line will be created only if the conditions have matched and the action is attempted.

One more note: watch carefully, that there is no : lock when delivering to DUPLICATE_MBOX because the outer global lock file already prevents all other procmail instances from executing this part of the recipe.

4.13 Gee, where do I put all those ! * $ ??

Ahem. I can't tell you exactly what to do or how to write your own procmail recipes, but I can show you an example. Here is one possible style for condition line token order:

      *$ ! ? BH VAR ?? test    

That won't say much unless you see something to compare with. Here is one perfectly valid rule, but like the above style.

      :0
      *$ ^Subject:.*$VAR
      *! ^From:.*some
      *B ! ?? match-the-string-in-body
      *$? $IS_EXIST $FILE
      *VARIABLE ?? set    

It might be better to line up things in condition lines. The first column is reserved for dollar sign, the second for not operator and so on. The key here is, that it is possible to see at a glance if I variable expansion dollar in the line (leftmost).

      :0
      *$       ^Subject:.*$VAR
      *  !      ^From:.*some
      *  ! B ?? match-the-string-in-body
      *$ ?      $IS_EXIST $FILE
      *         VARIABLE ?? set
       | | |
       | | |
       | | What is matched: (H)eader portion, (B)ody or (HB) both.
       | | The (??) associative operator is required.
       | |
       | Not operator (!) or shell call (?)
       |
       Variable expansion (important)    

4.14 If you Send an automatic reply, use X-loop header

Do not send automatic reply without checking "! ^FROM_DAEMON" condition and always include X-Loop header and check its existence to prevent mail loops

      :0
      *    conditions-for-auto-reply
      *$ ! ^$MYXLOOP
      *  ! ^FROM_DAEMON
      | $FORMAIL -A "$MYXLOOP" ...other-headers...    

4.15 Avoid extra shell layer and check command for SHELLMETAS

[dan] It is very important to study your shell command calls and try to save the overload of the extra layer of shell. It may be extra work once when you write your rcfile but it saves effort on each piece of arriving mail. When procmail sees a character from SHELLMETAS, it runs

      # Default SHELLMETAS: &|<>~;?*[
      # Default $SHELLFLAGS: -c

      % $SHELL $SHELLFLAGS "command -opts args"    

instead of

      % command -opts args    

That is because procmail's ability to invoke other programs does not include filename globbing ([, *, ?), backgrounding (&), piping (|), succession (;), nor conditional succession (&&, ||). If it sees any of those characters (before expanding variables), it hands the job over to a shell.

Sometimes those characters appear in arguments to a command without having their shell meta meaning and procmail really could invoke the command directly without the shell. You can see the distinction in a verbose log file: if procmail runs the command itself, it logs

      Executing "command,-opts,args"    

with a comma between each positional parameter, but if it calls a shell, the original spacing from the rcfile appears unchanged in the logfile:

      Executing "command -opts args"    

So, if you know you won't be needing shell expansion, wrap your shell calls with this:

      savedMetas  = $SHELLMETAS
      SHELLMETAS    # Kill variable

      ..command that does not need shell expansion features..

      SHELLMETAS  = $savedMetas    

4.16 Think what shell commands you use

For every message, procmail launches the processes you have put into your $HOME/.procmailrc. If you haven't paid attention to optimization before, now it's serious time to take a magnifying glass and check every recipe and the processes in them. When you write you private shell scripts, the performance hit is not so important, but for mail delivery, the matter is totally different. First, let's see some programs and sizes: The following is from one Unix system, where the binaries include debug and symbol table code.

      131072  /usr/bin/awk
      196608  /usr/bin/sort
      245760  /usr/bin/grep
      262144  /usr/bin/sed
      303552  /usr/local/bin/gawk
      544768  /usr/contrib/bin/perl       [perl 4.36]
      822232  /opt/local/bin/perl

              text    data     bss
      awk:    72727 + 51316 +  15317   = 139360
      sort:  173225 + 18496 + 183076   = 374797
      sed:   237248 + 16992 +  56252   = 310492
      grep:  221591 + 16176 +  53816   = 291583
      perl4: 502220 + 36044 +  65632   = 603896
      perl5: 633812 + 69612 +   2385   = 705809
      gawk:  160018 +  5264 +   7168   = 172450    

The binary sizes above are not the typical cases: these are from another system

           4 Sep 28  /usr/local/bin/awk -> gawk
       32768 Nov 16  /usr/bin/grep
       49152 Nov 16  /usr/bin/sed
      114688 Oct 20  /usr/local/contrib/gnu/bin/grep
      155648 Nov 16  /usr/bin/awk
      155648 Nov 16  /usr/bin/nawk
      221184 Nov 16  /usr/bin/gawk
      311296 Jan 27  /usr/local/bin/gawk
      958464 Nov  2  /usr/local/contrib/bin/perl
      1196032 Sep 14 /usr/local/bin/perl    

Stan Ryckman <stanr A T sunspot.tiac.net> wants you to know that:

Comparing byte sizes on disk means nothing here... these things may or may not have been stripped. Any symbol tables included in the byte counts you see above won't affect process start-up time. The size command will give a better handle on what will be needed in starting a process. The three segments may each have their own overhead, though, and the relative contributions of those segments to startup time may well be system-dependent.

Hm. Can we draw some conclusion? Not anything definitive, but at least something:

Here are some more programs. Don't even think of extracting fields with grep or awk, like "grep Subject", because formail is much smaller and more optimized for tasks like that. Better yet, many times you can do all with procmail's regexp matches.

      37007 Sep  5 15:53 /usr/local/bin/formail   # 3.11pre7
      28672 Jun 10  1996 /usr/bin/tr
      20480 Jun 10  1996 /usr/bin/tail
      20480 Jun 10  1996 /usr/bin/cat
      20480 Sep 26  1996 /usr/bin/expr
      16384 Jun 10  1996 /usr/bin/head
      16384 Jun 10  1996 /usr/bin/cut
      16384 Jun 10  1996 /usr/bin/date
      16384 Jun 10  1996 /usr/bin/uniq
      16384 Jun 10  1996 /usr/bin/wc
      12288 Jun 10  1996 /usr/bin/echo    

4.17 Using absolute paths when calling a shell program

Shell programmers know that if absolute path is used for calling the executable, shell doesn't have to search through long list of directories in $PATH. This may speed up shell scripts remarkably. The best way to use such an optimization is to define variables to those programs.

Should you use such optimization in your procmail code? That is a