Large number of procmail processes and failing POP3 connections

So I logged into the mail server, which is a CentOS 7 Linux server running Sendmail and Dovecot, and tried connecting to the localhost address, 127.0.0.1, but Dovecot never responded with a banner, nor did I receive any response when I issued a user command to provide login credentials. I had to hit Ctrl-] to exit from the Telnet program, since I wasn't getting any response from Dovecot.

I checked the system's mail log, but saw no references to dovecot nor pop3 in the day's log.

When I checked on the Dovecot processes that were running, I saw a very large number of such processes.

# ps -aux | grep dovecot
dovenull   346  0.0  0.1  45272  3248 ?        S    19:40   0:00 dovecot/pop3-login
dovenull   421  0.0  0.1  45140  3252 ?        S    19:41   0:00 dovecot/pop3-login
dovenull   837  0.0  0.1  45140  3252 ?        S    19:50   0:00 dovecot/pop3-login
root       893  0.0  0.0  12524   904 ?        S    13:02   0:00 dovecot/pop3
root       901  0.0  0.0  12524   908 ?        S    13:02   0:00 dovecot/pop3
root      1189  0.0  0.0  12524   908 ?        S    06:00   0:00 dovecot/pop3

When I counted the number of "dovecot/pop3" and "dovecot/pop3-login" processes, I saw 32 instances of the latter process, 76 instances of the former process and 112 total for all processes containing "dovecot".

Since, while reviewing the maillog file, I also saw references to "timeout waiting for input from local during Draining Input", I counted the number of such instances and found 127 of them.

While checking on the possible cause of those messages, I found [CentOS] Re: Sendmail: timeout waiting for input from local during Draining Input. The person posting noted he could resolve the problem he was experiencing by stopping and restarting Dovecot "every so often", but that wasn't a good solution for him nor would it be for me. Someone who responded stated "You need to make sure that all three processes are using the same lock type, with sendmail V8.13 is fcntl. Sometimes I see dovecot defaults with flock enabled, and it could be causing the lock contentions. Also make sure that you didn't somehow enable dovecot's deliver while procmail is still enabled."

I found the Sendmail version was 8.14. I checked to see what locking mechanism Dovecot was using and found it to be fnctl.

# doveconf -d | grep -i lock
dotlock_use_excl = yes
lock_method = fcntl
mail_max_lock_timeout = 0
mbox_dotlock_change_timeout = 2 mins
mbox_lock_timeout = 5 mins
mbox_read_locks = fcntl
mbox_write_locks = dotlock fcntl
pop3_lock_session = no

When I checked on procmail processes running with

ps aux | grep 
procmail

, I found many (well over one hundred) like the following:

son.Thelma@geotermic.org -t -Y -a  -d ann
ann      31687  0.0  0.0  15896   896 ?        Ss   Dec01   0:00 procmail -f RebeccaMorgan@pick.facedxh.top -t -Y -a  -d ann
ann      31826  0.0  0.0  15896  1068 ?        Ss   12:51   0:00 procmail -f PermanentlyEliminateNeuropathy@breathless.ohysend.top -t -Y -a  -d ann
ann      32418  0.0  0.0  15896   900 ?        Ss   Dec01   0:00 procmail -f PANSY.AYRTON@euro-tara.su -t -Y -a  -d ann
ann      32686  0.0  0.0  15896  1060 ?        Ss   19:37   0:00 procmail -f StarShower@8aggryn.club -t -Y -a  -d ann

All but one were related to one user's account, though that wasn't the user who reported the email issue.

I restarted Dovecot with the command service dovecot restart, I was then able to connect to check the users mailbox by establishing a connection to the POP3 port, port 110. I was then able to see that she had 387 messages in her mailbox by issuing the stat command after issuing the user and pass commands to provide her login credentials.

# telnet 127.0.0.1 110
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
+OK Dovecot ready.
user lila
+OK
pass thePassword
+OK Logged in.
stat
+OK 387 18824992
quit
+OK Logging out.
Connection closed by foreign host.
#

When I ran the mailq command to check on queued email, I saw 72 queued messages. A lot of messages had an asterick after the queue ID, which indicates a lock exists for the message. There were a lot of what appeared to be spam messages queued for delivery. The spam messages had a "to" address that corresponded to an email alias address on the system that results in email being delivered to two local accounts, but also some external email addresses on other domains. The queued messages were bound for those external email addresses.

# mailq
                /var/spool/mqueue (72 requests)
-----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient-----------
uB1NZe2Q012982*    1651 Thu Dec  1 18:43 <santas-shop-emersons=example02.com@cro
          7BIT   (host map: lookup (crosbystillsnashyoung.com): deferred)
                                         <emersons@example02.com>
uB32fTer015234     4002 Fri Dec  2 21:41 MAILER-DAEMON
                                         <leise.sonya-emersons=example02.com@ji
uB32adBk024633*    3848 Fri Dec  2 21:36 MAILER-DAEMON
                 (Deferred: Name server: smtp.atlanticbb.net: host name lookup)
                                         <sandy-mcglathery-emersons=example02.c
uB32fLFZ030925     4039 Fri Dec  2 21:41 MAILER-DAEMON
                                         <elaine.pavao-emersons=example02.com@j
uB32fWVb029492     4042 Fri Dec  2 21:41 MAILER-DAEMON
                                         <charlotte-corbi-emersons=example02.co
uB32euDV021491     4083 Fri Dec  2 21:40 MAILER-DAEMON
                                         <christa-patera-emersons=example02.com
uB32fGhW014824     5415 Fri Dec  2 21:41 MAILER-DAEMON
                                         <military_flashlight-emersons=example0
uB32fLKt003888     5590 Fri Dec  2 21:41 MAILER-DAEMON
                                         <russian-dating-emersons=example02.com
uB32a3LK017017     5459 Fri Dec  2 21:36 MAILER-DAEMON
                 (Deferred: Name server: smtp.atlanticbb.net: host name lookup)
                                         <debt-restructure-team-emersons=exampl
uB32eqoa025429     5830 Fri Dec  2 21:40 MAILER-DAEMON
                                         <business_phone_systems-emersons=examp
<text snipped>
                Total requests: 72
#

The files associated with queued messages are stored in the /var/spool/mqueue directory. E.g., for the second queued message above there is a "df" and a "qf" file:

You can delete a queue entry by removing both of them - see Deleting the files associated with a mailq entry. The "df" file contains the contents of the email. You can examine the "qf" file to see the "from", "to", and "subject" lines, though you may also see them in the "df" file.

I removed several of the queued spam messages, but then had to go to bed because it was late at night by the time I started cleaning up the queue. When I checked the mail queue again the next day, there were no messages remaining in the mail queue. Nor were there any procmail processes running. And there were no "timeout waiting for input from local during Draining Input" entries in this day's maillog file. There were only 4 Dovecot processes running.

# mailq
/var/spool/mqueue is empty
                Total requests: 0
# 
# ps aux | grep procmail | grep -v grep
# grep "Draining" /var/log/maillog | wc -l
0
# ps -aux | grep dovecot | grep -v grep | wc -l
4
# ps -aux | grep dovecot | grep -v grep
root     10793  0.0  0.0  40284  1224 ?        Ss   Dec02   0:00 /usr/sbin/dovecot -F
dovecot  10876  0.0  0.0   9312   580 ?        S    Dec02   0:00 dovecot/anvil
root     14755  0.0  0.1  12328  2020 ?        S    20:39   0:00 dovecot/config
dovecot  14756  0.0  0.0  29916  1844 ?        S    20:39   0:00 dovecot/auth
#

So, at this point, I suspect the queued spam is related to the issue with Dovecot not responding to user's email client queries, but I still don't know what was the original cause of the problem. I've had to restart Dovecot several times over the last few months, e.g., I noted a similar problem in Dovecot - client connections are being dropped on July 4 when I found 101 "dovecot/pop3-login" processes running on the mail server, which led me to check on the configuration of Dovecot. But yesterday's problem was dissimilar from the incident on July 4, since when I checked the maillog file for yesterday for "dovecot" entries again today, the only ones I found were for ones where I restarted Dovecot.

# grep dovecot /var/log/maillog.1 | wc -l
6
# grep dovecot /var/log/maillog.1
Dec  2 21:34:29 example dovecot: master: Warning: Killed with signal 15 (by pid=
1 uid=0 code=kill)
Dec  2 21:34:33 example dovecot: master: Dovecot v2.2.10 starting up for imap, p
op3, lmtp (core dumps disabled)
Dec  2 21:35:14 example dovecot: master: Error: service(anvil): Initial status n
otification not received in 30 seconds, killing the process
Dec  2 21:35:14 example dovecot: master: Error: service(log): Initial status not
ification not received in 30 seconds, killing the process
Dec  2 21:35:14 example dovecot: master: Error: service(ssl-params): Initial sta
tus notification not received in 30 seconds, killing the process
Dec  2 21:35:14 example dovecot: master: Error: service(log): child 10872 killed
 with signal 9
#

On July 4, I found lots of dovecot "process _limit (100) reached" entries, but no such entries in yesterday's log files.