Microsoft Telnet> open mail.example.com 110 Connecting To mail.example.com...
So I logged into the mail server, which is a
CentOS 7 Linux server running
Dovecot, and tried connecting to the
address, 127.0.0.1, but Dovecot never responded with a banner, nor did I receive
any response when I issued a
user command to provide login
credentials. I had to hit
Ctrl-] to exit from the Telnet
program, since I wasn't getting any response from Dovecot.
# telnet 127.0.0.1 110 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. user lila ^] telnet> quit Connection closed. #
I checked the system's mail log, but saw no references to dovecot nor pop3 in the day's log.
# grep dovecot /var/log/maillog # grep pop3 /var/log/maillog #
When I checked on the Dovecot processes that were running, I saw a very large number of such processes.
# ps -aux | grep dovecot dovenull 346 0.0 0.1 45272 3248 ? S 19:40 0:00 dovecot/pop3-login dovenull 421 0.0 0.1 45140 3252 ? S 19:41 0:00 dovecot/pop3-login dovenull 837 0.0 0.1 45140 3252 ? S 19:50 0:00 dovecot/pop3-login root 893 0.0 0.0 12524 904 ? S 13:02 0:00 dovecot/pop3 root 901 0.0 0.0 12524 908 ? S 13:02 0:00 dovecot/pop3 root 1189 0.0 0.0 12524 908 ? S 06:00 0:00 dovecot/pop3
When I counted the number of "dovecot/pop3" and "dovecot/pop3-login" processes, I saw 32 instances of the latter process, 76 instances of the former process and 112 total for all processes containing "dovecot".
# ps -aux | grep "dovecot/pop3-login" | wc -l 32 # ps -aux | grep "dovecot/pop3" | grep -v "login" | wc -l 76 # ps -aux | grep dovecot | wc -l 112
Since, while reviewing the maillog file, I also saw references to "timeout waiting for input from local during Draining Input", I counted the number of such instances and found 127 of them.
# grep "Draining" /var/log/maillog | wc -l 127 #
While checking on the possible cause of those messages, I found [CentOS] Re: Sendmail: timeout waiting for input from local during Draining Input. The person posting noted he could resolve the problem he was experiencing by stopping and restarting Dovecot "every so often", but that wasn't a good solution for him nor would it be for me. Someone who responded stated "You need to make sure that all three processes are using the same lock type, with sendmail V8.13 is fcntl. Sometimes I see dovecot defaults with flock enabled, and it could be causing the lock contentions. Also make sure that you didn't somehow enable dovecot's deliver while procmail is still enabled."
I found the Sendmail version was 8.14. I checked to see what locking mechanism Dovecot was using and found it to be fnctl.
# doveconf -d | grep -i lock dotlock_use_excl = yes lock_method = fcntl mail_max_lock_timeout = 0 mbox_dotlock_change_timeout = 2 mins mbox_lock_timeout = 5 mins mbox_read_locks = fcntl mbox_write_locks = dotlock fcntl pop3_lock_session = no
When I checked on procmail processes running with
ps aux | grep
procmail, I found many (well over one hundred) like the following:
son.Thelma@geotermic.org -t -Y -a -d ann ann 31687 0.0 0.0 15896 896 ? Ss Dec01 0:00 procmail -f RebeccaMorgan@pick.facedxh.top -t -Y -a -d ann ann 31826 0.0 0.0 15896 1068 ? Ss 12:51 0:00 procmail -f PermanentlyEliminateNeuropathy@breathless.ohysend.top -t -Y -a -d ann ann 32418 0.0 0.0 15896 900 ? Ss Dec01 0:00 procmail -f PANSY.AYRTON@euro-tara.su -t -Y -a -d ann ann 32686 0.0 0.0 15896 1060 ? Ss 19:37 0:00 procmail -f StarShower@8aggryn.club -t -Y -a -d ann
All but one were related to one user's account, though that wasn't the user who reported the email issue.
I restarted Dovecot with the command
service dovecot restart,
I was then able to connect to check the users mailbox by establishing a
connection to the POP3 port, port 110. I was then able to see that she had
387 messages in her mailbox by issuing the
stat command after
pass commands to provide her
# telnet 127.0.0.1 110 Trying 127.0.0.1... Connected to 127.0.0.1. Escape character is '^]'. +OK Dovecot ready. user lila +OK pass thePassword +OK Logged in. stat +OK 387 18824992 quit +OK Logging out. Connection closed by foreign host. #
When I ran the mailq command to check on queued email, I saw 72 queued messages. A lot of messages had an asterick after the queue ID, which indicates a lock exists for the message. There were a lot of what appeared to be spam messages queued for delivery. The spam messages had a "to" address that corresponded to an email alias address on the system that results in email being delivered to two local accounts, but also some external email addresses on other domains. The queued messages were bound for those external email addresses.
# mailq /var/spool/mqueue (72 requests) -----Q-ID----- --Size-- -----Q-Time----- ------------Sender/Recipient----------- uB1NZe2Q012982* 1651 Thu Dec 1 18:43 <santas-shop-emersons=example02.com@cro 7BIT (host map: lookup (crosbystillsnashyoung.com): deferred) <email@example.com> uB32fTer015234 4002 Fri Dec 2 21:41 MAILER-DAEMON <leise.sonya-emersons=example02.com@ji uB32adBk024633* 3848 Fri Dec 2 21:36 MAILER-DAEMON (Deferred: Name server: smtp.atlanticbb.net: host name lookup) <sandy-mcglathery-emersons=example02.c uB32fLFZ030925 4039 Fri Dec 2 21:41 MAILER-DAEMON <elaine.pavao-emersons=example02.com@j uB32fWVb029492 4042 Fri Dec 2 21:41 MAILER-DAEMON <charlotte-corbi-emersons=example02.co uB32euDV021491 4083 Fri Dec 2 21:40 MAILER-DAEMON <christa-patera-emersons=example02.com uB32fGhW014824 5415 Fri Dec 2 21:41 MAILER-DAEMON <military_flashlight-emersons=example0 uB32fLKt003888 5590 Fri Dec 2 21:41 MAILER-DAEMON <russian-dating-emersons=example02.com uB32a3LK017017 5459 Fri Dec 2 21:36 MAILER-DAEMON (Deferred: Name server: smtp.atlanticbb.net: host name lookup) <debt-restructure-team-emersons=exampl uB32eqoa025429 5830 Fri Dec 2 21:40 MAILER-DAEMON <business_phone_systems-emersons=examp <text snipped> Total requests: 72 #
The files associated with queued messages are stored in the
/var/spool/mqueue directory. E.g., for the second queued message
above there is a "df" and a "qf" file:
# ls /var/spool/mqueue/*uB32fTer015234 /var/spool/mqueue/dfuB32fTer015234 /var/spool/mqueue/qfuB32fTer015234
You can delete a queue entry by removing both of them - see Deleting the files associated with a mailq entry. The "df" file contains the contents of the email. You can examine the "qf" file to see the "from", "to", and "subject" lines, though you may also see them in the "df" file.
# rm /var/spool/mqueue/*uB32fTer015234 rm: remove regular file /var/spool/mqueue/dfuB32fTer015234? y rm: remove regular file /var/spool/mqueue/qfuB32fTer015234? y #
I removed several of the queued spam messages, but then had to go to bed because it was late at night by the time I started cleaning up the queue. When I checked the mail queue again the next day, there were no messages remaining in the mail queue. Nor were there any procmail processes running. And there were no "timeout waiting for input from local during Draining Input" entries in this day's maillog file. There were only 4 Dovecot processes running.
# mailq /var/spool/mqueue is empty Total requests: 0 # # ps aux | grep procmail | grep -v grep # grep "Draining" /var/log/maillog | wc -l 0 # ps -aux | grep dovecot | grep -v grep | wc -l 4 # ps -aux | grep dovecot | grep -v grep root 10793 0.0 0.0 40284 1224 ? Ss Dec02 0:00 /usr/sbin/dovecot -F dovecot 10876 0.0 0.0 9312 580 ? S Dec02 0:00 dovecot/anvil root 14755 0.0 0.1 12328 2020 ? S 20:39 0:00 dovecot/config dovecot 14756 0.0 0.0 29916 1844 ? S 20:39 0:00 dovecot/auth #
So, at this point, I suspect the queued spam is related to the issue with Dovecot not responding to user's email client queries, but I still don't know what was the original cause of the problem. I've had to restart Dovecot several times over the last few months, e.g., I noted a similar problem in Dovecot - client connections are being dropped on July 4 when I found 101 "dovecot/pop3-login" processes running on the mail server, which led me to check on the configuration of Dovecot. But yesterday's problem was dissimilar from the incident on July 4, since when I checked the maillog file for yesterday for "dovecot" entries again today, the only ones I found were for ones where I restarted Dovecot.
# grep dovecot /var/log/maillog.1 | wc -l 6 # grep dovecot /var/log/maillog.1 Dec 2 21:34:29 example dovecot: master: Warning: Killed with signal 15 (by pid= 1 uid=0 code=kill) Dec 2 21:34:33 example dovecot: master: Dovecot v2.2.10 starting up for imap, p op3, lmtp (core dumps disabled) Dec 2 21:35:14 example dovecot: master: Error: service(anvil): Initial status n otification not received in 30 seconds, killing the process Dec 2 21:35:14 example dovecot: master: Error: service(log): Initial status not ification not received in 30 seconds, killing the process Dec 2 21:35:14 example dovecot: master: Error: service(ssl-params): Initial sta tus notification not received in 30 seconds, killing the process Dec 2 21:35:14 example dovecot: master: Error: service(log): child 10872 killed with signal 9 #
On July 4, I found lots of dovecot "process _limit (100) reached" entries, but no such entries in yesterday's log files.
# grep limit /var/log/maillog.1 #