Why We Need to Handle Bounced Emails

October 5, 2014

Bounced emails are the bane of marketing campaigns and mailing lists. In this article, the author explains the nature of bounce messages and describes how to handle them.

Wikipedia defines a bounce email as a system-generated failed delivery status notification (DSN) or a non-delivery report (NDR), which informs the original sender about a delivery problem. When that happens, the original email is said to have bounced.

Broadly, bounces are categorised into two types:

A hard/permanent bounce: This indicates that there exists a permanent reason for the email not to get delivered. These are valid bounces, and can be due to the non-existence of the email address, an invalid domain name (DNS lookup failure), or the email provider blacklisting the sender/recipient email address.
A soft/temporary bounce: This can occur due to various reasons at the sender or recipient level. It can evolve due to a network failure, the recipient mailbox being full (quota-exceeded), the recipient having turned on a vacation reply, the local Message Transfer Agent (MTA) not responding or being badly configured, and a whole lot of other reasons. Such bounces cannot be used to determine the status of a failing recipient, and therefore need to be sorted out effectively from our bounce processing.

To understand this better, consider a sender alice@example.com, sending an email to bob@somewhere.com. She mistyped the recipients address as bub@somewhere.com. The email message will have a default envelope sender, set by the local MTA running there (mta.example.com), or by the PHP script to alice@example.com. Now, mta.example.com looks up the DNS mx records for somewhere.com, chooses a host from that list, gets its IP address and tries to connect to the MTA running on somewhere.com, port 25 via an SMTP connection. Now, the MTA of somewhere.com is in trouble as it can’t find a user receiver in its local user table. The mta.somewhere.com responds to example.com with an SMTP failure code, stating that the user lookup failed (Code: 550). Its time for mta.example.com to generate a bounce email to the address of the return-path email header (the envelope sender), with a message that the email to alice@somewhere.com failed. That’s a bounce email. Properly maintained mailing lists will have every email passing through them branded with the generic email ID, say mails@somewhere.com as the envelope sender, and bounces to that will be wasted if left unhandled.

VERP (Variable Envelope Return-Path)
In the above example, you will have noticed that the delivery failure message was sent back to the address of the Return-Path header in the original email. If there is a key to handle the bounced emails, it comes from the Return-Path header.
The idea of VERP is to safely encode the recipient details, too, somehow in the return-path so that we can parse the received bounce effectively and extract the failing recipient from it. We specifically use the Return-Path header, as thats the only header that is not going to get tampered with by the intervention of a number of MTAs.
Typically, an email from Alice to Bob in the above example will have headers like the following:

From: alice@example.com
To: bob@somewhere.com
Return-Path: mails@example.com

Now, we create a custom return path header by encoding the To address as a combination of prefix-delim-hash. The hash can be generated by the PHP hmac functions, so that the new email headers become something like what follows:

From: alice@example.com
To: bob@somewhere.com
Return-Path: bounce-bob.somewher.com-{encode ( bob@somewher.com ) }@example.com

Now, the bounces will get directed to our new return-path and can be handled to extract the failing recipient.
Generating a VERP address
The task now is to generate a secure return-path, which is not bulky, and cannot be mimicked by an attacker. A very simple VERP address for a mail to bob@somewhere.com will be:

bounces-bob=somehwere.com@example.com

Since it can be easily exploited by an attacker, we need to also include a hash generated with a secret key, along with the address. Please note that the secret key is only visible to the sender and in no way to the receiver or an attacker.
Therefore, a standard VERP address will be of the form:

bounces-{ prefix }-{hash(prefix,secretkey) }@sender_domain

PHP has its own hash-generating functions that can make things easier. Since PHPs hmacs cannot be decoded, but only compared, the idea will be to adjust the recipient email ID in the prefix part of the VERP address along with its hash. On receipt, the prefix and the hash can be compared to validate the integrity of the bounce.
We will string replace the @ in the recipient email ID to attach it along with the hash.
You need to edit your email headers to generate the custom return-path, and make sure you pass it as the fifth argument to the php::mail() function to tell your exim MTA to set it as the default envelope sender.

$to = bob@somewhere.com;
$from = alice@example.com;
$subject = This is the message subject ;
$body = This is the message body;

/** Altering the return path */
$alteredReturnPath = self::generateVERPAddress( $to );
$headers[ Return-Path] = $alteredReturnPath;
$envelopeSender=  -f . $alteredReturnPath;

mail( $to, $subject, $body, $headers, $envelopeSender );

/** We need to produce a return address of the form -
* bounces-{ prefix }- {hash(prefix) }@sender_domain, where prefix can be 
* string_ replaced(to_address )
*/
public generateVERPAddress( $to ) {
global $hashAlgorithm = md5;
global $hashSecretKey = myKey;
$emailDomain = example.com;
$addressPrefix = str_replace( '@', '.', $to );
$verpAddress = hash_hmac( $hashAlgorithm , $to, $hashSecretKey );
$returnPath = bounes. -.$addressPrefix.-. $verpAddress. @. $emailDomain;
return $returnPath;
}

Including security features is yet another concern and can be done effectively by adding the current timestamp value (in UNIX time) in the VERP prefix. This will make it easy for the bounce processor to decode the email delivery time and add additional protection by brute-forcing the hash. Decoding and comparing the value of the timestamp with the current timestamp will also help to understand how old the bounce is.

Therefore, a more secure VERP address will look like what follows:

bounces-{ to_address }-{ delivery_timestamp }-{ encode ( to_address-delivery & timestamp ), secretKey }@somewhere.com

The current timestamp can be generated in PHP by:

$current_timestamp = time();

Theres still work to do before the email is sent, as the local MTA at example.com may try to set its own custom return-path for messages it transmits. In the example below, we adjust the exim configuration on the MTA to override this behaviour.

$ sudo nano /etc/exim4/exim4.conf

# Do not remove Return Path header
return_path_remove = false

# Remove the field errors_to from the current router configuration.
# This will enable exim to use the fifth param of php::mail() prefixed by -f to be set as the default # envelope sender

Every email ID will correspond to a user_id field in a standard user database, and this can be used instead of an email ID to generate a tidy and easy to look up VERP hash.

Redirect your bounces to a PHP bounce-handling script
We now have a VERP address being generated on every sent email, and it will have all the necessary information we need securely embedded in it. The remaining part of our task is to capture and validate the bounces, which would require redirecting the bounces to a processing PHP script.

By default, every bounce message will reach all the way back till the MTA that sent it, say mx.example.com, as its return-path gets set to mails@example.com, with or without VERP. The advantage of using VERP is that we will have the encoded failing address, too, somewhere in the bounce. To get that out from the bounce, we can HTTP POST the email via curl to the bounce processing script, say localhost/handleBounce.php using an exim pipe transport, as follows:

$sudo nano /etc/exim4/exim4.conf

# suppose you have a recieve_all router that will accept all the emails to your domain.
# this can be the system_alias router too 
recieve_all:
driver = accept
transport = pipe_transport
# Edit the pipe_transport
pipe_transport:
driver = pipe
command = /usr/bin/curl http://localhost/handleBounce..php --data-urlencode "email@-"
group = nogroup
return_path_add # adds Return-Path header for incoming mail.
delivery_date_add # adds the bounce timestamp
envelope_to_add # copies the return path to the To: header of bounce

The email can be made use of in the handleBounce.php by using a simple POST request.

$email = $_POST[ email ];

Decoding the failing recipient from the bounce email
Now that the mail is successfully in the PHP script, our task will be to extract the failing recipient from the encoded email headers. Thanks to exim configurations like envelope_to_add in the pipe transport (above), the VERP address gets pasted to the To header of the bounce email, and thats the place to look for the failing recipient.
Some common regex functions to extract the headers are:

function extractHeaders( $email ) {
$bounceHeaders = array();
$lineBreaks = explode( "\n", $email );
foreach ( $lineBreaks as $lineBreak ) {
if ( preg_match( "/^To: (.*)/", $lineBreak , $toMatch ) ) {
$bounceHeaders[ 'to' ] = $toMatch[1];
}
if ( preg_match( "/^Subject: (.*)/", $lineBreak , $subjectMatch ) ) {
$bounceHeaders[ 'subject' ] = $subjectMatch[1];
}
if ( preg_match( "/^Date: (.*)/", $lineBreak , $dateMatch ) ) {
$bounceHeaders[ 'date' ] = $dateMatch[1];
}
if ( trim( $lineBreak ) == "" ) {
// Empty line denotes that the header part is finished
break;
}
}
return $bounceHeaders;
}

After extracting the headers, we need to decode the original-failed-recipient email ID from the VERP hashed $bounceHeader[to], which involves more or less the reverse of what we did earlier. This would help us validate the bounced email too.

/**
*Considering the recieved $heders[ to ] is of the form 
* bounces-{ to_address }-{ delivery_timestamp }-{ encode ( to_address-delivery & timestamp ), * secretKey }@somewhere.com
*/
$hashedTo = $headers[ to ];

$to = self::extractToAddress( $hashedTo );

function extractToAddress( $hashedTo ) {
$timeNow = time();

// This will help us get the address part of address@domain 
preg_match( '~(.*?)@~', $hashedTo, $hashedSlice );

// This will help us cut the address part with the symbol  - 
$hashedAddressPart = explode( '-', $hashedSlice1] );

// Now we have the prefix in $hashedAddressPart[ 0 - 2 ] and the hash in $hashedAddressPart[3]
$verpPrefix = $hashedAddressPart [0]. '-'. $hashedAddressPart 1]. '-'. hashedAddressPart [2];

// Extracting the bounce time. 
$bounceTime = $hashedAddressPart[ 2 ];

// Valid time for a bounce to happen. The values can be subtracted to find out the time in between and even used to set an accept time, say 3 days.
if ( $bounceTime < $timeNow ) {
if ( hash_hmac( $hashAlgorithm, $verpPrefix , $hashSecretKey ) === $hashedAddressPart[3] ) {
// Bounce is valid, as the comparisons return true.
$to = string_replace( ., @, $verpPrefix[1] );
return $to;
}
}
}

Taking action on the failing recipient
Now that you have got the failing recipient, the task would be to record his bounce history and take relevant action. A recommended approach would be to maintain a bounce records table in the database, which would store the failed recipient, bounce-timestamp and failure reason. This can be inserted into the database on every bounce processed, and can be as simple as:

/** extractHeaders is defined above */
$bounceHeaders = self::extractHeaders( $email );

$failureReason = $bounceHeaders[ subject ];
$bounceTimestamp = $bounceHeaders[ date ];
$hashedTo = $bounceHeaders[ to ]; // This will hold the VERP address
$failedRecipient = self::extractToAddress( $hashedTo );

$con = mysqli_connect( "database_server", "dbuser", "dbpass", "databaseName" );
mysqli_query( $con, "INSERT INTO bounceRecords( failedRecipient, bounceTimestamp, failureReason )VALUES ( $failedRecipient, $bounceTimestamp , $failureReason);

mysqlI_close( $con );

Simple tests to differentiate between a permanent and temporary bounce
One of the greatest challenges while writing a bounce processor is to make sure it handles only the right bounces or the permanent ones. A bounce processing script that reacts to every single bounce can lead to mass unsubscription of users from the mailing list and a lot of havoc. Exim helps us here in a great way by including an additional X-Failed-Recipients: header to a permanent bounce email. This key can be checked for in the regex function we wrote earlier, and action can be taken only if it exists.

/** 
* Check if the bounce corresponds to a permanent failure 
* can be added to the extractHeaders() function above 
*/
function isPermanentFailure( $email ) {
$lineBreaks = explode( "\n", $email );
foreach ( $lineBreaks as $lineBreak ) {
if ( preg_match( "/^X-Failed-Recipients: (.*)/", $lineBreak, $permanentFailMatch ) ) {
$bounceHeaders[ 'x-failed-recipients' ] = $permanentFailMatch;
return true; 
}
}

Even today, we have a number of large organisations that send more than 100 emails every minute and still have all bounces directed to /dev/null. This results in far too many emails being sent to undeliverable addresses and eventually leads to frequent blacklisting of the organisations mail server by popular providers like Gmail, Hotmail, etc.

If bounces are directed to an IMAP maildir, the regex functions won’t be necessary, as the PHP IMAP library can parse the headers readily for you.