I have been looking at this recently since, as a bulk mail sender, we experience regular complaints about how our customer's emails are not delivered or thrown into the SPAM folder. Most people have no idea about how complicated SMTP is and the constant very convoluted battle between senders, receivers and spammers.

What is the problem to fix?

SMTP as a basic protocol is simply text. Anyone can send anything and pretend to be anyone else. I can send an email with a "from" address of billgates@microsoft.com and the receiver is none-the-wiser. This is a problem because spammers or phishers can pretend to be anyone leading the victim to click something they should not click.

Note that the https versions of SMTP do not fix this problem since they only refer to encryption in-transit and not the proof of origin.

What is SPF?

Sender Policy Framework was a simple but misguided attempt to answer a simple question: which ip addresses are allowed to send emails as "me"?

It is easy enough to understand. You create a text record in your DNS for sender.com that lists ips or external references to other lists of ips which say who can send email "from" someone@sender.com. The receiving email server simply queries the DNS and attempts to match the sending ip to one in the list. If it matches, great, but if not, you have an SPF fail.

Why doesn't SPF work properly? There are two problems really.

The first is simply that when using one or more mail forwarding services like mailjet or amazon ses, your domain has to list all of these as "permitted senders", which even when involving external references can cause a bit of a headache. There is a 10 hop limit on DNS searches so with spf records, which are allowed to hop, you can quickly run our of space. We had one customer who (for reasons I don't understand) used some tool which expanded the spf lists into specific ips and created linked records in their DNS. They had already reached the 10 record limit when they needed to add 15 of our mail servers to their spf lists.

The second problem is that spf links the sending ip address to the DNS record so what happens if you e.g. send an email to Gmail and it forwards it to Yahoo? Yahoo sees GMail's ip address as the sender and this would fail SPF.

There is a workaround, in the good old way of the internet, someone invents another protocol to fix the broken one, instead of re-inventing it! It's called Sender Rewriting Scheme (SRS) and basically says to the forwarder that they have to replace the "from" address with their own, temporary from address, which will pass spf although it, of course, assumes that they validated the SPF from the origin server. If that failed, should they forward the email or not?

Messy business! Also, there are still a number of mail relays that do not support SRS and which basically break SPF.

SPF only really protects the senders domain, it doesn't protect people from receiving SPAM and in many ways has very limited value since we know that SPAM and phishing don't have to come from real domains to be effective.

What is DKIM?

DKIM solves a slightly different problem. How do I know my legitimate email wasn't tampered with en-route? How do I know that someone hasn't hacked my ISP and started added phishing links into emails via a "virus scanner" or some low-level network tool?

I don't know what motivated this, whether a real or a perceived threat but dkim uses cryptography to sign a message. The nice thing about signing is that it doesn't require encryption in-transit. Anyone could theoretically see the contents of the message but it is very hard/impossible to modify the email content without the signature validation failing.

The sender signs usually the message body and the "from" field with a private key and embeds the result into the message. The receiving server gets the public key for the domain from another DNS entry and performs a similar signature derivation but using the public key to verify that the message and from field have been untouched.

So far, so good. If we forward email to another server, we can handle this because the forwarder can add another dkim signature for their domain (which they need if they have used SRS) but otherwise can pass the original message unmolested.

So does this fix the original problem? In an word, no. DKIM only ties the domain of a "from" address and the message content to the signature. It provides no other protection or non-repudiation. If I am a spammer, although I would not be able to sign an email from microsoft.com and pass dkim, I could sign my own email from dodgy.com and it would pass the DKIM check. A common trick is for spammers and phishers to use domains that either use unicode characters to look visually like a real domain like paypa1.com (notice the last character of Paypal!) or otherwise something like paypal-security.com, which might be owned by an attacker and no paypal. Neither of these tricks is prevented with SPF or DKIM.

Enter DMARC

What do we do with our two largely broken protocols? We invent another, of course, to fix all the problems (without actually fixing anything!).

Domain-based Messaging Authentication is not much more than a bit of glue and some reporting functionality but doesn't really offer any useful above DKIM and SPF, just another thing to try and use and get wrong.

DMARC allows you to publish a policy via DNS that tells receivers how seriously to treat messages that fail SPF and/or DKIM. You can be strict, you can allow one or the other to succeed or you can just report the information directly to a published email address to analyse what people are trying to abuse your system.

We already know that SPF is easily broken with forwarders and that even the mighty DKIM both doesn't actually protect very much and can also be broken by various mail relays that "have to" change the contents of the message that has been signed - for example, adding a "scanned by Avast" line to the bottom.

The problem with DMARC is that you can be strict because there are too many known cases that just don't work and you will have email rejected. Most senders do not want this. Putting it to relaxed or report-only just generates noise. We know that there are loads of cases that will fail to enabling the reporting just sends you information that you already know about and can't fix like "forwarding sometimes breaks SPF".

Even if we got a report saying that an ip belonging to dodgy.com was trying to send emails as smartsurvey.co.uk, we can't do anything about that in most cases. The IP address might not be traceable, we might not have any legal route to approach someone registered overseas etc.

In the end, DMARC seems like a big waste of time.

What if everyone supported this stuff?

Well this is the big problem with the internet. Even if you bring in a protocol that solved all of these issues (and none of these do!), not only do you have to wait for people to upgrade, there are likely to be 1000s of servers that are redundant on the internet which will never be upgraded. It would also require that all custom builds of things like postfix and sendmail were also updated to use any new functionality - you simply can't do that.

You end up with a strange problem: you have to not trust the mail from servers you already trust like GMail and Yahoo since they will support the protocols that you use and trust mail from untrusted servers since they do not use the protocols that can establish trust!

There could potentially be a new version of SMTP or something similar (it would have to be similar to be adopted) that retains the original message body and "from" address signed with DKIM, where we wouldn't care which mail server sent it to us because they couldn't change it anyway. Any intermediate servers should not need to add different "from" addresses, they should just forward the message. If they need to add body content, it should be done in a different boundary of the message so it is clear which part of the message is safe and which isn't. Mail readers could prohibit interaction with the untrusted section. Otherwise the intermediate would have to sign the new message with its own dkim, although then the end-reader has to be more complicated to highlight that "this was from the original sender" and "this was from GMail who has forwarded it" etc.

Anyway, as I said in the title - ouch!