Why SMTP will be with us for a long time

Simple Mail Transfer Protocol is the internet protocol for sending emails. For most people, your email service provider is responsible for it either directly or by using a third-party library or mail system such as postfix or Exchange server.

It also hits the forums daily and bemoaned as something between a failure and an antique and asking why it won't be replaced. Every now and then, somebody proposes their new SMTP killer, which often gets cut down pretty quickly and it is mainly because the conversation has become so convoluted that people don't always understand what SMTP does and doesn't do, what other extensions exist to resolve some of its problems and what other potential solutions exist to make SMTP suitable for the next generation of email users.

There is a good reason why SMTP won't go anywhere quickly

Other than HTTP, it is probably the only other protocol and massively widespread use across the internet that is not wholly controlled or maintained by the Internet backbone providers. Other protocols like BGP are used across the internet but will not affect 99% of the internet's users since they are relevant only to low-level routers. If these protocols have issues, the backbone providers can more easily upgrade equipment over a period of time.

User-space protocols like SMTP are another beast. SMTP is designed to allow adhoc distribution of email. You do not know or care in most cases when you send an email how many servers the email might hit, how many relays will forward it and how many failed paths might occur before ideally it ends up at its destination, as long as it gets there.

Imagine you email a gmail address from your company email example.com. Your mail client will possibly talk to your own company email server, this server might well relay the email to a service provider who can ensure that high volumes of email are delivered efficiently, this server will query DNS to get mx records for gmail.com and gmail.com is likely to provide a geo-relevant destination for your mail, which you forward it to, this edge server might relay via another 2 or 3 servers to get the mail to a location near to where the user collects their email. This complexity is one reason why emails sometimes disappear. If there is a problem at any point in the journey, an email could easily be lost without reporting an error to the sender, in many cases, once relayed, the only way to collect an error is if one of the destination servers sends something to the return-path to indicate an error. If for any reason that fails, the error is lost.

As well as the path complexity, there is also a very large number of private mail servers, perhaps run by individuals or small companies, each of these supports SMTP.

Now imagine you wanted to replace SMTP with "PTMS"! You would not be able to use that protocol until everyone involved in the email relay supported this. Ah, but you say, we do something similar with port and protocol handshaking, why not do the same thing? Because you have a catch 22. Without knowing the full path to the destination (which could change between you checking it and you sending an email!) you would either need to somehow know that every server supported the new protocol or you would negotiate to use PTMS with your immediate upstream, only to have it downgrade the email to SMTP for the next hop, which might not support PTMS.

If your new protocol is markedly different, it might not be possible to downgrade or upgrade after downgrading.

It is like adding encryption to a message. If the destination doesn't support encryption, it would also need the plain-text message which defeats the point of having encryption in the first place.

The problems with SMTP

There are two often touted problems relating to SMTP. Message secrecy and Message assurance.

Message secrecy

SMTP by itself is plain-text. However clever the original inventors of the web were, they did not foresee a large public internet with all of the security risks associated with the free-for-all, there was therefore no provision for encryption. At the time, processing capacity was a serious limitation so even if they considered it, it was unlikely it would have been implemented, although the protocol could have pre-empted its future need.

Encryption is a more realistic problem to solve in SMTP since we can encode encrypted data as plain-text, we can send the encrypted envelope via SMTP knowing it is private and as long as we establish a mechanism to know that the destination supports encryption, we know they can decrypt it. PGP is the most famous of these mechanisms and although it works mechanically perfectly at the message level, it does create a new problem related to the secure distribution of keys as well as some kind of trust-ring.

The idea of lots of people each knowing and trusting a small number of other people is very academically satisfying but the truth is that it would be trivially easy to break the mechanism by a single bad actor e.g. a college professor who is trusted but is then compelled by a government to "trust" a number of people in the secret services, enabling their access to materials they should not have.

The truth is, if the foundations of secure key exchange could be created, then encryption can be solved over SMTP, it doesn't need a new protocol.

Message assurance

Mesage assurance is about two things here: message integrity and verification of the sender.

Message Integrity

Although some of us might not think we are at risk, the important point is that some people are and it is an issue that is relatively easy to understand and has largely been solved with DKIM and public-key cryptography (PKI).

PKI provides a really useful and powerful pattern. I can sign something digital with a private key and send it to a recipient. The recipient can then use my public key to verify that the signature is correct and therefore the content has not been tampered with. This is not the same as encryption. The message does not have to be private but it can be assured with a signature. Like all digital systems, however, the mathematics are largely perfect but the implementation has risks. Firstly, I need to gain reliable access to your public key to verify the signature and secondly I have to assume that an attacker does not have access to your private key, otherwise they could change a message and re-sign it.

We can never reliably tell the integrity of the private key. Even if the sender thinks their keys are secure, in the most high-security environments, it might be very difficult to know if the keys are safe. Physical keys have to be physically stolen or copied but digital data is trivially easy to copy without a trace unless you have very expensive hardware to manage the keys for you.

Even verifying a public key source is not as easy as we might think. With DKIM on SMTP, this is done via DNS so if I received a message purported to be from example.com, I can lookup the dkim reference on the DNS for example.com and use this as the public key but again, this assumes that DNS is reliable/not diverted or modified and also that I am directly aware of the source domain name for the message. An attacker can sign their own messages from a fake domain e.g. yourexample.com and this would pass verification even though it might not be from who you think.

Message Sender Verification

In most cases, a big cause of privacy concerns as well as quality use of email is having an idea of what is genuine email and what is either fake, dangerous or junk. This is usually why people often complain about SMTP. It is also an area that is definitely not solved by mechanisms like SPF, DKIM and DMARC but it is also a problem that is much more complex than most people assume.

Email messages come from a number of sources, directly, indirectly and unsolicited. A direct email might be from a friend or colleague or from someone I have a business relationship with. Indirect emails could be someone passing on your email address legitimately or otherwise to a third-party. You might or might not welcome contact from these recipients. Unsolicited emails might be legal, you might get them from a site you signed up to and might or might not have realised that you were agreeing to be sold to etc. they might also be from people who have unlawfully found or guessed your email address and they might be from a completely illegitimate company trying to sell you blue pills or illegal goods. They might also be attempts at phishing or other cyber-attacks and might contain malware. Some emails might be relayed by bulk mail services that could be used for good or evil.

Now ask yourself without thinking of the technical protocol details, how you could possibly tell all of these apart?

Email clients use a number of tricks to give a score to an email but that score will never be perfect. Although you might not want an email from a Recruiter that is not in your address book, you might be expecting it. You might have given them your email. The fact that the same Recruiter sent an unsolicited email to someone (or at least the recipient marked it as SPAM) should do what exactly? Block all future email from that sender? Ruin their business just because some people don't know what "SPAM" means and sometimes use the button to mean "I don't really want this".

You could reduce their sending IPs reputation but then that wouldn't be fair on a shared system used for people who are not sending unsolicited emails, this is a perpetual pain for bulk email services and SaaS businesses that have many customers using their system to send large numbers of emails.

The truth is, it is a big mess and most organisations have their own magic-sauce to work out what is and isn't likely to be SPAM and as we all know well, we often see legitimate emails going into SPAM and SPAM emails in our inbox (I find the former more often than the latter). This is the point that most people start with "SMTP is rubbish and needs replacing".

Sure SMTP doesn't solve this problem with SPF and DKIM and I think people assume that these are supposed to solve it and therefore SMTP is broken. The truth is, at the moment, even as a human, I can not tell the difference between real and unsolicited email, even though I know that I have given my email to people, it is often passed on by others, sometimes legitimately, and I see an email that I think is SPAM which isn't. If I can't do it, how on earth is a protocol supposed to?

How do we solve it?

I think there are some really important big-picture considerations to discuss and perhaps to give us a new perspective on things. It is common for people to try and fix things incrementally but sometimes you need a rethink.

Firstly, email is a terrible tool for a lot of the things we use it for. Many people get log-type emails when errors occur or events happen, even Amazon totally pollute my inbox with order confirmations, despatch notifications etc. These could easily be switched into a different relationship-based protocol so that instead of anyone being allowed to send to my inbox as currently, you would need to setup some kind of username/password/proof so that when Amazon wants to send me notifications, which I want and expect, they can do. This removes a lot of noise already.

Secondly, we could consider a different way of providing contact details to someone, let's say a recruiter at a conference. You want to provide a limited contact address, not an open one that can be shared etc. This could be via some kind of web app(s) like a contact form with no email address or it could be an address that can only be used once. If you want to continue the conversation, you can reply either via your public email if you are sure or ideally a unique address that is only for that relationship. If you ever receive anything on that channel from someone other than the person you were talking to, you know that they have shared it or been breached. You can also easily disconnect a communication channel.

What this also does is remove many of the SPAM traps and the many false positives. If I get an email to a channel that I do not expect, rather than having to punish the server that might have relayed it, I can simply block the channel, probably knowing who it is that has misused the access, and can take more direct actions. It is much easier to report the one person who has that address than to guess who it was that illegally shared it or used it.

All of these could be done without doing anything with SMTP.

The underlying problem we need to fix is to stop thinking of email as a digital equivalent to a postal address but to use the digital world for what it is good for, largely free disposable URIs covering an almost infinite keyspace that can provide breakable channels.