Internet-Draft Identifying Email Forwarding February 2024
Chuang Expires 22 August 2024 [Page]
Workgroup:
Independent Stream
Internet-Draft:
draft-chuang-identifying-email-forwarding-00
Published:
Intended Status:
Experimental
Expires:
Author:
W. Chuang
Google, Inc.

Identifying Email Forwarding

Abstract

Forwarded email often becomes unauthenticated because it breaks SPF (RFC7208) authentication and DKIM (RFC6376) authentication. For example mailing-lists distribute email to multiple recipients through a separate server than the original sending server that breaks IP based SPF authentication and potentially may modify the message that breaks the DKIM signature. This document calls for using ARC (RFC8617) to identify and authenticate forwarded emails by further specifying the naming of the two digital signatures present in ARC headers- the message signature and the seal. Because this uses ARC digital signature, the receiver has confidence that a valid signature corresponding to some forwarder only could have been generated by the named domain. This document also specifies that all forwarded mail flows have associated ARC headers and the means to characterize the mail flows.

Status of This Memo

This Internet-Draft is submitted in full conformance with the provisions of BCP 78 and BCP 79.

Internet-Drafts are working documents of the Internet Engineering Task Force (IETF). Note that other groups may also distribute working documents as Internet-Drafts. The list of current Internet-Drafts is at https://datatracker.ietf.org/drafts/current/.

Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress."

This Internet-Draft will expire on 22 August 2024.

Table of Contents

1. Introduction

1.1. Purpose

Forwarding has long complicated email authentication as forwarding email through a 3rd party server breaks SPF [RFC7208] IP based authentication and breaks DKIM [RFC6376] authentication when the message is modified. The IETF DMARC WG published ARC [RFC8617] as a solution to the limitations of SPF and DKIM based authentication caused by forwarding. ARC headers propagate the DKIM and SPF authentication as seen by the forwarder to the receiver. Using those results, the receiver may choose to override their own authentication result to overcome the SPF and DKIM limitations. The key problem with using ARC this way is trusting that the forwarder didn't forge those results, and this trust problem has prevented further adoption of ARC.

This document proposes a different, potentially complementary approach in utilizing ARC. This document calls for identifying the forwarder and placing the responsibility on any forwarded spam on the forwarder. This specification calls for using the ARC-Message-Signature d= domain to name the forwarder with the same identity as the DKIM-Signature d= domain. This specification permits both DKIM and ARC headers identities, and provides rules and additional tools to help interpret their results. For example, to assist the human reader in understanding the mail flow, this specification provides a new ARC header tag-value to name all the stages of the mail flow.

Notably this approach does not require the original DKIM signature or all forwarded ARC message signatures to pass. Only the most recent forwarded ARC message signatures must pass if the message is forwarded. When no ARC headers are present, then the DKIM signature must pass. Forwarders may still modify the message that breaks the prior signatures but the message must be then ARC message signed afterwards. This document does not attempt to justify why those changes occur, and only broadly attempts to determine who made changes to the message. It does require, as before, that all ARC seals must verify. If they do not, this document calls for following the ARC [RFC8617] specification in not trusting the ARC chain. However, unlike that document, this specification calls for the preservation of the invalid ARC chain results for forensics in a clearly distinguishable way.

Delivery Status Notifications (DSN [RFC3461]) and auto-reply messages may be used in backscatter attacks. Part of the difficulty in remedying this problem has been the lack of observability mail flow at scale. This specification calls for DSN or auto-reply messages triggered by some other message delivery to be tracked as one mail flow by propagating the ARC headers to DNS or auto-reply messages and continue building the chain. By treating the initial message and subsequent new message as part of the same mail delivery flow, a mail receiver of the DSN or auto-reply will be able to observe the initial forwarding flow potentially to the originator. Incorporating DKIM and ARC headers should make it easier for automated systems to discover spam that attempts to hide behind backscatter attacks.

This document uses mail flow descriptions from the Internet-Draft draft-ietf-dkim-replay-problem which in turn is based on [RFC5598]. The usage is this document are defined in section Section 1.2.3.

1.1.1. Existing Authentication Practices

This section attempts to generalize the authentication practices of originators and forwarders at the time of writing. Most email originators sign messages with DKIM following [RFC6376] on behalf of some responsible party's domain to permit authentication to that domain. That domain is called the DKIM Signing Domain Identifier (SDID). Many originators further specify the DKIM SDID to be the same as the payload From header sender's domain thus identifying the sender in the visible part of the email to be displayed by the email MUA client. DMARC [RFC7489] calls this process alignment. Occasionally some forwarders resign DKIM, meaning that a DKIM header is added and possibly the old header is removed, which complicates attribution. In some of those cases, the payload From header may be rewritten as well. Forwarders may also generate ARC headers that follow [RFC8617] to report the DKIM and SPF authentication results in the ARC-Authentication-Result header. In many cases the ARC-Seal d= domain and ARC-Message-Signature d= domain are the same.

When there is a prior ARC header seal validation failure, forwarders in many cases do not follow ARC [RFC8617], and generate a single ARC set indicating the failure. In some cases, the ARC headers are removed. In other cases, the forwarder stops generating new headers. However, these actions permit a spammer to introduce an ARC header error to prevent valuable forensics information from being sent to the receiver.

1.1.2. Terminology and Definitions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

1.2. Characterizing Mail Flow

1.2.1. ARC-Message-Signature and ARC-Seal Domains

This specification calls for the ARC-Message-Signature d= domain to represent the responsible party that controls the forwarding of the message. DKIM [RFC6376] calls this domain the SDID, and this document proposes using the same equivalent DKIM SDID of the controlling forwarder to represent the ARC-Message-Signature SDID. The existing ARC specification [RFC8617] leaves that domain unspecified, which permits the refinement. Moreover this specification calls for the ARC-Seal d= domain to represent the responsible party for generating the ARC headers and vouches for the truthfulness of the ARC-Authentication-Results. This too would be a SDID corresponding to the party that generates the ARC headers. In a cloud environment, the party that generates the ARC headers is often a different entity that controls message forwarding.

Example:

ARC-Seal: i=1; d=cloudprovider.example.com ...
ARC-Message-Signature: i=1; d=mailinglist.example.com ...

1.2.2. Interaction Between DKIM and ARC

As noted earlier, the more common arrangement between DKIM and ARC today is for the originator to generate DKIM header as specified in [RFC6376] and for forwarders to ARC specification [RFC8617]. This specification calls for the originator to always DMARC [RFC7489] aligns the payload From header domain with the DKIM SDID domain. This alignment permits a subsequent receiver to identify the originator. Some originators may also generate ARC headers, and if so, this specification calls for the ARC-Message-Signature SDID to be the same as the From header to help clarify that the originator generated the ARC headers. This specification also calls for the originator to clear the ARC-Authentication-Result header body as there are no authentication results at origination. The advantage of generating ARC headers at origination is that a receiver may find it convenient to primarily work with one type of headers rather than both ARC and DKIM at the cost of extra headers size overhead.

A forwarder may resign the message with DKIM and rewrite with the payload From header. This specification calls the ARC-Message-Signature SDID to be the same as the new SDID that is the From header. This makes it clear that the forwarder claims ownership of the message.

DKIM originator and ARC forwarder example:

ARC-Seal: i=1; d=forwarder.example.com ...
ARC-Message-Signature: i=1; d=forwarder.example.com ...
ARC-Authentication-Result: i=1; dkim=pass header.i=@orig.example.com..
DKIM-Signature: d=orig.example.com ...
From: user@orig.example.com

DKIM and ARC originator example:

ARC-Seal: i=1; d=orig.example.com ...
ARC-Message-Signature: i=1; d=orig.example.com ...
ARC-Authentication-Result: i=1;
DKIM-Signature: d=orig.example.com ...
From: user@orig.example.com

Forwarder claiming ownership example:

ARC-Seal: i=1; d=forwarder.example.com ...
ARC-Message-Signature: i=1; d=forwarder.example.com ...
ARC-Authentication-Result: i=1; dkim=pass header.i=@orig.example.com..
DKIM-Signature: d=forwarder.example.com ...
From: user@forwarder.example.com
DKIM-Signature: d=orig.example.com ...

1.2.3. Edge and Forwarder Characterization

This document specifies the functional purpose of ARC-Message-Signature SDID as description is added as a tag-value to the ARC-Message-Signature with mail flow description tag of "m". This serves two purposes- 1) to help a human reading the ARC headers understand the mail flow 2) let receivers know that this forwarder or originator participates in this specification. Notably these values are not the source of truth about the mail flow characterization, and rather a receiver SHOULD infer them from the DKIM and ARC headers SDID and payload From alignment.

The values of the "m" tag are characterized as a subset of the Mail Handling Services defined in [RFC5598] and clarified in the internet-draft draft-ietf-dkim-replay-problem. This document reuses that list and description from the latter document. In a few cases, this document extends the description and a later section on discontiguous mail-flow further extends this list. The list of Mail Handling Services are:

Edge services:

originator:

Defined in Section 2.2.1. This is the first component of the MHS and works on behalf of the author to ensure the message is valid for transport; it then posts it to the relay (MTA) that provides SMTP store-and-forward transfer. The Originator can DKIM sign the message on behalf of the author, although it is also possible that the author's system, or even the first MTA, does DKIM signing. If specified, the ARC-Authentication-Result SHOULD be empty.

receiver:

Defined in Section 2.2.4 is the last stop in the MHS, and works on behalf of the recipient to deliver the message to their inbox; it also might perform filtering. If specified, this message SHOULD NOT be used subsequently in forwarded traffic and when found, may be processed by spam systems according to local policy.

Forwarder services:

alias:

Defined in Section 5.1. A type of Mediator user, operating in between a delivery and a following posting. The Alias replaces the original RCPT TO envelope recipient address but does not alter the content address field header fields. Often used for Auto-Forwarding.

resender:

Defined in Section 5.2. ReSender is a type of Mediator user, like an Alias; however, the ReSender updates the recipient address, and "splices" the destination header field and possibly other address fields as well.

mailing_list:

Defined in Section 5.3. Mailing list is another Mediator; it receives a message and reposts it to the list's members; it might add list-specific header fields [RFC4021] e.g. List-XYZ: might modify other contents, such as revising the Subject: field, or adding content to the body.

esp:

Email Service Provider is often called a Bulk Sender - An originating third-party service, acting as an agent of the author and sending to a list of recipients. They may DKIM sign as themselves and/or sign with the author's domain name.

ofs:

Outbound Filtering Service: Rather than sending directly to recipients' servers, the Originator can route mail through a Filtering Service, to provide spam or data loss protection services. This service may modify the message and can have a different ADMD from the Originator.

ifs:

Inbound Filtering Service: The Receiver can also route mail through a Filtering Service, to provide spam, malware and other anti-abuse protection services. Typically, this is done by listing the service in an DNS MX record. This service may modify the message and have a different ADMD from the Receiver.

Example:

ARC-Message-Signature: i=1; m=mailing_list; ...

1.3. Invalid ARC Headers

The ARC specification [RFC8617] does not acknowledge the forensics value of invalid ARC headers. Further spammers will intentionally invalidate ARC headers knowing that receivers will typically stop processing and publishing ARC headers. This specification calls for the receiver to continue generating the ARC headers, but obfuscate the invalid ARC headers to be conformant with the ARC specification.

When ARC verification finds prior ARC headers to be invalid, this specification calls for the prior ARC headers to be prefixed with X-Invalid-. The ARC signer continues the ARC instance number at the next instance number as if prior ARC headers were valid. It also seals assuming the prior ARC headers lack the prefix. The ARC signer generates a new set of normal ARC headers (without any prefix) with the ARC-Seal's having a tag cv=fail and ARC-Authentication-Result with the result arc=fail. ARC validators unaware of this specification will ignore the prior ARC headers with the X-Invalid- prefix, and see that the new ARC headers seal will intentionally not validate correctly. Aware validators will process the current and prior invalid ARC headers without the X-Invalid- prefix. This specification also calls for supporting invalid headers without the X-Invalid- prefix to keep the very same forensics information if ARC signers don't follow this specification. Humans and automated systems may be able to make use of the invalid headers to determine which participants in the mail flows contributed to observed potentially malicious messages.

Example:

ARC-Seal: i=2; d=forwarder2.example.com; cv=fail; ...
ARC-Message-Signature: i=2; d=forwarder2.example.com ...
ARC-Authentication-Result: i=2; dkim=pass; arc=fail ...
X-Invalid-ARC-Seal: i=1; d=forwarder1.example.com; cv=none; ...
X-Invalid-ARC-Message-Signature: i=1; d=forwarder1.example.com ...
X-Invalid-ARC-Authentication-Result: i=1; dkim=pass...
DKIM-Signature: d=orig.example.com ...
From: user@orig.example.com

1.4. Reporting Discontiguous Mail Flow

Mail handling systems may automatically respond to messages by sending new messages with status notifications or by sending auto-replies. In some cases, the original message content is copied over such as with Delivery Status Notification (DSN) RFC [RFC3461] or the subject as with auto-reply. As a consequence, spammers have used these auto-reply and status notification messages to propagate spam. These attacks are particularly difficult to respond to at scale because of the discontinuity between the original message that triggered the auto-reply and the subsequent message containing spam that is sent to the victim. The discontinuity prevents appropriate attribution by automated systems. This specification proposes that the ARC headers be preserved from the original message and propagated to the new auto-reply message. The receiver that sends the response SHOULD continue the ARC chain as if the messages were simply forwarded between the original and the new response, and the ARC message set number should simply increment after the last ARC set in the original message.

To help with human review of such messages, this proposes additional mail flow tag values:

ndr:

Non-Delivery Reports: This describes a mail flow for a special kind of Delivery Status Notification when the message is not delivered, and the originator should be notified. This is commonly called a bounce.

dsn:

Delivery Status Notifications: This describes a mail flow for all other delivery notifications besides Non-Delivery Reports, such as when the originator is informed of delivery of the message.

auto_reply:

For messages generated by a message handler working on behalf of a recipient. Some examples of these are vacation responders, and calendar schedulers.

Example after a bounce, as seen by the original sender that is now handling the bounce:

ARC-Seal: i=1; d=receiver.example.com; ...
ARC-Message-Signature: i=1; d=receiver.example.com; m=ndr; ...
ARC-Authentication-Result: i=1; dkim=pass header.i=@orig.example.com.
DKIM-Signature: d=receiver.example.com ...
From: no-reply@receiver.example.com

1.5. Additional Examples

These examples are informational. These are more complex examples, to show how these various techniques may compose across the different participants in the mail flow.

1.5.1. Originator ⇒ Mailing-List ⇒ Receiver: Bounce

This message is sent through a mailing-list to some receiver where the message bounces. The initial sent message from the originator:

DKIM-Signature: d=orig.example.com ...
From: john.doe@orig.example.com
Subject: A really big announcement

It's Jane Doe's birthday tomorrow!

The inbound DKIM signature pass verification. Next the mailing-list modifies the message by adding a Subject header prefix and message-body footer, then adds an ARC set with the DKIM authentication result. As the mailing-list is hosted on forwarder.example.com, the ARC-Seal SDID is forwarder.example.com, while the ARC-Message-Signature SDID is mailinglist.example.com. It also adds a new DKIM-Signature with a SDID of mailinglist.example.com.

ARC-Seal: i=1; d=forwarder.example.com; cv=none ...
ARC-Message-Signature: i=1; d=mailinglist.example.com;
    m=mailinglist ...
ARC-Authentication-Result: i=1;
    dkim=pass header.i=@orig.example.com ...
DKIM-Signature: d=mailinglist.example.com ...
DKIM-Signature: d=orig.example.com ...
From: john.doe@orig.example.com
Subject: [school list] A really big announcement

It's Jane Doe's birthday tomorrow!

============
This is the school mailing-list.

When the receiver verifies the message, it finds that the original message DKIM signature fails verification, but the ARC-Message-Signature passes along with the ARC seals and the new DKIM signature. As the ARC-Message-Signature contains a mail flow tag (m=mailinglist), the receiver knows the forwarder is participating in this specification, and can extract the SDID from the ARC-Message-Signature d= domain. From this, the receiver understands that the message was forwarded by mailinglist.example.com and is responsible for this message. At some point during delivery, the receiver sees that the message is being sent to a non-existent user, and bounces the message to the original sender. It removes the original headers, except the ARC headers, and adds new ones indicating the bounce in a NDR.

ARC-Seal: i=2; d=receiver.example.com; cv=pass ...
ARC-Message-Signature: i=2; d=receiver.example.com;
    m=ndr ...
ARC-Authentication-Result: i=2;
    dkim=fail header.i=@orig.example.com;
    dkim=pass header.i=@mailinglist.example.com;
    arc=pass ...
ARC-Seal: i=1; d=forwarder.example.com; cv=none ...
ARC-Message-Signature: i=1; d=mailinglist.example.com;
    m=mailinglist ...
ARC-Authentication-Result: i=1;
    dkim=pass header.i=@orig.example.com ...
DKIM-Signature: d=receiver.example.com ...
From: no-reply@receiver.example.com
Subject: Delivery Status Notification

** Delivery incomplete **

The originator receives the bounce, and verifies the content. The DKIM signature signed by the receiver passes, as does the ARC seals. The originator can see from the ARC headers that this message was originally from it as the ARC-Authentication-Result at i=1 indicates the original DKIM SDID was orig.example.com, and is likely a bounce. This is confirmed by the mail flow tag of m=ndr found in the ARC-Message-Signature at i=2.

1.6. Security Considerations

A spammer may remove all ARC and DKIM headers to confuse an ARC/DKIM aware receiver, however this specification calls for ARC headers to always be present on forwarded messages and that the message must always have a DKIM signature. The lack of ARC and DKIM headers would indicate that this message cannot be authenticated with the method in this specification. Valid DKIM header indicates that the message has not been tampered with since origination and can be authenticated. If the DKIM header is invalid, then the message must have valid ARC headers with a forwarder that can be identified to be considered authenticated.

A spammer may manipulate ARC headers to confuse an ARC aware receiver. Modification of existing ARC headers will result in the seal not verifying, which the receiver will detect and handle using ARC [RFC8617]. Manipulated ARC headers will not be used with authenticating the message.

A spammer may choose to participate in ARC header generation with valid ARC seals but misleading results e.g. results for a good mail attached to spammy one. This then is forwarded to the receiver. This will be confusing the receiver but it will also identify the spammy forwarder.

A spammer may replay messages ARC signed by another party. This document does not attempt to solve that problem and leaves that to subsequent work such as draft-chuang-replay-resistant-arc. `

1.7. IANA Considerations

There are no requests at this time.

2. Normative References

[RFC2119]
Bradner, S., "Key words for use in RFCs to Indicate Requirement Levels", BCP 14, RFC 2119, DOI 10.17487/RFC2119, , <https://www.rfc-editor.org/rfc/rfc2119>.
[RFC3461]
Moore, K., "Simple Mail Transfer Protocol (SMTP) Service Extension for Delivery Status Notifications (DSNs)", RFC 3461, DOI 10.17487/RFC3461, , <https://www.rfc-editor.org/rfc/rfc3461>.
[RFC4021]
Klyne, G. and J. Palme, "Registration of Mail and MIME Header Fields", RFC 4021, DOI 10.17487/RFC4021, , <https://www.rfc-editor.org/rfc/rfc4021>.
[RFC5598]
Crocker, D., "Internet Mail Architecture", RFC 5598, DOI 10.17487/RFC5598, , <https://www.rfc-editor.org/rfc/rfc5598>.
[RFC6376]
Crocker, D., Ed., Hansen, T., Ed., and M. Kucherawy, Ed., "DomainKeys Identified Mail (DKIM) Signatures", STD 76, RFC 6376, DOI 10.17487/RFC6376, , <https://www.rfc-editor.org/rfc/rfc6376>.
[RFC7208]
Kitterman, S., "Sender Policy Framework (SPF) for Authorizing Use of Domains in Email, Version 1", RFC 7208, DOI 10.17487/RFC7208, , <https://www.rfc-editor.org/rfc/rfc7208>.
[RFC7489]
Kucherawy, M., Ed. and E. Zwicky, Ed., "Domain-based Message Authentication, Reporting, and Conformance (DMARC)", RFC 7489, DOI 10.17487/RFC7489, , <https://www.rfc-editor.org/rfc/rfc7489>.
[RFC8617]
Andersen, K., Long, B., Ed., Blank, S., Ed., and M. Kucherawy, Ed., "The Authenticated Received Chain (ARC) Protocol", RFC 8617, DOI 10.17487/RFC8617, , <https://www.rfc-editor.org/rfc/rfc8617>.

Appendix A. Acknowledgments

Thanks goes to Emanuel Schorsch. The mail handler server list and content in section Section 1.2.3 was written by David Crocker.

Author's Address

Weihaw Chuang
Google, Inc.