QUICK FACTS
Created Jan 0001
Status Verified Sarcastic
Type Existential Dread
cryptography, authenticated encryption, encryption, confidentiality, authenticity, ciphertext, authentication tag

Authenticated Encryption

“In the labyrinthine world of cryptography, where every shadow might conceal a threat, authenticated encryption (AE) emerges as a rather necessary construct....”

Contents
  • 1. Overview
  • 2. Etymology
  • 3. Cultural Impact

Authenticated encryption (AE)

In the labyrinthine world of cryptography , where every shadow might conceal a threat, authenticated encryption (AE) emerges as a rather necessary construct. It’s an encryption scheme that, with a commendable lack of fuss, manages to simultaneously guarantee two fundamental properties for data in transit or at rest: its confidentiality and its authenticity . One might imagine these as two distinct, yet equally vital, layers of protection, ensuring not only that a message remains private but also that it hasn’t been tampered with by any unwelcome hands.

Firstly, AE ensures confidentiality , often referred to as privacy. This means the encrypted message, or ciphertext , is rendered utterly incomprehensible to anyone who does not possess the designated secret key . Without this shared secret, the data remains an opaque, meaningless jumble, a silent testament to its guarded nature.

Secondly, and just as critically, AE guarantees authenticity . This property means the encrypted message is, for all practical purposes, unforgeable. It includes an integral component known as an authentication tag . This tag serves as a cryptographic fingerprint, meticulously calculated by the sender using the same secret key . Upon receipt, the legitimate recipient can verify this tag. If the tag doesn’t match the received ciphertext or has been altered, it immediately signals that the message has either been modified in transit or originated from an unauthorized source. This prevents adversaries from injecting malicious or altered data disguised as legitimate communications.

Illustrative examples of encryption modes that elegantly provide this dual assurance of authenticated encryption include Galois/Counter Mode (GCM) and Counter with CBC-MAC (CCM). These modes are not merely academic curiosities but practical workhorses in securing digital communications.

A significant number, though notably not all, authenticated encryption schemes also possess the foresight to allow for the inclusion of “associated data” (AD). This associated data is an intriguing concept: it is deliberately not made confidential, meaning it remains readable in its original form. However, and this is where the utility lies, it is rigorously protected for integrity , making it tamper-evident . Consider, for instance, the header of a network packet . For the packet to be correctly routed across various network nodes, these intermediate points must be able to read the destination address contained within the header. Yet, for security reasons, these intermediate nodes should not, and often cannot, possess the secret key required to decrypt the actual payload. In such scenarios, associated data protection ensures that while the header is readable, any attempt to alter it will be immediately detected. Schemes that incorporate this feature are known as authenticated encryption with associated data (AEAD) schemes, a testament to the real-world complexities that cryptography must navigate.

Programming interface

For those tasked with implementing or interacting with authenticated encryption within software, a typical programming interface presents a rather straightforward, almost deceptively simple, set of functions. One might think such critical operations would be shrouded in more complexity, but the elegance lies in their clear delineation.

At its core, an AE implementation generally offers two primary functions:

  • Encryption: This function takes the original, unencrypted message, known as the plaintext , as its primary input. It also requires the secret key that will be used for both confidentiality and authenticity . Crucially, and reflecting the real-world demands of AEAD, it often accepts an optional “header.” This header, also referred to as additional authenticated data (AAD) or simply associated data (AD), is provided in plaintext . Its purpose is explicit: it will not be encrypted, remaining fully readable, but it will be meticulously covered by the authenticity protection. The output of this function is a formidable pair: the ciphertext , which is the encrypted form of the original plaintext , and an authentication tag . This tag is essentially a message authentication code (MAC), a compact cryptographic checksum that vouches for the integrity and authenticity of both the ciphertext and the optional header.

  • Decryption: This function is the mirror image of encryption, designed to reverse the process and verify the message’s integrity. Its inputs include the received ciphertext , the same secret key used during encryption, and the authentication tag that accompanied the message. If a header (AAD/AD) was used during encryption, it must also be provided here, in its original plaintext form, for the authenticity check. The output of this function is either the original plaintext , faithfully restored, or, perhaps more frequently in the chaotic digital landscape, an error. This error signals a critical failure: the provided authentication tag simply does not match the supplied ciphertext or header, indicating that the message has been compromised or is illegitimate.

The aforementioned header part is not an afterthought; it’s a deliberate design choice. Its primary intent is to extend both authenticity and integrity protection to networking or storage metadata. This metadata, while vital for the correct functioning of systems (like a packet’s destination address or a file’s creation timestamp), does not require confidentiality . Yet, its authenticity is paramount to prevent malicious manipulation that could lead to misrouting, data corruption, or other system failures. It’s a pragmatic concession to the realities of system architecture, ensuring that even non-secret components are trustworthy.

History

The genesis of authenticated encryption as a formalized concept was, predictably, born out of necessity—or, more accurately, out of the widespread failure to adequately combine existing cryptographic primitives. It became rather painfully clear that attempting to securely glue together separate confidentiality (encryption) and authentication (message authentication code or MAC) block cipher operation modes was a task far more error-prone and difficult than many initially assumed. One might have thought that “secure encryption” plus “secure authentication” would automatically yield “secure authenticated encryption.” Such naive optimism, it turned out, was largely unfounded.

This harsh reality was underscored by a disconcerting parade of practical attacks. These vulnerabilities were not confined to theoretical papers but manifested directly in production protocols and applications, stemming from incorrect implementations or, perhaps more egregiously, a complete lack of proper authentication. The consequences were, as one might expect, rather dire, highlighting a critical gap in cryptographic design and application.

Around the turn of the millennium, a collective realization sparked numerous efforts focused on the standardization of modes that could inherently ensure correct implementation of both privacy and integrity. The burgeoning interest in demonstrably secure modes was significantly catalyzed by the publication of Charanjit Jutla’s seminal work in 2000, introducing his integrity-aware CBC and integrity-aware parallelizable (IAPM) modes. These early contributions, which would later influence modes like OCB , marked a crucial step towards robust authenticated encryption .

The international community, recognizing the urgent need for reliable solutions, eventually codified several different authenticated encryption modes. Six distinct approaches—specifically Offset Codebook Mode 2.0 (OCB 2.0), Key Wrap , Counter with CBC-MAC (CCM), Encrypt then Authenticate then Translate (EAX), Encrypt-then-MAC (EtM), and Galois/Counter Mode (GCM)—were formally standardized within ISO/IEC 19772:2009 . This standardization represented a significant milestone, providing a framework for developers to implement these complex functions correctly and consistently.

The momentum continued, with even more authenticated encryption methods being developed and proposed in response to solicitations from the National Institute of Standards and Technology (NIST), demonstrating an ongoing commitment to refining and expanding the cryptographic toolkit. Furthermore, the versatile concept of sponge functions , known for their ability to absorb input and squeeze out output, was found to be adaptable for providing authenticated encryption when employed in duplex mode, showcasing the flexibility of modern cryptographic primitives.

Earlier, in 2000, Mihir Bellare and Chanathip Namprempre conducted a rigorous analysis of three common compositions of encryption and MAC primitives. Their work was instrumental in demonstrating that the approach of encrypting a message first and then applying a MAC to the resulting ciphertext —the so-called Encrypt-then-MAC (EtM) paradigm—could indeed imply security against an adaptive chosen ciphertext attack , provided that both the encryption and MAC functions met minimum required security properties. Concurrently, Jonathan Katz and Moti Yung explored a similar notion, terming it “unforgeable encryption,” and further solidified its theoretical underpinnings by proving its implication of security against chosen-ciphertext attacks . These foundational analyses provided much-needed theoretical rigor to the practical challenges of combining primitives.

The pursuit of better authenticated encryption continued, leading to the announcement of the CAESAR competition in 2013. This competition aimed to stimulate the design and evaluation of new, more robust, and efficient authenticated encryption modes, pushing the boundaries of what was considered state-of-the-art.

In a more recent development from 2015, ChaCha20-Poly1305 was introduced as a potent alternative authenticated encryption construction to GCM within various Internet Engineering Task Force (IETF) protocols. This addition reflected an ongoing evolution in cryptographic preferences, driven by factors such as performance, resistance to certain types of attacks, and ease of implementation.

Variants

The core concept of authenticated encryption , while robust, has seen several refinements and specialized variants emerge to address specific challenges and use cases. These adaptations acknowledge that the digital landscape is rarely a one-size-fits-all scenario.

Authenticated encryption with associated data

As previously hinted, authenticated encryption with associated data (AEAD) stands as a prominent variant of AE. It’s designed for scenarios where certain portions of a message, while not requiring confidentiality , absolutely demand integrity and authenticity . The “associated data” (AD), also known as “additional non-confidential information” or “additional authenticated data” (AAD), allows for this nuanced protection. A recipient using an AEAD scheme can, therefore, rigorously check the integrity of both the associated data and the confidential information within a message.

This feature proves immensely useful in contexts such as network packets . Here, the header of a packet, containing essential metadata like source and destination addresses, must remain visible for efficient routing across the network. However, the integrity of this header is paramount; any malicious alteration could lead to misdirection or denial of service. Concurrently, the actual payload of the packet often requires full confidentiality . AEAD expertly handles this duality, ensuring that the header is readable but tamper-evident, while the payload is both confidential and authenticated. The formalization of this critical notion was provided by Phillip Rogaway in 2002, solidifying its place in modern cryptography.

Key-committing AEAD

One might assume that if an authenticated encryption scheme successfully validates an authentication tag using a specific symmetric key , say K_A, held by Alice , it inherently proves that the message originated from a party possessing K_A and was not tampered with by an adversary, such as Mallory , who lacks K_A. This is the essence of ciphertext integrity . However, most AE schemes, even the widely popular GCM , traditionally do not provide what is known as key commitment. Key commitment is a stronger guarantee: it ensures that decryption would unequivocally fail if any other key were used.

As of 2021, a rather unsettling reality persists: many existing AE schemes permit certain specially crafted messages to be decrypted without error using keys other than the single, correct K_A. While the plaintext recovered using a second (incorrect) key, K_M, would undoubtedly be garbled and incorrect, the authentication tag might, disturbingly, still match this new, incorrect plaintext . This vulnerability, though seemingly esoteric and requiring Mallory to possess both K_A and K_M to craft such a message, is far from a purely academic concern.

Under specific, unfortunate circumstances, practical attacks can be mounted against implementations vulnerable to this lack of key commitment. Imagine an identity authentication protocol where a user’s identity is verified solely by the successful decryption of a message using a password-based key. If Mallory can craft a single message that successfully decrypts under, say, a thousand different weak (and thus known to her) potential passwords, she can accelerate her dictionary attack by a factor of nearly 1000. For such an attack to succeed, Mallory also needs a mechanism to distinguish a successful decryption by Alice from an unsuccessful one, effectively turning Alice ’s side into an oracle —a flaw often stemming from poor protocol design or implementation. Naturally, this attack vector is entirely moot when keys are generated randomly, as is best practice.

The concept of key commitment was initially explored in the 2010s by researchers such as Abdalla et al. and Farshim et al., often under the moniker “robust encryption.” Their work laid the groundwork for understanding and mitigating this subtle but significant vulnerability.

To counteract the described attack without requiring the removal of the potentially problematic “oracle” (which might be difficult in existing systems), a key-committing AEAD scheme can be employed. These schemes are designed to prevent the existence of such crafted messages. AEGIS stands as an example of a fast and efficient key-committing AEAD, particularly when modern processor instruction sets like AES-NI are available. Furthermore, it is demonstrably possible to augment existing AEAD schemes with key-commitment properties, offering a pathway to enhance security without a complete overhaul.

Misuse-resistant authenticated encryption

In a world where human error is not just a possibility but a near certainty, misuse-resistant authenticated encryption (MRAE) offers a pragmatic concession to reality. MRAE schemes possess an invaluable additional property: even if the same cryptographic nonce (a number used once) is inadvertently reused for multiple messages—a common and catastrophic mistake in many cryptographic contexts—an attacker will not be able to recover the plaintext . This robustness against common implementation errors makes MRAE particularly appealing for systems where developers might lack deep cryptographic expertise or where operational constraints could lead to nonce reuse.

The formalization of MRAE was provided in 2006 by the esteemed cryptographers Phillip Rogaway and Thomas Shrimpton, recognizing the critical need for schemes that fail gracefully, rather than catastrophically, in the face of operational misuse. A prominent example of an MRAE algorithm, designed with this resilience in mind, is AES-GCM-SIV .

Approaches to authenticated encryption

The journey towards robust authenticated encryption has seen various architectural approaches, each with its own set of strengths, weaknesses, and historical implications. These paradigms dictate the order in which encryption and authentication operations are applied, a sequence that, as history has repeatedly shown, is far from arbitrary.

Encrypt-then-MAC (EtM)

The Encrypt-then-MAC (EtM) approach is, for many cryptographers, the conceptually most sound method for combining confidentiality and authenticity . In this paradigm, the plaintext message is first subjected to the encryption process, yielding a ciphertext . Subsequently, a message authentication code (MAC) is computed solely based on this resulting ciphertext . The ciphertext and its accompanying MAC are then transmitted together.

This sequence is not merely a preference; it is the standardized method according to ISO/IEC 19772:2009 . The reason for its prominence lies in its security properties: EtM is uniquely positioned as the only method among the generic composition paradigms that can achieve the highest definition of security in AE, specifically “strong unforgeability.” This pinnacle of security is attainable, however, only when the MAC employed is itself “strongly unforgeable,” underscoring the importance of robust underlying primitives.

The practical adoption of EtM reflects its proven security. IPsec , the suite of protocols used to secure Internet Protocol (IP) communications, formally adopted EtM in 2005. More recently, in November 2014, TLS and DTLS (Datagram Transport Layer Security) received extensions specifically for EtM via RFC 7366 , further cementing its role in securing web and real-time communications. Even SSHv2 (Secure Shell version 2) offers various EtM ciphersuites , such as hmac-sha1-etm@openssh.com, demonstrating its widespread acceptance across different secure communication protocols.

Encrypt-and-MAC (E&M)

The Encrypt-and-MAC (E&M) approach takes a somewhat different path, one that has been the subject of considerable scrutiny. In this method, a message authentication code (MAC) is generated based on the original plaintext . Simultaneously, or independently, the plaintext is encrypted, but without incorporating the MAC into the encryption process itself. Both the plaintext ’s MAC and the resulting ciphertext are then transmitted together.

A notable example of E&M’s deployment can be found in SSH . While this approach has not, in its raw form, been proven to achieve the coveted “strong unforgeability” that EtM can, it’s not entirely without hope. Researchers have shown that it is possible to apply certain minor modifications to SSH ’s implementation to bolster its security, enabling it to achieve strong unforgeability despite its foundational E&M paradigm. This highlights a recurring theme in cryptography: theoretical weaknesses can sometimes be mitigated through careful, protocol-specific engineering, though it often requires a deeper understanding of the entire system rather than just the individual cryptographic primitives.

MAC-then-Encrypt (MtE)

The MAC-then-Encrypt (MtE) approach is, by many accounts, the least advisable of the generic composition methods, and its history is replete with cautionary tales. Here, a message authentication code (MAC) is first computed over the plaintext . Then, both the original plaintext and the calculated MAC are bundled together and encrypted to produce a single ciphertext . This ciphertext , now containing an encrypted MAC, is what gets transmitted.

Historically, this approach was prevalent. Indeed, until TLS 1.2 , virtually all available SSL/TLS cipher suites relied on the MtE paradigm. This widespread adoption, however, did not equate to inherent security.

The MtE approach, much like E&M, has not been proven to be strongly unforgeable in its generic form. While Hugo Krawczyk famously provided a proof that SSL/TLS was, in fact, secure despite using MtE, his proof relied on specific encoding mechanisms employed alongside the MtE structure. However, even Krawczyk’s proof contained assumptions that later proved flawed, particularly concerning the randomness of the initialization vector (IV).

These flawed assumptions had tangible consequences. The infamous 2011 BEAST attack (Browser Exploit Against SSL/TLS) directly exploited the non-random chained IVs prevalent in TLS 1.0 and earlier, effectively breaking all CBC algorithms when used with MtE. This attack served as a stark reminder that even seemingly minor deviations from cryptographic best practices, or assumptions about randomness that don’t hold in practice, can lead to devastating vulnerabilities.

Further, deeper analysis of SSL/TLS modeled its protection not merely as MAC-then-Encrypt, but more precisely as MAC-then-pad-then-encrypt. This involves the plaintext being first padded to align with the block size of the encryption function. This padding, while necessary for block ciphers, introduced another critical attack surface. Errors in padding frequently result in detectable errors on the recipient’s side—errors that can be leveraged by attackers. This led directly to the development of sophisticated padding oracle attacks , such as the notorious Lucky Thirteen attack , which could reveal plaintext information byte by byte by observing server responses to malformed ciphertexts. The history of MtE in SSL/TLS is a compelling, if somewhat depressing, case study in the perils of cryptographic design choices and the persistent challenge of securing complex protocols against ingenious adversaries.

See also

References

General

Sources