ProTRR: Principled yet Optimal In-DRAM Target Row Refresh

Our work is the first that presents a principled in-DRAM mitigation against Rowhammer (ProTRR). We designed ProTRR to have formal security guarantees with optimal and flexible trade-offs on the number of required counters and refreshes. Our results show a negligible power, performance, and area impact in DDR4 and DDR5 devices. ProTRR does not require any changes to the CPU and is the first mitigation that is compatible with the latest DDR5 standard that introduces the Refresh Management (RFM) extension.

What is Rowhammer?

Rowhammer is a DRAM vulnerability caused by interferences between different rows of data stored in DRAM. Unfortunately, this enables attackers to change values in memory, obtaining privilege escalation and escaping sandboxes. To stop Rowhammer, DRAM vendors implemented mitigations known as Target Row Refresh (TRR). However, our recent work showed that these mitigations are broken. To address this problem, we designed ProTRR, which we have proven to be secure against the best possible Rowhammer attack.

How did you design ProTRR?

We started by formalizing the best Rowhammer attack against an ideal in-DRAM mitigation, which we call Feinting. We then designed and implemented ProTRR based on a new online frequent item count scheme that we call ProMG (Proactive Misra-Gries). ProMG extends the optimal Misra-Gries summaries to in-DRAM settings, using the bounds given by Feinting. This allows us to be optimal in terms of counters to protect against Feinting.

How does Feinting work?

In-DRAM mitigations cannot stop normal DRAM execution to refresh victim rows. Instead, they can refresh a small number of rows (e.g. 2, 4) periodically (e.g. every hundreds of activations). This means that the mitigations have to make decisions based on currently available information. This allows a clever attacker to use decoy rows to fool the mitigation to select those instead of the victim every time a mitigation selects rows to be refreshed.

We derive the structure of Feinting, which is the best attack in this setting, by proving a series of theorems that describe the optimal distribution of memory accesses, the intensity of these accesses, and for how long the Feinting should last. Additionally, we present different variations of Feinting to address optimizations made by some devices, for example, REF/RFM postponing (on DDR5) and subarray-level parallelism.

Fig. 1: Different variations of Feinting.

How does ProTRR work?

ProTRR uses ProMG to keep track of victim rows during each activation. On every TRR event, ProTRR refreshes the “V” most hammered rows. ProTRR allows for flexibility because its size depends directly on Feinting. The lower the size, the stronger Feinting becomes. Thus, a vendor could right-size ProMG depending on how vulnerable their devices are to Rowhammer (based on the bounds given by Feinting).

Fig. 2: Victim counting in ProTRR.

What is the storage required for ProTRR?

In Figure 3, we describe the link between storage size and supported vulnerability. This is a multifactorial analysis. It depends on the blast diameter (B), i.e. on how many rows are hammered by a single aggressor. It also depends on how many rows are refreshed every TRR (V, which is 2 in these figures), and how frequently the refresh command allows for TRR (TRR distance for DDR4 and DDR5).

Fig. 3: Storage size of ProTRR, per-chip values.

How is ProTRR different from existing mitigations?

We have conducted a thorough security evaluation of existing mitigation schemes, which we summarize in Table 1. We found that most hardware-based mitigations are incompatible with an in-DRAM implementation and often require changes to the memory controller or the operating system. None of the existing mitigations is designed for compatibility with DDR5 (i.e., RFM-compatible). Most severely, there are vulnerabilities in the majority of proposals that attackers could exploit.

Motivated by these shortcomings, we designed ProTRR as a provably secure and flexible in-DRAM mitigation that can easily be deployed in today’s DDR4/5 DRAM devices.

Table 1: Rowhammer mitigations in hardware and software.

For more details about existing work, we refer to §IX of our paper.

How low is the overhead really?

To give a rough idea: we can protect a DDR5 device (which is expected to be even weaker than today’s DDR4 devices), where bits start flipping after only 3,200 accesses, with less than 0.2% performance overhead while increasing the area by 1.78% and the device’s energy consumption by 2.35%. A DRAM manufacturer confirmed that these requirements are practical for real-world deployments.

More information

Our paper ProTRR is available now and will officially be presented at IEEE S&P 2022. You can find a recording of the talk on YouTube.

FAQs

Following, we provide answers to the most frequently asked questions about our work.

  • What does it mean that Feinting is the “best attack”?
    With “best attack” we mean the pattern of row activations that results in the highest possible number of times a victim can be hammered assuming an ideal in-DRAM TRR mitigation. Note that this value is reset if the victim row is refreshed by either TRR or REF.
  • What do you mean by “formal security guarantees”?
    We prove in our paper that ProTRR can defend against the formally proven best attack, called Feinting. This allows us to mathematically determine the security guarantees provided by ProTRR.
  • What does it mean for a mitigation to be “proactive”?
    It means that we refresh victim rows without using a fixed threshold. This happens by periodically triggering the TRR mechanism, which refreshes a limited number of victim rows each time.
  • Why does it matter that ProTRR is an In-DRAM mitigation?
    There are several reasons: (i) DRAM devices can internally remap rows in a way not visible to external mitigations, rendering them ineffective, (ii) Indirectly affected parties (e.g., CPU/OS vendors) may not be willing to fix a problem that is not in their products, (iii) There is no command to perform TRR from the memory controller; (iv) DDR5 standard introduces the RFM command, which can only be used by an in-DRAM mitigation.
  • What does flexibility mean in the context of ProTRR?
    Flexibility means that DRAM vendors can tune the mitigation based on their available resources (e.g., area, power) and device constraints (e.g., Rowhammer threshold).
  • Why are there so many different types of Feinting?
    We present different types of Feinting to cover several optional features defined by the respective DDRx standard, which may be selectively enabled by the memory controller or DRAM device (e.g., REF postponing/pulling-in, RFM). This also demonstrates that Feinting can easily be adapted to future changes in the DDRx standard.
  • The evaluation shows quite low area requirements but is it still practical for integration inside a DRAM chip?
    Yes, we have received confirmation from a DRAM vendor that there have already been mitigations deployed with a higher number of counters than those needed by ProTRR. We refer to §VII-D for more details.
  • Is Feinting just a theoretical attack, or does it indeed work on real DRAM devices?
    We have successfully tested Feinting on three real DDR4 devices. For details, see §VII-E in our paper.
  • I guess you could not implement ProTRR on a real device; how did you test your mitigation against existing Rowhammer attacks?
    We implemented ProTRR in the cycle-accurate DRAM simulator DRAMsim3 and used attack pattern traces generated by state-of-the-art Rowhammer fuzzers. We also tested traces of Feinting attacks. None of the traces could trigger any bit flips on a simulated device protected by ProTRR.
  • Will ProTRR be able to protect against possible future Rowhammer effects?
    Given ProTRR’s flexibility, it is easy to extend it to new effects. We have already done it for the case of the half-double pattern (concurrent work) and we discuss how to protect against other potential effects (see §VIII. for details).

Acknowledgements

This research was supported by the Swiss National Science Foundation under NCCR Automation, grant agreement 51NF40_180545, and in part by the Netherlands Organisation for Scientific Research through grant NWO 016.Veni.192.262.