Katana VentraIP

Input–output memory management unit

In computing, an input–output memory management unit (IOMMU) is a memory management unit (MMU) connecting a direct-memory-access–capable (DMA-capable) I/O bus to the main memory. Like a traditional MMU, which translates CPU-visible virtual addresses to physical addresses, the IOMMU maps device-visible virtual addresses (also called device addresses or memory mapped I/O addresses in this context) to physical addresses. Some units also provide memory protection from faulty or malicious devices.

An example IOMMU is the graphics address remapping table (GART) used by AGP and PCI Express graphics cards on Intel Architecture and AMD computers.


On the x86 architecture, prior to splitting the functionality of northbridge and southbridge between the CPU and Platform Controller Hub (PCH), I/O virtualization was not performed by the CPU but instead by the chipset.[1][2]

Large regions of memory can be allocated without the need to be contiguous in physical memory – the IOMMU maps contiguous virtual addresses to the underlying fragmented physical addresses. Thus, the use of (scatter-gather lists) can sometimes be avoided.

vectored I/O

Physical Address Extension

virtualization

In some architectures IOMMU also performs re-mapping, in a manner similar to standard memory address re-mapping.

hardware interrupt

Peripheral memory paging can be supported by an IOMMU. A peripheral using the PCI-SIG PCIe Address Translation Services (ATS) Page Request Interface (PRI) extension can detect and signal the need for memory manager services.

The advantages of having an IOMMU, compared to direct physical addressing of the memory (DMA), include:


For system architectures in which port I/O is a distinct address space from the memory address space, an IOMMU is not used when the CPU communicates with devices via I/O ports. In system architectures in which port I/O and memory are mapped into a suitable address space, an IOMMU can translate port I/O accesses.

Some degradation of performance from translation and management overhead (e.g., page table walks).

Consumption of physical memory for the added I/O . This can be mitigated if the tables can be shared with the processor.

page (translation) tables

In order to decrease the page table size the granularity of many IOMMUs is equal to the memory paging (often 4096 bytes), and hence each small buffer that needs protection against DMA attack has to be page aligned and zeroed before making visible to the device. Due to OS memory allocation complexity this means that the device driver needs to use bounce buffers for the sensitive data structures and hence decreasing overall performance.

The disadvantages of having an IOMMU, compared to direct physical addressing of the memory, include:[4]

Virtualization[edit]

When an operating system is running inside a virtual machine, including systems that use paravirtualization, such as Xen and KVM, it does not usually know the host-physical addresses of memory that it accesses. This makes providing direct access to the computer hardware difficult, because if the guest OS tried to instruct the hardware to perform a direct memory access (DMA) using guest-physical addresses, it would likely corrupt the memory, as the hardware does not know about the mapping between the guest-physical and host-physical addresses for the given virtual machine. The corruption can be avoided if the hypervisor or host OS intervenes in the I/O operation to apply the translations. However, this approach incurs a delay in the I/O operation.


An IOMMU solves this problem by re-mapping the addresses accessed by the hardware according to the same (or a compatible) translation table that is used to map guest-physical address to host-physical addresses.[5]

has published a specification for IOMMU technology, called AMD-Vi.[6][7]

AMD

offered Extended Control Program Support: Virtual Storage Extended (ECPS:VSE) mode[8] on its 43xx line; channel programs used virtual addresses.

IBM

has published a specification for IOMMU technology as Virtualization Technology for Directed I/O, abbreviated VT-d.[9]

Intel

Information about the IOMMU has been published in the Device Virtual Memory Access (DVMA) section of the Solaris Developer Connection.[10]

Sun

The Translation Control Entry (TCE) has been described in a document entitled Logical Partition Security in the IBM eServer pSeries 690.[11]

IBM

The has relevant work under the terms Single Root I/O Virtualization (SR-IOV) and Address Translation Services (ATS). These were formerly covered in distinct specifications, but as of PCI Express 5.0 have been moved to the PCI Express Base Specification.[12]

PCI-SIG

defines its version of IOMMU as System Memory Management Unit (SMMU)[13] to complement its Virtualization architecture.[14]

ARM

(HSA)

Heterogeneous System Architecture

List of IOMMU-supporting hardware

Memory-mapped I/O

Memory protection

Bottomley, James (2004-05-01). . Linux Journal (121). Specialized System Consultants. Archived from the original on 2006-07-15. Retrieved 2006-08-09.

"Using DMA"

Embedded Linux Conference 2014, San Jose, by Laurent Pinchart

Mastering the DMA and IOMMU APIs