2
Shishir 1,207 2 13 38 3 Answers  Answer 56 2 I have understood wh y memory should be aligned to 4 byte and 8 byte based on data width of the bus. But following statement confuses me "IoD rive requir es that all I/O performed on a device using O_DIRE CT must be 512-byte alligned and a multiple of 512 bytes in size." What is the need for aligning address to 512 bytes. c  operating-system  memory-alignment asked Aug 12 '10 at 17:00 89% accept rate Was this post useful to you? Blanket statements blaming DM A for large buffer alignment restr ictions are wrong. H ardware DMA transfers are usually aligned on 4 or 8 byte boundaries since the PCI bus can physically transfer 32 or 64bits at a time. Beyond this basic alignment, hardware DMA transfers are d esigned to w ork with any address pr ovided. H ow ever, the hardware deals with physical addresses, wh ile the OS deals with virtual memory addresses (which is a protected mode construct in the x86 cpu). This means that a contiguous b uffer in process space may not be contiguous in physical ram. Unless care is taken to create physically contiguo us buffers, the DMA transfer needs to be broken up at VM page boundaries ( typically 4K, possibly 2M).  As for buffers needing to be aligned to disk sector size, this is completely untrue; the DMA hardw are is completely oblivious to the physical sector size on a hard drive. U nder Linux 2.4 O_DIREC T requ ired 4K alignment, under 2.6 it's been relaxed to 512B. In either case, it w as probably a design decision to prevent single sector updates from crossing VM page boundaries and therefor requiring split DMA transfers. (An arbitrary 512B buffer has a 1/4 chance of crossing a 4K page). So, wh ile the OS is to blame rather than the har dw are, we can see why page aligned buffers are more efficient. Edit: Of course, if we're writing large buffers anyways (100KB), then the number of VM page boundaries crossed will be practically the same whether we've aligned to 512B or not. So the main case being optimiz ed by 512B alignment is single sector transfer s. edited Dec 16 '10 at 20:21 answered Dec 16 '10 at 20:08 feedback U sually large alignment requirements like that are due to underlying DMA hardware. Large block Memory alignment

c - Memory Alignment - Stack Overflow

Embed Size (px)

Citation preview

7/27/2019 c - Memory Alignment - Stack Overflow

http://slidepdf.com/reader/full/c-memory-alignment-stack-overflow 1/2

Shishir 

1,207 2 13 38

3 Answers

 Answer 

56 2

I have understood why memory should be aligned to 4 byte and 8 byte based on data width of the bus.

But following statement confuses me

"IoDrive requires that all I/O performed on a device using O_DIRECT must be 512-byte alligned and a

multiple of 512 bytes in size."

What is the need for aligning address to 512 bytes.

c   operating-system   memory-alignment

asked Aug 12 '10 at 17:00

89% accept rate

Was this post useful to you?

Blanket statements blaming DMA for large buffer alignment restr ictions are wrong.

Hardware DMA transfers are usually aligned on 4 or 8 byte boundaries since the PCI bus can physically

transfer 32 or 64bits at a time. Beyond this basic alignment, hardware DMA transfers are designed to

work with any address provided.

However, the hardware deals with physical addresses, while the OS deals with virtual memory addresses

(which is a protected mode construct in the x86 cpu). This means that a contiguous buffer in process

space may not be contiguous in physical ram. Unless care is taken to create physically contiguous

buffers, the DMA transfer needs to be broken up at VM page boundaries ( typically 4K, possibly 2M).

 As for buffers needing to be aligned to disk sector size, this is completely untrue; the DMA hardware is

completely oblivious to the physical sector size on a hard drive.

Under Linux 2.4 O_DIRECT required 4K alignment, under 2.6 it's been relaxed to 512B. In either case, it

was probably a design decision to prevent single sector updates from crossing VM page boundaries and

therefor requiring split DMA transfers. (An arbitrary 512B buffer has a 1/4 chance of crossing a 4K

page).

So, while the OS is to blame rather than the hardware, we can see why page aligned buffers are more

efficient.

Edit: Of course, if we're writing large buffers anyways (100KB), then the number of VM page boundaries

crossed will be practically the same whether we've aligned to 512B or not. So the main case being

optimized by 512B alignment is single sector transfers.

edited Dec 16 '10 at 20:21 answered Dec 16 '10 at 20:08

feedback

Usually large alignment requirements like that are due to underlying DMA hardware. Large block

Memory alignment

7/27/2019 c - Memory Alignment - Stack Overflow

http://slidepdf.com/reader/full/c-memory-alignment-stack-overflow 2/2

Carl Norum

59.2k 6 79 149

tc.

16.4k 1 14 38

transfers can sometimes be made much faster by requiring much stronger alignment restrictions than

what you have here.

On several ARM processors, the first level translation table has to be aligned on a 16 KB boundary!

answered Aug 12 '10 at 17:02

how is it made faster by aligning to 512 bytes as if data is transfered 4 bytes in a cycle – Shishir   Aug 12

'10 at 17:09

@siri, that's the point - it might not be. It might be transferred 8, 16, 32, or even more, like all 512 bytes in a

single cycle. DMA hardware can do basically anything - it's all very implementation dependent. – Carl Norum Aug 12 '10 at 17:10

4 @siri: It is made faster by not having the processor involved in the transmission at all (that is what DMA is all

about), but DMA hardware sometimes imposes limits above and beyond those implicit in the architecture

itself. – dmckee  Aug 12 '10 at 17:10

+1 @dmckee, that's a good explanation. – Carl Norum  Aug 12 '10 at 17:12

 A much nicer explanation than mine, and you used the magic word "DMA" – Matt Joiner   Aug 12 '10 at 17:29

feedback

If you don't know what you're doing, don't use O_DIRECT.

O_DIRECT means "direct device access". This means it bypasses all OS caches, hitting the disk (or 

possibly RAID controller, etc) directly. Disk accesses are on a per-sector basis.

EDIT: The alignment requirement is for the IO offset/size; it's not usually a memory-alignment

requirement.

EDIT: If you're looking at this page (it appears to be the only hit), it also says that the memory must be

page-aligned.

answered Aug 12 '10 at 17:11

feedback

Not the answer you're looking for? Browse other questions tagged c   operating-system

memory-alignment or ask your own question.