Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi-BCB/Compiler/Delphi/BASM    [ Add a report in this area ]  
Report #:  70238   Status: Open
Provide data aligment options
Project:  Delphi Build #:  17555
Version:    15.1 Submitted By:   Erik van Bilsen
Report Type:  Suggestion / Enhancement Request Date Reported:  1/3/2009 10:10:07 AM
Severity:    Commonly encountered problem Last Updated: 3/20/2012 2:24:39 AM
Platform:    All platforms Internal Tracking #:   267520
Resolution: None (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: (2 Total Ratings)
4.00 out of 5
Total Votes: 23
Description
This is a feature request.

Delphi supports SIMD instructions since Delphi 6, but still provides no way to align stack variables and local variables.
SIMD instructions require that data is properly aligned for optimum performance. This means that data should be aligned on 8, 16, 32 or 64 byte boundaries.
Some SIMD instructions, like MOVDQA, even throw exceptions when data is not properly aligned.

In our projects, we especially miss the following features (in order of importance):

1. Aligning global variables or initialized constants. For speed reasons, we declare some global initialized constants like this:

var
  SomeValue: UInt64 = $1234567812345678;

And then use SIMD to read those values like this:

asm
  movq mm0,[SomeValue]
end;

However, global variables are aligned on 4-byte boundaries, so there is a 50% chance that SomeValue isn't aligned properly for MMX instructions.
Newer SSE instructions may even prefer 16, 32 or 64 byte alignment, which makes the chance of misalignment even greater.
It would be nice if we could force alignment for specific variables. For example by using a local compiler directive like this:

{$DATAALIGN 32}
var
  SomeValue: UInt64 = $1234567812345678;
{$DATAALIGN DEFAULT}

Using this method, we should also have a way to revert data alignment policy to the default value.
Note that we cannot use {$ALIGN}, since that compiler directive is already reserved for aligning record fields.

Another way to do this may be by introducing a set of new directives like align8, align16, align32 and align64:

var
  SomeValue: UInt64 = $1234567812345678; align32;
  
With the current Delphi versions, I can see a couple of workarounds to make aligning global data work:

a. Allocate all data that needs alignment dynamically. You can write some routines to dynamically allocate memory on specific boundaries.
However, this requires an additional level of redirection in assembly code to access the variable. The example above would look like this when using this approach:

var
  SomeValue: PUInt64;

begin
  GetMemAligned(SomeValue, 8, 32); // Allocate 8 bytes on 32-byte boundary
  SomeValue := $1234567812345678;
  asm
    mov  eax,[SomeValue]
    movq mm0,[eax]
  end;
end;

This method requires extra code (which can reduce speed when used in inner loops) and requires sacrificing a general purpose register (like eax in this example).

b. Put all global aligned variables into a separate unit, and put this unit as the very first unit in the uses clause of the project file.
This unit should not use any other units. Also, you may need to add some extra dummy variables to make other variables align properly.

This is the method we use, but it also has drawbacks.
First, it decouples the data from the place where it should be.
Second, some third party tools like EurekaLog or madExcept insert themselves as the very first unit in the projects uses clause, so we have to "fight" for first rights.
Finally, the data segments of units are aligned on 4-byte boundaries, so you may need to add some extra "padding" variables for proper data alignment.
Perhaps a separate feature request would be to align the data segments of units at 64-byte boundaries. This would perhaps be the "easiest" way to add some data alignment functionality to Delphi.

c. Use an external assembler like NASM to create object files with aligned data. Unfortunately, Delphi doesn't respect the data alignment settings in the generated OBJ files, so this method doesn't work (see report 70237).

2. It would also be preferable if we can force alignment of local variables in the same way as global variables. For example:

procedure TFoo.Bar;
var
  Vector: array [0..255] of Byte; align64;
begin
  DoSomethingWith(Vector);
end;

This is especially useful if the local data is passed to a routine that used SIMD and requires aligned data. For example, a lot of APIs in the Intel Performance Primitives library have this requirement.

3. It would be nice (although not as important as examples 1 and 2) to align data of class fields too:

type
  TFoo = class
    FUnalignedData: Integer;
    FAlignedData: UInt64; align16;
  end;

Of course, this type of alignment would put some extra requirements on the way objects of this type are instantiated.
A TFoo object must be instantiated in such a way that its FAlignedData field would at a 16-byte boundary.

4. Finally, sometimes assembly code itself runs faster if it is aligned:

procedure TFoo.Bar;
begin
  SomeCode;

  {$CODEALIGN 64}
  for I := 0 to X do ...
end;

In this example, the for-loop starts on a 64-byte boundary. Any space between SomeCode and the for-loop should be filled with nop's.

Also see report 2755 for another outstanding request regarding data alignment.
Steps to Reproduce:
None
Workarounds
None
Attachment
None
Comments

Dennis Christensen at 2/27/2009 1:28:12 AM -
FastMM4 has an Align16Bytes. Use this to force all dynamically allocated blocks to be 16 byte aligned.

From FastMM4Options.inc

{Enable this define to align all blocks on 16 byte boundaries so aligned SSE
instructions can be used safely. If this option is disabled then some of the
smallest block sizes will be 8-byte aligned instead which may result in a
reduction in memory usage. Medium and large blocks are always 16-byte aligned
irrespective of this setting.}
{.$define Align16Bytes}

The default for this option is off

Regards
Dennis Christensen

Erik van Bilsen at 3/2/2009 1:18:16 PM -
Aligning dynamically allocated memory is not the issue here. That can easily be done with FastMM or some custom memory allocation routines. And we use that a lot already.

My request is for data alignment of stack variables and local variables, so SIMD instructions can access them without an additional level of redirection.

Dennis Christensen at 2/27/2009 1:50:24 AM -
Also see reports

http://qc.embarcadero.com/wc/qcmain.aspx?d=14769
http://qc.embarcadero.com/wc/qcmain.aspx?d=70204
http://qc.embarcadero.com/wc/qcmain.aspx?d=47636
http://qc.embarcadero.com/wc/qcmain.aspx?d=7560
http://qc.embarcadero.com/wc/qcmain.aspx?d=7518
http://qc.embarcadero.com/wc/qcmain.aspx?d=6610
http://qc.embarcadero.com/wc/qcmain.aspx?d=4530
http://qc.embarcadero.com/wc/qcmain.aspx?d=3071
http://qc.embarcadero.com/wc/qcmain.aspx?d=2755
http://qc.embarcadero.com/wc/qcmain.aspx?d=1116

Regards
Dennis

Erik van Bilsen at 4/21/2011 2:21:54 PM -
This report has been opened for a while now, but no solutions.
PLEASE fix this. Even FreePascal provides advanced code and (stack) data alignment options.
Performance on Pentium 4 and later CPUs really suffers without this, and it is a necessity to make some SSE2/SSE3 code work at all.

Tomohiro Takahashi at 4/21/2011 6:05:49 PM -
Thanks for the notification. I updated the internal status of this report.

Jack Tam at 9/6/2011 12:17:19 PM -
Still not supported in delphi/bcb XE2.
EMB, I'm just wondering what are you doing?
You can not  say delphi/bcb support mmx/sse/sse2/xxx if you DO not support all their requirements.

Tomohiro Takahashi at 9/6/2011 6:16:13 PM -
Unfortunately, your request is not implemented yet in XE2.
A compiler engineer says that it is possible but difficult to implement it by using {$DATAALIGN } directive... it requires some Linker improvements, too.

Erik van Bilsen at 10/28/2011 9:56:21 AM -
I saw in QC#100525 that there is a new "align" directive for records, as in:

var
  R: record
    X: Integer;
  end align 16;

Is this new in XE2? Does it only work on records or also on other data types? For example, this does not work:

var A: Integer align 16;

Does it work on OSX and Win64 too?

I cannot find any documentation on this directive. I would like to learn more about its uses and limitations.

Tomohiro Takahashi at 10/30/2011 10:04:18 PM -
The align directive is undocumented and experimental still now.
We advise you not to use it.

Jack Tam at 9/7/2011 1:12:38 PM -
We can just do these alighments like what ms-VC did. We only want these support because SIMD. without these alighment support, we can not make full use of SIMD without bugs or crashes.

MS-VC did this by _declspec(align(16)).

So for bcb, we can just do this like ms-vc. For delphi, we can use new words or something else.


If delphi/bcb don't support data alighment for static variables, you can not say bcb/delphi full support SIMD such as mmx, sse, sse2, etc(actually, you did, EMB has said that delphi/bcb has full support for SIMD)

Nikolaj Henrichsen at 1/26/2012 3:12:23 PM -
Please - please - please, implement data and code alignment for asm ASAP.

Delphi is supposed to produce blazingly fast applications, right?

Wrong - without code/data alignment we are missing out big on the speeds possible with SIMD instructions.

Erik van Bilsen at 8/16/2012 11:59:07 AM -
This report has been accepted and opened more than 3 1/2 years ago. Any progress?

Erik van Bilsen at 7/8/2013 12:56:31 PM -
Another year has passed. PLEASE look into this.
Most other professional native code compilers have options for this.

Server Response from: ETNACODE01