Log On
Embarcadero Home
Watch, Follow, &
Connect with Us
Share This
QualityCentral
Communities
Articles
Blogs
Resources
Downloads
Help
QualityCentral
Delphi-BCB
RTL
Delphi
Arithmetic
ConvUtils
Date - Time
DateUtils
File Management
Format + Float
Input/Output
Math Unit
Memory, Pointer, Address
Null-terminated strings
Other Classes
Other RTL
Pascal Strings
Regular Expressions
RTL Exceptions
Text Files
Thread support
Typed/Untyped Files
WinAPI
You are not logged in.
Help
Print
Public Report
Report From:
Delphi-BCB/RTL/Delphi/ConvUtils
[ Add a report in this area ]
Report #:
100685
Status:
Open
Ord and Chr got out of sync due to Unicode switch
Project:
Delphi
Build #:
XE2
Version:
16.0
Submitted By:
Dmitry Burov
Report Type:
Feature Specification issue
Date Reported:
11/3/2011 6:05:58 AM
Severity:
Infrequently encountered problem
Last Updated:
3/20/2012 2:24:39 AM
Platform:
All versions
Internal Tracking #:
288670
Resolution:
None
(Resolution Comments)
Resolved in Build:
:
None
Duplicate of:
None
Voting and Rating
Overall Rating:
No Ratings Yet
0.00 out of 5
Total Votes:
10
Description
XE2 Update 1 actualy. List does not have even single update even after Upd2 released.
In Pascal standard Chr and Ord were considered mutually-reversing fucntions.
Chr(Ord(a)) = x and Ord(Chr(i))=i - weither true, or broken due to out of range values.
http://writerguy.users.btopenworld.com/Pascal/pascal3.html
with Chr turned into WideChar and having no overload to return AnsiChar, it became broken!
This obviously means that char constants and char variables are of DIFFERENT types !!!
Maybe that also true for string constants and string variables.
Related bugs:
http://qc.embarcadero.com/wc/qcmain.aspx?d=100687
http://qc.embarcadero.com/wc/qcmain.aspx?d=100688
Below some letters changed to question marks.
Too see cyrillics glyphs and their order please open http://en.wikipedia.org/wiki/Cyrillics
Paragraph "Letters" has a table "The early Cyrillic alphabet"
Heading 5 letters did not change to our days, and are those exactly that used in example code.
Steps to Reproduce:
'?' here stands for cyrillic letter in Windows-1251 codepage. Used IDE XE2 Upd1 on Windows 7 with Russian language, pas source files are in "ANSI" file format.
А and АБВГД - let specify them in direct HTML is QC web front-end fails at Unicode :-(
put 8 buttons on form.
They should enlist 1st 5 letters of Russian alphabet: ?????
Instead they enlist central-European special modifications of Latin A - A with diacritics.
This proves that Chr(Ord(Cyrillic "A") == Latin "A" with dyacritics <> original Cyrillic "A"
--------------
procedure TForm2.btn1Click(Sender: TObject);
var i: Byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr(Ord('?') + i));
end;
procedure TForm2.btn2Click(Sender: TObject);
var i: integer;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr(Ord('?') + i));
end;
procedure TForm2.btn4Click(Sender: TObject);
var i: Byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr(Ord(AnsiChar('?')) + i));
end;
procedure TForm2.btn5Click(Sender: TObject);
var i: byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr(Byte((Ord('?')) + i)));
end;
procedure TForm2.btn6Click(Sender: TObject);
var i: byte;
begin
for i := ord('?') to Ord('?') do
showmessage( IntToStr(i) + ' => ' + Chr(i));
end;
Workarounds
see above about QC deleting important Cyrillics letter here.
(*** We are hackers. Let be ignoring Pascal standard and forcing typecast of blittable types ***)
procedure TForm2.btn7Click(Sender: TObject); var i: byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + AnsiChar(Byte((Ord('?')) + i)) ) ;
end;
(**** forcing compiler to acknowledge that char constants and char variables should belong to same type ***)
procedure TForm2.btn8Click(Sender: TObject);
var i: byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr(Ord(WideChar('?')) + i));
end;
(*** same as above, yet less obvious but also less boilerplate ****)
procedure TForm2.btn3Click(Sender: TObject);
var i: Byte;
begin
for i := 0 to 5 {33} do
showmessage( IntToStr(i) + ' => ' + Chr($410 + i));
end;
Attachment
QC100685 - Unicode_Chr.zip
Comments
Dmitry Burov at 11/3/2011 6:06:58 AM
-
This problem with Apple cross-compiling may be related.
https://forums.embarcadero.com/thread.jspa?messageID=406229&tstart=0#406229
Dmitry Burov at 11/3/2011 6:09:57 AM
-
changing source file format to UCS-2 or UTF8 does not change a thing.
Compiler still trates constants and variables different types!
Dmitry Burov at 11/3/2011 6:22:52 AM
-
I wanted to disable Unicode and try plain ANSI and see if bug perishes.
Alas i could not do it.
ms-help://embarcadero.rs_xe2/rad/Unicode_in_RAD_Studio.html
This tells Unicode can be disabled
"by default, the type string is now a Unicode string " - means there is non-default mode.
https://forums.embarcadero.com/message.jspa?messageID=15778
This tells Unicode can not be disabled.
Dmitry Burov at 11/3/2011 6:32:10 AM
-
Related bugs:
http://qc.embarcadero.com/wc/qcmain.aspx?d=100687
http://qc.embarcadero.com/wc/qcmain.aspx?d=100688
Dmitry Burov at 11/3/2011 7:50:30 AM
-
(**** For char in set - failure ***)
procedure TForm2.btn10Click(Sender: TObject); var c:Char;
begin
for c in ['?'..'?'] do showmessage(c);
end;
(**** For AnsiChar in 'WideSet': no Unicode - no problem ***)
procedure TForm2.btn13Click(Sender: TObject); var c: AnsiChar;
begin
for c in [WideChar('?')..WideChar('?')] do showmessage(c);
end;
(* for char in 'WideSet' - unexpected failure
Maybe WideChar were narrowed down to AnsiChar before
set creation ? But then there SHOULD be compiler warning !!!
I can see no other explanation *)
procedure TForm2.btn12Click(Sender: TObject); var c: Char;
begin
for c in [WideChar('?')..WideChar('?')] do showmessage(c);
end;
(******** for char in string - sudden success *****)
(**** maybe due to string.CodePage that Char lacks
or maybe because string ==UnicodeString ***)
procedure TForm2.btn9Click(Sender: TObject); var c: Char;
begin
for c in '??????' do showmessage(c);
end;
(************ for AnsiChar in AnsiString: same here ******)
procedure TForm2.btn11Click(Sender: TObject); var c: AnsiChar;
begin
for c in AnsiString('??????') do showmessage(c);
end;
Tomohiro Takahashi at 11/6/2011 8:21:26 PM
-
> [Workaround]
> ...
> showmessage( IntToStr(i) + ' => ' + AnsiChar(Byte((Ord('?')) + i)) ) ;
Is your issue related to this article?
[Delphi in a Unicode World Part III: Unicodifying Your Code - Use of the Chr Function]
http://edn.embarcadero.com/article/38693
Dmitry Burov at 11/6/2011 11:58:35 PM
-
In some, distant sense, it does.
> Certain uses of the Chr function may result in the following error:
> [DCC Error] PasParser.pas(169): E2010 Incompatible types: 'AnsiChar' and 'Char'
Never met it. And in sinppets above, i'd more liked to see this error, rather than to get incorrect code generated.
> Can be changed to MyChar := AnsiChar(i);
Obvious to me, though may be not so for others.
But i have background with assembler and with C (in C Byte and Char is one and the same type, not 2 distinct types as in Pascal/Delphi).
Chr/Ord were required in original Pascal for typecasting, that was missed from the language.
Since Turbo Pascal introduced manual type overriding, many built-in function became alternative ways to achieve the same typecasts.
Also standard library functions should to be more portable (platform-independent) than typecasts (which are intentional break of compiler-enforced type safety). They are expected to work even when direct typecast fails and need to be split into separate, platform-specific $IfDef'ed codepaths.
I think, overall, this article would be good find in offline help, linked to from char-relaed types and functions.
This however does not mean that Chr(Ord(char const)) should work no more.
For what i observe, i believe the main reason of failure is that compiler treats char literal constants like "?" beeing AnsiChar, rather than WideChar. Hence, it has not direct relation to the quotes above.
This is in sharp contract with types like PChar, char and string, with functions like Chr, etc - all the latter are treated WideChar-based, rather than AnsiChar. This split in very contra-intuitive and leads to unexpected result.
The article also has "Using Character Literals" section.
> if Edit1.Text[1] = #128 then
> will evaluate to False in Delphi 2009 because ... #128 is ... a control character in Unicode
> if Edit1.Text[1] = '?' then
> will ... also work (i.e., recognize the Euro) in Delphi 2009 (where '?' is #$20AC)
It implies that #128 and '?' are treated WideChar in if-conditions.
Not AnsiChar implicitly casted to WideChar (if so, then 1st if would still work).
Nor '?' being AnsiChar (because it told not to depend upon Codepage, and because of above).
If so, then the contrast with Ord('?') is even more mindbreaking.
The same literal treated so vastly different in different context just seems to have no sense.
PS: i see Unicode characters are broken in this comment again.
QC is incapable of getting Unicode-related tickets, two years after Unicode Support became selling point for Delphi. Pity. Makes communication much harder than it should be :-(
Dmitry Burov at 11/7/2011 12:22:40 AM
-
Added few more tests into sample app.
Direct string/char comparision works.
But as soon as Ord get in the way - it breaks.
[Window Title] Project1
[Content] Ord values: 1041 == 193
[OK]
This math is admirable.
procedure TForm2.btn14Click(Sender: TObject);
begin
if lbl1.Caption[Length(lbl1.Caption) - 3] = '?'
then ShowMessage('Correct') (** <--- here **)
else ShowMessage ('Broken');
if '?' = lbl1.Caption[Length(lbl1.Caption) - 3]
then ShowMessage('Correct') (** <--- here **)
else ShowMessage ('Broken');
if Ord(lbl1.Caption[Length(lbl1.Caption) - 3]) = Ord('?')
then ShowMessage('Correct')
else ShowMessage ('Broken'); (** <--- there **)
if Chr(Ord(lbl1.Caption[Length(lbl1.Caption) - 3])) = '?'
then ShowMessage('Correct') (** <--- here **)
else ShowMessage ('Broken');
if lbl1.Caption[Length(lbl1.Caption) - 3] = Chr(Ord('?'))
then ShowMessage('Correct')
else ShowMessage ('Broken'); (** <--- there **)
if lbl1.Caption[Length(lbl1.Caption) - 3] = Char(Ord('?'))
then ShowMessage('Correct')
else ShowMessage ('Broken'); (** <--- there **)
{expected} ShowMessage ( lbl1.Caption[Length(lbl1.Caption) - 3] + #13#10 + '?' );
{expected} ShowMessage ( '?' );
{unexpected} ShowMessageFmt(' Ord values: %d == %d', [Ord(lbl1.Caption[Length(lbl1.Caption) - 3]) , Ord('?')]);
end;
View Your Reports
Search
Server Response from: ETNACODE01
Developer Tools
Blackfish SQL
C++Builder
Delphi
FireMonkey
Prism
InterBase
JBuilder
J Optimizer
HTML5 Builder
3rdRail & TurboRuby
Database Tools
Change Manager
DBArtisan
DB Optimizer
ER/Studio
Performance Center
Rapid SQL
Technical Articles
Tutorials
White Papers
Press Releases
Newsletters
Add Content (GetPublished)
Audio
Audio & Video
Video
Bugs & Suggestions (QualityCentral)
Discussion Forums
Examples (CodeCentral)
Tags
Technology Partners
Downloads
Free Trials
Registered User Downloads
Beta Programs
Add Content (GetPublished)
Articles
Blogs
Bugs & Suggestions (QualityCentral)
Discussion Forums
Examples (CodeCentral)
Member Services
About
Connect with Us