Watch, Follow, &
Connect with Us
Public Report
Report From: Delphi for PHP/IDE/Code Editor    [ Add a report in this area ]  
Report #:  87831   Status: Reported
Working with encodings
Project:  HTML5 Builder Build #:  3.0.0.1319
Version:    3.0 Submitted By:   Alexander Pravdin
Report Type:  Issue Date Reported:  9/6/2010 4:07:10 PM
Severity:    Extreme corner case Last Updated: 1/22/2011 4:42:46 AM
Platform:    All platforms Internal Tracking #:  
Resolution: None (Resolution Comments) Resolved in Build: : None
Duplicate of:  None
Voting and Rating
Overall Rating: No Ratings Yet
0.00 out of 5
Total Votes: None
Description
It will be very handy if RadPHP will have this new feature. Add option "Default encoding" (not "New file encoding") to apply specified encoding to files which encoding can not be automatically detected. For example: I have UTF-8 files and if they have no UTF-8 signatures at beginning I see invalid symbols instead of national characters in the Editor. But I can not add this signatures to files because this signatures will appear in PHP output which is prohibited in my case. The Dreamweaver work fine in this case. Information about what encoding should be used on different files can be stored in the project file.
Steps to Reproduce:
None
Workarounds
None
Attachment
None
Comments

Tomohiro Takahashi at 9/7/2010 5:53:12 PM -
I chagned Version field from 2.1 to 3.0 as Sysop, since 3.0 is available now.

Oto BREZINA at 9/23/2010 2:27:08 AM -
related to http://qc.embarcadero.com/wc/qcmain.aspx?d=45624

Alexander Pravdin at 10/5/2010 4:23:22 PM -
This issue is the only problem why I can not port my projects to RadPHP: because I can not work with files encoded in UTF-8 but without UTF-8 BOM. All my other editors work fine: Eclipse, Dreamweaver and even Bred.

Tomohiro Takahashi at 10/6/2010 6:01:48 PM -
RadPHP uses unit1.bom(0-byte) file to indicate that unit1.php is encoded with UTF-8(without BOM).
So, if you have just xxx.php file encoded with UTF-8, please create xxx.bom(0-byte) in the same folder.
When IDE create new unit encoed with UTF-8, IDE genarates xxx.php and xxx.bom file automatically.

Alexander Pravdin at 12/11/2010 7:51:32 PM -
Also, when working with XML files (with xml, xsl, xslt extensions) when encoding is defined in xml (<?xml version="1.0" encoding="utf-8"?>) think no additional .bom files needed to successfully work with utf-8 encoding.

Tomohiro Takahashi at 12/12/2010 4:37:52 PM -
No, as you know, it is impossible to detect a format(and character set) of a file completely without BOM, except for XML file which contains the tag.

Alexander Pravdin at 1/22/2011 6:00:07 PM -
And! :) RadPHP has a bug - it inserts BOM to file beginning even this file has a .BOM file!

Alexander Pravdin at 1/21/2011 10:14:40 PM -
You are right, but even when XML file contain tag with utf-8 encoding - RadPHP do not interpret this file as utf-8 file without BOM.

In work, files are often *expected* to be in one preferred charset. So an option in program settings to configure how to interpret files which encoding can not be detected automatically will be very very useful.

Alexander Pravdin at 12/11/2010 7:00:21 PM -
Its strange, but first time I tried RadPHP the IDE did not generate .bom files, and placed BOM signature in unit file beginning... Anyway, it was not intuitive, bout should.

Tomohiro Takahashi at 12/12/2010 4:31:54 PM -
> Its strange, but first time I tried RadPHP the IDE did not generate .bom files, ...
Please change IDE option as below and create new Application.
[Tools]|[Options]|[Editor Options]|[Default settings for new files]
- Text Encoding: UTF-8
- Text Format: Default
- Character Set: Default

Alexander Pravdin at 12/11/2010 6:54:20 PM -
Yes, it working.
But other editors can successfully determine utf-8 encoding without BOM and additional files. Hope you make similar functionality.

Tomohiro Takahashi at 12/12/2010 4:37:02 PM -
> But other editors can successfully determine utf-8 encoding without BOM and additional files.
I guess the editors try to detect the ecoding automatically. But, as you know, generally speaking, it is impossible to detect a format of a file completely without BOM. And, it is also impossible determine a character set(encoding) of a file which has no BOM.

mhmd tarboosh at 8/30/2012 1:55:17 AM -
Hi all, here is the solution for unicode...REMEMBER to OPEN the table or the query after this code...

OnCreate Or OnShow:

$this->dbadn_db1->setCharset("utf8");
     $this->dbadn_db1->execute("SET character_set_connection = utf8");
     $this->dbadn_db1->execute("SET character_set_client = utf8");
     $this->dbadn_db1->execute("SET character_set_results = utf8");
     $this->dbadn_db1->execute("SET collation_connection=utf8_general_ci");
     $this->dbadn_db1->execute("SET collation_database=utf8_general_ci");

//open the table....
$this->tbnews1->open();

Server Response from: ETNACODE01