
| Subject: | Migrating crunchy string code to D2009 ?? |
| Posted by: | Andrew Fiddian-Green (…@bb.cc) |
| Date: | Fri, 28 Nov 2008 |
Hi,
The following is a challenge to chanters of the mantra "just recompile
and it runs", marketing people, and other assorted optimists...
Below is a function written in D2007; it is part of a parser for
extracting tag text information from MPEG-4 music files; it does in fact
compile OK in D2009 as it is; and if I change the function result type
from widestring to string, it still compiles; but I am pretty sure that
the result is not what would would have expected compared to D2007
How would one correctly migrate this function to D2007 unicode?
Note the input argument aData: TMp4ByteArray is an array of bytes.
Regards
AndrewFG
+++
function TMp4DataBox.ExtractUnicodeStr(const aData: TMp4ByteArray):
widestring;
var
utf: Utf8String;
i: integer;
bom: word;
const
bomSize = SizeOf(bom);
wcharSize = SizeOf(widechar);
begin
// check for a byte order mark
if Length(aData) > SizeOf(bom) then
begin
// get the byte order mark
System.Move(aData[0], bom, bomSize);
// check for big endian UTF-16 string
if bom = $fffe then
begin
// get the string
SetString(Result, PWideChar(@aData[bomSize]),
(Length(aData) - bomSize) div wcharSize);
for i := 1 to Length(Result) do
begin
// it is big endian so reverse the byte order
Result[i] := WideChar(SwapBytes16(word(Result[i])));
end;
exit;
end;
// check for little endian UTF-16 string
if bom = $feff then
begin
// return the string
SetString(Result, PWideChar(@aData[bomSize]),
(Length(aData) - bomSize) div wcharSize);
exit;
end;
end;
// if we got to this point, it is probably a plain UTF8 string
SetString(utf, PChar(@aData[0]), Length(aData));
// so decode it
Result := UTF8Decode(utf);
end;