Migrating crunchy string code to D2009 ??

Giganews Newsgroups
Subject:Migrating crunchy string code to D2009 ??
Posted by: Andrew Fiddian-Green (…@bb.cc)
Date:Fri, 28 Nov 2008

Hi,

The following is a challenge to chanters of the mantra "just recompile
and it runs", marketing people, and other assorted optimists...

Below is a function written in D2007; it is part of a parser for
extracting tag text information from MPEG-4 music files; it does in fact
compile OK in D2009 as it is; and if I change the function result type
from widestring to string, it still compiles; but I am pretty sure that
the result is not what would would have expected compared to D2007

How would one correctly migrate this function to D2007 unicode?
Note the input argument aData: TMp4ByteArray is an array of bytes.

Regards
AndrewFG

+++

function TMp4DataBox.ExtractUnicodeStr(const aData: TMp4ByteArray):
widestring;
var
  utf: Utf8String;
  i: integer;
  bom: word;
const
  bomSize = SizeOf(bom);
  wcharSize = SizeOf(widechar);
begin

  // check for a byte order mark
  if Length(aData) > SizeOf(bom) then
  begin

    // get the byte order mark
    System.Move(aData[0], bom, bomSize);

    // check for big endian UTF-16 string
    if bom = $fffe then
    begin

      // get the string
      SetString(Result, PWideChar(@aData[bomSize]),
        (Length(aData) - bomSize) div wcharSize);

      for i := 1 to Length(Result) do
      begin
        // it is big endian so reverse the byte order
        Result[i] := WideChar(SwapBytes16(word(Result[i])));
      end;

      exit;
    end;

    // check for little endian UTF-16 string
    if bom = $feff then
    begin

      // return the string
      SetString(Result, PWideChar(@aData[bomSize]),
        (Length(aData) - bomSize) div wcharSize);

      exit;
    end;
  end;

  // if we got to this point, it is probably a plain UTF8 string
  SetString(utf, PChar(@aData[0]), Length(aData));

  // so decode it
  Result := UTF8Decode(utf);
end;

Replies