Preparing the transition to WASTE 3.0
Last updated on June 21, 2006
Core Graphics vs Quickdraw
WASTE 3.0 Demo
WASTE 3.0 is a long overdue revision to the popular WASTE text engine that is fully compatible with Mac OS X. It is a complete rewrite: all existing code was scrapped. The new engine is written in C++, using modern idioms and the Standard Template Library extensively. Text is stored internally in UTF-16 format, text rendering and measuring is performed using ATSUI, and all imaging is done with Core Graphics. Text input is accepted through Carbon event handlers — all available keyboards and input methods are supported.
This document is written for developers using WASTE 2.1 in their applications who are in the process of migrating to WASTE 3.0, or are planning to do that.
As of this writing, WASTE 3.0 requires Mac OS X 10.3 (Panther). However, as development proceeds further, the minimum system requirements may become higher. The source code is meant to be compiled into a Universal (PowerPC/x86) Mach-O binary — CFM is not supported.
In previous versions of WASTE, text was stored internally using a variety of legacy WorldScript encodings such as MacRoman and MacJapanese. Characters could take up one or two bytes, and their identity could not be determined solely by inspecting the raw text buffer: WASTE would use the Quickdraw font family number associated with a text run to determine the encoding of the associated text.
WASTE 3.0 stores text internally in UTF-16 (Unicode) format. This has several advantages. First of all, characters have an intrinsic identity which is no longer dependent on any out-of-band attributes that might be imposed on them. Secondly, a much wider repertoire becomes accessible to WASTE documents, including a vast number of extended Latin characters that were previously unrepresentable, or representable only with non-Roman fonts.
An important consequence of this change, which should be mostly transparent to client applications, is that text offsets used in many APIs are no longer byte offsets, but offsets to UniChar (16-bit) units.
In WASTE 2.1, text input was achieved by feeding one-byte character codes (obtained from raw keyboard events) to
WEKey, or through TSM Apple events captured by suitable Apple event handlers set up with
WEInstallTSMHandlers. Both methods are obsolete and discouraged in Mac OS X, superseded by text input Carbon event handlers, installed using the new
WESetEventTarget API. This allows all keyboard and input methods to be used, even Unicode-only keyboards such as Arabic, Turkish and Icelandic that were previously unavailable to WASTE-using applications.
WEKey is still supported, but has been deprecated.
WASTE 3.0 fully supports Tiger's dictionary lookup feature (hover the mouse over any word, press control-command-D, and a definition of that word will pop up), and the Character Palette, including entering unencoded glyphs.
While previous versions of WASTE used Quickdraw to render the text, WASTE 3.0 uses Core Graphics exclusively, and avoids the use of Quickdraw types and data structures altogether, except where needed to guarantee backwards compatibility with WASTE 2.1 APIs. All geometric calculations are performed internally using floating point coordinates.
A few WASTE 2.1 APIs make use of Quickdraw regions, notably
WEAdjustCursor. While WASTE 3.0 will keep supporting these APIs, modern replacements will be provided that don't rely on Quickdraw regions, but use Core Graphics paths or HIShapes instead. WASTE 3.0 does not use regions internally — when it needs to manipulate complex shapes such as areas to highlight, it uses Core Graphics paths.
Embedded objects present a special difficulty, since 'draw' handlers assume a Quickdraw port has been set up for them to draw into. This assumption no longer holds in WASTE 3.0. Rather than going through hoops in an attempt to set up a Quickdraw environment just for embedded objects, I decided to replace 'draw' handlers with a backwards-incompatible version that gets passed a Core Graphics context.
Macintosh applications have been able to exchange text with one another since the dawn of times, first using the clipboard, and later using the now familiar drag-and-drop mechanism. The preferred underlying data format, however, has changed with time.
The traditional Mac OS format for exchanging plain text was the 'TEXT' scrap format, which was just a string of MacRoman characters (later, characters in more WorldScript encodings were allowed). Formatted text was represented by adding an ancillary 'styl' scrap to the main 'TEXT' scrap, representing a simple combination of out-of-band character-level attributes. The WASTE 1.x series fully supported this convention, and added a parallel 'SOUP' scrap for representing embedded objects. WASTE 2.x, with its support for paragraph-level formatting and extended character styles such as strike-through, introduced a number of additional scrap formats, all meant to be used in conjunction with the main 'TEXT' scrap.
The 'TEXT' format, however, is unsuitable for a modern Unicode-based operating system like Mac OS X. As a consequence, the preferred format for plain text in WASTE 3.0 is 'utxt' (a string of UTF-16 codepoints), and the preferred format for formatted text is RTF.
As of version 3.0d7, WASTE 3.0 is delivered as a universal framework, capable of running natively on Intel-based Macintoshes. A number of proprietary, binary scrap formats introduced in previous versions of WASTE for the purpose of accurate round-tripping of character- and paragraph-level attributes need to be properly endian-flipped when parsed, or generated, on a little-endian processor like the cpu that powers Intel-based Macintoshes. WASTE 3.0 takes care of all the endian-flipping automatically and transparently.
WEStreamRange, when called with the
'utxt' selector, will return UTF-16 text in big-endian byte order unless the
weGetLittleEndian option is specified, regardless of the underlying platform. To avoid oversights and ambiguities, I recommend you always use Unicode BOMs (byte order marks) when extracting or inserting UTF-16 text.
WASTE 3.0 offers limited support for one-level undo/redo. Current clients should consider adding support for multiple undo/redo, if they haven't already.
APIs bracketed in
WASTE_DEPRECATED guards in the WASTE.h header file that comes with WASTE 2.1 are not available in WASTE 3.0, so the first step in transitioning to WASTE 3.0 is to make sure you're not using any of those. The 2.1 header file suggests possible replacements.
In addition, several more APIs are deprecated in WASTE 3.0. Here's a table, with reasons for deprecation and suggested replacements:
|Deprecated API||Reason||Possible replacement|
|WECharByte||Don't assume text is stored in legacy WorldScript encodings||UCIsSurrogateHighCharacter, UCIsSurrogateLowCharacter|
|WECharType||Don't assume text is stored in legacy WorldScript encodings||WEGetCharacterProperty|
|WEGetUndoInfo||Support multiple undo/redo||WEGetIndUndoInfo|
|WEHandleTSMEvent||Use Carbon event handlers for text input||WESetEventTarget|
|WEInstallTSMHandlers||Use Carbon event handlers for text input||WESetEventTarget|
|WEKey||Prefer text input events to raw keyboard events||WESetEventTarget|
|WEIdle||No-op; CPU time is obtained by installing Carbon event timers when appropriate||not needed|
|WERemoveTSMHandlers||Use Carbon event handlers for text input||WESetEventTarget|
|WEUseSoup||Meant to be used in conjunction with WEUseText, which is deprecated||WEPut|
|WEUseStyleScrap||Meant to be used in conjunction with WEUseText, which is deprecated||WEPut|
|WEUseText||Don't assume text is stored in legacy WorldScript encodings||WEPut|
|WEPrintPage||Use Core Graphics instead of Quickdraw||WEPrintPageWithCGContext|
Also, many selectors for
WEGetInfo are unsupported in WASTE 3.0, including all selectors for low-level measuring and drawing hooks. Selectors that are supported at the time of this writing are:
WASTE 3.0 isn't just about maintaining compatibility with old codebases. It also introduces a few new features that address some of the most common real-world requirements of WASTE clients.
WASTE 3.0 allows client applications to define any number of custom attributes, and apply those attributes to any text range, just like built-in attributes such as font and color. The new API that enables this is
WERegisterCustomAttribute. Custom attributes can be either character-level (part of a style) or paragraph-level (part of a ruler). They can be marked as persistent, which will cause WASTE to preserve them when formatted text is converted to and from RTF or other scrap formats.
WASTE 3.0 introduces a feature called grouping, that allows an arbitrary text range to be "grouped", and treated as a unit for all selection, deletion and keyboard navigation purposes. This is similar to the concept of protected text found in other applications. With previous versions of WASTE, some clients faked grouped/protected text using embedded objects, but this solution was not fully satisfactory for several reasons, including the fact that embedded objects cannot span multiple lines, while grouped text can.
The current evaluation kit includes a demo of WASTE 3.0, delivered as a universal binary: it was obtained from the old demo application for WASTE 2.1, recompiled with Xcode. The application code itself is mostly unchanged, but it's linked against the new engine. As you can see, WASTE 3.0 is not 100% functional yet, and it could use some optimization, but it's getting close to usable. I will post new builds as development progresses. Your feedback is appreciated.
|Copyright © 2005-2006 Merzwaren|