Thursday, July 8, 2010

Hebrew and BiDi text in Emacs24 - Part 1 (Introduction)

I use Emacs a lot - not just for programming, but for many other tasks as well. It has always irritated me that Emacs didn't support Hebrew text input and that I would have to switch to another editor whenever I needed to write something in Hebrew. Well, with Emacs version 24, that will change! I've been playing around with the development version of Emacs24 (actually, although I also tested on Emacs24, I've mostly been testing on the development version of Aquamacs which is Emacs packaged with a number of frequently-used extensions for use on the Mac) and have been impressed by how well Emacs is starting to support Hebrew input with proper BiDi display (due in large part to the work and perseverance over the past few years by Emacs developer Eli Zaretskii). Here is a screen shot to illustrate both a mixture of English and Hebrew lines of text and Hebrew embedded in an English XML code snippet (in both cases, the proper Left-To-Right (LTR) and Right-To-Left (RTL) display of the text is maintained):


At the moment, this is only available in the development version of Emacs/Aquamacs. Therefore, unless you're technically-savvy, you probably won't be able to use it until the next official release of Emacs (the above picture is just to whet your appetite!). However, any developers who want to give it a try can do so by downloading and building the Emacs development source. Here are the necessary steps for building the Aquamacs version: From a terminal, download and build Aquamacs24: git clone git:// cd aquamacs-emacs/
git checkout -t origin/aquamacs24
(note: depending on git version, you might need to do: git checkout origin/aquamacs24)
Then, drag the application from the nextstep directory to your Applications directory. Currently, bidi support is turned off by default in Emacs24 (this will change before the official release of Emacs24), so you will want to turn it on in your .emacs file:
(setq-default bidi-display-reordering t)
That's all you have to do to enable bidi support! Emacs will "automatically" recognize whether you're typing a LTR or a RTL language and display the text appropriately (according to the rules specified in the Unicode Bidirectional Algorithm (described in Annex 9 of the Unicode Standard)). You can "force" RTL or LTR text by either setting the "bidi-paragraph-direction" variable in Emacs or by using one of the standard mechanisms described in the Unicode Bidirectional Algorithm page; however, for the most part, you will just let Emacs DWIM. You can enter Hebrew by either enabling system-wide Hebrew input (e.g. - as described here for the Mac) or by enabling a Hebrew input method locally in Emacs (e.g. - using "C-x RET C-\ hebrew"). Most Emacs users will probably prefer the Hebrew input method approach as it allows you to continue using standard Emacs command keys while the system-wide overrides will affect command key usage. If you enable an Emacs Hebrew input method, you can toggle between Hebrew and your default language by pressing "C-\". Anyone wanting to read about the main design decisions that went into the development of bidi support in Emacs can read Eli Zaretskii's "Bidirectional editing in Emacs -- main design decisions" post on the emacs-bidi list archive.