[ Jocelyn Ireson-Paine's Home Page | Publications | Dobbs Code Talk Index | Dobbs Blog Version ]

# Excel's Missing Ha'p'orth

I have been writing a spreadsheet to help those teaching the International Baccalaureate. In coding the part of its user interface that teachers use to describe the exercises they set and mark, my problems with menus recalled the concept of orthogonal programming languages, and showed me that Excel isn't.

A programming language is "orthogonal" if every feature works with every other feature. One of the first truly orthogonal languages was Algol68. (Others were Lisp and, I suspect, CPL.) For example, Algol68 had arrays. It had structures, functions, and unions. And it allowed you to make arrays of structures, arrays of functions, arrays of unions, structures with arrays in, unions which were sometimes arrays, functions that took and yielded arrays, and even arrays of arrays of functions that yielded structures composed of arrays of structures containing unions of arrays, arrays, and functions that took and yielded arrays.

Excel is not orthogonal. One thing I needed for the user interface was a menu of exercises. Teachers identify these by number, so I needed a menu of numbers. Now, Excel has a feature called "data validation". This enables you to restrict the values that a user can type, and to check what they do type. One way you can restrict them is to allow only values from a specified list. So I needed the data validator to put a menu of exercise numbers into my input cell.

When making the menu, you specify its list by typing it into the "Source" field on Excel's data-validation-setup form. The list can be either a list of string constants, or a formula that returns a range of cell addresses. Thus, if I want my menu to have as options the contents of cells A2:A8, I type `=A2:A8`.

But that's only if they're on the same sheet as the menu. When a formula refers to cells on a different sheet, you normally qualify them with the sheet name, like this: `'Task Metadata'!A2:A8`. But, I scream at my uncaring screen, Excel will not let me use this as a data-validation source. No matter how many times I type `'Task Metadata'!A2:A8` into the "Source" field, Excel retaliates with a dialogue box saying You may not use references to other worksheets or workbooks for Data Validation criteria. The only good thing about this is that its author at least realised that "criteria" is plural and avoided writing "a criteria".

You have to wonder about the person who implemented data validation. If you've ever written a compiler, you'll know that its code generator defines two vital functions. One generates code that yields the value of an expression. The other generates code that yields its address. You would use the latter, for example, to calculate the base address of an array-valued expression before subscripting it. Whoever specifies the code generator would ensure that it exports these two functions. Everybody would then be able to call them from anywhere in the compiler.

Excel's equivalents to value and address are value and cell address. So why doesn't Excel have an "evaluate to cell address" function? And if it does, why didn't the data-validation implementor call it?

It isn't that cells on different sheets from the menu are impossible to use. I say this because Excel can be fooled into using them. You define an identifier such as `MyOptions` that refers to `'Task Metadata'!A2:A8`. (This is what Excel knows as a "named range".) You can then type it into the "Source" field and everything will just work. Though because users can delete and add cells at any time, care is needed to make sure the named range always points where it should. Mistargeted named ranges are a common cause of errors.

I don't spend my time flaming Microsoft. I run Windows; and I am happy to believe a friend who once kept a Linux-resource Web site but who has abandoned Linux for Windows, when he tells me Windows is simpler to use, has better graphics, has more software written for it, and is easier to install because you don't have to fight through a junkyard of incompatible and complicated distributions, nor fritter away time hunting for drivers. But I don't see any excuse for the non-generality of Excel's data validation, or that the way round it makes your spreadsheet less safe. Nor do I see an excuse for other non-generalities, such as that when putting an array formula into a cell from a Visual Basic for Applications routine, the formula can be at most 255 characters. On the 22nd of September, the BBC announced that Bill Gates was the richest person in the US. More pertinently, because my Excel is Excel 2003, the BBC told us he has been the richest person in the US for the 17th year in a row. To design orthogonal languages takes time, because you have to consider lots of interactions. But Bill's programmers don't have to scrimp and sweat like the rest of us cash-flow-haunted slaves. They can afford the luxury of taking their time, coding calmly, and turning out an elegant and orthogonal design. Why spoil the ship for a ha'p'orth of tar?