Dynamic Web pages in Kawa

[ Jocelyn Ireson-Paine's Home Page | Free software | Publications ]

Some readers might be interested in a servlet I've written for implementing dynamic Web pages in Kawa, when the application needs session-tracking. The idea is that it's useful to view this as a state machine. There's a state stored on the server; submitting a form sends an action plus inputs to the server, which then makes a transition to a new state, storing that in place of the old. The new state determines an output page, which the server then displays.

Having programmed several applications of dynamic Web pages, I find that it's useful to have the transition network - the details of how actions determine transitions from state to state, and which pages are emitted as a result - all in one place. (See Footnote.) This is not how JSP and ASP encourage one to code. It's also useful to be able to test the individual Web pages without a server, by calling them directly from the Kawa top-level interpreter. And since the details of associating state with a session and updating it when input arrives are independent of the particular transition network, these should be taken care of by the servlet.

To keep the semantics simple, and make testing from the TLI easy, I compile Web pages into string-valued functions. (Page files get the extension .pfn, for Page FunctioN; the servlet must be configured so that the .pfn extension in a URL invokes it.). Each page is headed by a list of its arguments. Passing information to the pages via arguments makes explicit what they depend on, so makes them easier to maintain.

I've pinched the syntax from Bruce Lewis's BRL, so I use square brackets for inserting the results of Kawa expressions into a page. So here's a little .pfn file:

  (args title state)
  <html>
  <title> [ title ] </title>
  <h1> [ title ] </h1>
  [ (state->view state) ]
  </html>

I'm not sure that I'm as keen on square brackets as Bruce is (see his advocacy at http://brl.sourceforge.net/brl_4.html ), but they do have the advantage that one can preview such pages on a browser and check the HTML content that way without confusing it by unrecognised HTML elements.

To read BRL pages, Bruce hacks the Kawa lexical analyser so that characters between ] and [ are treated as strings. I didn't do that, partly because I wanted my system to be independent of changes to Kawa's internals. Instead, my servlet first passes a .pfn page through a pre-processor which replaces ]...[-quoted text by strings, making the file legal S-expressions. These are then fed to the page compiler, which converts them to a lambda expression, which is then eval'ed to get a function. When it comes time to use the page, the servlet supplies this function with its arguments, calls it, and sends the resulting string back to the Web server's output stream.

My .pfn pages can include other pages. Since pages are functions, to include a page, one writes a function call. However, the name of the function is replaced by the name of another page:

  (args title state)
  [ ("PageHeader" title) ]
  ... content ...
  [ ("PageFooter") ]

The page compiler replaces these by expressions which look up the included file, compile it, then call it. For efficiency, I could cache the computed page functions, but don't yet. Of course, these included pages can also be tested directly from the Kawa TLI.

What about state handling? The servlet assumes that the application's state is stored in a special slot in the Session instance. (This and following paragraphs will make more sense if you know how Java implements session-tracking.) To define the state transition network, the developer must supply a next-page function, which takes a state, action and inputs as arguments, and returns a next state and the name of a page to be output.

For example, I might write a search-engine application where the user submits some keywords on a form and asks for them to be searched for. My form must be coded to send the action 'search to the servlet, which will pass it as a symbol to the next-page function. The servlet will also pass the current state to this function, together with the data submitted, as a map from HTTP names to values. My next-page function might return either an error page and error state, if the keywords aren't found, or a success page listing the answers to the search, and a state which indicates where in this listing the reader is.

To my mind, one of the biggest design errors in HTML is the action URL in forms. In the days of CGI, this usually named a script to process the form's inputs; the browser would treat the name of this script as the name of the resulting page. This meant that if a form could display several different pages depending on the inputs (e.g. a success page or an error page), the script would have to redirect to one or other of these, to overcome the fact that the page name is hard-coded in the action URL. I remember fighting with this about five years ago; from postings in various fora about location headers, redirection headers, and browser compatibility, so were a lot of other people!

In contrast, my action URLs are literally actions. To tell the servlet which action a form wishes to invoke on the stored state, the form's action URL must literally name an action - e.g.

  <FORM ACTION=search.action METHOD=GET>

The servlet must be configured so that the .action extension also invokes it. When it receives such a URL, it strips off the extension and any directory part, and treats the rest as the name of the action, passing it to the next-page function as above. Having computed the name of the next page, the servlet will then do a redirect to it, so that the browser sees a sensible name.

(Actually, if the session-tracking is implemented by URL rewriting, the action URL must include the session id. The person coding the form will have to call one of my standard functions, which encapsulates Java's encodeUrl, to insert this.)

The code is available, but probably not terribly efficient. It is in /kb7/kb7.zip. This is a zip file which defines classes kb7.wsm.* (and other things, but you don't need those). Unzip it into a directory kb7/ and go to the wsm subdirectory. It defines one servlet, WSMServlet, which needs to be configured so that both .pfn and .action extensions cause it to be called. The servlet needs one initialisation parameter, document-root, which must be the top-level directory for .pfn pages. It will also expect to read a sitedefs.scm file from there, defining a next-page function.

Footnote

(*) Note: this is one way to achieve a concise specification of how a Web application behaves and how its pages are interrelated, rather than splitting the information up between the pages. Another approach I've been experimenting with is described here with an example here. It assumes knowledge of algebraic specification - there are some examples in one of the specification languages, CafeOBJ.

4th November 2001.

[ Jocelyn Ireson-Paine's Home Page | Free software | Publications ]