[ Jocelyn Ireson-Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]

An example of algebraic Web specification

-- web2.mod 

-- We start with basic definitions for Web pages, URLs
-- and so forth.


module* PAGES
{
  -- This module introduces the sort Page, denoting
  -- the space of complete Web pages.
  -- Other modules will define functions mapping
  -- between Page and URLs, filenames, and content.
  --
  -- We shall probably only use terms of sort Page to
  -- represent pages resident on our own server. We
  -- can certainly use CafeOBJ to specify the behaviour
  -- of pages on any server, but we can only have
  -- control over our own (except perhaps via the 
  -- use of PUT or other methods for writing to remote
  -- servers). So when using these modules
  -- as components of the algebraic web site generator,
  -- it's probably not useful to name pages elsewhere.

  [ Page ]
}


module* CONTENT
{
  -- This module introduces the notion of a mapping from 
  -- pages to their contents. We assume that every page's
  -- content can be represented as a string. In
  -- most cases, this will be HTML.

  protecting( STRING )
  protecting( PAGES ) 

  op content : Page -> String
}


module* CONTENT'
{
  -- This is an alternative version of CONTENT. 
  -- Though most pages contain HTML, not all do.
  -- Each page has a so-called content type or
  -- MIME type. A browser will either infer this
  -- from the extension in the page's URL (e.g.
  -- ".htm" and ".html" indicate HTML; ".gif" indicates
  -- the GIF graphical format), or from a 
  -- 'Content-type' header sent back by the server
  -- (e.g. the header "Content-type: text/html"
  -- indicates HTML; the header "Content-type: image/gif"
  -- indicates GIF). The browser uses the content
  -- type to decide how to interpret and display
  -- the page's content. E.g. a browser will
  -- render the content "<H1>Header</H1>" as a bold
  -- heading in some big font if the content type
  -- is HTML; but if the content type is "text/plain",
  -- the browser will display the HTML H1 tags verbatim
  -- rather than interpreting them, and will probably use a 
  -- monospaced font such as Courier throughout.
  --
  -- Below, I've defined sort ContentType, and some
  -- constants. Other modules might want to add other
  -- constants to the sort. Does the web site generator
  -- need ContentType? Probably, it's too early to say.
  -- Instead of saying that the function 'content' maps
  -- all pages to strings, should we define subsorts
  -- of String for different content, as I've done below?
  -- If so, how do we specify that, e.g. 'content'
  -- will only map a page to an HTML if the page's
  -- conteot-type is text/html? 

  protecting( STRING )
  protecting( PAGES )

  [ ContentType ]

  [ Content < String ]
  [ GIF < Content ]
  [ HTML < Content ]
  [ Text < Content ]

  ops image/gif text/html text/plain : -> ContentType

  op content-type : Page -> ContentType

  op content : Page -> String
}


module* URLS
{
  -- This module introduces the space of URLs.
  -- For the moment, we ignore the internal structure
  -- of these (e.g. the way they're divided by "/"
  -- separators) and just represent them as strings.
  -- In future, we might want not to ignore the
  -- structure: perhaps represent each URL as a list of
  -- elements, and have a function that converts
  -- the list 
  --   [ e1 e2 ... e_n ] 
  -- to the string
  --   e1 ++ "/" ++ e2 ++ "/" ++ ... ++ e_n.
  --
  -- We might also want to introduce functions
  -- that normalise URLs (e.g. converting 
  -- "http://site/dir1//dir2" to "http://site/dir1/dir2")
  -- and that test whether two URLs name the same page
  -- (do this by comparing normalised forms, perhaps).
  -- Some transformations (such as the above) can be
  -- done without knowledge of the site whose pages the
  -- URLs name, I think. Others will be site-dependent
  -- (e.g. knowing that "/dir1/" denotes the default
  -- page "/dir1/index.htm"). 
  --
  -- We do not assume that every element of sort URL
  -- actually names a page.
  --
  -- The subsort SiteAbsoluteUrl is for URLs of pages
  -- living on the current server. These are special 
  -- because the web site generator can create them and
  -- know their location in its filestore. They
  -- will not contain an "http://" prefix or a site
  -- address, and will start with a "/" (e.g. 
  -- "/dir1/index.htm"). 

  protecting( STRING )

  [ URL < String ]
  [ SiteAbsoluteURL < URL ]
}


module* PAGE-VS-URL
{
  -- This module introduces mappings between the
  -- space of pages and of URLs. The function
  -- 'page' will be partial, since not every URL
  -- actually names a page. 'url' can be total.
  --
  -- Should we use sort constraints to distinguish
  -- URLs that do name pages from those that don't?
  -- If so, how, given that the set of pages changes
  -- with time?

  protecting( PAGES + URLS )

  op page : URL -> Page
  op url : Page -> URL
}


module* FILENAMES
{
  -- This module introduces the spaces of filenames
  -- and directory names. We shall need these when
  -- mapping URLs to locations in our servers filestore.
  -- As with URLs, we might later want to represent the
  -- internal structure of filenames rather than just
  -- representing them as strings.
  --
  -- We define two sorts, Filename and DirectoryName.
  -- Some operating systems treat directories as
  -- special kinds of file, but it's probably cleaner
  -- to think of the two as different.
  --
  -- Question: are these relative or absolute filenames,
  -- and do we need to distinguish the two by subsorting?
  --
  -- I assume that directory names do not end in the 
  -- directory-separator character. This assumption is
  -- important when prepending them to URLs to make
  -- filenames.

  protecting( STRING )

  [ Filename < String ]
  [ DirectoryName < String ]
}


module* URL-VS-FILENAME
{
  -- This module introduces the idea of a mapping between
  -- a page's URL and its location in the server's
  -- filestore. Though we can talk about this for
  -- any server, we only have control over it on
  -- our own, so I've restricted it to SiteAbsoluteURL.
  --
  -- The mapping from URLs to filenames is total.
  -- That from filenames to URLs will be partial,
  -- since some files will be outside the region
  -- of filestore which the server is allowed to
  -- send pages from.

  protecting( URLS + FILENAMES )

  op filename : SiteAbsoluteURL -> Filename
  op url : Filename -> SiteAbsoluteURL
}


module* DOCUMENT-ROOT-MAPPING
{
  -- This module describes the most common way
  -- of mapping between URLs and filenames. The
  -- server's filestore has a 'document root'
  -- directory. The server is allowed to make
  -- any files in this directory or its subdirectories
  -- visible as Web pages. To convert a URL
  -- to a filename, it prepends the document
  -- root, and then converts all "/" characters in the
  -- URL to whatever character its operating
  -- system uses as a directory separator (e.g.
  -- "/" on Unix, "\" on Windows).
  --
  -- To describe your own server's document-root mapping,
  -- protect this module and give equations defining
  -- document-root and directory-separator. document-root
  -- should be a directory name, without a trailing
  -- directory separator. It should use "/" for directory
  -- separators, even if your own system uses "\" or 
  -- some other character. This is merely because there seems
  -- to be a bug in CafeOBJ which makes it pass 
  -- "\" characters in strings to Lisp incorrectly. 
  -- directory-separator should be the directory separator:
  -- if this is "\", write it as "\\", which despite the
  -- bug, does seem to get passed to Lisp OK.

  protecting( URL-VS-FILENAME )

  op document-root : -> DirectoryName
  op directory-separator : -> String

  op $substitute : String String -> String
  -- Private operator which replaces all "/"'s in its
  -- first argument by the second argument. The second
  -- argument will be directory-separator, but we need
  -- to pass it as a variable: I can't find a way to make
  -- Lisp refer to terms other than via CafeOBJ
  -- variables.

  var U : SiteAbsoluteURL
  eq filename( U ) = $substitute( document-root ++ U, directory-separator ) .

  eq $substitute( URL:String, Sep:String ) = #! (substitute (char Sep 0) #\/ URL) .
}


-- Next, we consider how Web pages can be grouped into
-- collections. We need some utilities first, and then
-- define a particular kind of collection, the slideshow.


module* HTML-UTILITIES
{
  -- This module defines some utilities for generating HTML.
  -- We'll need them when linking pages within collections.

  -- The <A HREF=_>_</A> operation places a link, as its name
  -- suggests.

  protecting( PAGE-VS-URL )

  op <A HREF=_>_</A> : Page String -> String
  op <A HREF=_>_</A> : URL String -> String 

  eq <A HREF= P:Page > S:String </A> = <A HREF= url(P) > S </A> . 
  eq <A HREF= U:URL > S:String </A> = "<A HREF=" ++ U ++ ">" ++ S ++ "</A>" . }


module* SLIDESHOW
{
  -- This module describes a specific kind of collection of
  -- pages, namely a 'slideshow'. This is examplified by the 
  -- MM presentation on my site at 
  -- http://www.j-paine.org/mm_presentation/dangerous.html . 
  -- It's a sequence of pages intended for linear reading. 
  -- Each page except the last has a successor, and each page 
  -- except the first has a predecessor. 
  -- 
  -- To use this module, define some extra pages p2 ... p_n . 
  -- Define the ordering with equations for 'succ' such that   
  --   last = succ(p_n), p_n = succ(p_n-1), ... p2 = succ(first). 
  -- and for 'pred' such that 
  --   first = pred(p2), p3 = pred(p2), ... p_n = pred(last). 
  -- Define the pages' titles with equations for 'title' such that
  --   title(p_i) = ...The string between <TITLE> and </TITLE>
  --                   tags for page p_i. The generator will also
  --                   use this for the page's heading, placing
  --                   it betweem <H1> and </H1> tags...
  -- Define the pages' bodies with equations for 'body' such that
  --   body(p_i) = ...The content for page p_i that comes after
  --                  the heading...
  -- Define the pages' URLs with equations for 'url' such that
  --   url(p_i) = ...The site-absolute URL for page p_i...
  -- Define the URL for the contents page (see below) with an equation
  -- for 'url' such that
  --   url(contents-page) = ...The site-absolute URL for the contents page...
  --
  -- Module CONTENT introduces the function 'content' from
  -- a page to its complete contents. SLIDESHOWS defines 'content'
  -- in terms of the page's title, body and links. After placing
  -- the title in the page's <HEAD> section, it generates a
  -- navigation bar linking to the page's successor and precessor
  -- (or to one of these if the page is at the start or end). It
  -- then reuses the title to generate the heading, as described
  -- above. It then places the page's body, following it with
  -- another copy of the navigation bar and the closing
  -- </BODY> and </HTML> tags. SLIDESHOW also generates a 
  -- contents page, linking to each page in order. It places
  -- a link to this in each navigation bar.
  --
  -- In real-life use, the user would want more control over
  -- location of the navigation bar and whether, e.g. the 'Next'
  -- link should appear before or after the 'Last' one, and
  -- the words to be used. We could define these by providing
  -- operations to construct templates for pages and their parts,
  -- terms constructed from which this module interprets. For
  -- example, 
  --   eq nav-bar = "Up" ++ UP-LINK ++ 
  --               " Next " ++ NEXT-LINK ++ " Last " ++ LAST-LINK
  -- or
  --   eq nav-bar = "Inhoud" ++ UP-LINK ++ 
  --                " Vorige pagina " ++ LAST-LINK ++ " Volgende pagina " ++ NEXT-LINK 
  -- where the capitalised operations get replaced by appropriate
  -- links for each page.
  --
  -- Question: I haven't declared a sort to represent slideshows.
  -- Would that be better?
  --
  -- Operations whose names begin with "$" are to be treated as
  -- private to the module.

  protecting( PAGES + CONTENT + PAGE-VS-URL + HTML-UTILITIES )

  [ SequencePage < Page ]
  [ StartPage < SequencePage ]
  [ InteriorPage < SequencePage ]
  [ EndPage < SequencePage ]
  [ ContentsPage < Page ]

  op succ : StartPage -> Page
  op succ : InteriorPage -> Page
  op pred : InteriorPage -> Page
  op pred : EndPage -> Page
  op first : -> StartPage
  op last : -> EndPage 
  op body : Page -> String
  op title : Page -> String
  op contents-page : -> ContentsPage

  op url : Page -> SiteAbsoluteURL
  -- Specialise 'url' so that 'filename' can be reduced.

  op $nav-bar : Page -> String 
  -- The navigation bar at the top and bottom of each sequence page.
 
  op $contents : -> String
  op $contents-from : Page -> String 
  -- $contents is the list of page titles to be used in the contents.
  -- $contents-from(P) is the list from and including P. 

  eq content( contents-page ) = "<HTML><HEAD>" ++
                                "<TITLE>Contents</TITLE>" ++
                                "</HEAD><BODY>" ++ 
                                "<H1>Contents</H1>" ++ 
                                "<UL>" ++
                                $contents ++
                                "</UL>" ++                
                                "</BODY></HTML>" .

  eq content( P:SequencePage ) = "<HTML><HEAD>" ++
                                 "<TITLE>" ++ title(P) ++ "</TITLE>" ++
                                 "</HEAD><BODY>" ++ 
                                 $nav-bar(P) ++
                                 "<HR>" ++ 
                                 "<H1>" ++ title(P) ++ "</H1>" ++ 
                                 body(P) ++ 
                                 "<HR>" ++
                                 $nav-bar(P) ++
                                 "</BODY></HTML>" .

  eq $nav-bar( P:StartPage ) = <A HREF= contents-page > "Contents" </A> ++
                               <A HREF= succ(P) > "Next" </A> .
  eq $nav-bar( P:EndPage ) = <A HREF= contents-page > "Contents" </A> ++
                             <A HREF= pred(P) > "Last" </A> .
  eq $nav-bar( P:InteriorPage ) = <A HREF= contents-page > "Contents" </A> ++
                                  <A HREF= succ(P) > "Next" </A> ++ 
                                  <A HREF= pred(P) > "Last" </A> .

  eq $contents = $contents-from( first ) .

  eq $contents-from( P:EndPage ) = "<LI>" ++ <A HREF= P > title(P) </A> .
  eq $contents-from( P:StartPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) .
  eq $contents-from( P:InteriorPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) . }


-- Now we consider how to generate the pages in a collection
-- to file. We'll need utilities such as lists and methods
-- for writing to files first.


mod! LISTS  ( ELT :: TRIV ) principal-sort List
{
  -- This module defines the space of lists. It was originally 
  -- written by Razvan Diaconescu as part of the CafeOBJ examples.
  -- We'll need it when constructing lists of all the pages
  -- defined by a module.

  [ NonEmptyList < List ]

  op nil : -> List

  op cons : Elt List -> NonEmptyList

  op cons : Elt -> NonEmptyList

  op car : NonEmptyList -> Elt
  op cdr : NonEmptyList -> List 

  op _+_ : List List -> List

  vars L L' : List
  var E : Elt

  eq car(cons(E,L)) = E .

  eq cdr(cons(E,L)) = L .

  eq nil + L = L .
  eq cons(E,L) +  L' = cons(E,(L + L')) . 
}


make PAGE-LISTS ( LISTS(PAGES) * { sort List -> PageList } ) .
  -- This defines the space of lists of pages.


module* WRITE-FILE
{
  -- This module calls Lisp to define a 'function'
  -- write-string--to-file(S,F) which writes the 
  -- string S to file F. It does return a value, 
  -- but we use it only for its side-effect, in WRITE-PAGE.
  --
  -- I use the subsort Void for the coarity of functions 
  -- derived from 'write-string-to-file', where it's only 
  -- the side-effect that's of interest. If I need to write 
  -- an explicit return value, I use the operation 'void' (or, 
  -- if in Lisp, an equivalent symbol).
  --
  -- I don't know exactly what #! does, since there's
  -- no documentation about it. Looking at the libraries,
  -- there are two ways to invoke Lisp, #! and #!!. 
  -- The second doesn't work here: I think #! discards
  -- the sort tags from variables when passing them 
  -- to Lisp, whereas #!! keeps them. For 'write-string-to-file',
  -- we don't need the tags, only the variables' values
  -- as strings.

  protecting( STRING )

  [ Void < String ]

  op write-string-to-file : String String -> Void

  op void : -> Void

  eq write-string-to-file( Content:String, Filename:String ) = 
    #! (let ( ( 
                stream (open Filename :direction :output)
              ) 
            ) 
         (write-string Content stream) 
         (close stream) 
         ""
        ) .

  eq void = "" .
}


module* WRITE-PAGE
{
  -- This module defines a 'function' whose side-effect is
  -- to write a page's content to its file. This only makes
  -- sense when applied to pages whose filestore location
  -- we know, namely those on our own server. We use it
  -- in the Web-page generator below.

  protecting( WRITE-FILE + PAGE-VS-URL + URL-VS-FILENAME + CONTENT )

  op write-page : Page -> Void

  eq write-page( P:Page ) = write-string-to-file( content(P), filename(url(P)) ) .
}


module* PAGE-COLLECTION
{
  -- See GENERATOR, below.

  protecting( PAGE-LISTS )

  op all-pages : -> PageList
}


module* GENERATOR( PAGE-COLLECTION :: PAGE-COLLECTION )
{
  -- This module defines the operation 'generate'. This
  -- iterates over all pages in a PAGE-COLLECTION, obtaining
  -- the content and filename for each, and calling 
  -- write.WRITE-FILE to write the content to the file.
  -- Since we use it for its side-effects only, we 
  -- give it the coarity Void, as in WRITE-FILE.
  --
  -- The interface between 'generate' and the PAGE-COLLECTION
  -- is via all-pages. Any PAGE-COLLECTION must define
  -- all-pages to return a list of all the pages in
  -- the collection.
  --
  -- The code for $generate is less efficient than it
  -- need be. Since we don't want its result, there's
  -- no need to use ++ to concatenate results together.
  -- But I can't think of any other way to force
  -- evaluation of both write-page and the recursive
  -- call to $generate in the second equation. 

  protecting( WRITE-PAGE + PAGE-LISTS )

  op generate : -> Void

  op $generate : PageList -> Void
  -- Private operation which writes each page
  -- in its argument.

  eq generate = $generate( all-pages ) .

  eq $generate( nil ) = void .
  eq $generate( cons( P:Page, Tail:PageList ) ) =
       write-page( P ) ++ $generate( Tail ) .
}


module* SLIDESHOW-PAGE-COLLECTION( SLIDESHOW :: SLIDESHOW )
{
  -- This module extends SLIDESHOW to define all-pages.
  -- Conceptually, this is not part of the slideshow itself,
  -- which is why I define it separately.

  protecting( PAGE-COLLECTION )

  op $the-sequence-from : Page -> PageList
  -- Private operation which returns a list of all pages in
  -- the slideshow, starting from its first argument.

  eq all-pages = cons( contents-page, $the-sequence-from(first) ) .

  eq $the-sequence-from( P:EndPage ) = cons( P, nil ) .
  eq $the-sequence-from( P:StartPage ) = cons( P, $the-sequence-from( succ(P) ) ) .
  eq $the-sequence-from( P:InteriorPage ) = cons( P, $the-sequence-from( succ(P) ) ) .
}


module! MM-SLIDESHOW
{
  -- This module instantiates SLIDESHOW to define
  -- a presentation about MM.

  protecting( SLIDESHOW )

  ops p2 p3 : -> InteriorPage

  eq body( first ) = "This is an introduction to MM." .
  eq body( p2 ) = "This is more about MM." .
  eq body( p3 ) = "This is the third page in the presentation." .
  eq body( last ) = "This is the final page." . 

  eq title( first ) = "Introduction" .
  eq title( p2 ) = "Background" .
  eq title( p3 ) = "Details" .
  eq title( last ) = "Prospects" .

  eq url(contents-page) = "/contents.html" .
  eq url( first ) = "/first.html" .
  eq url( p2 ) = "/p2.html" .
  eq url( p3 ) = "/p3.html" .
  eq url( last ) = "/last.html" .
  
  eq succ( first ) = p2 .
  eq succ( p2 ) = p3 .
  eq succ( p3 ) = last .

  eq pred( last ) = p3 .
  eq pred( p3 ) = p2 .
  eq pred( p2 ) = first .
}


module! MY-SERVER
{
  -- This module extends DOCUMENT-ROOT-MAPPING to
  -- define the directory separator and URL-to-filename
  -- mapping used by my server. GENERATOR must have
  -- the latter before it can evaluate 'filename'.

  protecting( DOCUMENT-ROOT-MAPPING )

  eq document-root = "/home/popx" .
  eq directory-separator = "/" .
}


make MM-SLIDESHOW-GENERATOR ( GENERATOR(SLIDESHOW-PAGE-COLLECTION(MM-SLIDESHOW)) +
                              MY-SERVER
                            )
  -- This defines a generator for the MM slideshow, using the 
  -- URL-to-filename mapping of MY-SERVER. Open the module
  -- and reduce 'generate' (assuming that your directory
  -- separator is "\" and the directory kb7\mm\cafeobj\ exists),
  -- and you will obtain files containing all the pages.

22nd May 2000.

[ Jocelyn Ireson-Paine's Home Page | Report on Visit to University of Minho, May 2000 ]

[ Jocelyn Ireson-Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]