[ Jocelyn Ireson-Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]
-- web2.mod -- We start with basic definitions for Web pages, URLs -- and so forth. module* PAGES { -- This module introduces the sort Page, denoting -- the space of complete Web pages. -- Other modules will define functions mapping -- between Page and URLs, filenames, and content. -- -- We shall probably only use terms of sort Page to -- represent pages resident on our own server. We -- can certainly use CafeOBJ to specify the behaviour -- of pages on any server, but we can only have -- control over our own (except perhaps via the -- use of PUT or other methods for writing to remote -- servers). So when using these modules -- as components of the algebraic web site generator, -- it's probably not useful to name pages elsewhere. [ Page ] } module* CONTENT { -- This module introduces the notion of a mapping from -- pages to their contents. We assume that every page's -- content can be represented as a string. In -- most cases, this will be HTML. protecting( STRING ) protecting( PAGES ) op content : Page -> String } module* CONTENT' { -- This is an alternative version of CONTENT. -- Though most pages contain HTML, not all do. -- Each page has a so-called content type or -- MIME type. A browser will either infer this -- from the extension in the page's URL (e.g. -- ".htm" and ".html" indicate HTML; ".gif" indicates -- the GIF graphical format), or from a -- 'Content-type' header sent back by the server -- (e.g. the header "Content-type: text/html" -- indicates HTML; the header "Content-type: image/gif" -- indicates GIF). The browser uses the content -- type to decide how to interpret and display -- the page's content. E.g. a browser will -- render the content "<H1>Header</H1>" as a bold -- heading in some big font if the content type -- is HTML; but if the content type is "text/plain", -- the browser will display the HTML H1 tags verbatim -- rather than interpreting them, and will probably use a -- monospaced font such as Courier throughout. -- -- Below, I've defined sort ContentType, and some -- constants. Other modules might want to add other -- constants to the sort. Does the web site generator -- need ContentType? Probably, it's too early to say. -- Instead of saying that the function 'content' maps -- all pages to strings, should we define subsorts -- of String for different content, as I've done below? -- If so, how do we specify that, e.g. 'content' -- will only map a page to an HTML if the page's -- conteot-type is text/html? protecting( STRING ) protecting( PAGES ) [ ContentType ] [ Content < String ] [ GIF < Content ] [ HTML < Content ] [ Text < Content ] ops image/gif text/html text/plain : -> ContentType op content-type : Page -> ContentType op content : Page -> String } module* URLS { -- This module introduces the space of URLs. -- For the moment, we ignore the internal structure -- of these (e.g. the way they're divided by "/" -- separators) and just represent them as strings. -- In future, we might want not to ignore the -- structure: perhaps represent each URL as a list of -- elements, and have a function that converts -- the list -- [ e1 e2 ... e_n ] -- to the string -- e1 ++ "/" ++ e2 ++ "/" ++ ... ++ e_n. -- -- We might also want to introduce functions -- that normalise URLs (e.g. converting -- "http://site/dir1//dir2" to "http://site/dir1/dir2") -- and that test whether two URLs name the same page -- (do this by comparing normalised forms, perhaps). -- Some transformations (such as the above) can be -- done without knowledge of the site whose pages the -- URLs name, I think. Others will be site-dependent -- (e.g. knowing that "/dir1/" denotes the default -- page "/dir1/index.htm"). -- -- We do not assume that every element of sort URL -- actually names a page. -- -- The subsort SiteAbsoluteUrl is for URLs of pages -- living on the current server. These are special -- because the web site generator can create them and -- know their location in its filestore. They -- will not contain an "http://" prefix or a site -- address, and will start with a "/" (e.g. -- "/dir1/index.htm"). protecting( STRING ) [ URL < String ] [ SiteAbsoluteURL < URL ] } module* PAGE-VS-URL { -- This module introduces mappings between the -- space of pages and of URLs. The function -- 'page' will be partial, since not every URL -- actually names a page. 'url' can be total. -- -- Should we use sort constraints to distinguish -- URLs that do name pages from those that don't? -- If so, how, given that the set of pages changes -- with time? protecting( PAGES + URLS ) op page : URL -> Page op url : Page -> URL } module* FILENAMES { -- This module introduces the spaces of filenames -- and directory names. We shall need these when -- mapping URLs to locations in our servers filestore. -- As with URLs, we might later want to represent the -- internal structure of filenames rather than just -- representing them as strings. -- -- We define two sorts, Filename and DirectoryName. -- Some operating systems treat directories as -- special kinds of file, but it's probably cleaner -- to think of the two as different. -- -- Question: are these relative or absolute filenames, -- and do we need to distinguish the two by subsorting? -- -- I assume that directory names do not end in the -- directory-separator character. This assumption is -- important when prepending them to URLs to make -- filenames. protecting( STRING ) [ Filename < String ] [ DirectoryName < String ] } module* URL-VS-FILENAME { -- This module introduces the idea of a mapping between -- a page's URL and its location in the server's -- filestore. Though we can talk about this for -- any server, we only have control over it on -- our own, so I've restricted it to SiteAbsoluteURL. -- -- The mapping from URLs to filenames is total. -- That from filenames to URLs will be partial, -- since some files will be outside the region -- of filestore which the server is allowed to -- send pages from. protecting( URLS + FILENAMES ) op filename : SiteAbsoluteURL -> Filename op url : Filename -> SiteAbsoluteURL } module* DOCUMENT-ROOT-MAPPING { -- This module describes the most common way -- of mapping between URLs and filenames. The -- server's filestore has a 'document root' -- directory. The server is allowed to make -- any files in this directory or its subdirectories -- visible as Web pages. To convert a URL -- to a filename, it prepends the document -- root, and then converts all "/" characters in the -- URL to whatever character its operating -- system uses as a directory separator (e.g. -- "/" on Unix, "\" on Windows). -- -- To describe your own server's document-root mapping, -- protect this module and give equations defining -- document-root and directory-separator. document-root -- should be a directory name, without a trailing -- directory separator. It should use "/" for directory -- separators, even if your own system uses "\" or -- some other character. This is merely because there seems -- to be a bug in CafeOBJ which makes it pass -- "\" characters in strings to Lisp incorrectly. -- directory-separator should be the directory separator: -- if this is "\", write it as "\\", which despite the -- bug, does seem to get passed to Lisp OK. protecting( URL-VS-FILENAME ) op document-root : -> DirectoryName op directory-separator : -> String op $substitute : String String -> String -- Private operator which replaces all "/"'s in its -- first argument by the second argument. The second -- argument will be directory-separator, but we need -- to pass it as a variable: I can't find a way to make -- Lisp refer to terms other than via CafeOBJ -- variables. var U : SiteAbsoluteURL eq filename( U ) = $substitute( document-root ++ U, directory-separator ) . eq $substitute( URL:String, Sep:String ) = #! (substitute (char Sep 0) #\/ URL) . } -- Next, we consider how Web pages can be grouped into -- collections. We need some utilities first, and then -- define a particular kind of collection, the slideshow. module* HTML-UTILITIES { -- This module defines some utilities for generating HTML. -- We'll need them when linking pages within collections. -- The <A HREF=_>_</A> operation places a link, as its name -- suggests. protecting( PAGE-VS-URL ) op <A HREF=_>_</A> : Page String -> String op <A HREF=_>_</A> : URL String -> String eq <A HREF= P:Page > S:String </A> = <A HREF= url(P) > S </A> . eq <A HREF= U:URL > S:String </A> = "<A HREF=" ++ U ++ ">" ++ S ++ "</A>" . } module* SLIDESHOW { -- This module describes a specific kind of collection of -- pages, namely a 'slideshow'. This is examplified by the -- MM presentation on my site at -- http://www.j-paine.org/mm_presentation/dangerous.html . -- It's a sequence of pages intended for linear reading. -- Each page except the last has a successor, and each page -- except the first has a predecessor. -- -- To use this module, define some extra pages p2 ... p_n . -- Define the ordering with equations for 'succ' such that -- last = succ(p_n), p_n = succ(p_n-1), ... p2 = succ(first). -- and for 'pred' such that -- first = pred(p2), p3 = pred(p2), ... p_n = pred(last). -- Define the pages' titles with equations for 'title' such that -- title(p_i) = ...The string between <TITLE> and </TITLE> -- tags for page p_i. The generator will also -- use this for the page's heading, placing -- it betweem <H1> and </H1> tags... -- Define the pages' bodies with equations for 'body' such that -- body(p_i) = ...The content for page p_i that comes after -- the heading... -- Define the pages' URLs with equations for 'url' such that -- url(p_i) = ...The site-absolute URL for page p_i... -- Define the URL for the contents page (see below) with an equation -- for 'url' such that -- url(contents-page) = ...The site-absolute URL for the contents page... -- -- Module CONTENT introduces the function 'content' from -- a page to its complete contents. SLIDESHOWS defines 'content' -- in terms of the page's title, body and links. After placing -- the title in the page's <HEAD> section, it generates a -- navigation bar linking to the page's successor and precessor -- (or to one of these if the page is at the start or end). It -- then reuses the title to generate the heading, as described -- above. It then places the page's body, following it with -- another copy of the navigation bar and the closing -- </BODY> and </HTML> tags. SLIDESHOW also generates a -- contents page, linking to each page in order. It places -- a link to this in each navigation bar. -- -- In real-life use, the user would want more control over -- location of the navigation bar and whether, e.g. the 'Next' -- link should appear before or after the 'Last' one, and -- the words to be used. We could define these by providing -- operations to construct templates for pages and their parts, -- terms constructed from which this module interprets. For -- example, -- eq nav-bar = "Up" ++ UP-LINK ++ -- " Next " ++ NEXT-LINK ++ " Last " ++ LAST-LINK -- or -- eq nav-bar = "Inhoud" ++ UP-LINK ++ -- " Vorige pagina " ++ LAST-LINK ++ " Volgende pagina " ++ NEXT-LINK -- where the capitalised operations get replaced by appropriate -- links for each page. -- -- Question: I haven't declared a sort to represent slideshows. -- Would that be better? -- -- Operations whose names begin with "$" are to be treated as -- private to the module. protecting( PAGES + CONTENT + PAGE-VS-URL + HTML-UTILITIES ) [ SequencePage < Page ] [ StartPage < SequencePage ] [ InteriorPage < SequencePage ] [ EndPage < SequencePage ] [ ContentsPage < Page ] op succ : StartPage -> Page op succ : InteriorPage -> Page op pred : InteriorPage -> Page op pred : EndPage -> Page op first : -> StartPage op last : -> EndPage op body : Page -> String op title : Page -> String op contents-page : -> ContentsPage op url : Page -> SiteAbsoluteURL -- Specialise 'url' so that 'filename' can be reduced. op $nav-bar : Page -> String -- The navigation bar at the top and bottom of each sequence page. op $contents : -> String op $contents-from : Page -> String -- $contents is the list of page titles to be used in the contents. -- $contents-from(P) is the list from and including P. eq content( contents-page ) = "<HTML><HEAD>" ++ "<TITLE>Contents</TITLE>" ++ "</HEAD><BODY>" ++ "<H1>Contents</H1>" ++ "<UL>" ++ $contents ++ "</UL>" ++ "</BODY></HTML>" . eq content( P:SequencePage ) = "<HTML><HEAD>" ++ "<TITLE>" ++ title(P) ++ "</TITLE>" ++ "</HEAD><BODY>" ++ $nav-bar(P) ++ "<HR>" ++ "<H1>" ++ title(P) ++ "</H1>" ++ body(P) ++ "<HR>" ++ $nav-bar(P) ++ "</BODY></HTML>" . eq $nav-bar( P:StartPage ) = <A HREF= contents-page > "Contents" </A> ++ <A HREF= succ(P) > "Next" </A> . eq $nav-bar( P:EndPage ) = <A HREF= contents-page > "Contents" </A> ++ <A HREF= pred(P) > "Last" </A> . eq $nav-bar( P:InteriorPage ) = <A HREF= contents-page > "Contents" </A> ++ <A HREF= succ(P) > "Next" </A> ++ <A HREF= pred(P) > "Last" </A> . eq $contents = $contents-from( first ) . eq $contents-from( P:EndPage ) = "<LI>" ++ <A HREF= P > title(P) </A> . eq $contents-from( P:StartPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) . eq $contents-from( P:InteriorPage ) = "<LI>" ++ <A HREF= P > title(P) </A> ++ $contents-from( succ(P) ) . } -- Now we consider how to generate the pages in a collection -- to file. We'll need utilities such as lists and methods -- for writing to files first. mod! LISTS ( ELT :: TRIV ) principal-sort List { -- This module defines the space of lists. It was originally -- written by Razvan Diaconescu as part of the CafeOBJ examples. -- We'll need it when constructing lists of all the pages -- defined by a module. [ NonEmptyList < List ] op nil : -> List op cons : Elt List -> NonEmptyList op cons : Elt -> NonEmptyList op car : NonEmptyList -> Elt op cdr : NonEmptyList -> List op _+_ : List List -> List vars L L' : List var E : Elt eq car(cons(E,L)) = E . eq cdr(cons(E,L)) = L . eq nil + L = L . eq cons(E,L) + L' = cons(E,(L + L')) . } make PAGE-LISTS ( LISTS(PAGES) * { sort List -> PageList } ) . -- This defines the space of lists of pages. module* WRITE-FILE { -- This module calls Lisp to define a 'function' -- write-string--to-file(S,F) which writes the -- string S to file F. It does return a value, -- but we use it only for its side-effect, in WRITE-PAGE. -- -- I use the subsort Void for the coarity of functions -- derived from 'write-string-to-file', where it's only -- the side-effect that's of interest. If I need to write -- an explicit return value, I use the operation 'void' (or, -- if in Lisp, an equivalent symbol). -- -- I don't know exactly what #! does, since there's -- no documentation about it. Looking at the libraries, -- there are two ways to invoke Lisp, #! and #!!. -- The second doesn't work here: I think #! discards -- the sort tags from variables when passing them -- to Lisp, whereas #!! keeps them. For 'write-string-to-file', -- we don't need the tags, only the variables' values -- as strings. protecting( STRING ) [ Void < String ] op write-string-to-file : String String -> Void op void : -> Void eq write-string-to-file( Content:String, Filename:String ) = #! (let ( ( stream (open Filename :direction :output) ) ) (write-string Content stream) (close stream) "" ) . eq void = "" . } module* WRITE-PAGE { -- This module defines a 'function' whose side-effect is -- to write a page's content to its file. This only makes -- sense when applied to pages whose filestore location -- we know, namely those on our own server. We use it -- in the Web-page generator below. protecting( WRITE-FILE + PAGE-VS-URL + URL-VS-FILENAME + CONTENT ) op write-page : Page -> Void eq write-page( P:Page ) = write-string-to-file( content(P), filename(url(P)) ) . } module* PAGE-COLLECTION { -- See GENERATOR, below. protecting( PAGE-LISTS ) op all-pages : -> PageList } module* GENERATOR( PAGE-COLLECTION :: PAGE-COLLECTION ) { -- This module defines the operation 'generate'. This -- iterates over all pages in a PAGE-COLLECTION, obtaining -- the content and filename for each, and calling -- write.WRITE-FILE to write the content to the file. -- Since we use it for its side-effects only, we -- give it the coarity Void, as in WRITE-FILE. -- -- The interface between 'generate' and the PAGE-COLLECTION -- is via all-pages. Any PAGE-COLLECTION must define -- all-pages to return a list of all the pages in -- the collection. -- -- The code for $generate is less efficient than it -- need be. Since we don't want its result, there's -- no need to use ++ to concatenate results together. -- But I can't think of any other way to force -- evaluation of both write-page and the recursive -- call to $generate in the second equation. protecting( WRITE-PAGE + PAGE-LISTS ) op generate : -> Void op $generate : PageList -> Void -- Private operation which writes each page -- in its argument. eq generate = $generate( all-pages ) . eq $generate( nil ) = void . eq $generate( cons( P:Page, Tail:PageList ) ) = write-page( P ) ++ $generate( Tail ) . } module* SLIDESHOW-PAGE-COLLECTION( SLIDESHOW :: SLIDESHOW ) { -- This module extends SLIDESHOW to define all-pages. -- Conceptually, this is not part of the slideshow itself, -- which is why I define it separately. protecting( PAGE-COLLECTION ) op $the-sequence-from : Page -> PageList -- Private operation which returns a list of all pages in -- the slideshow, starting from its first argument. eq all-pages = cons( contents-page, $the-sequence-from(first) ) . eq $the-sequence-from( P:EndPage ) = cons( P, nil ) . eq $the-sequence-from( P:StartPage ) = cons( P, $the-sequence-from( succ(P) ) ) . eq $the-sequence-from( P:InteriorPage ) = cons( P, $the-sequence-from( succ(P) ) ) . } module! MM-SLIDESHOW { -- This module instantiates SLIDESHOW to define -- a presentation about MM. protecting( SLIDESHOW ) ops p2 p3 : -> InteriorPage eq body( first ) = "This is an introduction to MM." . eq body( p2 ) = "This is more about MM." . eq body( p3 ) = "This is the third page in the presentation." . eq body( last ) = "This is the final page." . eq title( first ) = "Introduction" . eq title( p2 ) = "Background" . eq title( p3 ) = "Details" . eq title( last ) = "Prospects" . eq url(contents-page) = "/contents.html" . eq url( first ) = "/first.html" . eq url( p2 ) = "/p2.html" . eq url( p3 ) = "/p3.html" . eq url( last ) = "/last.html" . eq succ( first ) = p2 . eq succ( p2 ) = p3 . eq succ( p3 ) = last . eq pred( last ) = p3 . eq pred( p3 ) = p2 . eq pred( p2 ) = first . } module! MY-SERVER { -- This module extends DOCUMENT-ROOT-MAPPING to -- define the directory separator and URL-to-filename -- mapping used by my server. GENERATOR must have -- the latter before it can evaluate 'filename'. protecting( DOCUMENT-ROOT-MAPPING ) eq document-root = "/home/popx" . eq directory-separator = "/" . } make MM-SLIDESHOW-GENERATOR ( GENERATOR(SLIDESHOW-PAGE-COLLECTION(MM-SLIDESHOW)) + MY-SERVER ) -- This defines a generator for the MM slideshow, using the -- URL-to-filename mapping of MY-SERVER. Open the module -- and reduce 'generate' (assuming that your directory -- separator is "\" and the directory kb7\mm\cafeobj\ exists), -- and you will obtain files containing all the pages.
22nd May 2000.
[ Jocelyn Ireson-Paine's Home Page | Report on Visit to University of Minho, May 2000 ]
[ Jocelyn Ireson-Paine's Home Page | Free software | Publications | Algebraic Web specification | Report on Visit to University of Minho, May 2000 ]