(gawkinet.info.gz) Interacting Service

Info Catalog (gawkinet.info.gz) Primitive Service (gawkinet.info.gz) Using Networking (gawkinet.info.gz) Simple Server
 
 2.9 A Web Service with Interaction
 ==================================
 
 This node shows how to set up a simple web server.  The subnode is a
 library file that we will use with all the examples in  Some
 Applications and Techniques.
 

Menu

 
* CGI Lib                     A simple CGI library.
 
    Setting up a web service that allows user interaction is more
 difficult and shows us the limits of network access in `gawk'. In this
 node, we develop  a main program (a `BEGIN' pattern and its action)
 that will become the core of event-driven execution controlled by a
 graphical user interface (GUI).  Each HTTP event that the user triggers
 by some action within the browser is received in this central
 procedure. Parameters and menu choices are extracted from this request,
 and an appropriate measure is taken according to the user's choice.
 For example:
 
      BEGIN {
        if (MyHost == "") {
           "uname -n" | getline MyHost
           close("uname -n")
        }
        if (MyPort ==  0) MyPort = 8080
        HttpService = "/inet/tcp/" MyPort "/0/0"
        MyPrefix    = "http://" MyHost ":" MyPort
        SetUpServer()
        while ("awk" != "complex") {
          # header lines are terminated this way
          RS = ORS = "\r\n"
          Status   = 200          # this means OK
          Reason   = "OK"
          Header   = TopHeader
          Document = TopDoc
          Footer   = TopFooter
          if        (GETARG["Method"] == "GET") {
              HandleGET()
          } else if (GETARG["Method"] == "HEAD") {
              # not yet implemented
          } else if (GETARG["Method"] != "") {
              print "bad method", GETARG["Method"]
          }
          Prompt = Header Document Footer
          print "HTTP/1.0", Status, Reason       |& HttpService
          print "Connection: Close"              |& HttpService
          print "Pragma: no-cache"               |& HttpService
          len = length(Prompt) + length(ORS)
          print "Content-length:", len           |& HttpService
          print ORS Prompt                       |& HttpService
          # ignore all the header lines
          while ((HttpService |& getline) > 0)
              ;
          # stop talking to this client
          close(HttpService)
          # wait for new client request
          HttpService |& getline
          # do some logging
          print systime(), strftime(), $0
          # read request parameters
          CGI_setup($1, $2, $3)
        }
      }
 
    This web server presents menu choices in the form of HTML links.
 Therefore, it has to tell the browser the name of the host it is
 residing on. When starting the server, the user may supply the name of
 the host from the command line with `gawk -v MyHost="Rumpelstilzchen"'.
 If the user does not do this, the server looks up the name of the host
 it is running on for later use as a web address in HTML documents. The
 same applies to the port number. These values are inserted later into
 the HTML content of the web pages to refer to the home system.
 
    Each server that is built around this core has to initialize some
 application-dependent variables (such as the default home page) in a
 procedure `SetUpServer', which is called immediately before entering the
 infinite loop of the server. For now, we will write an instance that
 initiates a trivial interaction.  With this home page, the client user
 can click on two possible choices, and receive the current date either
 in human-readable format or in seconds since 1970:
 
      function SetUpServer() {
        TopHeader = "<HTML><HEAD>"
        TopHeader = TopHeader \
           "<title>My name is GAWK, GNU AWK</title></HEAD>"
        TopDoc    = "<BODY><h2>\
          Do you prefer your date <A HREF=" MyPrefix \
          "/human>human</A> or \
          <A HREF=" MyPrefix "/POSIX>POSIXed</A>?</h2>" ORS ORS
        TopFooter = "</BODY></HTML>"
      }
 
    On the first run through the main loop, the default line terminators
 are set and the default home page is copied to the actual home page.
 Since this is the first run, `GETARG["Method"]' is not initialized yet,
 hence the case selection over the method does nothing. Now that the
 home page is initialized, the server can start communicating to a
 client browser.
 
    It does so by printing the HTTP header into the network connection
 (`print ... |& HttpService'). This command blocks execution of the
 server script until a client connects. If this server script is
 compared with the primitive one we wrote before, you will notice two
 additional lines in the header. The first instructs the browser to
 close the connection after each request. The second tells the browser
 that it should never try to _remember_ earlier requests that had
 identical web addresses (no caching). Otherwise, it could happen that
 the browser retrieves the time of day in the previous example just once,
 and later it takes the web page from the cache, always displaying the
 same time of day although time advances each second.
 
    Having supplied the initial home page to the browser with a valid
 document stored in the parameter `Prompt', it closes the connection and
 waits for the next request.  When the request comes, a log line is
 printed that allows us to see which request the server receives. The
 final step in the loop is to call the function `CGI_setup', which reads
 all the lines of the request (coming from the browser), processes them,
 and stores the transmitted parameters in the array `PARAM'. The complete
 text of these application-independent functions can be found in  A
 Simple CGI Library CGI Lib.  For now, we use a simplified version of
 `CGI_setup':
 
      function CGI_setup(   method, uri, version, i) {
        delete GETARG;         delete MENU;        delete PARAM
        GETARG["Method"] = $1
        GETARG["URI"] = $2
        GETARG["Version"] = $3
        i = index($2, "?")
        # is there a "?" indicating a CGI request?
        if (i > 0) {
          split(substr($2, 1, i-1), MENU, "[/:]")
          split(substr($2, i+1), PARAM, "&")
          for (i in PARAM) {
            j = index(PARAM[i], "=")
            GETARG[substr(PARAM[i], 1, j-1)] = \
                                        substr(PARAM[i], j+1)
          }
        } else {    # there is no "?", no need for splitting PARAMs
          split($2, MENU, "[/:]")
        }
      }
 
    At first, the function clears all variables used for global storage
 of request parameters. The rest of the function serves the purpose of
 filling the global parameters with the extracted new values.  To
 accomplish this, the name of the requested resource is split into parts
 and stored for later evaluation. If the request contains a `?', then
 the request has CGI variables seamlessly appended to the web address.
 Everything in front of the `?' is split up into menu items, and
 everything behind the `?' is a list of `VARIABLE=VALUE' pairs
 (separated by `&') that also need splitting. This way, CGI variables are
 isolated and stored. This procedure lacks recognition of special
 characters that are transmitted in coded form(1). Here, any optional
 request header and body parts are ignored. We do not need header
 parameters and the request body. However, when refining our approach or
 working with the `POST' and `PUT' methods, reading the header and body
 becomes inevitable. Header parameters should then be stored in a global
 array as well as the body.
 
    On each subsequent run through the main loop, one request from a
 browser is received, evaluated, and answered according to the user's
 choice. This can be done by letting the value of the HTTP method guide
 the main loop into execution of the procedure `HandleGET', which
 evaluates the user's choice. In this case, we have only one
 hierarchical level of menus, but in the general case, menus are nested.
 The menu choices at each level are separated by `/', just as in file
 names. Notice how simple it is to construct menus of arbitrary depth:
 
      function HandleGET() {
        if (       MENU[2] == "human") {
          Footer = strftime() TopFooter
        } else if (MENU[2] == "POSIX") {
          Footer = systime()  TopFooter
        }
      }
 
    The disadvantage of this approach is that our server is slow and can
 handle only one request at a time. Its main advantage, however, is that
 the server consists of just one `gawk' program. No need for installing
 an `httpd', and no need for static separate HTML files, CGI scripts, or
 `root' privileges. This is rapid prototyping.  This program can be
 started on the same host that runs your browser.  Then let your browser
 point to `http://localhost:8080'.
 
    It is also possible to include images into the HTML pages.  Most
 browsers support the not very well-known `.xbm' format, which may
 contain only monochrome pictures but is an ASCII format. Binary images
 are possible but not so easy to handle. Another way of including images
 is to generate them with a tool such as GNUPlot, by calling the tool
 with the `system' function or through a pipe.
 
    ---------- Footnotes ----------
 
    (1) As defined in RFC 2068.
 
Info Catalog (gawkinet.info.gz) Primitive Service (gawkinet.info.gz) Using Networking (gawkinet.info.gz) Simple Server
automatically generated by info2html