Web Programming

John "Scooter" Morris

May 21, 2008

Portions Copyright © 2005-06 Python Software Foundation.

Introduction

Web Programming

The Hypertext Transfer Protocol

HTTP Request Line

Headers

Body

HTTP Response

HTTP Response Codes

Questions on HTTP?

Client Programming

HTML Forms

Creating Forms

A Simple Form

[A Simple Form]

Figure 4: A Simple Form

<html>
  <body>
    <form>
      <p>Sequence: <input type="text"/>
      Search type:
      <select>
        <option>Exact match</option>
        <option>Similarity match</option>
        <option>Sub-match</option>
      </select>
      </p>
      <p>Programs: 
      <input type="checkbox">
        FROG (version 1.1)
      </input>
      <input type="checkbox">
        FROG (2.0 beta)
      </input>
      <input type="checkbox">
        Bayes-Hart
      </input>
      </p>
      <p>
        <input type="button" value="Submit Query"/>
        <input type="button" value="Reset"/>
      </p>
    </form>
  </body>
</html> 

Javascript

Why Javascript?

Basic syntax

Getting your script to execute

Getting your script to execute (events)

Getting your script to execute (events)

  • As an action (event) handler:
    • HTML has a number of defined "events" that are expressed as attributes to HTML tags. The target of the event is a script:
      <p>This is a 
      <span style="color:green;border:thin blue solid;" onclick="alert('Hi!');">clickable</span> 
      word</p> 
      		
      Try me!

      This is a clickable word

    • Can also call your own function -- this is how form validation can be handled:
      <html>
      <body>
        <form>
          <script type="text/javascript">
            function validate()
            {
              var input = document.getElementById("number");
              var number = Number(input.value);
              if (number < 1 || number > 10 || isNaN(number)) {
                alert("Number must be between 1 and 10");
                input.value = "";
                return false;
              }
              return true;
            }
          </script>
          Enter a number between 1 and 10: 
             <input type="text" id="number" name="number" onchange="validate();"/>
        </form>
      </body>
      </html>
    • Try It!

HTML Events

TagsAttributeDescription
Window Events
bodyonloadScript to be run when a document loads
bodyonunloadScript to be run when a document unloads
Form Element Events
form elementsonchangeScript to be run when the element changes
form elementsonsubmitScript to be run when the form is submitted
form elementsonresetScript to be run when the form is reset
form elementsonselectScript to be run when the element is selected
form elementsonblurScript to be run when the element loses focus
form elementsonfocusScript to be run when the element gets focus
Keyboard Events
content elementsonkeydownWhat to do when key is pressed
content elementsonkeypressWhat to do when key is pressed and released
content elementsonkeyupWhat to do when key is released
Mouse Events
content elementsonclickWhat to do on a mouse click
content elementsondblclickWhat to do on a mouse double-click
content elementsonmousedownWhat to do when mouse button is pressed
content elementsonmousemoveWhat to do when mouse pointer moves
content elementsonmouseoutWhat to do when mouse pointer moves out of an element
content elementsonmouseoverWhat to do when mouse pointer moves over an element
content elementsonmouseupWhat to do when mouse button is released
Table 2: HTML Events

Javascript and the DOM

  • The HTML DOM is exposed through the document object
  • DOM methods provide access to the HTML elements:
    • document.getElementById(), document.getElementsByName(), document.getElementsByTagNameNS(), document.getElementsByClassName()
    • Note that only getElementById() is singular. All others will return an array of elements
    • Once you've found the element, you can look at it's properties:
      var x = element.innerHTML // the HTML contains within this element
      var x = element.style // the style information for this element
      var x = element.className // the element's class
      var x = element.attributes // the element's attributes
    • Elements also support the getElementBy methods. This allows you to walk the HTML tree, if you need to
  • DOM references:

Javascript and the DOM

  • DOM methods also provide a means to add to the HTML or to modify HTML elements:
    • document.createELementNS(), document.createAttribute(), and element.appendChild() provide the tools need to add to an HTML document
    • Many of the element properties are mutable. So, to change the class of an element, you could just change the className property.
  • Playing with classes (and the Gettysburg Address):
    classTest.html
    <html>
    <head>
            <title>Class change test</title>
            <link type="text/css" rel="stylesheet" href="classTest.css"/>
            <script type="text/javascript" src="classTest.js" ></script>
    </head>
    <body>
    <h1 class="header" id="header" onclick="changeClass(this);">Playing with classes</h1>
    
    <p class="plain" onclick="changeClass(this);">Four score and seven years ago ...</p>
    <p class="plain" onclick="changeClass(this);">Now we are engaged ...</p>
    <p class="plain" id="final" onclick="changeClass(this);">But, in a larger sense, we can not dedicate ...</p>
    <p class="hidden" id="citation">Gettysburg Address<br/>
    Abraham Lincoln<br/>
    November 19, 1863</p>
    
    </body>
    </html>

Javascript and the DOM (Example CSS)

classTest.css
h1.header {
        text-align: left;
        font-family: serif;
        color: black;
}

h1.title {
        text-align: center;
        font-family: sans-serif;
        color: green;
}

p.plain {
        margin-left: 0em;
        margin-right: 0em;
        font-style: normal;
}
p.quote {
        margin-left: 5em;
        margin-right: 5em;
        font-style: italic;
}
p.hidden {
        visibility: hidden;
}
p.visible {
        visibility: visible;
        float: right;
        color: blue;
        font-style: italic;
        margin-right: 5em;
}

Javascript and the DOM (Example JS)

classTest.js
// This function will change the class of the passed element
// Note that we deal with the header specially, and also
// handle the citation depending on the class of the final paragraph
function changeClass(element) {
        // Get the current class name of the element that called us
        var class = element.className;
        // Get the ID of the element that called us
        var id = element.id;

        // Did the header call us?
        if (id == "header") {
                // Yes, switch our class
                if (class == "header") {
                        element.className = "title";
                } else if (class == "title") {
                        element.className = "header";
                }
                return;
        }
        // Make sure it's a paragraph
        if (element.tagName == "P" || element.tagName == "p") {
                // Yes, switch it
                if (class == "plain") {
                        element.className = "quote";
                        // Is it the final paragraph?
                        if (id == "final")
                                // Yes, switch the citation
                                document.getElementById("citation").className = "visible";
                } else if (class == "quote") {
                        element.className = "plain";
                        // Is it the final paragraph?
                        if (id == "final")
                                // Yes, switch the citation
                                document.getElementById("citation").className = "hidden";
                }
        }
}

Other Javascript Objects

  • JavaScripts supports the following built-in objects:
    • Types: Array, Boolean, Date, Number, String
    • Math
    • RegExp
    • Top-level functions: decodeURI(), decodeURIComponent(), encodeURI(), encodeURIComponent(), escape(), eval(), isFinite(), isNaN(), Number(), parseFloat(), parseInt(), String(), and unescape()
    • Browser interface objects: Window, Navigator, Screen, History, and Location
    • DOM objects, some of which are: Document, Form, Option, Select, and Style
    • Various HTML elements are exposed as objects, but I think it's better to work through the DOM methods directly

Debugging client code

  • Tools:
    • Firefox
      • Most standard-compliant browser
      • Firefox "Tools→Error Console" will really help!
    • Firefox add-ins: Web Developer, JavaScript Debugger
      • Firebug is also excellent, but not (yet) available for Firefox 3.0
  • Technique:
    • Separate CSS and JS from HTML file, unless you have < 10-15 lines
    • Start with HTML & CSS - no JS
    • Before coding, search the web for examples
    • Incrementally add and debug JS
    • alert() is a useful, quick debugging aid

Client-side Programming

  • Questions?
  • Overwhelmed?

The Server as a Client

  • The ability to fetch and parse content from the web is an essential part of modern bioinformatics
    • For example, using NCBI eutils to pull data from Entrez or PubMed.
  • In this context, your server program becomes a client to someone else's server

Fetching Pages

  • Opening sockets, constructing HTTP requests, and parsing responses is tedious
    • So most languages provide libraries to do the work for you
    • In Python, that library is called urllib
  • urllib.urlopen(URL) does what your browser would do if you gave it the URL
    • Parse it to figure out what server to connect to
    • Connect to that server
    • Send an HTTP request
    • Returns an object that looks like a file, from which to read response data

urllib Example

  • Read a page the easy way
      import urllib
      
      instream = urllib.urlopen("http://www.third-bit.com/greeting.html")
      lines = instream.readlines()
      instream.close()
      for line in lines:
          print line,
      
  • Note: readlines wouldn't do the right thing if the thing being read was an image
    • Might try to convert “line endings”
    • Use read to grab the bytes in that case

Building A Spider

  • A web spider is a program that can explore the web on its own
    • Fetch a page, extract all the external links, visit those pages…
    • That, a search engine, and a few billion dollars, and you're Google
$ python spider.py http://www.google.ca
http://groups.google.ca/grphp?hl=en&tab=wg&ie=UTF-8
http://news.google.ca/nwshp?hl=en&tab=wn&ie=UTF-8
http://scholar.google.com/schhp?hl=en&tab=ws&ie=UTF-8
http://www.google.ca/fr
			
import sys, urllib, re

url = sys.argv[1]
instream = urllib.urlopen(url)
page = instream.read()
instream.close()

links = re.findall(r'href=\"[^\"]+\"', page)
temp = set()
for x in links:
    x = x[6:-1]    # strip off 'href="' and '"'
    if x.startswith('http://'):
        temp.add(x)
links = list(temp)
links.sort()
for x in links:
    print x

Passing Parameters

  • Sometimes want to provide extra information as part of a URL
    • Example: when searching on Google, have to specify what the search terms are
  • Could do this as part of the URL
    • Amazon puts ISBNs in URLs
  • More flexible to add parameters to the URL
    • http://www.google.ca?q=Python searches for pages related to Python
    • "?" separates the parameters from the rest of the URL
    • If there are multiple parameters, they are separated from each other by "&"
      • E.g., http://www.google.ca/search?q=Python&client=firefox

Special Characters

  • What if you want to include "?" or "&" in a parameter?
    • Same problem (and solution) as including a quote in a string, or <> in XML
  • URL encode special characters using "%" followed by a 2-digit hexadecimal code
    • And replace spaces with "+"
    • Character Encoding
      "#" %23
      "$" %24
      "%" %25
      "&" %26
      "+" %2B
      "," %2C
      "/" %2F
      ":" %3A
      ";" %3B
      "=" %3D
      "?" %3F
      "@" %40
      Table 3: URL Encoding

Encoding Example

  • To search Google for “grade = A+”, use http://www.google.ca/search?q=grade+%3D+A%2B
  • urllib has functions to make this easy
    • urllib.quote(str) replaces special characters in str with escape sequences
    • urllib.unquote(str) replaces escape sequences with characters
    • urllib.urlencode(params) takes a dictionary and constructs the entire query parameter string
    • import urllib
      print urllib.urlencode({'surname' : 'Von Neumann', 'forename' : 'John'})
      
      surname=Von+Neumann&forename=John
      

Screen Scraping (And Why Not)

  • Suppose you want to write a script that actually does search Google
    • Construct a URL: easy
    • Send it and read the response: no problem
    • Parse the response: there's a lot of junk on the page…
  • Many first-generation web applications relied on screen scraping
    • “Parse” the HTML with regular expressions
  • Hard to get right if the page layout is complex
    • And whenever the layout changes, the application breaks

Web Services

[Web Services]

Figure 5: Web Services

  • Modern web services separate data from presentation
    • When a client sends a request, it indicates that it wants machine-readable XML, rather than human-readable HTML
      • Much easier to parse
      • Much less likely to change over time
  • Many web services use the Simple Object Access Protocol (SOAP) standard
    • Despite its name, it's anything but simple
    • Luckily, there are libraries to hide the details for most widely-used web services

Example: Amazon

  • Amazon has defined an API for web services
    • You need to get a license key in order to use it
      • They're free
      • But they allow Amazon to throttle requests to one per second per client
  • PyAmazon turns parameters into URL, and converts the XML reply into Python objects

Example: Amazon (continued)

import sys, amazon

# Format multiple authors' names nicely.
def prettyName(arg):
    if type(arg) in (list, tuple):
        arg = ', '.join(arg[:-1]) + ' and ' + arg[-1]
    return arg

if __name__ == '__main__':

    # Get information.
    key, asin = sys.argv[1], sys.argv[2]
    amazon.setLicense(key)
    items = amazon.searchByASIN(asin)

    # Handle errors.
    if not items:
        print 'Nothing found for', asin
    if len(items) > 1:
        print len(items), 'items found for', asin

    # Display information.
    item = items[0]
    productName = item.ProductName
    ourPrice = item.OurPrice
    authors = prettyName(item.Authors.Author)
    print '%s: %s (%s)' % (authors, productName, ourPrice)
	$ python findbook.py 123ABCDEFGHIJKL4MN56 0974514071 Greg Wilson: Data Crunching : Solve Everyday Problems Using Java, 
Python, and more. ($18.87)
  • Note: much more code devoted to creating human-readable output than to getting the information

Server Programming

  • Users want to make the web do different things
    • How to let them write programs that handle HTTP requests?
  • Option #1: Require them to write socket-level code
    • Complicated and error-prone
    • Can only have one program listening to a socket at a time
  • Option #2: have the web server accept the HTTP request, and then run the user's code
    • Recompiling the web server every time someone wants to add functionality would be a pain
    • So define a protocol that lets web servers run other programs

The CGI Protocol

  • The Common Gateway Interface (CGI) protocol specifies:
    • How a web server passes information to a program
    • How that program passes information back to the web server
  • CGI does not specify:
    • A particular language
      • You can use Fortran, the shell, C, Java, Perl, Python…
    • How the web server figures out what program to run
      • Each web server has its own rules
      • We'll (briefly) talk about Apache's

From Server To CGI

    • Web server runs the CGI by creating a new process
    • [CGI Data Processing Cycle]

      Figure 5: CGI Data Processing Cycle

    • Web server passes some information to the CGI process through environment variables
      Name Purpose Example
      REQUEST_METHOD What kind of HTTP request is being handled GET or POST
      SCRIPT_NAME The path to the script that's executing /cgi-bin/post_photo.py
      QUERY_STRING The query parameters following "?" in the URL name=mydog.jpg&expires=never
      CONTENT_TYPE The type of any extra data being sent with the request img/jpeg
      CONTENT_LENGTH How much extra data is being sent with the request (in bytes) 17290
      Table 4: Important CGI Environment Variables
    • The web server may also send CONTENT_LENGTH bytes to the CGI on standard input
      • E.g., when a file is being uploaded

From CGI To Server

  • The CGI program sends data back to the web server by printing it to standard output
  • The web server then forwards this directly to the client
    • Which means that the CGI program is responsible for creating headers
  • Note: none of this works unless the web server has been configured to run the CGI
    • By default, modern servers won't do this unless they're told they can

MIME Types

  • Clients and servers need a way to specify data types to each other
    • Remember, bytes are just bytes: the browser doesn't magically know how to interpret them
  • Multipurpose Internet Mail Extensions standard specifies how to do this
    • Organizes data types into families, and provides a two-part name for each type
    • Use the "Content-Type" header to specify the MIME type of the data being sent
  • Family Specific Type Describes
    Text text/html Web pages
    Image image/jpeg JPEG-format image
    Audio audio/x-mp3 MP3 audio file
    Video video/quicktime Apple Quicktime video format
    Application-specific data application/pdf Adobe PDF document
    Table 5: Example Mime Types

Hello, CGI

  • Simplest possible CGI pays no attention to query parameters or extra data
    • Just prints HTML to standard output, to be relayed to the client
    • Along with a Content-Type header to tell the client to expect HTML…
    • …and a blank line to separate the headers from the data
  • #!/usr/bin/env python
    
    # Headers and an extra blank line
    print 'Content-type: text/html'
    print
    
    # Body
    print '<html><body><p>Hello, CGI!</p></body></html>'
    

Invoking a CGI

  • Invoke it by going to http://www.yourserver.com/cgi-bin/hello_cgi.py
    • By convention, CGI programs are put in a cgi-bin directory
  • Browser displays the simple HTML page generated by the program
[Basic CGI Output]

Figure 6: Basic CGI Output

Generating Dynamic Content

  • But the whole point of CGI is to generate content dynamically
    • E.g., show a list of environment variables and their values
  • You'll use this frequently when debugging…
#!/usr/bin/env python

import os, cgi

# Headers and an extra blank line
print 'Content-type: text/html'
print

# Body
print '<html><body>'
keys = os.environ.keys()
keys.sort()
for k in keys:
    print '<p>%s: %s</p>' % (cgi.escape(k), cgi.escape(os.environ[k]))
print '</body></html>'
[Environment Variable Output]

Figure 7: Environment Variable Output

A Simple Form (reprise)

[A Simple Form]

Figure 4: A Simple Form

<html>
  <body>
    <form>
      <p>Sequence: <input type="text"/>
      Search type:
      <select>
        <option>Exact match</option>
        <option>Similarity match</option>
        <option>Sub-match</option>
      </select>
      </p>
      <p>Programs: 
      <input type="checkbox">
        FROG (version 1.1)
      </input>
      <input type="checkbox">
        FROG (2.0 beta)
      </input>
      <input type="checkbox">
        Bayes-Hart
      </input>
      </p>
      <p>
        <input type="button" value="Submit Query"/>
        <input type="button" value="Reset"/>
      </p>
    </form>
  </body>
</html> 

Parameter Names

  • Each <input/> element has a name attribute
    • These become the names of the parameters that the client sends to the server
    • The input elements' values are the parameters' values
  • Submitting the form shown above with default values produces:
    • os.environ['REQUEST_METHOD']: "POST"
    • os.environ['SCRIPT_NAME']: "/cgi-bin/simple_form.py"
    • os.environ['CONTENT_TYPE']: "application/x-www-form-urlencoded"
    • os.environ['REQUEST_LENGTH']: "80"
    • Standard input: sequence=GATTACA&search_type=Similarity+match&program=FROG-11&program=Bayes-Hart

Handling Forms

  • Could handle form data directly
    • Read and parse environment variables
    • Read extra data from standard input
  • But the mechanics are the same each time, so use Python's cgi module instead
    • Defines a dictionary-like object called FieldStorage
      • Keys are parameter names
      • Values are either strings (if there's a single value assocatied with the parameter) or lists (if there are many)
  • When a FieldStorage object is created, it reads and stores information contained in the URL and environment
    • Which means that a CGI program should only ever create one
  • Program can read extra data from sys.stdin

Form Handling Example

  • Example: show the parameters send to a script
      #!/usr/bin/env python
      import cgi
      
      print 'Content-type: text/html'
      print
      print '<html><body>'
      form = cgi.FieldStorage()
      for key in form.keys():
          value = form.getvalue(key)
          if isinstance(value, list):
              value = '[' + ', '.join(value) + ']'
          print '<p>%s: %s</p>' % (cgi.escape(key), cgi.escape(value))
      print '</body></html>'
      
      URL Value of a Value of b
      http://www.third-bit.com/swc/show_params.py?a=0 "0" None
      http://www.third-bit.com/swc/show_params.py?a=0&b=hello "0" "hello"
      http://www.third-bit.com/swc/show_params.py?a=0&b=hello&a=22 [0, 22] "hello"
      Table 6: Example Parameter Values

Development Tips

  • During development, add import cgitb; cgitb.enable() to the top of the program
    • cgitb is the CGI traceback module
    • When enabled, it will create a web page showing a stack trace when something goes wrong in your script
  • Testing whether a FieldStorage value is a string or a list is tedious
    • In almost all cases, you'll know whether to expect one value or many
    • Use FieldStorage.getfirst(name) to get the unique value
      • Returns the first, if there are many
    • FieldStorage.getlist(name) always returns a list of values
      • Empty list if there's no data associated with name
      • If there's only one value, get a single-item list

Maintaining State

  • Often want to change the data a server is managing, as well as read it
    • Update a description of an experiment, change your preferred email address, etc.
  • The industrial-strength solution is to use a three-tier architecture
    • [Three Tier Architecture]

      Figure 9: Three Tier Architecture

    • CGI program stuffs parameters from HTTP requests into SQL queries
    • Runs the queries
    • Translates results into HTML to send back to the client

Maintaining State in Files

  • Simple programs can often get away with using files
    • The CGI program re-reads the file each time it processes a request
    • And re-writes it if there have been any updates
  • Example: append messages to a web page
    • Old messages are saved in a file, one per line
    • Hi, is anyone reading this site?
      I was wondering the same thing.
      I wasn't sure if we were supposed to post here.
      Good point.  Is there way to delete messages?
      
  • Script checks the incoming parameters to decide what to do
    • If newmessage is there, append it, and display results
    • If newmessage isn't there, someone's visiting the page, rather than submitting the form
    • # Get existing messages.
      infile = open('messages.txt', 'r')
      lines = [x.rstrip() for x in infile.readlines()]
      infile.close()
      
      # Add more data?
      form = cgi.FieldStorage()
      if form.has_key('newmessage'):
          lines.append(form.getfirst('newmessage'))
          outfile = open('messages.txt', 'w')
          for line in lines:
              print >> outfile, line
          outfile.close()

AJAX

  • AJAX (Asynchronous Javascript And XML) provides:
    • Increased usability (on-page server interaction)
    • Client-side state
    • Enhanced user experience (possibly)
  • Basic AJAX idea:
    1. Javascript in browser sends a request to server using XMLHttpRequest()
    2. Server CGI processes request and sends back response, usually as an XML document
    3. Javascript in browser receives response, parses the XML and (using DOM) extracts the information
    4. Results are presented to the user, or used to modify the interface in some way

XMLHttpRequest Example

// Handle the XMLHttpRequest
function sendRequest(sql)
{ 
  xmlhttp = new XMLHttpRequest();
  if (xmlhttp != null) {
    xmlhttp.onreadystatechange = getData; // getData is our callback method
    xmlhttp.open("GET", "/cgi-bin/getBmi219Table.py?sql="+sql, true);
    xmlhttp.send(null);
  }
}

// This method gets called whenever the object state changes.
function getData()
{ 
  // Are we complete? 
  if (xmlhttp.readyState == 4) {
    // Yes, do we have a good http status?
    if (xmlhttp.status == 200) {
      // yes, responseXML will hold the XML document, which we can address using the DOM
      // if we only wanted the raw text, we could get xmlhttp.responseText
      var response = xmlhttp.responseXML;

      // Use the DOM to get the results table from the server
      var newChild = response.getElementById("results_table");

      // Get a handle on the results div
      var tableDiv = document.getElementById("results_div");

      // Add in our results table
      tableDiv.appendChild(newChild);
    } else {
      alert("Unable to contact AJAX server: "+xmlhttp.status);
    }
  }
} 

XMLHttpRequest Methods

MethodDescription
abort()Cancels the current request
getAllResponseHeaders() Returns the complete set of http headers as a string
getResponseHeader("headername") Returns the value of the specified http header
open("method","URL",async,"username","password") Specifies the method, URL, and other optional attributes of a request

The method parameter can have a value of "GET", "POST", or "PUT" (use "GET" when requesting data and use "POST" when sending data (especially if the length of the data is greater than 512 bytes.

The URL parameter may be either a relative or complete URL.

The async parameter specifies whether the request should be handled asynchronously or not. true means that script processing carries on after the send() method, without waiting for a response. false means that the script waits for a response before continuing script processing

send(content) Sends the request
setRequestHeader("label", "value") Adds a label/value pair to the http header to be sent
Table 7: XMLHttpRequest Methods

XMLHttpRequest Properties

PropertyDescription
onreadystatechange An event handler for an event that fires at every state change
readyState Returns the state of the object:

0 = uninitialized
1 = loading
2 = loaded
3 = interactive
4 = complete

responseText Returns the response as a string
responseXML Returns the response as XML. This property returns an XML document object, which can be examined and parsed using W3C DOM node tree methods and properties
status Returns the HTTP status as a number (e.g. 404 for "Not Found" or 200 for "OK")
statusText Returns the HTTP status as a string (e.g. "Not Found" or "OK")
Table 8: XMLHttpRequest Properties

AJAX Server Side

  • The server implementation is just a CGI
  • Arguments can be handled using the python cgi module
  • Be sure to return a proper XML document if you want to use XMLHttpRequest.responseXML:
#! /usr/bin/python
import cgi
import sys

print "Content-type: text/xml"
print ""
# We want this to be interpreted as HTML by the client
print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'

print '<html xmlns="http://www.w3.org/1999/xhtml">'
  • Note that like any CGI, we need to send the Content-type, followed by a blank line
  • In this example, we want the browser to parse this as XHTML, could be any arbitrary XML, though

Putting it together

    • Consider the Example Application
    • It is using SVG for the graphics, HTML forms for the input, and AJAX to query the backend database and populate the tables
    • There is also a fair amount of JavaScript and CSS trickery going on
    • The application is made up of 4 files:
      • bmi219/bmi219.svg: the XHTML+SVG file that makes up the front-end
      • bmi219/css/bmi219.css: the stylesheet for both the XHTML and SVG
      • bmi219/js/bmi219.js: the JavaScript that drives the application
      • cgi-bin/getBmi219Table.py: the server-side component

bmi219.svg

<?xml version="1.0" encoding="UTF-8"?>
<html xmlns="http://www.w3.org/1999/xhtml" xmlns:svg="http://www.w3.org/2000/svg" xml:lang="en" lang="en">
<head>
<script type="text/javascript" src="js/bmi219.js"></script>
</head>
<link rel="stylesheet" type="text/css" href="css/bmi219.css"/>
<body>
<h3>BMI219 - AJAX Example</h3>
<svg:svg id="svg-root" width="100%" viewBox="0 0 800 100" version="1.1" >
  <!-- Surrounding Rectangle -->
  <svg:rect x="0" y="0" width="800" height="100" style="stroke: blue; fill: none;"/>
  <!-- Recipe Entity -->
  <svg:rect x="40" y="30" width="60" height="40" class="entity" onclick="showInput('recipe_input', this);"/>
  <svg:text x="50" y="52" class="label1">Recipe</svg:text>
  <svg:line x1="100" y1="50" x2="330" y2="50" stroke="yellow" stroke-width="2"/>
  <!-- Fragment Entity -->
  <svg:rect x="330" y="30" width="60" height="40" class="entity" onclick="showInput('fragment_input', this);"/>
  <svg:text x="334" y="52" class="label1">Fragment</svg:text>
  <svg:line x1="390" y1="50" x2="630" y2="50" stroke="yellow" stroke-width="2"/>
  <!-- Gene Entity -->
  <svg:rect x="630" y="30" width="60" height="40" class="entity" onclick="showInput('gene_input', this);"/>
  <svg:text x="647" y="52" class="label1">Gene</svg:text>
  <!-- Produces relationship -->
  <svg:rect x="200" y="30" width="40" height="40" class="relationship" transform="rotate(-45,220,50)" onclick="showInput('recipe_input_join', this);"/>
  <svg:text x="201" y="52" class="label2">Produces</svg:text>
  <!-- Contains relationship -->
  <svg:rect x="500" y="30" width="40" height="40" class="relationship" transform="rotate(-45,520,50)" onclick="showInput('gene_input_join', this);"/>
  <svg:text x="501" y="52" class="label2">Contains</svg:text>
  <!-- Links and orders -->
</svg:svg>

<!-- This is the form: Note that each <span> has an ID and a class that we will use to 
     control whether we show the containing input field or not.  Also note specifically
     the way we call getTable with the arguments we want. -->
<form>
  <span id="recipe_input" class="hidden">
    Recipe Name: <input type="text" onchange="getTable('RECIPE','RECIPE.NAME', this, 'Name,File,Owner',null);"/>
  </span>
  <span id="recipe_input_join" class="hidden">
    Recipe Name: <input type="text" onchange="getTable('RECIPE,PRODUCES,FRAG','RECIPE.NAME', this, 'RECIPE.Name,RECIPE.Owner,PRODUCES.Date,FRAG.Name,FRAG.Sequence','RECIPE.RCP=PRODUCES.RCP and PRODUCES.FRAG=FRAG.FRAG');"/>
  </span>
  <span id="fragment_input" class="hidden" style="position: absolute; left: 35%;">
    Fragment Name: <input type="text" onchange="getTable('FRAG','FRAG.NAME', this, 'Name,Sequence,Circular',null);"/>
  </span>
  <span id="gene_input_join" class="hidden">
    Gene Name: <input type="text" onchange="getTable('FRAG,CONTAINS,GENE','GENE.NAME', this, 'FRAG.Name,FRAG.Sequence,GENE.Name,CONTAINS.Start,CONTAINS.End','FRAG.FRAG=CONTAINS.FRAG and GENE.ID=CONTAINS.GENE');"/>
  </span>
  <span id="gene_input" class="hidden" style="position: absolute; left: 70%;">
    Gene Name: <input type="text" onchange="getTable('GENE','GENE.NAME', this, 'Name,Protein,StartNum',null);"/>
  </span>
</form>

<!-- We'll write a header into this <h3> when we get the data -->
<h3 id="table_header" class="table_header"> </h3>
<!-- We'll write the results table into this when we get the data -->
<div id="results_div">
</div>
</body>
</html>

bmi219.css

rect.entity { fill: purple; stroke-width: 2px;}
rect.relationship { fill: lightgreen; stroke-width: 2px;}
text.label1 {fill:white; font-size:8pt; font-family: arial; font-weight: bold;}
text.label2 {fill:blue; font-size:6pt; font-family: arial; font-weight: bold;}
span.hidden {visibility: hidden; }
span.shown {visibility: visible; }
tr.table-header {font-weight: bold; text-align: center; color: green; font-family: arial;}
h3.table_header {font-family: arial; text-align: center;}
table {font-family: arial; font-size: 80%;}

bmi219.js

// Handle the XMLHttpRequest
function sendRequest(sql)
{
  xmlhttp = new XMLHttpRequest();
  if (xmlhttp != null) {
    xmlhttp.onreadystatechange = getData; // getData is our callback method
    xmlhttp.open("GET", "/cgi-bin/getBmi219Table.py?sql="+sql, true);
    xmlhttp.send(null);
  }
}

// This method gets called whenever the object state changes
function getData()
{
  // Are we complete?
  if (xmlhttp.readyState == 4) {
    // Yes, do we have a good http status?
    if (xmlhttp.status == 200) {
      // yes, responseXML will hold the XML document, which we can address using the DOM
      // if we only wanted the raw text, we could get xmlhttp.responseText
      var response = xmlhttp.responseXML;

      // Use the DOM to get the results table from the server
      var newChild = response.getElementById("results_table");

      // Get a handle on the results div
      var tableDiv = document.getElementById("results_div");

      // Add in our results table
      tableDiv.appendChild(newChild);
    } else {
      alert("Unable to contact AJAX server: "+xmlhttp.status);
    }
  }
}
var elementShown = null;
var xmlhttp = null;
var selectedRect = null;

// ShowInput just controls the presentation of the name
// of the row we are looking for
function showInput(elementID, rect) {
  // Get a pointer to the element that called us
  var element = document.getElementById(elementID);

  // Do we already have a text input element showing?
  if (elementShown != null)
    elementShown.className = "hidden"; // Yes, hide it

  // Do we already have a rectangle highlighted?
  if (selectedRect != null)
    selectedRect.setAttributeNS(null, "stroke", ""); // Yes, hide it

  // Show the text input
  element.className = "shown";
  elementShown = element;

  // Outline the element the user clicked on
  // Note that we need to use setAttributeNS for SVG attributes
  rect.setAttributeNS(null, "stroke", "black");
  selectedRect = rect;
}


// This is the method that gets called when a text field is changed
function getTable(tableName, column, textField, fields, where) {
    var text = textField.value; // This contains the value the user entered

    // Now, create the SELECT statement
    var sql = 'SELECT '+fields+' from '+tableName;
    if (text.length >= 2 || where != null) {
      sql += ' where ';
      if (text.length >= 2) {
        sql += column+' = "'+text+'"';
        if (where != null) {
          sql += ' AND '+where;
        }
      } else {
        sql += where;
      }
    } 
    sql += ';';
  
    // Uncomment the next line to see what we pulled together
    // alert(sql);
  
    // Issue the request.  Because our XMLHttpRequest call is
    // asynchronous, this will return immediately
    sendRequest(sql);
  
    // Clear the text field
    textField.value = "";
  
    // Add a header
    header = document.getElementById("table_header");
    header.innerHTML = tableName;
  
    // Clear the old table
    var tableDiv = document.getElementById("results_div");
    while (tableDiv.firstChild) {
      tableDiv.removeChild(tableDiv.firstChild);
    }
} 

getBmi219Table.py

#! /usr/bin/python

import cgi
import cgitb
import sys
import MySQLdb

def returnError(errorString): 
  print """<html xmlns="http://www.w3.org/1999/xhtml">
    <body> <h3 id="results_table" style="color:red;">%s</h3> </body>
  </html>"""%errorString

cgitb.enable()

print "Content-type: text/xml"
print ""
print '<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">'

# Get the form data
form = cgi.FieldStorage()
if not (form.has_key("sql")):
  returnError("No SQL string?")
  sys.exit(0)

sqlStatement = form["sql"].value
rows = None

try:
  conn = MySQLdb.connect (host="127.0.0.1", db="bmi219")
  cursor = conn.cursor()
  cursor.execute(sqlStatement)
  rows = cursor.fetchall()
  cursor.close()
  conn.commit()
  conn.close()

except MySQLdb.Error, e:
  returnError(e.args[1])
  sys.exit(0)

print '<html xmlns="http://www.w3.org/1999/xhtml">'
print '<body>'
print   '<table id="results_table" border="1" width="80%" align="center">'
print     '<tr class="table-header">',
for column in cursor.description:
  print '<td>'+column[0]+'</td>',
print     '</tr>'
  
for row in rows:
  print     '<tr>',
  for cell in row:
    print '<td>'+str(cell)+'</td>',
  print     '</tr>'
print   '</table>'
print '</body>'
print '</html>'

AJAX - Questions?

  • Questions about CGI or AJAX?

HTML Templating

  • A lot of this program is devoted to copying values into an HTML template
    • There are lots of good systems out there, in many languages, for doing this
    • Kid in Python
    • Java Server Pages (JSPs) in Java
    • Please do not write one of your own

What About Concurrency?

  • What happens if two users try to save messages at the same time?
    • I/O is typically slower than processing
    • So most web servers try to overlap operations
  • Race condition:
    • First instance of message_form.py opens messages.txt, reads lines, closes file
    • Second instance opens messages.txt, reads the same lines, closes file
    • First instance re-opens file, writes out original data plus one new line
    • Second instance re-opens file, writes out original plus a different new line
    • First instance's message has been lost!

File Locking

  • Solution is to lock the file
    • As the name implies, gives one process exclusive rights to the file
    • After the first process acquires the lock, any other process that tries to read or write the file is suspended until the first releases it
  • Mechanics are different on different operating systems
    • But the Python Cookbook includes a generic file locking function that works on both Unix and Windows

Implementing Locking

    # Get existing messages.
    msgfile = open('messages.txt', 'r+')
    fcntl.flock(msgfile.fileno(), fcntl.LOCK_EX)
    lines = [x.rstrip() for x in msgfile.readlines()]
    
    # Add more data?
    form = cgi.FieldStorage()
    if form.has_key('newmessage'):
        lines.append(form.getfirst('newmessage'))
        msgfile.seek(0)
        for line in lines:
            print >> msgfile, line
    
    # Unlock and close.
    fcntl.flock(msgfile.fileno(), fcntl.LOCK_UN)
    msgfile.close()

Who Are You?

  • How to maintain state on the client?
    • Need to know which shopping cart to display for a particular user
  • HTTP is a stateless protocol
    • If a client makes a second (or third, or fourth…) request, server has no reliable way of connecting it to the first one
  • Can guess based on client address, elapsed time, etc.
    • But it's just a guess

Cookies

  • Solution is for the server to create a cookie
    • A string that is sent to the client in an HTTP response header
  • Client saves it (either in memory or on disk)
    • [Cookies]

      Figure 10: Cookies

  • The next time the client sends a request to the site, it sends the cookie back to the server
    • Like giving someone a claim check for their luggage

Creating Cookies

  • Represent cookies in Python using Cookie.SimpleCookie
    • Do not use SmartCookie: it is potentially insecure
  • When creating, add values to a cookie as if it were a dictionary
    • Convert it to a string (e.g., by printing it) to create the required HTTP header
  • When the cookie comes back:
    • Get the value associated of the environment variable "HTTP_COOKIE"
    • Create a SimpleCookie
    • Pass the "HTTP_COOKIE" value to the cookie's load method

Cookie Example

  • Example: count the number of times a user has visited a web site
    • If there's no cookie, create one with a count of 1
    • Otherwise, increment the count
    • Create a new cookie to send back to the user
    • Display the count
  • # Get old count.
    count = 0
    if os.environ.has_key('HTTP_COOKIE'):
        cookie = Cookie.SimpleCookie()
        cookie.load(os.environ['HTTP_COOKIE'])
        if cookie.has_key('count'):
            count = int(cookie['count'].value)
    
    # Create new count.
    count += 1
    cookie = Cookie.SimpleCookie()
    cookie['count'] = count
    
    # Display.
    print 'Content-Type: text/html'
    print cookie
    print
    print '<html><body>'
    print '<p>Visits: %d</p>' % count
    print '</body></html>'

Cookie Tips

  • Can control how long a cookie is valid by setting an expiry value
    • Either the number of milliseconds
    • Or the time it should expire (in UTC )
      • Use time.asctime(time.gmtime()) to create the value
  • Do not put sensitive information in cookies
    • Browsers store them in files on disk
    • Villains can watch network traffic, and steal data
  • Cookies should instead be random values that act as keys into server-side information