Friday, November 28, 2008

Passing Keys


The current version of this article and sequence can be found at Core Memory.


An entity is an instance of a data model, analogous to a data record in a database table.

When you store such a chunk of data, sometimes you need to locate it quickly, using a unique identifier. This is the entity's key.

Keys for data entities are typically unique relative to the data model. They are commonly hidden in a web page so that, when the user clicks something, the appropriate key is passed back to the server. This "passing of keys" (from the server, to the user, then back to the server) happens in any web application that empowers the user to create many-to-one relationships on a web page -- for example, to create a list with items.

The Google App Engine (GAE) hosting environment provides a database with a difference: when created, each entity is assigned a key universal to Google's "Big Table", guaranteed never to change, and guaranteed to be accessible only from your GAE application.

This implies an unusual approach to passing keys to the browser. When the webapp generates a web page the data choices can be hidden in the page, not with ID's generated on-the-fly, but with permanent, universal keys. There are clues on how to do this, mostly implied or embedded, scattered across the Google App Engine Group's discussion forum, the GAE cookbook, etc. But there are no simple, deployed examples. So I thought I would provide one.

Unfolding of an Example

To present this, I'm explicitly introducing an ancient illustrative technique to the world of software development, one that I think is particularly pleasant, unusually effective and full of possibility: the unfolding of a program. It's a polished sequence of stepwise software development. This is rather different than a walk-through of finished code. You may have seen it before in public programming performances ... in fact, Google introduced GAE with just such a performance, which I'll analyze in a separate paper.

For the moment, let's think of a program as a living organism.


During an organism's development, its structure and shape changes, moving from very simple to very complex. We marvel at nature's ability to maintain a coherent morphology amidst this massive engineering project. Amazingly, the elegant developmental sequences we see in nature, which generate such robust and complex mechanisms, are very flexible and adaptive. Living things tap a superb library of transformative solutions and methods. Since humans have only about 30,000 genes, life somehow creates robust, unique people from something analogous to 30,000 small functions. The Linux Kernal has about 10 million lines of code. We need to pay much closer attention to nature.



If you watch this gradual, incremental growth, you'll see difficult engineering problems overcome fluidly, simply, at each stage, at all scales. A great deal of ad hoc adaptation is going on, adaptations synthesized from available information, used in a persistent, continuously balanced approach to tackling the emergence of the entire organism, not just tiny parts of it.



This holistic, stepwise approach to growth can be seen as following, or borrowing from, many sequences of steps. Each step creates a morphological differentiation, building upon the previous step, providing the context for the next. Surprisingly, although the number of sequences borrowed by a developing organism is vast, the number of differentiations is not. A human being develops in approximately 50 cell divisions, for example.



Assuming that programmers are called developers because they engage in something strongly akin to this kind of development, it's unfortunate that one of the least explored areas in programming is the study of the steps and sequences we use. When we decide to change code, at that moment, what are we doing? What are the problems, and what is our resolution? And where are we going? Clearly it would be useful to pass these steps and sequences around, among ourselves, for the sake of improving our discipline. The study of developmental programming could provide a quicker, more efficient and effective absorption of new contexts, in the fast-changing world of engineering. Sequences could provide another tool for Open Source computer science.

If all this talk seems redolent of Design Patterns, there's a reason. The notion to identify sequences of steps that humans can take, to build complex coherent systems, comes from Christopher Alexander, whose practical architectural philosphy inspired the computing movement to mine patterns, and pattern languages. Research by Alexander and myself in the 1990's indicated, among many other things, that people take from a sequence whatever they need, absorbing information about dependencies, and borrowing solutions, and sensibilities, very like a developing organism making use of its genome.

I believe that polished sequences are a complement to patterns and refactoring ... after you've refactored towards various patterns, you tend to think: "if only I'd known about A! At this point I would have done A instead of B" or "Y turned into a huge time-wasting problem later on. I'm sure someone else has run into this."

Sequences give you the opportunity to save vast numbers of people from the frustration with poor solutions, and teh difficult path to rediscovery of an already known solution. Sequences can make us more effective, and help us to produce quality work more easily and more often.

So, to our example. I hope this new approach is useful, and that the sequence itself provides solutions large and small, from which you can borrow.

Preliminary Step: Scope and Environment

To get started, note that I'm not supplanting the steps found within the Google App Engine Getting Started tutorial, which is still the only coherent tutorial for Google App Engine development. If you're making use of my sequence, below, I assume that you're already running the development environment, based on instructions in the GAE tutorial.

Again I'm focussed on simply passing keys. This sequence will not develop code for editing, ownership, user authentication, sessions, sorting, timestamps, AJAX, exception handling, CSS, backups or any number of otherwise perfectly reasonable and necessary features. But, clearly, those tasks too can be broken into small sequences, "snippets" if you will, which could be applied in an additive fashion to any application derived from this article.

In this example I'm also using a very simple subset of Python, free of 'dunder' methods, and avoiding the Django helper. I want to limit the number of technical features I'm describing here, while allowing as broad a swath of programmers as possible to read it. Similar limits were placed on the tutorials produced by Google.

We'll call this List-Item sample application "Passing Keys", or "passingkeys" to GAE. It's hosted at passingkeyshg.appspot.com, with links to the source. The primary Python module is passingkeys.py .

Step 1: Using purpose to shape the persistent data models

My first step directly reflects my main goal.

What do I want to do with this app? I want to create Lists and Items. Since I know something about the GAE application space, there's no harm in using a little code to start. In fact, it immediately gives the reader some understanding of GAE. Concrete implementation is not taboo in the world of generative sequences: after all, information critical to a developing organism is conveyed through real molecules.

So, in passingkeys.py, let's define List and Item, two different Datastore models derived from the db class ... db is a Google App Engine wrapper for the Datastore API, which reads and writes from Google's Big Table, or its simulation in the development environment on your own machine.

Note that the relationship between List and Item is made explicit by Item's "ReferenceProperty (List)" declaration. We will assign to Item's list_key the associated List instance, when the time comes.

class List(db.Model):
  name = db.StringProperty(multiline=True)

class Item(db.Model):
  list_key = db.ReferenceProperty(List)
  name = db.StringProperty(multiline=True)



Note: if you are using these steps to guide the development of a different application, it's good to keep data models that are understood separate from your exploratory models.

Step 2: Functional Centers and the First Bridge

In passingkeys.py, we'll define the three clumps of code that I'll call functional centers. They reflecting the three primary centers of activity that I foresee in this app. These aren't classes yet. But this outline bridges the gap between, on the one hand, the user's sense of real activity, and on the other hand, web technology (GETs, POSTs, forms rendering HTML) and the supporting classes and methods.


The Home Page:
* get: displays the current lists and items
* render: called by all other functions to render a version of the home page

Creating Lists:
* get: renders the home page with a List form
* post: handles the CGI POST to create a List

Creating Items:
* get: renders the home page with an Item form
* post: handles the CGI POST to create an Item



Step 3: The Class and Method Bridge

I associated render above with the home page. But, when I step over the next bridge, to a class and method outline, Python's object idiosyncrasies kick in. 'Class methods' are not normal in Python, so render needs to be an isolated function definition. Hence, my class and method outline is:

def render
   # renders home page

# handlers:
class HomePage
  def get
      # returns render's output

class CreateList
  def get
     # calls render with List form
  def post
     # creates a List

class CreateItem
  def get
     # calls render with Item form
  def post
     # creates an Item



Step 4: Encode the Handler Shells; Outline Rendering

Let's make the class and method outline more rigid, 'hardening' the shells of both.

Some might complain that this approach is too top-down, that I'm not following an end-to-end incremental methodology here. But, this is runnable code. Over the decades, programming languages have shifted towards pseudocode, so that we can ease our way through these early differentiating stages more naturally. The previous steps could be end-to-end, and runnable, if the language and environment shifted a little further in this direction. And if it included higher-level descriptions for web-page pathways. I find the conflicting environmental support for people who've done the same thing many times, and people who are exploring the environment for the first time, to be unnecessary. With open-source sequences, these incongruities would resolve themselves, with better development environments.

There's a hidden assumption in what follows -- it's in my mind, so I might as well spell it out. For simplicity, I'll use one Django template, which I'll name index.html, to create the HTML presented to the user. The template will be populated with Item and List data by my render function, which in turn is passed a Python dictionary (a collection of name-value pairs). This dictionary also includes switches to turn the forms on and off.

Here's an implementation outline of the five handler methods + 1 render function:

def render (template_values):
   # put data in template_values
   # call Django templating
   # return results to calling handler

class HomePage (webapp.RequestHandler):
  def get (self):
     # create vanilla home page
     # by calling render

class CreateList (webapp.RequestHandler):
  def get (self):
     # create list form
     # by calling render

  def post (self):
     # insert data from list form
     # redirect to home

class CreateItem (webapp.RequestHandler):
  def get (self, list_key):
     # create item form with hidden list_key
     # by calling render

  def post (self):
     # insert data from item form
     # redirect to home



I call my render function for GETs, and I redirect to home after POSTs. I could consistently call the render function, or with a session layer I could consistently redirect. I'm just trying to reduce overhead, for this example.

Step 5: Directing URL Regexes to Handlers

As per GAE's default webapp WSGI framework, there's a mapping, at the end of passingkeys.py, between our defined URL regex's and our request handlers. So the end of every main web application module in GAE-land looks like this:

application = webapp.WSGIApplication(
                                     [('/', HomePage),
                                        ('/list_form/', CreateList),
                                        ('/create_list/', CreateList),
                                        ('/item_form/(.*)/', CreateItem,
                                        ('/create_item/', CreateItem)],
                                     debug=True)

def main():
  run_wsgi_app(application)

if __name__ == "__main__":
  main()



The critically important detail here, relative to passing keys in a Python/GAE implementation, is the "(.*)" in the regex, which passes the contents of that string (a key coming back from the user) as an argument to CreateItem's CGI GET handler, which we call "list_key" inside CreateItem.get.

Step 6: Encode Template Rendering

The Django template index.html will be rendered by my render function, which initially just looks like this:

   def render (template_values):
         # to do: add data to template_values
        path = os.path.join(os.path.dirname(__file__), 'index.html')
        self.response.out.write(template.render(path, template_values))



Step 7: Encode Template Values for Create Get Methods

CreateList and CreateItem's GET handlers will pass a Python dictionary structure, with switches and a key, to my render :

class CreateList (webapp.RequestHandler):
  def get (self):
     template_values = {
        'ListForm': 1
     }
     self.response.out.write (render(template_values))

class CreateItem (webapp.RequestHandler):
  def get (self, list_key):
     template_values = {
        'item_form': 1,
        'list_key': list_key
     }
     self.response.out.write (render(template_values))



Step 8: Template Logic Outline

For developmental balance, I'm compelled now to complete our end-to-end infrastructure here, with a quick cut at our Django template index.html, based on the assumptions so far:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <title>Passing Keys</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >
</head>
<body>
<a href="/"><font size="+3"><b>Passing Keys</b></font></a><br>
 <br><br>
{% if list_form %}
   {# list_form #}
{% else %}
  <a href="/list_form/">Create a List</a><br><br>
{% endif %}
{% for list in lists %}
   {{ list.name }}<br>
   {% for item in list.items %}
      {{ item.name }} 
   {% endfor %}
   {% if item_form %}
      {% ifequal list.key list_key %}
         {# item form #}
      {% else %}
        <a href="/item_form/{{ list.key }}/#form">Create an item</a>
        <br>
      {% endifequal %}
   {% else %}
         <a href="/item_form/{{ list.key }}/#form">Create an item</a>
   {% endif %}
{% endfor %}

</body>
</html>



[Note that the end-to-end implementation cycle has started. Here, I start with the "Create" links. People tend to follow this cycle, this mini sequence, as a reality check during live webapp development: create links -> create form -> entity storage -> render data. It's a very natural cycle, and so common that, if you watch development in time-lapse, it'll make you dizzy. You can see it in public programming events like the Google App Engine Introduction. I did it too, and it's reflected here. It's useful both in exploratory development and recapitulation.]

What did I know at this "template logical outline" stage? With an evangelical passion for the Model-View-Controller pattern, Django templates attempt to force logical constraints on the developer, literally, by forcing as much logic as possible out of the template. The result is a few extra conditional tags here and there -- for example, a test for item_form and a separate "equal" test for the desired list_key, resulting in a duplicate "Create an item" link in the template.

Step 9: Major "Create" Pathway

Next I need to generate the List form, when the user clicks the "Create a List" link. Here's CreateList.get again, unchanged:

class CreateList(webapp.RequestHandler):
  def get(self):
     template_values = {
        'list_form': 1
     }
     self.response.out.write(render(template_values))



... and the matching form inside the template, turned on by the list_form switch:

{% if list_form %}
  <form action="/create_list/" method="post">
    <textarea  style="background:#eeee00" name="name" rows=1 cols=33>
    </textarea><br>
    <span align="left">
       <input type="Submit" name="button" value="Create list">
    </span>
    <span style="padding-left:138px">
       <a href="/">cancel</a></span>
    <br>
  </form>
{% else %}
  <br> <a href="/list_form/">Create a List</a><br><br>
{% endif %}



Then, I need CreateList.post to receive this POST, and create a list:

  def post(self):
    list = List()

    list.name = self.request.get('name')
    list.put()

    self.redirect("/")



Step 10: Bundling for a Template

And then I need to display the list itself. I already have an idea for displaying items, so I'll write the code for rendering in one try.

Which brings us to the biggest consequence of the Django template philosophy, at this stage -- our function render needs to pre-package a new structure, to pass to the Django template rendering system. This is the used by the "{% for list in lists %}" and "{% for item in list.items %}" template tags in my index.html.

To do this, render needs to fetch Lists and Items from the datastore and create a kind of populating payload, which I build as "new_lists" below, and rename as "lists" to pass it to the template, within the "template_values" dict structure:

def render (template_values):
   lists = db.GqlQuery("SELECT * FROM List")

  new_lists = []
  for list in lists:
     items = db.GqlQuery("SELECT * FROM Item Where list_key = :1",
                                            list.key())

     new_list = {
        'name': list.name,
        'key': str(list.key()),
        'items': items
     }
     new_lists.append(new_list)
  
  template_values['lists'] = new_lists

  path = os.path.join(os.path.dirname(__file__), 'index.html')
  return(template.render(path, template_values))



... which is unraveled by the logic in the index.html Django template.

(Note the need to cast the list.key() to a string for the payload).

Step 11: Minor "Create" Pathway

To demonstrate this, I'll need to GET an Item form, making sure it has the list key. Again, CreateItem.get, unchanged:

class CreateItem(webapp.RequestHandler):
  def get(self, list_key):
     template_values = {
        'item_form': 1,
        'list_key': list_key
     }
     self.response.out.write(render(template_values))



... and its matching template section:

{% if item_form %}
      {% ifequal list.key list_key %}
        <a name="form"></a>
        <form action="/create_item/" method="post">
          <span style="padding-left:15px">
             <textarea style="background:#eeee00" name="name" rows=1 cols=33>
             </textarea><br>
          </span>
          <span style="padding-left:15px"> 
             <input type="Submit" name="button" value="Create item">
          </span>
          <span style="padding-left:130px">
             <a href="/">cancel</a>
          </span>
          <br>
          <input type="hidden" name="list_key" value="{{ list_key }}">
        </form>
      {% else %}
        <br>
        <span style="padding-left:15px">
           <a href="/item_form/{{ list.key }}/#form">Create an item</a>
        </span>
        <br><br>
      {% endifequal %}
   {% else %}



We'll instantiate Items in the CreateItem.post handler:

  def post(self):
    item = Item()

    item.list_key = self.request.get('list_key') 
    item.name = self.request.get('name')
    item.put()

    self.redirect("/")



Conclusion

That's all. That's one sequence, of medium length.

The application runs on GAE here, along with the final versions of passingkeys.py, index.html, and the app.yaml control file (which includes a mapping for these static source files).

This sequence cannot represent the intentions of the builders of Python, Django or the GAE implementation. It is only the story of my resolutions to my problems. It will only let you understand what I've done. But, you may find that some part is useful for your application.

Of course, there are many problems with this small sequence. It's not perfect. However, I believe that many such sequences, ironed-out, polished and improved, can serve as inspiring programming resources. This practical, introspective approach may guide us to a future of smoother software development, for more complex and coherent systems. It might also shape the future of tools and languages to support good engineering and design.

We only need to take nature's achievements seriously.

3 comments:

j_l_larson said...

Hello, your application looks intriguing but fails with a quota exception: http://passingkeys.appspot.com/

j_l_larson said...

Nevermind, I copied your code over to my own instance and am working through it. Thank you so much for this introduction!

j_l_larson said...

How would you implement delete item using this methodology?