Friday, November 28, 2008

Passing Keys


The current version of this article and sequence can be found at Core Memory.


An entity is an instance of a data model, analogous to a data record in a database table.

When you store such a chunk of data, sometimes you need to locate it quickly, using a unique identifier. This is the entity's key.

Keys for data entities are typically unique relative to the data model. They are commonly hidden in a web page so that, when the user clicks something, the appropriate key is passed back to the server. This "passing of keys" (from the server, to the user, then back to the server) happens in any web application that empowers the user to create many-to-one relationships on a web page -- for example, to create a list with items.

The Google App Engine (GAE) hosting environment provides a database with a difference: when created, each entity is assigned a key universal to Google's "Big Table", guaranteed never to change, and guaranteed to be accessible only from your GAE application.

This implies an unusual approach to passing keys to the browser. When the webapp generates a web page the data choices can be hidden in the page, not with ID's generated on-the-fly, but with permanent, universal keys. There are clues on how to do this, mostly implied or embedded, scattered across the Google App Engine Group's discussion forum, the GAE cookbook, etc. But there are no simple, deployed examples. So I thought I would provide one.

Unfolding of an Example

To present this, I'm explicitly introducing an ancient illustrative technique to the world of software development, one that I think is particularly pleasant, unusually effective and full of possibility: the unfolding of a program. It's a polished sequence of stepwise software development. This is rather different than a walk-through of finished code. You may have seen it before in public programming performances ... in fact, Google introduced GAE with just such a performance, which I'll analyze in a separate paper.

For the moment, let's think of a program as a living organism.


During an organism's development, its structure and shape changes, moving from very simple to very complex. We marvel at nature's ability to maintain a coherent morphology amidst this massive engineering project. Amazingly, the elegant developmental sequences we see in nature, which generate such robust and complex mechanisms, are very flexible and adaptive. Living things tap a superb library of transformative solutions and methods. Since humans have only about 30,000 genes, life somehow creates robust, unique people from something analogous to 30,000 small functions. The Linux Kernal has about 10 million lines of code. We need to pay much closer attention to nature.



If you watch this gradual, incremental growth, you'll see difficult engineering problems overcome fluidly, simply, at each stage, at all scales. A great deal of ad hoc adaptation is going on, adaptations synthesized from available information, used in a persistent, continuously balanced approach to tackling the emergence of the entire organism, not just tiny parts of it.



This holistic, stepwise approach to growth can be seen as following, or borrowing from, many sequences of steps. Each step creates a morphological differentiation, building upon the previous step, providing the context for the next. Surprisingly, although the number of sequences borrowed by a developing organism is vast, the number of differentiations is not. A human being develops in approximately 50 cell divisions, for example.



Assuming that programmers are called developers because they engage in something strongly akin to this kind of development, it's unfortunate that one of the least explored areas in programming is the study of the steps and sequences we use. When we decide to change code, at that moment, what are we doing? What are the problems, and what is our resolution? And where are we going? Clearly it would be useful to pass these steps and sequences around, among ourselves, for the sake of improving our discipline. The study of developmental programming could provide a quicker, more efficient and effective absorption of new contexts, in the fast-changing world of engineering. Sequences could provide another tool for Open Source computer science.

If all this talk seems redolent of Design Patterns, there's a reason. The notion to identify sequences of steps that humans can take, to build complex coherent systems, comes from Christopher Alexander, whose practical architectural philosphy inspired the computing movement to mine patterns, and pattern languages. Research by Alexander and myself in the 1990's indicated, among many other things, that people take from a sequence whatever they need, absorbing information about dependencies, and borrowing solutions, and sensibilities, very like a developing organism making use of its genome.

I believe that polished sequences are a complement to patterns and refactoring ... after you've refactored towards various patterns, you tend to think: "if only I'd known about A! At this point I would have done A instead of B" or "Y turned into a huge time-wasting problem later on. I'm sure someone else has run into this."

Sequences give you the opportunity to save vast numbers of people from the frustration with poor solutions, and teh difficult path to rediscovery of an already known solution. Sequences can make us more effective, and help us to produce quality work more easily and more often.

So, to our example. I hope this new approach is useful, and that the sequence itself provides solutions large and small, from which you can borrow.

Preliminary Step: Scope and Environment

To get started, note that I'm not supplanting the steps found within the Google App Engine Getting Started tutorial, which is still the only coherent tutorial for Google App Engine development. If you're making use of my sequence, below, I assume that you're already running the development environment, based on instructions in the GAE tutorial.

Again I'm focussed on simply passing keys. This sequence will not develop code for editing, ownership, user authentication, sessions, sorting, timestamps, AJAX, exception handling, CSS, backups or any number of otherwise perfectly reasonable and necessary features. But, clearly, those tasks too can be broken into small sequences, "snippets" if you will, which could be applied in an additive fashion to any application derived from this article.

In this example I'm also using a very simple subset of Python, free of 'dunder' methods, and avoiding the Django helper. I want to limit the number of technical features I'm describing here, while allowing as broad a swath of programmers as possible to read it. Similar limits were placed on the tutorials produced by Google.

We'll call this List-Item sample application "Passing Keys", or "passingkeys" to GAE. It's hosted at passingkeyshg.appspot.com, with links to the source. The primary Python module is passingkeys.py .

Step 1: Using purpose to shape the persistent data models

My first step directly reflects my main goal.

What do I want to do with this app? I want to create Lists and Items. Since I know something about the GAE application space, there's no harm in using a little code to start. In fact, it immediately gives the reader some understanding of GAE. Concrete implementation is not taboo in the world of generative sequences: after all, information critical to a developing organism is conveyed through real molecules.

So, in passingkeys.py, let's define List and Item, two different Datastore models derived from the db class ... db is a Google App Engine wrapper for the Datastore API, which reads and writes from Google's Big Table, or its simulation in the development environment on your own machine.

Note that the relationship between List and Item is made explicit by Item's "ReferenceProperty (List)" declaration. We will assign to Item's list_key the associated List instance, when the time comes.

class List(db.Model):
  name = db.StringProperty(multiline=True)

class Item(db.Model):
  list_key = db.ReferenceProperty(List)
  name = db.StringProperty(multiline=True)



Note: if you are using these steps to guide the development of a different application, it's good to keep data models that are understood separate from your exploratory models.

Step 2: Functional Centers and the First Bridge

In passingkeys.py, we'll define the three clumps of code that I'll call functional centers. They reflecting the three primary centers of activity that I foresee in this app. These aren't classes yet. But this outline bridges the gap between, on the one hand, the user's sense of real activity, and on the other hand, web technology (GETs, POSTs, forms rendering HTML) and the supporting classes and methods.


The Home Page:
* get: displays the current lists and items
* render: called by all other functions to render a version of the home page

Creating Lists:
* get: renders the home page with a List form
* post: handles the CGI POST to create a List

Creating Items:
* get: renders the home page with an Item form
* post: handles the CGI POST to create an Item



Step 3: The Class and Method Bridge

I associated render above with the home page. But, when I step over the next bridge, to a class and method outline, Python's object idiosyncrasies kick in. 'Class methods' are not normal in Python, so render needs to be an isolated function definition. Hence, my class and method outline is:

def render
   # renders home page

# handlers:
class HomePage
  def get
      # returns render's output

class CreateList
  def get
     # calls render with List form
  def post
     # creates a List

class CreateItem
  def get
     # calls render with Item form
  def post
     # creates an Item



Step 4: Encode the Handler Shells; Outline Rendering

Let's make the class and method outline more rigid, 'hardening' the shells of both.

Some might complain that this approach is too top-down, that I'm not following an end-to-end incremental methodology here. But, this is runnable code. Over the decades, programming languages have shifted towards pseudocode, so that we can ease our way through these early differentiating stages more naturally. The previous steps could be end-to-end, and runnable, if the language and environment shifted a little further in this direction. And if it included higher-level descriptions for web-page pathways. I find the conflicting environmental support for people who've done the same thing many times, and people who are exploring the environment for the first time, to be unnecessary. With open-source sequences, these incongruities would resolve themselves, with better development environments.

There's a hidden assumption in what follows -- it's in my mind, so I might as well spell it out. For simplicity, I'll use one Django template, which I'll name index.html, to create the HTML presented to the user. The template will be populated with Item and List data by my render function, which in turn is passed a Python dictionary (a collection of name-value pairs). This dictionary also includes switches to turn the forms on and off.

Here's an implementation outline of the five handler methods + 1 render function:

def render (template_values):
   # put data in template_values
   # call Django templating
   # return results to calling handler

class HomePage (webapp.RequestHandler):
  def get (self):
     # create vanilla home page
     # by calling render

class CreateList (webapp.RequestHandler):
  def get (self):
     # create list form
     # by calling render

  def post (self):
     # insert data from list form
     # redirect to home

class CreateItem (webapp.RequestHandler):
  def get (self, list_key):
     # create item form with hidden list_key
     # by calling render

  def post (self):
     # insert data from item form
     # redirect to home



I call my render function for GETs, and I redirect to home after POSTs. I could consistently call the render function, or with a session layer I could consistently redirect. I'm just trying to reduce overhead, for this example.

Step 5: Directing URL Regexes to Handlers

As per GAE's default webapp WSGI framework, there's a mapping, at the end of passingkeys.py, between our defined URL regex's and our request handlers. So the end of every main web application module in GAE-land looks like this:

application = webapp.WSGIApplication(
                                     [('/', HomePage),
                                        ('/list_form/', CreateList),
                                        ('/create_list/', CreateList),
                                        ('/item_form/(.*)/', CreateItem,
                                        ('/create_item/', CreateItem)],
                                     debug=True)

def main():
  run_wsgi_app(application)

if __name__ == "__main__":
  main()



The critically important detail here, relative to passing keys in a Python/GAE implementation, is the "(.*)" in the regex, which passes the contents of that string (a key coming back from the user) as an argument to CreateItem's CGI GET handler, which we call "list_key" inside CreateItem.get.

Step 6: Encode Template Rendering

The Django template index.html will be rendered by my render function, which initially just looks like this:

   def render (template_values):
         # to do: add data to template_values
        path = os.path.join(os.path.dirname(__file__), 'index.html')
        self.response.out.write(template.render(path, template_values))



Step 7: Encode Template Values for Create Get Methods

CreateList and CreateItem's GET handlers will pass a Python dictionary structure, with switches and a key, to my render :

class CreateList (webapp.RequestHandler):
  def get (self):
     template_values = {
        'ListForm': 1
     }
     self.response.out.write (render(template_values))

class CreateItem (webapp.RequestHandler):
  def get (self, list_key):
     template_values = {
        'item_form': 1,
        'list_key': list_key
     }
     self.response.out.write (render(template_values))



Step 8: Template Logic Outline

For developmental balance, I'm compelled now to complete our end-to-end infrastructure here, with a quick cut at our Django template index.html, based on the assumptions so far:


<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <title>Passing Keys</title>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" >
</head>
<body>
<a href="/"><font size="+3"><b>Passing Keys</b></font></a><br>
 <br><br>
{% if list_form %}
   {# list_form #}
{% else %}
  <a href="/list_form/">Create a List</a><br><br>
{% endif %}
{% for list in lists %}
   {{ list.name }}<br>
   {% for item in list.items %}
      {{ item.name }} 
   {% endfor %}
   {% if item_form %}
      {% ifequal list.key list_key %}
         {# item form #}
      {% else %}
        <a href="/item_form/{{ list.key }}/#form">Create an item</a>
        <br>
      {% endifequal %}
   {% else %}
         <a href="/item_form/{{ list.key }}/#form">Create an item</a>
   {% endif %}
{% endfor %}

</body>
</html>



[Note that the end-to-end implementation cycle has started. Here, I start with the "Create" links. People tend to follow this cycle, this mini sequence, as a reality check during live webapp development: create links -> create form -> entity storage -> render data. It's a very natural cycle, and so common that, if you watch development in time-lapse, it'll make you dizzy. You can see it in public programming events like the Google App Engine Introduction. I did it too, and it's reflected here. It's useful both in exploratory development and recapitulation.]

What did I know at this "template logical outline" stage? With an evangelical passion for the Model-View-Controller pattern, Django templates attempt to force logical constraints on the developer, literally, by forcing as much logic as possible out of the template. The result is a few extra conditional tags here and there -- for example, a test for item_form and a separate "equal" test for the desired list_key, resulting in a duplicate "Create an item" link in the template.

Step 9: Major "Create" Pathway

Next I need to generate the List form, when the user clicks the "Create a List" link. Here's CreateList.get again, unchanged:

class CreateList(webapp.RequestHandler):
  def get(self):
     template_values = {
        'list_form': 1
     }
     self.response.out.write(render(template_values))



... and the matching form inside the template, turned on by the list_form switch:

{% if list_form %}
  <form action="/create_list/" method="post">
    <textarea  style="background:#eeee00" name="name" rows=1 cols=33>
    </textarea><br>
    <span align="left">
       <input type="Submit" name="button" value="Create list">
    </span>
    <span style="padding-left:138px">
       <a href="/">cancel</a></span>
    <br>
  </form>
{% else %}
  <br> <a href="/list_form/">Create a List</a><br><br>
{% endif %}



Then, I need CreateList.post to receive this POST, and create a list:

  def post(self):
    list = List()

    list.name = self.request.get('name')
    list.put()

    self.redirect("/")



Step 10: Bundling for a Template

And then I need to display the list itself. I already have an idea for displaying items, so I'll write the code for rendering in one try.

Which brings us to the biggest consequence of the Django template philosophy, at this stage -- our function render needs to pre-package a new structure, to pass to the Django template rendering system. This is the used by the "{% for list in lists %}" and "{% for item in list.items %}" template tags in my index.html.

To do this, render needs to fetch Lists and Items from the datastore and create a kind of populating payload, which I build as "new_lists" below, and rename as "lists" to pass it to the template, within the "template_values" dict structure:

def render (template_values):
   lists = db.GqlQuery("SELECT * FROM List")

  new_lists = []
  for list in lists:
     items = db.GqlQuery("SELECT * FROM Item Where list_key = :1",
                                            list.key())

     new_list = {
        'name': list.name,
        'key': str(list.key()),
        'items': items
     }
     new_lists.append(new_list)
  
  template_values['lists'] = new_lists

  path = os.path.join(os.path.dirname(__file__), 'index.html')
  return(template.render(path, template_values))



... which is unraveled by the logic in the index.html Django template.

(Note the need to cast the list.key() to a string for the payload).

Step 11: Minor "Create" Pathway

To demonstrate this, I'll need to GET an Item form, making sure it has the list key. Again, CreateItem.get, unchanged:

class CreateItem(webapp.RequestHandler):
  def get(self, list_key):
     template_values = {
        'item_form': 1,
        'list_key': list_key
     }
     self.response.out.write(render(template_values))



... and its matching template section:

{% if item_form %}
      {% ifequal list.key list_key %}
        <a name="form"></a>
        <form action="/create_item/" method="post">
          <span style="padding-left:15px">
             <textarea style="background:#eeee00" name="name" rows=1 cols=33>
             </textarea><br>
          </span>
          <span style="padding-left:15px"> 
             <input type="Submit" name="button" value="Create item">
          </span>
          <span style="padding-left:130px">
             <a href="/">cancel</a>
          </span>
          <br>
          <input type="hidden" name="list_key" value="{{ list_key }}">
        </form>
      {% else %}
        <br>
        <span style="padding-left:15px">
           <a href="/item_form/{{ list.key }}/#form">Create an item</a>
        </span>
        <br><br>
      {% endifequal %}
   {% else %}



We'll instantiate Items in the CreateItem.post handler:

  def post(self):
    item = Item()

    item.list_key = self.request.get('list_key') 
    item.name = self.request.get('name')
    item.put()

    self.redirect("/")



Conclusion

That's all. That's one sequence, of medium length.

The application runs on GAE here, along with the final versions of passingkeys.py, index.html, and the app.yaml control file (which includes a mapping for these static source files).

This sequence cannot represent the intentions of the builders of Python, Django or the GAE implementation. It is only the story of my resolutions to my problems. It will only let you understand what I've done. But, you may find that some part is useful for your application.

Of course, there are many problems with this small sequence. It's not perfect. However, I believe that many such sequences, ironed-out, polished and improved, can serve as inspiring programming resources. This practical, introspective approach may guide us to a future of smoother software development, for more complex and coherent systems. It might also shape the future of tools and languages to support good engineering and design.

We only need to take nature's achievements seriously.

Saturday, November 1, 2008

Google App Engine: local debugging and logging

Google provides instructions for using the Python logging module for debugging, and logging, in Google App Engine's Python environment.

The document describes fetching log lines from the actual cloud itself. Unfortunately, I can't find references on doing this locally, in my development environment: the local administration console is not as full-featured as the cloud one.

Luckily, I found this flag (--debug) which makes the logging module messages appear in my terminal output:

clear; dev_appserver.py --clear_datastore --debug .

(I clear the terminal screen so I can freshly search on keywords to find a particular log line.)

This works pretty well. And it's essential for debugging a server/CGI application.

Friday, October 31, 2008

Google App Engine without the App


Say you have a legacy static website -- just HTML and images -- and you'd like to host it on Google App Engine while you ponder the direction you'd like to take it in the future.

Well, in a way, you're lucky: it's much easier to port a static site to GAE, than it is to port a web app (we used to call these "dynamic" sites ... remember?) Unless you've been a Python web developer, you'll be starting from scratch if you're developing web apps for GAE.

But how do you port an entirely static site? Luckily, my wife, Olga Volchkova asked me to do just this. She wanted to see her old website, live. She built this very lovely hand-painted Fireworks/Dreamweaver site in late 2000, which we first released in early 2001, and which she hasn't touched since 2002. It looks very 2001 -- well, very much like her work in 2001, heavily influenced by the Golden Age of Russian animation from 1965 to 1985, best represented by Yuri Norstein's wonderfully gentle short films, like Hedgehog in the Fog -- and happily there wasn't a bit of code involved!

The only problem was the file quota: 1000 files is the limit for GAE. Macromedia Fireworks was profligate with files, dividing, in one case, a single page of hers into over 400! I culled that page for now ... I'll upload it as another project (Google gives you five GAE projects per cell phone number) some other time.

So, here's what an app.yaml file for a bunch of static html, htm, gif and jpg files looks like:


application: olgallerycom
version: 1
runtime: python
api_version: 1

handlers:

- url: /
static_files: index.html
upload: index.html

- url: /(.*)
static_files: \1
upload: (.*)


No Python in sight.

I use the static_files feature of app.yaml ... it is more flexible, in a sense, than static_dir. I eliminated the Python, because the collection of files already had a working structure, which I decided to preserve. It was easier than wrapping the file fetches in code and rearranging the directories.

This is fast and easy, and recommended if you need to get a legacy site up quickly. You can see the resulting sweet, gentle legacy site of Olga Volchkova, running on GAE.

Tuesday, September 30, 2008

Little secret: dev_appserver is_required



If you do skip around and rearrange the steps of the tutorials for Google App Engine, one thing you may try to do, is continually, incrementally, directly upload to Google, using appcfg.py update ...

Unfortunately, this won't work. If you use GQL and the Datastore API, you'll find that you need to run the local development server dev_appserver.py before deploying ... the dev server creates an index.yaml file, essentially mapping your db model and its use to Big Table. This gets deployed with your application. So, skipping this step (at least for most applications, which use the db) is not an option.

If you don't run the local dev_appserver, you'll get error messages from Google's servers when you run your webapp, like this:

Traceback (most recent call last):
File "/base/python_lib/versions/1/google/appengine/ext/webapp/__init__.py", line 499, in __call__
handler.get(*groups)
File "/base/data/home/apps/2splice/1.31/main.py", line 20, in get
for greeting in greetings:
File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1257, in __iter__
return self.run()
File "/base/python_lib/versions/1/google/appengine/ext/db/__init__.py", line 1589, in run
query_run = self._proto_query.Run(*self._args, **self._kwds)
File "/base/python_lib/versions/1/google/appengine/ext/gql/__init__.py", line 581, in Run
res = bind_results.Get(self.__limit, offset)
File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 938, in Get
return self._Run(limit, offset)._Next(limit)
File "/base/python_lib/versions/1/google/appengine/api/datastore.py", line 887, in _Run
str(exc) + '\nThis query needs this index:\n' + yaml)
NeedIndexError: no matching index found.
This query needs this index:
- kind: Greeting
properties:
- name: date
direction: desc

Sunday, September 21, 2008

Google App Engine: quick-end-to-end



Here's a breathless, running narration of a one-hour attempt to get App Engine serving a web application.

A small complaint: the tutorials for Google App Engine are not geared towards getting a first real webapp deployed quickly. They are, instead, intended to give a web developer a pretty complete survey of the infrastructure for web development. Unfortunately, this means you really need to dig around to understand how to fully deploy, say, "helloworld" to "helloworld.com".

So, to address this deficit, here's an end-to-end, holistic version of the Google App Engine tutorial. I'm using a Macbook, running 10.4.11, and my domain name registrar is Network Solutions:

1) Gain registration-level control over the domain name: networks solutions, godaddy, register.com, etc.

2) Sign up for free Standard Google Apps, using that domain, here. (Update: here's the current link directly to the free Standard Google Apps edition)

3) Log into http://appengine.google.com/a/[your-domain-name].

4) Follow the instructions to name your app (after your-domain-name is easiest) ... you get an access code, for verification, on your cellphone.

5) Install Python 2.5 on your local machine, from here. No earlier version will work, but installers are available for all platforms. Note that after installation, on a Mac, you'll need to use Get Info on any python file, to make all python scripts open with the 2.5 Python launcher (which was just installed in your Applications directory).

6) Download the Google App Engine Launcher for your local machine, here. Copy it to your Applications directory before launching it. Launch and install. On the Mac you'll need to restart your shell (or terminal) to pick up the new python.

7) Then, from the first half of the third page of the tutorial:
a) create a your-app-name directory
b) within that directory, make an app.yaml file, as described above, changing 'helloworld' to your-app-name within the file.
c) borrow the "helloworld" app itself above, and copy it into a your-app-name.py file.
d) in the shell, cd to the parent of this directory
e) upload / deploy your app with appcfg.py update your-app-name/
f) appcfg will ask you for the email and password of your google app account.
g) back on your appengine overview page (http://appengine.google.com/), you'll get a notice that an app was successfully deployed.

8) On this page (http://appengine.google.com/) click the name of the app. This will bring you to your app engine app's dashboard.

9) Click "versions". Follow the instructions for "add domain" ... you will need to prove the domain by signing up for Google Apps from here (The instructions on that page read: Changing a CNAME Record), and then come back to versions->add domain, type the domain name and click "add domain ..." so that ghs.google.com. will handle CNAME aliases from your registrar.

10) Back at my registrar, I delete the A Record for www, and do an "advanced DNS" move: a CNAME alias for www, to ghs.google.com. This works, and my app is now hosted by google at the domain I want.

8, 9 and 10 can be stumbled upon here.