mercoledì 31 ottobre 2007

RESTful Web Services Example 3.4 - Obtaining a list of buckets

Example 3.4 of the RESTful Web Services deals with obtaining a list of buckets (think of them as top-leve directories). This first version of the code will return a collection of strings, each one being the name of a single bucket. This is done by sending #getBuckets to an S3BucketList instance.

If you load the latest code (RWS-gc.25) from the RWS repository you’ll see many new changes. The ones we’re interested into are the S3Authorized class and the S3BucketList class.

The former is a class extracted via refactorings from the S3Object class, while the latter allows you to get the list of your buckets.

Take a look at S3BucketList>>getBuckets. The method is defined as:
S3BucketList>>getBuckets
| doc |
"Fetch all the buckets this user has defined.

Code taken from RESTful Web Services. Copyright © 2007 O'Reilly Media, Inc.
All rights reserved. Used with permission."

"GET the bucket list URI and build an XML DOM tree from it."
doc := XMLDOMParser
parseDocumentFrom: (self get: Host) contents readStream.
"For every bucket, collect its name"
^ doc // 'Bucket' / 'Name' collect: [:each | each contentString]

As you may see, the Smalltalk code is shorter than the corresponding Ruby code. That’s because we used Pastell, an XPath-like DSL for navigating XML DOM trees, instead of using a separate XPath library.

The first line of code creates a DOM tree from the XML document obtained by sending #get: to itself. #get: is one of the new messages added in the previous post.

In the second line, the DOM tree is sent a #// message. This message selects all the descendant nodes of an XML node whose name matches the argument of the message. In this example, the result of this message send will be a collection of nodes whose tag is 'Bucket'. This collection is a special subclass of OrderedCollection that can understand the Pastell messages, called PastellCollection

The list of nodes is sent another Pastell message, #/. This message selects all the child nodes whose tag matches the argument of the message. In this example, when sent to the PastellCollection instance returned by #//, it returns another PastellCollection whose elements have a 'Name' tag. We can retrieve the actual names of this elements by collect:ing their contentStrings.

In the next two examples, we’ll see how to model a single bucket as a Squeak object, and how to modify S3BucketList>>getBuckets in order to make it return a collection of objects instead of strings.

lunedì 29 ottobre 2007

RESTful Web Services - A Better HTTP interface

Before moving to the remaining Amazon S3 examples from Chapter 3 of the RESTFul Web Services book, let’s take another look at the code. The previous examples were strongly inspired by the corresponding Ruby examples. This, paired with the use of the libCurl bindings, has led to code which is brittle and hard to modify.

As an example, let’s consider the code for S3Object>>open:method:headers:. This method requires the HTTP method in order to compute the correct signature for the request, but then ignores it by sending request download, which is the message used to send a GET request. A possible solution for this is to use a case-like solution in order to send the right message; the code would then become something like this:
(method asLowercase == 'put')
ifTrue: [request upload]
ifFalse: [request download]

Now, this isn’t just ugly, it’s plain bad OO code. This means we have to look for another solution.

The solution that I’ve chosen is based on the pattern used for Seaside’s and HV2’s Canvas systems. It involves having an S3Request object that wraps a Curl instance and provides a simple interface. S3Request has different subclasses for the various HTTP methods: S3GetRequest, S3PutRequest etc.

The code for S3Request and its subclasses may be found in package RWS-gc.20 from the RWS repository. S3Request is defined as:
Object subclass: #S3Request
instanceVariableNames: 'headers url request contents'
classVariableNames: ''
poolDictionaries: ''
category: 'RWS-Examples'

I won’t provide a detailed explaination for this class, save for the #send message:
S3Request>>send
request := Curl new.
request url: url.
headers keysAndValuesDo:
[:key :value |
request addHeader: key, ': ', value].
request onHttpHeaders.
self perform.
^ request

#send creates a new Curl object, sets the headers and url for the new request, then sends a #perform message to itself before returning the Curl object. #perform is an abstract message implemented by the various subclasses of S3Request. For example, S3PutRequest>>perform looks like
S3PutRequest>>perform
contents ifNotNil: [request contents: contents].
request upload

This is a simple and elegant solution to our problem that, while adding 5 new classes to the system, keeps the overall complexity of the code to a much lower level.

In order to use this new API, I’ve added some more messages to the protocols of S3Object. If you browse the latest versions of this class, you’ll find many methods such as #get:headers:, #put:contents:headers: etc. This methods replace the #open:method:headers: message and its variants.

In the next post, we’ll see how to interact with the Amazon S3 service. Meanwhile, I suggest you go over Chapter 3 of Richardson and Ruby’s book in order to get yourself acquainted with S3’s lingo.

venerdì 26 ottobre 2007

KomHttpServer 7.0.5

In the past days I’ve become one of the mantainers of KomHttpServer, Squeak’s native HTTP server. I’ve just released a new version (available on Squeak Map) that includes various bug fixes, and HTTP 1.1 compliant status codes (plus some WebDAV ones, too).

I’d like to thank Goran, Ron, Philippe and all the others for the work they’ve done on Kom’s source code.

mercoledì 24 ottobre 2007

Vito Di Modugno Organ Quartet plays Haitian Fight Song

Charles Mingus’ Haitian Fight Song (and its later variant II B.S.) is one of my favourite jazz pieces. And the organ, especially the Hammond organ, is one of my favourite musical instruments. So I was very pleased to find on Youtube a version of Haitian Fight Song played by the Vito di Modugno Organ Quartet. Here is it, for your hearing pleasure:

martedì 23 ottobre 2007

Miles Davis - So What 1958 & 1964

Looks like I’m not the only one with a blog who’s interested in Jazz and programming. Patrick Logan just published a wonderful post about Miles Davis’ So What So What is a track from the Kind of Blue album, and Patrick compares two different versions of it: the first one from 1958, the second one from 1964. It’s difficult to say which one I prefer.

Incidentally, just ten days ago in the #verbamanent IRC channel we briefly discussed and compared Davis’ Kind of Blue, Coltrane’s A Love Supreme and Mingus’ The Black Saint & The Sinner Lady. The comparision is a little inappropriate, since Kind of Blue precedes the other two albums by roughly five years, but it’s interesting nonetheless. I’m going to write more about these three albums in another post.

lunedì 22 ottobre 2007

RESTful Web Services - Bugfixing CurlPlugin

Before moving to today’s example, in which we’re going to start interacting with the Amazon S3 server, I’ll take a diversion to explain why all this lateness. As I wrote in the previous post, this example is so late because of a bug in CurlPlugin that I introduced myself.

The original CurlPlugin didn’t expose at the Smalltalk level the functionalities for setting custom HTTP headers in the request, so I had to add these functionalities to the plugin. When I did that, I overlooked the fact that a Smalltalk string and a C string aren’t implemented in the same way: a Smalltalk string, being a Collection subclass, has an arrayed collection of character elements (either 8-bit or 32-bit wide) and an explicit size attribute, while a C string is a NULL-termined array of char.

This means that since you can’t be sure of the fact that, in the internal representation of a Smalltalk string, the value after the last character of the string will be a NULL, you have to explicitly add one. Since I forgot to do so (by sending a simple #asCString message, provided by CurlPlugin), I introduced a tricky bug in the plugin. To make things worse, this bug showed itself only when the string was longer than usual. So, when adding to the request short headers such as Date: or Content-Type: there were no problems, while the Authorization: header always had some spurious trailing characters.

In order to detect this bug, I had to install a packet sniffer, learn how to use it, and look at what libCurl was actually sending to the server. As it often happens, once the problem was identified, fixing it was a matter of minutes.

Now that the bug is fixed, you’ll have to download the new version of the CurlPlugin. Extract the archive, overwrite the CurlPlugin file in your image directory, file in the new Curl.st file, and you’ll be ready for the next example.

domenica 21 ottobre 2007

RESTful Web Services Example 3.16 - Signing the request

After little less than a month, let’s resume with the Squeak Smalltalk versions of the examples of the RESTful Web Services. This latest installment is so late due to a bug that I introduced in the CurlPlugin code when adding a new primitive, and it took me a long while to debug it. The new primitive is needed for the next example in this series.

We already know how to build a canonical string, and how to actually sign it. In today’s example, we’ll see how the request signature is created from the URI, the HTTP method, and the headers.
This is the task of the #signatureFor:method:headers:expires: method and its ilk.

The source for S3Object>>signatureFor:method:headers:expires: is the following:
S3Object>>#signatureFor: aStringOrUri method: method headers: headers expires: expires
| uri path |
"Builds the cryptographic signature for an HTTP request. This is
the signature (signed with your private key) of a 'canonical
string' containing all interesting information about the request.

Code taken from RESTful Web Services. Copyright © 2007 O'Reilly Media, Inc.
All rights reserved. Used with permission."

"Accept the URI either as a Squeak URI object, or as a string"
uri := aStringOrUri asURI.
path := uri query ifNil: [uri path] ifNotNil: [uri path, uri query].

"Build the canonical string, then sign it"
^ self signString: (self canonicalStringFor: method path: path headers: headers expires: expires)

The method is very simple, but there are a couple of observations that may be done.

The first observation is about the uri temporary variable: since the aStringOrUri parameter, as the name says, may be either a string or an URI, we send the #asURI message in order to be sure that uri holds an URI object (this way of ensuring that an object is of the appropriate class is a common pattern in Smalltalk).
Once we are sure we have an URI object, we extract the path from the URI, plus the optional query string (the part of the URI after the ? character).

Finally, we build the canonical string, sign it and return the signature.

The source code for this method is included in the RWS-gc.14 package in the RWS repository. In the same package you will find other variations of the same method:

`S3Object>>#signatureFor:method:headers:`
`S3Object>>#signatureFor:method:`
`S3Object>>#signatureFor:headers:`
`S3Object>>#signatureFor:`


If you browse the source code for these methods, you’ll observe that all these methods send the #signatureFor:method:headers:expires: message with the appropriate default parameters. This is a very common pattern in Smalltalk, but it has one serious flaw, as the sheer number of possible combinations makes this solution unpractical on a large scale. For example, a message with four parameters may have other 14 possible variants, while one with five parameters has 30 possible variants.

This combinatorial explosion is somewhat mitigated by the fact that usually there’s only a restricted subset of sensible and useful variants of a message, but in many cases this is not enough. This is one of the reason why Seaside moved from a builder system to a canvas one for (X)HTML generation.

In the next post in this series, we’ll see how to send a signed request to the Amazon S3 service.