Failed Experiments: 2008

Thursday, December 4, 2008

OLPC and the dying belief

I quoted OLPC earlier

Our commitment to software freedom gives children the opportunity to use their laptops on their own terms. While we do not expect every child to become a programmer, we do not want any ceiling imposed on those children who choose to modify their machines. We are using open-document formats for much the same reason: transparency is empowering. The children—and their teachers—will have the freedom to reshape, reinvent, and reapply their software, hardware, and content.

Short, straightforward, and powerful language.

Compare to the current page contents. Here's a sample:

Thus OLPC puts an emphasis on software tools for exploring and expressing, rather than instruction. Love is a better master than duty. Using the laptop as the agency for engaging children in constructing knowledge based upon their personal interests and providing them tools for sharing and critiquing these constructions will lead them to become learners and teachers.

As a matter of practicality and given the necessity to enhance performance and reliability while containing costs, XO is not burdened by the bloat of excess code, the “featureitis” that is responsible for much of the clumsiness, unreliability, and expense of many modern laptops.

A truly inspiring stand for constructivist teaching. Unfortunately, as far as I can tell, little else is being said.

Sunday, November 30, 2008

Understanding cl-cont semantics without thinking about CPS

Understanding the way that cl-cont transforms forms is one way to understand the sometimes counterintuitive behavior of that code. However, the difficulty of dissecting the meaning of code written in continuation-passing style is one of the major reasons that we use a CPS transformer at all instead of writing it out manually.

As a cl-cont user, you may find it easier to understand a behavior model that removes consideration of CPS entirely. After all, CPS is only the implementation method; the goal is to think of the code as "continuable".

Understanding first

To follow along with the behavior model of cl-cont, you should be familiar with dynamic context (such as that established by unwind-protect), lexical context, and the behavior of full continuations such as you would find in Scheme.

Understanding "ordinary" continuations is especially important, because cl-cont's behavior is more complex than full continuations, and you will be lost without preexisting knowledge of what they are.

Defining some terms

First, a continuation, unless otherwise qualified, is always a first-class continuation—a function that, when called, returns its arguments as values to the place where the call/cc call that created the continuation was called. This is just to be clear that I hardly ever mean something abstract when I say "continuation".

The macro with-call/cc introduces a lexical continuable context for the forms lexically contained within it. We will refer to this so often that I will call it LCC henceforth to avoid confusion. For all current purposes, it only matters whether code exists within any LCC, so you can think of "LCC-ness" as a flag on all code. We also say that any code not within an LCC is in an LNCC (lexical non-continuable context). We will define more rules for determining whether code is in an LCC later.

Entering an LCC at the beginning creates a dynamic continuable context or DCC. Understanding DCC behavior is the key to understanding cl-cont. Unlike LCC, you can have multiple distinct DCCs active at any time, possibly even sharing code, just as one function calling mapcar doesn't preclude you from doing so. All code not executing within a DCC is executing within a DNCC (dynamic non-continuable context).

Lexical continuable contexts

As I have said, with-call/cc introduces an LCC. This is an implicit progn. A few convenient macros, such as lambda/cc and defun/cc, wrap their non-cc counterparts with with-call/cc. Within this form, an LNCC can be inserted with the macro without-call/cc. The pseudo-function call/cc also implies a without-call/cc around its sole argument. Within any LNCC, including the implicit one around every program, further LCCs may be introduced with with-call/cc.

There are some cases where you might think that an LCC is there or doesn't matter, where it actually does. In this code:

(defun x ()
 (with-call/cc
   1 2))

The LCC only contains the 1 and 2 forms, not the function entry and function exit. Therefore, calling X from any code will result in first executing code in an LNCC, then some LCC code, then some LNCC code. The importance of this distinction will become clear once we start discussing DCCs. To solve this, move the with-call/cc to outside the defun, or use the equivalent defun/cc convenience macro.

A similar situation holds for this code:

(lambda () (with-call/cc 3 4))

Calling this function is the same as calling X above, and the same solutions apply.

One final area of confusion may be:

(with-call/cc
 (without-call/cc
   (with-call/cc 5 6)))

This is not a smooth contour of an LCC; the without-call/cc always creates an LNCC, even if a new one is created immediately therein.

Basic DCC execution

Entering any LCC from a DNCC, by either calling a function created or defined in an LCC or simply evaluating a with-call/cc form, creates a DCC. Upon creation, beyond the usual frame info, one property is captured and saved permanently for that DCC: the exit point. Consider this code:

(defun/cc bob ()
 (cons 4 2)
 :bob)
(defun/cc slack ()
 (bob)
 :slack)
(defun moo ()
 (bob)
 (slack))

Calling bob in moo creates a DCC whose exit point is the code that returns the keyword :bob. Calling slack in moo creates another one whose exit point is returning :slack. The call to bob within the call to slack does not create a new DCC, or even alter the existing one! Whether calling bob creates a new DCC depends on the nature of the calling code.

While in a DCC, calling a function defined in an LNCC suspends that DCC. That happens in bob above when calling cons, which, like all CL standard functions, is defined in some LNCC. Whether the compiler optimizes away the cons is irrelevant to our semantic model. When the function returns, the same DCC is resumed; as such, the cons above destroys neither DCC's exit point information.

The function rule sounds like a special case, but it is really just a subcase of the continuation case. Invoking a continuation enters the DCC in which it was created. In cl-cont, calling LNCC functions is handled by grabbing the continuationand invoking it with the function's result after it finishes.

After exiting a DCC, the only way to reenter it is to invoke one of its continuations. Calling a function defined therein only creates a new DCC delimited by that function's body.

Finally, a point about call/cc that will surprise you if you are used to Scheme: Invoking call/cc exits the DCC, returning the values returned by the function you gave it. You are, of course, free to simulate Scheme behavior by invoking the continuation from right within that function.

Strange exit point behavior

This may seem a little surprising:

(defvar *cc*)
(defun/cc foo ()
 (let/cc cc (setf *cc* cc) 'saved)
 'foo)

(defun/cc bar ()
 (foo)
 'bar)

(progn
 (foo) ; ⇒SAVED
 (funcall *cc*) ; ⇒FOO
 (bar) ; ⇒SAVED
 (funcall *cc*) ; ⇒BAR
)

The important thing to remember is that it doesn't matter at all what context you invoke a continuation in from a DNCC; you always get the exit point of the continuation's DCC.

Nested DCCs

Certain code looks like it should really be resuming something that it actually can't, and it has to do with nesting DCCs. Here's what I mean:

(let (keep-going list)
 (progn (with-call/cc
          (setf list (mapcar (lambda (n)
                               (let/cc k (setf keep-going k) 42))
                             (list 1 2 3 4 5))))
   (dotimes (n 5)
     (funcall keep-going (+ n 6)))
   list))

If mapcar was defined in an LCC, you would get the result you expect, (6 7 8 9 10). What you expect is that for each element in (1 2 3 4 5), let/cc saves and suspends the continuation.. The first time, the dotimes would be entered, and the next four times it would do another iteration, each time invoking the continuation saved above. In the first case, the 42 would be delivered to the first progn location (and discarded), and the further four times to the implicit progn in dotimes (and discarded).

Instead, you get the result (42 42 42 42 42). Why?

The mapcar throws a kink in. Consider that when mapcar is running, the enclosing DCC is suspended, because mapcar is defined in an LNCC and therefore creates a DNCC by being called. Now, when it calls the LCC-defined function passed to it, a new DCC is created each time. In each of these a continuation is saved and 42 is returned, creating the contents of the list, but only the last saved continuation is ever seen by the dotimes. When mapcar finishes, it resumes the enclosing continuation and it finishes execution before the dotimes can even start.

Now that dotimes has started, it is resuming the DCC solely delimited by the function passed to mapcar. As you can see, the exit point of that function is to return whatever was passed to the continuation. Accordingly, if you wrap the funcall above with a print, you'll see it print 6, 7, 8, 9, and 10, in order.

Saturday, October 11, 2008

A practical use for change-class!

Previously only thought useful for metaprogramming, the amazing change-class finds a use in the hallowed realm of application programming.

(defwidget maybe-pagination (weblocks:pagination)
  ((show-pred :initarg :show-predicate :accessor show-paginator-predicate
              :initform (constantly t)
              :documentation "One-arg proc accepting self answering
              whether to show the widget contents."))
  (:documentation "Hide the paginator sometimes."))

(defmethod weblocks:render-widget-body
    ((self maybe-pagination) &key &allow-other-keys)
  "Iff `show-paginator-predicate''s value answers NIL on myself, don't
send to super."
  (when (funcall (show-paginator-predicate self) self)
    (call-next-method)))

(defmethod initialize-instance :after
    ((self my-listedit-subclass) &key &allow-other-keys)
  (change-class (weblocks:dataseq-pagination-widget self) 'maybe-pagination
    :show-predicate (f_% (typep (weblocks:authenticatedp) 'admin-account))))

Wow, huh?

(The above makes it so that only admins can see the subwidget that
shows the current page with links for paging through the results.)

Tuesday, September 16, 2008

What is cascading rebase?

The problem of cascading rebase is a serious one, and should not be taken lightly. Though the idea of cleaning history may be appealing, it is almost always inappropriate.

This problem is what makes all documenters of rebase functionality implore you to avoid rebasing public repositories. However, the nature of the problem is difficult to explain to those without a good understanding of the DAG-based branching and merging employed by most modern distributed VCSes, including Bazaar, Mercurial, and Git.

What starts simple…

Imagine a larger version of this very abbreviated history. Bob has started a project and commits new revisions to the mainline, and collects contributions by merging from others' branches.

Here we see Bob's mainline in red.

Topic branching

Alice has branched mainline-3 to create a feature, and made a few commits. The idea is that Bob will merge the feature branch, in green, back into mainline. Work on mainline can continue, as seen by additional mainline revisions.

Uh oh…

Bob realizes that mainline-2 was done wrong. He decides the problem is serious enough that it has to be redone.

This is not the only kind of rebasing; rebasing is a subsequence replacement operation, so it can be used to insert revs in history, delete revs from history, or insert and replace any number of revs at any point. In this case, we're replacing mainline-2 with mainline-2'.

Here is the completed rebase operation. The dead revisions are shown in light red.

Rebasing is really branching

Every revision in a proper DAG history contains an immutable reference to its parents. So mainline-3 has a reference to mainline-2. But we rewrote mainline-2, so we must rebuild mainline-3 so it refers to the new mainline-2 instead, 4 to 3, 5 to 4, and so on.

As such, rebasing doesn't really "rewrite history"; it finds the latest revision that doesn't have to be rewritten, branches off it, builds the new history onto that branch, and finally replaces the current branch with the new branch. You can see this clearly in an alternate, but equivalent visualization of the previous graph.

Why is the feature branch still connected to the dead revisions?

The rules don't just apply to mainline; feature-4 also has a reference to mainline-3. This is an important integrity feature—as a brancher working on a private feature, Alice wouldn't want Bob to be able to spontaneously rewrite her branch's source by altering revisions made before she branched. Therefore, those "dead" revisions still live as part of the feature branch's history.

I show them above the feature branch here to show how they have become part of Alice's history. They will also be shared with any other branches that were made off mainline after mainline-2 and before the rebase. Please note that the history of Alice's branch remains exactly the same as it was before the rebase.

What happens with a merge?

Feature branches live to die; they ought to be merged back into the mainline eventually, when the feature is ready to be part of it. So Alice would like Bob to merge her feature branch into mainline.

Except the idea of mainline has changed due to the rebase. Here is a symmetric merge diagram to illustrate.

One step in the cascade occurs

Alice by necessity includes the broken changes from the old mainline-2 and the mostly duplicate mainline-3. The DAG sees these as separate from mainline-2' and mainline-3', as they are. So the merge is wrong.

To fix this, Alice must produce a new branch and rewrite her changes onto it. We can do this with rebase, but it requires Alice to know which revisions are duplicates. Here, Alice must know that mainline-3' is the new basis. This seems simple, but imagine if mainline-2 had been simply deleted, or some new revisions had been inserted. Then the revisions would be numbered differently; in that case she would have to rebase from mainline-3 to mainline-2'.

Here is the result of her rebase, and the correct merge.

Now the cascade happens: if any topic branches were made from the "dead" feature-4, feature-5, and feature-6, they must also be rebased onto the new feature-4', 5', or 6', as appropriate. And so on. And so on. Hence my name for this issue, which I don't believe has been adopted by anyone but appropriately illustrates its seriousness, cascading rebase.

Cascading rebase cannot be solved automatically; all rebase tools recognize this and have some interactive conflict resolution features. Furthermore, there is no guarantee that when a rebase goes smoothly on the original branch, it will also go so smoothly on the cascade. It also gets wildly complicated to calculate the cascade when you include behavior like synchronization by merging and especially cross-merging.

Before you consider rebasing, consider other tools for fixing problems. If you made a mistake in an old revision, and it is serious enough that all branches since should receive the fix as well, a good alternative is to branch off the broken revision, write the fix, commit to the new branch, merge it into mainline, publish the revision, and ask others to merge it in. Even if they don't, the common history will mean that merging won't see a parallel "fixed" revision, making the merge cleaner and less likely to conflict.

You don't even have to create a new URL in many cases. In both Bazaar and Mercurial, the commit to the side branch will receive a revision number in your mainline. By merging the mainline at this revision number, branches will receive only that revision, without being forced to merge the head of mainline.

Monday, September 8, 2008

Blame release tarballs for the installation problem

…plus love for distributed version control.

Many have noticed the failure of ASDF-Install to make installing Lisp packages universally easy. The situation is such that most serious Lispers don't bother with it, and many casual Lispers encounter a blocking problem such as the uninstallability of some dependency. The latter generally do one of these:

Ask the mailing list for the failing package for help. This generally elicits either a new package post or an exhortation to use the VCS, as releases are worthless for this particular package.
Ask on IRC. These also generally lead to a response of “use the VCS”.
Just use the VCS themselves, or explode the tarball and configure it themselves.
Give up. Well, that's that.

I'll spoil the ending and say that I think reliance on release tarballs is the main failing of ASDF-Install. Furthermore, I think it's a major mistake to assume that the tarball even qualifies as an appropriate modern distribution medium for software as easily rebuildable as most Lisp packages.

First, there is the symptomatic drawback. Everyone familiar with source releases has noticed that you have to download the entire package again to get the changes, which could otherwise be represented in a smaller compressed diff. Some GNU packages even distribute an xdelta, a kind of binary diff, along with release tarballs. The problem with this is that the number of diffs or xdeltas needed for maximum download efficiency is (n−1)²−1 where n=the number of releases over the course of the project. Setting that aside, now that broadband is broadly available, many believe the tarball-upgrade problem has been solved.

Some tarballs have real problems

However, I believe we've only treated the symptom, not the actual problem. We should have taken the inefficiency of making users download whole source trees over and over just to get little changes as a sign that there was a more serious problem with tarballs, demonstrated by further symptoms.

With tarballs, there's no automatic way to be sure you have the latest version. So you report a bug, get a reply “it's fixed in the new release; upgrade.”

Then, there's no automatic way to upgrade to said version. Even when managed by a binary distribution system like APT, you'll still encounter cases where some feature you want or need has been implemented upstream, possibly even released, but you just have to sit on your hands until it trickles down.

I've encountered this over and over with binary distributions: had to install my own git to get the pull --rebase option; cdrkit to get a bugfix for my DVD burner; hg to get the new rebase plugin; Darcs to see whether Darcs2 is better at managing trees; Mono to get .NET 2.0 features; etc. Now I'm trying to build my own Evolution to be able to connect to MS Exchange 2007, without much luck so far.

Such as it is, there's no sense in binary-installing software I'm actually serious about using, such as Bazaar-NG or Lisp implementations; it's easier to just track and build new releases myself.

Most importantly, it places a serious barrier between using code and modifying code. It's one thing to make a change, build, and install. But this isn't automatically persistent. If all you have is the binary, then you have to download the source again. If you didn't plan to make the change when you exploded the tarball, you have to rename the tree and explode the tarball again, to make a diff and store it somewhere; planners ahead may use cp -al, and take care to break hardlinks when working. Then, every time there's a new release, you have to reapply the diff; if there are conflicts, you have to fix them then replace the diff (possibly separately, in case you have to reinstall an older release). Then, if you're serious about getting your change into the upstream, you have to get the source once more, via a VCS, and apply it there as well.

I have a patch for Hunchentoot 0.15.7 (downloaded from tarball) locally, that lets you start a server in SBCL while in a compilation unit (otherwise deadlocking). After spending a while on this, I was asked to reapply it to the SVN version mysteriously hosted at BKNR. Of course, the one at BKNR had already rewritten the function in question, so my patch was inapplicable. The disconnect between the idea of developing for Hunchentoot (on an unadvertised SVN) and using it (see e.g. Weblocks, which will not build against SVN Hunchentoot, because it targets 0.15.7, the latest advertised version of Hunchentoot) has become so large that you might not even call them the same software anymore. I can't drop in the SVN Hunchentoot because it would break Weblocks, and I can't fix Weblocks (well, dev, I'll probably do a branch sometime) because it would break it for everyone who hasn't found the SVN at BKNR and therefore assumes 0.15.7 is the latest and greatest.

If I sound frustrated with this, imagine if it was the first time I had ever tried to contribute to a Lisp project.

DVCS solves all these problems

My answer to all of the above is one that Lisp users should be familiar with: “use the VCS”. Let's go over the problems again, and see how DVCS solves them:

Downloading the same thing over and over. With a DVCS, you get the entire history in compact format, so you can fast-forward and rewind to any release very quickly, and the tool will reconstruct history for you. Deltas are combined on-the-fly, so there is no quadratic explosion of deltas. To get new changes, you only download the parts of history you don't have.

When history gets too big, systems like Bazaar feature horizons and overlays, so you need only download history back to a certain point.

Be sure you have the latest version, possibly upgrading to it. All DVCSes have convenient commands to upgrade to the latest version. Most also have graphical tools to browse and wind around in history, if you don't like the new version.

Transitioning from user to developer. You may not agree this is an important goal for software just being used by someone, but I will not delve into this, allowing the OLPC project to speak for me:

Our commitment to software freedom gives children the opportunity to use their laptops on their own terms. While we do not expect every child to become a programmer, we do not want any ceiling imposed on those children who choose to modify their machines. We are using open-document formats for much the same reason: transparency is empowering. The children—and their teachers—will have the freedom to reshape, reinvent, and reapply their software, hardware, and content.

(On a side note, reliance on release .xo files containing activities makes figuring out which activities will work with your XO a nightmare. To reach the full potential of Develop and related activities, I think that OLPC will be forced to adopt a VCS-driven, per-XO branching distribution framework.)

With a local DVCS checkout, all you need to do is make your changes and commit them. For sending upstream, all tools include, or have plugins, to send one or a series of changes to an upstream mailing list with a single command. Even private or rejected changes are safe: you can merge new upstream changes onto your branch (for which new conflict resolutions are automatically managed and fully rewindable), or use rebase to move your changes up to the tip of the upstream changes. If you make many changes and are given a branch on the upstream's server, it's yet another single command to push them to the new remote location.

But DVCS…

Let's start with the obvious, DVCS gives you an unstable developer's version rather than a stable version. This is a straw man, considering that modern DVCSes support powerful branching and merging. If your mainline frequently destabilizes, you can point everyone to a “stable” DVCS branch URL that receives regular merges from the unstable branch when it stabilizes. Pushing revisions across the network is so easy, as opposed to making a new release tarball, that this is likely to get far more frequent updates.

I can imagine a report for a bug needing only minor changes in this environment:

User: $ bzr merge --pull
User: test test test
User: hey, there's a bug in the HEAD of the stable branch
Maint: what with
User: xxx yyy zzz
Maint: okay, just a sec
Maint: work work work on stable (or cherry fix from unstable)
Maint: merge onto unstable
Maint: okay, fixed in r4242
User: $ bzr merge --pull
User: thanks, you're the awesomest Maint

Super-cheap topic branching means that this process can expand as needed, depending on the size of the changes required. Furthermore, this easy incremental release process means that the maintainers need no longer weigh the cost of rolling a release against the cost of duplicate bug reports for unreleased fixes; the release is “prerolled”, as it were.

In a culture of small, incremental changes and widespread tracking of DVCS content, such as that in the Common Lisp community, the “stable” branch might even be the same as the “development” branch, where destabilizing changes are done in separate “topic” branches before being merged into the mainline. In addition, the effort to make sure that the heads of all the DVCS mainlines are compatible keeps these from driving users into version incompatibility hell.

Even if that's not enough, branching and merging allows us infinite granularity in the stability of the release process. If “stable” changes too often for some users, you can have a “reallystable” branch that merges revisions from “stable” that have reached really-stability, and so on. This could be used to simulate any kind of release cadence from the “stable” branch that maintainers might like to effect, in only a few shell commands.

History is too big. Well, first of all, not really. We've already used the bandwidth argument to dismiss the symptom of redownloading for tarballs; it applies equally here, and you're also getting the benefit of having every version ever. Even so, for large histories, lightweight checkouts and history horizons let you keep only a subset of history locally. Bazaar is really a pioneer in this area, but I expect the other DVCSes to catch up in the usual manner of competition among the available tools.

Building and rebuilding all the time takes time. While I can't speak for other environments, the Common Lisp community has admirably solved this problem. For all of Kenny's reported faults with ASDF, it's still better than anything any other environment has. It is such that I can run a single find command when downloading a new package by DVCS, thereby installing the package, and Lisp will rebuild changed files and packages on-demand when loading them. Even without on-demand recompilation, this isn't much of an issue for Lisp: I use a command to wipe out compiled code and rebuild from scratch on a shared Lisp environment I manage, where even given SBCL's relatively slow compiler and FASL loader, it only takes about 90sec to rebuild all 35 or so packages from scratch and write a new image for everyone's immediate use.

To be honest, this is a real blocker for systems with slow rebuilds and early binding semantics. It wouldn't work well for C or GHC Haskell, for example. However, I'm sure that Lisp, systems with on-the-fly interpretation and compilation like CPython and Ruby, and systems with simple, standardized and fast full-compilation processes like Mono would be served well by DVCS-based distribution.

Probably the most serious objection is really about dependencies, that you have to have all the DVCS tools used by the systems you use. First, I think existing practice shows that we already have no objection to this; we aren't bothered about requiring everyone to use APT rather than a website with downloads, because we know by comparison that the installation process with APT is orders of magnitude better for users and developers than the traditional Windows “download an installer and run it” method. (The fact that a Debian APT source does a Fedora yum user no good is all about the system-specific hacks typically packaged with these packages, and has no effect on pristine upstream distribution, which is after all our topic of discussion.)

Great, so who's going to implement all this?

The most comprehensive effort I've seen at trying to integrate VCS with automated installation is CL-Librarian, which knows how to use a few of the available VCS tools to download and update libraries. Librarian is mostly a personal effort, and isn't widely ported or adaptive yet, but it's a step in the right direction. While the above may sound like a very long advertisement for Librarian, I would surely embrace any Common Lisp library manager that takes the new DVCS order to heart and helps us to banish the release tarball.

I'm currently using a few scripts collectively called lispdir that drop into the myriad branches in my Lisp tree and update them using the appropriate VCSes. When I add a new branch, I simply run an asdfify shell function to add the .asd files therein to my central registry. It also serves as a list of canonical VCS locations for many projects. You can get that as a Bazaar branch; acquire.sh downloads all systems, and update.sh updates them.

Tuesday, September 2, 2008

Fixing weblocks test failures, try 1

Here we commemorate the demise of the first test fixing branch of Weblocks, c6dc18-test-fixes. At its peak, it got the number of failures in the ~750 test suite down to a whopping 9. Shortly after its merge back into dev, the count was back up to over 50.

As I write this, the count on my new fixes branch is 84. On the now 4-patch divergent dev, a trial merge into the fixes branch raises that to 89. (To be entirely fair, I was the committer of the revisions that caused the jump.)

There is a lesson here about test discipline. Once you get out of the mindset of making every mainline rev pass all tests, it's hard to get back in. For quite a while my fix strategy has been a little lax.

If it's small, and I don't think it does much, don't bother testing.
For large changes, make sure the failure count doesn't jump above 100.
Whatever, just make sure my development site (which doesn't even use continuations yet) works.

Symmetric merging is little help here, as all the revs in a topic branch end up as first-class revs in the mainline. Not that I blame Mercurial at all; still, I hope to build future features in topic branches posted on Bitbucket, preferably with optimized archival storage (backporting hardlinks and such) so dormant branches, such as c6dc18-test-fixes, can stay around forever cheaply.

I think the failure with c6dc18-test-fixes was not providing public feedback about how the failure counts were diverging between branches. Now I have scripts that merge, test, and compare test results on two branches, so I can generate all the useless statistics I like.

Saturday, August 23, 2008

defclass options slay me, again

I previously wrote about missing canonicalize-defclass-options. Now I have a nice inconsistency in AMOP to add to the complaints.

I have believed for some time, for some reason, that this is the standard slot option canonicalizer, excepting the special cases:

(defun canonicalize-defclass-option (opt)
`(',(car opt)
  ',(if (typep (cdr opt) '(cons t null))
      (cadr opt)
      (cdr opt))))

In other words, if you gave a single argument, like (:opt val), it wouldn't be listified. A little weird, with a nasty special case, but a good attempt at dealing with both listy and atomy class options.

Thankfully, AMOP has two contradictory interpretations, neither of which are the above. First, the example on page 287, which would have it:

  `(',(car opt)
   ',(cadr opt))

Finally, on page 148, hiding from the prying eyes of back-of-the-book indexed content (nowhere near the entries on defclass), the true behavior:

Any other class options become the value of keyword arguments with the same name. The value of the keyword argument is the tail of the class option.

  `(',(car opt)
   ',(cdr opt))

I also previously thought the defclass options to be evaluated, but never mind that.

Monday, August 18, 2008

Dependencies versus effort

I do not need to rehash the benefits of relying on other libraries when developing a library here. If you care about dependencies, you ought to be familiar with those benefits already.

I wish to instead address the common complaint of "too many dependencies" from those who feel that getting a Common Lisp library installed is too difficult.

Here is how I feel about such requests:

Graph 1

In short, the effort avoiding a dependency, even in the case of synchronizing with an external source, far exceeds that for the simple process of fetching a dependency yourself and adding it to your ASDF registry. What's good for the maintainer is good for the library.

To clarify further:

Graph 2

Friday, August 15, 2008

A metaclass for weblocks-webapps, try 3

This one includes mostly the same shared-initialize and defwebapp contents, but updated for some new changes to dev.

It introduces the :reset slot option, which follows a leftmost-bound inheritance rule and specifies that shared-initialize should ignore its second argument for that slot, always assuming the slot should be initialized.

For weblocks-webapp in particular, it has the problem that for slots provided with :reset t, it would be necessary to export the slot names, so that subclasses could cancel the behavior. It also has the problem that values initialized or normalized by initializer methods are not recognized as reproducible initial values.


(defclass weblocks-webapp-class (standard-class)
  ()
  (:documentation "Class of all classes created by `defwebapp'.  The
  really interesting behavior is in
  `shared-initialize' (weblocks-webapp t &rest)."))

(defmethod validate-superclass ((class weblocks-webapp-class) superclass)
  (typep (class-name (class-of superclass))
         '(member standard-class weblocks-webapp-class)))

(defgeneric slot-definition-reset-p (slot-defn)
  (:documentation "Answer whether to change the value of this slot
   when reinitializing or updating an instance after class
   redefinition.")
  (:method ((slot-defn slot-definition))
    "Regardless of this method, only direct-slot-definitions where
    `resetp' is a bound slot may participate in the inheritance rule."
    nil))

(defgeneric (setf slot-definition-reset-p) (value slot-defn)
  (:documentation "See `slot-definition-reset-p'."))

(defclass resetting-slot-definition (slot-definition)
  ((resetp :initarg :reset :accessor slot-definition-reset-p))
  (:documentation "I provide the extension that when `resetp' is
  non-nil, and an initarg is not present in the `shared-initialize'
  call, I will use a relevant stored initarg or initfunction to reset
  my value.

  :reset's inheritance rule is leftmost-bound."))

(defclass resetting-eslot-definition
    (resetting-slot-definition standard-effective-slot-definition)
  ())

(defclass resetting-dslot-definition
    (resetting-slot-definition standard-direct-slot-definition)
  ())

(defmethod direct-slot-definition-class
    ((self weblocks-webapp-class) &rest initargs)
  (declare (ignore initargs))
  (find-class 'resetting-dslot-definition))

(defmethod effective-slot-definition-class
    ((self weblocks-webapp-class) &rest initargs)
  (declare (ignore initargs))
  (find-class 'resetting-eslot-definition))

(defun compute-resetting-eslot-definition (eslot class name dslotds)
  "Implement leftmost-bound rule for resetp."
  (declare (ignore name))
  (setf (slot-definition-reset-p eslot)
        (and-let* ((leftmost-bound (find-if (lambda (dslot)
                                              (and (slot-exists-p dslot 'resetp)
                                                   (slot-boundp dslot 'resetp)))
                                            dslotds)))
          (slot-definition-reset-p leftmost-bound)))
  eslot)

(defmethod compute-effective-slot-definition ((self weblocks-webapp-class) name dslotds)
  (compute-resetting-eslot-definition (call-next-method) self name dslotds))

(defun reset-slots (instance initargs)
  "Given an instance to be initialized and the initargs passed to the
initializer method, reset the appropriate slots to their original
values."
  (dolist (eslot (class-slots (class-of instance)))
    (when (slot-definition-reset-p eslot)
      (let ((initkeys (slot-definition-initargs eslot)))
        (when (loop for (key) on initargs by #'cddr
                    never (member key initkeys))
          (flet ((set-it (val)
                   (setf (slot-value instance (slot-definition-name eslot)) val)))
            (or (and-let* (initkeys
                           (definit
                            (some (lambda (definit) (member (car definit) initkeys))
                                  (class-default-initargs (class-of instance)))))
                  (set-it (funcall (third definit)))
                  t)
                (and-let* ((initfunc (slot-definition-initfunction eslot)))
                  (set-it (funcall initfunc))
                  t))))))))

Where shared-initialize calls reset-slots when its second argument is not t (as is required for initialize-instance).

Notwithstanding the above issues, the real difficulty forcing me to abandon this particular iteration for now is that the test failures in dev have jumped from 9 (at last pull from c6dc18-test-fixes) to over 90. I'm sure this has to do with the current defwebapp implementation having dependency issues eerily similar to that mentioned in my last post.

By the time I reached this conclusion, I had been too frustrated by fighting with SBCL and SLIME on OS X to do anything about it. Later today, I'll just boot up in GNU/Linux, wifi be damned.

Next up, doing as much as I can without the metaclass, just so other dev writers will put more new instance logic in the initialize method instead of defwebapp. Then I can revisit the metaclass issue with a fresh look, something like this:


(defclass a-class (s-class)
  ()
  (:persistent-initargs :a :b :c)
  (:transient-initargs :d :e :f))

…where some initializer method will capture all initargs (not initforms this time, I think), including those given to make-instance, storing in an instvar, and initarg semantics are determined by a leftmost-persistent-or-transient rule. This has the benefit of not killing initializer method settings with the common :initform nil.

Or, I'll be back here later with an explanation of why it won't work.

Wednesday, August 13, 2008

A metaclass for weblocks-webapps, try 2

My next attempt was more interesting, succeeding in point #1, and getting around point #2 with a new slot definition feature.

Unfortunately, at this point, I ran into the default canonicalization behavior for slots and defclass options. While AMOP discusses a possible extension via a generic function canonicalize-defclass-options, it is unfortunately not included in the final MOP. So:

Except for :default-initargs, class options are always quoted.

Except for :initform, slot options are always quoted.

That and a dependency issue forced me to abandon this try, which introduces the instance initfunction to address the common complaint about initforms that the expression cannot refer to the new instance.


(defmacro and-let* ((&rest bindings) &body body)
  "Like `let*', but stop when encountering a binding that evaluates to
NIL.  Also allows (EXPR), stopping when EXPR is false, and EXPR as a
shortcut for it only if EXPR is a symbol."
  (reduce (lambda (binding body)
            (etypecase binding
              (symbol `(and ,binding ,body))
              ((cons symbol (cons t null))
               `(let (,binding)
                  (and ,(car binding)
                       ,body)))
              ((cons t null)
               `(and ,(car binding) ,body))))
          bindings :from-end t :initial-value (cons 'progn body)))

(defclass weblocks-webapp-class (standard-class)
  ()
  (:documentation "Class of all classes created by `defwebapp'.  The
  really interesting behavior is in
  `shared-initialize' (weblocks-webapp t &rest)."))

(defmethod validate-superclass ((class weblocks-webapp-class) superclass)
  (typep (class-name (class-of superclass))
         '(member standard-class weblocks-webapp-class)))

(defclass instance-initializing-slot-definition (slot-definition)
  ((instance-initfunction :initarg :instance-initfunction :initform nil
                          :accessor slot-definition-instance-initfunction))
  (:documentation "I provide an alternative to `:initform' for slots,
  the `:instance-initfunction', a function taking the instance to be
  initialized."))

(defclass weblocks-webapp-direct-slot-definition
    (instance-initializing-slot-definition standard-direct-slot-definition)
  ()
  (:documentation "Direct slot definition for `weblocks-webapp-class'es."))

(defclass weblocks-webapp-effective-slot-definition
    (instance-initializing-slot-definition standard-effective-slot-definition)
  ()
  (:documentation "Effective slot definition for `weblocks-webapp-class'es."))

(defmethod compute-effective-slot-definition
    ((self weblocks-webapp-class) name direct-slot-defns)
  "Transfer `instance-initfunction' to the effective slot definition,
making sure to override the regular `initfunction' if mine appears
first in the precedence list, so `shared-initialize' will not call it
to fill the slot."
  (let ((eslot (call-next-method)))
    (loop for (dslot . slot-precedence) on direct-slot-defns
          when (slot-definition-initfunction eslot)
            do (return)
          when (and (slot-exists-p self 'instance-initfunction)
                    (slot-definition-instance-initfunction dslot))
            do (setf (slot-definition-instance-initfunction eslot)
                     (slot-definition-instance-initfunction dslot)
                     (slot-definition-initfunction eslot) nil
                     (slot-definition-initform eslot) nil)
               (return))
    eslot))

(defmethod direct-slot-definition-class
    ((self weblocks-webapp-class) &rest initargs)
  "Use my special version when `:instance-initfunction' is present."
  (if (getf initargs :instance-initfunction)
      (find-class 'weblocks-webapp-direct-slot-definition)
      (call-next-method)))

(defmethod effective-slot-definition-class
    ((self weblocks-webapp-class) &rest initargs)
  "As `instance-initfunction' is transferred over later, it isn't
present in INITARGS, so assume any slot might need it."
  (declare (ignore initargs))
  (find-class 'weblocks-webapp-effective-slot-definition))

(defun instance-initialize-unbound-slots (self slot-names)
  "Do the magic promised by `weblocks-webapp-class'."
  (let ((slots (class-slots (find-class self))))
    (dolist (slot slot-names)
      (and-let* (((not (slot-boundp self slot)))
                 (slotd (find slot slots :key #'slot-definition-name))
                 (initfunc (slot-definition-instance-initfunction slotd)))
        (setf (slot-value self slot)
              (funcall initfunc self))))))

(defmethod shared-initialize :after ((self weblocks-webapp-class) slot-names
                                     &key &allow-other-keys)
  (declare (ignore slot-names))
  (pushnew (class-name self) *registered-webapps*))

(defclass weblocks-webapp ()
  ((name :accessor weblocks-webapp-name :initarg :name :type string
         :instance-initfunction (lambda (self)
                                  (attributize-name (class-name (class-of self)))))
   ;;snip uninteresting slots
   (prefix :accessor weblocks-webapp-prefix :initarg :prefix :initform ""
           :instance-initfunction
           (lambda (self)
             (concatenate 'string "/" (weblocks-webapp-name self)))
           :type string
           :documentation "The default dispatch will allow a webapp to be invoked 
              as a subtree of the URI space at this site.  This does not support 
              webapp dispatch on virtual hosts, browser types, etc.")
   ;;snip more
   (init-user-session :accessor weblocks-webapp-init-user-session :initarg :init-user-session
                      :instance-initfunction
                      (lambda (self)
                        (find-symbol (symbol-name '#:init-user-session)
                                     (symbol-package (class-name (class-of self)))))
                      :type symbol
                      :documentation "'init-user-session' must be defined by weblocks client in the
                         same package as 'name'. This function will accept a single parameter - a 
                         composite widget at the root of the application. 'init-user-session' is 
                         responsible for adding initial widgets to this composite.")
   (ignore-default-dependencies
    :initform nil :initarg :ignore-default-dependencies
    :documentation "Inhibit appending the default dependencies to
the dependencies list.  By default 'defwebapp' adds the following resources:

  Stylesheets: layout.css, main.css
  Scripts: prototype.js, weblocks.js, scriptaculous.js")
   (debug :accessor weblocks-webapp-debug :initarg :debug :initform nil))
  (:metaclass weblocks-webapp-class)
  (:documentation "snip"))

(defmacro defwebapp (name &rest initargs &key subclasses slots (autostart t)
                     &allow-other-keys)
  "snip"
  (remf initargs :subclasses)
  (remf initargs :slots)
  (remf initargs :autostart)
  `(prog1
     (defclass ,name ,(append subclasses (list 'weblocks-webapp))
       ,slots
       (:default-initargs . ,initargs)
       (:metaclass weblocks-webapp-class))
    (when autostart
      (pushnew ',name *autostarting-webapps*))))

(defmethod shared-initialize ((self weblocks-webapp) slot-names
                              &key &allow-other-keys)
  "Use my `instance-initfunction' to initialize any slots that aren't
yet bound."
  (instance-initialize-unbound-slots self slot-names)
  (macrolet ((do-slot ((bind-var &optional (slot-name bind-var)) &body forms)
               `(when (member ',slot-name slot-names)
                  (let ((,bind-var (slot-value self ',slot-name)))
                    ,@forms))))
    (do-slot (init-user-session)
      (or init-user-session
          (error (format nil "Cannot initialize application ~A because no~
                              init-user-session function is found."
                         (webapp-name self)))))
    (when (member 'application-dependencies slot-names)
      (setf (weblocks-webapp-application-dependencies self)
            (build-local-dependencies
             (append (and (not (slot-value self 'ignore-default-dependencies))
                          '((:stylesheet "layout")
                            (:stylesheet "main")
                            (:stylesheet "dialog")
                            (:script "prototype")
                            (:script "scriptaculous")
                            (:script "shortcut")
                            (:script "weblocks")
                            (:script "dialog")))
                     dependencies))))
    (do-slot (path public-app-path)
      (setf (weblocks-webapp-public-app-path self)
            (if (or (null path) (eq path :system-default))
                nil
                (compute-public-files-path 
                 (intern (package-name (class-name (class-of self))) :keyword)))))))

Even without the quoting problem, the fact that prefix's initfunction requires name's initfunction to be called first means that I would need to introduce dependency analysis and expression. But Kenny has already solved this problem.

A metaclass for weblocks-webapps

defwebapp in weblocks-dev currently does many mysterious things with its keyword arguments. But these are the sorts of things that should be given directly to make-instance, so that you can use your own defclass forms and your own make-instance calls to construct weblocks-webapp instances in creative ways.

I have a few goals for a rewrite of defwebapp:

Trivialize the mapping of the defwebapp form to a defclass form, putting most of the logic in initializing methods and possibly a metaclass (ahem, class metaobject class).
Don't hide the default slot values in class-default-initargs.
On reevaluation of defwebapp, replace the slot values, at least in the trivial case where the hacker allowed the normal autostarter to instantiate the resulting weblocks-webapp subclass.

My first thought, abandoned before I could even start changing defwebapp proper to use it, would have defwebapp fill class-instance slots (not to be confused with instance slots with :allocation :class) instead of default-initargs, then alter the initform and initfunction in compute-effective-slot-definition to copy over those values to each new instance.


(defclass weblocks-webapp-class (standard-class)
  ())

(defmethod validate-superclass ((class weblocks-webapp-class) superclass)
  (typep (class-name (class-of superclass))
         '(member standard-class weblocks-webapp-class)))

(defgeneric class-defaulting-slots (class)
  (:method ((self weblocks-webapp-class))
    (loop for class in (class-precedence-list self)
       while (typep class 'weblocks-webapp-class)
       append (class-direct-slots class)))
  (:method ((self standard-class)) '()))

(defmethod compute-effective-slot-definition
    ((self weblocks-webapp-class) name direct-slot-defns)
  "Provide a default initform (read the class's version of the slot)
for those without initforms."
  (let ((eslot (call-next-method))
        (defaulting-slots (class-defaulting-slots self)))
    (macrolet ((initfunc () (slot-definition-initfunction eslot)))
      (when (and (find name defaulting-slots :key #'slot-definition-name)
                 (not (initfunc)))
        (setf (slot-definition-initform eslot)
              `(slot-value (find-class ',(class-name self)) ',name)
              (initfunc) (lambda () (slot-value self name)))))
    eslot))