29 December 2006

Deprogramming the LALR(1) Bias

Look around for LR(1) parser generators, and you will primarily find LALR(1) parser generators. There seems to be an unspoken assumption that LALR(1) is somehow better than LR(1), but look at the following pertinent facts:
  • In terms of grammars that can be handled by various parser generation techniques, SLR(1) ⊂ LALR(1) ⊂ LR(1).
  • SLR(1) and LALR(1) tables are always the same size, but LR(1) tables are potentially larger.
The important thing to notice here is that LALR(1) is not the most powerful of the three parser generation techniques listed above. LALR(1) may introduce reduce/reduce conflicts that do not exist when using LR(1).

So then, why is there a bias toward LALR(1)? I suspect that it has to do with the well known and widely cited dragon book, which treats LALR(1) as the culmination of the parser generation algorithms it presents. There is one little detail that is not mentioned at all though: it is possible to compress LR(1) tables to be the same as LALR(1) tables, with the exception of avoiding table compression that would introduce LALR(1)'s reduce/reduce conflicts. Granted, the book is over 20 years old now, but I would not be surprised if the new edition preserves this omission.

I recently finished implementing a parser generator (as the basis for a GLR parser, as described by the Elkhound technical report). I was initially unable to wrap my head around the "efficient" method that the dragon book provides for LALR(1) parser generation, so I took the incremental approach of implementing SLR(1), converting that to LR(1), then finally converting that to LALR(1) -- the "easy, inefficient" method according to the dragon book. Well, when I got to the point of compressing LR(1) tables to LALR(1) tables, I questioned the necessity of compressing states when reduce/reduce conflicts would result. As near as I could tell, there is no fundamental requirement to do so, which means that compressed LR(1) tables that avoid such conflicts should be the gold standard, rather than LALR(1).

This seemed too obvious to have been overlooked by the compiler community, so I looked high and low for prior art. The best I found was a comp.compilers post by Chris F. Clark on 1 July 2003, which says, in part:
That brings us back to "full" or canonical LR(k), where k is >= 1.
LR(k) parsing preserves the needed left context to disambiguate
certain situations where LALR(k) finds the rules to conflict. The
canonical way of doing this is to not merge states with different
lookaheads in the first place. That causes an explosion in the
table size as many states are kept distinct simply because they
have different lookaheads, when in reality the lookahead for those
states will never be consulted. A more modern method for solving
the problem involves splitting the states only when a conflict is
detected. Then, if the grammar is LALR, no splitting will occur
and one has the same size machine as the LR(0), but as conflicts
arise the require left context to resolve, the tables slowly grow.
There is likely some discussion of this optimization somewhere in the primary literature, but several hours of searching failed to turn it up. The closest I found is Sönke Kannapinn's PhD thesis (in German, but the abstract is also in English), which comes to the same conclusion via a very different route.

In summary, if you need to write a parser generator, make it LR(1) and use careful table compression, rather than using the less general LALR(1) approach.

Transactional Memory: Panacea or Confounder?

There is a very nice review article on the subject of transactional memory (TM) as applied to programming languages in ACM Queue, Vol. 4 No. 10. The article does a reasonable job of describing the basics of transaction processing, and describes how these techniques could be used to directly expose transaction semantics as a built-in programming language feature.

Transaction processing has been actively researched by the databases community since well before my time, probably for 30+ years. This is important, because it means that the really new ideas embodied by TM are limited to approximately the following:
  • Transactions can be exposed as a general purpose language construct (not just in database languages).
  • Hardware can be modified to reduce TM overhead.
These ideas are interesting enough, and certainly warrant research. However, there are some basic compromises that transactions require, and nothing about TM changes that.

There is a fundamental difference of approach when writing code that uses synchronization primitives (mutexes, condition variables, etc.) for deadlock/livelock-free algorithms, versus transaction processing. For the former, we specifically write code that will always make forward progress, while trying to limit the synchronization overhead. With transactions, we are working at a much higher level, and leave it to the transaction manager to detect when conflicts arise, then (hopefully) make progress even when transactions conflict.

So, I think TM is both panacea and confounder:
  • Panacea: Programmers can work at a higher level that doesn't require thinking as hard about parallel execution.
  • Confounder: When working with transactions instead of low level synchronization, there is a lack of detailed information that prevents achieving maximum performance for many applications.
TM may be the right tool for some programming jobs, but it is almost certainly not the right tool for all programming jobs.

03 December 2006

Impacts of multi-core processing on programming language design

Within the next couple of years, all modern desktop computers will have multiple CPUs, thanks to multi-core packaging. This doesn't matter very much in the context of programming languages though until the number of CPUs crosses a certain threshold, say, 4-8 CPUs. There are a few reasons for this, including:
  • 2 CPUs aren't a very compelling motivation for the significant extra development effort required for multi-threaded programming.
  • There are numerous hackish programming solutions that work reasonably well when targeting sub-4-CPU systems, which reduces the pressure to develop general solutions.
  • General solutions have some inherent overhead that substantially eats into the performance gains. Depending on the programming model, general solutions need on the order of 4-8 CPUs before the overhead is an acceptable tradeoff for the improved scalability.
I have been waiting for multi-threaded programming to provide truly scalable performance gains for general computing ever since I started out developing software on IBM's OS/2 operating system in the early 1990s. The crossover point has been "a few years" away for 15 years now, but now it's so close I can smell it.

Researchers have long since developed the raw technology necessary to see us through this transition, but current tools are woefully lacking. Let me briefly describe the shortcomings of some of the tools we currently have available.
  • C/C++ pthreads: C/C++ are dangerous, difficult languages when it comes to writing reliable software, leaving alone multi-threading. The single-image programming model that pthreads provides adds insult to injury. Even experts have a very difficult time writing reliable multi-threaded C/C++ programs. On top of this, development aids such as debuggers are of limited use here, because of the Heisenbug Principle.
  • Java, C#: These languages improve on the C/C++ situation by providing language-level threading support, and the runtime environments improve analysis and debugging prospects. For the next few years, this is probably the best we're going to do, but the single-image programming model is rather difficult to deal with, even under the best of circumstances. I think there will always be a place for languages like these, but that as the number of CPUs in computers increases, these will be increasingly seen as low-level languages.
  • Erlang: Erlang relies entirely on message passing for communication among threads (let's ignore for now that Erlang's terminology differs). Threads do not explicitly share memory. There are two problems with this:
    • Erlang actually runs all threads inside a single process that uses only one CPU. This is an implementation detail, but in practice it limits flexibility, and the workarounds are less than ideal.
    • The overhead of passing all data as messages between threads is very expensive, depending on the application. This is a general concern, but at some threshold number of CPUs, I expect it to become an acceptable cost of developing highly scalable multi-threaded software.
  • Perl, Python, Ruby, etc.: These languages vary in their approaches to multi-threading, but I think it fair to say that none of them provide scalable, useful multi-threaded development support. I find this noteworthy because this class of languages is of increasing importance both for scripting and for larger-scale systems programming. These languages will have to adapt if they are to maintain their value as systems programming languages.
I alluded above to a division between two approaches to multi-threaded programming: 1) single-image and 2) message-passing. Right now, (1) is of primary importance (and the time to worry about it really is now). (2) is creeping up on us quickly though; I predict that it will really start mattering when we start dealing with anything more than about 8 CPUs. Why? Because in my experience, the only reliable approach to writing software that scales beyond 8 CPUs is to rely mostly on message passing.

Okay, here's where I pull out my crystal ball. I predict that five years from now, the languages that provide the highest productivity with regard to multi-threaded programming will make message-passing easy (i.e. it will be the primary mode of multi-threaded development), and shared-image-based threading possible. Right now, none of the available languages I'm aware of provide this focus. What really concerns me though is that of the primary "scripting" languages, none are even close to providing the necessary programming infrastructure, let alone the appropriate focus on methodology. This is where my attentions are currently focused, and I expect many of my future ramblings will relate to the topic.

02 December 2006

Things to come

For some time, I've been writing notes on what I would blog about, were I to have a blog. Well, I have a blog now. Given the blog name, it should come as no surprise that forthcoming posts will bore family members within an inch of death. Maybe others will derive some value though.

Without further delay, here's an incomplete list of topics to come:
  • Garbage collection: Fast suspend/resume with critical section support
  • Effects of multi-core computing on programming language design
  • Why virtual machines matter
  • Unicode's impact on programming language design
  • Phylogenetic inference: Star decomposition is biased!
This is the sort of stuff that keeps me up at night. If nothing else, it can put others to sleep at night.