Brrian to go.

I ran into this question while sketching alternatives for timeline labels. The right answer seems to be that left or right-rotated text are equally difficult to read (short and sweet study here). If using a tab/folder metaphor, follow the tab orientation. If using a book spine metaphor, it’s more complex because USA vs. European publishers rotate differently.

Some neat tips on layering here!

And now something lighter… brson’s grandmother!

[11:13am] tjc: rustbot: build snap-stage3 qkzw
[11:14am] auREAX: qkzw?
[11:15am] tjc: auREAX: the FreeBSD bot
[11:15am] tjc: only brson knows why it’s called that
[11:15am] auREAX: ah
[11:15am] auREAX:
[11:16am] brson: qkzw was my grandmother’s name
[11:16am] tjc: oh, that’s touching
[11:17am] graydon: she was a ham radio operator?
[11:17am] brson: one of the best
[11:17am] graydon: (or maybe fighter pilot? hm. call signs…)

This is an interesting paper on the spectrum between “dead” and “live” coding, in both programming and music creation. With this view, I guess I could say that I want to move more programming technology from low levels of liveness to high levels.

This fall, I’m working on Servo, a parallel, task-based layout engine and framework being developed at Mozilla. So, I’ve been reading browser architecture research papers in search for inspiration. This is a complex domain, which makes it quite easy to hide devils in the details. I hope to point out a few of these details, so that others can read these papers more carefully.

(I realize many of these flaws are simply artifacts of a work in progress; that said, there’s no reason to believe they will be addressed if nobody raises issues.)

Lies, damn lies, and statistics

The most obvious devil in the aforementioned paper’s numbers is the breakdown of CPU time in the browser. Every paper has a different composition, workload, measurement technique, etc. So, some papers (such as this one) claim that CSS selector matching and cascading is not important, while others (such as Leo Meyerovich’s work) claim it is more important than layout tree constraint solving.

To me, it’s not clear how these numbers are derived. What functions constitute parsing? layout? CSS? rendering? In fact, the WebKit toolkit merges and interleaves many of them to minimize tree traversals. Some parts of the browser precompute results for others. Most notably, the parser prunes a lot of work, and CSS and layout are interleaved. At the least, I would have liked a summary of what decomposition they used, and a process for measuring this consistently (i.e., with no human involved).

The other problem is workload. Replicating their experimental results is impossible, even if their sources were made available (it is not). Live web pages change daily, and how they are loaded depends greatly on network characteristics. Richards et al. are making progress towards better JavaScript and webpage benchmarks, but their work focuses mainly on JavaScript benchmarks.

Correctness and Feasibility

Recent experience in designing Servo’s architecture has made me skeptical about hand-waving of details. In this paper, the obvious threat to correctness is splitting up web pages into independent mini-pages, executing in parallel, and combining them. There are not nearly enough details given for this to be reproduced, let alone judged, by browser vendors.

How does DOM dispatch work across independent mini-pages? It is stated that the event target is found across mini-pages, and then the DOM event is created and dispatched in the main page. How can the main page dispatch the DOM element if the mini-page’s DOM contains some of the targets? This seems incompatible with having all JavaScript run in one page—-unless the JS minipage has DOM bindings for all minipage—-in which case, it seems unlikely that minipages can safely execute in parallel.

(N.B. they merge minipages if script needs to access DOM state. This almost seems like the common case to me: DOM event dispatch will need to access every single DOM element between the event target and the page root.)

  • Can mini pages really be positioned correctly without parent context? This seems unlikely, since the CSS 3 Values module makes it easy for any element to specify widths relative to the containing viewport.
  • What happens when a new stylesheet or rule is added?
  • Building up DOM trees in parallel violates the HTML5 parser specification when error cases are encountered (as far as I can tell).
  • Using an external proxy to instrument/split content is not feasible for HTTPS pages and dynamically created pages (such as those that build the DOM from AJAX responses using JavaScript).
  • CSS rules cannot be pruned and specialized for mini-pages, as this breaks the CSS Object Model.
  • Details of synchronizing JavaScript and the DOM are elided. This is worrisome. 
  • Synchronizing JavaScript and layout results (i.e., .getClientRects, HTMLImageElement.width, and others specified the CSS OM View module) is not even discussed, but is a major constraint for parallel browsers.

I’ve been on both sides of the “browser vendor” abstraction: first as a researcher pitching architectural changes to WebKit for Timelapse, and lately as a research browser designer looking at many proposed architectures. Ultimately two things can assuage concerns of correctness: compliance tests (i.e., running the browser’s or W3C’s test suites) and freely-available sources.

This article brings to words many of my opinions on where software engineering is today.

I would have enjoyed a discussion of the dichotomy of problems between industry and researchers. If Software Engineering research is mostly engineering, then how does one balance engineering by researchers with the engineering by professional engineers?

In most companies, software developers are part of the “engineering organization”; so, from their perspective, many developers see Software Engineering researchers in a bad light because they are telling practitioners how to do their own jobs. In most cases, professional developers are better engineers than SE researchers.

What’s the proper dichotomy? Do SE researchers focus on high-risk, high-reward projects? How do other engineering disciplines divide the effort and problem space?

Contained within this rant are some good points:
Code reuse is good
Removing unneeded dependancies is good
Fixing up libtool to not check for fortran would be brilliant

However, surrounding these good points are a lot of non-sequiturs and nostalgic romanticised thinkings harking back to the good ol’ days when all you needed was a terminal and a |. I’m sure it was lovely back then, but these days some of us like being able to plug in our USB devices and have them automatically detected, being able to just click on the wifi network we want to connect to.

So libtiff is an implicit dependency for Firefox even though Firefox cannot display tiff images, and this is an example of how badly designed the whole system is. Is it not more likely that someone somewhere (possibly the author, in his haste to show us how hardcore he was because he compiles everything from source) had forgotten to pass a —disable-tiff to one of the other 121 packages?

Surely even the complaint that Firefox uses 122 other packages shows that code reuse, rather than being “totally foreign in the bazaar” is actually quite a common occurance?

The bazaar (I’m assuming this is to mean the free software world) is anarchistic? From all the software projects I’ve worked on over the past 15 years in Free Software, the one thing they have in common is a very strong sense of hierarchy so that clearly can’t be an anarchy, can they? Large distributions have smart people who are thinking about the best way to put them together, people can’t just fling their code into them, so these aren’t anarchies either.

There is also a fair greater sense of reverence to Eric Raymond and his ramblings (the man has written articles about how he is the embodiment of the god Pan, for goodness sake) in this article than I have seen in the Free Software world. The man is a hack, anyone who has ever had to look at his code would be able to tell you he isn’t very good at it. He stopped being relevant to the Free Software community circa 1999.

This rant seems to be the ranting of an out of touch old hacker, who’s a bit annoyed technology has got too complicated for him to understand, but rather than seeing that as the inevitable progress of technology (a modern car is considerably more complex than Henry Ford’s, the modern aeroplane is more complicated than the one the Wright Bros used, and the modern computer OS is more complex than the operating systems of the early 80s), he concludes that it was better in the old days, and everyone today is a moron.

iain, in the comments section, shares my thoughts. Also, consider the alternative of OSX, Android, Win32 APIs, which are no better than the “bazaar”, and are worse considering you rarely can fix your own problems without a blowing a wish to some vendor.

This describes a repurposing of Csmith to test Frama-C. Their basic approach was to turn the value analysis of Frama-C into an interpreter for C by cranking up precision and delaying joins indefinitely. There was also some work to invent test oracles for the static analyzer.

There are several different kinds of oracles. If running an analysis on generated programs doesn’t terminate, this is a precision bug. Crash bugs are similarly simple. For testing the invariants deduced by the analyser, special code was added to insert the invariants as a C assertion at the end of the analyzed. Then, the program was run—-with a failed assert indicating that something was incorrectly deduced.