creating a dynamic soundtrack for switchbreak’s “civilian”

Over the last week I’ve put a bunch of time in to my new game project, a Switchbreak game called Civilian. I’ve been working on music for it, but this blog post isn’t about music — it’s about the crazy stunts you can pull in modern interpreted languages.

Dynamic music in Flash?

Most Flash games use a looping MP3 for background music — it takes just a couple of lines of code to implement, and while the looping isn’t perfectly seamless (there are brief pauses at the start and end, added by the MP3 encoder) it’s usually close enough. For Civilian, though, I wasn’t satisfied with a simple looped track. It’s a game about the passage of time and about player progression, and I wanted the music to reflect those things.

What I really wanted was a dynamic music system, something that would let me alter the music’s sequence or instrumentation on-the-fly in response to the player’s actions. There was no way that was going to work with simple, non-seamless looping MP3s, though — I needed to start working with audio data on a much lower level.

Writing a low-level mixer in AS3

Thankfully, the Flash 10 APIs do give you low-level audio functionality. You can get raw audio data out of an MP3 file, and then send that to audio buffers for playback; I’d already done just that in fact, to implement a seamless MP3 looper, and that gave me a crazy idea: if I could get audio data from one MP3 and play it back, could I also get data from two or more MP3s, mix them, and play them back all at once?

Once I’d confirmed with a simple proof-of-concept that the answer was an emphatic “yes”, I set about adding more tracks, and then implementing features like panning and volume conrol. By this point, the amount of CPU power required to run this mixing was significant — about 40% of one core on my 1.7Ghz i5 Macbook Air — but Flash had no trouble keeping up while running some simple gameplay at 60FPS.

A screenshot from my test app, with five channels of audio running

A screenshot from my test app, with five channels of audio running

From mixer to sequencer

A few days later I had more than just a mixer: I had a simple pattern-based sequencer. Instead of looping MP3s from start to finish, it splits the MP3 tracks in to bars, and then plays those bars in accordance with a sequence stored in an array in the AS3 code.

This actually fits quite well with how I tend to write my music. I can arrange the track basically how I want it in Ardour, then record each unique section of each track to audio, and string those sections together to produce a single MP3 track for each instrument. Then, I can create a sequence within the AS3 code that reassembles those sections in to my original arrangement.

Each bar can have its own settings, too, somewhat like the effects on each note in a tracker. So far, these just let me set the panning or volume for each track, or set up a volume slew (ie: a fade in or fade out) to run over the course of the bar.

Making the music dynamic was just a matter of replacing the static sequence array with code that generates the sequence on-the-fly. I have pattern templates for each track which I combine to create the sequence one bar a a time, adding or removing tracks or replacing one part with another (perhaps with a nice fade in/fade out) based on what’s happening within the game world.

Pushing interpreted languages

As if all the above wasn’t enough, I decided to add an optional audio filter on the output. For certain scenes in the game I want to be able to make the music sound like it’s coming from a radio, so I added a simple bandpass filter, based on a Biquad filter implementation from Dr. Dobbs. If the filter is having any impact on my sequencer’s CPU usage, it’s far too small to notice.

Eventually, I gave up trying to think of efficient ways of doing things, and just started doing them in the simplest way possible. I’ve since done some optimisation work, to help retain a steady frame rate on slower systems (using my old Latitude E6400, clocked down to 800Mhz, as my test machine), but those optimisations are totally unnecessary on more typical systems.

Ten years ago, I wrote audio mixing code for the GBA, and it looked something like this

Ten years ago, I wrote audio mixing code for the GBA, and it looked something like this

The last time I wrote audio mixing code, it was for the ARM7 CPU inside the Gameboy Advance. On that system, compiled C code wasn’t fast enough, so I had to re-write the critical loops in hand-optimised ARM assembler code to get the necessary performance. To see an interpreted language do the same things so easily is still somewhat mind-boggling, but it’s a testament to the advances made in modern interpreters, and to just how fast modern PCs are.

It’s somewhat fitting that this was the week that the GNOME developers announced that JavaScript would become the preferred language for GNOME app development. That announcement caused a surprising amount of backlash, but I think it makes perfect sense: not only is JavaScript a capable and incredibly flexible language with a huge developer community. but it performs incredibly well, too. In fact, I doubt that any other interpreted language has ever had as much developer time invested in improving its performance.

The writing’s on the wall for Flash, of course, but HTML5 and JavaScript are improving rapidly, and frameworks are being written that should make it just as easy to write games for them as it is to write for Flash today. When that happens, it should be a simple matter to port my dynamic music system to JavaScript, and I’ll be very excited to see that happen.

a week-and-a-half with GNOME 3

I’m as surprised as anyone to admit it, but I’ve spent the last using GNOME 3, and it hasn’t been too painful — in fact, I’ve had no trouble remaining productive in it. I’ve definitely missed some of GNOME 2’s features, but it’s definitely been a more pleasant and productive experience than my time with Ubuntu’s Unity desktop after the 11.04 release.

A lot of people have reacted poorly to GNOME 3, and I can understand their frustrations. I’m not sure why I haven’t had the same experience, but perhaps my time with Mac OS X has something to do with it — I’m already used to using the Expose-style overview in the GNOME Shell, and to having Alt-Tab work on an application-level. There’s a new key combo for switching between the windows of an individual application; it defaults to Alt and whatever key sits above the Tab key in your locale (Alt-` in my case). It still took a bit of adjustment, but I was soon zipping between windows and launching applications without any dramas.

GNOME Shell's overview provides quick access to your applications and windows

The GNOME Shell cheat sheet covers a lot of the less obvious functionality built in to the Shell. I do find some of the hidden functionality a bit silly — having to hold Alt to reveal the “Power Off” menu item, for instance — but it still doesn’t take long to come up to speed.

I will add one caveat to my comments: I’ve been using GNOME 3 on my laptop, where (as I remarked a couple of posts back) I spend most of my time using Firefox, Chrome, Thunderbird, terminal windows, and a text editor. I haven’t used it with JACK and my regular assortment of music tools yet, so I’m still not sure how it’ll handle that workflow, or if its greater use of video hardware is going to cause any latency issues.

A quick reality check, nine years in the making

One thing I can’t help but feel in the release of GNOME 3.0 is a sense of history repeating; after all, it’s not the first major release of GNOME to slash away at the desktop’s feature set and remodel the remains based on design principles put together by a core team of developers.

Red Hat Linux 8, with the then-new GNOME 2.0. I'd forgotten how much like a browser Nautilus looked

GNOME 2.0 had substantially less functionality and configurability than the 1.4 release that preceded it, and it imposed a set of Human Interface Guidelines that described how user interfaces should be designed. I think you’d have a hard time finding someone today who’d claim that those changes weren’t for the best in the long run, but at the time, the streamlining was considered too extreme, and the HIG was controversial.

I think we forget just how much was missing in GNOME 2.0, partly because it’s been so long, but mostly because all of the really important features have found their way back in. To remind myself, I took a look back in time: I installed Red Hat Linux 8 in a VM and fired up its default GNOME 2.0 desktop.

The configuration dialogs in GNOME 2.0 did actually cover some options that are currently missing in GNOME 3, such as font and theme settings, and its panels had greater flexibility than the GNOME Shell’s single top panel, thanks to the bundled selection of applets. However, there were surprisingly few applets that provided functionality that hasn’t been incorporated in to GNOME 3 in some way.

Even this minimal window settings dialog from Red Hat 8 wasn't an official part of GNOME 2.0. A complete window settings dialog was added in the next release, GNOME 2.2

Leafing through the release notes for the subsequent GNOME 2 releases showed how quickly some of its missing functionality came back, and just how much the desktop has been polished over the years. While GNOME 3 throws away the visible desktop components, there’s a lot of GNOME 2 still in there, from the power, disk, sound, and networking management infrastructure through to its many tools and utilities.

GNOME 3.0 is a little different from GNOME 2.0 in that it changes the basics of navigating your desktop, and the developers have so far resisted requests to relax those changes. I’m still sure that it’s going to improve rapidly, though, and I do think that its developers will take the various criticisms on board. I don’t expect any dramatic design reversals, but I do expect improvements and refinements that will make GNOME 3 a viable option for many of the users that find it frustrating today.