Archive for August, 2010

Last week: CSS WG meeting at Opera, Chrome Labs and the year 275759

Published on in CSS, Google Chrome, HTML, Last Week, Standards, tech, WebKit. Version: Chrome 7

Last week the CSS Working Group met at Opera’s office in Oslo, Norway, for a face to face meeting. Following tight planning, the members met three days in a row discussing topics ranging from the open CSS 2.1 issues, various CSS 3 modules and other subjects such as hit testing. Some of the results are clear: all open CSS 2.1 issues have been resolved and a range of specifications will have their priority increased (such as CSS Transitions and Transforms).

Furthermore, CSS 2.1 is expected to become a Proposed Recommendation by the end of the year. This would mean that the specification could be a W3C Recommendation early next year, allowing the working group to focus their attention to CSS 3 and beyond. During the meeting Mozilla’s David Baron also mentioned that Firefox will be implementing 3D Transforms, already available in Safari and Google Chrome.

As for Chromium and WebKit, a combined amount of 1282 commits were uploaded to their repositories. While this means there were fewer commits than to last week, there’s a lot more news to share about the projects. I’ll highlight some interesting items which occurred last week, and briefly list other interesting changes.

Firstly, it’s becoming more and more obvious to the Chrome team that their browser is lacking important features for the enterprise market. An area Google can tackle is policies. Policies are a way of defining the settings of the browser through the registry, Microsoft’s Administrative Template files or the, so far unannounced, ChromeOS Enterprise Daemon. Other policy and preference stores may be added in the future.

Policies allow companies to easily define common settings such as the proxy server to use, account synchronization and whether JavaScript should be enabled for websites. Unfortunately this also enables companies to block Chrome updates, but I’m sure the Chrome team will be looking at options to prevent people from doing this. Last week support for three new policies was added.

Another large update is the initial inclusion of the Google Chrome Labs page. Most other Google products, as well as Google itself, include a page with experimental features. Considering Chrome supports about 320 command line flags it won’t surprise you that adding such a page makes certain tests a lot more accessible. Google’s Nico Weber committed the initial version just over four days ago. You can try it out yourself by downloading a recent nightly and visiting about:labs.

The WebKit team has invested a lot of time in improving their support for various standards. Adam Barth and Eric Seidel enabled the last part of the new HTML5 Tree Builder: fragment parsing. Furthermore support for HTML5 compliant doctype switching was added, symbolic CSS3 list-style-types are now supported and file inputs now respect HTML5’s fake path. Finally, due to this addition, you can now use HTML5’s date input types to start making plans for your birthday in the year 275759.

Now that the new Tree Builder has been completed, except for a lot of fine-tuning of course, thousands of lines of code were up for deletion. The old Tree Builder itself wast removed on the 24th of August. Further cleanups were done with the removal of their current implementation of Mozilla’s XML Binding Language (XBL). It hadn’t been maintained in years, so the decision was made to remove it in total.

Further updates last week

Starting next Thursday I will be in Brighton, United Kingdom. Together with KrijnAnne and Matijs I’ll be attending dConstruct 2010. Perhaps I’ll be seeing you there? 🙂

Read more (3 comments) »

Last week in.. WebKit and Chromium!

Published on in Browser Vendors, Google Chrome, Last Week, tech, WebKit. Version: Chrome 7

It’s hard to keep track of huge open source projects which receive hundreds of updates per week. In case of WebKit and Chromium, a total of 1113 changes were landed in the past seven days alone, including lots of new features, enhancements and of course tons of bugfixes. Inspired by Paul Irish and Divya Manian, I’m going to experiment to see whether it’s doable to regularly write (smaller) updates like these.

In the past seven days WebKit has seen 396 commits done by about 80 authors. A decent number of them were done by Google engineers working on storage related systems. Firstly there is the File API specification; Chromium has been supporting various asynchronous File Reader functions for a few months now.

Last Thursday Eric Uhrhane committed the first part of the File Writer spec. Even though it’s only a placeholder, it shows that Google’s actively working on implementing the features. Official word on synchronous methods is still pending.

The other storage system they’re working on is a specification I wasn’t aware of myself, a Directories and System extension to the File API. The initial bits of the implementation were committed by Kinuko Yasuda on Monday. Being built entirely on top of the File API, it’s likely that the main use-case for the implementation will be Chromium OS. Regardless, most of the use-cases would be useful in current browsers as well.

Folks at Apple have been busy with improving the quality of the WebKit2 interface. Windowless plugins can now paint and receive mouse events, which means that the Vimeo Flash Player can be used again on Windows builds. A number of improvements for the media playback have been added as well, such as improved handling of detection of the “application/octet-stream” content-type, as well as restoring the intrinsic size of a video after loading its poster. Simon Fraser solved a number of random crashes which became more obvious now that accelerated compositing has been implemented.

Also exciting news is, even though it has been working for a while already, that support for inline MathML has been announced for Safari nightlies. MathML is a way of rendering complex math straight in your browser, pretty much like SVG is for graphics. MathML can be, just like SVG, included in any HTML5 page. Henri Sivonen has created a nice example demonstrating both technologies.

Within the Chromium team work is hard on its way to perfectly integrate ANGLE into the browser. DirectX libraries will be distributed with the Windows versions and a public experiment has started to gather statistics about GPU capabilities. The browser also received per-plugin content settings, although it’s still protected behind a runtime flag.

The version of Chromium’s trunk has been updated to 500, which certainly is a milestone. An early implementation of the remote WebDriver API has landed, allowing basic remote control of your browser. Finally, the V8 JavaScript engine has been updated to version 2.3.9 (changelog).

Other recent changes

Of course, with a total of 1113 commits in both repositories, there’s a lot which hasn’t been mentioned yet!

  • Eric and Tony have solved some more issues around the HTML5 Tree Builder.
  • The Qt port now supports touch events in WebKit2, courtesy of Juha Savolainen.
  • Chromium’s accelerated compositing rendering logic has been refactored.
  • <style> elements within <noscript> are now ignored if JavaScript is enabled.
  • Kenneth Russell now is a WebKit reviewer (congratulations!).
  • Some SVG Pattern fixes were landed by Nikolas Zimmermann.
  • Pushed SPDY streams in Chromium now get closed automatically as well.
  • Accelerated Compositing for <canvas> will be compiled in by default.
  • Chromium can now use the Windows 7 Location Provider for Geolocation.

Even though it’s just a week, an incredible amount of work happens within these two huge open source projects. In order to include other browsers (Firefox, Opera and Internet Explorer) and specifications, I’ll have to cut back on the details quite a bit. This week the CSS Working Group is meeting face-to-face in Oslo, I’m sure that’ll be interesting to include next week :colone;;);

Read more (13 comments) »

Synthesizing and processing audio through JavaScript: the Audio API

Published on in Gecko, Standards, tech, WebKit.

In the past few years a whole range of visual effects have been standardized. Future websites can render pretty much anything using bitmap canvasses, display 3D content using CSS 3D Transforms or WebGL and even implement entire key-frame based animations using nothing but CSS. Combined with specifications like the Application Cache and Local Storage, “HTML5? enables a whole new range of web-based applications.

Unfortunately, now that almost everything can be visualized on your monitor, the inability to synthesize, process, and analyse audio streams is becoming more and more obvious. While Flash provides fairly extensive APIs for working with sound, having a native (and preferably more extensive) API available to synthesize, process, and analyse any audio source is much more convenient. That’s why the W3C Audio Incubator Group was founded!

Don’t get too excited just yet: while an initial draft has been published by Google’s Chris Rogers, you shouldn’t expect the API to be finished within the year. The initial version received lots of input from six Apple engineers: Maciej Stachowiak, Eric Carlson, Chris Marrin, Jer Noble, Sam Weinig and Simon Fraser, and now frequently gets updated based on feedback received via the mailing list. The draft specifies various features for the API: spatialized audio, a convolution engine, real-time frequency analysis, biquad filters and sample-accurate scheduled sound playback. Wait, spatialized what?

The reason why it doesn’t exist already

The complexity involved with synthesizing, processing, and analysing audio is one of the key reasons why it doesn’t exist already. Most audio today has a sampling rate of just over 44 thousand samples per second; tracks of DVDs and blu-ray discs can be as high as 192 thousand samples per second. When multiplied by the number of sound channels and considering the decoding required to make sure the file makes sense, you can imagine the amount of work that goes into translating that MP3 file to waves our ears can interpret.

Of course, part of this process is handled by hardware, like converting the digital stream to an analog signal. However, applying effects to an audio stream happens entirely in software where each sample gets processed. In situations where effects are applied and the processed sound is played back almost simultaneously, you can imagine how critical things like buffering and timing are.

Another problem is JavaScript performance. While the scripting engines have become way more powerful in the last few years, they can be in the order of twenty time slower than well optimized native code. When used in combination with one of the SSE instruction sets, which enhance your processor with highly optimized abilities to do audio-related math, today’s scripting engines still got a long way to go.

Native processing to the rescue: just create an API

Performance can be improved by moving most of the processing away from JavaScript. By providing the Application Programming Interface (API) the Audio Incubator Group will likely be proposing that your script gains the ability to “describe” what you want to be doing, rather than doing it. Right now, however, work is being done to implement an interface allowing direct JavaScript processing in the API. Such an interface could be used to prototype audio processing algorithms and creating educational demos, something which already was a possibility using Adobe Flash and Mozilla’s Audio Data API.

The idea is simple: the “base” is an AudioContext interface which manages connections between the different Audio Nodes. The context contains a Destination Node by default, which represents the output device on your computer. This could be your speakers, your headphones or, perhaps in the future, even as a file on your harddrive.

Of course, there have to be audio sources as well. There are various kinds of sources: MediaElementAudio- SourceNode for <audio> and <video> tags and AudioBufferSourceNode for other kinds of input, like MP3 files requested via XHR. Other types are yet to be defined, but source nodes like DeviceElementSourceNode aren’t unthinkable, which could be used to process microphone input via the <device> element.

Between audio sources and destinations, there can be other types of nodes to perform various kinds of manipulations. The specification currently defines the following interfaces:

  • AudioGainNode Allowing you to change the volume of the audio.
  • AudioPannerNode Positioning and spatializing audio in a 3D space.
  • BiquadFilterNode Add lowpass, highpass, and other types of common filters to the audio.
  • ChorusNode Add a chorus effect to the audio.
  • ConvolverNode Add effects to audio, such as imitating the sound of a concert hall.
  • DelayNode Apply dynamically adjustable delays to an AudioNode.
  • DynamicsProcessorNode Adding shaping (compressing/expanding) effects.
  • WaveShaperNode Adding non-linear waveshaping effects, like distortion.

These nodes form the foundation of many of the features currently available in audio systems, but the specification is still far from finished and more types of nodes may be added. For analysis you could use a RealtimeAnalyserNode, which allows you to analyse the audio node in real time. This could be used for example, to display the tones output by a stream.

An example: dynamically changing the language of a video

Currently there is no clean way to switch between alternative audio streams for a HTML5 <video> element. The Audio API is ideal for such a purpose. When you keep a number of things in mind, like fragmenting the audio in smaller files to speed up the (initial) loading, it won’t be hard to create a language switcher:

  1. Create an AudioContext,
  2. Get the audio sources from the <video> element using a MediaElementAudioSourceNode,
  3. Decrease the volume of the video using an AudioGainNode,
  4. Get the new audio stream by requesting  the MP3 via XHR and putting it in an AudioBufferSourceNode,
  5. Combine the two using the Dynamics Compressor (DynamicsProcessorNode),
  6. Play the audio stream.

This can be demonstrated using the following diagram:

These same techniques could be used to dynamically control background sounds for clips and create timed effects for games using an arbitrary number of output channels (which could be 2 for stereo, 5.1 for surround or even more!). Of course, more normal use-cases can be thought of as well: a beep when you click on a button, messages when interactive validation in a form fails or a music player featuring cross-over effects.

A number of examples demonstrating the capabilities of the Web Audio API are available as well, but keep in mind that you have to build WebKit yourself. They do show the involved JavaScript code however!

I’m really interested in the progress of the Audio Incubator Group and can see quite some benefits in being able to synthesize, process, and analyse audio through JavaScript. I’ve signed up to their mailing list and follow prototypes in Gecko and WebKit. Are you interested too? Consider following @AudioXG on Twitter or subscribe to the public-xg-audio mailing list at the W3C — lots of cool things are yet to be invented!

Thanks and credits to Chris Rogers and Koen ten Berg for their technical input and feedback!

Read more (3 comments) »