There have been some interesting movements in the browser landscape lately: Opera moving away from Presto, Chromium for Android and Firefox Mobile making a stronger stand on mobile platforms. Web Platform has less fragmentation right now than it ever had before. Today, the Chromium project introduced Blink: a new open source rendering engine based on WebKit.
I’ve been a contributor to the WebKit project for a few years now, written hundreds of updates about the project and even published dozens of those on the official weblog, Surfin’ Safari. It’s a great community, one that I’m very proud to be a member of. Working on WebKit introduced me to a great amount of people and taught me an incredible amount of things.
To improve the open web through technical innovation and good citizenship.
A few weeks ago, Max Heinritz introduced the new Chromium Feature Dashboard, listing not just the implementation status, but also the maturity, interoperability and testability of Web Platform features in Chromium. With Blink, Chrome is taking this one step further and are introducing strict guidelines new features have to adhere to.
To fulfill our good citizenship mission, we need to be careful to add new features to the web platform in a transparent, responsible, and compatible manner. We measure success as moving the open web platform forward as a whole, not just as moving one implementation forward.
I’m both positively surprised by and very confident in the way that Google’s Web superstars (Alex, Dimitri, Paul, Eric and many others) are making this happen. Google projects, including Dart, NaCL and others, are subject to the same guidelines in introducing their changes to the Web Platform as any other participant, including our very strong preference for standardization and compatibility with other browser vendors.
Compatibility risk will be one of the most important decision criteria for enabling new web platform features for the new engine. A launch process has been introduced for new features which touch the Web Platform, which includes several public announcements and reviewing steps. This is not just limited to Googlers: whilst getting commit rights for Blink is similar to the Chromium project, there is a fast track available for WebKit contributors. I’m also very hopeful that we’ll soon be welcoming the first non-Google OWNER to the project as well.
I’m a Web Developer. Thank you for making my life even more complicated!
Paul Irish asked a question about a month ago: what is WebKit? WebKit implementations are not homogenous: they differ in anything from the code used for downloading resources to the mechanism used to display pages, as well as in supported set of features. Chromium, Safari and other implementations of WebKit should already have been considered as separate testing targets.
All browsers’ latest versions are absolutely excellent from a technical perspective. With Ian Hickson’s HTML parsing algorithm and all the rendering engines coming increasingly close to passing the entire CSS 2.1 test-suite, we’re almost at a point where the true foundations of layout on the Web are both standardized and interoperable between implementations. The Chromium project now intends to take this further by working with the W3C to make sure both conformance tests and the testing infrastructure can be shared between different browsers.
Another area which I’m very excited about is that Blink will be following Mozilla’s ideas in regards to vendor prefixes. While support for legacy WebKit vendor prefixes will be maintained in the short term, Blink will strive to avoid shipping vendor prefixed features to the Stable channel altogether.
The end of the Last Week in WebKit articles?
I’ve published 114 articles describing changes in the WebKit and Chromium repositories that occurred in the week before. I do still intend to continue doing so, however, they will address Blink changes instead of WebKit’s. I won’t be posting updates to the Surfin’ Safari anymore, although I am definitely grateful I had the chance of doing so for a number of months.
Personally, I’m really excited about this change. The scale of the Web Platform asks for an ecosystem which doesn’t just welcome participation and collaboration, but also has a fair and healthy amount of competition. Jake made a great analogy about Blink’s potential in the near future: Blink can do for layout and rendering what V8 did for JavaScript, although the improvements will be more gradual.
I’m confident that Chromium can use Blink to increase diversity, while driving innovation, as it has shown to be capable of in the past. For more background on the why, I encourage you to read Alex Russell’s great post on the announcement of Blink. Justin also sheds some light on the security implications of this change, and Paul Lewis also shared some nice insights.
Read more (4 comments) »
In the past few years a whole range of visual effects have been standardized. Future websites can render pretty much anything using bitmap canvasses, display 3D content using CSS 3D Transforms or WebGL and even implement entire key-frame based animations using nothing but CSS. Combined with specifications like the Application Cache and Local Storage, “HTML5? enables a whole new range of web-based applications.
Unfortunately, now that almost everything can be visualized on your monitor, the inability to synthesize, process, and analyse audio streams is becoming more and more obvious. While Flash provides fairly extensive APIs for working with sound, having a native (and preferably more extensive) API available to synthesize, process, and analyse any audio source is much more convenient. That’s why the W3C Audio Incubator Group was founded!
Don’t get too excited just yet: while an initial draft has been published by Google’s Chris Rogers, you shouldn’t expect the API to be finished within the year. The initial version received lots of input from six Apple engineers: Maciej Stachowiak, Eric Carlson, Chris Marrin, Jer Noble, Sam Weinig and Simon Fraser, and now frequently gets updated based on feedback received via the mailing list. The draft specifies various features for the API: spatialized audio, a convolution engine, real-time frequency analysis, biquad filters and sample-accurate scheduled sound playback. Wait, spatialized what?
The reason why it doesn’t exist already
The complexity involved with synthesizing, processing, and analysing audio is one of the key reasons why it doesn’t exist already. Most audio today has a sampling rate of just over 44 thousand samples per second; tracks of DVDs and blu-ray discs can be as high as 192 thousand samples per second. When multiplied by the number of sound channels and considering the decoding required to make sure the file makes sense, you can imagine the amount of work that goes into translating that MP3 file to waves our ears can interpret.
Of course, part of this process is handled by hardware, like converting the digital stream to an analog signal. However, applying effects to an audio stream happens entirely in software where each sample gets processed. In situations where effects are applied and the processed sound is played back almost simultaneously, you can imagine how critical things like buffering and timing are.
Another problem is JavaScript performance. While the scripting engines have become way more powerful in the last few years, they can be in the order of twenty time slower than well optimized native code. When used in combination with one of the SSE instruction sets, which enhance your processor with highly optimized abilities to do audio-related math, today’s scripting engines still got a long way to go.
Native processing to the rescue: just create an API
Performance can be improved by moving most of the processing away from JavaScript. By providing the Application Programming Interface (API) the Audio Incubator Group will likely be proposing that your script gains the ability to “describe” what you want to be doing, rather than doing it. Right now, however, work is being done to implement an interface allowing direct JavaScript processing in the API. Such an interface could be used to prototype audio processing algorithms and creating educational demos, something which already was a possibility using Adobe Flash and Mozilla’s Audio Data API.
The idea is simple: the “base” is an AudioContext interface which manages connections between the different Audio Nodes. The context contains a Destination Node by default, which represents the output device on your computer. This could be your speakers, your headphones or, perhaps in the future, even as a file on your harddrive.
Of course, there have to be audio sources as well. There are various kinds of sources: MediaElementAudio- SourceNode for <audio> and <video> tags and AudioBufferSourceNode for other kinds of input, like MP3 files requested via XHR. Other types are yet to be defined, but source nodes like DeviceElementSourceNode aren’t unthinkable, which could be used to process microphone input via the <device> element.
Between audio sources and destinations, there can be other types of nodes to perform various kinds of manipulations. The specification currently defines the following interfaces:
- AudioGainNode Allowing you to change the volume of the audio.
- AudioPannerNode Positioning and spatializing audio in a 3D space.
- BiquadFilterNode Add lowpass, highpass, and other types of common filters to the audio.
- ChorusNode Add a chorus effect to the audio.
- ConvolverNode Add effects to audio, such as imitating the sound of a concert hall.
- DelayNode Apply dynamically adjustable delays to an AudioNode.
- DynamicsProcessorNode Adding shaping (compressing/expanding) effects.
- WaveShaperNode Adding non-linear waveshaping effects, like distortion.
These nodes form the foundation of many of the features currently available in audio systems, but the specification is still far from finished and more types of nodes may be added. For analysis you could use a RealtimeAnalyserNode, which allows you to analyse the audio node in real time. This could be used for example, to display the tones output by a stream.
An example: dynamically changing the language of a video
Currently there is no clean way to switch between alternative audio streams for a HTML5 <video> element. The Audio API is ideal for such a purpose. When you keep a number of things in mind, like fragmenting the audio in smaller files to speed up the (initial) loading, it won’t be hard to create a language switcher:
- Create an AudioContext,
- Get the audio sources from the <video> element using a MediaElementAudioSourceNode,
- Decrease the volume of the video using an AudioGainNode,
- Get the new audio stream by requesting the MP3 via XHR and putting it in an AudioBufferSourceNode,
- Combine the two using the Dynamics Compressor (DynamicsProcessorNode),
- Play the audio stream.
This can be demonstrated using the following diagram:
These same techniques could be used to dynamically control background sounds for clips and create timed effects for games using an arbitrary number of output channels (which could be 2 for stereo, 5.1 for surround or even more!). Of course, more normal use-cases can be thought of as well: a beep when you click on a button, messages when interactive validation in a form fails or a music player featuring cross-over effects.
A number of examples demonstrating the capabilities of the Web Audio API are available as well, but keep in mind that you have to build WebKit yourself. They do show the involved JavaScript code however!
I’m really interested in the progress of the Audio Incubator Group and can see quite some benefits in being able to synthesize, process, and analyse audio through JavaScript. I’ve signed up to their mailing list and follow prototypes in Gecko and WebKit. Are you interested too? Consider following @AudioXG on Twitter or subscribe to the public-xg-audio mailing list at the W3C — lots of cool things are yet to be invented!
Thanks and credits to Chris Rogers and Koen ten Berg for their technical input and feedback!
Read more (3 comments) »