Eye tracking testing

Recently, in one of our work pitches at Firstborn, we were asked whether we could create a web-based application that could use your webcam to track the position of your eyes. We didn’t know for sure that would be possible (or to which level of accuracy) so we decided to spend a few days creating a test prototype. This is the result, running on Flash 10.

The red circle represents the perceived position of my eyes, and the green line is just a line that moves with a speed that is based on the circle position (when the circle is on top, the line moves up; when the circle is in the middle, the line stays; when the circle is at the bottom, the line moves down).

It uses a face tracking algorithm for initial face detection, then some other color-separating code to find where the eye is looking at in real-time. An initial calibration is necessary (the code must know when the eye is looking at the top and at the bottom of the screen; then it can find where you’re looking at, in between those two).

The face detection used an AS3 port of OpenCV, based on the work of Ohtsuka Masakazu and Mario Klingemann with a bunch of modifications for speed and varying levels of accuracy (nothing too crazy though, just better for our specific use case).

The result was SoBe’s Staring Contest, where you have to keep your eyes in a specific part of the screen to “win” the game (try the “experimental version” for the eye-tracking webcam version; otherwise it’s a mouse-based game). It doesn’t work in all situations – it relies heavily on the amount of light available at the visitor’s environment, and on the visitor’s ability to not rotate their head too much during the experience – but I still consider it a success.

As a developer, it’s very important to take some time out to try not only new techniques but also new platforms. The tools we have available for us change constantly and to keep up you need to make time for some private investigation of new methods and capabilities. This time for research & development with something that may or may not be fruitful is hard to come by, but it’s certainly of utmost importance in this career. It’s also fun, of course.

This recent presentation by Joon Park, our Chief Creative Officer, talks a little about this, talking not only about the above example but also about some other great “personal” projects by some of our developers that started as an exercise but turned out to be so much more.

P.S. As always, we’re hiring.

Download the Flash Player ActionScript 3 reference files as a single zip

For some reason, I always have trouble finding and download the Flash ActionScript 3 language documentation. The LiveDocs are easy to find, but the zipped documentation – which is super useful to have locally for faster access and for quick help at the press of F1 if you install it on a tool like FDT – is somehow difficult to find if you just use Google to look for it (too much noise). Plus, the official reference page contains a Zip package that is out-of-date (from 2008!). Grrrr (Edited at August 11th, 2011: apparently this page has been removed and now redirects to the Flash Help online index).

Anyway, the direct link to the archived AS3 reference files seems to be this. Thanks to @sfdesigner for coming up with this mysterious link. The source page for this link seems to be the ActionScript reference archive.

Hopefully this will be helpful to some other people out there (and I’m naively hoping posting this here will help Google and other search engines take people to the right link).

Bonus: the Flex 4.5 API beta reference (for Flash 11) can be found here, although there’s no download option yet.

Update for FDT users (August 11th, 2011): while the package linked above contains the correct documentation, there’s a big problem with the new “standalone zip” file provided: much of the content is build dynamically, by JavaScript present on the page (since you can selected which packages and frameworks you want to see on the lists, online). Because of this, the documentation sort of works locally, although it tries to contact remove servers (it doesn’t work on MSIE at all; navigation is impossible since you get endless warnings and redirects to the front page). The real issue with all this is that because FDT parses the HTML to find reference to the classes and its members, it can’t properly index the dynamic documentation anymore; when using the above zip as a reference, much of the actual information is gone.

The real solution to this – and to the fact that the online LiveDocs version has too many packages and frameworks that may be useless to some developers – would be to allow a download of the LiveDocs without any kind of dynamic content creation, and with frameworks properly pre-selected (for example, I want the latest version of the AIR and Flash Player API to be listed, but no Flash Lite, and no other framework whatsoever). But this would probably take a big time for development and I’m not sure whether either Adobe or PowerFlasher (makers of FDT) have that as a high priority. Maybe a Chrome/Firefox/Browser extension could do the trick (something that allows you to download a site with pages pre-rendered after JavaScript execution), but I couldn’t find any such extension that acted on subpages.

Anyway, this is all to say that if you want the documentation to work properly in FDT, you have no option other than getting a copy of the Flash documentation and using it instead. This page contains such a copy that is moderately up-to-date (August 2010), and it’s what I’m using on FDT now.

Flash Video frame time woes

In the past few months, it’s surprising the number of times I’ve had to play a video in Flash (in either FLV or F4V format) and know its current time (or current frame) accurately – specially because we’ve been using a lot of shape tracking, to project a Flash content-based plane on top of a video being played (with proper perspective distortion).

Examples of this are in out work for SoBe’s Staring Contest (try the “experimental version”), also SoBe’s Drop of Flavor, and in 5Gum’s 5React this one created more than a year ago (select “United States”, click “Find out”, and approve the Facebook requests – don’t worry, it’s not gonna post aything to your wall, the campaign is over; and you may need to reload the page a few times after approval as recent changes in Facebook login have broken the website in some browsers).

It’s developing websites like these that made me realize how hard it is to know well what’s the video’s current position in time (and no, saving the video inside a SWF container and then reading the MovieClip’s currentFrame is not a real solution). One would expect using the NetStream’s time property to be the way to go, but this is extremely inaccurate; I have the feeling this is a property that is only updated after video keyframes are reached. And, for the kind of tracking we need to do, being one frame off in our calculations was already too much.

For 5React, I had to know the video’s current frame, so I could write the content to the fake TV sets that show during the introductory video playback. After much testing and researching, I ended up using the number of NetStream’s decodedFrames (surprisingly absent from the documentation) added to the number of droppedFrames to properly determine the video’s current frame.

This approach has two big caveats, however. First, they are properties that accumulate as the video is played – they don’t take time into consideration. This means that if you stop the video after one second and start playing again from the beginning, the number of decodedFrames will never reset – it’ll just continue to accumulate. So it means it’s only useful on a linear, continuous playback – if the user can’t pause video execution and seek to a different time.

The second caveat is that the properties are not so reliable. In theory, droppedFrames sound amazing, but apparently it’s only increased when Flash decides top drop a frame itself – if the Flash Player hangs for a fraction of a second (say, when you right-click the Flash movie), it has no impact in that number, meaning the sync is lost. The same happens if you switch back from the window playing the SWF – neither decodedFrames or droppedFrames are counted. This behavior is easy to notice with the 5React animation – if you just watch the whole thing, it’ll perform flawlessly. If you’re on a slow computer, or if you right-click the SWF prior to execution, or if you switch to another browser tab or window and then switch back, tracking is already out of sync.

Because of these issues, when implementing the tracking solution for SoBe’s Staring Contest, I knew I couldn’t rely on that technique – I needed to allow the user to scrub the video and see the tracking matching the video time – so I tried something different: encoding cuepoints in the video with the number of the current frame.

The problem with this approach is, however, threefold. For one thing, encoding the cuepoints with the necessary information into the video is a daunting task – one that can be helped by scripts, but still, a hard task.

The second problem is that using real cuepoints force you to use the “old” On2 VP6 video codec – instead of the new, more modern, H.264 codec-based F4V format. The videos for the above website had to be HD-quality videos, and being forced to use FLV for those meant bigger files, less quality, and even worse rendering performance.

The third and final problem is that even thought FLV cuepoints are extremely accurate, they’re still not accurate enough. Apparently, video frames can be decoded and rendered at a separate thread than the cuepoints for that specific frame are triggered, so we’ve ran into situations where the tracked plane is not rendered properly (and, like I said, even a 1 frame offset is already too much unless the video being tracked is very slow). This is a problem that is hard to put a finger on, and it depends largely on Flash Player version, system version, and number of processors available on a machine – but it suffices to say that, while the cuepoint decoding and rendering works pretty well on most machines, it does fail pretty badly on a top-of-the-line Mac machine with the latest version of the Flash Player for no apparent reason. This is noticeable on the Drop of Flavor website I mentioned above, or on the Staring Contest, when playing the “victory” video for the experimental version and scrubbing through the video during the part where the model shows you the smartphone with your photo: scrubbing too much, too quickly, will cause the tracked shape to be rendered somewhere it doesn’t belong.

I’ve been discussing this with Eric Decker – who developed the Drop of Flavor website – and he believes he has come up with a somewhat better solution, product of a crazy idea we’ve discussed in the past. Can you guess what it is?

He’s encoding the current frame’s number as binary, at the bottom of the video, in the shape of rectangles that go on and off. He then reads the color information (via getPixel) and converts that to the proper number. It works well, and it’s very accurate – but it requires encoding the video with this information (created with a Photoshop action that creates a layer for every number) that not only takes space and file size, but forces you to take the useless space into account when rendering the video.

He has more information about that here. That we even need to resort to this kind of hack is something that has been driving me crazy, so I’d just like to put something out there:

We need to know the current frame – or accurate time position – of a video being played in Flash, regardless of the format, without any crazy hack.

I don’t know the particularities of how the video decoding and renderers are programmed in the Flash Player. However, it does seems to be to be a little bit odd that we can’t have a property like “currentFrame” that indicates the actual number of the frame rendered there in the screen, or a “time” that is updated on every frame.

I like to believe I’m not the kind of developer that likes to bitch and moan about features. And I’ve already seen Adobe constantly delivering features I wanted in the Flash Player, like better sound control and, more recently, native JSON parsing. But, seriously… think about this: we just need to know the video’s current frame or time.

Have you ever ran into this issue? Am I the only one who wants this functionality to be that accurate? Or is it something a lot of people have ran into? Make your voice, and opinion, be heard – I’ve created a feature request entry on Adobe’s bug tracker for Flash Player (requires login).

Firstborn is hiring a HTML/JS developer

I may have forgotten to mention, but in addition to Flash Developers, Producers and others, we’re also hiring a serious JS/HTML/CSS developer for our New York offices. Do you wanna work with some of the best people in their field, in a place that is challenging and fun? Then Firstborn is the place you’re looking for.

I love it here.

Interactive perspective-distorted sprites

In a previous post, I published a method to distort an image by its corners with proper perspective distortion, using the new drawTriangle() methods from Flash 10. That was used for a website I had to develop about an year ago at Firstborn, where we had (mostly) static content projected to 3d screens.

It just happened that, recently, I had to re-apply the technique to another website I was developing. This time around, however, the distorted image had to be interactive so using drawTriangle() wouldn’t work.

Luckily, wonder.fl user wh0 had already derived a method, quite different from my implementation, that instead took the corner coordinates and created a proper Matrix3D instance. This 3D transformation matrix could then be applied to an Interactive Object, that could then work as if it had been transformed by using the standard transformation properties (rotateX, rotateY, etc).

I had to adapt the feature a bit to fit my needs, so I’ve created a wrapper for that technique that allows you to create a distorted sprite – called a PerspectiveSprite – and then add children to it directly. It works like so:

// Creates a container of assumed width 200 and assumed height 100
container = new PerspectiveSprite(200, 100);
addChild(container);

// Moves the top right corner a bit
container.topRight = new Point(250, 10);

// Adds an object to the container
myBox = new Box(200, 100);
container.addChild(myBox);

The container needs assumed width and assumed height parameters just so it’ll calculate the graphics correctly. In theory, you can use anything in there, but then you will need to assume the instance created is a rectangle of that same width and height when adding content. And, of course, the higher the assumed width and height, the higher the resolution when transforming the plane (since Flash pre-renders the content before projecting it).

You can find PerspectiveSprite here on my account on GitHub.

Here’s an example of that in action, using our Meet Firstborn video in a loop as the distorted plane and some simple elements as examples of seamless user interaction (since it’s just that, a standard DisplayObject container):

Works well and is a better, more dynamic solution than redrawing a bitmap. Props to wh0 again for first finding the equations and coming up with the calculation for AS3′s Matrix3D.

Update (June 13th, 2011): the above, as it turns out, isn’t as precise as I thought. On extreme perspective transformations, the transformed Sprite can be a bit skewed vertically or horizontally. The problem seems to be at the projectionCenter of the perspectiveProjection of the Sprite. In such cases, using the (very manual but perfectly accurate) drawPlane() is a better option, at least until a correct matrix3d calculation is found.

Update II (July 6th, 2011): apparently distortions will only happen in some very specific cases. I think it’s a combination of being loaded inside another SWF, and stage alignments different than top left (any of these, isolated, won’t be a problem). I can’t even replicate it consistently. So my recommendation is instead to use the above code anywhere necessary, but resort to drawPlane() in case it’s not precise enough.

Getting the SWF’s HTML object/embed id from within the Flash movie itself

Update: if you’re using swfobject to embed the SWF, this post may still be useful, but there’s some better, simpler alternatives to finding the SWF element id. Read the comments to read about them. The original article follows, and is suitable for all situations in which you have a SWF embedded in a page.

In ActionScript, it’s quite common that you need to talk to the JavaScript/HTML side and then have JS/HTML talk back to Flash. That is easily accomplished by ExternalInterface, but one of the caveats is that when talking from the JS/HTML side to the SWF side you must know the element id of the SWF you’re trying to communicate to. You then do,

document.getElementById("myFlashMovie").myCallbackFunction();

I ran into a problem today, however, when I wanted to call a JavaScript function from Flash and have the function call my movie back at a later time – without having the SWF id hard-coded anywhere else (e.g., without embedding it on the AS3 code or using a FlashVars parameter). The reason is, I wanted to have a self-contained function that would open a popup function and to have the HTML page tell me when that window was closed. Therefore, the JS side needed to know the SWF id in advance.

At a first glance, the ExternalInterface offers the objectID property, which is supposed to do exactly that. Unfortunately, however, this property isn’t as reliable as the documentation makes it sound – on a website I was testing, while embedding the SWF with SWFObject, the property’s value wasn’t being set on Google Chrome (and, I assume, all other plugin-based browsers). This is probably due to the way the embedding was done (even though I used the same name as the SWF object’s name and the attribute parameter’s id):

// Embeds SWF
var flashvars = {};

var params = {};
params.allowFullScreen = "true";
params.allowScriptAccess = "always";
params.allowNetworking = "all";
params.base = ".";

var attributes = {};
attributes.id = "mainMovie";

swfobject.embedSWF('index.swf', attributes.id, '100%', '100%', '10.0.0', 'expressinstall.swf', flashvars, params, attributes);

Miller Medeiros warned me of that issue, and suggested an alternative – looping through all object and embed elements on the page, checking whether an arbitrary test variable (created previously) existed. This is the solution implemented on his code here, albeit for a different purpose.

The perfect The most flexible solution to find the SWF name from inside Flash, then, is as such (as a self-contained JavaScript function injected from AS3):

public function getSWFObjectName(): String {
	// Returns the SWF's object name for getElementById

	// Based on https://github.com/millermedeiros/Hasher_AS3_helper/blob/master/dev/src/org/osflash/hasher/Hasher.as

	var js:XML;
	js = <script><![CDATA[
		function(__randomFunction) {
			var check = function(objects){
       				for (var i = 0; i < objects.length; i++){
        				if (objects[i][__randomFunction]) return objects[i].id;
       				}
       				return undefined;
       			};

      	 		return check(document.getElementsByTagName("object")) || check(document.getElementsByTagName("embed"));
		}
	]]></script>;

	var __randomFunction:String = "checkFunction_" + Math.floor(Math.random() * 99999); // Something random just so it's safer
	ExternalInterface.addCallback(__randomFunction, getSWFObjectName); // The second parameter can be anything, just passing a function that exists

	return ExternalInterface.call(js, __randomFunction);
}

When called, this function creates a callback to a random function (that is never called), and then calls some JavaScript code that checks all object and embed elements in the page to see if they have any reference to that same random function. When they do, they return the element’s name – that’s the SWF movie.

This behavior is also similar to this solution suggested by Flavio Caccamo, although at first I overlooked the article thinking it was too specific to ActionScript 2.

The popup-with-callback function, therefore, looks like this:

public function openPopup(__url:String, __width:int = 600, __height:int = 400, __name:String = "_blank", __onClosed:Function = null): void {
	// Open a popup window, with optional callback when closed

	var js:XML;
	js = <script><![CDATA[
		function(__url, __width, __height, __name, __SWFContext, __onClosed) {

			if (__onClosed != "") {
				// If 'onClosed' is supplied, call a function when the popup window is closed

				var checkForWindow = function() {
					if (newWindow.closed) {
						clearInterval(windowCheckInterval);
						document.getElementById(__SWFContext)[__onClosed]();
					}
				};

				var windowCheckInterval = setInterval(checkForWindow, 250);
			}

			var wx = (screen.width - __width)/2;
			var wy = (screen.height - __height)/2;

			var newWindow = window.open(__url, __name, "top="+wy+",left="+wx+",width="+__width+",height="+__height);
			if (newWindow.focus) newWindow.focus();

		}
	]]></script>;

	var __onClosedString:String = "";

	if (Boolean(__onClosed)) {
		__onClosedString = "checkFunction_" + Math.floor(Math.random() * 99999); // Something random just so it's safer
		ExternalInterface.addCallback(__onClosedString, __onClosed);
	} 

	ExternalInterface.call(js, __url, __width, __height, __name, getSWFObjectName(), __onClosedString);
}

I haven’t tested on absolutely every version of every browser yet, but it has been enough to get it work on my two main test environments (Google Chrome, and Internet Explorer).

Benchmarking video playback performance in Flash

In my day-to-day doing development at Firstborn, it’s pretty common that we run into questions that don’t have a very clear answer – for example, what would be the best technique to do something (in terms of performance).

When something like that happens, depending on the time constraints, we may either do some quick test, or just go with what we assume is best – creating a theory and sticking with it. At times, however, a certain issue becomes so important it requires a more extensive research – having the scientific method applied to it, so to speak.

I’ve ran into one of those recently with a website we’ve just created – Expedition Titanic. In that website, we have a few videos running in quasi-fullscreen (taking the whole website space, that is) and we were having trouble with performance in some computers – dropping frames, not playing smoothly, etc.

In previous projects, we had some assumptions of how full-site videos should be encoded – resolution, bitrate, codec, etc. Most of those were based on actual experience, but it was scattered information without any real data to back it up.

Video playback performance was becoming more of an issue in every website I’ve been working on – this is also true of the previous website I’ve developed here, for the 5 REACT campaign – and I decided to stop relying on myths, personal theories or word-of-mouth and do some real tests – rather, benchmarks – to actually come up with the best solutions for video playback in Flash in terms of performance.

My question was not about the video quality itself, but rather its playback quality – how smooth the video frames were being played, how many frames were being dropped; in sum, what was the impact of different encoding methods on playback performance. We generally assume F4V (H.264) videos have better quality, but which one requires more from the CPU to decode? What’s the impact of actual resolution and bitrate?

I have to confess I had some personal beliefs I wanted to put to the test with these benchmarks. My theories were as such:

  • On2 VP6, (FLV) videos are faster to decode than H.264 (F4V) videos.
  • H.264 decoding is faster on multi-core machines; On2 VP6, not as much.
  • Overall, OS X has even worse decoding performance for H.264 videos.
  • Videos with higher bitrates are slower to decode and this impacts the rendering speed of high-quality videos.
  • Video playback performance in Linux is horrible.

Still, I approached the problem with a neutral stance and would be as happy proving as I would be with disproving any of those theories. The main reason why I wanted to do the test in the first place is because I wasn’t really sure any of them were really true, but I still used them as guidelines when creating content.

Measuring video playback quality

My initial plan to measure the playback quality was using the (Flash 10 only) droppedFrames property of the NetStreamInfo object and the (undocumented) decodedFrames property of the NetStream object. That way, I hopped to be able to measure the number of skipped frames and measure the time spent decoding in a more rendering-agnostic fashion.

After some initial tests, however, I decided against it because the values I hot were just not reliable enough: while the behavior of both properties was predictable for FLVs (the sum of decodedFrames and droppedFrames would be the number of total frames in the video), their values while playing an F4V video were quite undecipherable; there seemed to be a correlation between playback quality and the value of both properties, but their actual relationship is a mistery – for example, does droppedFrames contain the number of frames dropped during decoding, or during decodoing and playback? Does decodedFrames contain the number of frames decoded, or decoded and rendered? The numbers state different things at different times, giving that either property can have a higher value than the other, and the fact that they seem to behave differently depending on the choice of video format.

Due to this, I’ve decided to try and have a 30fps video running inside a 60fps SWF, and use the final SWF speed (measured by onEnterFrame events) indicating how much time was being spent with video decoding and rendering. Nevertheless, the number of frames decoded and dropped are still included in the final results.

Test rationale

The idea of the test was finding out the best combination of encoding variables for proper video playback, so I decided to use these parameters:

  • Format: FLV (On2 VP6) and F4V (H.264)
  • Bitrate: 500Kbps, 1000Kbps, 1500Kbps, 2000Kbps
  • Resolution: 1920×1080, 960×540, 480×270

This gives a total of 24 different encoded files for the same video. They would all be rendered at the same size (taking the whole browser space); the idea is that it’d allow me to have a better understanding of how resolution, bitrate and format changes impacted the playback quality, with the same area being drawn.

The file used was a rendering of the intro to the Expedition Titanic’s “Explore” area, originally rendered at 1920×1080 and set at 30FPS.

The testing project was a small SWF file that would load a file, play it, unload it, and skip to the next file. The SWF was set at 60fps, using maximum quality; nothing else was present on the staging area (the idea was to test video decode speed without compositing tradeoffs), and the video was scaled to fit inside the full browser area (similar to the Stage scaleMode of StageScaleMode.NO_BORDER). The executing SWF was kept focused during the whole test.

I used On2 VP6 as the sole codec for the FLV videos. Sorenson Spark, while considered to have better decoding performance, is usually not a parameter on the video encoding decision equation anymore due to its sub-standard encoding quality.

For the FLV encoding, I used Adobe Flash Video Encoder (CS3) using a normal encoding (1-pass). For the F4V encoding, I used Adobe Media ENcoder (CS4) with a “High” profile, Level 4.1, and encoded with 2-pass VBR where both target and maximum bitrates where the bitrate being tested. And since the idea was testing raw playback quality, as opposed to accurate seeking ability or any other parameter, keyframe placement distance was kept as automatic on both.

The videos were encoded without an audio track.

Results

Here are the test results, plotted in line charts. The question these numbers try to answer is: when playing a 30FPS video on a 60FPS SWF, what’s the actual rendering framerate?

The test was ran on a number of different computers, from low-end to high-end, Windows PCs and Macs, multi-core and single-core machines. The idea was not measuring performance of each setup against each other – they had different specs, and were ran at different resolutions – but rather to see the differences between different videos being played on the same machine.

“Slow Windows” was an old Dell laptop running Windows XP SP2 at 2GHz (Intel single core) with 1gb memory. Tests used Firefox 3.6.8 and Flash Player 10.1.82.76.

“Medium Windows” was a desktop running Windows XP SP3 at 3GHz (Intel single core) with 4gb memory. Tests used Firefox 3.6.8 and Flash Player 10.1.82.0.

“Fast Windows” was a desktop machine running Windows 7 64bits at 3.33GHz (Intel Core 2 Duo, 2 cores) with 4gb memory. Tests used Chrome 6.0.472.51 (beta) and Flash Player 10.1.53.0.

“Fast Mac” was a 27″ iMac running OS X 10.6.4 at 3GHz (Intel Core 2 Duo, 2 cores) with 4gb memory. Tests used Safari 5.0.1 and Flash Player 10.1.53.64.

“Slow Linux” was a laptop running Ubuntu 10.04 64bits at 2GHz (AMD 3500+ single core) with 461MB memory. Tests used Firefox 3.6.8 and Flash Player 10.1.82.76.

Result analysis

Some points can be drawn from the results:

F4Vs have better playback quality

Simply put, H.264 decoding is faster than On2 VP6 decoding, so F4V videos easily give best results in terms of overall playback performance of a video on a Flash website . This holds true for both single-core and multi-core machines, and for every platform tested (Windows, Macintosh, Linux). On2 VP6 actually uses a new thread on multi-core systems for better performance, but apparently this is not enough to be faster than the (usually) hardware-accelerated H.264 decoding.

Adding to the best performance the (visibly) better quality of H.264 encoding, F4V becomes an obvious choice for video encoding in Flash. My recommendation is that F4V videos should always be used instead of FLV videos, except when a feature supported by FLVs is needed (transparency channel; legacy Flash versions support; cuepoints; odd video dimensions).

H.264 doesn’t have any special impact on OS X machines

F4V video performs just as well on a Macintosh, not being particularly taxed in relation to comparisons with any other system. Specially with the newly added hardware acceleration, there’s doesn’t seem to exist any valid concern with how the decoder performs in this platform.

Actual bitrate doesn’t have that much of an impact on performance

While technically not true – a higher bitrate/quality does mean a lot more data to process – the actual impact of video bitrate in overall performance was near negligible in my tests. In the case where one needs better performance for a video, it makes more sense to lower the resolution than to lower the bitrate.

Linux performance is not that bad

Simply put, my tests on a very low-end machine produced results that were easily watchable. They would drop considerably below the target framerate of 30fps, but considering the target machine, and a comparison to my low-end Windows machine, it does feel pretty good overall. Not fast, but not too terrible either.

Conclusion, or TL;DR

H.264 (F4V) videos are better performance-wise. Use them when possible.

Additional notes

You’ll find the full results (with data for every video tested) here. The actual benchmark is here if you want to test if yourself (not recommended at all, as it takes at least 31 minutes to complete, and downloads 240mb of videos – and you can only see the results once it’s done).

Scaleform GFx now bringing Flash to the Unreal engine

Scaleform has announced that Scaleform GFx, their Flash-based solution for graphical user interfaces for games, is set to be included free of charge with the Unreal engine.

They’ve made a nice video of the thing in action, going all the way from the Flash IDE to a new menu and interface for Unreal Tournament III. The video brought a smile to my face — try to find out why:

The funny thing? I don’t remember whether I’ve said this publicly before, but Tweener was initially planned as an UnrealScript extension – I needed some tweening extension for a project I was working at the time (the UI for the Defence Alliance 2 mod for Unreal Tournament 2004), and a MC Tween-like syntax wouldn’t work. In the end I dropped the idea since I would have not enough time to complete it for the mod, and instead took it over for AS2/AS3 a little bit later. It’s super nice seeing it going all the way around and finally being used in the game by way of Scaleform GFx – I don’t think game UI systems have anything nearly as practical as the tweening concepts Actionscript has had for a few years.

Of course, any other tweening extension will probably work within Scaleform GFx, so it’s not like Tweener is any kind of bundle; it was merely what they were using in their example.

The best drawPlane/distortImage method, ever

Pretty bold title, uh?

Before I even begin, let me say two things. First, this is a pretty obvious use for the new Flash 10 drawing API, so I’m quite sure somebody has created something similar already, specially because it’s something I can see a lot of people in need of. Second, please read the entire post before commenting, because it may not be exactly what you think it is.

So getting straight to the point, I’ve created a drawPlane function that uses the new drawTriangle() method in Flash 10′s ActionScript 3 native APIs to draw a correctly distorted plane in 3d, based in four bidimensional points. Other people would call it distortImage or something similar. Click below to check an example:

By now, you’re probably thinking, “hey, I’ve seen this before”. There’s this, this, this, this, and probably many others. So why another one?

Well, there’s one special (and blunt) reason why: because they’re all wrong.

Here’s a shitty image I created a few years ago to illustrate my point:

Perspective is not distortion

Subdividing a plane into a bunch of triangles using their bi-dimensional distances doesn’t give you a projected plane. It gives you a good approximation, and I guess the calculation is easier, but try distorting a plane like that by any amount and you’ll see what kind of deformity it generates.

Secondary to that, I wanted to use the new Flash 10 capabilities to project planes without a lot of hassle (meaning, without having to rely on other libraries or actual 3d position) and, most importantly, by using two triangles only for the entire plane – with no subdivisions of any kind.

Now, don’t get me wrong. There were reasons for (correct) triangulation before Flash 10: Bitmap fills had to rely on a non-distorted fill, so adding more triangles created more accurate results. However, Flash 10 allows AS3 developers to set the distortion factor of a Bitmap fill by using the new drawTriangle() method and its UVT parameters, achieving just that – a perfectly projected plane using two triangles only.

But here lies another problem. While finding the projection is pretty trivial if you’re working with three dimensions (the T portion of the of UVT combination is just a function of the Z of each point), things get a bit more confusing when you’re dealing with points set in two dimensions – points defined by their X and Y only. And that’s exactly what I was going to need, since this is all supposed to be used in a project I’m creating at Firstborn where we’ll be combining video renders with Flash-drawn content over the video at specific positions, and I don’t have any kind of 3d information about the points, just anchor points matched to the video. To put it another way, I wanted to have the same ability to move corners as you have in Photoshop’s Free Transform tool.

I honestly could not find an answer to the problem online. I’m pretty sure there’s some complex way that makes a lot of sense to project 2d points into a 3d field, and therefore find the T of each point, but instead, I was lucky enough to get what I wanted working after a good amount of crazy trial and error (and beer). It actually made the solution very small and much simpler than I expected.

So without further ado, here’s the actual implementation. It’s a pretty simple method that draws a BitmapData to a Graphics instance and sets the perspective accordingly. The semi-secret is the weird diagonals-based ratio that’s calculated for each corner, giving the final plane render uncanny precision.

package com.zehfernando.display {
	import flash.display.BitmapData;
	import flash.display.Graphics;
	import flash.geom.Point;
	/**
	 * @author zeh
	 */
	public function drawPlane(graphics:Graphics, bitmap:BitmapData, p1:Point, p2:Point, p3:Point, p4:Point) : void {
		var pc:Point = getIntersection(p1, p4, p2, p3); // Central point

		// If no intersection between two diagonals, doesn't draw anything
		if (!Boolean(pc)) return;

		// Lengths of first diagonal
		var ll1:Number = Point.distance(p1, pc);
		var ll2:Number = Point.distance(pc, p4);

		// Lengths of second diagonal
		var lr1:Number = Point.distance(p2, pc);
		var lr2:Number = Point.distance(pc, p3);

		// Ratio between diagonals
		var f:Number = (ll1 + ll2) / (lr1 + lr2);

		// Draws the triangle
		graphics.clear();
		graphics.beginBitmapFill(bitmap, null, false, true);

		graphics.drawTriangles(
			Vector.<Number>([p1.x, p1.y, p2.x, p2.y, p3.x, p3.y, p4.x, p4.y]),
			Vector.<int>([0,1,2, 1,3,2]),
			Vector.<Number>([0,0,(1/ll2)*f, 1,0,(1/lr2), 0,1,(1/lr1), 1,1,(1/ll1)*f]) // Magic
		);
	}
}

import flash.geom.Point;
function getIntersection(p1:Point, p2:Point, p3:Point, p4:Point): Point {
	// Returns a point containing the intersection between two lines
	// http://keith-hair.net/blog/2008/08/04/find-intersection-point-of-two-lines-in-as3/
	// http://www.gamedev.pastebin.com/f49a054c1

	var a1:Number = p2.y - p1.y;
	var b1:Number = p1.x - p2.x;
	var a2:Number = p4.y - p3.y;
	var b2:Number = p3.x - p4.x;

	var denom:Number = a1 * b2 - a2 * b1;
	if (denom == 0) return null;

	var c1:Number = p2.x * p1.y - p1.x * p2.y;
	var c2:Number = p4.x * p3.y - p3.x * p4.y;

	var p:Point = new Point((b1 * c2 - b2 * c1)/denom, (a2 * c1 - a1 * c2)/denom);

	if (Point.distance(p, p2) > Point.distance(p1, p2)) return null;
	if (Point.distance(p, p1) > Point.distance(p1, p2)) return null;
	if (Point.distance(p, p4) > Point.distance(p3, p4)) return null;
	if (Point.distance(p, p3) > Point.distance(p3, p4)) return null;

	return p;
}

Inside a DisplayObject, use it like so to redraw a BitmapData instance using 4 given 2d points:

drawPlane(this.graphics, myBitmapData, topLeft, topRight, bottomLeft, bottomRight);

You can download the source for the example editor above from here: TriangleTest.zip. Warning: it’s a Flash 10, AS3 classes-only example. There’s no FLA file and no project settings file. The structure should be fairly simple to understand, however; the class roots are at /src/src (actual source) and /src/libs (a few additional files used). Just compile /src/src/documents/TriangleTest.as as your document class and you should be good to go.

Anyway, I like it. It can probably be a bit optimized (specially the getIntersection function, or the many Point.distance() calls) but it works well so far – don’t forget it’s only dealing with 4 points per plane, so there’s no excruciating loop that begs for optimization.

If you still think this was done before – Flash 10 drawTriangles() based image drawing, with two triangles, and correct perspective distortion – please post a link on the comments and I’ll correct the article. I still think the title and the premise of the post is bold, but so far, I’m sticking to it.

Finally, do notice that this is not a 3d engine. It’s just a quick way to draw a plane when you only know the 2d points of the corners. There are no 3d points, vertices, or angles of any kind involved here. If you want to rotate a plane you already have, it makes a lot more sense to just rotate it using the native rotation properties, or use any 3d engine of your preference.

PS. No, the cat is not mine. It’s Don Citarella‘s Floyd.

Update: makc used the same approach to do the inverse: get a projected plane out of a 3d graphic and transform it on an de-projected rectangle, something that seems to have the amazing name of inverse homography. Read about it here, and see an example of the technique here. Apparently this is a faster alternative to the same technique first employed by Japanese rock-star Flash developer Saqoosha.

Something I just remembered

Roughly 17 years ago, I was in love with QuickBasic. After a while using GWBasic (not to mention years of using BASIC on other systems, such as the Brazilian Sinclair ZX81 clone TK-85, the MSX 1.0, and the Apple II), QuickBasic was a real milestone in my life. It was when I changed from using line numbers and a pretty mangled structure in my programs (using subroutines with GOSUBs and GOTOs) to a more organized one with functions and procedures. I also loved how easy it was to debug inside the Quickbasic 4.5 IDE – you could stop your program at any time, do changes to your code, and then continue running. Programs were interpreted, and while speed wasn’t the best compared to other platforms available for similar DOS machines, immediate execution, easy debugging and the ability to compile executables later made it the platform of choice for all the small programs I liked to create – database merging systems, small graphical or text-based games, and general tools I used on my daily work.

Quickbasic 4.5

QuickBasic 4.5 (click for source)

But even though I used QuickBasic for most of my stuff, I also used other languages like C and Pascal at high school. For the latter, my IDE of choice was Turbo Pascal. And it’s within that development environment that I had my first run-in with Turbo Vision.

Turbo Vision was, in a nutshell, a framework to create user interface elements for Turbo Pascal programs. It allowed you to easily create menus, buttons, dialog windows and the alike – something that was still pretty rare at the time (considering DOS programs were never really about the user interface). Being able to quickly construct a program with such a high-level interface without doing everything from scratch was quite a shock to me.

Turbo Vision example

Screenshot of "Quadra", showing what a Turbo Vision interface looked like (Click for source)

An even bigger shock was understanding that you could actually create common libraries or frameworks and reuse them on different projects. As obvious as it is today, my self-taught understanding of program structures at the time hadn’t grasped this concept yet. Turbo Vision itself was beautiful, but the idea of Turbo Vision was, for my little coder mind, revolutionary.

So after having my first experience with Turbo Vision, I set out to create a similar framework in QuickBasic that I could use for my own programs (while I enjoyed the speed and correctness of languages like C and Pascal, not to mention the speed, I enjoyed the agility I had with QuickBasic even more).

The result was a bunch of subroutines that could be called to draw and control everything. You could create screen elements like buttons, editable text fields, windows, check boxes, radio buttons, menus and the like, with full tabbing/focusing/clipboard support, and control them using a higher-level API. That allowed me to add great interface elements to my simple programs quite easily, without having to worry about rewriting it every time I needed to create something new. The implementation was somewhat ugly, since QuickBasic would run in a single thread and I had to move execution from one element to another and deal with the return result to know what to do, but it was clear enough in a way that allowed me to reuse elements with certain ease. I still have that code somewhere around my computer.

Fast forward to 2003, around a decade later – after I had abandoned QuickBasic, used a few other platforms and languages, until finally settling with development for the web. While I’m working with ActionScript and developing some visual components for a website, I had an epiphany: I was doing the same thing I had done 10 years before.

The operating system was different: Windows, instead of DOS; the language was not QuickBasic anymore, but ActionScript; I wasn’t building a text-based application, but rather one that used graphical user interface elements and the mouse; I wasn’t reading data from the disk, but from some asynchronous network resource. But still, I was doing the same thing: creating graphical elements that were meant to be reused.

A decade before, I had no idea I was going to be working with Flash (since that didn’t exist), or even be building those kinds of interfaces. Of course I had no idea I would be building something meant to be used over the Internet either. And I had even take some pretty odd routes along the way of my earlier career, focusing in graphic design and animation instead of programming, but still, there I was, full circle, back doing what could be said to be basically the same thing I had been doing at high school, only in an updated environment.

And while I’m not really that old, I like to see this as a great example that no technology lasts forever, and that programmers can’t have any idea of what they’ll be working 10 years later. And, perhaps more strongly now, that while the environment they’re using might be completely different, the concepts they’ve learned before will carry on. In a nutshell, your experience is not the language you use, but what you do with that language.

I guess it doesn’t take much to see how that applies to the issue I’m seeing people debating today – “Rich Internet” platforms like Flash, SilverLight, HTML 5, and a few others. I’ll be the douchebag who quotes himself and just paste what I had to say a short while ago in my FWA interview:

Q. Do you think Flash is here to stay?

A. In a way, most definitely. Flash was the first software platform to allow developers to easily create rich interactive animations for the everyday user, so whatever Flash has brought and is still bringing is something other platforms are playing catch up to adopt.

But on the other hand, you can never predict what the future will be like. When I started working with interface development, Flash didn’t exist and I couldn’t even predict I’d be working with something like it 15 years later.

I love Flash, but after having worked with it for more than a decade and seeing it radically transformed so many times, it’s as if the name doesn’t matter that much anymore. I think what we call RIA development is the thing that’s here to stay and that’ll just become bigger with time, but the name under the hood becomes almost irrelevant. Even if it’s Flash and ActionScript, it’s going to be a different Flash and a different language in a few years. With similarities, but then again there are similarities between ActionScript, Java, and C#. Being attached to that kind of brand isn’t very healthy I guess

Here’s something a lot of people forget – we’re not in this business because we’re using language XYZ. While a lot of people like to draw that kind of conclusion – because, after all, their experience is usually focused on a single language, not to mention the need to identify with a certain group – the market itself changes over time and it’s natural for one language to replace another as the language of choice for a specific task. Remember when Perl was the de-facto server-side language for the web? If you considered yourself a “Perl specialist”, instead of a server-side programmer, you’d be stuck when people started to migrate their systems to PHP and others.

I see the current state of rich interfaces development the same way. We’re doing Flash today – because it’s widespread, the tools available (official or not) are stable enough, it has an awesome community, and it’s generally easy to develop for. What are we going to use tomorrow? I don’t know, but the knowledge I’ve gathered over the years – developing the interface elements I’ve been using, learning about asynchronous data loading and user interface issues – will carry over, regardless of the new platform I’m using.

Now, there’s a lot that could be said against HTML5. Performance issues, compatibility, much slower user penetration, lack of features, how it’s only adding the same features and problems people criticized in Flash before. All has been said and, I’m sure, will be said as soon as HTML5 start to become a real contender and its flaws become more obvious in a real world scenario.

But you know what? It doesn’t matter. Because HTML5 will either work out or it won’t. And, as a developer, it’s not my job to root for one specific side. It’s my job to absorb whatever there’s to be absorbed and move on. Be it HTML and JavaScript, be it ActionScript, be it Silverlight, I’m pretty sure the higher-level knowledge I’ve gathered through the years that are outside the syntax realm will carry on.

I can’t be sure of what I’ll be working with in 10 years. And neither can you.