In the past few months, it’s surprising the number of times I’ve had to play a video in Flash (in either FLV or F4V format) and know its current time (or current frame) accurately – specially because we’ve been using a lot of shape tracking, to project a Flash content-based plane on top of a video being played (with proper perspective distortion).
Examples of this are in out work for SoBe’s Staring Contest (try the “experimental version”), also SoBe’s Drop of Flavor, and in 5Gum’s 5React this one created more than a year ago (select “United States”, click “Find out”, and approve the Facebook requests – don’t worry, it’s not gonna post aything to your wall, the campaign is over; and you may need to reload the page a few times after approval as recent changes in Facebook login have broken the website in some browsers).
It’s developing websites like these that made me realize how hard it is to know well what’s the video’s current position in time (and no, saving the video inside a SWF container and then reading the MovieClip’s currentFrame is not a real solution). One would expect using the NetStream’s time property to be the way to go, but this is extremely inaccurate; I have the feeling this is a property that is only updated after video keyframes are reached. And, for the kind of tracking we need to do, being one frame off in our calculations was already too much.
For 5React, I had to know the video’s current frame, so I could write the content to the fake TV sets that show during the introductory video playback. After much testing and researching, I ended up using the number of NetStream’s decodedFrames (surprisingly absent from the documentation) added to the number of droppedFrames to properly determine the video’s current frame.
This approach has two big caveats, however. First, they are properties that accumulate as the video is played – they don’t take time into consideration. This means that if you stop the video after one second and start playing again from the beginning, the number of decodedFrames will never reset – it’ll just continue to accumulate. So it means it’s only useful on a linear, continuous playback – if the user can’t pause video execution and seek to a different time.
The second caveat is that the properties are not so reliable. In theory, droppedFrames sound amazing, but apparently it’s only increased when Flash decides top drop a frame itself – if the Flash Player hangs for a fraction of a second (say, when you right-click the Flash movie), it has no impact in that number, meaning the sync is lost. The same happens if you switch back from the window playing the SWF – neither decodedFrames or droppedFrames are counted. This behavior is easy to notice with the 5React animation – if you just watch the whole thing, it’ll perform flawlessly. If you’re on a slow computer, or if you right-click the SWF prior to execution, or if you switch to another browser tab or window and then switch back, tracking is already out of sync.
Because of these issues, when implementing the tracking solution for SoBe’s Staring Contest, I knew I couldn’t rely on that technique – I needed to allow the user to scrub the video and see the tracking matching the video time – so I tried something different: encoding cuepoints in the video with the number of the current frame.
The problem with this approach is, however, threefold. For one thing, encoding the cuepoints with the necessary information into the video is a daunting task – one that can be helped by scripts, but still, a hard task.
The second problem is that using real cuepoints force you to use the “old” On2 VP6 video codec – instead of the new, more modern, H.264 codec-based F4V format. The videos for the above website had to be HD-quality videos, and being forced to use FLV for those meant bigger files, less quality, and even worse rendering performance.
The third and final problem is that even thought FLV cuepoints are extremely accurate, they’re still not accurate enough. Apparently, video frames can be decoded and rendered at a separate thread than the cuepoints for that specific frame are triggered, so we’ve ran into situations where the tracked plane is not rendered properly (and, like I said, even a 1 frame offset is already too much unless the video being tracked is very slow). This is a problem that is hard to put a finger on, and it depends largely on Flash Player version, system version, and number of processors available on a machine – but it suffices to say that, while the cuepoint decoding and rendering works pretty well on most machines, it does fail pretty badly on a top-of-the-line Mac machine with the latest version of the Flash Player for no apparent reason. This is noticeable on the Drop of Flavor website I mentioned above, or on the Staring Contest, when playing the “victory” video for the experimental version and scrubbing through the video during the part where the model shows you the smartphone with your photo: scrubbing too much, too quickly, will cause the tracked shape to be rendered somewhere it doesn’t belong.
I’ve been discussing this with Eric Decker – who developed the Drop of Flavor website – and he believes he has come up with a somewhat better solution, product of a crazy idea we’ve discussed in the past. Can you guess what it is?
He’s encoding the current frame’s number as binary, at the bottom of the video, in the shape of rectangles that go on and off. He then reads the color information (via getPixel) and converts that to the proper number. It works well, and it’s very accurate – but it requires encoding the video with this information (created with a Photoshop action that creates a layer for every number) that not only takes space and file size, but forces you to take the useless space into account when rendering the video.
He has more information about that here. That we even need to resort to this kind of hack is something that has been driving me crazy, so I’d just like to put something out there:
We need to know the current frame – or accurate time position – of a video being played in Flash, regardless of the format, without any crazy hack.
I don’t know the particularities of how the video decoding and renderers are programmed in the Flash Player. However, it does seems to be to be a little bit odd that we can’t have a property like “currentFrame” that indicates the actual number of the frame rendered there in the screen, or a “time” that is updated on every frame.
I like to believe I’m not the kind of developer that likes to bitch and moan about features. And I’ve already seen Adobe constantly delivering features I wanted in the Flash Player, like better sound control and, more recently, native JSON parsing. But, seriously… think about this: we just need to know the video’s current frame or time.
Have you ever ran into this issue? Am I the only one who wants this functionality to be that accurate? Or is it something a lot of people have ran into? Make your voice, and opinion, be heard – I’ve created a feature request entry on Adobe’s bug tracker for Flash Player (requires login).