A few years back, I was working on StoryTouch, a desktop application which was meant to be used by users of an industry niche. As part of the release strategy, our product manager, Paulo Morelli, decided that it would be great to offer video tutorials explaining the basic usage of the software.

As those tutorials were recorded and we were making fast paced changes to the software, we started bumping into discussions on whether or not we should re-record the videos or avoid major UI changes to the application or just hope users could pick up on the changes we made since the tutorial was recorded.

We decided, back then, that the best thing we could do was to automate as much as we could from the tutorials into end-to-end automated tests named after each of the tutorials. As a result, each time we would change the application layout or interactions, those tests would fail and prompt us to have a conversation with Paulo about whether or not we should keep the change, update the tutorial or let it be knowing that there was a variation.

I was recently explaining what we were doing to a colleague that was very excited about the idea and suggested I blogged about it so here we are. The way those tests were written were also very close to what the narrator of the tutorial was explaining what was happening. In a way, if we could run those tests reading the high level description of the test and recording what each step caused in the UI, we could potentially have been able to generate both the video and the audio for such tutorials upon each build of the software. This would ensure that your tutorials always work and that users can easily find out how to make a certain thing given their software version.

We never quite actually got there. And those tests were a little brittle as most of those high level end-to-end tests tend to be. Nevertheless, I sense that the idea could be a good guideline into deciding what needs to be in your end-to-end high level automated test suite and what probably should be somewhere else. Critical features of your application that customers need to interact with are likely worth a tutorial and have an automated way to warn your development team that user behavior in a very import part of your application will be changed by a given release can avoid many nasty surprises.

Now, in the era of mostly web applications, something like a decorated selenium driver that records interactions, high lights clicks/buttons and text field inputs might be capable of generating the video for tutorials. Text to speech software allied with a good abstraction level in your test’s code may be enough to generate the associated audio.

Unfortunately, the decorated selenium driver is inexistent as far as I know and the text to speech is common but projects with automated tests with a good enough abstraction level are very rare, However, as we evolve our projects and platforms, we might get closer and closer to generating this critical piece of documentation from code alone (a step further along the BDD line).

Please feel free to comment, disagree or agree and point out to any technology you might know that goes into this line.