Voice Driven Type Design: Useful for #Captions?

Fascinating article by Woelfel, Schlippe, and Stitz. If I understood it properly, different qualities of a an audio file or spoken voice could be interpreted and impact how the words spoken would be represented typographically. In other words, volume might impact size of the type, font used, leading, if the type was ragged, or what have you.

In terms of captioning, I thought this might be interesting in terms of potentially providing an algorithmic approach for caption generation. Might be a good way to show anger, calm, seduction, etc. Then again, this might also require viewers to learn an entirely new vocabulary or develop an additional literacy to interpret what specific things, like size, font, or color, might mean.

Then there's are a few potential problems: would each company potentially have its own algorithm and thus have no consistent standard? Would the standard be free and given away? Would there be a charge for it? Could this only appear in videos or films that could afford it?

One approach that I think might work is, of course, inspired by the epic and experimental captioning done inNight Watch and the new Sherlock Holmes. Rather than trying to do these kinds of effects on all text, perhaps these could be used for such things like representing NSI (non-speech information) on screen and single word utterances, such as profanities and exclamations. Thus they would be visually different from the normal presentation of utterance and sounds, and yet they could also embody certain components of the utterances in their type design.

Plenty to think about from this piece. Hope that we see more of this type of work--and maybe some of it will enter the realm of captions. Yes!

Optimal Caption Placement: Ouzts, Snell, Maini, Duchowski

Excited I was to find this conference paper! "Yes," I thought! This will be fascinating. And then, after I finished reading all two pages, I felt disappointed. Is this the authors' fault? A bit. Is it my fault? A bit. You see, I sadly lack the necessary statistical literacy, or my statistical chops are seriously gummed up, to fully make sense of the results section. So, there's that.

Their conclusion, however, reads thus:

"An eye tracking study was presented in which several different captioning styles were examined. Significant differences were found between eye movement metrics depending on the captioning style used, suggesting that captioning styles play an important role in viewing strategies. Participants underwent large amounts of saccadic crossovers and spent much less time reading the captions when captions changed position frequently. Future work is needed to fully examine the implications of these differences" (emphasis added, p. 190).

This makes quite a bit of sense, especially when you consider that they tried four approaches to presenting the captions. (Read the article, heh!) Most notably they tried the traditional captioning positions as well as placing captions above the speakers when present on screen. If not on screen, the captions would be at the bottom. This left me wondering a couple things.

While it may be useful for comprehension to avoid lots of extra or overlapping eye movement, might it not be possible to have the caption placement near speakers occur during intense dialogue and conversation and then shift to traditional (at the bottom) placement when conversation off-screen alternates with significant NSIs (non-speech information)? That might be an interesting approach to captions to test out--especially in dialogue heavy video or film.

As to the article, I am grateful that the authors conducted and shared the research. I just wish the findings would have been more explicitly stated. Then again, there might not have been enough information, or data, to support broader generalizations or suggestions for practice. I respect that. However, in the interest of testing out other approaches to captioning, it would be nice to have some research-driven data from which to launch.

Bringing in the Connotative in #Captions: Nicole Snell's Dissertation on Captions

That title can't sum up Nicole Snell's dissertation, but it does emphasize one of the key points she makes in her work. I'm about twenty pages in and quite enjoying it. It's a dissertation, not an academic article, so that make the reading flow differently; however, the content and focus are quite refreshing.

Particularly love this part:

"This goal [of the dissertation] is undergirded by the hypothesis: since closed captioning changes the passive viewing experience into an active reading one, it can be predicted that users of closed captioning construct connotative and emotional meaning through viewing and meaning making strategies that are different than the strategies individuals who have access to the audio soundtrack and scene action do" (p. 12).

Over the past couple days, I have been trying to think and better understand why I like captions. I seem to have some kind of fixation or attachment to them--something akin to my former obsession with punk rock or specific bands when I was sixteen or eighteen. It's like an itch or fixation or something. Snell's quote, though, helps me better understand my focus on captions though--if only a little bit. Watching the telly, well, just does not do it for me. Boring. I want the captions, and the captions serve and work as a kind of validator of what I'm seeing and hearing. 

Captions also provide additional information. Sometimes it's song lyrics--not always easy to discern if you are just listening; other times, it's background muttering by another character--not always clear in the actual spoken dialogue. If memory serves me right, this happens a fair amount in Orphan Black and similar conspiracy-esque series. Back to Snell's point, though: I'm not just watching, I am also reading.

When I read captions, though, it's not a passive activity. I read and see if all the dialogue is there. I look for specific sounds. I wonder about the presentation of accents--or not. I wonder what non-speech information (NSI) [using a descriptor developed, as far as I know, by Sean Zdenek] is presented and why. I am entertained by the narrative but engage with one form of its representation.

I know that this post's focus is on what and why I like captions--at least part of the reason. The affective is another emphasis Snell cover's in her diss--not just the connotative. If you want more details, please check out her work. Really. This is some exciting thinking!

And yes, this is a processing and working through post.