Aurature / Aurality

The following is a extract from my forthcoming essay 'Aurature at the End(s) of Electronic Literature' (the draft under consideration is linked here) which was presented as paper at the 2015 conference of the Electronic Literature Organization, University of Bergen, Norway.

Programmatology page with a number of recorded examples.

"Is there something about the contemporary culture of reading which has not so far be mentioned and that has emerged with new significance? I believe that there is. At last we come to aurature at the end(s) of this essay. I deliberately left off one of the interesting affordances of contemporary ebooks from my previous list. Many ebooks now have companion audio versions, some of them with the ability to sync across reading platforms. Without being able, here and now, to quote hard literary sociological evidence to support this (although I am confident that my impression would be borne out), I would say, anecdotally, that there has been a significant increase in the reading of audio books over the past decade. They are ever cheaper to buy, much more numerous and, because of digitization and network delivery, an order of magnitude easier to acquire and manage. In the world of both popular and high literary culture, there has, therefore, been a significant increase in the appreciation of literary artifacts — in their reading, I would say — by way of aurality as opposed to visuality.

"At the same time, with the advent and persistent presence of Siri, Cortana, and Google Now, we are beginning to realize that computational terminals linked to the ‘cloud’ — and thence to the research and service infrastructures of Big Software — are now listening to us, and responding with much improved synthesized voices, beginning to approach an acceptable coherence of significance and affect in construable utterance. These voices can also be configured to read out loud from arbitrary texts of our choice on computers and other devices housing the aforementioned software agents that these same voices ventriloquize. People who nowadays encounter these vocal transactors may begin to understand some new part of what has become of all the data that they have filled in and posted, that they have willingly and much too freely given over not only to market profiling but to the solutionist research institutions of Big Software. Whereas computer voices and ‘text’ generation had remained, until quite recently, feeble, if charming, geekish jokes from the ‘AI winter,’ now many of us — I mean many non-specialists — have heard of what ‘n-grams’ may do for us and for our culture at large and that this is also an aspect of a widespread, ramified, and very pragmatic, commercially-invested engagement with ‘natural language processing.’

"With the prospect, in part, of being able to balance out what can only be understood as an invidious commercial overdetermination, a whole new field of technically and algorithmically implicated aesthetic language practice is opening up for just the kind of author-makers who may have been speculating about the ends of electronic literature. Perhaps we will not be able to think of this new field as, strictly, literary practice since its medium is language without the letter. As an applied grammatologist, the writer of this essay proposes that we eschew any unwarranted qualitative linguistic-philosophical distinction between writing and speech. Language is medium agnostic, although the human animal, as language co-creator, is not. Regardless, to ‘read,’ in our philosophy, is, precisely, to transmute perceptible forms — consisting of any material substance — into language. While — the serpent eats its tail — it is the bringing into being of language that proves to us that ‘reading’ has taken place.

"How and why might the practice of a computationally implicated aurature be important, apart, that is, from helping to stave off or delay the end of electronic literature? There is too much to say here. However, to conclude this essay, I will simply illustrate a few points by way of example, not attempting to draw out the full implications of what is touched on in the following brief narrative. Nonetheless, the arrival of speaking and, especially, listening networked programmable devices — as a part of the technological and cultural architecture of Big Software — has, I believe, important consequences for literature and for literary practices of all kinds.

"After Siri and at around the same time that we were introduced to Cortana and Google Now, it became possible to invite Alexa — Amazon’s Echo — into our homes, promoted by much-satirized advertising suggesting that she might even become a kind of family member. Alexa can speak and she also — most particularly — listens. If you set her up and leave her in some common room of your home she will listen to everything that she can hear within that space using seven excellent microphones particularly attuned to vocal human language by ‘Far-field voice recognition.’ Triggered by her ‘wake word,’ the eponymous “Alexa,” she sends everything she subsequently hears — including “a faction of a second of audio before the wake word” [my emphasis] — to the ‘cloud’ for processing by Amazon’s “Alexa Voice Services.” The latter is the name for a web-based infrastructure that, in addition to interpreting and responding to human invocations of Alexa herself, will provide an inexpensive service for any hardware manufacturer wanting to add voice recognition, control and vocal feedback to their devices, without having to build these technologies and services themselves. Our mobile digital familiars — especially phones and tablets — already surveil us extensively given our more or less silent, passive consent, but they are ours, intimate with us — they seem to be our individual business or problem. I believe that Alexa is the first device that we have invited to enter into our homes and attend to whatever occurs — that its algorithms can linguistically interpret — in these spaces that we may also share with other ostensibly private visitors and without any existing protocol for obtaining their consent to this surveillance, always assuming that this now occurs to us as any kind of a problem. And when ever more devices are enhanced and empowered by the Voice Services of Big Software? Then what? Will everything in the the world of human aurality be perfectly surveilled? Interventions will be necessary, if only to help us understand this radical transformation of the social and ideological spaces within which we must live.

"Alexa can, with the Alex Skills Kit (ASK), be given new linguistic abilities in the burgeoning world of computational aurality. These are called ‘skills,’ and she exercises them in order to respond to what she — also in the terminology of the kit — can interpret as vocally expressed ‘intents.’ Now, today, any of us can program Alexa to recognize and attend to arbitrary, even aesthetic, events of language that she believes to be intended for her. And we can make her respond appropriately with utterances that humans may understand, that we can read. For my part — and while carefully considering certain implications of this decision — my next work of digital language art will involve the making of a relatively simple skill for Alexa."

John Cayley - August 2015


Programmatology page with a number of recorded examples.