Meeting Topic: Building a Software Audio Product: Two Case Studies
Moderator Name: Greg Riggs
Speaker Name: Armen Karamian, AppOnBoard, Ivana Andjelkovic, Schitt Audio
Meeting Location: Sportsman's Lodge, Studio City, California
On Tuesday, November 28, 2017, the Audio Engineering Society Los Angeles heard from two software engineers, Armen Karamian and Ivana Andjelkovic. Armen works at AppOnboard, Inc. but discussed his personally developed iPad application Nodal, while Ivana works at Schitt Audio, and discussed her Ph.D. work on audio recommendation software. The panel was moderated by Greg Riggs.
Armen described his journey to software engineer and application developer: "I'm here to talk about my baby Nodal. It started off as an experimental music application, just for myself. It uses algorithmic synthesis to create audio patterns. But to say where I started, I actually went to art school, and I was working in film post-production for a while. I got interested in digital signal processing (DSP) and programming and so I took some online classes or MOOCS — massively open online classes as I like to call them. I love those. I took some classes, and failed. So I took more classes, then I passed. Don't give up on your dreams."
He described first experimenting with Android for app development, but realized the software and hardware support for multimedia was better supported on Apple products, with user interactivity being much more responsive. Armen showed a graph comparing Android and Apple iOS, which showed even a five-year-old Apple phone had a 7 millisecond latency to respond to input, whereas a brand new Google Pixel XL2 had between 14 to 53 ms.
Nodal was originally just a personal looper, where Armen would take his synthesizer and feed sounds into software to see how they could be manipulated. He realized that he could use Markov chains to create new, random combinations of synthesized sounds in interesting ways, and use the graphical interface of the iPad to allow users to connect nodes (hence "Nodal") in different ways and weightings to create chains of forward causation and feedback.
Armen described the work of building an App, from learning the programming language, to learning the concepts of Object Oriented Programming, to learning the specifics of a development Framework, which allows one to leverage built-in software tools to greater effect. He noted that the Apple iOS environment has a UI Kit for building a consistent Apple-like experience, and underpinning that Apple supports both Objective C, a stalwart of Apple programming, and now Swift, an updated programming language recently developed by Apple. Within the Apple frameworks also is something called Audio Sessions, which is a set of tools and applications that make programming audio applications easier and more powerful. For instance, it lets the audio output automatically adapt to the native audio sampling rate of the device the application is running on, whether it's 44.1 kHz, or 96 kHz. Apple developed additional tools called AV Foundation and within it is AV Player, AV Audio Engine, a set of synthesis tools, and Interapp Audio, for exchanging audio information between applications. These tools are basically the same as those on the Mac.
Armen said that the development of this application, according to the time recorded on Github, was around 380 hours. The software tools have gotten much more powerful, though, so something that would have been 300 lines of code in Objective C was only 50 lines in Swift, so the choice of Swift as the main coding language was obvious.
Ivana Andjelkovic described her Ph.D. research that she did at the University of California, Santa Barbara: "Moodplay is an interactive music recommendation system based on artists' mood similarity, in order to predict user preferences. It includes an interactive recommender system to enable users to make adjustments to the recommendations they receive." Continuing, she explained why a music recommendation system had become important: "The way we're listening to music has been changing drastically in the last couple of decades. Twenty to thirty years ago we used CDs. A typical person had 50 or 100 CD albums whereas today we have access to streaming services, Spotify being probably one of the most popular ones that have over a million albums. That's all available to us and we have to have a way to navigate through all of those collections."
Ivana's Moodplay recommendation system is based on associating a "high dimensional space of moods" with a collection of artists. Each artist's position in this space is based on associated mood categories. The user supplies a set of profile items which are used to create an avatar, which is placed in a space surrounded by the music samples. The user can move their avatar through the space representing the moods of the music samples, and broaden or narrow the circle of interest as a means of exploring what they might like.
Ivana said that one of her reasons for exploring this topic is that typical music recommendation systems do not provide much interaction nor explanation about why they serve the music they do. She also wanted to explorer what effect interactive visualization would have on the user experience — would they like interacting with the system more and find more of the music they might enjoy?
Typically a recommendation system bases its recommendations on music that you've already listened to or on an artist you've said you like, and often using the preferences of other people that it judges to be similar to you. Recommendations based on context cues may limit the new music that might be discovered if the social group is simply unaware of that music exists. Content-based recommendation, on the other hand, might compute similarity between songs by analyzing timbre, tempo and other characteristics of the song, in order to seek similarities among a broad group of songs, whether popular or not. Spotify has a playlists based on moods, but they have been curated, either professionally or by a user, but the relationships between one mood and another, their proximity, isn't obvious. Creating a mood space into which one can travel and move between moods is easier than going through a list, which is what Spotify offers.
Studies of user satisfaction on the Moodplay system indicated that users were both more satisfied and felt that the content provided was more relevant to their interests. Finding a satisfactory mood database, however, is somewhat challenging, and in Ivana's opinion the most useful is Rovi Music's Moodlogic database, with over 300 different moods. Each artist is tagged with five to twenty of these 300 moods. The number of moods available to describe music makes the use of a n-dimensional space essential, as some pieces of music may be related in certain respects, say violence arousal, but unrelated in others. To make the problem more manageable, she found a psychological study by Fetner in 2008 that broke music into three broad categories of sublimity, vitality, and unease, with subcategories under each of these. As an example, she compared the bands "Husky Rescue" and "Flunk", and found that while their music was similar, Flunk is more sad. The total number of artists used in Ivana's Moodplay moodspace creation was five-thousand. She also found, however, that there were tradeoffs in providing information to the users. Being able to interact with the system increased the users' trust in the system, but keeping a history of previous marks in the moodspace seemed to overwhelm the users a bit and was negatively correlated with satisfaction. https://www.youtube.com/watch?v=eEdo32oOmcE&t=4s
The Audio Engineering Society, Los Angeles Section wishes to thank Ivana Andjelkovic and Armen Karamian for their time and their presentations, and for increasing awareness of the applications of software in the world of audio.
Written By: John Svetlik