{"id":2955,"date":"2013-11-09T22:21:59","date_gmt":"2013-11-09T22:21:59","guid":{"rendered":"http:\/\/blog.soton.ac.uk\/digitalhumanities\/?p=2955"},"modified":"2013-10-15T22:22:22","modified_gmt":"2013-10-15T22:22:22","slug":"sotondh-small-grants-investigation-into-synthesizer-parameter-mapping-and-interaction-for-sound-design-purposes-post-2","status":"publish","type":"post","link":"https:\/\/digitalhumanities.soton.ac.uk\/small-grants\/2955","title":{"rendered":"sotonDH Small Grants: Investigation into Synthesizer Parameter Mapping and Interaction for Sound Design Purposes \u2013 Post 2"},"content":{"rendered":"
In the previous blog post three research questions were presented in relationship to how synthesizers are used for sound design:\u00a0 First, is there a way that sound design can be performed without an in-depth knowledge of the underlying synthesis technique?\u00a0 Second, can a large number of synthesizer parameters be controlled intuitively with a set of interface controls that relate to the sounds themselves?\u00a0 Finally, can multiple sets of complex synthesizer parameters be controlled and explored simultaneously?<\/p>\n
Over the years there has been significant research in the area of synthesizer programming, which can be separated into two foci. First, improving the programming interface so that the details of the underlying techniques are not visible, but can be controlled.\u00a0 Second, the automatic programming of a synthesizer to try to replicate target sounds.<\/p>\n
As previously mentioned, the programming interface that a synthesizer presents to the user is often a direct mapping of the synthesis parameters rather than related to the output sound, and follows directly from original hardware synthesizers such as the Moog Modular [1].\u00a0 Various proposed solutions examine the mapping of the synthesizer parameters between the synthesis engine and the programming interface to see if the relationship can be more intuitive and less technical.<\/p>\n
Several researchers have developed systems that will interpolate between parameters, or sets of parameters, via a user interface.\u00a0 Work in this area was first completed at GRM in the 1970\u2019s and 80\u2019s, where the SYTER system was developed [2], [3].\u00a0 This system used a X-Y graphical interface to control the relationship between different parameters of the synthesizer engine.\u00a0 The X-Y positions of points on the graphical interface were mapped to the parameters using a gravitational model and the user could explore different interpolations between the parameters.\u00a0 The number of parameters controlled on the X-Y plane can be expanded by defining different origins for the position calculations of each parameter [4].\u00a0 As mentioned previously, these systems use a gravity model to define a circle of influence for the interpolation function.\u00a0 A later system called Interpolator<\/i> used a light model, where an angle could be specified to define an interpolation zone [5].\u00a0 This gives extra flexibility and when an angle of 360\u00b0 is used the traditional circular model can also be achieved.\u00a0 Interpolation techniques were expanded in the implementation of Metasurface, which allows the definition of multiple origins in an X-Y plane and then uses spatial interpolation technique, called Natural Neighbourhood Interpolation [6].\u00a0 This creates a polygon for each parameter and then gives a weighting value to each that corresponds to area taken from adjacent polygons, resulting in the smooth control of the assigned parameters.\u00a0 Other geometric manipulations of the parameter space have been suggested [7], resulting in the implementation of a multi-Layer mapping system that has been used to map both geometric position and gestures, to control the sound, via a drawing tablet [8].\u00a0 The principle of using an X-Y plane has also been advanced with the use of multi-point touch screen interfaces, which allows the relationship between multiple points to be used to map advanced multi-touch gestures [9].<\/p>\nTimbre Space and Perceptual Mappings<\/h4>\n
Although the interpolation systems examined in the previous section do give a more intuitive way of managing complex synthesizer programming control structures, they do not necessarily relate to the perception of the sound produced.\u00a0 In 1975 Grey defined \u201cTimbre Space\u201d based on a 3D space using a three-way multidimensional scaling algorithm called INDSCAL to position 16 timbres in the space.\u00a0 The first axis is interpreted as the spectral energy of the sound, the second dimension is temporal behavior in the attack stage between the upper harmonics, and the third is the spectral fluctuation, which relates to the articulatory nature of the instrument [10].\u00a0 These principles were expanded on in 1979 by Wessel, from IRCAM, who showed that a 2D timbre space could be used to control the mapping of synthesizer parameters [11].\u00a0 Later in the mid 1990s a system called, Intuitive Sound Editing Environment (ISEE) developed by Vertegaal from the University of Bradford, used a hierarchical structure for timbre space, based on a taxonomy of musical instruments.\u00a0 This allowed changes in timbre, that require numerous parameter changes, to be generated by relocating the sound within the timbre space hierarchy [12], [13].<\/p>\n
Although not directly related to timbre space in 1996, at the University of Paris, Rolland developed a system for capturing the expertise of sound designers, programming a synthesizer, by using a model of knowledge representation.\u00a0 This was not based the attributes of the sound structures themselves, but on the manipulations or variations that can be applied to them. \u00a0These transformation procedures were then defined using adjective terms such as \u201cbrighter’\u201d or \u201cwarmer\u201d. This means classification of a sound according to the transformations that can be applied to it, rather than the properties of the sound itself.\u00a0 This resulted in a hierarchical network of sounds and connection between them, which define the transformations that are required to modify between them [14].\u00a0 Seawave, developed at Michigan State University in 1994 by Ethington, was a similar system that allowed an initial synthesizer patch to be modified using controls that are specified using timbral adjectives [15].\u00a0 More recently, in 2006 Gounaropoulos at University of Kent produced a system that used a list of adjectives to provide an input, which was mapped via a trained neural network [16]. \u00a0The user could then adjust the sound using controls allocated to the timbral adjectives. \u00a0Aramaki in 2007 then showed that a similar mapping process can be applied to percussive sounds, based on different materials and the type of impact [17].<\/p>\n
Nicol in 2005 was the first to propose the use of multiple timbre spaces, with one being generated from listening test and another that is drawn from acoustic parameters [18].\u00a0 In a comprehensive body of work Seago expanded this idea and has recently presented a synthesizer interface that allows the design and exploration of a timbre space, using a system of weighted centroid localization [19], [20].<\/p>\n
Work is continuing in generating more accurate representations of perceptual adjectives and hence definition of timbre space, recent examples being [21], [22], [23].\u00a0 Potentially this will result in a more controllable mapping between a synthesis engine and timbre space.<\/p>\n
One of the unique features of synthesizer technology compared with traditional instruments is that they present two interfaces to the user, one for the programming of the sound generator and the other for the actual musical input.\u00a0 However, during a performance the user can potentially interact with either, or both interfaces.\u00a0 Therefore, the mapping between these two interfaces will ultimately affect the expressiveness of the synthesizer as an instrument.\u00a0 With both interpolated parameter mapping and timbre space mapping systems, the quantities mapped to the performance interface will ultimately affect the expressiveness of the instrument [24].\u00a0 As a result, the expressive control of both systems has been considered extensively.<\/p>\n
Winkler in the mid 1990s considered the mapping of different body movements as expressive gesture control of Interactive Computer Music [25].\u00a0 Although the mapping to a synthesis engine was not considered, it demonstrated the notion of capturing movements for the control of performer expression.\u00a0 Along similar lines, in 2001 Camurri presented a framework for capturing and interpreting movement gestures [26].\u00a0 This framework is built around the notion that a \u201cmulti-layer\u201d system is required to take physical input signals captured from movement sensor, and map them to interpreted gestures.\u00a0 The framework allows different formats for the input signals, such as, time variant sampled audio signals, sampled signals from tactile, infra-red sensors, signals from haptic devices, or events such as MIDI messages or low-level data frames in video. \u00a0Around the same time, Arfib highlighted not only the need for gestural control, but also a visual feedback mechanism from the expression so that the performer can learn to use the expressiveness available [27].\u00a0 This work has then been expanded with a multi-layer mapping strategy based on the definition of a \u201cperception space\u201d that allowed multi-modal feedback [28] and in a subsequent paper specific examples are given [29].<\/p>\n
In 2003 Hunt defined a \u201cmany-to-one\u201d mapping that uses fewer layers, but claims to offer more expressiveness [30].\u00a0 Then in 2004, Wanderley reviewed gesture control of sound synthesis and presented simulated results of the various constituent parts of a Digital Musical Interface (DMI) that are mapped to digital audio effects and computer synthesized sounds [31].\u00a0 Next adaptive control was added [32] and trajectories were used as the input stimulus [33].<\/p>\n
Work currently being undertaken by Caramiaux is looking at synthesizing sounds that have a direct similarity to the gesture used to generate it [34].\u00a0 In this way, specific sounds can be accessed with specific gestures in an intuitive way [35].<\/p>\n
Being able to morph a synthesizer between multiple sounds in real-time is not new concept, but in most cases it is created as a simple cross-fade between two or more different patches.\u00a0 Recently some more complex ways of morphing a synthesizer between different sounds have been proposed where points in the parameter space representing desirable sounds can be controlled.\u00a0 In this way a path or trajectory can be defined in the parameter space so it is possible to morph the multiple sets of parameter in a specific order.<\/p>\n
Ssynth was developed by Verfaille in 2006 at McGill University and is a real-time additive synthesizer that allows \u201cadditive frames\u201d to be arranged as a 3-D mesh.\u00a0 Trajectories can then be used to morph between different sounds [36].\u00a0 Also in 2006 Pendharkar suggests another form of parameterized morphing where desired parameters can be selected from the parameter spaces and using a control signal, interpolation can then be performed between multiple sets of parameters [37].\u00a0 This allows points in the parameter space representing desirable sounds to be parameterized with high-level controls. \u00a0The choice of end points of the morph and the extent of the morph can be to synthesis parameters.\u00a0 Aramaki also used a similar process in 2007 to morph between different sounds (materials) in a percussive synthesizer [17].<\/p>\n
In 2010 Wyse proposed a system called Instrumentalizer that allows synthesis algorithms to be controlled with traditional instrument controls for things such pitch and expression.\u00a0 The system then maps these controls to the synthesis parameters and allows morphing to permit typical instrumental expressions [38].\u00a0 Another example presented by Brandtsegg in 2011 is a modulation matrix that enables interpolation between different tables of matrix coefficients [39].\u00a0 This permits the morphing of the modulators mappings, allowing the sound produced to be morphed.<\/p>\n
An alternative mechanism that can be used to program synthesizers exploits resynthesis techniques.\u00a0 The basic premise is that a \u201ctarget\u201d sound is supplied and the system attempts to replicate the target sound with a synthesis engine.\u00a0 These techniques are either used for recreating sounds without having to understand the synthesis engine or to populate a search space for sound design.\u00a0 Resynthesis approaches can be separated into two categories:\u00a0 one analyses the target sound and then based directly on the results from the analysis, the synthesis engine is programmed.\u00a0 The other category uses Artificial Intelligence (AI) methods to program the synthesizer based on analysis of the supplied target.<\/p>\n
The idea of analysis and resynthesis is not new and has been implemented many times using frequency analysis of the target sound and additive synthesis to build a representation of the target spectrum [40].\u00a0\u00a0 A popular technique for the implementation of the analysis stage has been the use of a Phase Vocoder [41], although other techniques do exist.\u00a0 Over the years this basic premise has been refined many times.\u00a0 A recent example of this was presented by Kreutzer in 2008, who proposed an efficient additive-based resynthesis engine that claims to provide larger flexibility for the user and reduces the number of synthesis parameters compared to traditional methods [42]. In addition to the work being done to refine the synthesis process, others have also examined how the process is driven. An example from 2008 is PerceptSynth, which is controlled with perceptually relevant high-level features, such as pitch and loudness [43].\u00a0 In 2008 Sethares also presented tools for manipulation of the spectral representations of sounds between analysis and re-synthesis [44].\u00a0 This then gives a mechanism to dynamically change the tonality of the sound and create morphing effects.<\/p>\n
Using additive resynthesis principles, TAPESTREA created by Misra in 2009, is a complete sound design framework that facilitates the synthesis of new sounds from supplied audio recordings, through interactive analysis, transformation and resynthesis [45].\u00a0 It then allows complex audio scenes to be constructed from the resynthesized results, using a graphical interface.\u00a0 Klingbeil has also showed in 2009, with a resynthesis system called SPEARS, that the principles can be used for compositional purposes [46].<\/p>\n
Over the last few years much work has been published on the analysis of acoustic audio features, of the sort used in music information retrieval and other sound analysis applications [47], [48].\u00a0 These techniques are now being applied to a resynthesis paradigm.\u00a0 In this manner Hoffman, in 2006, presented a framework for synthesizing audio with sets of quantifiable acoustic features that have been extracted from supplied audio content [49].\u00a0 Although not technically resynthesis, similar analysis has been applied to a corpus-based concatenative synthesis technique by Schwarz in 2008 called CataRT [50].\u00a0 This allows user-driven parameter settings to be generated based on these forms of audio analysis.<\/p>\n
Artificial Intelligence has become increasing popular in the area of synthesizer programming.\u00a0 An early knowledge-based system by Miranda, 1995, called ISSD (Intelligent System for Sound Design), represented sounds in terms of their attributes (brightness, openness, compactness, acuteness, etc.) and how these attributes map to subtractive synthesis parameters for formants [51].\u00a0 In 1998, Miranda further expanded this idea and implemented a system called ARTIST [52] and applied it to different synthesis algorithms.\u00a0 More recently there has been much work on the use of Evolutionary Computational (EC) techniques for the programming of synthesizers. In 2001 Garcia developed a system where Genetic Programming (GP) was used to design a population of synthesis topologies, consisting of oscillators, filters, etc. \u00a0The sounds generated by individuals in the population were then evaluated to establish how closely they matched the target [53], [54].\u00a0 Another AI technique that has been employed for synthesizer programming is the use of Genetic Algorithms (GA).\u00a0 These have been used to search large parameter spaces for target sounds, based on user interactions [55].\u00a0 Then in 2003 Johnson refined this so that the new population was generated based on fitness proportionate selection, where the higher the fitness rating given, the more likely it is to be selected as a parent [56].\u00a0 GAs have also been used with fuzzy logic to allow the user to make explicit associations between twelve visual metaphors presented by a particular sound [57].\u00a0 McDermott, 2005 \u2013 2008 proposed a new interface for the design for interactive EC, which allows faster evaluation of large numbers of individuals from the population [58], [59], [60].\u00a0 As well as these interactive systems, in 2008 Yee-King presented an unsupervised synthesizer programmer, called SynthBot [61]. This was able to automatically find the subtractive synthesis parameter settings necessary to produce a sound similar to a given target, using a GA.\u00a0 In addition, in a recent study by Dykiert, 2011, GAs have been suggested as a mechanism to reduce the size of the parameter search space [62].\u00a0 Finally, it should be noted that as well as synthesizer programming, in 2004 Miranda has also shown how EC can be applied to the compositional process [63].<\/p>\n
sotonDH Small Grants: Investigation into Synthesizer Parameter Mapping and Interaction for Sound Design Purposes \u2013 Post 2 by Darrell Gibson Research Group: Composition and Music Technology Introduction In the previous blog post three research questions were presented in relationship to how synthesizers are used for sound design:\u00a0 First, is there a way that sound design can be performed without an …<\/p>\n","protected":false},"author":93693,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":[],"categories":[198239],"tags":[],"_links":{"self":[{"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/posts\/2955"}],"collection":[{"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/users\/93693"}],"replies":[{"embeddable":true,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/comments?post=2955"}],"version-history":[{"count":2,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/posts\/2955\/revisions"}],"predecessor-version":[{"id":3014,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/posts\/2955\/revisions\/3014"}],"wp:attachment":[{"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/media?parent=2955"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/categories?post=2955"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/digitalhumanities.soton.ac.uk\/wp-json\/wp\/v2\/tags?post=2955"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}