(1) To simply download the dataset as a zipfile, please click the "Download Full Dataset" button below. It will download a zipped folder with all the sounds, norms and data, along with the accompanying scripts and a README file.
(2) To better understand the dataset, we recommend reading through all the sections below that discuss the sounds, their acoustics, semantics and norms in more detail. Each section can be expanded after clicking the "Read more" button. It also contains a corresponding "Download" button that will allow you to download the relevant data discussed in a given section.
Sound categories included in the database were selected based on the sounds' overall ecological frequency of occurrence (Ballas, 1993), and membership in distinct subclasses of sounds (Schaefer, 1977) at different levels of abstraction to represent common and ‘easy to distinguish’ sound classes (Gemmeke et al., 2017). The top-level classes comprise: (1) human, (2) manmade or mechanical, and (3) sounds of nature or natural sounds, indicated in the number of reviewed taxonomies as the most general division between sound classes. These main groups can be further divided into multiple subcategories at a higher level of categorical abstraction, derived from consulting specialised lists of sounds (e.g., ornithology, entomology, and bioacoustics) and lists of sound categories and their annotations generously provided by FreeSound.org and BBC Archive. Each subcategory was composed of more basic-level classes (e.g., cars, cats, drills) chosen to reflect the most frequently occurring in environment and familiar agents and objects. For example, when selecting animals, cats and dogs were selected over donkeys or lions as these are more likely to appear in our everyday environment and expose listeners to natural acoustic variability within that class. Where possible, each basic class also consists of multiple types of sounds produced by the same source and uniquely associated with it.
Category selection process included the following 4 steps. (1) Identification included a thorough review of environmental sound taxonomies to identify a list of possible categories and their organisational criteria. (2) Synthsesis focused on extracting common structural characteristics to select categories that were most consistently indicated across various taxonomies. (3) Specialization ivolved exploring representative subcategories and consulting specialistic sound taxonomies and lists of classes from otehr databases to identify sounds produced by the same source. For example, although cats or dogs were included in most previous studies, both animals have more than one vocalization type in their repertoire. To correctly identify species- or source-specific sounds, appropriate literature was consulted (e.g., sounds produced by animals – cats, Pandeya & Lee, 2018; dogs, Molnar et al., 2008; birds, Briggs et al., 2012; sounds produced by cars, Morel et al., 2012; Park et al., 2019). (4) Filtering refers to the process of selecting the most frequently occuring and the most characteristic sounds for each class. First, lists of ecological frequency of everyday sounds (Ballas, 1993) and familiarity ratings (e.g., Marcell et al., 2000; Hocking et al., 2013; Burns & Rajan, 2019) were consulted. For example, familiarity ratings (with lower ranks representing more familiar sounds) indicated the sounds produced by a dog (M=1.3), cat (M=1.22), horse (M=1.27) and cow (M=1.54) are more familiar than other animals such as donkey (M=2.6), bear (M=3.31), goat (M=2.04) or lion (M=2.42; a full list of familiarity ratings can be found in Hocking et al., 2013). Then, for each of those selected objects or agents, we selected the most characteristic sounds. For example, sounds of dogs included sounds of barking or howling, but not sounds of snoring or walking, which are not distinctive for a dog (i.e., other animals might be snoring or walking).
The labels were constructed to be consistent grammatically and as generically descriptive as possible. They are all linguistic phrases derived from the most common labels occurring in the reviewed databases. Given that neither nouns nor verbs are sufficient to distinguish between certain sound classes (e.g., a 'dog' label does not differentiate between barking and growling; at the same time growling might not be indicative of a dog; Saygin, Dick, & Bates, 2005), we opted to include both, noun and verb (or its derivative) constructions in our labels. Their grammatical complexity was kept uniform by transforming all phrases into ‘noun + gerund’ forms. A similar approach was successfully exercised in previous research (e.g., Saygin, Dick, Wilson, Dronkers, & Bates, 2003). The same syntactic frame was used to construct the labels for all the sounds.
Stimuli 530 are natural sounds downloaded and manually edited from the following databases: Free Sound, BBC Sound Effects, Sound Bible, Zapsplat, Orange Free Sounds, Adobe Audition Sound Effects Library and YouTube.
Selection of the acoustic features was inspired by previous research in audio content analysis (Lerch, 2012), with emphasis on the applications in environmental sound recognition and classification (e.g., Keller & Berger, 2001; Peltonen et al., 2002; Cai et al., 2006; Muhammad & Alghatabar, 2009; Leaver & Rauschecker, 2010; Velero & Alias, 2010; for review see: Alias, Socoro, & Sevillano, 2016; Serizel, Bisot, Essid, & Richard, 2017). We included features which were shown to perform well in describing and parameterising environmental sounds. The list of selected features is not exhaustive, but based on the usefulness of those features in previous research it should provide a good starting point for considering variability in the acoustic structure of environmental sounds.
TBC.
TBC.
Last modified 2024/04/03
Designed by @mkachlicka