NC State[ UO Linguistics | NCSU Linguistics | NCSU Libraries ]
SLAAP logo
navigation

* SLAAP home

* about SLAAP

* papers

* people


log in

user:

password:

software features & specs

Please Note: This page hasn't been updated for quite a while (sorry!). You may want to access Tyler's (2007) PWPL paper for a more recent introduction.

While still in active development, the SLAAP software has a number of features for interacting with and analyzing the sociolinguistic archive. This document seeks to highlight some of these features and explain the methods by which they function.

technical details

The NC SLAAP system is an Apache web server currently housed on a Macintosh G5 computer running Mac OS 10.4. Data are stored in a MySQL database and application pages are written in PHP.
screen shot of line analysis example showing pitch track
Figure 1: Transcript Line Analysis, with Audio Player, Spectrogram, and Graph of Pitch
The web server communicates with third-party open source applications to do most of its "heavy" processing. Most importantly, the phonetic software application Praat (http://www.praat.org) is used by the web server to conduct real-time phonetic analyses (such as generating the pitch data illustrated in the graph in Figure 1) and generating some of the graphics (such as the spectrogram in Figure 1). Praat is also used to excerpt segments from the full audio files as they are needed.

Audio files are stored in high quality WAV format. The WAV files are used for analysis and available (either in toto or as short excerpts) for users to download. All audio that is presented to users for listening over the web, however, is converted to mp3 for faster loading and listening. The server uses the LAME mp3 encoder (http://lame.sourceforge.net/) to dynamically convert the archive quality audio files to mp3 format. Thus, the ~1.75 second sound clip presented to the user in Figure 1 has been extracted from the complete audio with Praat and then converted to mp3 with LAME - all in the background - in response to the user's accessing of the page.

archive library

At its most basic, the SLAAP software provides a simple user interface to the digitized audio archive. Designed to mimick online library card catalogues in many ways, the software provides browsable and searchable access to the collection (as in Figure 2).

Each interview event is the source of a record in the "library".
screen shot of SLAAP library view
Figure 2: SLAAP Library Browse View
The record for each interview contains information about the interview event, such as the date of the interview, the speakers and interviewers occuring in the interview, and information about formats elicited during the interiew (e.g., word pairs and/or reading passages). Each interview also contains references to the records for the media files for the interview. An interview can contain as many media files as necessary. Media files generally correspond to sides of analog tapes (since most of the digitized audio are from digitized tapes), and therefore are often approximately 45 minutes. Each media file record, then, contains metadata about that file (like the digitization specifications) and the location of the actual media file in the filesystem. Links to any transcripts for the interviews are also presented to the user.

Importantly, speakers (including interviewers) are stored with their own records in a table in the database, with demographic information and references to the interviews they appear in. This allows for analyzing or searching for particular speakers in the archive (since some speakers will appear in multiple interviews). It also allows for search and analysis features that are based on demographics. Features along these lines haven't been fully developed yet, but one example of this would be a search feature that would allow users to retrieve all interviews with or transcripts of, say, Native American females, between the ages of 14 and 25.

Even though the development focus thus far hasn't centered on user interface, a number of user customizable views are available. In short, the library aspects are designed to be stream-lined, but flexible and powerful.

listen & annotate

While SLAAP uses the archive quality audio files (typically in WAV format) for analysis, as mentioned above, the software automatically generates an mp3 for every archived audio file for faster online listening. This mp3 is presented to the user through the QuickTime Plug-in Player (other players work, but much of the automation is designed around QuickTime). Users can not only listen to the audio through the interface, but can also enter annotations associated to particular timestamps in the audio file.
screen shot of SLAAP audio listen screen
Figure 3: SLAAP Audio Listen and Annotate View
These annotations can be searched and shared with (or hidden from) other researchers. Once a user has annotated a particular spot in the audio, all she or he has to do to return to that point is click on the annotation and the audio player will return to that point in time.

transcripts

The transcript features are some of the most developed in the SLAAP software. The basic premise behind these features is to supply users a maximally simple, but powerful, orthographic transcription of the audio. In the SLAAP system, transcripts have become extensive versions of the annotations discussed above.
screen shot of multiple SLAAP transcript views
Figure 4: Multiple Transcript Views in the SLAAP System

That is, they are designed to be ways of finding, indexing, and treating the audio data while minimally abstracting away from it.

Transcripts are stored in the database with each phonetic utterance comprising an entry (i.e., a line) in the database. For each utterance, the database stores an orthographic representation (the text) as well as the speaker and the start and end times of the utterance. Pauses, as well as speaker overlap, are recorded as a matter of course - since this information is derived from the start and end times of each utterance.

To get this level of temporal detail, transcripts are created in Praat using the TextGrid feature (an example of Praat with transcription TextGrids is shown in Figure 4).
screen shot of Praat edit window
Figure 5: Praat Editor, showing Waveform, Spectrogram, and TextTiers
These transcripts are very simplistic when compared to other discourse transcripts. Some mark up is often included in the transcripted utterances (brackets ([...]) for speaker overlap (even though this is redundant since overlap is determined and stored based on utterance timing), angled brackets (<...>) for non-linguistic noises, and slashes (/.../) for unsure transcriptions), but this is intentionally kept to a minimum. The overarching goal of the design is to enable the text transcript to act as a quick and easy index to the audio. Since the audio is available with the transcript, and can even quickly be analyzed, listened to, or represented in spectrogram form, questions that arise based on the transcript can be investigated at the level of the recorded speech, and not one step removed from that.

analysis

A number of corpus-like analysis features are currently under development. These work in a number of different ways but (at present) all are derived from the the structure of the transcripts and their relationship to the audio (cf. the transcripts section above). These range from text-based analyses of the transcript information -
screen shot of SLAAP Pitch Analysis
Figure 6: SLAAP Pitch Analysis (top of page)
like a speech rate measuring algorithm that measures information on speakers' syllable per second rate - to phonetic anaylses driven by information contained in the transcripts - such as the pitch analysis of speaker "Yvonne" (a pseudonym) in Figure 6.

To illustrate just how these analyses are conducted, let's delve a little further into the inner workings of the pitch analysis shown in Figure 6. The server software (written in PHP) first communicates with the MySQL database to determine with transcripts the speaker appears in. It then retrieves non-blank lines (i.e. utterances; blank lines are pauses) by that speaker from the transcript that match the criteria of the user - such as having durations of a certain length (between .5 seconds and 4 seconds in the example in Figure 6) - and which have not been marked as lines to ignore. Users can remove lines that seem problematic from the analysis. For the pitch analysis, the orthographic transcription of the utterance isn't important for the software's analysis but it is displayed to the user.
screen shot of Emacs showing Praat script for pitch analysis
Figure 7: Praat Script used for Pitch Analysis
The software then sends the start and end time for each line to be analyzed as arguments to a Praat script (show in Figure 7), which is executed. The web server then "listens" as the Praat script outputs the analysis for each line. The SLAAP software displays each line, with its results, to the user (10 of the lines are displayed on the bottom third of Figure 6). It also determines overall summary statistics (mean, standard deviation, min, and max) and generates graphics for the results. Since the analysis in Figure 6 is done chronologically (by line, as opposed to randomly), one graph is the mean pitch by line in order to help highlight trends over time or, possibly, sections of speech that have pitch readings that stand out for further study. The second graph is a scatter plot, illustrating utterance duration vs. mean utterance pitch. There may not be much particularly noteworthy arising from the analysis shown in Figure 6, but it nonetheless, I hope, illustrates the possibilities of the analysis features.

Two aspects of the SLAAP system are particularly important for the analysis features. First, the fact that speakers are stored as discrete entities in the database allows for speaker-level analysis and comparison. Figure 6 showed results for "Yvonne". However, to retrieve an analysis for a different speaker, the user simply selects the speaker from the drop-down list. Second, analyses are conducted on the fly - as transcripts are added to the system for particular speakers the results of the analysis change automatically for those speakers, reflecting the new data. Presumably, the more data that gets added for a speaker, the more accurate the measurements become.

The analysis features are the most experimental aspect of the SLAAP software. They are being tested both methodologically (i.e., do they work as intended and are they accurate?) and theoretically (i.e., do they tell us anything important?) in order to determine their value. Nonetheless, they are an exciting aspect of the project.

Tyler Kendall
Raleigh, NC
January 9th, 2006

Cite: Kendall, Tyler (2006). Features of the North Carolina Sociolinguistic Archive and Analysis Project Software. Accessed 12/14/2024.


With thanks to the North Carolina State University Libraries, the NCSU College of Humanities and Social Sciences, the Language and Life Project, and the William C. Friday Endowment at NCSU for their support.  © Tyler Kendall
last mod: 10/30/2024