dr² > Projects > ...

Regionalism, Oppressed Voices, Sentiment Analysis

Pilot Data presented at the Linguistic Society of America Winter Meeting, 2020 in New Orleans.
This work is largely extended from the dissertation work of my colleague, Andrea Sant, who has also provided some suggestions and insights about this work and directions for future improvements.

Abstract: Sentimental Importance of Place in Oppressed Voices

A major theme of 19th and early 20th century American writing is literary regionalism, where features of local speech styles, customs, and landscape are prominent. During this time, major groups within the US, notably women and Black Americans, were struggling for social equality. Regionalist scholars (e.g., Fetterley & Pryse, 2013; Hardig, 2005) suggest that for female writers and Black writers of this period, landscape and location had additional importance, as a major aspiration of oppressed groups was to find a place where they could feel equal or empowered.

This analysis examines whether works by such writers use language related to places with greater frequency and sentimental valence. This limited analysis compares a handful of notable works from such authors in this period (Roper, 1846; Craft, 1860; Jacobs, 1861; Jackson, 1884; Hopkins, 1902) against works in matched decades from the Corpus of Historical American English (COHA) (Davies, 2015). N-grams were pulled from COHA and calculated for each of the works (unigrams for frequency, 4-grams for sentiment), and then filtered for n-grams containing locative terms (from Wikipedia’s Category:English Locatives). This list, containing words like ‘yonder’ and ‘home’, are likely to occur in place-based expressions. This analysis excludes specific places, as examining those would require fine-toothed readings of the texts -- not tenable for the corpus. Unigram frequencies were directly compared, and remaining 4-grams analyzed for sentimental valence using the R package sentimentr (Rinker, 2018). Absolute sentiment values were taken, and n-grams of 0-sentiment (“in the midst of”) excluded, leaving a collection of valenced locative expressions (“perished in some unknown”).

Frequencies of locative use were compared using Wilcoxon signed-rank tests excluding matches to blanks, and found no significant difference in use between the selected works and the works in the corpus, aside from Roper (Roper vs. 1840s, V= 548, p=.03; Craft vs. 1860s, V=447, p=.43; Jacobs vs. 1860s, V=337, p=.10; Jackson vs. 1880s, V=1211, p=.79; Hopkins vs. 1900s, V=530, p=.73). This suggests these works are largely typical of their eras. Non-parametric analysis was preferred for sentiment, which was bounded at 0, however, group comparison through the Wilcoxon-Mann-Whitney at large sample sizes becomes unfavorable (e.g., Zimmerman, 2010), particularly when the sample sizes are uneven, as they are here. One-way parametric analyses were used as distributions were near-normal and centered around apx. 0.4 on the range of (0,1.2). (Properly, the distributions are gamma distributions, bounded at 0. Student's t-distribution and gamma distributions are asymptotically related, and Student's t-distribution handles gamma cases well when n is greater than roughly 50. The thousands of samples here easily exceed that.) Each of the selected works scored significantly higher on location-related sentimentality than the matched writing from the period:
Note: for each of the following graphs, y = count data; individual works and corpus data are scaled to match each other for visual comparison

Roper vs. 1840s
H1: Are Roper's utterances more valenced than matched authors from the 1840s? (One-way t-test)
H0: No difference or less valenced; t(1103)=3.7, p<.001

Craft vs. 1860s
H0: No difference or less valenced; t(1714)=3.6, p<.001

Jacobs vs. 1860s
H0: No difference or less valenced; t(4943)=9.98, p<.001

Jackson vs. 1880s
H0: No difference or less valenced; t(9660)=6.5, p<.001

Hopkins vs. 1900s
H0: No difference or less valenced; t(4163)=5.7, p<.001

Notably, though by and large the selected writers do not tend to write about places more often than other writers of the period, writers of oppressed groups may place a larger emotional stock in places, as other places could promise freedom. Though this current work is limited in scope, future analyses aim to examine lengthier n-grams, as well as examining a wider swath of works, even extending analysis to more recent feminist works and works from minority voices to evaluate whether places maintains special relevance.

Works Referenced

Category:English locatives. (n.d.). In Wikipedia. Retrieved May 28, 2019, from https://en.wiktionary.org/wiki/Category:English_locatives

Craft, W. (1860). Running a Thousand Miles for Freedom: The Escape of William and Ellen Craft from Slavery. E-text from: https://archive.org/details/runningthousandm00craf/page/n6

Davies, Mark. (2015) Corpus of Historical American English (COHA). https://doi.org/10.7910/DVN/8SRSYK, Harvard Dataverse, V1

Fetterley, J., & Pryse, M. (2003). Writing Out of Place: Regionalism, Women, and American Literary Culture. University of Illinois Press.

Hartig, A. S. (2005). Literary Landscaping: Re-reading the Politics of Places in Late Nineteenth-Century Regional and Utopian Literature (Doctoral dissertation, Miami University). http://rave.ohiolink.edu/etdc/view?acc_num=miami1133485531

Hopkins, P. E. (1902) Of One Blood, or The Hidden Self. The Colored Co-operative Publishing Company: Boston. Nov, 1902. E-text from: https://archive.org/details/HopkinsOfOneBlood

Jacobs, H. A. (1861). Incidents in the life of a slave girl: Written by herself. Boston Stenotype Foundry. E-text from: https://archive.org/details/incidentsinlifeo1861jaco

Jackson, H. H. (1886) Ramona: a story. Roberts Brothers: Boston. E-text from: https://archive.org/details/ramona02802gut

Rinker, Tyler. (2018) Calculate Text Polarity Sentiment. R-Package: sentimentr, v.2.8.0. http://github.com/trinker/sentimentr

Roper, M. (1846). A Narrative of the Adventures and Escape of Moses Roper: From American Slavery. Berwick-Upton-Tweed. Etext from: https://archive.org/details/narrativeofadven00roperich/

Zimmerman, D. W. (2003). A warning about the large-sample Wilcoxon-Mann-Whitney test. Understanding Statistics, 2(4), 267-280. https://doi.org/10.1207/S15328031US0204_03