| 
  • If you are citizen of an European Union member nation, you may not use this service unless you are at least 16 years old.

  • Stop wasting time looking for files and revisions. Connect your Gmail, DriveDropbox, and Slack accounts and in less than 2 minutes, Dokkio will automatically organize all your file attachments. Learn more and claim your free account.

View
 

Class 16 Notes

Page history last edited by Alan Liu 5 years, 4 months ago

Last Phase of Project Work

 


 

1. Preparing Public Presentation Pages for Project

 

 


2. Current State of our Corpora

 

  • 1880's Children's Fiction Corpus (134 works)
    • Subcategories:
      • All
      • European
      • American
      • Female
      • Male
      • British Female
      • British Male
    • Processed versions of works:
      • Full plain-text
      • "Scrubbed" (Jockers 2014 stoplist applied, punctuation removed except for internal apostrophes, numerals removed, converted to all lower-case)
      • "Scrubbed and chunked" (each work chunked into segments of 1,000 words)

 

  • 1880's British Adult Fiction Corpus (451 works)
    • Subcategories:
      • All female & male
      • Female
      • Male
    • Processed versions of works:
      • Full plain-text
      • "Scrubbed" (Jockers 2014 stoplist applied, punctuation removed except for internal apostrophes, numerals removed, converted to all lower-case)
      • "Scrubbed and chunked" (each work chunked into segments of 1,000 words)

 


3. Text Analysis Work

 

  • Team: Ginny, Sinead, Eve, Jennifer

 


4. Topic Modeling Work

 

 

 

 

 

Comments (0)

You don't have permission to comment on this page.