Word Essay Grading

Dear student,

I have just read your essay, and I must apologise – I have absolutely no idea what it said.

When you hold this essay in your hands in a few weeks’ time, I know that you will look immediately at the mark I’ve written at the top of the first page. You will make assumptions about yourself, your work – perhaps even your worth – based on this number. I want to tell you not to worry about it.

How to survive marking dissertations

When I was a student, I assumed – as you probably do now – that my work was meticulously checked and appraised, with the due consideration it deserved, by erudite scholars who perhaps wore tweed.

I wonder now if it was actually marked by someone like me: a semi-employed thirtysomething on a zero-hours contract, sitting at home in pyjamas, staring at a hopeless pile of marking, as hopes of making it to the shops for a pint of milk today fade.

Your essay is one of 20 or so I’ve tackled in one sitting this afternoon. They are beginning to blur into one; a profusion of themes and things “to be noted” and endless variations on the phrase “It is interesting that...”.

I’m reading something you wrote on page two and I’m wondering if I just read an explanation of this concept on page one, or if that was in someone else’s essay. I have to go back a page, eyes swimming, and check.

Your essay does not stand alone, but becomes amalgamated with the others I’ve read so far today, all talking about the same things, with varying degrees of clarity. Your words are diluted by the ones that came before, they are lost on me even before I begin.

It should not be like this. In an ideal world, I would spend my morning carefully marking three essays at most, giving them the thought they deserve. I would spend the early afternoon wandering around a meadow picking flowers – something, anything, to clear my head so I can approach the next batch with a fresh outlook and enthusiasm.

Academic workload: a model approach

But I do not have that kind of time. I have academic work of my own; I have a job interview to prepare for; at various points of the year, I have additional employment to help tide me over. (And I’m only a part-time lecturer, I’m aware that my colleagues in full-time jobs have a lot more of this to do.)

I have cleared this bit of space in my schedule to read your essays, and I have come at them genuinely excited to see what you have found out this term, and to tell you how you can improve. I try to be thorough and write actual comments on your essay, even though I’m aware that I could probably get away with a few ticks, question marks and a cryptic “needs improvement”.

I’ve been at it all day and it is 6.20 pm. There are 11 unmarked essays. I could carry on, but I can’t make sense of anything you say any more. I have to force myself to understand anything other than the clearest, nicest writing; the kind of writing that takes me by the hand and shows me round all your ideas. (Dear student, please note: I am not so exhausted that I can’t spot nice writing. Do us both a favour and spend time on your essay. Make it good. Edit, polish, relieve my boredom and let me award you a first.)

I know that I should go back and reread a few essays to compare the marks I’ve given, but there isn’t time. I would like to look up the references you cite, to tell you if there are other gems in those books you may have missed, or suggest other interpretations, but there’s no chance. I also have a life – washing to do, family to spend time with, that sort of thing.

In this letter (which I’ve written with an aching hand) I ask three things of you:

  • Work hard on your essays. Help people like me. It’ll open your mind, and it’ll make me happy. And I really, really want to give you a first.
  • Don’t think that if you just waffle on for three pages to bring your essay up to the required word count, I won’t notice. I will.
  • Do not get too upset – or complacent – because of whatever mark you’ve got. Don’t take it too personally. I’ve tried my best to be consistent and fair, and other lecturers will moderate my marking, but really, by a certain stage, I’m just pulling numbers out of the air. (55? 58? I don’t know)

Teaching at a university means constant pressure - for about £5 an hour

Your essay does not stand alone; it’s either going to impress me or sap my energy, and if it does the latter, it affects how I read the ones which come afterwards. Too many awful essays and I can’t concentrate anymore.

The books on your reading list will tell you everything about the subject that you need to know; read them. There are also books in the library with titles like How to Write an Essay; make use of them. If you don’t understand something, come along to my office hour. I’ve gone on about it all term, and you know where that is.

All the best,

Your lecturer

Join the higher education network for more comment, analysis and job opportunities, direct to your inbox. Follow us on Twitter @gdnhighered. And if you have an idea for a story, please read our guidelines and email your pitch to us at highereducationnetwork@theguardian.com

The e-rater® automated writing evaluation engine is ETS's patented capability for automated evaluation of expository, persuasive and summary essays. Multiple assessment programs use the engine. The engine is used in combination with human raters to score the writing sections of the TOEFL iBT® and GRE® tests.

The e-rater engine is also used as the sole score in learning contexts, such as formative use in a classroom setting with ETS's Criterion® online essay evaluation system. In the Criterion application, the engine is used to generate individualized feedback for students, addressing an increasingly important need for automated essay evaluation that is reliable, valid, fast and flexible.

The e-rater engine can also automatically detect responses that are off-topic or otherwise anomalous and, therefore, should not be scored.

ETS has an active research agenda that investigates new automated scoring features for genres of writing beyond traditional essay genres, and now includes source-based and argumentative writing tasks found on assessments, as well as lab reports or social science papers.

Below are some recent or significant publications that our researchers have authored that highlight research in automated writing evaluation.

  • Supervised Word-Level Metaphor Detection: Experiments with Concreteness and Reweighting of Examples
    B. Beigman-Klebanov, C. W. Leong, & M. Flor
    Paper in Proceedings of the Third Workshop on Metaphor in NLP, pp. 11–20

    The authors discuss a supervised machine learning system that classifies all content words in a running text as either metaphorical or nonmetaphorical. Learn more about this publication >

  • Automated Scoring of Picture-based Story Narration
    S. Somasundaran, C.-M. Lee, M. Chodorow, & X. Wang
    Paper in Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 42–48

    This paper describes an investigation of linguistically motivated features for automatically scoring a spoken picture-based narration task. Learn more about this publication >

  • Scoring Persuasive Essays Using Opinions and their Targets
    N. Farra, S. Somasundaran, & J. Burstein
    Paper in Proceedings of the Tenth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 64–74

    In this work, researchers investigate whether the analysis of opinion expressions can help in scoring persuasive essays. Experiments on test taker essays show that essay scores produced using opinion features are indeed correlated with human scores. Learn more about this publication >

  • Automated Writing Evaluation: A Growing Body of Knowledge
    M. Shermis, J. Burstein, N. Elliot, S. Miel, & P. Foltz. In C. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of Writing Research, 2nd Edition Guilford Press

    The authors present automated writing evaluation in terms of the categories of evidence that are used to demonstrate that these systems are useful in teaching and assessing writing. Learn more about this publication >

  • Automated Analysis of Text in Graduate School Recommendations
    M. Heilman, F. J. Breyer, F. Williams, D. Klieger, & M. Flor
    ETS Research Report No. RR-15-23

    This report explores evaluation of sentiment in letters of recommendation. Researchers developed and evaluated an approach to analyzing recommendations that involves (a) identifying which sentences are actually about the student; (b) measuring specificity; (c) measuring sentiment; and (d) predicting recommender ratings. Learn more about this publication >

  • Patterns of Misspellings in L2 and L1 English: A View from the ETS Spelling Corpus
    M. Flor, Y. Futagi, M. Lopez, & M. Mulholland
    Bergen Language and Linguistics Studies, Vol. 6

    This paper presents a study of misspellings, based on annotated data from ETS’s spelling corpus. Researchers examined data from the TOEFL® and GRE tests and found that the rate of misspellings decreased as writing proficiency (essay score) increased for test takers in both testing programs. Learn more about this publication >

  • Content Importance Models for Scoring Writing From Sources
    B. Beigman Klebanov, N. Madnani, N., J. Burstein, & S. Somasundaran
    Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Short Papers), pp. 247–252

    This paper describes an integrative summarization task used in an assessment of English proficiency for nonnative speakers applying to higher education institutions in the United States. Researchers evaluate a variety of content importance models that help predict which parts of the source material the test taker would need to include in a successful response. Learn more about this publication >

  • Using Writing Process and Product Features to Assess Writing Quality and Explore How Those Features Relate to Other Literacy Tasks
    P. Deane
    ETS Research Report No. RR-14-03

    This report explores automated methods for measuring features of student writing and determining its relationship to writing quality and other features of literacy, such as reading test scores. The e-rater automated essay-scoring system and keystroke logging are a central focus. Learn more about this publication >

  • Predicting Grammaticality on an Ordinal Scale
    M. Heilman, A. Cahill, N. Madnani, M. Lopez, M. Mulholland, & J. Tetreault
    Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Short Papers), pp. 174–180

    This paper describes a system for predicting the grammaticality of sentences on an ordinal scale. Such a system could be used in educational applications such as essay scoring. Learn more about this publication >

  • An Explicit Feedback System for Preposition Errors based on Wikipedia Revisions
    N. Madnani & A. Cahill
    Paper in Proceedings of the Ninth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 79–88

    In this paper, the authors describe a novel tool they developed to provide automated explicit feedback to language learners based on data mined from Wikipedia revisions. They demonstrate how the tool works for the task of identifying preposition selection errors. Learn more about this publication >

  • Difficult Cases: From Data to Learning and Back
    B. Beigman Klebanov & E. Beigman
    Paper in Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics, (Short Papers), pp. 390–396

    This paper addresses cases in annotated datasets that are difficult to annotate reliably. Using a semantic annotation task, the authors provide empirical evidence that difficult cases can thwart supervised machine learning on the one hand and provide valuable insights into the characteristics of the data representation chosen for the task on the other. Learn more about this publication >

  • Different Texts, Same Metaphors: Unigrams and Beyond
    B. Beigman Klebanov, C. Leong, M. Heilman, & M. Flor (2014)
    Paper in Proceedings of the Second Workshop on Metaphor in NLP, pp. 11–17

    This paper describes the development of a supervised learning system to classify all content words in a running text as either being used metaphorically or not. Learn more about this publication >

  • Lexical Chaining for Measuring Discourse Coherence Quality in Test-taker Essays
    S. Somasundaran, J. Burstein, & M. Chodorow
    In The 25th International Conference on Computational Linguistics (COLING), Dublin, Ireland, August 23–29, 2014.
    Paper in Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 950–961

    Researchers investigated a technique known as lexical chaining for measuring discourse coherence quality in test-taker essays. In this paper, they describe the contexts in which they achieved the best system performance. Learn more about this publication >

  • Applying Argumentation Schemes for Essay Scoring
    Y. Song, M. Heilman, B. Beigman Klebanov, & P. Deane
    Paper in Proceedings of the First Workshop on Argumentation Mining, pp. 69–78

    In this paper, the authors develop an annotation approach based on the theory of argumentation schemes to analyze the structure of arguments and implement an NLP system for automatically predicting where critical questions are raised in essays. Learn more about this publication >

  • Handbook of Automated Essay Evaluation: Current Applications and New Directions
    M. D. Shermis & J. Burstein

    This comprehensive, interdisciplinary handbook reviews the latest methods and technologies used in automated essay evaluation (AEE) methods and technologies. New York: Routledge. Learn more about this publication >

  • Word Association Profiles and their Use for Automated Scoring of Essays
    B. Beigman Klebanov & M. Flor
    Paper in Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). pp. 1148–1158

    The authors describe a new representation of the content vocabulary in a text, which they refer to as "word association profile." The paper presents a study of the relationship between quality of writing and word association profiles. Learn more about this publication >

  • Robust Systems for Preposition Error Correction Using Wikipedia Revisions
    A. Cahill, N. Madnani, J. Tetreault, & D. Napolitano
    In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 507–517, Atlanta, Ga.

    This paper addresses the lack of generalizability in preposition error correction systems across different test sets. The authors then present a large new annotated corpus to be used in training such systems, and illustrate the use of the corpus in training systems across three separate test sets. Learn more about this publication >

  • Detecting Missing Hyphens in Learner Text
    A. Cahill, M. Chodorow, S. Wolff & N. Madnani
    In Proceedings of the Eighth Workshop on Innovative Use of NLP for Building Educational Applications, pp. 300–305, Atlanta, Ga.

    This paper presents a method for automatically detecting missing hyphens in English text. Learn more about this publication >

  • The e-rater® Automated Essay Scoring System
    J. Burstein, J. Tetreault, & N. Madnani. In M. D. Shermis & J. Burstein (Eds.), Handbook of Automated Essay Scoring: Current Applications and Future Directions. New York: Routledge.

    This handbook chapter includes a description of the e-rater automated essay scoring system and its NLP-centered approach, and a discussion of the system's applications and development efforts for current and future educational settings. Learn more about this publication >

  • A Fast and Flexible Architecture for Very Large Word n-gram Datasets
    M. Flor
    Natural Language Engineering, FirstView online publication, pp. 1–33

    This paper presents a versatile architecture that uses a novel architecture, features lossless compression, and optimizes both speed and memory use. Learn more about this publication >

  • Correcting Comma Errors in Learner Essays, and Restoring Commas in Newswire Text
    R. Israel, J. Tetreault, & M. Chodorow (2012)
    Proceedings of the 2012 Meeting of the North American Association for Computational Linguistics: Human Language Technologies (NAACL-HLT), pp. 284–294
    Association for Computational Linguistics

    The authors present a system for detection and correction of the placement of commas in English-language sentences. The system likewise can restore commas in well-crafted sentences. Learn more about this publication >

  • On Using Context for Automatic Correction of Non-Word Misspellings in Student Essays
    M. Flor & Y. Futagi
    Proceedings of the 7th Workshop on Innovative Use of Natural Language Processing for Building Educational Applications (BEA) pp. 105–115

    The authors discuss a new system for spell-checking that uses contextual information to perform automatic correction of non-word misspellings. The article relates how the system has been evaluated against a large body of TOEFL® and GRE® essays, which were written by both native and nonnative English speakers. Learn more about this publication >

  • Using Parse Features for Preposition Selection and Error Detection
    J. Tetreault, J. Foster, & M. Chodorow
    Proceedings of the 2010 Association for Computational Linguistics (ACL 2010)
    Association for Computational Linguistics

    This paper evaluates the effect of adding features that aim to improve the detection of preposition errors in writing from speakers of English as a second language. Learn more about this publication >

  • Progress and New Directions in Technology for Automated Essay Evaluation
    J. Burstein & M. Chodorow
    The Oxford Handbook of Applied Linguistics, 2nd Edition, pp. 487–497
    Oxford University Press

    This ETS-authored work is part of a 39-chapter volume that covers topics in applied linguistics with the goal of providing a survey of the field, showing the many connections among its subdisciplines, and exploring likely directions of its future development. Learn more about this publication >

  • Using Entity-Based Features to Model Coherence in Student Essays
    J. Burstein, J. Tetreault, & S. Andreyev
    Human language technologies: The 2010 Annual Conference of the North American Chapter of the ACL, pp. 681–684
    Association for Computational Linguistics

    This paper describes a study in which researchers combined an algorithm for observing what computational linguists refer to as entities — nouns and pronouns — with natural language processing features related to grammar errors and word usage with the aim of creating applications that can evaluate evidence of coherence in essays. Learn more about this publication >

  • 0 Replies to “Word Essay Grading”

    Lascia un Commento

    L'indirizzo email non verrà pubblicato. I campi obbligatori sono contrassegnati *