23
John Blake Japan Advanced Institute of Science and Technology Personalised statistical writing analysis

Personalised statistical writing analysis

Embed Size (px)

DESCRIPTION

Powerpoint slides from JAECS, 2013, Sendai, Japan.

Citation preview

Page 1: Personalised statistical writing analysis

John Blake Japan Advanced Institute of Science and Technology

Personalised statistical writing analysis

Page 2: Personalised statistical writing analysis

Overview• Introduction

– context, impetus – focus, process

• Five aspects – statistical analysis

• Personalised writing analysis – sample extracts

• Interview survey• Future direction

2

Page 3: Personalised statistical writing analysis

Context* Proofreading for faculty* Writing assistance for PhD candidates

3

70% 50% science

Page 4: Personalised statistical writing analysis

Impetus

21 email exchange on various points, including:• “minor scary incident” で統一したいと思います。• “near miss”“ ではなく” minor scary incident” で統一したい

と思います。• 提出先に聞きました。 near accident というのが一般的な

ようです。これで修正しました。• “near-miss incident” に変更しました。 … . 先生から指示

に従うように提案されました。• Near miss incident → Near miss incidents に全て修正しま

した。4

From one research article (RA)minor scary incident near-miss incident ヒヤリ・ハット

Page 5: Personalised statistical writing analysis

FocusEnable research articles meet generic expectations of:• Accuracy by being factually correct• Clarity by avoiding ambiguity• Formality by adopting appropriate style

5

rhetorical structure, logic, originality, flawed method, etc.= important, but…

Page 6: Personalised statistical writing analysis

Five aspects of generic integrity

1. Vocabulary fit2. Readability3. Word type balance4. Style and usage 5. Lexicogrammatical

errorsSummary statistics

6Bhatia, V. K. (1993). Analysing genre: Language use in professional settings. London: Longman.

Page 7: Personalised statistical writing analysis

Process for each research article• Create target corpus (TC)

• Analyse RA and TC

• Identify errors in RA• Compile ratios where

poss.• Create feedback document

7

Page 8: Personalised statistical writing analysis

Five aspects

8

• keyness of RA & TCVocabulary fit

• Readability statistics of RA & TCReadability

• Ratio of GSL, AWL and off-list for RA & TC

Word type balance

• Markedness, modality, registerStyle and usage

• Vocabulary & grammatical errorsLexico-grammar

Page 9: Personalised statistical writing analysis

1. Vocabulary fitScott & Tribble (2006, p.56)

``keyness [is what a text] boils down to``Hyland (2011) paper-journal fit

9

Hyland, K. (2011). Welcome to the Machine: Thoughts on writing for scholarly publication. Journal of Second Language Teaching and Research, 1 (1), 58–68.

Scott, M., & Tribble, C. (2006). Textual Patterns: Key Words and Corpus Analysis in Language Education. Amsterdam, Philadelphia: John Benjamins.

TC firm knowledge market international foreignperformance research variables markets countriesexport country relationship business model

RA organizational TMSs coordination DOPPO expertise interactions mechanisms BLOCK employee leader team coordinate informal information management

Prepared using AntConc 3.2.4w with Brown Corpus as referenceTC = 243 RAs, c. 2.1 million words RA = 10k words

Page 10: Personalised statistical writing analysis

10

Prepared using Wordle with RA, 10k words

TC firm knowledge market international foreignperformance research variables markets countriesexport country relationship business model

RA

Page 11: Personalised statistical writing analysis

2. Readability

11

Gunning fog i

ndex

Flesch

Kincaid gr

ade le

vel

Mean se

ntence le

ngth05

10152025

DraftTarget

Bogert, J. (1985). In Defense of the Fog Index. Business Communication Quarterly, 48 (2), 9-12.Gilquin, G., & Paquot, M. (2008). Too chatty: Learner academic writing and register variation.

English Text Construction, 1 (1), 41-61. McClure, G. (1987). Readability Formulas: Useful or Useless, Professional Communication, IEEE

Transactions on, 30 (1), 12-15.

Bogert (1985) & McClure (1987) – factors affecting readabilityGilquin & Paquot (2008) - Learner academic writing – rather `chatty` Research articles tend to have a higher reading difficulty.

Page 12: Personalised statistical writing analysis

3. Word type balance

Levels academic text1st 1000 73.5%2nd 1000 4.6%AWL 8.5% Other 13.3%

12

First 2k

words69%

AWL16%

Off-list15%

Cobb , T. (2013). Web Vocabprofile. www.lextutor.ca/vp/Nation, I.S.P. (2001). Learning vocabulary in another language. Cambridge: Cambridge University Press.

Used in EAP courses at PolyU and CityU in Hong Kong

Nation (2001,p.17)

RA analysed by WebVP classic v4 (Cobb, 2013)

Page 13: Personalised statistical writing analysis

4. Style and usage errors

13

Marked usage Ratio SuggestionPeople provide first 0:9 COCA People first provide

Hyland (1998) – hedgingRobb (2003) – “Google as a quick ‘n’ dirty corpus tool”

Hyland, K. (1998). Hedging in scientific research articles. Amsterdam : John BenjaminsRobb, T. (2003). Google as a quick ‘n’ dirty corpus tool. TESL-EJ, 7(2).

Corpora: IS, KS, MS, BNC , COCA , WAC

Page 14: Personalised statistical writing analysis

5. Lexicogrammatical errors

14

Grammatical or vocabulary errors

Incorrect form Correct form Comment

1 Taking account differences

Taking account of differences

preposition

2 this study answers to two questions

this study answers two questions

answer to s.b. / answer s.th.

3 former employee a former employee employee [singular]

4 to participate to this study

to participate in this study

collocation (participate in)

5 emphasis is given on XX

emphasis is placed on XX

collocation (give to / place on)

6 for being responsible to be responsible general vs. specific purpose

Page 15: Personalised statistical writing analysis

Summary statistics

15

Based on requests for simple to understand evaluation

Caveat: subjective evaluations disguised as statistics

Page 16: Personalised statistical writing analysis

Personalised writing analysis

16

Selected statistics for subject 1

Readability Yours Target Word type balance Yours %

Target %

Gunning fog index

13.2 13.2 1k words 68.58 74.39

Mean sentence length

15.49 19.37 2K words 6.69 5.29

Mean number of clauses /sentence

1.19 1.54 AWL 16.36 7.67

Lexical density 0.63 0.57 Off-list words 8.36 12.65

Page 17: Personalised statistical writing analysis

Personalised writing analysis

17

Selected statistics for subject 4

Style and usage Sentence Ratio Comment or correction1 minor scary incidents 1: 58,700 WAC near-miss incidents2 falling-accident 0: 19 COCA slips, trips and falls OR

falling objects3 a medical examination

by interview1: 525 WAC0: 1 COCA

a medical consultation

4 According to sex 1: 18 WAC According to the gender5 175 indoor workers n/a Use One hundred and ….

6 Tomio,T. (1995) proposes

n/a Omit initials in in-text citations unless …

Page 18: Personalised statistical writing analysis

Personalised writing analysis

18

Selected statistics for subject 7

Style and usage Sentence Ratio Comment or correction1 people provide first their

expertise … 0:9 COCA

people first provide their expertise …

2 XX also engage into XX 1:9000 COCA

XX also engage in XX

3 The XX structure limits become

n/a Use limits for boundaries and limitations for restrictions/ inabilities

4 future studies are able to n/a Use may be to show uncertainty

5 employee simultaneous participation

0:5WAC

simultaneous participation of employees

Page 19: Personalised statistical writing analysis

Interview surveyInterviewer = meSubjects = 4 faculty, 1 PhD candidateNationalities = 3 Japanese, 2 non-Japanese Number   = 5 participants   Interview time    = 30 minutes Location   = private office on campus   Dates of interview = Jun-Jul 2013

Semi-structured interviews

e.g. `What revisions did you make to your paper since…..? `How can I make the feedback more useful?`

19

Page 20: Personalised statistical writing analysis

Survey results

20

• Explanatory notes – too long

• Key word lists – couldn`t understand

• Three readability scores – too complex

• Raw ratios – too difficult e.g. 47:211,120 1:4500

• Lexico-grammatical errors• Word type balance• Ratios for style and usage

Page 21: Personalised statistical writing analysis

Incremental improvements (made)1. Create summary statistic scorecard 2. Use word tag cloud for vocabulary fit 3. Shorten explanatory notes 4. Simplify and approximate ratios 5. Show word type balance graphically with

percentages6. Select `most useful` readability measure(s) –

mean sentence and word length?

21

Page 22: Personalised statistical writing analysis

Future developments• Integration of metrics into one-stop online

porthole (thanks to reviewer for idea) for researchers to submit drafts

• Statistical comparison of draft and published versions to evaluate success of feedback

22

Page 23: Personalised statistical writing analysis

Any questions, suggestions or comments?

John Blake [email protected]