XWindows apps: emacs, xkwic

  • View
    40

  • Download
    0

Embed Size (px)

DESCRIPTION

XWindows apps: emacs, xkwic. LING 5200 Computational Corpus Linguistics Martha Palmer February 9, 2006. Emacs. emacs nw Control x, control c exit (C-x,C-c) Control x, control s save (C-x, C-s) Control x, control v visit (C-x, C-v) Appropos. Emacs Hour 12 in book. emacs nw - PowerPoint PPT Presentation

Transcript

  • XWindows apps: emacs, xkwicLING 5200Computational Corpus LinguisticsMartha PalmerFebruary 9, 2006

  • Emacsemacs nw

    Control x, control c exit (C-x,C-c)Control x, control s save (C-x, C-s)Control x, control v visit (C-x, C-v)Appropos

  • Emacs Hour 12 in bookemacs nw

    Control x, b switch to a new buffer, Control x, Control b show all buffers, Control x, 1 just show one window, Control g ignore the last command, Control h help (works on verbs?)

  • Preparing to run xkwic: modifying your .cshrcDon't forget to make a back-up copy of your .cshrc file before editing it.

  • Preparing to run xkwic: modifying your .cshrcls aCreate an alias for cp to make it prompt you before blowing away a filealias cp 'cp i'

  • Echo commandDon't forget to make a back-up copy of your .cshrc file before editing it.

    You can check the value of an environment variable by using the echo command. Try it now: Enter echo $TGREP_CORPUS. What do you see? You shouldn't see anything, because you haven't defined the TGREP_CORPUS variable. If you do see something, ask for help.

  • Add to .cshrc See Lab 4# xkwic stuff setenv CWBHOME /corpora2/imscorpus setenv CORPUS_REGISTRY $CWBHOME/registry setenv MANPATH $CWBHOME/man:$MANPATH setenv UIDPATH "/usr/local/ims-cwb/lib/ X11/uid/ %N/%U" # tgrep stuff #setenv TGREP_CORPUS /corpora/treebank2/tbl_075/tgrepabl/brwn_cmb.crp setenv TGREP_CORPUS /corpora/treebank2/tgrepabl/wsj_mrg.crp

  • The PATH variableOne very important environment variable is the PATH variable. You can view the current value of your path variable by typing echo $PATH. As you can see, you already have a value defined. We're going to change it. Open your .cshrc file with a text editor (emacs .cshrc or pico -w .cshrc. Find a line that looks something like this:

  • The PATH variable (cont.)set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc . )

  • Adding PATHsNow you'll define some new environment variables in your .cshrc file. There are two ways to do it. One would be to copy the following lines into your .cshrc file, either by hand or by copying and pasting off of this web page. The other would be by tailing my .cshrc (/home/mpalmer/.cshrc), and appending the output to your .cshrc (hint: >>). Don't forget to make a back-up copy of it first, and don't forget to source .cshrc afterwards!

  • A PATH for xkwic Now enter the string /usr/local/ims-cwb/bin before the period that precedes the closing parenthesis, so that it looks something like this: set path=($HOME/bin /usr/local/bin /usr/local/etc /usr/local/lang/bin /usr/ucb /bin /usr/bin /usr/sbin /usr/local/ssh/bin /usr/local/TeX/bin /usr/local/mh/bin /usr/local/elm/bin /usr/local/metamail/bin /usr/local/gnu/bin /usr/ucb /usr/openwin/bin /usr/local/X11/bin /usr/ccs/bin /etc /usr/local/ims-cwb/bin . )

  • Running xkwicSave your file, source it, and check the value of your path variable again. You should see /usr/local/ims-cwb/bin in it now (in addition to the rest of the stuff that was there before). You're now ready to run xkwic! Start it by entering xkwic at the command line.

  • Fire it upTo start xkwic:$babel> xkwic &First step: select a corpus

  • Select a corpus

  • Select a corpusBNC is lemmatizedBrown and WSJ aren't

  • Select a corpus and a search patternSelect the BNC corpus by clicking on the question-mark next to the Search Space text field. Search for the word research with the query [word = "research"]. How many results do you get?

  • Word attribute

  • Output of a search: KWIC

  • Select a corpus and a search patternSelect the BNC corpus by clicking on the question-mark next to the Search Space text field. Search for the word research with the query [word = "research"]. How many results do you get? Search for the lemma research with the query [lemma = "research"]. How many results do you get? Why the difference?

  • Lemma attribute outputInflected formsCase differences

  • Regular expressions in attributes of a position

  • Searching with POS tagsSearch for tokens of research that are not verbs with the query [lemma = "research" & pos != "V.*"]. How many results do you get? Modify the display so that you can see the POS of all words: File -> Display Attributes -> Concordance -> Positional Attributes; highlight "word" and "pos", click "update" and "Dismiss". What are two non-verb POS tags that research occurs with?

  • POS attribute

  • I am SOOO frustrated

  • Basic unit of xkwic:the positionAttributes of a position:WordPOSLemma (BNC)Searching for "positions" by attribute

  • Multiple attributes of a position[word = "research" & pos = "NN1"]

  • Multiple attributes of a position[word = "research" & pos = "NN1"]Ampersand to connect the two attributes

  • Multiple attributes of a position[word = "research" & pos = "NN1"]Single pair of square brackets around all attributes of the single position

  • Negation[word = "research" & pos != "NN1"]= means "is" or "does match" != means "isn't" or "doesn't match"

  • Regular expressions in attributes of a positionWildcard: .Character classes: [word = "[Tt]he"] GroupingAlternation: |Quantifiers: Kleene star, Kleene plus

  • Sequences of positions[lemma = "research"] [word = "the"]Each position gets its own set of square brackets

  • Sequences of positions[lemma = "research"] [word = "the"]A space between the positions

  • Regular expressions over positionsWildcard: []Any single positionQuantifier: *[lemma = "research"] []* [word = "funding"]

  • Resources Laura is bugging me to make a CU Corpora pageLike this http://www.stanford.edu/dept/linguistics/corpora/cas-home.html

    TGREP http://www.stanford.edu/dept/linguistics/corpora/cas-tut-tgrep.html

  • Xkwic resourcesCQP home page: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQP User's Manual: http://www.ims.uni-stuttgart.de/projekte/CorpusWorkbench/CQPUserManual/HTML/ (html version)