38
UNIVERSITY OF CALICUT Abstract Faculty of Engineering – Scheme & Syllabus of M.Tech Course in Computational Linguistics – approved – Implemented – with effect from 2011 admission – Orders issued. = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = General and Academic Branch – IV ‘E’ Section No.GA.IV/E1/8226/2011(3) Dated Calicut University P.O. 08.05.2012. = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Read:- 1. U.O.No.GA.IV/E1/8226/2011 dated 25.01.2012. 2. Minutes of the meeting of Board of Studies in Engineering (P.G) held on 30.03.2012 Item.No.1) 3. Orders of Vice-Chancellor in the file of even No. dated 17.04.2012. 4. Letter from the Dean, Faculty of Engineering dated 23.04.2012. 5. Orders of Vice-Chancellor in the file of even No. dated 03.05.2012. O R D E R As per paper read as (1) above, a technical expert committee consisting of the following members were constituted to scrutinize the syllabus of the M.Tech Course in Computational Linguistics as few mistakes and omissions were noticed by the Board in the Syllabus forwarded by the Principal, Govt.Engineering College, Sreekrishnapuram, Palakkad and to submit the modified version to the University. a) Smt.Helen.K.J (Co-ordinator), Member, Board of Studies in Engineering (UG) Assistant Professor, Dept.of. Computer Science & Engineering, Govt.En- gineering College, Thrissur. b) Dr.Reghuraj.P.C, Professor, Dept.of.Computer Science & Engineering, Govt.Engineering College, Sreekrishnapuram, Palakkad – 678 633. c) Dr.K.Najeeb, Head, Dept.of.Computer Science & Engineering Govt.Engineering College, Sreekrishnapuram, Palakkad – 678 633. Vide paper read as 2 nd above, the meeting of Board of Studies in Engineering (P.G) held on 30.03.2012, vide item.No. 1 unanimously resolved to approve the Scheme & Syllabus of the M.Tech course in Computational Linguistics submitted by the Expert Committee. Vide paper read as 3 rd above, the Vice-Chancellor had ordered to seek the opinion of the Dean, Faculty of Engineering regarding the approval of the minutes of the meeting of the Board of Studies in Engineering (PG) held on 30.03.2012 The Dean, Faculty of Engineering vide paper read as 4 th above, recommended for the approval of the minutes of the meeting of the Board of Studies in Engineering (PG) held on 30.03.2012. Contd..2

Scheme & Syllabus of M.Tech Course in Computational Linguistics a

Embed Size (px)

Citation preview

UNIVERSITY OF CALICUT Abstract

Faculty of Engineering – Scheme & Syllabus of M.Tech Course in Computational Linguistics – approved – Implemented – with effect from 2011 admission – Orders issued.= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = =

General and Academic Branch – IV ‘E’ Section

No.GA.IV/E1/8226/2011(3) Dated Calicut University P.O. 08.05.2012.= = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = = Read:- 1. U.O.No.GA.IV/E1/8226/2011 dated 25.01.2012. 2. Minutes of the meeting of Board of Studies in Engineering (P.G) held on 30.03.2012 Item.No.1) 3. Orders of Vice-Chancellor in the file of even No. dated 17.04.2012. 4. Letter from the Dean, Faculty of Engineering dated 23.04.2012. 5. Orders of Vice-Chancellor in the file of even No. dated 03.05.2012.

O R D E R

As per paper read as (1) above, a technical expert committee consisting of the following members were constituted to scrutinize the syllabus of the M.Tech Course in Computational Linguistics as few mistakes and omissions were noticed by the Board in the Syllabus forwarded by the Principal, Govt.Engineering College, Sreekrishnapuram, Palakkad and to submit the modified version to the University.

a) Smt.Helen.K.J (Co-ordinator), Member, Board of Studies in Engineering

(UG)

Assistant Professor, Dept.of. Computer Science & Engineering, Govt.En-

gineering College, Thrissur.

b) Dr.Reghuraj.P.C, Professor, Dept.of.Computer Science & Engineering,

Govt.Engineering College, Sreekrishnapuram, Palakkad – 678 633.

c) Dr.K.Najeeb, Head, Dept.of.Computer Science & Engineering

Govt.Engineering College, Sreekrishnapuram, Palakkad – 678 633.Vide paper read as 2nd above, the meeting of Board of Studies in Engineering (P.G) held on

30.03.2012, vide item.No. 1 unanimously resolved to approve the Scheme & Syllabus of the M.Tech course in Computational Linguistics submitted by the Expert Committee.

Vide paper read as 3rd above, the Vice-Chancellor had ordered to seek the opinion of the Dean, Faculty of Engineering regarding the approval of the minutes of the meeting of the Board of Studies in Engineering (PG) held on 30.03.2012

The Dean, Faculty of Engineering vide paper read as 4th above, recommended for the approval of the minutes of the meeting of the Board of Studies in Engineering (PG) held on 30.03.2012.

Contd..2

-2-

Considering the urgency of the matter, the Vice-Chancellor has accorded sanction to implement the Scheme & Syllabus of the M.Tech Course in Computational Linguistics, subject to ratification by the Academic Council, vide paper read as 5th above.

Sanction has therefore been accorded for implementing the Scheme & Syllabus of the M.Tech course in Computational Linguistics with effect from 2011 admission onwards.

Orders are issued accordingly.(The Syllabus is available in the University website)

Sd/- DEPUTY REGISTRAR (GA.IV) For Registrar.

To

The Principals of all affiliated Engineering Colleges offering M.Tech Course.Copy to :- P.S to V.C/PA. to PVC/ P.A. to Registrar/P.A to C.E/Enquiry/ Ex.Sn/EG Sn/DR,M.Tech /M.Tech.Tabulation Section/Dean, Faculty of Engineering/ Chairman, BOS in Engg (PG)&(UG) System Administrator (with a request to upload in the university website)/ SF/FC

Forwarded/By Order

Sd/- SECTION OFFICER

UNIVERSITY OF CALICUT

M.Tech. DEGREE COURSE

COMPUTATIONAL LINGUISTICS

Curricula, Scheme of Examinations and Syllabus

(With effect from 2011admissions)

M.Tech in Computational Linguistics

CURRICULUM

FIRST SEMESTER

Subject Code Name of the Subject

Hours/Week Marks

L T P/DIntern

alSem-end

Total Mark

s

Sem-end

exam duration - Hrs.

Credits

MCL10 101Mathematical Foundations of Computer Science

3 1 0 100 100 200 3 4

MCL10 102 Natural Language Processing

3 1 0 100 100 200 3 4

MCL10 103 Computational Linguistics-I 3 1 0 100 100 200 3 4

MCL10 104 Speech Processing 3 1 0 100 100 200 3 4

MCL10 105 Elective I 3 1 0 100 100 200 3 4

MCL10 106(P)

Seminar I0

02

100 0 100 - 2

MCL10 107(P)

Computational Linguistics Lab I 0

0 4 100 0 100 - 2

Total 15 5 700 500 1200 24Note: Remaining 6 hours / week is meant for departmental assistance by students

SECOND SEMESTER

Subject Code Name of the Subject

Hours/Week Marks

L T P/DInternal

Sem-end

Total Marks

Sem-end

exam duration – Hrs.

Credits

MCL10 201Statistical Natural Language Processing

3 1 0 100 100 200 3 4

MCL10 202 Algorithms and Complexity

3 1 0 100 100 200 3 4

MCL10 203 Data and Text Mining 3 1 0 100 100 200 3 4

MCL10 204 Elective II 3 1 0 100 100 200 3 4

MCL10 205 Elective III 3 1 0 100 100 200 3 4

MCL10 206(P) Seminar II0

02

100 0 100 - 2

MCL10 207(P)Computational Linguistics Lab II 0

0 2 100 0 100 - 2

Total 15 5 4 700 500 1200 24

Note: Remaining 6 hours / week is meant for departmental assistance by students

List of Electives

Subject Code Name of the SubjectEquivalent Subject in M.Tech(Computer Sc&Engg)

Elective IMCL10 105(A)

Computational Intelligence MCS10 105(A)

MCL10 105(B)

Information Retrieval and Extraction

MCL10 105(C)

Theories of Syntax and Semantics

MCL10 105(D)

Special Architectures

Elective IIMCL10 204(A)

Machine Learning

MCL10 204(B)

Machine Translation

MCL10 204(C)

Information Theory MCS10 204(C)

Elective IIIMCL10 205(A)

Pattern Recognition

MCL10 205(B)

Software Architecture and Project Management

MCL10 205(C)

Soft Computing MCS10 205(C)

III Semester

Subject Code Name of the Subject

Hours/Week Marks

L T P/D InternalSem-end

Total Marks

Sem-end

exam duration- Hrs.

Credits

MCL10 301 Elective IV 3 1 0 100 100 200 3 4

MCL10 302 Elective V 3 1 0 100 100 200 3 4MCL10 303(P) Industrial Training 0 0 0 50 - 50 - 1

MCL10 304(P)Master ResearchProject Phase I

0 0 22 Guide

EC*

150 150- 300 - 6

Total 6 2 22 550 200 750 15

NB: The student has to undertake the departmental work assigned by HoD

*EC – Evaluation Committee

IV Semester

Subject Code

Name of the Subject

Hours/Week Internal Marks

Sem-end exam

duration– Hrs.

L T P/D Guide

Evaluation

Committee

Extl. Guid

e

Viva Voce

Total Mark

s

Credits

MCL10 401(P)

Project Work - - 30 150 150 150 150 600 12

Total 30 150 150 150 150 600 12

NB: The student has to undertake the departmental work assigned by HoD

Elective IV

MCL10 301(A) MultimediaMCL10 301(B) Research Methodology (Common with MCS10 301(B))

MCL10 301(C) Knowledge based System Design

Elective V

MCL10 302(A) Advanced Computational LinguisticsMCL10 302(B) Semantic Web ArchitectureMCL10 302(C) Indian Theories of Meaning

SYLLABUS

SEMESTER 1

MCL10 101 Mathematical Foundations of Computer Science

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the mathematical theory relevant to Computational Linguistics.

Module I (12 hrs)Review of Formal Languages and Automata Theory: Mathematical notion of a language and language recognizers-Types of languages-Chomsky Hierarchy-Grammars, languages and their recognizers-notions of computabilityModule II (12hrs)Linear Algebra: Matrix operations, Eigen values and Eigen vectors, LU decomposition, Singular Value decomposition, Review of Vector AlgebraModule III (15 hrs)Probability: Markov Chains: Definition, Examples, Transition Probability Matrices of a Markov Chain, Classification of states and chains, Basic limit theorem, Limiting distribution of Markov chains. Continuous Time Markov Chains: General pure Birth processes and Poisson processes, Birth and death processes, Finite state continuous time Markov chainsBayes’ Theorem and applicationsInformation Theory Essentials: Entropy and Cross Entropy-relation to language-Mutual informationModule IV (13 hours)Essential statistics: Random variables and Standard distributions-Poisson distribution-Term distribution models-k mixture, Statistical Estimators.

References:1. Martin: Theory of Computation. McGraw Hill.2. Automata, Languages, and Formal Languages. Aho and Hopcroft. Narosa.3. E. Kreizig: Advanced Engineering Mathematics. Wiley.4. S. M. Ross, Introduction to Probability Models, Harcourt Asia Pvt. Ltd. and

Academic Press.5. Oliver C. Ibe: Fundamentals of Applied Probability and Random Processes.

Academic Press/Elsevier. 2007.6. John B Thomas, An Introduction to Applied Probability and Random Processes, John

Wiley & Sons.7. Cover, T. M. and J. A. Thomas: Elements of Information Theory. Wiley. 1991. 8. Charniak, E.: Statistical Language Learning. The MIT Press. 1996.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.

End semester Examination:100 marks

Question patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 102 Natural Language Processing

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the basics of Language processing from algorithmic viewpoint.

Module I (12 hrs) Introduction to Natural Language Understanding: Linguistic Background-An Outline of English Syntax-Grammars and Parsing-Features and Augmented Grammars.

Module II (14 hrs)Grammars for Natural Language: Toward Efficient Parsing, Ambiguity Resolution: Statistical Methods.

Module III (14 hrs)Semantics and Logical Form: Linking Syntax and Semantics-Ambiguity Resolution-other Strategies for Semantic Interpretation-Scoping and the Interpretation of Noun Phrases.

Module IV (12 hrs)Knowledge Representation and Reasoning-Local Discourse Context and Reference-Using World Knowledge-Discourse Structure-Defining a Conversational Agent.References

1. Allen, James. Natural Language Understanding. The Benjamin/Cummings Publishing Company, Inc., Redwood City, CA. 1995.

2. Charniak, Eugene: Introduction to Artificial intelligence, Addison-Wesley, 1984.3. Bates, M. (1995). Models of Natural language understanding. Proceedings of the

National Academy of Sciences of the United States of America, Vol. 92, No. 22 (Oct. 24, 1995), pp. 9977-9982.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 103 Computational Linguistics-I

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the fundamentals of Language processing from computational viewpoint.

Module I (14 hrs)Introduction: Words-Regular Expressions and Automata-Morphology and Finite-State Transducers-Computational Phonology and Pronunciation Modeling-Probabilistic Models of Pronunciation and SpellingSyntax: N-gram Models of Syntax - HMMs and Speech Recognition -Word Classes and Part-of-Speech Tagging - Context-Free Grammars for English - Parsing with Context-Free Grammars-Probabilistic Context-Free Grammars-Probabilistic CYK Parsing of PCFGs-

Learning PCFG Probabilities-Problems with PCFGs-Probabilistic Lexicalized CFGs-Dependency Grammars-Human ParsingModule II (14 hrs)Unification of Feature Structures: Feature Structures in the Grammar-Agreement-Head Features-Subcategorization-Long Distance Dependencies-Implementing Unification-Unification Data Structures-The Unification Algorithm-Unification Parsing-Integrating Unification into an Earley Parser-Types and Inheritance-Extensions to Typing Module III (12hrs)Semantics: Representing Meaning-Semantic Analysis-Lexical Semantics - Word Sense Disambiguation and Information RetrievalModule IV (12 hrs)Pragmatics: Discourse - Dialog and Conversational Agents -Natural Language Generation - Multilingual Processing - Machine Translation

References1. Jurafsky, D. and J. H. Martin, Speech and language processing: An Introduction to

Natural Language Processing, Computational Linguistics, and Speech Recognition, Prentice-Hall, 2000.

2. Charniak, E.: Statistical Language Learning. The MIT Press. 1996.3. J. Allen: Natural Language Understanding. Benjamin/Cummins.1995.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 104 Speech Processing

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the relevant theory and algorithms for processing human speech.

Module I (14 hrs)Introduction: The Role of Knowledge in Speech and Language Processing-Models and Algorithms-Language, Thought, and Understanding-The Turing Test-A Short History of Speech and Language Processing-Formal Languages and recognizersComputational Phonology and Pronunciation Modeling-Speech Sounds and Phonetic Transcription-The Vocal Organs-Consonants: Places of ArticulationConsonants: Manner of Articulation-Vowels-The Phoneme and Phonological Rules-Phonological Rules and TransducersModule II (14 hrs)Mapping Text to Phonemes for TTS-Pronunciation DictionariesBeyond Dictionary Lookup: Text Analysis-An FSA based Pronunciation LexiconEnglish Pronunciation Variation, Models of Pronunciation and Spelling--Introduction to Spell Checking-Spelling Error Patterns-Detecting Non-word Errors-Probabilistic Models-Applying the Bayesian Method to Spelling Correction-Minimum Edit DistanceModule III (12 hrs)The Bayesian Method for Pronunciation-Decision Tree Models of Pronunciation Variation-Weighted Automata-Computing Likelihoods: The Forward Algorithm-Decoding: The Viterbi Algorithm-Weighted Automata and SegmentationModule IV (12 hrs)Introduction to Speech Recognition-Overview of a Speech Recognition Architecture-Overview of Hidden Markov Models-The Viterbi Algorithms-Advanced Methods of Decoding-A* Decoding-Acoustic Processing of Speech-Sound Waves-Interpreting a Waveform-Spectra-Feature Extraction-Computing Acoustic Probabilities-Training a RecognizerReferences1. Jurafsky, D. and J. H. Martin, Speech and language processing: An Introduction to Natural Language Processing, Computational Linguistics, and Speech Recognition, Upper Saddle River, NJ: Prentice-Hall, 2000.2. Jelinek, F.: Statistical Methods for Speech Recognition. The MIT Press. 1998. 3. Cover, T. M. and J. A. Thomas: Elements of Information Theory. Wiley. 1991.4. Charniak, E.: Statistical Language Learning. The MIT Press. 1996. Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marks

Module IVQuestion 7 : 20 marks, Question 8: 20 marks

ELECTIVESMCL10 105(A) Computational Intelligence

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the fundamentals of intelligent systems, their architecture and programming.Module I (16 hours)Introduction- problems in AI - AI applications - perception and action - representing and implementing action functions - production systems - networks - problem solving methods - forward versus backward reasoning - search in state spaces - state space graphs -uninformed search - breadth first search - depth first search - heuristic search - using evaluation functions - general graph-searching algorithm - algorithm A* - admissibility of A* - the consistency condition - iterative deepening A* - algorithm AO* - heuristic functions and search efficiency - alternative search formulations and applications - assignment problems - constraint satisfaction - heuristic repair - two agent games - the mini-max search - alpha beta procedure - games of chanceModule II (14 hours)Knowledge representation - the propositional calculus - rules of inference - proof techniques - semantics - soundness and completeness - the PSAT problem- resolution in propositional calculus - soundness of resolution - converting arbitrary wffs to conjunctions of clauses - resolution refutations - horn clauses - the predicate calculus - quantification - semantics of quantifiers - resolution in predicate calculus - unification - converting arbitrary wffs to clause form - using resolution to prove theorems - answer extraction - knowledge representation by networks - taxonomic knowledge - semantic networks - frames - scriptsModule III (12 hours)Neural networks - introduction - motivation - notation - the back propagation method - generalization and accuracy - reasoning with uncertain information - review of probability theory - probabilistic inference – Bayes’ networks - genetic programming - program representation in GP - the GP process - communication and integration - interacting agents - a modal logic of knowledge - communication among agents - speech acts - understanding language strings - efficient communication - natural language processing - knowledge based systems - reasoning with horn clauses - rule based expert systemsModule IV (10 hours)Programming in Prolog: Basics- Predicates – Representing rules, and facts-recursion-unification-control-the cut-Input/Output-the Fail predicate-dynamic databases-Applications to Natural Language ProcessingReferences

1. Nilsson N.J., Artificial Intelligence - A New Synthesis, Harcourt Asia Pvt. Ltd.2. Luger G.F. and Stubblefield W.A., Artificial Intelligence, Addison Wesley3. Elain Rich & Kevin Knight, Artificial Intelligence, Tata McGraw Hill4. Tanimotto S.L., The Elements of Artificial Intelligence, Computer Science Press

5. Carl Townsend- Turbo Prolog-BPB Publications

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.

End semester Examination:100 marks

Question patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 105(B) Information Retrieval and Extraction

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the basics of modeling and design off Information Retrieval systems and methods of information extraction.

Module I (12 hours)Introduction: Information versus Data Retrieval, IR: Past, present, and future. Basic concepts: The retrieval process, logical view of documents. Modeling: A Taxonomy of IR models, ad-hoc retrieval and filtering. Classic IR models: Set theoretic, algebraic, probabilistic IR models, models for browsing.Retrieval evaluation: Performance evaluation of IR: Recall and Precision, other measuresModule II (12 hours)Reference Collections such as TREC, CACM, and ISI data sets. Query Languages: Keyword based queries, single word queries, context queries, Boolean Queries, Query protocols, query operations.Text and Multimedia Languages and properties, Metadata, Text formats, Markup languages, Multimedia data formats, Text Operations. Indexing and searching: Module III (14 hours)Information Extraction: Basic Assumptions-Various Tasks-Chunking- Developing and Evaluating Chunkers-Entity Recognition-Topic Spotting-Slot Filling-Layered Finite State

Methods-Query Expansion-Question Answering SystemsModule IV (14 hrs)Word Sense Disambiguation and Information Retrieval: Selectional Restriction Based Approaches-Limitations of Selectional Restrictions-Robust Word Sense DisambiguationMachine Learning Approaches, Dictionary-Based ApproachesApplications to Information Retrieval:Term Weighting,Term Selection and Creation-Homonymy, Polysemy and Synonymy -Improving User Queries-Other Information Retrieval Tasks References

1.R. Baeza-Yates and B. R. Neto: Modern Information Retrieval:, Pearson Education, 2004.2.C.J. van Rijsbergen: Information Retrieval, Butterworths, 1979.3.C.D. Manning and H. Schutze: Foundations of Statistical natural Language Processing (Chapters 13, 14, and 15 only), The MIT Press, Cambridge, London. 2001.4.Information Retrieval: Algorithms and Heuristics (The Information Retrieval Series:2nd Edition): David A. Grossman and Ophir Frieder 5. W. Bruce Croft and John Lafferty (editors): Language Modeling for Information Retrieval, Kluwer Academic Publishers, 2003,The Kluwer International Series on Information Retrieval, Volume 13.6.Introduction to Information Retrieval:Christopher D. Manning, Raghavan, and Schutze

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests persubject. The assessment details are to be announced right at the beginning of the semesterby the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 105(C) Theories of Syntax and Semantics

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce various frameworks of syntactic and semantic theories of linguistics from computational viewpoint.

Module I (12 hrs)Universal grammar, X-bar-theory, thematic roles, transformations, minimalism, lexical-functional grammar (LFG), Head-driven phrase structure grammar-(HPSG), syntax and spoken language, functional grammar, cognitive grammar, construction grammar, grammar theory in first and second language acquisitionModule II (12 hrs)Semantics-Representing Meaning: Introduction-Computational Requirements for Representations-Meaning Structure of Language-Predicate Argument Structure-First Order Predicate Calculus-Elements of FOPC-The Semantics of FOPC-Variables and Quantifiers-Inference-Representing Linguistically Relevant Concepts-Categories-Events-Representing Time-Representing Beliefs-Related Representational Approaches-Alternative Conceptions of Meaning-Meaning as Action-Meaning and TruthModule III (14 hrs)Semantic Analysis: Introduction- Syntax-Driven Semantic Analysis-Semantic Augmentations to CFG Rules-Simple Semantic Attachments-Lambda Notation-Complex Terms-Quantifier Scoping and Complex Terms-Semantic Attachments for a Fragment of English-The Design of a Syntax-Driven Analyzer-Integrating Semantics into the Earley Algorithm-Idioms and Compositionality-Robust Analysis.Module IV (14 hrs)Lexical Semantics: Introduction-Relations among Lexemes and Their Senses-Homonymy-Polysemy-Synonymy-HyponymySemiotics, semantic change, tense and aspect, semantic roles, formal semantics, and cognitive semantics-Specific language impairment and aphasia.References

1. Akshar Bharathi, Vineet Chaitanya, and Rajeev Sangal: Natural Language Processing:A Paninian Perspective. Prentice Hall of India.

2. L. T. F. Gamut: Logic, Language, and Meaning, Volume 1: Introduction to Logic 3. Ivan A. Sag and Thomas Wasow: Syntactic Theory: A Formal Introduction. CSLI

Pub.4. Mary Dalrymple: Lexical Functional Grammar, Center for Linguistics and Philology,

MIT Press. 5. Bresnan, Joan. 2001. Lexical-Functional Syntax. Oxford: Blackwell Publishers.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marks

Module IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 105(D) Special ArchitecturesTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the modern trends in computer architecture for processing huge volumes of information, such as information retrieval, Language processing, etc.

Module I (13 hours): Cluster-based Computing: Overview -The Role of Clusters-Definition and Taxonomy-Distributed Computing-Limitations.Cluster Planning-Architecture and Cluster Software-Design Decisions-Network Hardware Network Software-Protocols-Distributed File Systems-Virtualization technologies-BenchmarksModule II (13 hours)Grid Computing: Introduction-Infrastructure of hardware and software-Main Projects and Applications-The Open Grid Forum.Grid Architecture: Resource Managers-Application Management-Grid Application Description Languages-Application Partitioning-Metascheduling-Mapping-Monitoring-Web Services-Grid PortalsModule III(13 hours) Cloud Computing: Introduction:Introduction to cloud computing, cloud architecture and service models, the economics and benefits of cloud computing, horizontal/vertical scaling, thin client, multimedia content distribution, multiprocessor and virtualization, distributed storage, security and federation/presence/identity/privacy in cloud computing, disaster recovery, free cloud services and open source software, and example commercial cloud services. Module IV (13 hours)Evolving Paradigms and Software Platforms: Hadoop, Map-Reduce Paradigm.Ontology Engineering: The role of Ontologies in the semantic web-Theoretical Foundations of Ontologies-Methods for building Ontologies-Languages for Building Ontologies-Ontology Tools- Ontology-based Applications

(Open Source implementations/toolkits of the above concepts may be introduced wherever necessary, as case studies)

References1. R. Buyya, (ed). High Performance Cluster Computing, Volume 1: Architectures and

Systems. Prentice-Hall, 1999.

2. R. Buyya, (ed). High Performance Cluster Computing, Volume 2: Programming and Applications, Prentice-Hall, 1999.

3. K. Dowd and C. Severance. High Performance Computing, 2nd ed. O'Reilly and Associates, 1998.

4. W. Gropp, E. Lusk, and A. Skjellum.Using MPI: Portable Parallel Programming with the Message Passing Interface. MIT Press, 1994.

5. T.L. Sterling, J. Salmon, D.J. Becker, and D.F. Savarese.How to Build a Beowulf: A Guide to the Implementation and Application of PC Clusters. MIT Press, 1999.

6. John W. Rittinghouse and James F. RansomeCloud Computing, Implementation, Management, and Security, CRC Press, 2010.

7. George Reese. Cloud Application Architectures, O’Reilly, 2009 8. David S. Linthicum, Cloud Computing and SOA Convergence in Your Enterprise: A

Step-byStep Guide, Addison Wesley, 2009 9. Kenneth Hess, Amy Newman, Practical Virtualization Solutions: Virtualization from

the Trenches, Prentice Hall, 200910. Peter Saint-Andre, Kevin Smith, Remko TronCon, XMPP: The Definitive Guide:

Building Real-Time Applications with Jabber Technology, O’Reilly 2009.11. Jeanna N. Mathews, Eli M. Dow, Todd Deshane, Wenjin Hu, Jeremy Bongio, Patrick

F. Wilbur, Brendan Johnson, "Running Xen: A Hands-On Guide to the Art of Virtualization", Prentice Hall, 2008.

12. Andrew S. Tanenbaum, Maarten van Steen, "Distributed Systems: Principles and Paradigms, 2nd Edition", Prentice Hall, 2006.

13. Gruber, T. A translation Approach to portable ontology specifications. Knowledge Acquisition. Vol. 5. 1993. 199-220.

14. Uschold, M.; Grüninger, M. ONTOLOGIES: Principles, Methods and Applications. Knowledge Engineering Review. Vol. 11; N. 2; June 1996.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL 10 106(P) Seminar ITeaching scheme: 2 hours practical per week Credits: 2

Objective:

To assess the debating capability of the student to present a technical topic. Also toimpart training to a student to face audience and present his/her ideas and thus creating self esteem and courage that are essential for an engineer. Each student is expected to present a seminar on a topic of current relevance in Computational Linguistics for about 45 minutes. They are expected to refer current research and review papers from standard journals like ACM, IEEE, JPDC, IEE etc. - at least three cross references must be used - the seminar report must not be the reproduction of the original paper. A committee consisting of at least three faculty members shall assess the presentation of the seminar and award marks to the students based on merits of topic of presentation. Each student shall submit two copies of a write up of the seminar topic. One copy shall be returned to the student after duly certifying it by the chairman of the assessing committee and the other will be kept in the departmental library. Internal continuous assessment marks are awarded based on the relevance of the topic, presentation skill, quality of the report and participation.

Internal Continuous Assessment (Maximum Marks-100)Presentation +Discussion : 60Relevance + Literature : 10Report : 20Participation : 10Total marks : 100

MCL10 107(P) Computational Linguistics Lab-I

Teaching scheme: 2 hours Practical per week Credits: 2

Objective: To familiarize the students with practical aspects of natural language processing.

(A subset of the following experiments shall be conducted)

Introduction to Computing with Python: Texts as Lists of Words, Computing with Language: Simple Statistics, Making Decisions and Taking Control -Accessing Text Corpora and Lexical Resources-conditional Frequency Distributions- Reusing Code-Lexical Resources-WordNet

Processing Raw Text- Accessing Text from the Web and from Disk- Strings: Text Processing at the Lowest Level-Text Processing with Unicode Regular Expressions for Detecting Word Patterns -Useful Applications of Regular Expressions-Normalizing Text-Regular Expressions for tokenizing Text-Segmentation-

Formatting: From Lists to Strings

References1.Frederick Jelinek: Statistical Methods for Speech Recognition (Language, Speech, and Communication) 2.Steven Bird, Ewan Klein, and Edward Loper: Natural Language Processing with Python3.James Allen: Natural Language Understanding. Benjamin/Cummins.1995.

SEMESTER 2

MCL10 201 Statistical Natural Language ProcessingTeaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the basics of statistical techniques for processing human languages and large text corpora..

Module I (10 hrs)Introduction-Mathematical Foundations: Elementary Probability theory and Information Theory, Linguistic Essentials, Corpus-Based Work, CollocationsModule II(15 hrs)Statistical Inference: n-gram Models over Sparse Data, Word Sense Disambiguation: Supervised and Dictionary based approaches, Lexical AcquisitionModule III (15 hrs)Markov Models: HMM fundamentals, Implementations, properties and variants, applications to Part-of-Speech Tagging, Probabilistic Context Free Grammars, Probabilistic ParsingModule IV (12 hrs)Applications: Statistical Alignment and Machine Translation-Clustering-Topics in Information Retrieval-Text Categorization

References1. C.D. Manning and H. Schubert: Foundations of Statistical Natural Language

Processing: MIT Press, 2001.2. J. Allen: Natural Language Understanding: Benjamin Cummins3. Charniak, E.: Statistical Language Learning. The MIT Press. 1996.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module I

Question 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 202 Algorithms and ComplexityTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4 Objective: To introduce the advanced concepts of data structures and algorithms with an insight into the complexity-related aspects of computing.

Module I: (13 hours)Analysis: RAM model – Notations, Recurrence analysis - Master's theorem and its proof- Amortized analysis - Advanced Data Structures: B-Trees, Binomial Heaps, FibonacciHeaps, Disjoint Sets, Union by Rank and Path CompressionModule II: (13 hours)Graph Algorithms and complexity: Matroid Theory, All-Pairs Shortest Paths, MaximumFlow and Bipartite Matching.Module III: (14 hours)Randomized Algorithms : Finger Printing, Pattern Matching, Graph Problems, AlgebraicMethods, Probabilistic Primality Testing, De-RandomizationModule IV: (14 hours)Complexity classes:NP-Hard and NP-complete Problems-Cook's theorem NP completeness reductions. Approximation algorithms – Polynomial Time and FullyPolynomial time Approximation Schemes. Probabilistic Complexity Classes,Probabilistic Proof Theory and Certificates.References1. Dexter Kozen, The Design and Analysis of Algorithms, Springer, 1992.2. T. H. Cormen, C. E. Leiserson, R. L. Rivest, Introduction to Algorithms, PrenticeHall India, 1990.3. S. Basse, Computer Algorithms: Introduction to Design and Analysis, AddisonWesley, 1998.4. U. Manber, Introduction to Algorithms: A creative approach, Addison Wesley,1989.5. V. Aho, J. E. Hopcraft, J. D. Ullman, The design and Analysis of ComputerAlgorithms, Addison Wesley, 1974.6. R. Motwani andP. Raghavan, Randomized Algorithms, Cambrdige University Press,1995.7. C. H. Papadimitriou, Computational Complexity, Addison Wesley, 19948. S.S. Skeina. The Algorithm Design Manual.Springer Verlag. 1998. Internal continuous assessment: 100 marks

Internal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 203 Data and Text MiningTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the relevant theoretical and practical aspects of data and text mining, with an emphasis on algorithms.

Module I (12 hours) Basic data mining tasks: Classification, Regression, Time Series Analysis, Prediction, Clustering, and Summarization, Sequence discovery.Introduction to data ware housing: OLAP, OLTP, Knowledge discovery in databases.Module II: (12 Hours)Data Mining Techniques: Statistical Perspective on Data Mining: Point Estimation, Models based on summarization, Bayes Theorem, Hypothesis testing, similarity measures, Application of Decision trees, Neural Networks and Genetic algorithms in data mining.Module III (12 hours) Core Topics in Data Mining: Issues in classification, statistical algorithms, Distance-based algorithms, Decision tree based algorithms, Neural Network-based algorithms, rule-based algorithms.Clustering: Similarity and distance measures, outliers, partitional, and hierarchical algorithmsModule IV (16 hrs) Association Rule Mining-Large Item sets: Basic algorithms, Comparison of approaches.Advanced topics: Generalized Association rules, multiple-level association rules, Quantitative rules, web mining, spatial Mining and temporal mining (Introduction only)

References1. Data Mining: Introductory and Advanced Topics-Margaret H. Dunham. 2004

(Pearson Education)

2. Data Mining: Concepts and Techniques-Jiawei Han and Micheline Kamber 2002. (Morgan Kauffman Publishers)

3. Principles of Data Mining : D. Hand, H. Mannila, and P. Smyth. 2004. (PHI).4. Data warehousing in the real world : A practical Guide for building decision support

systems-Sam Anahory and Dennis Thurray , 2000. (Addision Wesley)5. The Text Mining Handbook: Advanced Approaches in Analyzing Unstructured Data:

Ronen Feldman and James Sanger 6. Soumen Chakrabarti: Mining the Web: Discovering Knowledge from Hypertext Data.

Morgan-Kaufmann Publishers, 2002.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 204(A) Machine LearningTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the prominent methods for machine learning, such as supervised and unsupervised learning, connectionist and other architectures for machine learning, etc.

Module I (10 hours)Preliminaries - Introduction - Learning Input-Output Functions - Learning and Bias - Sample applications - Boolean Functions - Representation - Classes of Boolean Functions - Introduction to Neural Networks Module II (14 hours)Using Version Spaces for Learning - Version Spaces and Mistake Bounds - Version Graphs - Learning as Search of a Version Space - The Candidate Elimination Method - Neural Networks - Threshold Logic Units - Linear Machines - Networks of TLUs - Training Feedforward Networks by Backpropogation - Synergies Between Neural Network and Knowledge-Based Methods - Statistical Learning - Using Statistical Decision Theory -

Learning Belief Networks - Nearest-Neighbor MethodsModule III (14 hours)Decision Trees - Definitions - Supervised Learning of Univariate Decision Trees - Networks Equivalent to Decision Trees - Overfitting and Evaluation - The Problem of Replicated Subtrees - The problem of Missing Attributes - Comparisions - Inductive Logic Programming - Notations and Definitions - A Generic ILP Algorithm - Inducing Recursive Programs - Choosing Literals to Add - Relationship Between ILP and Decision Tree Induction - Computational Learning Theory - Notation and Assumptions for PAC Learning Theory - PAC Learning - The Vapnik-Chervonenkis Dimension - VC Dimension and PAC LearningModule IV (14 hours)Unsupervised Learning - Clustering Methods - Hierarchical Clustering Methods - Temporal-Difference Learning - Temporal Patterns and Prediction Problems - Supervised and Temporal-Difference Methods - Incremental computation of the (delta w)i - An experiment with TD Methods - Theoretical Results - Intra-Sequence Weight Updating - Delayed-Reinforcement Learning - The General Problem - Temporal Discounting and Optimal Policies - Q-Learning - Discussion, Limitations, and Extensions of Q-Learning - Explanation-Based Learning - Deductive Learning - Domain Theories - Evaluable Predicates - More General Proofs - Utility of EBL - Applications

References1.Ethem Alpaydın, Introduction to Machine Learning (Adaptive Computation and Machine Learning), MIT Press, 2004.2.Mitchell. T, Machine Learning, McGraw Hill, 19973.Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.4. Ryszard S. Michalski, Jaime G. Carbonell, Tom M. Mitchell, Machine Learning : An Artificial Intelligence Approach, Tioga Publishing Company, 1983.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 204(B) Machine TranslationTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the important approaches to the automatic translation between natural languages.

Module I (12 hrs)History-Translation process-Approaches-Rule-based-Statistical-Example based-Hybrid MT-Major issues-Disambiguation-Named entities-Applications-EvaluationModule II (12 hrs)Language Similarities and Differences-The Transfer Metaphor-Syntactic Transformations-Lexical TransferModule III (12 hrs)The Interlingua Idea: Using Meaning-Direct Translation-Using Statistical Techniques-Quantifying Fluency-Quantifying FaithfulnessModule IV (16 hrs)Statistical MT-Basis-Benefits-Word based translation-Phrase based translation- Syntax based translation-Challenges with statistical machine translation-Compound words- Idioms-Morphology-Different word orders-Syntax-Out of vocabulary (OOV) words

References1. Hutchins, W. John ; and Harold L. Somers. An Introduction to Machine Translation.

London: Academic Press. 1992.2. Allen, James: Natural Language Understanding. Benjamin/Cummins. 1995.3. C.D. Manning and H. Schutze: Foundations of Statistical Natural Language

Processing, MIT Press. 2001.Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 204 (C) Information TheoryTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the mathematical aspects of information theory .

Module 1: Entropy and Loss less Source coding (12 hours)Entropy- Memory less sources- Markov sources- Entropy of a discrete Random variable-Joint, conditional and relative entropy- Mutual Information and conditional mutual information- Chain relation for entropy, relative entropy and mutual Information-Lossless source coding- Uniquely decodable codes- Instantaneous codes- Kraft's inequality - Optimal codes- Huffman code- Shannon's Source Coding Theorem.Module II: Channel Capacity and Coding Theorem (14 hours)Asymptotic Equipartition Property (AEP)- High probability sets and typical sets- Method of typical sequence as a combinatorial approach for bounding error probabilities. Channel Capacity- Capacity computation for some simple channels- Arimoto-Blahut algorithm-Fano's inequality- Shannon's Channel Coding Theorem and its converse- Channels with feed back- Joint source channel coding Theorem.Module III Continuous Sources and Channels (14 hours)Differential Entropy- Joint, relative and conditional differential entropy-Mutual information- Waveform channels- Gaussian channels- Mutual information and Capacity calculation for Band limited Gaussian channels- Shannon limit- Parallel Gaussian Channels-Capacity of channels with colored Gaussian noise-Water filling.Module IV: Rate Distortion Theory (12 hours)Introduction - Rate Distortion Function - Properties - Continuous Sources and Rate Distortion measure - Rate Distortion Theorem - Converse - Information Transmission Theorem - Rate Distortion Optimization.

References1. T. Cover and Thomas, Elements of Information Theory, John Wiley & Sons 1991.2. Robert Gallager, Information Theory and Reliable Communication, John Wiley &

Sons.3. R. J. McEliece, The theory of information & coding, Addison Wesley Publishing Co.,

1977.4. T. Bergu, Rate Distortion Theory a Mathematical Basis for DataCompression PH Inc.

1971.5. R. W. Hamming. Coding and Information Theory. Prentice Hall Inc1980.6. Special Issue on Rate Distortion Theory, IEEE Signal Processing Magazine,

November 1998.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.

End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 205 (A) Pattern RecognitionTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the theoretical and practical aspects of pattern recognition, including the syntactic approaches.

Module I (12 hours)Introduction - introduction to statistical - syntactic and descriptive approaches - features and feature extraction - learning - Bayes Decision theory - introduction - continuous case - 2-category classification - minimum error rate classification - classifiers - Discriminant functions - and decision surfaces - error probabilities and integrals - normal density - Discriminant functions for normal densityModule II (12 hours)Parameter estimation and supervised learning - maximum likelihood estimation - the Bayes classifier - learning the mean of a normal density - general Bayesian learning - nonparametric technique - density estimation - Parzen windows - k-nearest neighbor estimation - estimation of posterior probabilities - nearest-neighbor rule – k nearest neighbour ruleModule III (12 hours)Linear discriminant functions - linear Discriminant functions and decision surfaces - generalised linear discriminant functions - 2-category linearly separable case - non-separable behaviour - linear programming procedures - clustering - data description and clustering - similarity measures - criterion functions for clustering.Module IV (16 hours)Syntactic approach to PR - introduction to pattern grammars and languages - higher dimensional grammars - tree, graph, web, plex, and shape grammars - stochastic grammars - attribute grammars - parsing techniques - grammatical inference

References1. Duda & Hart P.E, Pattern Classification And Scene Analysis, John Wiley2. Gonzalez R.C. & Thomson M.G., Syntactic Pattern Recognition - An Introduction,

Addison Wesley.

3. Fu K.S., Syntactic Pattern Recognition And Applications, Prentice Hall, Eaglewood cliffs

4. Rajjan Shinghal, Pattern Recognition: Techniques and Applications, Oxford University Press, 2008.

5. Christopher M. Bishop: Pattern Recognition and Machine Learning (Information Science and Statistics). Springer Pub.

6. Ethem Alpaydın, Introduction to Machine Learning (Adaptive Computation and Machine Learning), MIT Press, 2004.

7. Mitchell. T, Machine Learning, McGraw Hill, 19978. Christopher M. Bishop, Pattern Recognition and Machine Learning, Springer, 2006.9. David J. C. MacKay: Information Theory, Inference & Learning Algorithms.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 205(B) Software Architecture and Project ManagementTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the theoretical and practical aspects of pattern recognition, including the syntactic approaches to pattern recognition.

Module I (13 hours)Software Architecture - Foundations - Software architecture in the context of the overall software life cycle - Architectural Styles - CASE study of Architectures Designing, Describing, and Using Software Architecture - IS2000: The Advanced Imaging Solution - Global Analysis - Conceptual Architecture View - Module Architecture View -Styles of the Module Viewtype - Execution Architecture View, Code Architecture- View. Component-and-Connector Viewtype - Styles of Component-and-Connector Viewtype - Allocation Viewtype and Styles - Documenting Software Interfaces, Documenting Behavior - Building the Documentation Package.

Module II (13 hours)Archetypes and Archetype Patterns, Model Driven Architecture with Archetype Patterns. Literate Modeling, Archetype Pattern. , Customer Relationship Management (CRM) Archetype Pattern, Product Archetype Pattern, Quantity Archetype Pattern, Rule Archetype Pattern. Design Patterns, Creational Patterns, Patterns for Organization of Work, Access Control Patterns, Service Variation Patterns, Service Extension PatternsModule III (13 hours)Object Management Patterns Adaptation Patterns, Communication Patterns, Architectural Patterns, Structural Patterns, Patterns for Distribution, Patterns for Interactive Systems Adaptable Systems, Frameworks and Patterns, Analysis Patterns Patterns for Concurrent and Networked Objects, Patterns for Resource Management, Pattern Languages, Patterns for Distributed Computing. Module IV (13 hours)Defining EAI, Data-Level EAI, Application Interface-Level EAI. ,Method- Level EAI. , User Interface-Level EAI, The EAI Process - An Introduction to EAI and Middleware, Transactional Middleware and EAI, RPCs, Messaging, and EAI, Distributed Objects and EAI, Database- Oriented Middleware and EAI, Java Middleware and EAI, Implementing and Integrating Packaged Applications—The General Idea, XML and EAI, Message Brokers—The Preferred EAI Engine, Process Automation and EAI. Layering, Organizing Domain Logic, Mapping to Relational Databases.Reference Books

1. Ian Gorton. Essential Software Architecture, 1st edition, Springer, 2006.

2. Bob Hughes, Mike Cotterell, Software Project Management, 4th edition, Tata McGraw Hill, 2006.

3. Christine Hofmeister, Robert Nord, Deli Soni , Applied Software Architecture, Addison-Wesley Professional; 1st edition, 1999.

4. Erich Gamma, Richard Helm, Ralph Johnson, John Vlissides, Design Patterns: Elements of Reusable Object-Oriented Software, Addison-Wesley Professional; 1st edition.

5. Martin Fowler, Patterns of Enterprise Application Architecture, Addison- Wesley Professional, 2003.

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars or a combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 205(C) Soft ComputingTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To familiarize the students with the salient approaches in soft computing, based on artificial neural networks, fuzzy logic, and genetic algorithms.

Module I (13 hours)Introduction to artificial neural networks - biological neurons - Mc Culloch and Pitts models of neuron - types of activation function - network architectures - learning process - error-correction learning - supervised learning - unsupervised learning - single unit mappings and the perceptron - perceptron convergence theorem - method of steepest descent - least mean square algorithms - adaline/medaline units - multilayer perceptrons-backpropagation algorithmModule II (13 hours)Radial basis and recurrent neural networks - RBF network structure - covers theorem and the separability of patterns - RBF learning strategies - K-means and LMS algorithms - comparison of RBF and MLP networks - recurrent networks - Hopfield networks - applications of neural network - the XOR problem - traveling salesman problem - image compression using MLPs - character retrieval using Hopfield networksModule III (13 hours)Fuzzy logic - fuzzy sets - properties - operations on fuzzy sets - fuzzy relations - operations on fuzzy relations - the extension principle - fuzzy measures - membership functions - fuzzification and defuzzification methods - fuzzy controllers - Mamdani and Sugeno types - design parameters - choice of membership functions - fuzzification and defuzzification methods - applicationsModule IV (13 hours)Introduction to genetic algorithm and hybrid systems - genetic algorithms - natural

Teaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4 Objective: To familiarize the students with the fundamentals of different media types and the issues in processing multimedia data.

Module I (13 hours)Multimedia system organization and architecture - QOS architecture - multimedia distributed processing models - multimedia conferencing model - storage organization.Module II (13 hours)Psychoacoustics - digital audio and computer - digital representation of sound - audio signal processing (editing and sampling) - audio production - digital music - musical instrument synthesizer - MIDI protocolModule III (13 hours)Raster scanning principle - color fundamental - color video performance measurement - analog audio - stereo effect - MPEG and DVI technology - multimedia applications - toolkit and hyper application.Module IV (13 hours)Multimedia information system - operating system support middleware system service architecture - presentation services - user interface - file system and information and information model - presentation and anchoring file - Multimedia standards - role of standards - standardization issues - distributed multimedia systems.

References1.W.I. Grosky, R. Jain and R. Mehrotra, The Handbook of Multimedia Information System, PrenticeHall India.2.P. K. Andleigh and K. Thakrar, Multimedia Systems Design, Prentice Hall India. 3.M. J. Bunzal and S. K. Morriec, Multimedia Application Development, Tata McGraw Hill4.Rao, Bojkovic and Milovanovic, Multimedia Communication Systems, 5.R. Steinmetz and K. Nahrstedt, Multimedia Computing Communication and Application, Pearson Education

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IV

Question 7 : 20 marks, Question 8: 20 marks

MCL10 301(B) Research MethodologyTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To give the students a sound introduction to structured methodologies recommended while carrying out high end research.

Module I - Research Methodologies (12 Hrs)Introduction, Research and Scientific methods, Objectives and Motivation of Research,Criteria of Good Research, research Approaches, Significance of research, Type ofResearches, Research methods VS Methodology, Research problems, Defining a research problem, Research Design, Sampling DesignModule II – Data Collection and Analysis (13 Hrs)Collection of Primary Data, Observation method, Interview Method, Collection of datathrough Questionnaires and Schedules, Secondary Data, Processing operations, Statistics in research, Measures of central Tendency, Other methods of data collection, Collection of secondary data, Processing operations, Types of analysis, statistics in research, Dispersion, Asymmetry, relationship, Simple regression analysis, Partial correlationModule-III –Testing (14 Hrs)Hypothesis-I - Introduction, Testing of Hypothesis, Procedure for hypothesis testing,Flow diagram for hypothesis testing, Measuring the power of hypothesis test, Tests ofHypothesis, Hypothesis testing of Means, Proportions, Correlation Coefficients, Chisquare test, Phi Coefficient, Hypothesis-II - Introduction, Nonparametric, Distributionfree Tests, Sign tests, Fisher-Irwin test, Spearman’s Rank Correlation, Kendall’s Coefficient of concordanceModule-4 – Report (14 Hrs)Report writing – Introduction and Significant, Interpretation – Meaning, Techniques, and Precautions, Layout of research reports, Types of report, Mechanics and precautions of writing a research report, Computer role in research, computers and computer technology, computer system, Characteristics

References1. CR Kothari, Research Methodologies – Methods and Techniques, Second Edition, New Age International2. John W Best and James V Kahn, Research in Education, Fifth Edition, PHI,New Delhi3. Pauline V Young, Scientific Social Surveys and Research, Third Editions, PHINew York.Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.

End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 301(C) Knowledge based System DesignTeaching scheme: 3 hours lecture & 1 hour tutorial per week Credits:

4

Objective: To familiarize the students with the architecture and design principles of knowledge based systems.

Module I (13 hours)Introduction To Knowledge Engineering : The Human Expert And An Artificial Expert – Knowledge Base and Inference Engine – Knowledge Acquisition And Knowledge Representation.Module II (13 hours)Problem Solving Process; Rule Based Systems – Heuristic Classifications – Constructive Problem Solving – Tools For Building Expert Systems. Module III (13 hours) Case Based Reasoning – Semantics of Expert Systems – Modeling Of Uncertain Reasoning – Applications of Semiotic Theory; Designing For Explanation – Expert System Architectures.Module I V(13 hours)High Level Programming Languages: Logic Programming for Expert Systems - Machine Learning – Rule Generation And Refinement – Learning Evaluation – Testing And Tuning. References

1. Peter Jackson. Introduction to Expert Systems. 3rd Edition, Pearson Education 2007.2. Robert I. Levine, Diane E. Drang, Barry Edelson. AI and Expert Systems: a

comprehensive guide, C language, 2nd edition, McGraw-Hill 1990. 3. Jean-Louis Ermine: Expert Systems: Theory and Practice. 4th printing, Prentice-Hall

of India , 2001 4. Stuart Russell, Peter Norvig: Artificial Intelligence: A Modern Approach. 2nd

Edition, Pearson Education, 2007.5. N.P.Padhy: Artificial Intelligence and Intelligent Systems. 4th impression , Oxford

University Press, 2007.

Internal continuous assessment: 100 marks

Internal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 302(A) Advanced Computational LinguisticsTeaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the advanced concepts in computational linguistics, modern grammar formalisms, NL generation, etc. Module I (12hrs)Tree Adjoining Grammars-Dependency Grammars-Statistical Parsing-Introduction to Semantic Processing-Semantic Knowledge Representation, Deep Structure and Logical Form-Compositional Semantic Interpretation-Semantic Grammars-Case Frames and Case Frame based ParsingModule II (12 hrs)Natural Language Generation-Problems in NL Generation-Basic Generation Techniques-Hard Problems in NLP-Speech Understanding and Translation-Discourse ProcessingModule III (14 hrs)Lexical Functional Grammar: Active-Passive and Dative Constructions-Wh-movement in Questions-Overview of LFG-LFG Formalism-Well-formedness Conditions-Handling Wh-movement in Questions-Computational AspectsModule IV (14 hrs)Morphology and Finite State Transducers-Inflectional Morphology-Derivational Morphology-Finite State Morphological Parsing-The Lexicon and Morphotactics-Morphological Parsing with Finite State Transducers-Orthographic Rules and Finite-State Transducers-Combining an FST Lexicon and Rules-Lexicon-Free FSTs

References1. Alexander Clark , Chris Fox, and Shalom Lappin (Editors):The Handbook of

Computational Linguistics and Natural Language Processing (Blackwell Handbooks in Linguistics)

2. Akshar Bharathi, Vineet Chaitanya, and Rajeev Sangal: Natural Language

Processing:A Paninian Perspective. Prentice Hall of India.3. James Allen: Natural Language Understanding. Benjamin/ Cummins

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL 302(B) Semantic Web ArchitectureTeaching scheme: 3 hours lecture & 1 hour tutorial per week Credits: 4

Objective: To introduce the advanced concepts in emerging trends in Web architecture.

Module I (13 hours): Introduction to semantic web technology : Traditional web to semantic web – meta data- search engines .Module II (13 hours)Resource Description Framework –elements -rules of RDF – tools- RDFS core elements- Taxonomy and ontology concepts . Web ontology language: OWL: define classes- set operators –enumerations- defining properties – Validating OWL ontology.Module III (13 hours)Semantic web services and applications :Web services – web services standards – web services to semantic web services- UDDI- Concept of OWL-S – building blocks of OWL-S- mapping OWL-S to UDDI- WSDL-S overview Module IV (13 hours)Real world examples and applications : Swoogle- architecture and usage of meta data; FOAF – vocabulary – creating documents – overview of semantic markup – semantic web search engines. References

1. Liyang Yu .Introduction to the Semantic Web and Semantic web services. Chapman

& Hall/CRC, Taylor & Francis group, 2007.2. Johan Hjelm. Creating the Semantic Web with RDF. Wiley, 2001 3. Grigoris Antoniou and Frank van Harmelen. A Semantic Web Primer. MIT Press

Internal continuous assessment: 100 marksInternal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL10 302(C) Indian Theories of MeaningTeaching scheme: 3 hours lecture & 1 hour tutorial per week

Credits: 4

Objective: To introduce the students to the rich traditional India theories of semantics and their relevance to the modern linguistics.

Module I (14hrs)Introduction-Indian Theories of Logic,Theories of Inference: Nyaya Module II (14 hrs)Sphota Theory: Act of Speech-Conceptualization-Performance and Comprehension Module III (14 hrs)Dhwani Theory: Nature of Dhwani- Different Views on Dhwani-Comparison with Sphota-Primary and Secondary soundsModule IV (10 hrs)Karaka Theory

References1.K. Kunjunni Raja: Indian Theories of Meaning, The Theosophical Publishing House.20002.Coward, Harold G. The Sphota Theory of Language: A Philosophical Analysis. Delhi: Motilal Banarsidass Publishers Private Limited, 1997.3.Bimal K. Matilal: The sphota doctrine of the Indian Grammarians, Walter de Gruyter) 19924.Shah, K.J.: Bhartrihari and Wittgenstein’, in Perspectives on the Philosophy of Meaning

(Vol. I, No. 1. New Delhi.)1/1 (1990): 80-95.5.Akshara Bharathi, et.al: Natural Language Processing: A Paninian Perspective. PHI.6.Saroja Bhate, Johannes Bronkhorst (eds.), Bhartuhari- philosopher and grammarian: proceedings of the First International Conference on Bhartuhari, University of Poona, January 6-8, 1992, Motilal Banarsidass Publishers, 1997.7.Patnaik, Tandra, Śabda : a study of Bhartrhari's philosophy of language, New Delhi : DK Printworld, 1994.8. SphoTa, Dhavani and Pratibhaa, (thesis) A. Hota, University of Pune.

Internal continuous assessment: 100 marks

Internal continuous assessment is in the form of periodical tests, assignments, seminars ora combination of all whichever suits best. There will be minimum of two tests per subject. The assessment details are to be announced right at the beginning of the semester by the teacher.End semester Examination:100 marksQuestion patternAnswer any 5 questions by choosing at least one question from each module.Module IQuestion 1 : 20 marks, Question 2 : 20 marksModule IIQuestion 3 : 20 marks, Question 4: 20 marksModule IIIQuestion 5 : 20 marks, Question 6: 20 marksModule IVQuestion 7 : 20 marks, Question 8: 20 marks

MCL 10 303(P): Industrial TrainingTeaching scheme: 1 hour per week Credits: 1

The students have to undergo an industrial training of minimum two weeks in an industry during the semester break after second semester and complete within 15 calendar days from the start of third semester. The students have to submit a report of the training undergone and present the contents of the report before the evaluation committee constituted by the department. An internal evaluation will be conducted for examining the quality and authenticity of contents of the report and award the marks at the end of the semester.

Internal continuous assessment: Marks 50

MCL 10 304(P): MASTERS RESEARCH PROJECT (PHASE – I)Teaching scheme: 22 hours per week Credits: 6

Objective:

To improve the professional competency and research aptitude by touching the areas which otherwise not covered by theory or laboratory classes. The project work aims to develop the work practice in students to apply theoretical and practical tools/techniques to solve real life problems related to industry and current research.

The project work should be a project in computer science & engineering stream. The project work is allotted individually on different topics. The students shall be encouraged to do their project work in the parent institute itself. If found essential, they may be permitted to do their project outside the parent institute subject to the conditions in clause 10 of M.Tech regulations. Department will constitute an Evaluation Committee to review the project work. The Evaluation committee consists of at least three faculty members of which internal guide and another expert in the specified area of the project shall be two essential members. The student is required to undertake the masters research project phase-I during the third semester and the same is continued in the 4th semester (Phase-II). Phase-I consists of preliminary thesis work, two reviews of the work and the submission of preliminary report. First review would highlight the topic, objectives, methodology and expected results. Second review evaluates the progress of the work, preliminary report and scope of the work which is to be completed in the 4th semester.

Internal Continuous assessment:

First Review:

Guide 50 marksEvaluation Committee 50 marks

Second review:

Guide 100 marksEvaluation Committee 100 marks

Total 300 marks

SEMESTER 4

MCL10 401(P) : MASTERS RESEARCH PROJECT PHASE 2

Teaching scheme: 30 hours per week Credits: 12

Objectives: To improve the professional competency and research aptitude by touching the areas which otherwise not covered by theory or laboratory classes. The project work aims to develop the work practice in students to apply theoretical and practical tools/techniques to solve real life problems related to industry and current research.

Masters Research project phase-II is a continuation of project phase-I started in the third semester. Before the end of the fourth semester, there will be two reviews, one at middle of the fourth semester and other towards the end. In the first review, progress of the project work done is to be assessed. In the second review, the complete assessment (quality, quantum and authenticity) of the Thesis is to be evaluated. Both the reviews should be conducted by guide and Evaluation committee. This would be a pre qualifying exercise for the students for getting approval for the submission of the thesis. At least one technical paper is to be prepared for possible publication in journal or conferences. The technical paper is to be submitted along with the thesis. The final evaluation of the project will be external evaluation.

Internal Continuous assessment:

First review:

Guide 50 marksEvaluation committee 50 marks

Second review:

Guide 100 marksEvaluation committee 100 marks