6
Text Mining Definition :Text mining and text analytics are broad umbrella terms describing a range of technologies for analyzing and processing semi-structured and unstructured text data. The need is to “turn text into numbers” so powerful algorithms can be applied to large document databases.

Text Mining

Embed Size (px)

DESCRIPTION

basic of text mining and its methods

Citation preview

  • Text Mining

    Definition :Text mining and text analytics are broadumbrella terms describing a range of technologies foranalyzing and processing semi-structured andunstructured text data. The need is to turn text intonumbers so powerful algorithms can be applied tolarge document databases.

  • Text Mining areas with related field

    We can broadly classify text mining techniques based on the relation with various field.

  • TM Classification

    Search & IR (Information Retrieval)

    Concept Extraction

    Information Extraction

    Document Clustering

    Document Classification

    Web mining

    NLP ~ Computational linguistic

  • Topics

    Inverted Index (IR) Feature Selection (classification) Link analytics (web mining) Sentiment Analysis (classification) Name Entity Extraction(information extraction) Part of speech tagging(nlp) Synonym identification (content extraction) Topic modeling (content extraction) Question answering(nlp) Document clustering & classification(clustering) Spam filtering (content extraction)

  • Examples

    Use of text mining to reveal trends

    Sentiment analysis using customer reviews

    Defense against spam mails.

    Genre detection based on a training set from music site(uses classification)

    Market basket analysis- associating commonly found terms together

    Classical example of watson (NLP)

  • Use of topic modelling in R to show the probabilistic modeling of term frequency occurrences in documents.

    Performing ontology on news article (multi-faceted topic hierarchies for easier searching and browsing)