17
“I meant what I said and I said what I meant. An elephant's faithful one-hundred percent!” Dr. Seuss, Horton Hears a Who! Due thanks and apologies to Dr. Seuss and to the illustrator. Apart from the ones above, all the words are mine. As for me, I’m neither a linguist nor a programmer. I’m a writer, interested in words, awed by their expressive and impressive powers.

Metaphic or the art of looking another way

Embed Size (px)

Citation preview

Page 1: Metaphic or the art of looking another way

“I meant what I said and I said what I meant. An elephant's faithful one-hundred percent!”

Dr. Seuss, Horton Hears a Who!

Due thanks and apologies to Dr. Seuss and to the illustrator. Apart from the ones above, all the words are mine.

As for me, I’m neither a linguist nor a programmer. I’m a writer, interested in words, awed by their expressive and impressive powers.

Page 2: Metaphic or the art of looking another way

The Conversation Company

presents

Programming:

“Joos” RajagopalLogic:

Suresh Manian

Arthur Koestler Noam ChomskyMax MuellerAristotle Tim Berners Lee

Inspired by the words and ideas of:

Metaphic

A Grammatically Different Approach to Text Analysis.

<createdBy>

Dedicated to my children, family, friends and foafs.

Suresh Manian, Bangalore 2014.

The Act of Creation The Science of ThoughtPrior Analytics Context Free Grammar Semantic Web

Page 3: Metaphic or the art of looking another way

One day, almost three years ago, I realized that for all intents and purposes, we are our words.

Our actions and attitudes, behaviours and beliefs, emotions and experiences are all captured in our words;

in the verbs, nouns, adjectives, adverbs… even the punctuations of our sentences.

Grammar, not mathematics or statistics, makes the rules in the world of text communication.

The Premise

When it comes to text and sentiment analysis, grammar knows best.

Page 4: Metaphic or the art of looking another way

Words can make elephants fly,or turn lamb into sheep.

Words, like people, are known bythe company they keep.

Words are not numbers, with precise values ruled by ratios. But words are alive in a way numbers are not. A word has an inherent awareness of its significance and position in relation to its immediate neighbours. It has a dynamic and analogous relationship with other words, governed by a set of universal rules and conventions. In the final analysis, words are anything but ambiguous.

Page 5: Metaphic or the art of looking another way

Supple. Stable. Scalable. - The natural properties of English grammar.

While nouns and a few new verbs get added from time to time, the“universals” of the language are predetermined and inviolable. No newpreposition has been added to the English language in the last 300 years.Words are kept few, common and shared to make communicationpossible.

Unlike data or content, words grow rather slowly @25,000 words peryear as per Wiki. The entirety of English words is estimated to be overjust a million words, give or take a few thousands.

Basically, the world of words seemed large but finite, and based onpredictable and observable patterns that could be easily retrieved,analysed and interpreted, given our syntax and library based approach,smileys, emoticons and wtf included.

So we set forth in our pursuit of meaning.

The Program Logic

Syntax are the rails semantics ride on to reach meaning.

People use same words to mean same things and different words to mean different things.

Sometimes they use different words to mean same things and same words to mean different things.

In general, people are quite precise with their words.

More or less.

Page 6: Metaphic or the art of looking another way

Our Content-Context framework

Subtext is to Text what Metadata is to Data, and other assumptions.

What is text? What is content? What is context? What is meaning? There were many questions that we had to ask and answer ourselves

during the development. We just “reasoned up” as Elon Musk of Tesla says, based on first principles and a natural order of questioning,

led by language, logic and StackOverflow. The program will return true or false before we could say good or bad was the thinking.

The Process Attitude

Reason up. Follow the Word. Focus on output, not outcome.

What is Data? We hold the view that “All is data” or “Sarvam Datum”, as we say inSanskrit-Latin. And that all data can be captured and categorized under four basictypes of metadata: Time|Place|Topic|Person, responding to aWho|Where|What|When query framework. And grammar is the gopher.

Content can be text/sound/image but for our purposes, it is restricted to text.

Page 7: Metaphic or the art of looking another way

The Proofs

A grammar book and a calculator can take you a long way in making sense of Google and Twitter.

It works!

Page 8: Metaphic or the art of looking another way

From Search to Find: In Trouve, we employ a simple and practical framework for relevant content discovery, with categorization, entity/concept extraction, and

even publishing. You can use it like you would Google. In fact, you are using Google, with a little helper attached to it. There’s a lot of cleaning up to do still, but thehypothesis seemed to work. For instance, an input like “text mining” is likely to give you Text Mining 101 as a category result, among others. Makes research simpler.You can access the content in multiple ways. Helped us make a knowledge base out of our own history data. But that’s another localized demo.

The difference between “Search” and “Find” is one of expectation, so to speak.

Input term: ibm quarter. Feel free to try anything. And wait just a bit.

Page 9: Metaphic or the art of looking another way

Topics, entities, people, places, questions, requests, demands, interests, opinions, comparisons, superlatives… The combination of grammar tags and regexes gives us full control over the text to mine all the information we need. And with colour and sentiment coded inline graphs like “Sentipede”and features like Summarizer, Querifier etc., there are many facets we can show of the data. This approach is ideal to create customized tl;dr templates. Add regular metadata information into the mix and the possibilities increase substantially.

Demo url: http://metaphic.in/monitor.php?term=coke

What kind of messages are they?

Where are they talking from?

The advantage? The questions you can ask of the data increase significantly. And it is easy to customize it to brand or domain.

Who is talking? What are they talking about? Who else are they talking about? How many are talking? When are they talking mostly? Where are they from?

Are they saying good things or bad things? Are they being funny or sarcastic?...

The questions are but natural. Shouldn’t the answers be too?

Case in point:

With Parts of Speech tagging, there’s no such thing as unstructured data.

The “sentipede” and “message type” categorizations are early stages yet but hopefully proves the point. It is easy to build and train the libraries.

Page 10: Metaphic or the art of looking another way

If you really want to know how your customers or employees are feeling, just follow the adjectives.

We analysed the employee survey results using our algorithm to assess needs and

aspirations of the employees of a very large Business Unit of a very large IT major, and

mapped it to their employee engagement programs. Extremely good results.

This was when we analysed Nokia’s social data

over three months. http://metaphic.in/nokia/

With this “whole-grained” approach to analysis, the data bends to yield insights.

Page 11: Metaphic or the art of looking another way

The metadata is much richer now, providing greater flexibility to query and retrieve relevant information.

Linked text, with a syntactic grid that underpins it, is perhaps a way to go. Is perhaps the key to linked data and the dream of the semantic web.

Try a url/term/text at http://metaphic.in/home/index.php? The program reads at approx. 4000 word per minute so request a little patience, especially if you want classification. The screenshots are from a link analysis from The Guardian. Nice article.

We generate a subject cloud which looks at the word in context and doesn’t just look at frequency. Apologies for the lack of UI/UX and also please note that the categorizations and the library work is not complete.

Page 12: Metaphic or the art of looking another way

If meaning is conveyed in the flow of words, then a preposition is where the flow bends at.

Biographer employs grammar and phrase patterns to make 2 minute dossiers of people. This is Arvind Kejriwal’s life story published by our algorithm. On a related note, we’ve also developed a “job description to resume match” candidate finder (RecoDex, we call it).

http://metaphic.in/peop/e/

Page 13: Metaphic or the art of looking another way

http://metaphic.in/home/whysguy.html?

Please keep the questions simple. For now.

What is the meaning of life?

Ask the Whysguy. Be patient.

Words are parts of speech. Literally, grammatically, programmatically and pragmatically speaking.

Page 14: Metaphic or the art of looking another way

Answers, not analytics.

The Promise

Page 15: Metaphic or the art of looking another way

Adjectives and verbs reveal attitudes and behaviour that statistics can never aspire to. Proper Nouns and simple capitalization rules capture topics and people like no hashtag can. And prepositions can access insights like metadata can't.

With so much emphasis on social media and unstructured text, with so much content being generated that needs to be analysed and interpreted, text analytics is one of the hottest focus areas of the technology investor. Text Analytics Market is worth $4.90 Billion by 2019 say the pundits.

Metaphic is one way of looking at the opportunity.

Mine the language to mind the business.

Data is the new oil, and text makes up nearly 90% of it.

The Potential

Page 16: Metaphic or the art of looking another way

Lack of money, expertise and infrastructure to bring it to market readiness and revenue.

The Problem

These are all proofs of concept that now need to be tested, validated and applied to answering real business questions. We have more or less reached the end of the road on this owing to lack of resources and I’m looking for an angel that might see merit in taking the development further.

With a couple of projects, and a recently developed API, we have managed to and will continue to generate some revenue but it is not enough. In my estimation, there is at least another six months of work left before this can start generating steady revenue.

And in two years, working overtime, this can be of some serious value.

Page 17: Metaphic or the art of looking another way

Grammatically Different

Tel: +919845057220 mail: [email protected]

Thank you for your time and hope to hear back.

Metaphic