72
UNIVERSAL NETWORKING LANGUAGE UNDL FOUNDATION

UNIVERSAL NETWORKING LANGUAGE - UNESCO

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

UNIVERSAL

NETWORKING

LANGUAGE

UNDL FOUNDATION

UNL

BAMAKO7/05/05

Content

1. Seeing is believing2. Response to challenge of our time3. Convergence of IT, Knowledge, Language4. What we can do with it5. How it works6. What it takes to have it7. A story

ARABIC

CHINESE

ENGLISH

FRENCH

RUSSIAN

SPANNISH

ARABIC

CHINESE

ENGLISH

FRENCH

RUSSIAN

SPANNISH

English

French

Spanish

Chinese

Arabic

Russian

UNITED NATIONS: 5x6=30 Pairs

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter Deconverter

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter

Deconverter

Internet

UNL System Architecture

French People

Hindu People Arabic People

Chinese People

UNL System

The property rights of the UNL belongs to the United Nations © UNDL Foundation. All rights reserved

Hindi

ChineseFrench

Arabic

WEBInternet

UNL Document creation

Web serverWeb page using UNL

Spanish ContentDeveloper UNL Language

Server

Spanish

Enconverting

UNL

Language Server

ENCO

DECO

Web Server with

UNL document

UNL-LanguageDictionary

KnowledgeBase

CoocurrenceDictionary

GenerationRules

AnalysisRules

UNL

NL

Language Server

Natural Language

UNL

UNL

Deconverter

Deco

GenerationRules

UNL-LanguageDictionary

UNL System

CoocurrenceDictionary

UNL

NL

KnowledgeBase

Enconverter

Enco

AnalysisRules

UNL-LanguageDictionary

UNL System

NL

UNL

KnowledgeBase

UNL LANGUAGE SERVEREnconverter = Deconverter

(EnCO) (EnCO)

Language Server

UNL <-> Hindi

InternetUNL Proxy

Language ServerUNL <- >Japanese

Language ServerUNL <- >Chinese

DeCO

UNL documentLanguage Server

UNL <-> Arabic

Language Server

UNL <-> Spanish

USER

EnCO

DeCOEnCOLanguage Server UNL <- > English

DeCOEnCODeCOEnCO

UNL Editor

1 2 3

UNL Viewer

LANGUAGE SERVERUNL = Native Language

Deconversion ProgramDeCO

Enconversion ProgramEnCO

UNL Dictionary UNL Grammar

UNL Knowledge Base

3

WEB

French People

Hindu People Spanish People

Chinese People

The UNL over the Web

French

= UNL Language server

French Chinese

SpanishHindi

Spanish

English

Hindi

Russian

UNL

UNL

UNL

UNL

UNL

Chinese

MERCIThank you

History of Great Discoveries and the Great Inventions

• To see the Forest beyond the Trees.• The little stories of everyday, hide the great history and

the great changes in History• Most of them are unexpected• Technology are response , reaction to great challenges

– Technologies has a power of transforming everything, catalyzes of other movements, power of multiplying effects, opportunities;

– Technologies that have higher lower power of transforming life;

– When they happen , there is a Risorgimento regional.

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter Deconverter

What the UNL can do ?

1) Machine Translation2) Multilingual Information Service (e-

Commerce, e-learning, e-government, e-TV)

3) Information Retrieval System (e-Commerce, e-learning, e-government)

4) Expert system5) Encyclopaedia6) And many others

WHAT IS THE UNL ?

A set of resources comprising:

–Linguistic Resources–A technical infrastructure–Knowledge Assets

Inside the UNL System

• Specifications of the UNL–Universal Words (UWs)–Master Definitions–Attributes–Relations–Grammar

Linguistic resources

• UNL Servers–Enconverter–Deconverter

• Proxy Server • UNL Editor• UNL Verifier• UNL Explorer• Manuals

Technical Resources

• Dictionaries: UWs, Master, Natural Language (NL)

• Grammatical rules for each NL

Knowledge Assets

Competitors?

• Computer systems which can deal with knowledge and contents have been already developed.

• Representations of knowledge or contents are different from each other.

• Moreover, a representation depends on a language.

• Knowledge or contents of a computer system can not be used in

Who?

• In the case of machine translation,if we combine all the results of research and development on machine translation, we can not realize a multilingual machine translation system that can break language barriers.

Advantages of UNL

• The UNL, a common language for computers:– enables sharing knowledge and

contents among all systems – overcomes language barriers– reduces costs of developing knowledge

or contents – facilitates knowledge processing

How it works

• The UNL can express concepts like words do in natural languages. Ex: horse, cavalo, cheval, (Chinese ideogram),

• The UNL can express information like natural languages do. Ex: full description of a horse as in an encyclopaedia entry.

How UNL express information?

• The UNL express information by classifying objectivity and subjectivity.

• Objectivity is expressed using UWsand relations.

• Subjectivity is expressed using attributes.

300 computer scientists and linguists from universities and research institutions around the world

Who are we

Language Coverage

• Languages already engaged:• 6 UN official Languages:

Arabic, Chinese, English, French,Spanish, Russian

• Other languages:Hindi, Indonesian, Italian,Japanese, Korean, Mongol,Latvian, Portuguese, Thai

Arabic The Royal Scientific Society, Jordan

Chinese Ministry of Electronics Industry, China

English UNL Centre

French University Joseph Fourier, France

German Univ. of Saarbrucken, Germany (inactive)

Hindi Indian Institute of Technology, India

Indonesian BPPT Technology, Indonesia

Italian Pisa CNR, Italy

Japanese UNL Centre

UNL R&D Network(1)

Mongolian Mongol Pedagogical University, Mongolia

Latvian University of Latvia, Latvia (inactive)

Portuguese University of Sao Paulo, Brazil

Russian Russian Academy of Science, Russia

Spanish University Politecnica of Madrid, Spain

Swahili Univ. of Dar es Salaam, Tanzania (inactive)

Thai NECTEC, Thailand (inactive)

UNL Centre UNDL Foundation

UNL R&D Network(2)

Where we are now

1) UNL the language, Relation and Attributes (specification)Version Approved by a committee of Scholars and patent recognized by PCT countries (WIPO) 2002

Dictionary of Universal Words, Knowledge Base (Increasing volume of entries on a continuous development)

Where we are now

2) Language Server: (Operational)

a) Deconverter (Language Generation System)

Deco: Operational

Generation Rules and Dictionary (each language): continuous development

b) Enconverter (UNL Generation System)

Enco: Operational

Analysis Rules, Dictionaries (each language):

continuous development

Where we are now

• 3) Tools & Applications: • UNL Proxy Server: Operational• UNL News: 4 publications on 2002• UW Gate: under tests• UNL Verifier: under tests• UNL Viewer: Prototype• UNL Editor: Prototype• UNL Encyclopedia: Prototype• UNL Explorer: Prototype• Org Explorer: Prototype

• Applications in all fields of human activities

• Advantages for international organizations

• Bridging the Digital Gap• Benefits for Multilingual Countries• Content driven Technology:

• hence opportunities for employment and self employment

• Low cost clean investment

Vision of the Future

• Financial Resources• Persistence: working towards

cumulative results• More language coverage• Expanding the R&D network

Challenges

Open Policy

UNL (system) should be developed by all peoples in the world.

• We will open:

UNL specifications

Universal Word Dictionary

Format of UNL-Language dictionary

Format of Deconversion rule

System interface

What we expect to be developed by people in the world

UNL (system) should be developed by all peoples in the world.

• Universal words necessary for each language

• Language Servers for new languages and new domains

What we expect to be developed by people in the world

• Application systems such as:

Information Retrieval System

Search Engines

Browsers

Editors/Word Processors

Machine translation Systems

The UNL System

1) UNL (Universal Networking Language)

Dictionary of Universal Words , Relation, Attribute,

Knowledge Base

2) Language Server

i) Deconverter (Language Generation System)

Deco, Generation Rules, Dictionary(each language)

ii) Enconverter (UNL Generation System)

Enco, Analysis Rules, Dictionaries(each language)

3) Tools: UNL Viewer UNL Editor UNL Proxy Server

UNL Proxy Server

• Searches for UNL at the web page accessed by the user.

• The UNL document is sent to the Language Server defined by the selected language.

• Updates the web page to be displayed on the user’s chosen language.

UNL Editor

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter

Deconverter

UNL Editor – select sentence

UNL Editor

UNL Editor

UNL Editor

UNL Editor

UNL Encyclopaedia

UNL Encyclopaedia

• “Infinite library” (M.Luis Borges)• Human Knowledge• Knowledge system• Encyclopaedias• How to build the UNL Encyclopaedia

• Purpose: Gift to Humankind• Submitted in 1999 to the Japanese

Patent Office• Recognized by PCT countries

(WIPO) 2002• Application for patent e

commercial protection in major countries

• Protection by the United Nations

UNL Patent

• 300 computer scientists and linguists from universities and research institutions around the world

Languages covered: Arabic, Chinese, English, French, Japanese, German, Hindi, Indonesian, Italian, Latvian, Mongolian, Portuguese, Russian, S i h Th i

UNL Global Network of R&D

• Purpose: Collaboration in R&D• Membership: Individual and

institutions• At present: over 300 members

from 30 countries• Future perspective: Collaboration,

support users

UNL Society

WE…

A global network of computer Scientists and Linguists + philosophers, mathematicians…

UNDL FOUNDATION8, JULY 2003

UNLUNL: A LANGUAGE: A LANGUAGE• UNL is a “language” for computers (different

from a “computer language”)• expresses information and knowledge in

digits, the characters that all computers understands

• enconverts contents from any natural language into UNL and then deconverts into any other natural languages.

• UNL Language enables peoples to• build the “reservoir” of human knowledge

from and to diverse natural languages

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter

Deconverter

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter Deconverter

UNL

EnglishSpanish

Chinese

French

Japanese

Russian

Etc...Arabic

Enconverter Deconverter

UNLUNL: A SYSTEM: A SYSTEM

• UNL has been designed to represent contents in a language independent way.

• UNL is a system to support multilingual information services (mainly for Internet)

• It can also be used as a machine translation system

European Heritage Network(HEREIN)

• HEREIN is a very large document repository (all documents written in three different languages)

• Great amount of human translation resources needed.• Current contents written in UNL can be converted in more

languages.

• Web page www.european-heritage.net

• The Network is currently composed of administrations and/or mandated bodies from the following (27) countries :

– Andorra, Armenia, Belgium (Brussels-Capital, Flemish Region, Walloon Region), Bulgaria, Croatia, Cyprus, Denmark, Estonia, Finland, France, Georgia, Hungary, Ireland, Latvia, Lithuania, Luxembourg, Norway, Poland, Portugal, Romania, Slovakia, Slovenia, Spain, Sweden and the United Kingdom.

DEMO

1. See the Spanish report (.xml)

2. See the Spanish report in UNL

3. Load the Spanish report in UNL into the Spanish language generator.

4. See the generated Spanish (output.txt)

5. See generation available in other languages (Russian, English, Italian).

THE SIZE OF THE PROBLEM/CHALLENGE

Globalization of the economic activities and the political relations among states and social lifestyle is generated, supported and reinforced by global information systems.

The global village emerging from the convergence of telecommunications carriers, radio and television global networks and the computers generates the conditions for a market, sharingaffluence, and enjoying cultural goods.

The global village creates the situation of exclusion of millions from the sharing affluence, health services education, enjoying leisure, technology comfort exchange and exposure culture, participating social activities, benefiting from economic activities, access to the market

Population Forecasts for Major Cities in 2010 (unit: millions)

(14) Calcutta (India) 15.70 m(15) New Delhi (India) 15.58 m(16) Los Angeles (USA) 13.91 m(17) Seoul (South Korea) 13.91 m(18) Buenos Aries (Argentina) 13.68 m(19) Cairo (Egypt) 13.42 m(20) Rio de Janeiro(Brazil) 13.32 m(21) Bangkok (Thailand) 12.74 m(22) Tehran (Iran) 11.88 m(23) Istanbul (Turkey) 11.80 m(24) Osaka (Japan) 10.60 m(25) Moscow (Russia) 10.37 m(26) Lima (Peru) 10.07 m

(1) Tokyo (Japan) 28.93 m(2) San Paolo (Brazil) 24.97 m(3) Bombay (India) 24.37 m(4) Shanghai (China) 21.67 m(5) Lagos (Nigeria) 21.09 m(6) Mexico City (Mexico) 18.02 m(7) Beijing (China) 17.97 m(8) Dhaka (Bangladesh) 17.55 m(9) New York (USA) 17.23 m(10) Jakarta (Indonesia) 17.20 m(11) Karachi (Pakistan) 17.02 m(12) Manila (Philippines) 16.06 m(13) Ten shin (China) 15.70 m

13 4

7

8

10

1112

13

1415

1719

2122

23

24

25

2

56

916

18

2026

Source: World Bank Data/Nishi

Top 10 Languages by Top 10 Languages by PopulationPopulation

RANK LANGUAGE POPULATION ____________________________________________

1. CHINESE (MANDARIN) 885,000,0002. SPANISH 332,000,0003. ENGLISH 322,000,0004. BENGALI 189,000,0005. HINDI 182,000,0006. ARABIC (ALL COUNTRIES) 177,000,0007. PORTUGUESE 170,000,0008. RUSSIAN 170,000,0009. JAPANESE 125,000,00010. GERMAN, STANDARD 98,000,000____________________________________________

Source: Ethnologue: Languages of the World

WHAT DOES UNL OFFER?

The UNL provides users with a multilingual platform and a set software tools enabling them to communicate with

other people their respective languages.

With the multilingual platform in place, users can share information and knowledge across native languages.

Citizens, governments, international organizations, and enterprises will all benefit from the UNL, as it provides opportunities for information sharing, education, and e-business. The ultimate goal is to promote sustainable development, dialogue among civilizations, economic

prosperity for all nations as well as peace among them.

The property rights of the UNL belongs to the United Nations © UNDL Foundation. All rights reserved

“enconverts” all inputs into UNL representations. This can be done interactively between the writer and the computer. The UNL “Viewer”shows back to the writer the document as it is “enconverted” from the UNL into his/her language, which represents how the system understands the original document being produced by the writer. This allows him/her to check the correctness of “enconversion”. In such an interactive process, the writer can produce UNL documents as accurate as he/she wishes. USERS do not need to know UNL, nor how the EnCO and DeCO programs operates; they just need to input correct sentences in their native languages with the help of the UNL Editor like a word processor.

“enconverts” all inputs into UNL representations. This can be done interactively between the writer and the computer. The UNL “Viewer”shows back to the writer the document as it is “enconverted” from the UNL into his/her language, which represents how the system understands the original document being produced by the writer. This allows him/her to check the correctness of “enconversion”. In such an interactive process, the writer can produce UNL documents as accurate as he/she wishes. USERS do not need to know UNL, nor how the EnCO and DeCO programs operates; they just need to input correct sentences in their native languages with the help of the UNL Editor like a word processor.

In bloc 1, the USER writes a document in his/her native language using a PC equipped with the UNL Language Server. The UNL “Editor” tool, in connection with the UNL Language Server, enables him/her to write it in UNL. As the USER types word by word, sentence after sentence, a full paragraph, or the whole document, the UNL Enconverter software (EnCO) instantaneously

In bloc 1, the USER writes a document in his/her native language using a PC equipped with the UNL Language Server. The UNL “Editor” tool, in connection with the UNL Language Server, enables him/her to write it in UNL. As the USER types word by word, sentence after sentence, a full paragraph, or the whole document, the UNL Enconverter software (EnCO) instantaneously

Understanding How the UNL System Works

Understanding How the UNL System Works

UNL LANGUAGE SERVEREnconverter = Deconverter

(EnCO) (EnCO)

Language Server

UNL <-> Hindi

InternetUNL Proxy

Language Server

UNL <- >Japanese

Language ServerUNL <- >Chinese

DeCO

UNL documentLanguage Server

UNL <-> Arabic

Language Server

UNL <-> Spanish

USER

EnCO

DeCOEnCOLanguage Server UNL <- > English

DeCOEnCODeCOEnCO

UNL Editor

1 2 3

UNL Viewer

UNL LANGUAGE SERVEREnconverter = Deconverter

(EnCO) (EnCO)

Language Server

UNL <-> Hindi

InternetUNL Proxy

Language Server

UNL <- >Japanese

Language ServerUNL <- >Chinese

DeCO

UNL documentLanguage Server

UNL <-> Arabic

Language Server

UNL <-> Spanish

USER

EnCO

DeCOEnCOLanguage Server UNL <- > English

DeCOEnCODeCOEnCO

UNL Editor

1 2 3

UNL Viewer

in standing alone computers, or exchanged in local networks (LAN), or distributed through WWW servers. They can also be forwarded by file transfer program. UNL documents received in a network terminal can be deconverted into each native language and read by any people on a browser equipped with the “the UNL Language Server” set. This is one of the outstanding features of the UNL System: it allows for synchronous and asynchronous operation of multiple language servers simultaneously.

in standing alone computers, or exchanged in local networks (LAN), or distributed through WWW servers. They can also be forwarded by file transfer program. UNL documents received in a network terminal can be deconverted into each native language and read by any people on a browser equipped with the “the UNL Language Server” set. This is one of the outstanding features of the UNL System: it allows for synchronous and asynchronous operation of multiple language servers simultaneously.

In Bloc 2 the UNL document is placed on the Internet through a ”UNL Proxy server”. Information, text, documents, web pages written in UNL can be stored in archives, or downloaded and shared throughout the Internet to multiple users in all native languages equipped with the “UNL Language Server” set. UNL documents can be processed

In Bloc 2 the UNL document is placed on the Internet through a ”UNL Proxy server”. Information, text, documents, web pages written in UNL can be stored in archives, or downloaded and shared throughout the Internet to multiple users in all native languages equipped with the “UNL Language Server” set. UNL documents can be processed

In t ernetWeb PageCont ent s

Internet

Lang. Server

UNL - Engl ish

UNL

Language Server A

UNL - Arabic

enc onver t er

dec onver t er

Language Server B

UNL - Spanish

enc onver t er

dec onver t erLang. Server

UNL - I t a l ian

UNL Editor

Arabic

Spanish

UNL Viewer

UNL Proxy

Lang. Server

UNL - Chinese

© UNDL Foundation. All rights reserved

Int ernetWeb PageCont ent s

Internet

Lang. Server

UNL - Engl ish

UNL

Language Server A

UNL - Arabic

enc onver t er

dec onver t er

Language Server B

UNL - Spanish

enc onver t er

dec onver t erLang. Server

UNL - I t a l ian

UNL Editor

Arabic

Spanish

UNL Viewer

UNL Proxy

Lang. Server

UNL - Chinese

© UNDL Foundation. All rights reserved

Once You Have It in UNL, You Have in It

All Languages

Once You Have It in UNL, You Have in It

All Languages

Bloc 3 shows that each native language has its own a UNL Language Server. This allows any user to interact with others in his/her own language, while the others use theirs, through the Internet. The number of languages that can be supported is unlimited.

Bloc 3 shows that each native language has its own a UNL Language Server. This allows any user to interact with others in his/her own language, while the others use theirs, through the Internet. The number of languages that can be supported is unlimited.

The Language Servers are all equipped with the same set of software as in Block 1, i.e., the EnCO, DeCO programs, as well as the Editor and Viewer tools, which are connected to the Master Dictionary of UWs and the UNL Knowledge Base. Users, therefore, may write, read and exchange UNL documents from any language that has developed its UNL Language Server. They can also improve the existing UWs Dictionary, or create one where it does not exist, and expand Knowledge Base indefinitely. For these tasks, the necessary tools, specifications, instructions and manuals are available on the web.

The Language Servers are all equipped with the same set of software as in Block 1, i.e., the EnCO, DeCO programs, as well as the Editor and Viewer tools, which are connected to the Master Dictionary of UWs and the UNL Knowledge Base. Users, therefore, may write, read and exchange UNL documents from any language that has developed its UNL Language Server. They can also improve the existing UWs Dictionary, or create one where it does not exist, and expand Knowledge Base indefinitely. For these tasks, the necessary tools, specifications, instructions and manuals are available on the web.

Internet

UNL System

French People

Hindu People Arabic People

Chinese People

UNL System

The property rights of the UNL belongs to the United Nations © UNDL Foundation. All rights reserved

Hindi

ChineseFrench

Arabic

Once you Have it in UNL, you Have in It all

languages

Once you Have it in UNL, you Have in It all

languages

• 馬