9
Indian Journal of Traditional Knowledge Vol. 4(4), October 2005, pp. 358-366 Data Warehouse Techniques in Traditional Knowledge Systems Deepak K Pande G B Pant Institute of Himalayan Environment and Development, Kosi Katarmal, Almora 263 643, Uttaranchal, E-mail: [email protected] Received 6 July 2004; revised 9 July 2005 Traditional knowledge is the knowledge that has been passed from one generation to the next through the oral or written traditions. Elders, being the most knowledge persons are very important in the society. The elders are the people, who have gained the knowledge over their lifetime and are needed to teach the younger generations. The relationship of the indigenous people to the land and its resources is tantamount to their survival. No matter where they live and whatever beliefs they have, they all view land as the basis of their survival. Attempts are being made to document and preserve Oral Traditional Knowledge and Traditional Cultural Expressions. Looking into the quantum of the information, It is difficult to document and retrieve the information. To make this a reality, an urgent need to fabricate a Data Warehouse (DW) on Traditional Knowledge Systems (TKS) has been emphasised. Keywords: Data Warehouse Techniques, Traditional Knowledge Systems IPC Int. Cl. 7 : G06K9/00 India has rich diversity of heritage and related Traditional Knowledge System (TKS), which is passed down by word of mouth from generation to generation. Due to modernization this unwritten knowledge is diminishing very fast. Therefore, a large number of persons, organizations are documenting the traditional knowledge and related issues. But their findings are scattered in different bulletins, journals, books, magazines, newspapers, etc. and very difficult to procure this at one place. It is difficult to get the documented database (literature) at one place. To make this a reality, there is an urgent need to fabricate a Data Warehouse (DW) on TKS. Data Warehouse, a buzz word think up by Bill Inmon in 1990 has a capability to provide storage, functionality and responsiveness to analytical queries of transaction-orientation databases, which also sustain the Decision Support Systems (DSS) and On- Line Analytical Processing (OLAP). Beside this, the data warehouse facilitates fast retrieval, analysis and maintenance of large data within a stipulated time. A warehouse is a subject-oriented, integrated, time- variant and non-volatile collection of data in support of management’s decision-making process. 1 There are two types of traditional knowledge (i) documented or disclosed knowledge, which is defended through national and international law, while other is undocumented (oral or undisclosed). 2 Oral Traditional Knowledge is restricted either to locally known traditional healers (individual knowledgeable persons) or to specific/particular community or group/group of communities of a particular area/region (Fig. 1). Documentation may facilitate the biopiracy 3,4 . In spite of these, the process of documentation of Traditional Knowledge (TK) or folklore still continuing with out internationally recognized norms for their protection 4 . Hence, it required legal and technical policies to protect the Traditional Knowledge, wisdom, practices, folklore, genetic resources, benefit sharing, etc. Some related viewpoints are given below: Legal Policy Various suggestions have been advanced in India to extend protection to knowledge, innovation and practices. 3 These include: (i) documentation of TK ;(ii) registration and innovation patent system; and (iii) development of sui generis system. Documentation of location of biological resources is required at local, regional and national level. 5,6 Therefore, concrete efforts are required to digitise the inventories characterize at molecular level. 6 If someone needs to mine the information from data warehouse, it should be made available for fee by the conditions governing the use of this information.

Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

  • Upload
    dodat

  • View
    218

  • Download
    3

Embed Size (px)

Citation preview

Page 1: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

Indian Journal of Traditional Knowledge Vol. 4(4), October 2005, pp. 358-366

Data Warehouse Techniques in Traditional Knowledge Systems Deepak K Pande

G B Pant Institute of Himalayan Environment and Development, Kosi Katarmal, Almora 263 643, Uttaranchal, E-mail: [email protected]

Received 6 July 2004; revised 9 July 2005

Traditional knowledge is the knowledge that has been passed from one generation to the next through the oral or written traditions. Elders, being the most knowledge persons are very important in the society. The elders are the people, who have gained the knowledge over their lifetime and are needed to teach the younger generations. The relationship of the indigenous people to the land and its resources is tantamount to their survival. No matter where they live and whatever beliefs they have, they all view land as the basis of their survival. Attempts are being made to document and preserve Oral Traditional Knowledge and Traditional Cultural Expressions. Looking into the quantum of the information, It is difficult to document and retrieve the information. To make this a reality, an urgent need to fabricate a Data Warehouse (DW) on Traditional Knowledge Systems (TKS) has been emphasised.

Keywords: Data Warehouse Techniques, Traditional Knowledge Systems

IPC Int. Cl.7: G06K9/00

India has rich diversity of heritage and related Traditional Knowledge System (TKS), which is passed down by word of mouth from generation to generation. Due to modernization this unwritten knowledge is diminishing very fast. Therefore, a large number of persons, organizations are documenting the traditional knowledge and related issues. But their findings are scattered in different bulletins, journals, books, magazines, newspapers, etc. and very difficult to procure this at one place. It is difficult to get the documented database (literature) at one place. To make this a reality, there is an urgent need to fabricate a Data Warehouse (DW) on TKS. Data Warehouse, a buzz word think up by Bill Inmon in 1990 has a capability to provide storage, functionality and responsiveness to analytical queries of transaction-orientation databases, which also sustain the Decision Support Systems (DSS) and On-Line Analytical Processing (OLAP). Beside this, the data warehouse facilitates fast retrieval, analysis and maintenance of large data within a stipulated time. A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management’s decision-making process.1

There are two types of traditional knowledge (i) documented or disclosed knowledge, which is defended through national and international law, while other is undocumented (oral or undisclosed).2 Oral Traditional Knowledge is restricted either to

locally known traditional healers (individual knowledgeable persons) or to specific/particular community or group/group of communities of a particular area/region (Fig. 1). Documentation may facilitate the biopiracy3,4. In spite of these, the process of documentation of Traditional Knowledge (TK) or folklore still continuing with out internationally recognized norms for their protection4. Hence, it required legal and technical policies to protect the Traditional Knowledge, wisdom, practices, folklore, genetic resources, benefit sharing, etc. Some related viewpoints are given below: Legal Policy • Various suggestions have been advanced in India

to extend protection to knowledge, innovation and practices.3 These include: (i) documentation of TK ;(ii) registration and innovation patent system; and (iii) development of sui generis system.

• Documentation of location of biological resources is required at local, regional and national level.5,6 Therefore, concrete efforts are required to digitise the inventories characterize at molecular level.6 If someone needs to mine the information from data warehouse, it should be made available for fee by the conditions governing the use of this information.

Page 2: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

PANDE: DATA WAREHOUSE TECHNIQUES IN TRADITIONAL KNOWLEDGE SYSTEMS

359

• The digital libraries along with sate and national biodiversity or Bioresource register, if any, should be owned by the government in order to strengthen the claim of traditional people, such as, issues of ownership of biodiversity, bioresources, traditional techniques and compensation. It would be helpful in preventing biopiracy or theft of Traditional Knowledge. Traditional Knowledge Digital Library (TKDL) should be utilized as a proof of ‘prior art’ by the examiners of patent offices, nationally and internationally.2

• India’s biodiversity is shared by almost all South

Asian Association for Regional Cooperation (SAARC) countries. Hence, these countries

require developing a legal framework to maintain or protect their biodiversity.

• People should be awakened about the important legislations concerning Intellectual Property Rights (IPRs). For this, government should hold symposia, seminars, workshops, refresher courses, etc.

• There is an urgent need to popularize the significance and benefit of commercialization of Traditional Knowledge, practices, and folklore. Therefore, the government should open the numerous patent offices and training centres through out the country to protect, preserve, promote and disseminate the Traditional Knowledge/folklore.

Page 3: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

INDIAN J TRADITIONAL KNOWLEDGE, VOL 4, No. 4, OCTOBER 2005

360

• Validation of Traditional Knowledge scientifically ids also required.

• There is an urgent need to define ‘biopiracy’. • The government should exploit the talents of

traditional healers living in the remote areas of the country.

• Sui generis system for protection of new plant varieties is a form of protection, which is not covered by the present patent law, etc. Promulgation of an appropriate sui generis system should be legitimize.3

• The burden of proof to shift to alleged infringer.7 Technical Policy The following steps can be taken to provide software based technical protection: • The undisclosed (undocumented/oral) knowledge

should be digitised in such a way that one should not excerpt any thing from it by using encryption algorithm. An attempt should also be made to link such knowledge to newly created sub-class ‘Traditional Knowledge Resource Classification (TKRC)’ of International Patent Classification (IPC).

• The front-end (data codification and de-codification software) should be developed for the security issues for secret Traditional Knowledge by using data encryption algorithms. Encryption algorithms are the superior way to protect any undisclosed/secret knowledge, which gives more functionality, security and web protection. The accessible format should also be compatible with other digital library for resource linking (Fig. 2).

• For designing digital library of codified or informal knowledge, one has to consider the following points: (i) resource/knowledge classification, (ii) Document classification, (iii) Enhancing subject-based International Patent Classification (IPC) to serve the national needs, (iv) Deciding on key attributes of Traditional Knowledge. Traditional Knowledge Digital Library (TKDL) similar to the first page of patent application, (v) Finalizing essential features for search and examination and (vi) Identifying primary attributes of TKDL.8

• An informal retrieval system (IR) should also be made in different international languages by using natural language processing.

• Data mining facilities retrieval of information (pattern identification) from the data warehouse or large amount of available data sets using statistical method with minimal input by the users. Data mining techniques is widely used in knowledge segmentation, knowledge profiling, risk management, analysis of records, data visualization, etc. There are two basic outline of data mining, known as descriptive and predictive. Descriptive is the hidden procedure, which is not defined previously, while predictive method is used to discover the wisdom from previously defined predictive models. There are some models for solving the business problems such as clustering, association analysis, sequential pattern discovery, classification and regression by using these methods.

The mining is itself a broad area to fetch data from TKS data warehouse. There are several algorithms and methods (K-Means, Association Rules, Self Organization Map, Sequential Pattern, Decision Trees, Neutral Networks, Bayesian Network) used in data mining.7, 9 To erect a data warehouse of TKS, builders should take a broad view of knowledge base intelligent querying. Visualizing a data warehouse project, it is important to have some measurement, principles and criteria for inputs and outputs. TKS Data Warehouse Plan In order to make a data warehouse of Traditional Knowledge firstly ‘the data farmers’ has to accumulate the scattered information from different communities on traditional wisdom. This information is gathered from rural people or knowledgeable persons, which is coupled with published literature. The data should be catalogued into three categories, e.g. - Transaction data, Derived data and De-normalized data (Fig. 3). Transaction data retrieve from ‘sources’. There are certain rules, regulations and intelligent based computation to filter or prioritization the squander knowledge for acquiring derived data. De-normalized data can be detaches from derived data by means of data marts. Fundamentally the data mart is the body of decision support system (DSS). Methodology Following steps are to be taken to build up a data warehouse of Traditional Knowledge Systems (TKS)-

Page 4: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

PANDE: DATA WAREHOUSE TECHNIQUES IN TRADITIONAL KNOWLEDGE SYSTEMS

361

(i) Data Warehouse Architecture The ‘Architecture’ of data warehouse is the important facets to develop it, (Fig. 4). Besides, object of data warehouse, level of the sponsor, nature of knowledge, data characteristics, query and process requirements and maturity in technology of the organization are equally valuable. TKS data warehouse architecture has two key components ‘Technology’ and ‘Knowledge’ status. Knowledge status acknowledge to source systems. Sources of knowledge or data & meta data ultimately stored in a place, which is known as federal loom (centralized approach) of data warehouse10. The architecture contains planning, implementation and management of Data Warehouse. After collection of transaction data or necessities, the data architectural support requirements must be clear. Network capacity, transfer rates, storage volume, data collection rates are some crucial points to increase rate of Data Warehose11,12. (ii) TKS Data Warehouse Project Team Project management committee will look after the over all internal development of the data warehouse project (Fig. 5). The committee will also be responsible for its successful completion, management, regularization, and position. The project requires sufficient fund, which is also managed by the said committee. The traditional knowledge analyst delivered the concrete analytical, traditional knowledge to warehouse. Traditional Knowledge expert keeps positive relationships with management committee and the traditional knowledge analyst(s). The traditional knowledge expert also evaluates and update the knowledge or data deliver by the traditional knowledge analyst. Statistical expert helps in report generation through statistical measurements. (iii) Building the Data Marts The data mart is the subset of the data warehouse or the group of DSS data for specific knowledge (Fig. 6). It provides centralized approach that where and how data will store. This technology gives the greatest elasticity to an organization for quick and trouble-free implementation. The data mart explores the particular theme or sub themes of a TKS such as folk medicines, folk culture, folk songs, and folk literature etc. The data marts can be classified into two sets13. i.e. multidimensional on-line analytical processing (MOLAP) and relational on-line analytical processing (ROLAP). MOLAP supports numeric data, while the

ROLAP prop up both numeric and text data. The Data Marts is driven by technology groups. Few of the several important parameters to build up the Data Marts are as follows14: • Location of the source data or knowledge. • Need of the users as per requirements. • Dig out the vital data or knowledge from the

source or transaction data. • Alter the transaction data or knowledge to a

suitable format. • Design the data mart for TKS Database

Management System (DBMS) and • Load the group of decision support system (DSS)

or data mart in TKS DBMS in the warehouse. (iv) Meta Data Meta Data is a data about data, which depicts about the data in data warehouse or in data marts. Mata Data describes the contents and source of data flow in data warehouse. Also gives the information about the data mart, database table, attributes and relationships, etc. It is very important issue for the data manager to manage the relation between Meta data of data warehouse and Meta data of data marts13. (v) Online Analytical Processing Online analytical processing (OLAP) is a fast analysis of shared Multidimensional Information15. OLAP technology is useful for mining the qualitative, quantitative and complex data of traditional knowledge system from warehouse. OLAP sustain the better level of analysis for the warehouse. By using this technique one can fetch promptly a user-friendly report and knowledgebase analysis of traditional knowledge. (vi) Testing of Data Warehouse Unit, Integration and System or Acceptance testing are essential for a qualitative data warehouse. Unit testing (single procedure) checks the Extract Transfer and Load (ETL) of events, records, tasks of the warehouse and also be support development for report generation16. There are some significant role of unit testing such as routine check of font, format and color, graphs, labels and legends and data item in data warehouse schema. After successful completion of unit testing the data warehouse expert starts the integration testing. Using a shell script or batch facilities, expert tests the

Page 5: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

INDIAN J TRADITIONAL KNOWLEDGE, VOL 4, No. 4, OCTOBER 2005

362

Page 6: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

PANDE: DATA WAREHOUSE TECHNIQUES IN TRADITIONAL KNOWLEDGE SYSTEMS

363

Page 7: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

INDIAN J TRADITIONAL KNOWLEDGE, VOL 4, No. 4, OCTOBER 2005

364

Page 8: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

PANDE: DATA WAREHOUSE TECHNIQUES IN TRADITIONAL KNOWLEDGE SYSTEMS

365

Page 9: Data Warehouse Techniques in Traditional Knowledge Systemsnopr.niscair.res.in/bitstream/123456789/8538/1/IJTK 4(4) 358-366.pdf · Data Warehouse Techniques in Traditional Knowledge

INDIAN J TRADITIONAL KNOWLEDGE, VOL 4, No. 4, OCTOBER 2005

366

procedures of programs or software16. Integration testing is also plays a significant role such as- fabrications of log files, restart (auto/manual) the tasks, if failure, re-repeat of processing of rejected records, alter the data capture. System or acceptance testing is helpful in two constants-time and budge. In System testing, an expert check the whole procedure of system or warehouse and finally gives the counsel of acceptance or rejections of the data warehouse. Also test the acceptability of system or warehouse is ready to the end users or not. (vii) Implementation of Data Warehouse The implementation is the most significant, operational and last phase of TKS data warehouse life cycle development. Set up the help file for users and install the data warehouse software in terminals. In some cases the plan of user training delivered by the warehouse experts.

Management of Data Warehouse The management of quality and consistency of data is most vital issues because of its degree of size and complexity i.e. data sources, data staging area and operational data sources17. The best performance of data warehouse can be achieved by using quality control of data. Before starting the data warehouse project one has to develop related documents given below 18:

• Define the goal of warehouse. • Define query. • Define knowledge selection criteria. • Define the meta data of data

Apart form these, planners should also kept in mind the other related factors i.e. the size of database and characteristics, number of nodes (or user), the network capacity or requirement, the complexity and cleanliness of data, which affects the management of the warehouse.

Conclusion Data warehouse techniques would enable to conserve the folklore, folk-culture, traditions related to natural heritage, etc. It will also help in avoiding or rejecting Patent and Patent claims3,19, if any. Acknowledgement I am thankful to Dr U Dhar, Director; Dr N A Farooquee, IKS Core Head; Dr B S Majila and Dr C P

Kala, G B Pant Institute of Himalayan Environment and Development, Kosi- Katarmal, Almora, for providing facilities, support, and encouragements. I am also thankful to Dr P C Pande, Department of Botany, Kumaon University, S S J Campus, Almora, for providing literature and encouragements. References 1 Michael Read, A Definition of Data Warehousing, Viewed at

http://idm.internet.com/features/datawarehousing.html 2 Gupta V K, An approach for establishing a TKDL, Viewed at

http://www. Patentmatics.com/pub2002/pub69.htm. 3 Tripathi S K, Traditional knowledge: Its Significance and

implications. Indian J Traditional Knowledge, 2(2), 2003, 99-106.

4 Devinder, Corporate theft of ‘Traditional Knowledge’, Biopiracy by another name, Viewd at http://www.himalmag. com/2002/augest/opinion-1.htm.

5 Gadgil M, People’s Biodiversity register, Recording India’s Wealth, Amruth, Oct, 1996, 1-16.

6 Mehrotra Shanta, Role of Science & Technology in adding values to the crude herbal drugs and their products. National Seminar on New Millennium Strategies for Quality, safety and GMPs of Herbal Drugs/Products. NBRI, Lucknow, 2003, 43-48.

7 Gautam P L and Singh A K, Agro-biodiversity and Intellectual Property Rights (IPR) related issues, Indian J Plant Genet Resources, 11(2) 1998, 129-151.

8 Gupta V K, An approach for establishing Traditional Knowledge Digital Library, J Intellectual Property Rights, 5(6) 2000, 307-319.

9 Gao Junling and Xiaojin Dong, Data Mining in Customer Relation Management, Viewed at http;// www. Pafis.shh.fi/

10 Tom Haughey, Data Warehouse Architecture in an Internet Age, Viewed at http://www.datawarehouse.com/article/

11 Data Warehousing Services, Viewed at http://www.donmeyer.com/art5.html

12 Defining Data Warehousing, What is it and who needs it? Viewed at http://www.dmreview.com/whitepaper/dwr.pdf

13 Prabhu, C S R, Data Warehousing Concepts, Techniques Products and Application, (PH of India Private Limited, New Delhi) (2001).

14 Silvon Software, Inc (White Paper), Defining Data Warehousing, what is it and who needs it? Viewed at http://www.dmreview.com/whitepaper/dwr.pdf

15 OLAP application, viewed at http://www.olapreport.com/ application.htm

16 Munshi Asim Kumar, Testing a Data Warehouse Application, Viewed at http://www.datawarehouse.com/ article/

17 Vassiliadis, Panos, Quix Christoph, Vassiliou Yannis, Jarke Matthias, Data Warehouse Process Management, Viewed at http://www-i5.informatik.rwth-aachen.del/~quix/papers/is 2001.pdf

18 Sid Adelman, Oates Joe, Data Warehouse Project Management, Viewed at http://www.dmreview.com/

19 Rajagopalan C R, Indigenous Knowledge/CFS experience, Indian J Traditional Knowledge, 2(4) 2003, 313-320