14
1 Representing a Computer Science Representing a Computer Science Research Organization on the ACM Research Organization on the ACM Computing Classification System Computing Classification System Boris Mirkin School of Computer Science and Information Systems , Birkbeck College, University of London, United Kingdom Susana Nascimento and Luís Moniz Pereira Computer Science Department and Centre for Artificial Intelligence (CENTRIA) Faculdade de Ciências e Tecnologia Universidade Nova de Lisboa Portugal CENTRIA

Representing a Computer Science Research Organization on the ACM Computing Classification System

Embed Size (px)

DESCRIPTION

CENTRIA. Representing a Computer Science Research Organization on the ACM Computing Classification System. Boris Mirkin School of Computer Science and Information Systems , Birkbeck College, University of London , United Kingdom Susana Nascimento and Luís Moniz Pereira - PowerPoint PPT Presentation

Citation preview

Page 1: Representing a Computer Science Research Organization on the ACM Computing Classification System

1

Representing a Computer Science Research Representing a Computer Science Research Organization on the ACM Computing Organization on the ACM Computing

Classification SystemClassification System Boris Mirkin

School of Computer Science and Information Systems , Birkbeck College, University of London,

United Kingdom

Susana Nascimento and Luís Moniz PereiraComputer Science Department and Centre for Artificial Intelligence

(CENTRIA)Faculdade de Ciências e Tecnologia

Universidade Nova de LisboaPortugal

CENTRIA

Page 2: Representing a Computer Science Research Organization on the ACM Computing Classification System

2

Motivation: an Objective Portrayal of Research Organisation as a Whole

Overview the structure of scientific subjects being developed in the organisation.

Position the organisation over the ACM-CCS ontology.

Assessing scientific subjects not fitting well to ACM-CCS these are potentially the growth points or other breaking through

developments.

Planning research restructuring and investment.

Overview of scientific field being developed in a country, with a quantitative assessment of controversial areas

e.g. the level of activity is not sufficient or the level of activities excesses the level of results.

Page 3: Representing a Computer Science Research Organization on the ACM Computing Classification System

3

ACM-CCS: Classification 1998 - level 1

A. General Literature B. Hardware C. Comp. Sys. Organization D. Software E. Data F. Theory of Computation

J

D

IG H

CB E F

K

A

CS

G. Mathematics of Computing

H. Information Systems

I. Computing Methodologies

J. Computer Applications

K. Computing Milieux

Page 4: Representing a Computer Science Research Organization on the ACM Computing Classification System

4

Cluster-Lift Method

Express Research Activities of CS Organization (RAO) as a set of CLUSTERS of ACM-CCS Subjects Captures RAO in a straightforward way No information away about individual members or teams Can be implemented on different levels of the taxonomy Needs good clustering tecniques

MAP individual clusters to ACM-CCS and GENERALISE them A new approach Extendable to other ontologies and activities

Page 5: Representing a Computer Science Research Organization on the ACM Computing Classification System

5

Electronic Survey Tool for Data Collection

Page 6: Representing a Computer Science Research Organization on the ACM Computing Classification System

6

Generic Survey Output: fuzzy memberships over all subjects in

3rd Layer of ACM-CCS

Page 7: Representing a Computer Science Research Organization on the ACM Computing Classification System

7

Fuzzy Similarity between ACM-CCS Subjects

Contribution by a respondent [f(i)] – membership vector over all subjects i in 3rd layer

of ACM-CCS from the survey. A(i,j)=f(i)f(j), the product, for all ACM-CCS 3rd layer

subjects i and j. Matrices A(i,j) summed up over all individuals

weighted according to their span ranges.

Fuzzy similarity measure between two ACM-CCS subjects measure is proportional to the number and importance

of research activitives in both subjects (details can be presented).

Page 8: Representing a Computer Science Research Organization on the ACM Computing Classification System

8

Bulding Overlapping Subject Clusters

Additive Clustering with Iterative Extraction (ADDI-S) Given the similarity matrix, the additive clustering problem is of

finding one-by-one of K clusters and their intensity weights that minimize the sum of squared errors.

Interpretable parameters of cluster intensity and its contribution to the explanation of the data scatter.

Leads to tight clusters A subject i belongs to a cluster S in case its similarity is higher

than half of the average similarity within the cluster S; Subject i is also well separated from the rest, because for

each entity j S, its average similarity with S is less than that.

Computationally feasible.

Page 9: Representing a Computer Science Research Organization on the ACM Computing Classification System

9

Generalising Subject Clusters mapped onto ACM-CCS: good and bad cases

Blue cluster is tight, all topics are in one ACM-CCS subject.

Red cluster is dispersed over many ACM-CCS subjects.

CS

Page 10: Representing a Computer Science Research Organization on the ACM Computing Classification System

10

Elementary Structures

The set of subject clusters, their ‘head subjects’, ‘gaps’ and ‘offshoots’ constitutes what can be called the profile of the organization under study.

The total count of ‘head subjects’, ‘gaps’, and ‘offshoots’, each type weighted accordingly, can be used for scoring the extent of the fit between a research grouping and the ontology.

Lifting a Subject Cluster onto the Ontology

Page 11: Representing a Computer Science Research Organization on the ACM Computing Classification System

11

Parsimonious Lifting of Subject Cluster onto ACM-CCS

Plural Solutions: which one is better?

Mapping (B) is better than (A) if ‘gaps’ are much cheaper than additional ‘head subjects’.

Page 12: Representing a Computer Science Research Organization on the ACM Computing Classification System

12

Real Case Study: 2006 Survey of CS of FCT-Universidade Nova

de Lisboa

Survey conducted in our Department in 2006 Participation 30 individuals Each one supplied three ACM-CCS 2nd level topics 26 of 59 topics at ACM-CCS 2nd level are covered Additive clustering algorithm ADDI-S

Six subject clusters found

cl1= {F1, F3, F4, D3} (contribution 27.08%) cl2= {C2, D1, D2, D3, D4, F3, F4, H2, H3, H5, I2, I6} (contribution 17.34%) cl3= {C2, C3, C4} (contribution 5.13%) cl4= {F4, G1, H2, I2, I3, I4, I5, I6, I7} (contribution 4.42%) cl5= {E1, F2, H2, H3, H4} (contribution 4.03%) cl6= {C4, D1, D2, D4, K6} (contribution 4.00%)

Page 13: Representing a Computer Science Research Organization on the ACM Computing Classification System

13

Profile of DI-FCT-UNL (2006 Survey)

GE B KJA

E1 E2 E3 E4 E5 G1 G2 G3 G4 K1 K2 K3 K4 K5 K6 K7 K8

HFC D

CS

IHead subject

Offshoot

Gap I1 I2 I3 I4 I5 I6 I7

C. Computer Systems Organization F. Theory of Computation

D. Software

I. Computing Methodologies

D. Software and H. Information SystemsH. Information Systems

D. Software and H. Information Systems

Page 14: Representing a Computer Science Research Organization on the ACM Computing Classification System

14

Analysis

The most contributing cluster with head subject ( ) ‘Theory of Computation’ comprises a very tight group;

The next contributing cluster has two head subjects ( ) D. Software and H. Information Systems, and several offshoots among the other head subjects, indicating that this cluster should be the structure underlying a certain unity of the department;

There are only 3 offshoots outside the department’s head subjects. E1. Data Structures from H. Information Systems;

G1. Numerical Analysis from I. Computing Methodologies;

K6. Management of Computing and Information Systems from D. Software

as all them seem natural, they potentially could be updated in the list of collateral links of the ACM ontology.

GE B KJA

HFC D

CS

I