Exploring the "Betrothed Lovers" and other literary works

Preview:

DESCRIPTION

DH Workshop in memory of Emanuele Pianta, Trento 10 December, 2013 As part of the activities of Digital Humanities group in FBK, a one-day workshop on "Digital Humanities: Current state and Future challenges". Exploring the “Betrothed Lovers” and other literary works by Andrea Bolioli, Riccardo Tasso 2 Our company: Cross Library:Spin-off of FBK (Trento) and CELI (Torino) Digital Humanities and School Our claim: If you enjoy it, you understand it! Our product: the "crunched" book 3 A propotype for literature: I promessi sposi 2.0 4 Exploring literary works 5 A research project: Sèduco 6 HLT tasks for literature processing 7 The Annotation Framework 8 Our Annotation Model: An annotation is a span of text characterized by a 9 Our Annotation Model: An annotation may have attributes 10 Our Annotation Model: An annotation may be classified 11 Our Annotation Model: An annotation may be related 12 Object Store 13-14 Text Store 15-20 The annotation query engine 21 Crunched Book SNA 22 Actors Graph 23 Pinocchio Actors (1) 24 Pinocchio Actors (1) 25 Speakers Graph 26 Promessi Sposi Speakers 27 Pinocchio Speakers 28 Romeo and Juliet 29 Crunched Book SNA (speakers) 30 Future works 31 Thank You! @CrossLib http://www.cross-library.com

Citation preview

Exploring the “Betrothed Lovers”

and other literary works

Andrea Bolioli, Riccardo Tasso

”If you enjoy it, you understand it”

Our claim: If you enjoy it,

you understand it!

Our product: the "crunched" book

Spin-off of FBK (Trento)

and CELI (Torino)

Digital Humanities and School

www.cross-library.com

Our company: Cross Library

A propotype for literature: I promessi sposi 2.0

«The Betrothed», by Alessandro Manzoni www.crunchedbook.com

Exploring literary works

NARRATIVE SEQUENCES

CHARACTERS SOCIAL NETWORKS

LOCATIONS

A research project: Sèduco

Sharing Educational Content

www.seduco.it

Partners: Cross Library,

OpenContent,

FBK, IPRASE

and 4 high schools

«Exploring the Betrothed Lovers»,

A. Bolioli, M. Casu, M. Lana, R. Roda,

Computational Models of Narrative workshop CMN 2013,

Hamburg 4-6 august 2013

HLT tasks for literature processing

• Automatic text segmentation:

narrative sequences, quoted speech,

other text units

• Entity mention annotation:

speakers, mentions of characters

(agents) and locations (not only GPEs,

e.g. "castello dell'Innominato" - castle

of the Unnamed, osteria della Luna

piena" - tavern of the Full Moon)

• Quoted speech attribution

The Annotation Framework

Our Annotation Model

An annotation is a span of text characterized by

a <begin, end>

Our Annotation Model

An annotation may have attributes:

Our Annotation Model

An annotation may be classified:

Our Annotation Model

An annotation may be related:

Object Store

An annotation is persisted:

“A graph database stores data in a graph, the

most generic of data structures, capable of

elegantly representing any kind of data in a

highly accessible way”

An annotation is persisted:

Text Store

Annotations, annotations, annotations... But what about text?

Text Store

Annotations, annotations, annotations... But what about text?

The annotation query engine

And (finally) you can search and find annotations

The annotation query engine

Choose a MAIN annotation filter:

{ "main": { "@class": "Sequence" } }

Returns all the Annotations: whose class is Sequence

The annotation query engine

Specify annotation's attributes:

{ "main": { "@class": "Fragment", "type": "speech" } }

Returns all the Annotations: whose class is Fragment of (sub)type "speech"

The annotation query engine

Specify annotation's relations:

{ "main": { "@class": "Sequence", "out('actor')": "pinocchio", "out('place')": "paese_balocchi" } }

Returns all the Annotations: whose class is Sequence with an actor relation to "pinocchio" with a place relation to "paese_balocchi"

The annotation query engine

Choose second level filter:

{ "main": { "@class": "Sequence" }, "filter": { "@class": "@Fragment", "type": "speech" } }

Returns all the Annotations: whose class is Sequence which CONTAIN a given annotation (speech)

The annotation query engine

Full text search:

{ "main": { "@class": "Sequence", "out('actor')": "pinocchio" }, "@text": "storia" }

Returns all the Annotations: whose class is Sequence with an actor relation to "pinocchio" whose text contains "storia" keyword

Crunched Book SNA

Actors Graph

Pinocchio Actors (1)

Pinocchio Actors (2)

Speakers Graph

Promessi Sposi Speakers

Pinocchio Speakers

Romeo and Juliet

Crunched Book SNA (speakers)

Promessi Sposi Pinocchio Romeo & Juliet

nodes 86 62 35

edges 182 104 236

diameter 6 6 3

density 0.061 0.055 0.397

connected components 1 1 1

communities 6 11 3

clustering coefficient 0.528 0.614 0.813

avg. path length 2.814 2.395 1.64

Future works

Other crunched books (in January):

«Le avventure di Pinocchio», «Romeo and Juliet»

Next DH projects:

• Annotating and visualizing ancient places in latin literature

• A multilingual work (latin, english, italian and chinese)

Thank You!

@CrossLib

http://www.cross-library.com

”If you enjoy it, you understand it”