47
Towards Digitizing Scholarly Communication Sören Auer University of Bonn & Fraunhofer IAIS

Towards digitizing scholarly communication

Embed Size (px)

Citation preview

Towards Digitizing Scholarly Communication

Towards Digitizing Scholarly CommunicationSren AuerUniversity of Bonn & Fraunhofer IAIS

Publishing 6600 BCEJiahu symbols 16 distinct markings on prehistoric artifacts found in Jiahu, a neolithic Peiligang culture site found in Henan, China

Researchers discuss wether Jiahu symbols already represent some form of systematic writingLater these symbols evolved into oracle bone scripts, the oldest known member and ancestor of theChinese family of scripts2

Publishing 2000 BCECursive hieroglyphs from the Papyrus of Ani (1250 BCE)

ThePapyrus of Aniis apapyrusmanuscript withcursive hieroglyphsand color illustrations created c. 1250 BCE, in the19th dynastyof theNew Kingdomofancient Egypt. Egyptians compiled an individualized book for certain people upon their death, called theBook of Going Forth by Day, more commonly known as theBook of the Dead, typically containingdeclarations and spellsto help the deceased in their afterlife. ThePapyrus of Aniis the manuscript compiled for theThebanscribeAni.

orientation of cursive hieroglyphs is not constant, reading right to left or left to right depending on the context3

Publishing 380 BCEPapirus Oxyrhynchus, with fragment of Plato's Republic

TheRepublic(Greek:,Politeia;Latin:De Re Publica[1]) is aSocratic dialogue, written byPlatoaround 380 BC, concerning the definition ofjustice(), the order and character of the justcity-stateand the just man[2]for this reason, ancient readers used the nameOn Justiceas an alternative title

Plato's best-known work, it has proven to be one of the world's most influential works ofphilosophyandpolitical theory, both intellectually and historically.[5][6]In it,Socratesalong with various Athenians and foreigners discuss the meaning of justice and examine whether or not the just man is happier than the unjust man by considering a series of different cities coming into existence "in speech"4

12th century publishingCodex Gigas (largest extant medieval manuscript) was created in the Benedictine monastery of Podlaice in Bohemia (now Czech Republic).

It is also known as theDevil's Biblebecause of a large illustration ofthe devilon the inside and the legend surrounding its creation.

includes a unique picture of the devil, about 50cm tall. Directly opposite the devil is a full page depiction of the kingdom of heaven, thus juxtaposing contrasting images ofGood and Evil.5

Scientific publishing in the 17th centuryOne of the earliestresearch journals:Philosophical Transactions of the Royal Society

CC BY Henry Oldenburg

Scientific publishing todayMainly based on PDF Is only partially machine-readableDoes not preserve structureDoes not allow embedding of semanticsDoes not facilitate interactivity/dynamicity/ repurposing

Has it changed much?In terms of distribution: YESAlmost zero cost of copying and distribution(whole history of publishing is mainly a history of the reduction of marginal costs of publishing)

In terms of method/representation: NOArticles are fixed sucessions of characters and wordsstatic in terms of presentation, content, granularity

Researchers spend (most of)their time on:Encoding their findings in articles

Decoding other reserchers findings from articlesFinding related workGetting an overview over the state-of-the-art

We need to develop means to make scholarly communication more efficient and effective.

New possibilities in a Digital WorldMachine-readabilitySemantic representationDynamic content, interactive examplesIntegration of multimedia contentRich interlinking with context (related work, calls, reviews, comments/ discussions)Integration of rich metadata (provenance, licensing)Interactive collaboration

Machine-readability

In PDFs the structure of the documents is lostHeadings, paragraphs, tables, references etc. are not recognizable anymoreSemantics can only be added as metadata on a per document level

Semantic RepresentationIn addition to 5-star data (http://5stardata.info) we need 5-star documents:Machine-readableSemantics-awareInterlinked

David Shotton: The Five Stars of Online Journal Articles, D-Lib Magazine (2012),http://www.dlib.org/dlib/january12/shotton/01shotton.html

limes-paper describes appr1 .appr1 a approach .appr1 for Link_Discovery .appr1 hasProp looseless ....

limes-paper describes impl1 .impl1 a implementationimpl1 implements appr1 .impl1 language Java ....

limes-paper describes eval1 .eval1 a evaluation .eval1 evaluates impl1 .eval1 uses Dbpedia ....Internal: Semantic Description of Scientific ContentFacilitates querying for all link discovery approaches having certain properties or implementations thereof in a certain language using a certain dataset.

External: Rich Interlinking with Related Work, Calls, Reviews, Discussions,

Three approaches for digitizing scholarly communication

Linked Research enabling semantic authoring, publishing, discovery

SlideWiki courseware authoring and translation

OpenResearch Collaborative Management of Scholarly Communication Metadata

Dokie.li: clientside editor for decentralised article publishing, annotations and social interactions

http://Dokie.li

Sarven Capadisli

A holistic view on scientific publishing

Sparklines Adding small interactive inline charts

Linked Research & dokie.li FeaturesDocuments are human and machine-friendly.Using theplain old semantic HTMLmarking process, with further semantic annotations using microformats and RDF.All kinds of interactive content can be embedded into the HTML5 documents e.g. Javascript apps, code, videos, audio, interactive visualizationsDifferent views e.g., ACM, LNCS, W3C-ED, Slideshow, NativeBuilds on Linked Data Platform, Solid and Linked Data Notifications to realize truly decentralized authoring & publishing workflows

Interlinking a research article, call for contributions and workshops, and proceedings@prefix sioc: .@prefix schema: .@prefix bibo: .

sioc:reply_of ; schema:hasPart .

sioc:reply_of ; bibo:citedBy .

schema:hasPart .

bibo:uri .

Comparison of scientific authoring and publishing approachesACaA Access control and attributionAtA Adaptation to audiencesCaF Commentary and feedbackDAaP Decentralised authoring and publishingDI Data integrationDVaM Different views and mediaEI Entity identifiersFaI Feedback and interactionsHaMR Human and machine-readabilityIAaPW Integrated authoring and publication workflowIC Interactive contentIM Impact metricsIoS Integration of semanticsM MultimediaPaA Provenance and accountabilityPaP Persistence and preservationSaSI Sharing and social interactions

OpenResearch a Semantic Wiki for Scientific Event Metadata (RIS for Events)We need Research Information Systems not only for organizations but also for communities or specific types of content

Events are a crucial element of scholarly communication

Information about events is difficult to obtain:Quality (e.g. acceptance rate, PC members)Logistics (locations, fees)Dates (submission, registration etc.)Co-located events

CC BY 3.0 Wiki4desatEnglish Wikipedia

Structured event meta dataSemantic (typed) links inside call text

29

30

Interactive Queries (can be also created by users)

OpenResearch.org Architecture

SlideWiki A collaborative OpenCourseWare Authoring PlatformCollaborative creation and maintenance of high-quality, multilingual OpenCourseWare is still a major challenge

SlideWiki is a platform for OpenCourseWare creation employing crowdsourcing, full versioning, WYSIWIG

Facilitates translations to many languagesHelps to keep track of authors, translators and sources

SlideWiki: Self-assessment questions can be attached to every single slide

Learners can test their knowledge an be pointed exactly to the content they need to revisit

How is SlideWiki different?SlideWiki differs from other online tools for presentations, such as Google Docs Presentations, Prezi, SlideShare due to its focus on:E-learning- you can add questions to slides and thus compose comprehensive self-assessment tests for learnersCollaboration- SlideWiki aims at empowering whole communities to create presentations collaborativelyTranslation- with SlideWiki content can be easily translated in more than 50 languages

Semantic Web Layer Cake 2001

http://www.w3.org/2001/10/03-sww-1/slide7-0.html

Monolithic based on XMLFocus on heavyweight Semantic (Ontologies, Logic, Reasoning)

The Semantic Web Layer Cake 2015 Bridging between Big & Smart DataUnicodeURIsXMLJSONCSVRDBHTMLRDFRDF/XMLJSON-LDCSV2RDFR2RMLRDFaRDF Data ShapesRDF-SchemaVocabulariesOntologiesSKOS ThesauriLogicSWRL RulesSPARQL(Access control), Signatur, Encryption (HTTPS/CERT/DANE),Lingua Franca of Data integration with many technology interfaces (XML, HTML, JSON, CSV, RDB,)Focus on lightweight vocabularies, rules,thesauri etc.Less invasive

Fraunhofer

Towards an Ecosystem of Open Scholarly Communication InfrastructureWe need to invest more into techniques tailored for digital knowledge exchange instead of techniques mimicking work-arounds of the past.From document-centricity to knowledge-centricity

Thanks for your attentionSren Auerhttp://eis.iai.uni-bonn.de/SoerenAuer.html [email protected]

http://OpenResearch.orghttp://SlideWiki.orghttps://dokie.li

Sarven Capadislihttp://csarven.ca Christoph Langehttps://langec.wordpress.com