36
1 Cold Start Knowledge Base Population at TAC 2017 Task Description 1 Version 1.0 of May 23, 2017 What’s New ............................................................................................................................................. 2 Introduction ........................................................................................................................................... 3 Input .......................................................................................................................................................... 5 Schema................................................................................................................................................................ 5 Document Collection...................................................................................................................................... 6 Evaluation Queries ......................................................................................................................................... 8 Cold Start KB Task Output .............................................................................................................. 11 Nodes................................................................................................................................................................. 12 Predicates ........................................................................................................................................................ 13 type..................................................................................................................................................................................... 13 *mention predicates ................................................................................................................................................... 14 link ...................................................................................................................................................................................... 16 SF, sentiment, and event predicates .................................................................................................................... 16 Event Realis .................................................................................................................................................... 16 Provenance ..................................................................................................................................................... 17 Confidence Measure ..................................................................................................................................... 18 Comments ........................................................................................................................................................ 19 Slot Filling Task Output ................................................................................................................... 19 Evaluation ............................................................................................................................................ 21 Component Evaluations .............................................................................................................................. 21 Composite Evaluation Assessment ......................................................................................................... 22 Composite Evaluation Scoring .................................................................................................................. 24 Submissions......................................................................................................................................... 26 Appendix .............................................................................................................................................. 27 .................................................................................................................................................................. 28 Change History ................................................................................................................................... 36 1 The TAC organizing committee welcomes comments on this Task Description, or on any aspect of the TAC evaluation. Please send comments to [email protected].

Cold Start Knowledge Base Population at TAC 2017 Task ......4 1. a document collection; 2. a knowledge base schema From these, Cold Start KB systems will produce a knowledge base

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

  • 1

    ColdStartKnowledgeBasePopulationatTAC2017TaskDescription1

    Version1.0ofMay23,2017

    What’sNew.............................................................................................................................................2Introduction...........................................................................................................................................3Input..........................................................................................................................................................5Schema................................................................................................................................................................5DocumentCollection......................................................................................................................................6EvaluationQueries.........................................................................................................................................8

    ColdStartKBTaskOutput..............................................................................................................11Nodes.................................................................................................................................................................12Predicates........................................................................................................................................................13type.....................................................................................................................................................................................13*mentionpredicates...................................................................................................................................................14link......................................................................................................................................................................................16SF,sentiment,andeventpredicates....................................................................................................................16

    EventRealis....................................................................................................................................................16Provenance.....................................................................................................................................................17ConfidenceMeasure.....................................................................................................................................18Comments........................................................................................................................................................19

    SlotFillingTaskOutput...................................................................................................................19Evaluation............................................................................................................................................21ComponentEvaluations..............................................................................................................................21CompositeEvaluationAssessment.........................................................................................................22CompositeEvaluationScoring..................................................................................................................24

    Submissions.........................................................................................................................................26Appendix..............................................................................................................................................27..................................................................................................................................................................28ChangeHistory...................................................................................................................................36

    1TheTACorganizingcommitteewelcomescommentsonthisTaskDescription,oronanyaspectoftheTACevaluation.Pleasesendcommentstotac-kbp@nist.gov.

  • 2

    What’sNewThe2017ColdStartKBandSFtasksdifferfromthe2016tasksinthefollowingsignificantways:

    1. TheColdStartKBincludesevents(analogoustoentities)andeventarguments(analogoustoslotfillers).

    2. TheColdStartKBincludessentimentfromanentitytowardsanotherentity.3. TheColdStartKBwillbeevaluatedviaacompositeKBevaluationusingqueries(asinCold

    Start2016),andasetofcomponentevaluationsforEntityDiscoveryandLinking(EDL),SlotFilling(SF),EventNuggetDetectionandCoreference(EN),EventArgumentandLinking(EAL),andSentiment.

    4. MeanAveragePrecision(MAP)willbethemainevaluationmetricforthecompositeKBevaluationandthecomponentSFevaluation.

    5. MultiplejustificationsareallowedandencouragedforKBrelations(involvingSF,event,orsentimentpredicates).JustificationspansforanysinglejustificationintheKBandSFtasksmustcomefromthesamedocument,inordertomakejustificationseasiertodifferentiateandcount.

    The2017ColdStartKBConstructiontaskbuildsonthe2016taskbyextendingtheKBschematoincludenotonlyentitiesandSlotFillingrelations,butalsoeventsandrelationsinvolvingeventargumentsandsentimentbetweenentities.

    Inaddition,ColdStart2017hasthegoalofencouragingsystemsto1)providemeaningfulconfidencevaluesforassertionsthataremadeintheKBand2)returnasmuchevidenceforeachKBrelationascanbefoundinthedocumentcollection.Toencouragedevelopmentofmeaningfulconfidencevalues,theprimaryevaluationmetricforColdStart2017willbeavariantofmeanaverageprecision(MAP).TosupportanevaluationthatrewardssystemsforfindingmorethanonejustificationforeachKBrelation,ColdStart2017requiresthatalltextspansforasinglejustificationmustcomefromasingledocument.ThissimplificationallowsTACtodefinetwojustificationstobethesameifandonlyiftheycomefromthesamedocument,andgivesmorecreditforfindingmoredocumentsthateachjustifyrelation.2

    ThesubmittedColdStartKBsareevaluatedbybothacompositequery-basedevaluation,andasetofcomponentevaluations.ThecompositeKBevaluationappliesasetofColdStartevaluationqueriestoeachKBandassessesthecorrectnessoftheevents,sentimentsourcesandtargets,andSFslotfillersfound.

    ThecomponentevaluationsareimplementedbyprojectingouttheindividualcomponentsfromthesubmittedKBandevaluatingeachcomponentoutputfileasthoughithadbeensubmitteddirectlytothestandalonetrackforthatcomponent.

    2TheTACColdStartKBandSFtasksin2014-2016allowedasinglerelationjustificationtocontainprovenancespansfrommultipledocuments,inordertoencourageinferenceacrosswidercontexts.WhilerestrictingprovenancespanstoasingledocumentperjustificationmayseemlikeastepbackwardsforKBP2017,thebenefitofbeingabletocountjustificationsoutweighsthereductioninallowableinference.Furthermore,becauseofthetypesofrelationsandwhatisconsidercorrectinColdStart/SF,mostsuccessfulcross-documentinferenceinColdStarthasbeenlimitedtoahandfulofhard-codedinferencerules(e.g.,involvingfamilialrelationshipsandpart-wholerelationshipsbetweenGPE’s);suchinferencescouldbedonebysomedownstreamprocess,usingaseparateinferenceengineandapossiblyrichersetofinferencerulesandworldknowledge.

  • 3

    Thisdocumentdescribesthe2017composite(end-to-end)ColdStartKBConstructiontaskandthecomponentSlotFillingtask,whichareevaluatedusingpost-submissionassessmentofresponsestoColdStartevaluationqueries.

    ThestandaloneEDL,EN,EAL,andsentimenttasksareevaluatedusinggoldstandardannotationsonacommonsetofapproximately500"core"documents,andaredescribedfullyontheirrespectivetrackhomepages.Thedetailedtaskdescriptionforthosecomponenttasksareavailableontheirtrackhomepages:

    • EntityDiscoveryandLinking(http://nlp.cs.rpi.edu/kbp/2017/index.html).• EventNuggetDetectionandCoreference(http://cairo.lti.cs.cmu.edu/kbp/2017/event/)• EventArgumentandLinking(https://tac.nist.gov/2017/KBP/Event/Argument/)• BeliefandSentiment(http://www.cs.columbia.edu/~rambow/best-eval-2017/)

    IntroductionSince2009,TACKBPhasevaluatedperformanceonseveralimportantaspectsofknowledgebasepopulation:entitydiscoveryandlinking,slotfilling,eventnuggetdetectionandcoreference,eventargumentextractionandlinking,andbeliefandsentiment.ThegoaloftheColdStarttrackistoexercisetechnologyineachoftheseareas,andevaluatetheabilityofasystemtousethesetechnologiestoactuallyconstructaknowledgebase(KB).ColdStartparticipantsbuildasoftwaresystemthatprocessesalargetextcollectionandcreatesaknowledgebasethatisconsistentwithandaccuratelyrepresentsthecontentofthatcollection.Theknowledgebaseisthenevaluatedasasingleconnectedresource,usingqueriesthattraversenodesandpredicatelinksintheKBtodetermineiftheKBcontainscorrectrelationsbetweenentities,events,andstrings.InColdStart2017,thepredicatescanbeslotfillingpredicates,sentimentpredicates,oreventargumentpredicates.

    WecallthetaskColdStartKnowledgeBasePopulationtoconveytwofeaturesoftheevaluation:itimpliesboththataknowledgebaseschemahasbeenestablishedatthestartofthetask,andthattheknowledgebaseisinitiallyunpopulated.Thus,weassumethataschemaexistsfortheentities,events,andrelationsthatwillcomposetheknowledgebase;itisnotpartofthetasktoautomaticallyidentifyandnamerelationshipspresentinthetextcollection.In2017,ColdStartusesaschemathatcombinestheentitytypes,eventtypes,andrelationsfromtheTACcomponenttracksofEDL,SF,EN,EAL,andBeSt.

    ColdStartalsoimpliesthattheknowledgebaseisinitiallyempty.ToavoidsolutionsthatrelyonverifyingcontentalreadypresentinWikipediaorotherlargedatasourcesaboutentities,thequeriesusedinColdStartwillbedominatedbyentitiesthatarenotpresentinWikipedia.

    Thisdocumentdescribesboththeend-to-endColdStartKBconstructiontaskandthecomponentSlotFillingtask.

    1. IntheColdStartKnowledgeBasetask(CSKB),participantssubmitentireknowledgebases,withoutpriorknowledgeoftheevaluationqueries.

    2. IntheSlotFillingtask(SF),theColdStartevaluationqueriesthatinvolveonlySFpredicatesaresplitintoColdStartSlotFillingqueries,withoneentrypointperquery,andaredistributedatthestartofthetaskevaluationwindow.Participantsdonothavetosubmitentireknowledgebases.Rather,theyapplytheirslotfillingsystemtwice,thefirsttimeontheentrypointforeachquery,thesecondtimeoneachoftheresultsofthefirstround.

    ParticipatingsystemsinboththeColdStartKBandSFtaskswillreceivethefollowinginput:

  • 4

    1. adocumentcollection;2. aknowledgebaseschema

    Fromthese,ColdStartKBsystemswillproduceaknowledgebase.ThisKBwillbesubmittedtoNISTasasetofaugmentedsubject-predicate-objecttriples.TheColdStartKBwillincludevarious*mention, link,andtypetriples,aswellasarangeoftriplesinvolvingSF,eventargument,andsentimentpredicates(alltriplesaredescribedmorefullybelow).ParticipatingKBsystemsmusttieeachentitymentionandeventmentioninthedocumentcollectiontoaparticularKBnode;inthisway,theknowledgebasecanbequeriedwithoutfirstaligningittoareferenceknowledgebase.

    SystemsparticipatingintheSlotFillingtaskwillalsoreceive:

    3. asetofColdStartSlotFilling(CSSF)evaluationqueries(eachevaluationqueryisasequenceofoneortwoSFqueriestobeappliedinseries).

    ForbothCSKBandSFtasks,theresultswillthenbeevaluatedbyNIST:

    • SystemsparticipatingintheSlotFillingtaskreturnslotfillersdirectlyinresponsetothegivenCSSFevaluationqueries,andthefillersarethenassessedandscored.

    • EvaluationoftheKnowledgeBasevariantwillstartbyapplyingtheColdStartevaluationqueriestothesubmittedknowledgebase.Eachquerywillstartatanamedentitymentioninadocument(identifiedbythequery’sandtags),identifytheknowledgebaseentitythatcorrespondstothatmention,followasequenceofoneormorerelationswithintheknowledgebase,andendina“slotfill”(whichcouldbeanentity,event,orstringnode,dependingontherelationpredicate).TheresultingslotfillswillbeassessedandscoredinthesamewayasintheSlotFillingvariant.Forexample,aCSSFevaluationquerymightask‘whataretheagesofthesiblingsoftheBartSimpson4mentionedinDocument42?’AsystemthatcorrectlyidentifieddescriptionsofBart’ssiblingsinthedocumentcollection,linkedthemtotheappropriatenodeintheKB,andalsofoundevidenceforandcorrectlyrepresentedtheagesofthosesiblingswouldreceivefullcredit.

    4ManyoftheexamplesusedtoillustratetheColdStarttaskaredrawnfromTheSimpsonstelevisionshow.ReaderslackingadetailedworkingknowledgeofgenealogicalrelationshipsintheBouvier/Simpsonfamilyneednotagonizeoverwhattheyhavebeendoingwiththeirlivesforthepastquartercentury,butmaysimplyvisithttp://simpsons.wikia.com/wiki/Simpson_Family.

  • 5

    Input

    Schema

    TheKBschemaforColdStart2017consistsof:

    • Entities:Entitiesandentitymentionsforfiveentitytypes(person,organization,geopoliticalentity,facility,andlocation)asdefinedinthetrilingualEDLtaskofthe2017EDLtrack.

    • SFRelations:Entityattributes("slots")asdefinedintheSFtrack.Thesecompriseforty-onerelationtypesandtheirinverses.

    • Events:Eventsandeventmentionsfor18eventsubtypes.ColdStartdefinesaneventasaRichEREcross-documenteventhopper,andaneventmentionasaRichEREeventtrigger.

    • Event(Argument)Relations:Eventrolesandargumentsforthe18eventsubtypes,asdefinedintheEALtrack.Thesecomprise85eventargumentrelationtypeshavingeventasthepredicatesubject,and56inverseshavinganentityasthepredicatesubjectandaneventasthepredicateobject.

    • SentimentRelations:Positiveandnegativesentimentfromasourceentitytowardatargetentity,asdefinedintheBeSttrack.

    TheschemaforColdStart2017combinestheentityandmentiontypesfromTACKBP2016EntityDiscoveryandLinking,theSFrelationtypesfromTACKBP2016ColdStartKnowledgeBasePopulation,theeventrolesandargumenttypesfromTACKBP2016EventArgumentandLinking,theeventmentionsfromTACKBP2016EventNuggetDetection,andsentimentrelationsfromTACKBP2016BeliefandSentiment(BeSt).Annotation/assessmentguidelinesareavailableontheTACwebsite(http://www.nist.gov/tac/2017/KBP/ColdStart/guidelines.html),andaremorefullydocumentedinthedatapackagesthatcanberequestedfromtheLDCuponcompletionofTACKBPtrackregistration.

    Forrelationswhoseobjects(“slotfills”)areentities(suchasper:siblingsorper:likes)orobjects(suchasper:confict.attack_attacker),ColdStartKBswillberequiredtolinkthatslottothenodeinthesubmittedKBrepresentingthecorrectentity5orevent.Slotswhosefillsarestrings(suchasper:titleororg:website)mustbelinkedtoaspeciallycreatedstringnodetorepresenttheobjectforthatrelation.

    ColdStartentitiesandentitymentionsaredefinedbyDEFTRichERE.FullannotationguidelinesforDEFTRichEREentitiesareincludedintheDEFTRichEREannotationpackages,availablefromtheLDC,butahigh-levelsummaryofthefiveentitytypesandtheirmentionsareavailableinRichEREAnnotationGuidelinesOverview.ForColdStart,allnamedandnominalmentionsmustbeextracted,andtheentitiesmustbespecificindividualentities(asdescribedinAnnotationGuidelinesforIndividualityofSpecificEntities).Pronominalentitymentionsmaybeextracted(foruseinthesentimentcomponentevaluationinBeSt),butarenotrequiredforthecompositeevaluationoftheColdStartKB.AColdStartnamedentitymentionisthesameasanamedentitymentioninRichERE;i.e.,aColdStartnamedentitymentionisamentionthatuniquelyreferstoanentitybyitspropername,acronym,nickname,alias,abbreviation,orotheralternatename,andincludespostauthornamesfoundinthemetadataofdiscussionforumdocuments.Theextentof5Becausefacilityandlocationentitiesarenotincludedintheslotdefinitions,onlyperson,organization,andgeopoliticalentitynodesmustbelinkedtotheSFslots.

  • 6

    thenamedentitymentionistheentirestringrepresentingthename,excludingtheprecedingdefinitearticleandanyotherpre-posedorpost-posedmodifiers.AColdStartnominalentitymentionistheheadofthenominalentitymentioninRichERE;i.e.,aColdStartnominalentitymentionisamentionnotincludingtheentity'spropername,referringtoitbyacommonnounphrase(butforColdStart,thenominalmentionisonlytheheadnounofthenominalphrase).Entitymentionsareallowedtonestoroverlap;forexample,thestring“PhiladelphiaEagles”mightbeamentionofanORG(thefootballteam),whilethefirstwordmightsimultaneouslybeamentionofaGPE(thecityofPhiladelphia).

    ColdStartdefinesaneventasaRichEREcross-documenteventhopper,andaneventmentionasaRichEREeventtrigger(whichisalsothesameasaneventnuggetintheTACKBPEventNuggettrack).Thecriteriafordeterminingwhethertwoormoreeventmentionsbelonginthesamehopperareessentiallythesameregardlessofwhetherthementionsareinthesamedocumentordifferentdocuments.TheColdStartinventoryofeventsandeventroles/argumentsisasubsetoftheeventtypesinRichERE,andisdescribedintheEAL2016Taskdescription.

    TheColdStartinventoryofSFslotsisdescribedthoroughlyinTACKBP2015SlotDescriptionsandTACKBP2015AssessmentGuidelinesavailableontheTACWebsite.Forty-oneslotsandtheirinversesareusedfortheevaluation.Twenty-sixofthesehavefillsthatarethemselvesentities,asshowninTable1oftheAppendix.Theremainingfifteenslotshavestringfills,asshowninTable2oftheAppendix.EachSFpredicatehavinganentityasobjectwillhaveaninverse.6

    ColdStartsentimentrelationsarealsogiveninatableintheAppendix,andincludeaninverseforeachsentimentpredicate.

    ColdStarteventrelationsaregiveninthelasttwotablesintheAppendix.Thefirstshowseventpredicateshavinganeventassubject,andanentityorstringasobject.Thesecondtableshowstheinverseeventpredicates,whichhaveanentity(butnotastring)assubject,andaneventasobject.

    Allinverserelationsmustbeexplicitlyidentifiedinthesubmittedknowledgebase.Thatis,iftheKBassertsthatrelationRholdsbetweenentitiesAandB,thenitmustalsoassertthatrelationR-1holdsbetweenBandA.Asaconvenience,theColdStartKBvalidationscriptcanbeusedtointroducemissinginversesintoaKB,exceptthatthevalidatorwillnotinferanyinverserelationsforeventpredicateshavingeventassubject;alleventrelationshavinganeventassubjectmustbeexplicitlyincludedintheKB.

    DocumentCollection

    TheColdStart2017evaluationdocumentcollectionwillbetheTACKBP2017EvaluationSourceCorpus,whichcomprisesapproximately90,000documents,roughlyequallydistributedbetweenEnglish,Spanish,andChinese,andbalancedbetweennewswire(NW)andmulti-postdiscussionforum(MPDF)documents.Thesedocumentswillbenew(previouslyunreleased)documentsthatwillbedistributedbyNISTviaWebdownloadatthebeginningoftheColdStartevaluationwindow.Therewillbeexactlyonefileperdocument,andallfileswillbeparsableasXML.Eachfilewillbegin

    6SomeSFslots,suchasper:siblings,aresymmetric.Others,suchasper:parents,haveinversesthatwerealreadyinthe2014EnglishSlotFillingtrack(inthiscase,per:children).TheremainingSFslots(e.g.,org:founded_by)hadnocorrespondingslotinthe2014EnglishSlotFillingtrack;ColdStartspecifiesnewslotnamesfortheseinverses.Allsuchslotsarelist-valued.

  • 7

    withtheopeningtagoftheelement(forMPDF);7notethatcanbespelledwitheitheruppercaseorlowercaseletters,dependingonthegenre,andmayoptionallyincludeadditionalattributes(suchas"type"forsomenewswiredata).

    Newswiredatawillusethefollowingmarkupframework:

    ...

    ...

    ...

    ...

    wheretheHEADLINEandDATELINEtagsareoptional(notalwayspresent),andtheTEXTcontentmayormaynotinclude"

    ...

    "tags(dependingonwhetherornotthe"doc_type_label"is"story").

    Multi-PostDiscussionForumfiles(MPDFs)arederivedfromDiscussionForumthreads.Theyconsistofacontinuousrunofpostsfromathreadbuttheyareonlyapproximately800wordsinlength(excludingmetadataandtextwithinelements).Whentakenfromashortthread,aMPDFmaycomprisetheentirethread.However,whentakenfromlongerthreads,aMPDFisatruncatedversionofitssource,thoughitwillalwaysstartwiththepreliminarypost.TheMPDFfileswillusethefollowingmarkupframework,inwhichtheremayalsobearbitrarilydeepnestingofquoteelements,andotherelementsmaybepresent(e.g."..."anchortags):

    ...

    ... 7IncontrasttosomeoftheKBPsourcecorporapriorto2016,theTACKBP2017SourceCorpuswillnotcontainanyfilesthatbeginwithxmldeclarationssuchas.ThisistoensurethatoffsetsalignacrossthevariousKBP2017tracksthatareusingthissameevaluationsourcecorpus,regardlessofwhetheroffsetsarecountedfromthebeginningofthefile,orthebeginningofthetag.

  • 8

    ...

    ...

    ...

    Allprovenance/justificationsforColdStartKB/SF2017tasksmustbedrawnfromthedocumentsintheTACKBP2017EvaluationSourceCorpus.EachdocumentisrepresentedasaUTF-8characterarrayandbeginswiththetag,wherethe“

  • 9

    AllCSSEevaluationqueriesstartwithanentrypointintotheknowledgebasebeingevaluated.Theentrypointisdefinedbyanamedentitymention(name,docid,beginoffset,andendoffset),andisfollowedbytheentitytypeandeitheroneortwoslotstobeextractedfortheentity.ThequerymayrequestanyoftheSF,Sentiment,orEventrelationsthathavethequeryentityasthepredicatesubject.

    Evaluationqueriescouldtakeoneoftwoforms:single-hopormultiple-hop.Forexample,hereisasamplesingle-hopCSSEevaluationquerythatasks“WhatistheageoftheJuneMcCarthymentionedatoffsets16931-16943inDocument42?”:

    June McCarthy 42 16931 16943 PER per:age per:age

    Thissingle-hopquerylooksverymuchlikeaqueryfromthe2014EnglishSlotFillingtask,exceptthateachqueryinColdStartasksforaspecificslot,ratherthanallslotsforwhichthereisinformationinthedocumentcollection.8

    Amorecomplex“two-hop”querymightask,“WhataretheagesofthechildrenoftheJuneMcCarthymentionedatoffsets16931-16943inDocument42”:

    June McCarthy 42 16931 16943 PER per:children per:children per:age

    TheabovequeriesarehomogeneousSFqueriesinthesensethateachqueryasksforonlySFrelations.Anexampleofamixedquery,involvingbothasentimentpredicate(inslot0)andaneventpredicate(inslot1)isbelow(“WhataretheattackeventsinwhichthetargetisapersonwhomJuneMcCarthydislikes?”):

    June McCarthy 42 16931 16943 PER 8ParticipantsintheSlotFillingvariantshouldtreatallotherslotsasiftheyappearinthefieldofaSlotFillingqueryfrom2013orearlier.

  • 10

    per:dislikes per:dislike per:conflict.attack_target

    Ingeneral,two-hopquerieswillstartfromanentrypoint(selectingthecorrespondingKBentityofaCSKBsubmission),followasingleentity-valuedrelation,thenaskforasingleslotvalue.9Suchquerieswillverifythattheknowledgebaseiswell-formedinawaythatgoesbeyondbasicentitylinkingandslotfilling,withoutallowingcombinationsoferrorstodrivescorestozero.

    Becausetwo-hopqueriesdonotlooklikeanyslotfillingqueriesfromKBP2009-2014,participantsintheColdStartSlotFillingvariantmustprocesstheCSSFqueriesintwo“rounds”usingtheCS-GenerateCSQueries.plscriptfromNIST,whichaddstheentrytotheNIST-distributedCSSFqueries.ParticipantsintheSlotFillingvariantmusttreatastheslottobefilled.Duringthefirstround,willbeidenticalto.TheCS-GenerateCSQueries.plscriptwillthenconvertafirstroundoutputfiletoasecondroundqueryfile.Secondroundqueriesgeneratedbythisscriptwillbearentriesequivalentto.ThoughsomeoftheCSSFquerieswilldifferonlyinhavingdifferentmentions(possiblyforthesameentity)astheirentrypoints,participatingCSSFsystemsareprohibitedfromusinginformationaboutonequerytoinformtheprocessingofanotherquery.

    FortheKnowledgeBasevariant,thefollowingrulesareappliedtomapfromaCSSEevaluationquerytoaknowledgebaseentry:First,formacandidatesetofallKBnodementionsthathaveatleastonecharacterincommonwiththeevaluationquerymentionandthathavethesametype.Ifthissetisempty,thesubmissiondoesnotcontainanyanswersfortheevaluationquery.Otherwise,foreachmentionKinthecandidateset,calculate:

    • COMMON,thenumberofcharactersinKthatarealsointhequerymentionQ.• K_ONLY,thenumberofcharactersinKthatarenotinQ.

    Executeeachthefollowingeliminationsuntilthecandidatesetissizeone,andselectthatcandidateastheKBnodethatmatchesthequery:

    • EliminateanycandidatethatdoesnothavethemaximalvalueofCOMMON• EliminateanycandidatethatdoesnothavetheminimalvalueofK_ONLY• Eliminateallbutthecandidatethatappearsfirstinthesubmissionfile

    TheproperspecificationofentitymentionsinaKBisthereforeimportantforscoringwell;CSKBparticipantsshouldthereforetakecaretoensurethateverynamedentitymentionintheevaluationcollectionservesasamentionforanodeintheKB.

    TheNISTevaluationofaKBwillproceedbyfindingallentriesintheKBthatfulfillanevaluationquery.Forexample,iftheevaluationquery‘schoolsattendedbythesiblingsofBartSimpson’findstwosiblingsforthenodespecifiedbytheentrypoint,andtheKBindicatesthatthosesiblingsattendedtwoandoneschoolsrespectively,thenthreeresultswouldbeassessedbyNIST.TheseresultswillbeconvertedtothesameformastheoutputfortheSlotFillingvariant.ResultswillbepooledacrossallCSKBandCSSFsubmissions,andassessorswilljudgethevalidityofeachresult.Finally,ascoringscriptwillreportavarietyofstatisticsforeachsubmittedrun.

    9Inprinciple,multiple-hopqueriescouldincludemorethantworelations,butwecurrentlylimitourselvestotwo.

  • 11

    Increatingevaluationqueries,LDCwillexercisearangeofSF,sentiment,andeventpredicatesandstrivetobalanceevendistributionacrosspredicatetypeswithproductivityofthoseslots.Thismeansthatthequeriesinthecompositeevaluationwillnotnecessarilyfollowthedistributionofmention-leveloccurrencesoffacts,soevenifthereare10timesasmanynegativesentimentsaspositivesentiments,thenumberofqueriesaskingforpositivesentimentvsnegativesentimentwillbeaboutthesame.Systemsthathavebeenoptimizedforthecomponentevaluationsbyassumingaparticulardistributionofmention-levelphenomenainthecomponentevaluationdocuments,mayneedtorecalibratetotakeintoaccountlessfrequentphenomena(eventtypes,cognitivestates,etc.).

    Singlehopquerieswillinmanycasesaskformultipleslotsforagivenentityregardlessofwhetherfillersforthoseslotsareattestedinthedocumentcollection.Multiplehopquerieswillfavorentitiesandslotsequencesthatareattestedinthedocumentcollection(althoughheretoo,availabilityofanswersisnotguaranteedatanyhoplevel).

    Becausecoreferenceofstringsandeventsisstilldifficult(forbothhumansandautomaticsystems),ColdStart2017willmitigateerrorsandinter-annotatordisagreementabouthowtocoreferenceeventsandstrings(especiallycross-documentcoreference)byavoidingqueriesthatinvolveaneventorstringasthepredicatesubject.ColdStart2017querieswillalwayshaveanentityasthepredicatesubjectateachhoplevel.Thismeansthatalleventquerieswillbeoftheform:FindalleventsthathaveentityXinroleY(e.g.,“Findallconflict.attackeventsthathavetheentity“HomerSimpson”intherole“attacker”,butnot“Findalltargetsofconflict.attackeventswhereHomeristheattacker”).

    Single-hopquerieswillrequestsomeeventorstringobjectinslot0;two-hopquerieswillrequestanentityobjectinslot0,andanentity,event,orstringobjectinslot1.Thetargeteddistributionofpredicatesacrossqueriesthatareassessedinthefinalevaluationisgiveninthefollowingtable.

    ApproximatedistributionofpredicatetypesacrossColdStartevaluationqueries

    Single-hopqueries Two-hopqueries

    Slot0 event SF(stringobject) SF SF SF sentiment sentiment sentiment

    Slot1 -- -- SF sentiment event SF sentiment event

    3/12 1/12 3/12 1/12 1/12 1/12 1/12 1/12

    ColdStartKBTaskOutputTheColdStartknowledgebaseisrepresentedasadirectedlabeledmultigraph:entities,events,andotherpredicatearguments(i.e.,“string”arguments,thatareneitherentitiesnorevents)arerepresentedasnodesintheknowledgegraph,whilebinaryrelationsbetweentheentities/events/stringsarerepresentedasedgeslabeledwithSF,sentiment,andeventpredicatenames.Generally,thesubjectandtheobjectofthepredicatearebothnodesintheKB(exceptthatthetype,link,and*mentionpredicatestakeaquotedstringasobject).TheKBisgroundedtothedocumentcollectionviavarious*mentionpredicatesthatconnecteachentity,event,orstringnodetoitsmentionsinthedocumentcollection.TheColdStartKBisconnectedtoexternalKBsviaalinkpredicatethatindicatescoreferencebetweenanentityintheColdStartKBandanentitynodeintheexternalKB.

  • 12

    CSKBsystemsmustproduceaknowledgebaseasoutput.ThefirstlineoftheKBoutputfilemustcontainauniquerunID.TheremainderoftheKBoutputfileisasetofassertions,oraugmentedtriples(subject,predicate,object).Assertionswillappear,one-per-line,intab-separatedformat.TheKBoutputfilewillbeautomaticallyconvertedtoRDFstatementsduringevaluation.AllKBoutputmustbeencodedinUTF-8.

    EachtripleappearsintheKBoutputfileinsubject-predicate-objectorder.Forexample,toindicatethatEntity4hasEntity7asasibling,thetriplemightbe:

    :Entity4 per:siblings :Entity7

    IfEntity4hassiblingsinadditiontoEntity7,theserelationsshouldbeenteredasseparatetriples.

    EachtripleintheCSKBsubmissionwillincludeasetofaugmentations(againusingtabsasseparators).Exceptforthetypepredicate(whichdoesnotrequireexplicitsupportfromadocument)thefirstaugmentationwilldescribetheprovenance(ajustification)ofthetriple,andthesecondaugmentationwillprovidetheconfidenceforthetripleandjustification.Ifthereismorethanonejustificationforatriple,eachjustificationappearsinaseparateassertion(line),alongwithitsconfidence.

    Atleastoneassertionforeachuniquesubject-predicate-objecttriplewillbeevaluated.Ifmorethanoneassertionofagiventripleappearsintheoutput(witheachtriplehavingdifferentprovenance),LDCwillassesstheassertionwiththehighestconfidencevalue(seebelow),andwillassessadditionalassertionsifresourcesallow.Ifmorethanonesuchassertionsharesthesameconfidencevalue,theassertionthatappearsearlierintheoutputwillbeconsideredtohavehigherconfidence.

    Nodes

    TheKBcontainsthreedifferentkindsofnodes:Entity,Event,andString.Eachnodespecificationbeginswithoneof“:Entity”,“:Event”,or“:String”,followedbyasequenceofletters,digits,andunderscores.Examplesoflegalentity,event,andstringnodespecificationsinclude:Entity42, :Event_056,and:String74_R29,respectively.Nomeaningisascribedtothissequencebytheevaluationsoftware;itisusedonlyasauniqueidentifier.Anysubsequentuseofthesamecolon-precededsequencewillbetakenasareferencetothesameentity,event,orstringnode.

    Eachspecificindividualentityoreventthatappearsinthedocumentcollectionmustberepresentedbyexactlyoneentityoreventnode.Twoseparateentity(orevent)nodesintheKBwillbeinterpretedasrepresentingtwodifferententities(orevents).

    Thestringnodeisacatch-allstructuretorepresentSFandEventpredicateargumentsthatarenotspecificindividualentities.ThestringnodeallowstheKBtorepresentmultiplejustificationsforthesamesubject-predicate-objecttriplewhentheobjectisstring-valued(e.g.,per:cause_of_death),evenwhentheliteralstringsaredifferent(e.g.,“cardiacarrest”vs.“heartattack”);ifagivensubjectandpredicatehastwoseparatestringnodesasobjects,theywillbeinterpretedasrepresentingtwodifferentslotfillersforthatsubjectandpredicate.However,inTAC2017thereisnorequirementthatthesamestringnodebeusedfortripleshavingdifferentsubjectsorpredicates(i.e.,for"JohnandMarydiedofheartdisease",theKBmaycontaintwodifferentstringnodesfor"heartdisease",whereonenodeisthecauseofdeathofJohn,andtheothernodeisthecauseofdeathofMary).10

    10TheKBmayelecttorepresenteachreal-worldconceptorvaluebyexactlyonenode(e.g.,fordatesandnumericvalues),butsuchaglobalgroundingofnodestotherealworldisnotarequirementinTAC2017(beyondtheentitiesandeventsthataredefinedexplicitlyinthe2017TACKBPontology).Aricherontologyofconceptsandvaluesisleftforfuturework.

  • 13

    Forthe2017evaluation,thestringobjectonlyneedstoincludeonementionforeachdocumentprovidedasjustificationforthatsubject-predicate-objecttriple.

    Inparticular,eacheventargumentthatisnotaspecificindividualentity(e.g.,unnamedaggregateslike“3people”)mustberepresentedintheColdStartKBasastringnode.WhentheeventargumentisnotavalidColdStartentity(i.e.,aspecificindividualPER,ORG,GPE,LOC,orFAC),theKBmayrepresenteachargumentstringasaseparatestringnode,evenifit’sclearfromcontextthatthestringsarecoreferential;stringvaluedeventargumentsareignoredinthecompositeKBevaluation(ColdStart2017eventquerieswillalwayshaveanentitynodeassubject,andaneventnodeasobject),andareusedonlytoproducetheEALoutputfilesforthecomponentEALevaluation(whichdoesnotrequireexplicitcoreferenceofeventargumentstrings).

    Predicates

    ThelegalpredicatesaretheSlotFillingpredicates,sentimentpredicates,andeventpredicatesshownintheAppendix,plustype,link,mention,nominal_mention,pronominal_mention,normalized_mentionandcanonical_mention.

    SFpredicatesfoundinTable1musthaveentityspecificationsinboththesubjectandobjectpositions;predicatesfoundinTable2mustspecifyanentitynodeinthesubjectposition,andastringnodeintheobjectposition;thestringnodeintheobjectpositionwillexactlycorrespondwiththeslotfillforthatrelationintheSlotFillingtask.

    type

    Eachentity,event,andstringnodewillbethesubjectofexactlyonetypetriple.TheobjectofthattriplewillbeoneoftheallowabletypeslistedinTable1below.Itisuptosubmittingsystemstocorrectlyidentifyandreportthetypeofeachentityandevent;allstringnodesmusthavetypeSTRING.

    Table1Allowablevaluesfortypepredicate

    Node Allowabletype

    String STRING

    Entity PER

    ORG

    GPE

    LOC

    FAC

    Event CONFLICT.ATTACK

    CONFLICT.DEMONSTRATE

    CONTACT.BROADCAST

  • 14

    CONTACT.CONTACT

    CONTACT.CORRESPONDENCE

    CONTACT.MEET

    JUSTICE.ARREST-JAIL

    LIFE.DIE

    LIFE.INJURE

    MANUFACTURE.ARTIFACT

    MOVEMENT.TRANSPORT-ARTIFACT

    MOVEMENT.TRANSPORT-PERSON

    PERSONNEL.ELECT

    PERSONNEL.END-POSITION

    PERSONNEL.START-POSITION

    TRANSACTION.TRANSACTION

    TRANSACTION.TRANSFER-MONEY

    TRANSACTION.TRANSFER-OWNERSHIP

    *mentionpredicates

    Eachentity,event,andstringnodewillbethesubjectofoneormorepredicatesfrom{mention, nominal_mention, pronominal_mention, canonical_mention, normalized_mention};theterm“*mention”isusedtorefertothesepredicates.Togetherwiththeprovenanceinformation(seebelow),these*mentiontriplesindicatehowtheknowledgebaseistiedtothedocumentcollection.Theobjectofa*mentiontripleisthedouble-quotedmentionstring;documentIDandoffsetappearunderprovenanceinformation(seebelow).

    mention and nominal_mention and pronominal_mention

    Eachentitywillbethesubjectofoneormoremention,nominal_mentionorpronominal_mentiontriples.Thedefinitionofwhatconstitutesanamed,nominal,orpronominalentitymentionforColdStartisdescribedintheColdStartschemaabove.Eachnamedentitymentioninthecollectionmustbesubmittedastheobjectofamentiontriple,whileeachnominalentitymentioninthecollectionmustbesubmittedastheobjectofanominal_mentiontriple.Forexample,ifanentityismentionedbynamefivetimesinadocument,fivementiontriplesshouldbegenerated.Thepronominal_mentionsareusedonlyforthecomponentBeStevaluation,andarenotreferencedor

  • 15

    evaluatedinthecompositeKBevaluationoranyoftheothercomponentevaluationsbesidesBeSt;therefore,theKBonlyneedstoincludepronominalmentionsthatserveasprovenanceforsentimentassertions.Anexampleisshownbelowtodemonstratetheusageoftheassertion:

    :Entity_0007 type PER

    :Entity_0007 canonical_mention "Dzhokhar Tsarnaev" NYT_ENG_20131113.0264:434-450 1.0

    :Entity_0007 mention "Dzhokhar Tsarnaev" NYT_ENG_20131113.0264:434-450 1.0

    :Entity_0007 pronominal_mention "he" NYT_ENG_20131113.0264:546-547 1.0

    Eacheventwillbethesubjectofoneormorementiontriples.ThedefinitionofwhatconstitutesaneventmentionisdescribedintheColdStartschemaabove.Eachmentionoftheevent(i.e.,eacheventnuggetorEREeventtriggerfortheevent)mustbesubmittedastheobjectofamentiontriple.EventmentionsneedtobeexhaustiveinordertosupportevaluationofthecomponentEventNuggetDetectionandCoreferencetask.

    Eachstringnodewillbethesubjectofoneormoremorementiontriples.Thestringnodeonlyneedstoincludeonementionforeachdocumentprovidedasjustificationforthatsubject-predicate-objecttriple.

    canonical_mention

    Foreachdocumentthatmentionsanentityorevent,oneofthementions(ornominal_mentionsifit’sanentity)mustbeidentifiedasthecanonicalmentionforthatentity/eventinthatdocument;itisthestringthatwillbeseenbytheassessorifthatentity/eventappearsasaslotfill,supportedbythatdocument(inSlotFillingtaskterms,itisthecontentofColumn5ofaCSSF2017submission,anditsprovenancewillserveasColumn7oftheCSSFsubmission).13Thisimpliesthatadocumentattestingtoarelationmustcontainmentionsornominal_mentionsofboththesubjectandtheobjectoftherelation.Canonicalmentionsareexpressedusingacanonical_mentiontriple.Theargumentsforcanonical_mention arethesameasformention.Notethatthereisnorequirementthatsubmissionsselectasingle,globalcanonicalmentionforanentity.Whilesuchamentionmightbeuseful,herewerequirethatacanonicalmentionbeprovidedwithineachdocumentfortheassessortouseduringassessment.

    Eachcanonical_mentionisalsoamention (or nominal_mention or pronominal_mentionifthenodeisanentity).Asaconvenience,ifasubmittedKBdoesnotcontainamention(ornominal_mention/pronominal_mention)tripleforeachcanonical_mentiontriple,themissingrelationswillbeinferred(perhapsincorrectly)asmentions(albeitwithawarning).ThisshortcutisprovidedtomakesubmittedKBseasiertoview,anddoesnotrelievesubmittersfromtherequirementtoprovideeachoftherequiredmentions,nominal_mentions,andcanonical_mentions.

    13IntheSlotFillingtaskofKBP2009-2014(andintheSlotFillingvariantofColdStart),allslotfillsarestrings.Assessorsverifythevalidityofaslotfillbylookingforthatstringinthespecifieddocument,usingtheprovenanceinformationprovidedinthesystemresponse.InasubmittedKB,slotsthatarefilledwithentitiesoreventsholdnotstrings,butpointerstotheKBstructurefortheappropriateentity/event.Thus,acanonicalmentionmustbeidentifiedbytheColdStartKBforeachentityineachdocument,sothattheassessorcanbepresentedwithastringthatrepresentstheentityduringassessment.

  • 16

    normalized_mention

    Inordertoallownormalizeddates(andothernormalizedstringsinfuture)intheKB,astringnodeforanormalizedstringvaluemustbethesubjectofanormalized_mentionpredicate.Anexampleoftheusageofanormalized_mentionassertionisshownbelow:

    :String_0001 type STRING

    :String_0001 mention "April 15" NYT_ENG_20131113.0264:624-631 1.0

    :String_0001 normalized_mention "2013-04-15" NYT_ENG_20131113.0264:624-631 1.0

    Thestringprovidedastheobjectofnormalized_mentionwouldnotbeverifiedagainsttextinthesourcedocument;however,itisarequirementthatforagivenstringnodetheprovenanceofthenormalized_mentionassertionshouldbethesameastheprovenanceofanother(non-normalized)mentionofthatstring.

    link

    Eachentitymaybethesubjectofuptoonelinkpredicate.Theobjectofthepredicateisaquotedstringoftheform“ExternalKBID:ExternalNodeID”andindicatesthattheColdStartentityisthesameastheentitywithID“ExternalNodeID”inanexternalreferenceKB.ForTAC2017,theexternalreferenceKBisthesameasthatusedin2015-2017fortheTACTrilingualEDLtrack,namelyLDC2015E42(TACKBPKnowledgeBaseII–BaseKB).ThelinkpredicateisignoredinthecompositeKBevaluationandisusedonlyfortheEDLcomponentevaluation.

    Thefollowingexampleshowshowtouselink(assumingtheexternalKBisthereferenceKBinLDC2015E42:TACKBPKnowledgeBaseII-BaseKB):

    DEMO

    :Entity_0001 type GPE

    :Entity_0001 canonical_mention "Boston" NYT_ENG_20131113.0264:402-407 1.0

    :Entity_0001 mention "Boston" NYT_ENG_20131113.0264:402-407 1.0

    :Entity_0001 link "LDC2015E42:m.050v43"

    ThiswillproducethefollowingEDLoutput:

    DEMO :Entity_0001_M00001 Boston NYT_ENG_20131113.0264:402-407 m.050v43 GPE NAM 1.0

    SF,sentiment,andeventpredicates

    TheKBmustincludealltriplesinvolvingSFpredicates,sentimentpredicates,andeventpredicatesintheAppendix.

    EventRealis

    TheKBmustspecifyrealisfor

    • eventpredicates,and• *mentionpredicatesthathaveaneventnodeassubject.

    Realismaytakeoneofthefollowingvalues:

    • actual

  • 17

    • generic• other

    InordertosupporttheEventNuggetandEventArgumentandLinkingcomponentevaluationsoftheKB,theKBmustspecifytherealisofeventmentionsandeventargumentassertions,andincludeeventsandargumentassertionsofallthreerealisvalues.TherealisofaneventmentionintheColdStartKBfollowsthedefinitionofrealisinRichERE,whiletherealisofaneventargumentassertionfollowsthedefinitionintheEventArgumentandLinkingtask(EAL).

    FromtheperspectiveofaColdStartKBuser,bothcompletedandplannedeventsareofinterest,butnotgenericevents;therefore,whenaColdStartqueryrequestinganeventisappliedtotheKB,itwillconsideronlyeventnodesthathavean“actual”or“other”mention,wherethequeryentityisan“actual”or“other”argumentfortheevent.Thismeans,forexample,thatthequery"FindallattackeventsthathaveHomerasanattacker"means"FindallACTUALorOTHEReventsthathaveHomerasanACTUALorOTHERattacker".GENERICevents(thathavebeenmis-classifiedasACTUALorOTHERintheKB)willbeassessedasWrong.

    Eventsthathave“generic”realisintheKBareignoredinthecompositequery-basedevaluation,andareusedonlyforthecomponenteventnuggetandeventargumentevaluations.

    Therealisvalueshouldbeappendedattheendofthepredicatename,using"."toseparatethetwo.Forexample,

    :Event_0001 type CONFLICT.ATTACK

    :Event_0001 mention.actual "bombing" NYT_ENG_20131113.0264:418-424 1.0

    :Event_0001 canonical_mention.actual "bombing" NYT_ENG_20131113.0264:418-424 1.0

    :Event_0001 conflict.attack:attacker.actual :Entity_0007 NYT_ENG_20131113.0264:492-681;NYT_ENG_20131113.0264:546-547;NIL 1.0

    Provenance

    Eachassertion(exceptfortypeassertions)mustcontainasinglejustification(provenance)immediatelyafterthesubject-predicate-objecttriple.Provenanceisasetofjustificationspans;eachspanmaycompriseatmost200UTF-8characters.EachjustificationspanwillincludeadocumentID,followedbyacolon,followedbytwodash-separatedoffsets(beginandendoffsets).Theoffsetsthatshowtheprovenanceofanextractedrelationareusedtonarrowtheassessor’sfocuswithinthedocumentswhenassessingthecorrectnessofthatrelation.

    Provenancespansforasinglejustificationmustcomefromasingledocument.Thisisanewrestrictionin2017,toallowjustificationstobecountablebasedonthejustificationdocuments.

    Provenancespanscanbepartitionedintofourdifferentgroups:

    • FILLER_STRING(musthaveexactly1span)• PREDICATE_JUSTIFICATION(mayhave1-3spans;multiplespansareseparatedbya

    comma)• BASE_FILLER(musthaveexactly1span)• ADDITIONAL_JUSTIFICATION(mayhaveanynumberofspans;multiplespansare

    separatedbyacomma)

    Provenancefortheassertionconsistsofsomesubsetofthefourspangroupsabove,dependingonthepredicateandobject;multiplegroupsareseparatedbysemicolon.

  • 18

    a) Ifthepredicateistype:- Noprovenanceshouldbeprovided

    b) Otherwise,ifthepredicateisanyofthe*mentionpredicates:- ProvenanceconsistsofonlyPREDICATE_JUSTIFICATION,containingexactlyone

    span,representingtheexactlocationofthementioninthedocumentcollectionc) Otherwise,ifthepredicateisasentimentpredicate:

    - ProvenanceconsistsofonlyPREDICATE_JUSTIFICATION,containingexactlyonespan

    - Theprovenancemustbeamentionoftheentitythatisthetargetofthesentiment,andmustbethementionclosesttowherethesentimentisexpressed

    - N.B.:Thetargetofthesentimentcouldbeeitherthesubjectorobjectofthepredicate,dependingonthepredicate:

    - forlikesanddislikespredicates,thetargetofthesentimentistheobject- foris_liked_byandis_disliked_bypredicates,thetargetofthesentimentisthe

    subjectd) Otherwise,ifthepredicateisanSFpredicatewithanon-stringobject:

    - ProvenanceconsistsofonlyPREDICATE_JUSTIFICATION,containing1-3spanse) Otherwise,ifthepredicateisanSFpredicatewithastringobject:

    - ProvenanceconsistsofFILLER_STRING;PREDICATE_JUSTIFICATION- FILLER_STRINGmustbeoneofthementionsofthestringobject- PREDICATE_JUSTIFICATIONcontains1-3spans

    f) Otherwise,ifthepredicateisaneventpredicatewithanon-stringobject- Provenanceconsistsof

    PREDICATE_JUSTIFICATION;BASE_FILLER;ADDITIONAL_JUSTIFICATION- PREDICATE_JUSTIFICATIONcontains1-3spans- ADDITIONAL_JUSTIFICATIONmaybe"NIL"oranynumberofspans

    g) Otherwise,ifthepredicateisaneventpredicatewithastringobject- Provenanceconsistsof

    FILLER_STRING;PREDICATE_JUSTIFICATION;BASE_FILLER;ADDITIONAL_JUSTIFICATION

    - FILLER_STRINGmustbeoneofthementionsofthestringobject- PREDICATE_JUSTIFICATIONcontains1-3spans- ADDITIONAL_JUSTIFICATIONmaybe"NIL"oranynumberofspans

    Forpredicateswithastringobject,thefirstjustificationspan(FILLER_STRING)mustrepresentthedocumentIDandoffsetsofthestringfill.(SlotFillingvariantparticipantsarealreadyprovidingthisinformationinColumn7oftheirsubmissions.)Thisisthetextthatwillbeshowntoassessorsinsteadofacanonical_mentionforthestringnode.

    ConfidenceMeasure

    Topromoteresearchintoprobabilisticknowledgebasesandconfidenceestimation,eachassertionintheKB(orslotfillintheCSSFsubmission)musthaveanassociatedconfidencescore.ConfidencescoreswillbeusedtocomputeavariantofMeanAveragePrecision(MAP)inthecompositeKBevaluation.Confidencescoreswillbeusedtoinduceatotalorderovertherelationsbeingevaluated(tiesarebrokenwhentwoscoresareequalbyassumingthattheassertionappearingearlierinthesubmissionhasahigherscore).Anysubmittedconfidencescoremustbeapositiverealnumberbetween0.0(exclusive,representingthelowestconfidence)and1.0(inclusive,representingthehighestconfidence),andmustincludeadecimalpoint(nocommas,please)toclearlydistinguishitfromadocumentoffset.Confidencescores,ifpresent,willappearattheendofeachoutputline,separatedfromtheprovenanceinformationwithatab.Ifnoconfidencescoreisprovidedforan

  • 19

    assertion,theconfidencewillbeinferredtobe1.0forthepurposesofevaluation.Confidencescoresmaynotbeusedtoqualifytwoincompatiblefillsforasingleslot;submittersystemsmustdecideamongstsuchpossibilitiesandsubmitonlyone.Forexample,ifthesystembelievesthatBart’sonlysiblingisLisawithconfidence0.7andMilhousewithconfidence0.3,itshouldsubmitonlyoneofthesepossibilities.Ifbotharesubmitted,itwillbeinterpretedasBarthavingtwosiblings.

    Comments

    Outputfilesmaycontaincomments,whichbeginatanyoccurrenceofapoundsign(#)andcontinuethrough(butdonotinclude)theendoftheline.Commentsandblanklineswillbeignored.ThefirstlineofaKBvariantoutputfilemustcontaintheuniquerunID(i.e.,itmaynotbeblank).Submittersmayliketoaddacommenttothislinegivingfurtherdetailsabouttherun.

    SlotFillingTaskOutputOutputfortheSlotFillingvariantwillbeintheformofatab-separatedfile.Thecolumnsofthesubmittedfileareasfollows:

    Column1 QueryID.Forthefirstround,thisistakendirectlyfromtheXMLtag.Forthesecondround,thisisdrawnfromthetagofthequerygeneratedfromthefirstroundoutput.

    Column2 Thenameoftheslotbeingfilled.

    Column3 AuniquerunIDforthesubmission.

    Column4 Provenancefortherelationbetweenthequeryentityandslotfiller,consistingofupto3docid:startoffset-endoffsettriplesseparatedbycommas.Individualspansmaycompriseatmost200UTF-8characters.Unlikethe2014SlotFillingtask,thereisnorequirementtogenerateNILentrieswhennoinformationaboutthetargetentityisavailable.

    Column5 Aslotfiller(possiblynormalized,e.g.,fordates).Thisisusedbothtopopulatetheentryofthenextroundquery,andbytheassessortojudgetheslotfill.ThestringshouldbeextractedfromthefillerprovenanceinColumn7,exceptthatanyembeddedtabsornewlinecharactersshouldbeconvertedtoaspacecharacteranddatesmustbenormalized(therefore,slotfillersshouldnotbetranslatedacrosslanguages).Ifanominalmentionisreturnedasaslotfiller,onlytheheadwordofthenominalphraseshouldbereturned(consistentwiththeEDLdefinitionofnominalmentions).Fordates,systemsmustnormalizedocumenttextstringstostandardizedmonth,day,and/oryearvalues,followingtheTIMEX2formatofyyyy-mm-dd(e.g.,documenttext“NewYear’sDay1985”wouldbenormalizedas“1985-01-01”);ifafulldatecannotbeinferredusingdocumenttextandmetadata,partialdatenormalizationsareallowedusing“X”forthemissinginformation.

    Column6 Afillertype,selectedfrom{PER,ORG,GPE,STRING}.TheSTRINGfilleris

  • 20

    usedforstring-valuedslotsshowninTable2.

    Column7 Provenancefortheslotfillerstring.Thisiseitherasinglespan(docid:startoffset-endoffset)fromthedocumentwherethecanonicalslotfillerstringwasextracted,or(inthecasewhentheslotfillerstringinColumn5hasbeennormalized)asetofuptotwocomma-separateddocid:startoffset-endoffsetspansforthebasestringsthatwereusedtogeneratethenormalizedslotfillerstring.ThedocumentsusedfortheslotfillerstringprovenancemustbeasubsetofthedocumentsprovidedinColumn4.Thiscolumnservestwopurposes.First,LDCwilljudgeCorrectvs.Inexactwithrespecttothedocument(s)providedintheslotfillerstringprovenance.Second,thiscolumnisusedtofillthe,andentriesinsecondroundqueries.Ifmorethanoneprovenancetripleisprovidedhere,thefirstonewillbeusedtofillthesecondroundquery.

    Column8 Confidencescore.

    TheprocessforconstructingaSlotFillingvariantsubmissionisasfollows:

    • DownloadthefollowingfromtheNISTWebsite:o TheColdStartevaluationdocumentsCS-GenerateQueries.plscripto CS-PackageOutput.plscripto CS-ValidateSF.plscript

    [email protected]:o TheCSSFevaluationqueries

    • ConfigureyoursystemtoproduceresultsonlyfromtheColdStartevaluationdocuments.• RuntheCS-GenerateQueries.plscriptontheevaluationqueriestoproducethefirstround

    queriesyoursystemwillrunon.Notethattherawevaluationqueriesmightdifferfromtheformatgivenabove,soyoushouldnotassumethatyoucanusethemasinputtoyoursystemwithoutrunningthisscript.

    • Runyoursystem,producingaslot-fillingsubmissionforthefirstroundqueries.• RuntheCS-ValidateSF.plscriptonyourfirstroundoutputtoverifythatitisformatted

    correctly.• RuntheCS-GenerateQueries.plscriptontheevaluationqueriesandyourfirstround

    outputtoproducethesecondroundqueries.• Runyoursystemonthesecondroundqueriestoproduceasecondoutputfile.• RuntheCS-PackageOutput.plscriptonthetwooutputfilestoproduceyoursubmission.• RuntheCS-ValidateSF.plscriptonyoursubmissiontoverifythatitisformattedcorrectly.• UploadthesubmissiontoNIST.

    Slotfillingsystemsthatparticipatedinthe2014SlotFillingtaskwillneedtohandlethefollowingdifferencestosuccessfullyparticipateinthe2017CSSFtask:

    • Onlytheslotspecifiedbytheentryistobefilled;allotherslotsshouldbeignored.TheentryisaddedtothequeriesreceivedfromNISTbyrunningtheCS-GenerateQueries.plscript.

    • Participantswillneedtodooneroundofslotfilling,runtheCS-GenerateQueries.plscripttocreatethesecondroundqueries,thenrunslotfillingagainonthenewqueries.The

  • 21

    resultsofroundsoneandtwoaretobeconcatenatedbeforesubmissionusingtheCS-PackageOutput.plscript.

    • CSSFrequiresthatparticipantsbeabletofillallslotsinbothdirections.Forexample,the2014SlotFillingtaskrequireddetectionofthe per:cities_of_residence slot.CSSFalsorequiressystemstobeabletodetecttheinverseofthatslot,gpe:residents_of_city.

    • Eachslotfillermustbeassignedatype,selectedfrom{PER,ORG,GPE,STRING}.Thisfieldrepresentsanadditionaloutputcolumnnotfoundinthe2014SlotFillingorCSSFtasks.

    • NILentries,indicatingthatnoinformationaboutaparticularslotisavailable,arenotrequiredinCSSF.

    • Nominalmentionsofslotfillersmaybereturnifnonamedentitymentionisavailableinthedocumentcollection.(Returningnominalentitymentionsisnotrequired,butmayimprovesystemrecallifdonecorrectly.)

    • ToconformwithrequirementsintheColdStartKBtask,provenanceforeachSFrelationislimitedtoonly3spans(insteadof4),andeachspanmayhaveupto200UTF-8characters(insteadof150).

    HereareexamplelinesfromaSlotFillingsubmission:

    Q4 org:city_of_headquarters myrun1 Doc42:3-8,Doc8:3-11 Baltimore GPE Doc8:3-11 1.0 Q5 per:siblings myrun1 Doc124:283-288,Doc885:173-179 Lisa PER Doc124:283-286 0.7

    Q6 per:age myrun1 Doc124:180-181,Doc885:173-179 10 STRING Doc124:180-181 0.9

    Evaluation

    The submitted Cold Start KBs are evaluated by both a composite query-based evaluation, and a set of component evaluations. The composite KB evaluation applies a set of Cold Start evaluation queries to each KB and assesses the correctness of the events, sentiment sources and targets, and SF slot fillers found.

    Becausethecompositeevaluationmayhidemanyfactorscontributingtotheperformanceoftheend-to-endKBsystem,eachsubmittedKBalsoundergoesasetofcomponentevaluations.The component evaluations are implemented by projecting out the individual components from the submitted KB, such that each component output file isformattedinthesamewayasasubmissiontotheKBPtrackforthatcomponent, and evaluating each output file as though it had been submitted directly to the standalone track for that component.

    ExceptfortheSFtask,allcomponenttasksareevaluatedusinggoldstandardannotationsonacommonsetofapproximately500"core"documents.

    ComponentEvaluations

    ThefollowingcomponentfilesareprojectedfromeachColdStartKBforthecomponentevaluations:

    1. Entitydiscoveryandlinking(EDL):AnEDLfileconsistingofnameandnominalmentionsandlinksforPER,ORG,GPE,FAC,andLOCentitiesfromthe"core"documentsusedtoevaluatesubmissionstotheEDLtrack.LinkscanbetoeitheranodeinthereferenceKB(TACKBPKnowledgeBaseII-BaseKB)or(iftheentitydoesnotexistinthereferenceKB)a

  • 22

    NILnodecorrespondingtoanentitynodeinthesubmittedKB.EvaluationoftheEntityDiscoveryandLinkingcomponentofsubmittedColdStartKBswillbeidenticaltoscoringforthe2017TACTrilingualEntityDiscoveryandLinkingtask.PleaseseeTACKBP2017EntityDiscoveryandLinkingTaskDescriptionforcompletedetailsonscoring.

    2. SlotFilling:AnSFfileconsistingofslotfillersandjustificationsfoundintheKBbyapplyingColdStartevaluationqueriesthatinvolveonlySFpredicates.ThecomponentSFevaluationisidenticaltothecompositeKBevaluation,exceptthattheSFevaluationincludesonlyCSqueriesthatinvolveonlySFslots.BecauseSFsystemsareallowedtosubmitonlyonejustificationperrelation,onlythetoprankedjustificationperrelationwillbeconsideredfortheSFcomponentevaluationofKBs.

    3. EventNuggetDetectionandCoreference:AnENfileconsistingofeventmentionsandwithin-documentcoreferencefromthe"core"documents.EvaluationoftheEventNuggetcomponentofsubmittedColdStartKBswillbeidenticaltoscoringforthe2017TACEventNuggetDetectionandCoreferencetask.PleaseseeTACKBP2017EventNuggetDetectionandCoreferenceTaskDescriptionforcompletedetailsonscoring.

    4. EventArgumentandLinking:Asetof"arguments"files,eachfileconsistingofeventargumentassertions(includingjustifications)froma"core"document;asetof"linking"files,eachfileconsistingofcoreferenceofassertionsinthecorresponding"arguments"file.EvaluationoftheEventArgumentcomponentofsubmittedColdStartKBswillbeidenticaltoscoringforthe2017TACEventArgumentandLinkingtask.PleaseseeTACKBP2017EventArgumentandLinkingTaskDescriptionforcompletedetailsonscoring.

    5. Sentiment:AsetofpredictedERExmlfiles,eachfileconsistingofname,nominal,andpronominalmentionsandcoreferenceforPER,ORG,GPE,FAC,andLOCentitiesfroma"core"document;asetofBeStxmlfiles,eachfileconsistingofsentiment(includingprovenance)fromasourcetowardsatargetentityinthecorrespondingpredictedEREfile.EvaluationofthesentimentcomponentofsubmittedColdStartKBswillbeidenticaltoscoringforthe2017TACBeliefandSentimenttask,exceptthatonlysentimenttowardsentitieswillbeevaluated.PleaseseeTACKBP2017BeliefandSentimentTaskDescriptionforcompletedetailsonscoring.

    CompositeEvaluationAssessment

    ColdStart2017assessmentandscoringwillproceedasfollows:Theresponsesforeachevaluationquery(frombothCSKBandCSSFsystemsandfromhuman-generatedresults)willbepooled,andeachresponsewillbeassessedbyaperson.TheresultoffollowingthefirstrelationwillbeassessedasifitwereaSlotFillingquery.ThesecondrelationinthequerywillalsobeassessedasaSlotFillingquery,butonlyifthefillforthefirstrelationiscorrect.Ifthefillforthefirstrelationisnotcorrect,eachfillforthesecondrelationisautomaticallycountedasWrong.Forexample,ifthequeryasksfortheagesofthesiblingsof“BartSimpson,”andthesubmittedknowledgebasegives“Lisaage8”and“Milhouseage10”assiblings,thenonlythereportedageofLisawillbeassessed(MilhouseisnotBart’ssibling),andthereportedageofMillhousewillautomaticallybecountedasWrong.

    ColdStartusespseudo-slotscoringtoevaluatemultiple-hopqueries,inwhicheachevaluationqueryistreatedasifitselectsasingleindivisibleslot.Forexample,anevaluationquerythatasksforthechildrenofthesiblingsofanentitywillbescoredasifitwereaqueryaboutavirtual

  • 23

    per:nieces_and_nephewsslot.15TheguidelinesinTACKBP2015SlotDescriptionsspecifywhethereachofthecomponentslotsofapseudo-slotissingle-valued(e.g.,per:date_of_birth)orlist-valued(e.g.,per:employee_of,per:children).Apseudoslotissingle-valuedifeachofitscomponentslotsissingle-valued,andlist-valuedotherwise.IncontrasttotheSlotFillingtask,ColdStartKBsubmissionsmaycontainmultiplefillsforsingle-valuedslots.Ifsucharepresentinthesubmission,LDCwillassesstheslotfillwiththehighestconfidencevalue,andwillassessadditionalslotfillsifresourcesallow.Ifmorethanonesuchslotfillsharesthesameconfidencevalue,theslotfillthatappearsearlierintheoutputwillbeconsideredtohavehigherconfidence.

    EachCSSFslotfillerresponse(orCSKBobjectofeachcomponentrelationthatmakesupasingleevaluationqueryresponse)isassessedasCorrect,ineXact,orWrong.Aresponseisinexactifiteitherincludesonlyapartofthecorrectanswerorincludesthecorrectanswerplusextraneousmaterial.InexactanswersarecountedasWrongforthepurposesofscoring.Iftherelationobjectisanevent,thecanonical_mentionshouldbeanEREeventtriggerbut(giventhedifficultyofdeterminingexactextentsofeventmentions),assessorsin2017willbelenientwhenassessingtheextentofanotherwisecorrecteventmention.

    Foreachquery,allsystemresponsesinwhichtheslotfillerisassessedasCorrectorineXactwillbepartitionedintoequivalenceclasses,whereslotfillersinthesameequivalenceclassrepresentthesameentity,event,orvalue(asinthecaseofdates).EachCorrectorineXactresponsewillreceiveanannotationforfillermentiontype(eitherNAMorNOM),andeachequivalenceclasswillreceiveanannotationforequivalenceclassmentiontype(NAMiftheassessorcanfindanamedmentionforthefilleranywhereintheprovenancesinanyoftheresponses;otherwise,NOMifonlynominalmentionsappearintheprovenancesofallresponses).

    Pseudo-slotswillbescoredjustasslotsintheSlotFillingtask,withtheadditionalconstraintthatboththeslotfillandthepathleadingtothatfillmustbecorrectfortheentiretytobejudgedcorrect.ToreceivecreditforidentifyingMaggieSimpsonasPattyBouvier’sniece,theknowledgebasemustnotonlyincludeMaggieastheslotfill,butmustalsorepresentMaggieasMarge’schild,andMargeasPatty’ssibling:16

    Evaluationquery: NiecesandnephewsofPattyBouvier(per:siblings,per:children) GroundTruth: :PattyBouvier per:siblings :MargeSimpson :MargeSimpson per:children :MaggieSimpson Submission: :PattyBouvier per:siblings :MargeSimpson

    :MargeSimpson per:children :MaggieSimpsonÞcorrect

    AKBthatindicatedthatMaggiewasPatty’sniecebecauseshewasPatty’ssisterSelma’schildwouldbescoredasincorrect:

    Evaluationquery: NiecesandnephewsofPattyBouvier(per:siblings,per:children) GroundTruth: :PattyBouvier per:siblings :MargeSimpson :MargeSimpson per:children :MaggieSimpson Submission: :PattyBouvier per:siblings :SelmaBouvier

    :SelmaBouvier per:children :MaggieSimpsonÞincorrect

    15Apseudo-slotissimilartotheconceptofarolechain,whichissupportedbysomeknowledgerepresentationsystemsbasedondescriptionlogic,includingOWL2.16Ineachoftheseexamples,onlythesubject,predicateandobjectareshown,andonlyasubsetoftherelevantknowledgebaseispresented.Eachentityisnamedafterthementionthatgaverisetoit.

  • 24

    Inaddition,theobjectofthefinalrelationinapseudo-slotmayberatedasredundantifitisequivalenttoanotherfillforthepseudo-slot.RedundantanswersarecountedasWrongforthepurposesofscoring:

    Evaluationquery: NiecesandnephewsofPattyBouvier(per:siblings,per:children)GroundTruth: :PattyBouvier per:siblings :MargeSimpson

    :MargeSimpson per:children :MaggieSimpson :MaggieSimpson per:alternate_names "Margaret Simpson" Submission: :PattyBouvier per:siblings :MargeSimpson :MargeSimpson per:children :MaggieSimpsonÞcorrect

    :MargeSimpson per:children :MargaretSimpsonÞredundant

    However,objectsofrelationsotherthanthefinalrelationwillneverberatedasredundant:

    Evaluationquery: NiecesandnephewsofPattyBouvier(per:siblings,per:children)GroundTruth: :PattyBouvier per:siblings :MargeSimpson

    :MargeSimpson per:children :LisaSimpson :MargeSimpson per:children :BartSimpson :MargeSimpson per:alternate_names "Marjorie Simpson" Submission: :PattyBouvier per:siblings :MargeSimpson :PattyBouvier per:siblings :MarjorieSimpson :MargeSimpson per:children :LisaSimpsonÞcorrect

    :MarjorieSimpson per:children :BartSimpsonÞcorrectHere,MargeSimpsonandMarjorieSimpsonrepresentthesamepersoninthegroundtruth,buttwodistinctentitiesintheKB.However,becausethequeryisaboutMarge’schildrenandnotaboutMargeherself,bothresponsestotheevaluationqueryareassessedascorrect.

    SinceinColdStartthefactsbeingevaluatedcomefromsequencesoftriples,confidencescoreswouldneedtobecombinedifwewantedtogenerateconfidencescoresforaderivedpseudo-relation.Threegeneralscorecombinationfunctionsaremin,maxandproduct.Theproperwaytocombinescoresofcoursedependsonthemeaningofthosescores;fornow,ColdStartwillselecttheproductfunctionasareasonableconfidencecombinationfunction.

    CompositeEvaluationScoring

    Giventheaboveapproachtoassessment,basicscoringforagivensystemproceedsasfollows:

    • EachresponseassessedasWrongorineXact,iscountedasSpurious• EachresponseforRound2whoseRound1parentfillerisassessedasWrongorineXact,is

    countedasSpurious.ThisscoringpolicyassumesthattheColdStartsystemoutputisintendedforfullyautomaticdownstreamanalytics;hence,ahop1responseisWrongifthehop0parentresponseisWrong.

    • ResponsesassessedasCorrectorInexactaregroupedintoequivalenceclasses.Forquerieswithanentityasthepredicateobject,ifthesystemhasaNAMentitymentionintheequivalenceclass,orifthesystemhasonlyNOMentitymentionsandtheequivalenceclassisannotatedasNOM,thentheresponsesintheequivalenceclassarecountedasRight;otherwise,ifthesystemhasonlyNOMentitymentionsintheequivalenceclassandtheequivalenceclassisannotatedasNAM,thentheresponsesintheequivalenceclassarecountedasIgnore(i.e.,treatedasifitwasneverreturnedbythesystem)andremovedfromtheequivalenceclass.Thus,namedentitymentionsarepreferredandanamedmentionmustbereturnedifoneexists.

  • 25

    • Reference=numberofsingle-valuedpseudo-slotswithacorrectresponse+numberofequivalenceclasses17foralllist-valuedpseudo-slots

    Aftereachpooledresponsehasbeenassessedand(ifassessedasCorrectorInexact)putintoanequivalenceclass,NISTwillscoreasubmittedKBbycomputingavariantofmeanaverageprecision(MAP)fortheKB.Averageprecision(AP)foragivenqueryandsubmittedrun,iscomputedinthefollowingway:

    0.LetkbethemaximumnumberofjustificationsassessedperrelationassertedintheKB.ForTAC2017,itisexpectedthatk=3forKBsubmissions.(SFsubmissionswillhavek=1.)Wewillalsorequirethatforagivenfiller,atmostonejustificationisreturnedperdocument(ResolveQuerieswillconsideronlythehighestconfidencejustificationforthatdocument).

    1.Foreachcandidatefiller(node)returnedforthequery,letthenodeconfidencebetheaggregateoftheconfidencesofitsjustifications--uptokjustificationsj1,j2,...jkpernode,havingconfidencec1,c2,..ck.Ifthenodehasaparent(i.e.,thefillerisahop_1filler),itsnodeconfidenceistheproductofthenodeconfidenceofitsparententity,andtheaggregateoftheconfidencesofitsjustifications.Therearemanypossiblewaysofaggregatingjustification-levelconfidencevaluestoproduceanode-levelconfidencevalue,andwhichaggregationfunctionwilldependlargelyonhowconfidencevaluesareusedineachusecase.ForTAC2017,theconfidenceofanodewithjustificationshavingconfidence(c1,c2,…,ck),assumingtheconfidencesaresortingindecreasingorder,willbeanormalizedweightedsumoftheconfidencevalues,weightedbasedontherankofthejustification:

    (c1/1+c2/2+….+ck/k)/(1+½+…+1/k).

    2.Rankthecandidatefillersbytheirnodeconfidence.

    3.Godownthelistofnodes,andassignavaluevtoeachnode,0

  • 26

    thefilleriszeroifthefillercannotbematchedtoanequivalenceclass,eitherbecausealljustificationsinthefillerareWrong,orbecauseallequivalenceclassesinthefillerhavealreadybeenmatchedtosomeotherhigherrankedfiller(s).ThevaluevmeasuresthenumberofcorrectjustificationsdocumentsthattheKBprovidesforthematchedequivalenceclass;visarecallmeasureoverjustificationdocuments.

    4.WethencomputeAPasusual,exceptthata"Correct"itemintherankedlistdoesnotalwaysgetcountedas"1",buthassomevalue0

  • 27

    3. MonolingualChinese:entitymentions,slotfillsandprovenancesareextractedonlyfromChinesedocuments.EvaluationquerieswillcontainentrypointsonlyfromChinesedocuments.

    4. Cross-lingual:entitymentions,slotfillsandprovenancesareextractedfromanycombinationofEnglish,Spanish,andChinesedocuments.Evaluationquerieswillcontainentrypointsfromanyoftheselanguages,andtheslotfillerandjustificationscancomefromalanguagedifferentfromtheentrypoint.Becausejustificationspansforasinglejustificationmustcomefromthesamedocument,thecross-lingualnatureofthisevaluationconditionisduetotheexerciseofcross-lingualEDL.

    IfateamsubmitsaruninvolvingmorethanonelanguageundertheCross-lingualcondition,itmustalsosubmitatleastonerununderthemonolingualconditionforeachlanguageinvolved(withadescriptionofwhichmonolingualrunconfigurationswereusedforeachcross-lingualrun).

    Submittedrunsmustberanked(1-5).TherunIDincludedineachteam'ssubmissionfilemustbeaconcatenationoftheteam'sTACKBP2017teamID,thetask(KBorSF),thelanguagecondition(ENG,CMN,SPA,orXLING),andarank(1-5);thus"Acme_KB_XLING_1"wouldbethetop-rankedrunfortheAcmeteamfortheCSKBtaskunderthecross-lingualcondition.

    Thetop-rankedsubmissionmustbemadeasa‘closed’system;inparticular,itmustnotaccesstheWebduringtheevaluationperiod.Allsubmissionsmustobeythefollowingexternalresourcerestrictions:

    • Structuredknowledgebases(e.g.,Wikipediainfoboxes,DBPedia,Freebase)maynotbeusedtodirectlyfillslotsordirectlyvalidatecandidateslotfillers.

    • Structuredknowledgebaseentriesfortargetentitiesmaynotbeedited,eitherduring,oraftertheevaluation.

    Inaddition,becauseColdStartfocusesontheconditionwheretheknowledgebaseisinitiallyempty,weaskthateachparticipatingsitesubmitatleastonerunthatconsultsexternalentityknowledgebasesonlyafterentitiesandrelationshavebeenextractedfromthedocumentcollection.Detailsaboutsubmissionprocedureswillbecommunicatedtothetrackmailinglist.ToolstovalidateformatswillbeavailableontheTACWebsite(http://www.nist.gov/tac/2016/KBP/ColdStart/tools.html).

    Appendix

  • 28

    Relation Inverse(s)per:children per:parents per:other_family per:other_family per:parents per:children per:siblings per:siblings per:spouse per:spouse per:employee_or_member_of {org,gpe}:employees_or_members* per:schools_attended org:students* per:city_of_birth gpe:births_in_city* per:stateorprovince_of_birth gpe:births_in_stateorprovince* per:country_of_birth gpe:births_in_country* per:cities_of_residence gpe:residents_of_city* per:statesorprovinces_of_residence gpe:residents_of_stateorprovince per:countries_of_residence gpe:residents_of_country* per:city_of_death gpe:deaths_in_city* per:stateorprovince_of_death gpe:deaths_in_stateorprovince* per:country_of_death gpe:deaths_in_country* org:shareholders {per,org,gpe}:holds_shares_in* org:founded_by {per,org,gpe}:organizations_founded* org:top_members_employees per:top_member_employee_of* {org,gpe}:member_of org:members org:members {org,gpe}:member_of org:parents {org,gpe}:subsidiaries org:subsidiaries org:parents org:city_of_headquarters gpe:headquarters_in_city* org:stateorprovince_of_headquarters gpe:headquarters_in_stateorprovince* org:country_of_headquarters gpe:headquarters_in_country*

    Table2.Entity-valuedSFslots.SlotswithasterisksrepresentinverserelationsthatwillneedtobeaddedbyparticipantsfrompreviousyearsSlotFillingtask(2014andearlier).Thetypequalifierofeachrelation(per,orgorgpe)isthetypeofitssubject,whilethetypequalifierforitsinverseisthetypeofitsobject.Asetoftypesmeansthatanyofthosetypesisacceptableforthatslot.Allsubmittedslotnamesmustuseonlyasingletypespecification.

    per:alternate_names org:alternate_names per:date_of_birth org:political_religious_affiliation per:age org:number_of_employees_members per:origin org:date_founded per:date_of_death org:date_dissolved per:cause_of_death org:website per:title per:religion per:charges

    Table3.String-valuedSFslots.

  • 29

    SentimentPredicates

    Subject Predicate Object InversePredicate

    PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:dislikes PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:is_disliked_by

    PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:is_disliked_by PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:dislikes

    PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:is_liked_by PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:likes

    PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:likes PER,ORG,GPE,LOC,FAC

    {per,org,gpe,loc,fac}:is_liked_by

  • 30

    EventPredicates(eventassubject)

    SubjectPredicate Object InversePredicate

    CONFLICT.ATTACKconflict.attack:attacker

    PER,ORG,GPE,STRING

    {per,gpe,org}:conflict.attack_attacker

    CONFLICT.ATTACKconflict.attack:instrument STRING none

    CONFLICT.ATTACKconflict.attack:target

    PER,ORG,GPE,FAC,STRING

    {per,gpe,org,fac}:conflict.attack_target

    CONFLICT.ATTACKconflict.attack:time STRING none

    CONFLICT.ATTACKconflict.attack:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:conflict.attack_place

    CONFLICT.DEMONSTRATE

    conflict.demonstrate:entity PER,ORG,STRING {per,org}:conflict.demonstrate_entity

    CONFLICT.DEMONSTRATE

    conflict.demonstrate:time STRING none

    CONFLICT.DEMONSTRATE

    conflict.demonstrate:place GPE,LOC,FAC,STRING

    {gpe,loc,fac}:conflict.demonstrate_place

    CONTACT.BROADCASTcontact.broadcast:audience

    PER,ORG,GPE,STRING

    {per,org,gpe}:contact.broadcast_audience

    CONTACT.BROADCASTcontact.broadcast:entity

    PER,ORG,GPE,STRING

    {per,org,gpe}:contact.broadcast_entity

    CONTACT.BROADCASTcontact.broadcast:time STRING none

    CONTACT.BROADCASTcontact.broadcast:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:contact.broadcast_place

    CONTACT.CONTACTcontact.contact:entity

    PER,ORG,GPE,STRING

    {per,org,gpe}:contact.contact_entity

    CONTACT.CONTACTcontact.contact:time STRING none

    CONTACT.CONTACTcontact.contact:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:contact.contact_place

    CONTACT.CORRESPONDENCE

    contact.correspondence:entity

    PER,ORG,GPE,STRING

    {per,org,gpe}:contact.correspondence_entity

    CONTACT.CORRESPONDENCE

    contact.correspondence:time

    STRING none

    CONTACT.CORRESPONDENCE

    contact.correspondence:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:contact.correspondence_place

    CONTACT.MEETcontact.meet:entity

    PER,ORG,GPE,STRING

    {per,org,gpe}:contact.meet_entity

    CONTACT.MEETcontact.meet:time STRING none

    CONTACT.MEETcontact.meet:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:contact.meet_place

    JUSTICE.ARREST-JAILjustice.arrest-jail:agent

    PER,ORG,GPE,STRING

    {per,org,gpe}:justice.arrest-jail_agent

  • 31

    JUSTICE.ARREST-JAILjustice.arrest-jail:crime STRING none

    JUSTICE.ARREST-JAILjustice.arrest-jail:person PER,STRING per:justice.arrest-jail_person

    JUSTICE.ARREST-JAILjustice.arrest-jail:time STRING none

    JUSTICE.ARREST-JAILjustice.arrest-jail:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:justice.arrest-jail_place

    LIFE.DIElife.die:agent

    PER,ORG,GPE,STRING

    {per,org,gpe}:life.die_agent

    LIFE.DIElife.die:instrument STRING none

    LIFE.DIElife.die:victim PER,STRING per:life.die_victim

    LIFE.DIElife.die:time STRING none

    LIFE.DIElife.die:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:life.die_place

    LIFE.INJURElife.injure:agent

    PER,ORG,GPE,STRING

    {per,org,gpe}:life.injure_agent

    LIFE.INJURElife.injure:instrument STRING none

    LIFE.INJURElife.injure:victim PER,STRING per:life.injure_victim

    LIFE.INJURElife.injure:time STRING none

    LIFE.INJURElife.injure:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:life.injure_place

    MANUFACTURE.ARTIFACT

    manufacture.artifact:agent PER,ORG,GPE,STRING

    {per,org,gpe}:manufacture.artifact_agent

    MANUFACTURE.ARTIFACT

    manufacture.artifact:artifact

    FAC,STRING fac:manufacture.artifact_artifact

    MANUFACTURE.ARTIFACT

    manufacture.artifact:instrument

    STRING none

    MANUFACTURE.ARTIFACT

    manufacture.artifact:time STRING none

    MANUFACTURE.ARTIFACT

    manufacture.artifact:place GPE,LOC,FAC,STRING

    gpe,loc,fac}:manufacture.artifact_place

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:agent

    PER,ORG,GPE,STRING

    {per,org,gpe}:movement.transport-artifact_agent

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:artifact

    FAC,STRING fac:movement.transport-artifact_artifact

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:destination

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:movement.transport-artifact_destination

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:instrument

    STRING none

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:origin

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:movement.transport-artifact_origin

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:time

    STRING none

    MOVEMENT.TRANSPORmovement.transport-person:agent

    PER,ORG,GPE,STRING

    {per,org,gpe}:movement.transport-

  • 32

    T-PERSON person_agent

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:destination

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:movement.transport-person_destination

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:instrument

    STRING none

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:origin

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:movement.transport-person_origin

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:person

    PER,STRING per:movement.transport-person_person

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:time

    STRING none

    PERSONNEL.ELECTpersonnel.elect:agent PER,ORG,GPE,STRIN

    G{per,org,gpe}:personnel.elect_agent

    PERSONNEL.ELECTpersonnel.elect:person PER,STRING per:personnel.elect_person

    PERSONNEL.ELECTpersonnel.elect:position STRING none

    PERSONNEL.ELECTpersonnel.elect:time STRING none

    PERSONNEL.ELECTpersonnel.elect:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:personnel.elect_place

    PERSONNEL.END-POSITION

    personnel.end-position:entity

    ORG,GPE,STRING {org,gpe}:personnel.end-position_entity

    PERSONNEL.END-POSITION

    personnel.end-position:person

    PER,STRING per:personnel.end-position_person

    PERSONNEL.END-POSITION

    personnel.end-position:position

    STRING none

    PERSONNEL.END-POSITION

    personnel.end-position:time

    STRING none

    PERSONNEL.END-POSITION

    personnel.end-position:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:personnel.end-position_place

    PERSONNEL.START-POSITION

    personnel.start-position:entity

    ORG,GPE,STRING {org,gpe}:personnel.start-position_entity

    PERSONNEL.START-POSITION

    personnel.start-position:person

    PER,STRING per:personnel.start-position_person

    PERSONNEL.START-POSITION

    personnel.start-position:position

    STRING none

    PERSONNEL.START-POSITION

    personnel.start-position:time

    STRING none

    PERSONNEL.START-POSITION

    personnel.start-position:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:personnel.start-position_place

    TRANSACTION.TRANSACTION

    transaction.transaction:beneficiary

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transaction_beneficiary

    TRANSACTION.TRANSACTION

    transaction.transaction:giver

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transaction_giver

    TRANSACTION.TRANSACTION

    transaction.transaction:recipient

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transaction_recipient

    TRANSACTION.TRANSACTION

    transaction.transaction:time

    STRING none

  • 33

    EventPredicates(eventasobject)

    SubjectPredicate Object InversePredicate

    PER,ORG,GPE{per,org,gpe}:conflict.attack_attacker CONFLICT.ATTACK conflict.attack:attacker

    GPE,LOC,FAC{gpe,loc,fac}:conflict.attack_place CONFLICT.ATTACK conflict.attack:place

    PER,ORG,GPE,FAC

    {per,org,gpe,fac}:conflict.attack_target CONFLICT.ATTACK conflict.attack:target

    PER,ORGper,org:conflict.demonstrate_entity

    CONFLICT.DEMONSTRATE

    conflict.demonstrate:entity

    TRANSACTION.TRANSACTION

    transaction.transaction:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:transaction.transaction_place

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:beneficiary

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-money_beneficiary

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:giver

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-money_giver

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:money

    STRING none

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:recipient

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-money_recipient

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:time

    STRING none

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:transaction.transfer-money_place

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:beneficiary

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-ownership_beneficiary

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:giver

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-ownership_giver

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:recipient

    PER,ORG,GPE,STRING

    {per,org,gpe}:transaction.transfer-ownership_recipient

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:thing

    FAC,ORG,STRING {fac,org}:transaction.transfer-ownership_thing

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:time

    STRING none

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:place

    GPE,LOC,FAC,STRING

    {gpe,loc,fac}:transaction.transfer-ownership_place

  • 34

    GPE,LOC,FAC{gpe,loc,fac}:conflict.demonstrate_place

    CONFLICT.DEMONSTRATE

    conflict.demonstrate:place

    PER,ORG,GPE{per,org,gpe}:contact.broadcast_audience

    CONTACT.BROADCAST contact.broadcast:audience

    PER,ORG,GPE{per,org,gpe}:contact.broadcast_entity CONTACT.BROADCAST contact.broadcast:entity

    GPE,LOC,FAC{gpe,loc,fac}:contact.broadcast_place CONTACT.BROADCAST contact.broadcast:place

    PER,ORG,GPE{per,org,gpe}:contact.contact_entity CONTACT.CONTACT contact.contact:entity

    GPE,LOC,FAC{gpe,loc,fac}:contact.contact_place CONTACT.CONTACT contact.contact:place

    PER,ORG,GPE{per,org,gpe}:contact.correspondence_entity

    CONTACT.CORRESPONDENCE

    contact.correspondence:entity

    GPE,LOC,FAC{gpe,loc,fac}:contact.correspondence_place

    CONTACT.CORRESPONDENCE

    contact.correspondence:place

    PER,ORG,GPE{per,org,gpe}:contact.meet_entity CONTACT.MEET contact.meet:entity

    GPE,LOC,FAC{gpe,loc,fac}:contact.meet_place CONTACT.MEET contact.meet:place

    PER,ORG,GPE{per,org,gpe}:justice.arrest-jail_agent JUSTICE.ARREST-JAIL justice.arrest-jail:agent

    PERper:justice.arrest-jail_person JUSTICE.ARREST-JAIL justice.arrest-jail:person

    GPE,LOC,FAC{gpe,loc,fac}:justice.arrest-jail_place JUSTICE.ARREST-JAIL justice.arrest-jail:place

    PER,ORG,GPE{per,org,gpe}:life.die_agent LIFE.DIE life.die:agent

    GPE,LOC,FAC{gpe,loc,fac}:life.die_place LIFE.DIE life.die:place

    PERper:life.die_victim LIFE.DIE life.die:victim

    PER,ORG,GPE{per,org,gpe}:life.injure_agent LIFE.INJURE life.injure:agent

    GPE,LOC,FAC{gpe,loc,fac}:life.injure_place LIFE.INJURE life.injure:place

    PERper:life.injure_victim LIFE.INJURE life.injure:victim

    PER,ORG,GPE{per,org,gpe}:manufacture.artifact_agent MANUFACTURE.ARTIFAC

    Tmanufacture.artifact:agent

    FACfac:manufacture.artifact_artifact

    MANUFACTURE.ARTIFACT

    manufacture.artifact:artifact

    GPE,LOC,FAC{gpe,loc,fac}:manufacture.artifact_place

    MANUFACTURE.ARTIFACT

    manufacture.artifact:place

    PER,ORG,GPE{per,org,gpe}:movement.transport-artifact_agent

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:agent

    FACfac:movement.transport-artifact_artifact

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:artifact

    GPE,LOC,FAC{gpe,loc,fac}:movement.transport-artifact_destination

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:destination

    GPE,LOC,FAC{gpe,loc,fac}:movement.transport-artifact_origin

    MOVEMENT.TRANSPORT-ARTIFACT

    movement.transport-artifact:origin

    PER,ORG,GPE{per,org,gpe}:movement.transport-person_agent

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:agent

    GPE,LOC,FAC{gpe,loc,fac}:movement.transport- movement.transport-

  • 35

    person_destination MOVEMENT.TRANSPORT-PERSON

    person:destination

    GPE,LOC,FAC{gpe,loc,fac}:movement.transport-person_origin

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:origin

    PERper:movement.transport-person_person

    MOVEMENT.TRANSPORT-PERSON

    movement.transport-person:person

    PER,ORG,GPE{per,org,gpe}:personnel.elect_agent PERSONNEL.ELECT personnel.elect:agent

    PERper:personnel.elect_person PERSONNEL.ELECT personnel.elect:person

    GPE,LOC,FAC{gpe,loc,fac}:personnel.elect_place PERSONNEL.ELECT personnel.elect:place

    ORG,GPE{org,gpe}:personnel.end-position_entity PERSONNEL.END-

    POSITIONpersonnel.end-position:entity

    PERper:personnel.end-position_person PERSONNEL.END-

    POSITIONpersonnel.end-position:person

    GPE,LOC,FAC{gpe,loc,fac}:personnel.end-position_place

    PERSONNEL.END-POSITION

    personnel.end-position:place

    ORG,GPE{org,gpe}:personnel.start-position_entity PERSONNEL.START-

    POSITIONpersonnel.start-position:entity

    PERper:personnel.start-position_person PERSONNEL.START-

    POSITIONpersonnel.start-position:person

    GPE,LOC,FAC{gpe,loc,fac}:personnel.start-position_place

    PERSONNEL.START-POSITION

    personnel.start-position:place

    PER,ORG,GPE{per,org,gpe}:transaction.transaction_beneficiary

    TRANSACTION.TRANSACTION

    transaction.transaction:beneficiary

    PER,ORG,GPE{per,org,gpe}:transaction.transaction_giver

    TRANSACTION.TRANSACTION

    transaction.transaction:giver

    GPE,LOC,FAC{gpe,loc,fac}:transaction.transaction_place

    TRANSACTION.TRANSACTION

    transaction.transaction:place

    PER,ORG,GPE{per,org,gpe}:transaction.transaction_recipient

    TRANSACTION.TRANSACTION

    transaction.transaction:recipient

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-money_beneficiary

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:beneficiary

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-money_giver

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:giver

    GPE,LOC,FAC{gpe,loc,fac}:transaction.transfer-money_place

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:place

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-money_recipient

    TRANSACTION.TRANSFER-MONEY

    transaction.transfer-money:recipient

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-ownership_beneficiary

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:beneficiary

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-ownership_giver

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:giver

    GPE,LOC,FAC{gpe,loc,fac}:transaction.transfer- transaction.transfer-

  • 36

    ownership_place TRANSACTION.TRANSFER-OWNERSHIP

    ownership:place

    PER,ORG,GPE{per,org,gpe}:transaction.transfer-ownership_recipient

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:recipient

    FAC,ORG{fac,org}:transaction.transfer-ownership_thing

    TRANSACTION.TRANSFER-OWNERSHIP

    transaction.transfer-ownership:thing

    ChangeHistory• Version1.0

    o Originaldraftversion,basedonthe2016specification