Home

D5.4 - DASISH Web Annotation (DWAN) framework

image

Contents

1. talage hear vorria upar y 0 Title lecente rtdurend aan de lieuwe Sagrada Familia meaning of the name heeft de bouw Villek eo n is in 2026 2 ormati n zijn ergens int Gebi as bouw vordert is Snela Cancel OK Hulp nnenbrengen D Donaties nmiddels re gerenoveerd moeten worden Dit was overigens niet ongebruikelijk bij de bour fea ene een van kerken van dit formaat Hoewel de kerk nog niet klaar is werd zij op 7 Figure 12 Editing annotation When the user creates an annotation all registered users except the creator owner get read access The owner has write access and users with write access can edit the annotation Only the owner of an annotation can change the rights of other users and delete the annotation To change the access rights of an annotation right click it and select Permissions Fill in the pop up form see Figure 13 Public access defines minimal access rights for each logged in user For instance if it is set to read then each logged in user is able to read the annotation Rights for a particular user are defined as maximum of public access and his individually set rights For instance the user with the e mail Xxx yyy mpi nl on the Figure 13 has write access To delete an annotation look for it in the list right click it and select Delete Public access read Access per user Email Permission xxx yyy mpi nl write lt X A
2. e DASISH Data Service Infrastructure for the Social Sciences and Humanities EC FP7 Grant Agreement Number 283646 Deliverable Report Deliverable D5 4 Deliverable Name DASISH Web Annotation DWAN framework Task Leader Olha Shkaravska MPG TLA Work Package Leader Daan Broeder MPG TLA Contributing Partners and Editors Valentina Ascuitti KCL Daan Broeder MPG TLA Stuart Dunn KCL Twan Goosen MPG TLA Indrek Jentson University of Tartu Przemek Lenkiewicz MPG TLA Kees Jan van de Looi MPG TLA Olof Olsson UGOT Stephanie Roth UGOT Olha Shkaravska MPG TLA Menzo Windhouwer MPG TLA SEVENTH FRAMEWORK PROGRAMME www dasish eu www dasish eu GA no 283646 1 Table of Contents 2e Executive Summary AA OT 1 3 Introduction to the DWAN framework cccccccccscccscssscsssssscssscsscscscscscscssscssscsssssececscecscess 2 3 1 Motivation and G0alS c cccstcssscsccccscscccecccacsesscsssecssessecdscescecscsscesscesceesseesseed scstedsescedscesccesssece 2 3 2 Requirements and user SC MAaLi0S sccsssssccsssssecssssssecsssscssssescssssssscssssescsssssessssssessessseees 3 4s ON odaiondo Sana na danad da cs coscoaseoavassesesesesssesesssseassesseessssaccsssesacess 5 4 1 Potential DWAN client prototypes state of the art on September 2012 0 0 5 4 2 Developments after September 2012 sccssssccsssssscsssssccssssscsssessssssesssssssesssssssssssssoees 9
3. Creates a new annotation in nid The content of an annotation is given in the request body In fact this is a short cut Envelope of two actions POST api annotations NotebookResponseBody and PUT notebooks nid annotation aid POST api notebooks nid 5 4 DWAN front end s Wired Marker based front end The original Wired Marker software is freeware developed in Japan as part of the Integrated Database Project sponsored by the Ministry of Education Culture Sports Science and Technology development code name ScrapParty for supporting the construction of databases The tool s concept and design are credited to BITS Co Ltd and Prof Okubo Wired Marker is licensed under a Creative Commons License This includes a No Derivative works condition which means that the modified code cannot be distributed According to the special agreement between BITS Co Ltd and the MPI for Psycholinguistics this condition has been waived Wired Marker as well as Wired marker based DWAN client is a Firefox extension that can be used with Firefox versions greater than 2 0 The DWAN client can be downloaded as an XPI file from the DASISH GitHub repository at https github com DASISH dwan client wiredmarker releases A more detailed description on how to install the extension can be found in the Manual see the Appendix of the presented deliverable After completed installation of the add on a new menu item called DASISH Web Annot
4. hat person and building The user is likely to wish to share selected parts of the original resource via email Twitter and Facebook In the case of a scholar they wish to share only by email In the case of a curator or public engagement professional they may wish to share via social media e g using the AskACurator or MuseumsWeek hashtags To do this they will have to Save their own annotations locally It will be necessary to Track versions of annotations The user will wish to Tag a whole images with keywords This functionality is already supported by www flickr com commons so the use of the Flickr API would be more appropriate than the construction of a new system They should have the ability to embed bibliographic references in the annotations They could then for example connect related entries from the V amp A catalogue in London http collections vam ac uk treating each collection entry as a bibliographic entity 3 Web page annotation Review of tools available Mediathread Rehersal Assistant Vertov A nnotate com Annozilla Annotea on Mozilla Fleck NoteBook Project Pad SharedCopy Springpad Trailfire All but three of these tools are editorial This reflects the fact that browser based bookmarking and generic services such as https delicious com are adequate to meet most researchers needs for organizing collections of web pages the need for editorial comment based annotation is fa
5. lt used in the target gt lt xs complex Type name CachedRepresentationFragment gt lt xs sequence gt lt xs element name fragmentString type xs string minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs complexType gt lt xs complexType name CachedRepresentationFragmentList gt lt xs complexContent gt lt xs extension base dasish List gt www dasish eu GA no 283646 41 lt xs sequence gt lt xs element name cached type dasish CachedRepresentationFragment minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs complexType name Target gt lt xs sequence gt lt xs element name lastModified type xs dateTime minOccurs 1 maxOccurs 1 gt lt xs element name link type xs anyURI minOccurs 1 maxOccurs 1 gt lt xs element name version type xs string minOccurs 1 maxOccurs 1 gt lt xs element name siblingTargets type dasish ReferenceList minOccurs 1 gt lt xs element name cachedRepresentatinons type dasish CachedRepresentationFragmentList minOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs attribute ref xml id use required gt lt xs complexType gt lt xs complexType name TargetInfo gt lt xs sequence gt lt xs element name link type x
6. active i yes For Moodle and a net Java Script Open Journal system Annotating genomes http www yandell Need an lab org software m active no account was html Annotating notes Last webpages TrailFire mentioned Firefox IE categrozing annotated in 2007 web pages sharing Social networking and news website Need an REddIT active no server account obviously not available any more ReframeIT only light weight demo on website www dasish eu GA no 283646 6 add ons outdated integration info missing on official website premium paid edition under under development developme not yet Scrible n fie available P ubie bela license version available ads modifications allowed State uncertain According to http en wi kipedia org wiki Web _annotatio n SharedCopy Developm ent has stopped Observe copyright date of official website 2012 Developm ShiftSpace ent has stopped PDF reader and node taker Skim active BDS license OS X platinum Adding notes to PDF and pro lite web pages version WebNotes active aopountIS needed modification under permission www dasish eu GA no 283646 7 http info jkn com firefo x htm Light version with available features web page annotation organize and search notes share notes via JKN new i email twitter and permalink or any other
7. syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation syntax semanti c annotation image annotation time series annotation time series annotation Commenting critical responses and stating preferences Contextualization Linking Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Contextualization Contextualization Linking cataloguing Cataloguing Linking GA no 283646 text text text image text text text text text text text image text video informal formal formal informal formal informal formal formal formal informal formal informal 33 Mediathread Rehersal Assistant Vertov A nnotate com Annozilla Annotea on Mozilla Fleck NoteBook Project Pad SharedCopy Springpad Trailfire Pliny Editorial configurational Editorial Editorial Editorial Editorial Editorial Editorial Editorial configurational Editorial Configurational Configurational Editorial www dasish eu web media annotation web media annotation web media
8. 5 DASISH Web Annotator DWAN cccccccccccccscscsccssssssssssscsssscssscssessscsescsescscscscscssseececscecssees 9 5 1 Framework architecture ic csscscsssssssscicssssssscssssssdsssasvscessssssssasesuassnnseesvansosscaes cacsesoscasasassaaasssee 9 5 2 DWAN s Data Model and its connection to Open Annotation Model sccsssseoeees 10 53 DWAN Back end cscccccccasceccssscecscioscccssvovecsssescsessssovsesesevessvevessesessesseosssssesoassseteocsccetesescsieavess 13 Architecture in a nutshell c ooonnononncnnncnnnononononnnononencnnnnnncnnnonononccononononnnnonnonccncnoncanunnininess 13 Database and Database Access ObDJeCts ce eeceeecceececseecsseceseeeeeesceeseecaaeceaecsaeeeeeseneeseesaaeenaeen 15 REST Application Programming Interface ceseceseeseeeeeeeeeecaeceseceaeeeneseneeeecaeenaeen 16 5 4 DWAN Tront end S siccesccccccsssssssssaccosseovscssscsvsvssssscevseesscsssvocesesesussseseecssetenssttecuccsusancsssesavess 23 Wired Marker based front end cece cccsessesscecceccccceseeesseccecccesseseuueaesceseecesessuueaseseesessess 23 Front end for BLA Nidia ad 26 Front end for ANNEX c ccccccsccssssssescsccceccccesssssvsesccccsccssscessscscccesceesecsssscessecsseeescessesssscevs 26 55 AAA 27 6 Social Sciences and Humanities Results and Outlook ccssssssssssssssssscscsesescssssscscccccceses 28 6 1 List of annotation tools used by the HSS COMMUMIY ccsccccssss
9. 4 4 Geaeeeaeceae ec A y www dasish eu GA no 283646 35 C 12 Textual interpretation translation C 13 Enhance text with links Enhance text with images Enhance image with text C 16 Enhance text with video JC 17 Enhance text with audio C 18 Insert definitions C 19 Insert references AA pmi jd nn A E CLE ELECE e Such Use Cases can then be grouped under six headings bibliography image web page syntax semantics wiki and video One Use Case can follow under several headings e Bibliography UC 1 UC 2 UC9 UC11 UC13 e Image UC9 UC 11 UC 3 UC 6 UC 7 UC 8 UC14 UC 15 UC 18 UC 19 e Web page UC 1 UC 2 UC 9 UC 11 UC 3 UC 6 UC 4 UC 7 UC 8 UC 12 UC 13 UC18 UC19 UC16 e Syntax semantic UC 1 UC 2 UC 9 UC 11 UC 4 UC 5 UC 7 UC 8 UC 12 UC 13 UC 18 UC 19 UC 16 UC 17 e Wiki UC 1 UC 2 UC 9 UC 11 UC 4 UC 5 UC 7 UC 8 UC 10 UC 12 UC 13 UC 18 UC 19 UC 16 UC 17 e Video UC 9 UC 11 UC 3 UC 6 UC 7 UC 8 UC 19 The mapping from a topic to its list of features can be illustrated by six corresponding user scenario s 1 Bibliographic annotation Review of tools available LitBlitz Literature Notes Manager NoodleTools Projects Oigga Sente All but one of these are configurational i e that they tend to support the organization and ordering of database records rather than the annotation of those records with further information Scenario a user has a bibliog
10. e the risks associated with the plan Testing was performed at several points in the life cycle as the product is developed Testing is a very dependent activity As a result test planning is a continuing activity performed throughout the system development life cycle The scope of DWAN testing activity includes e server API for DWAN release 1 0 server side software e DWAN release 1 0 client side software for Firefox browser e DWAN User Manual The scope of this testing activity does not include DWAN release 1 0 server side software and DWAN development documentation Requirements Testing consists of several phases each phase may or may not include testing of anyone or more of the following aspects of the DWAN software listed alphabetically availability content functionality performance reliability scalability security usability The API for the server side software is tested separately with several Python scripts The client side software is tested manually by following some basic test scenarios Testing is performed on the client side with operating system Windows 7 Windows 8 Mac OS X or Linux For testing of the browser plugin the latest Mozilla Firefox version 29 or later is used For the testing of the server API the Python programming environment with the unit testing framework and the package Requests 2 3 0 https pypi python org pypi requests is used All discovered software anomalies during the testing are
11. myserver ds webannotator basic in the User Specified box and close See Figure 9 for an example 3 Currently the DWAN back end is connected to the CLARIN trust federation and allows access to all home organizations using CLARIN services as are also all eduGAIN connected home organizations For example the current DWAN annotation service is located at https lux 17 mpi nl ds webannotator basic www dasish eu GA no 283646 48 aus Sign in to Lux17 Shibboleth SDS Select your Identity P ider here in the DiscoJuice Shibboleth Discovery Ser MPI archive default identity provider MPI Nijmegen tems Netherlands BI a w Lo i di External Identity Providers New CLARIN EU IDP Clarin eu website account E European Union Lux17 Test Identity Provider Centraal bureau voor Schimmelcultures KNAW Figure 8 Authentication using Federated Identity Mi General y Shortcut El Database Server D Merg Select the back end server for the annotation client Default https lux17 mpi nl ds webannotator o User specified https lux17 mpi nl ds webannotator basic Figure 9 Configuring the Service Location Viewing annotations Annotations created on other client instances or by other users are all listed in the Incoming folder in the left side box The DASISH website is the default webpage Navigate to the page you are interested in Click the reload icon in the browser bar If t
12. request is sent then in the case of success the server returns the serialized information about the added resp updated resource together with a standard HTTP response code If an annotation is posted or updated the server returns an XML document of type envelope which contains a serialization of the resource together with the list of actions which client should perform to complete the request in a sound way For instance if an annotation is posted so that for one of its targets there is no cached representation in the database the list of action contains reminder to post a cached representation for the corresponding target id In the case of failure of the request the corresponding error status with the detailed message when necessary is returned e g 401 Unauthorized access if the principal is not logged in except for the log in service Before describing the requests in more detail we give the list of used notations in Table 2 Table 2 Notations notation meanin annotation identifier x cached representation identifier q date and time including time zone as defined in http www w3 org TR xmlschema 2 dateTime notebook identifier x prefix the prefix of a namespace a a target identifier text some text principal s i URI as defined in http tools ietf org html rfc398 a user person or a group of users prid RI Principal Q A 16 http shibboleth net www dasish eu GA no 2
13. 4 farkerl in het a Marker2 E Donaties Select All om aten D Marker a il Search Google for De naam beteken Ll Markers E Links naar deze gedet View ee EA 3 overigens niet ongebruikelijk bij de bouw van 1 Markers A pagina kerke lerd zij op 7 november 2010 door paus Benedictus 1 Marker6 Verwante wijzigingen XVite Inspect Element Marker E Bestand uploaden E Marker8 Speciale pagina s Archit x Inspect Element with Firebug dere karakteristieke bouwwerken in Barcelona MILLO Permanente onder meer Casa Ball en Park Guell Gaud maakte deel uit van het Catalaans modernisme een stroming a koppaihg in de kunst waarvan de basiliek het belangrijkste symbool is Dit Catalaans modernisme is nauw verwant adii Small Q Alo Paginagegevens i 5 y A A de jugendstil en dient dus niet verward te worden met het modernisme Gegevensitem Title Date B A k 1885 start 2014 4 28 10 58 48 Daze pagna eee Inhoud verbergen a _ de opracht 2014 4 28 9 22 13 Afdrukken exporteren x Sagrada Fam lia Wikipedia 2014 4 1 15 45 49 Pesq as 1 Geschiedenis Plaats Barcelo _ Right Sector Who is Hammer 2014 5 29 14 42 40 cena ay ara 1 1 De opdracht Co rdinaten 41 24 _ Catalan Art Nuveau PEA 1 2 Bouw Gebouwdin 1882 h Figure 11 Selecting a marker Adding possibilities to change access rights is currently work in progress www dasish eu GA no 283646 50 Annotation HYPER ANCHOR Code Page title Google
14. Firstly the use will need to Save their own annotations in the form of Add comments in the form of scribbled notes text to text and text to image These are stored in a shared collaborative space The annotations will need to contain metadata detailing the page URL and the part of the page being referred to It will be necessary to specify start and end points allowing the user to Highlight text and Highlight images For this scenario it will not be necessary to highlight parts of images Each annotation will have to be able to point to multiple parts or the same web page or to multiple web pages Ina shared collaborative environment it will be necessary to Track versions of annotations including responsibility for different versions This scenario reflects the probability that collaborative annotation is likely to be of scholarly use only within relatively well defined groups of researchers working on a common task The tools overview suggests that there is less demand for community wide annotation applications 4 Syntactic and Semantic annotation Review of tools available CLAWS Tagger GATE MMax2 Melita Pundit Thinkport Annotator UAM CorpusTool Versioning Machine Word Hoard WordFreak brat rapid annotation tool QDA Miner Qualitative Data Analysis Software for Qualitative Research Text annotation both structured syntactic and unstructured semantic is a fundamental part of the research process in most discip
15. HSS community The list of more than 50 tools has been generated from a simple search using the keyword annot in the Tools e Registry for E Social science Arts and Humanities TERESAH registry currently under development for WP2 Here we briefly describe ten of them which form our point of view look the most promising as potential DWAN front ends ANNIS is an open source versatile web browser based search and visualization architecture for complex multilevel linguistic corpora with diverse types of annotation ANNIS which stands for ANNotation of Information Structure has been designed to provide access to the data of the SFB 632 Information Structure The Linguistic Means for Structuring Utterances Sentences and Texts Since information structure interacts with linguistic phenomena on many levels ANNIS2 addresses the SFB s need to concurrently annotate query and visualize data from such varied areas as syntax semantics morphology prosody referentiality lexis and more For projects working with spoken language support for audio video annotations is also required In the SFB a number of different projects collect and annotate data according to the common SFB Annotation Standard This data which is annotated using both automatic taggers parsers and a small set of manual annotation tools EXMARaLDA ELAN annotate Synpathy MMAX RSTTool is mapped onto the encoding standard of the SFB PAULA Potsdamer Austauschformat f r Lingui
16. annotatable source An instance of the Target and Targetinfo type has a string attribute version which is to be filled by a client when an annotation is posted or updated and sent to the server An Annotation type contains target info elements that keep information about the annotation targets api targets Table 6 API for resource Target Return Gam type GET api targets tid Returns the target with a given id Returns the lists of the URIs of all the sibling versions of the tid that is targets ReferenceList related to the same source the same link It is a 2 part POST with the request body consisting of serialised CachedRepresentationInfo instance and a single file representing the chacher representation itself HTML document image etc multiple files must be archived GET api targets tid versions POST api targets tid fragment ragmentdescriptorstring cac hed CachedRepresentationIn fo Removes connection tid cids The cached String messaging how DELETE representation is removed from the many rows in the junction api targets tid cached cid database as well unless there are more table have been removed references to this representation should be 0 or 1 www dasish eu GA no 283646 20 api cached It is possible to store the cached representation not only of the fragment precisely corresponding to annotation s target but also of a larger fragment and even of the entire annotatable document For instance th
17. annotation web page annotation web page annotation web page annotation web page annotation web page annotation web page annotation web page annotation web page annotation web page annotation PDF annotation Linking cataloguing Contextualization Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Commenting critical responses and stating preferences Collaborative tagging Linking Commenting critical responses and stating preferences GA no 283646 text image video video audio text image text image text image text image text image text image video sound text image text image text image text image informal informal informal informal informal informal informal informal informal informal informal informal 34 Configurational Bibliopedia Fado wiki annotation Contextualization text informal N editorial FromThePage Editorial wiki annotation Transcription text informal Y ANNIS Editorial Contextualization text formal N q 2 Comm
18. lists of important general entities These will include but not exhaustively important personages such as Caesar Sextus Pompey contemporary events such as the formation of the First Triumvirate and the Civil places such as Rome Brundisium roles such as aedile and senator laws Any word phrase or passage that the user wishes to associate with these events would need to be defined and an associative term or terms selected www dasish eu GA no 283646 38 Assuming the critical edition will involve translation or all or part of the corpus the user will need to annotate any passages where the translation is for any reason indirect It will be essential for the user to be able to Track versions of annotations and to be able to delete obsolete versions They will need to be able to Save their own annotations The user will need to be able to Modify text Add information to text within the text as well as delete information within the text if in their judgement there is repetition or trantextual inaccuracy or if abridgement is needed for any other reason The deletion and the text deleted should be preserved as an annotation The user will need to be able to embed links to other texts bibliography video and image media 5 Wiki based annotation Review of tools available Bibliopedia The requirements for wiki based annotation are similar to those required for web page annotation However there is an additional require
19. o suburbs of Nijmegen Oo v J Local folder Portugu s v Markar Rom n Nijmegen Ressen Markerl a Runa Simi Marker2 a Pycckuk 8 Marker3 A Scots History edi Marker4 E Seeltersk The first mention of Nijm Marker5 E Shaip Marker6 ji i F m Simple English Romans built a military c Pons lace where Nijmegen w Marker8 A eyo p ymeg Srpskohrvatski the location had great str cpnckoxpBarcku because of the surround 5 E Suomi gave and continue to giv Evora view over the Waal and Title Date R ns Nijmegen Roman old 2014 10 27 14 T rk e By 69 when the Batavia k inhabitants of te Rhine Right Sector press 2014 5 30 12 1 sul delta revolted a village 1 This annotation by Olaf check update 2014 8 19 15 1 LES anes a ae qe Figure 10 Viewing annotations of other users Annotating documents and editing annotations To annotate a web document navigate to the corresponding webpage and select a text fragment with the mouse After right click select Marker folder in the menu Next select the color you would like to use to mark the text fragment see Figure 11 Selecting a markerFigure 11 Following this a pop up text box with two fields appears One can assign a distinctive title to the annotation in the Title field and write a clear short description in the Annotation field To save the work click ok This then finally creates the annotation It is shown on the web page now To update the a
20. readers and writers As one can expect a reader is a user that can read the annotation and a writer can also add changes to it Thus a registered principal can be related to an annotation by means of one of three access modes reader writer none An annotation can have one or more targets A target i e instantiation of the Target class contains the reference to the web document a source and the precise description of the document s fragment which is actually annotated A target can also be related to one or more cached representations A cached representation is a stored record that contains representations of the relevant parts of the annotated document together with the descriptions of their respective annotated fragments http www openannotation org spec core 8 Recall that a principal is ether a user or a group of users and for the current version of DWAN user and principal are synonyms Creating user s groups is the matter of the future work www dasish eu GA no 283646 10 blob mimeType tool fragment descriptor String URI TimeStamp URI Title time stamp Owner ID headline Principal URI display name eMail access mode XMLBod text managed by DWAN www Web fragment HTML URI Figure 3 DWAN Data Model The semantics of an annotation is given in its body In the implementation a body is an arbitrary text or an XML text In both cases a client must gi
21. registered in the project issue management pages under the GitHub e https github com DASISH dwanclientwiredmarker and e https github com DASISH dwanback end For back end the testers have implemented a python script what tries to perform several API operations https github com DASISH dwan testing tree master scripts www dasish eu GA no 283646 27 6 Social Sciences and Humanities Results and Outlook Annotation is an activity which runs throughout all scholarly work in all disciplines The purpose of this section is to give context to the DASISH Web ANnotation framework DW AN annotation tool to explore how annotation works in the broader context of scholarly communication in the humanities and Social Sciences HSS and to set out a series of scenarios which users in these domains are likely to encounter when faced with tasks requiring annotation and related activities This review will comprise of three main elements 6 1 a list of software annotation tools drawn from the Tools e Registry for E Social science Arts and Humanities TERESAH registry which is the primary output of DASISH Work Package 2 a mapping of these tools functionality and where it can be determined their usages to the typology proposed by Dunn and Hedges 2012 in their report on crowd sourcing in cultural heritage and the humanities and a set of user scenarios based on this analysis List of annotation tools used by the
22. runs on your computer and manages annotations and notes that you gather as you are reading e Pundit is a semantic annotation and augmentation tool It enables users to create structured data while annotating web pages Annotations span from simple comments to semantic links to web of data entities as Freebase com and Dbpedia org to fine granular cross references and citations Pundit can be configured to include custom controlled vocabularies In other words annotations can refer to precise entities and concepts as well as express precise relations among entities and contents Read more on semantically structured annotations Pundit is designed to enable groups of users to share their annotations and collaboratively create structured knowledge e UVic Image Markup Tool allows to describe and annotate images and store the resulting data in TEI XML files all within a simple enough interface that can be used by people with little or no experience in editing XML code Designed to be Windows only but can be successfully run on Linux using Wine It supports a wide variety of image formats and saves markup www dasish eu GA no 283646 29 information in conformant TEI P5 XML files It has a simple graphical interface that lets you see the image and the fields for entering your markup notes and annotations that are visually represented on the image The tool allows knowledgeable TEI users to add additional TEI markup tags to their annotations The too
23. schema definition schema definition segmenting video syntax semanti c annotation syntax semanti c annotation Commenting critical responses and stating preferences Commenting critical responses and stating preferences Linking Contextualization Commenting critical responses and stating preferences Contextualization Commenting critical responses and stating preferences Linking Commenting critical responses and stating preferences Commenting critical responses and stating preferences Cataloguing Collaborative tagging GA no 283646 text image images text geospati al text image text image text image video video video text text informal informal formal informal informal informal informal informal informal informal formal formal 32 MMax2 Melita Pundit Thinkport Annotator UAM CorpusTool Versioning Machine Word Hoard WordFreak brat rapid annotation tool QDA Miner Qualitative Data Analysis Software for Qualitative Research Annotation Graph Toolkit AGTK VideoANT Editorial Editorial configurational Configurational Editorial Configurational Editorial Editorial Editorial Editorial configurational Editorial configutrational Configurational Configurational www dasish eu syntax semanti c annotation syntax semanti c annotation
24. targets Beste Histor tine sora ReferenceList the targets of aid Removes aid from the database String messaging how DELETE api annotations aid together with all its targets to which many rows have been no other annotation refers deleted should ne O or 1 Updates the annotation with aid For instance it is used when prid wants to correct typos in the annotation body and change annotated fragments See PUT Envelope api annotations aid body for AnnotationResponseBody correcting body only The serialized representation of the updated annotation is given in the request body PUT api annotations aid Updates the body of the annotation PUT api annotations aid body aid Used e g for correcting typos in the text part Envelope AnnotationResponseBody List of permissions for the aid In GET user is not included in the list his api annotations aid permissions access is defined by public attribute PermissionList Updates the permission list New permission list is given serialized in the request body PUT api annotations aid permissions Envelope PermissionResponseBody Updates the access mode for the String messaging how annotation aid and principal prid many rows have been api annotations aid permissions prid New access mode is given in the updated added body of the request should be 0 or 1 www dasish eu GA no 283646 19 Targets A target represents a specific fragment of a specific version of an
25. the resource on the DASISH server Resource info types TargetInfo AnnotationInfo NotebookInfo contain reference to the corresponding resource plus the most important information about the resource There are corresponding list of resource info types Targetinfos AnnotationInfos NotebookInfos lt xs schema targetNamespace http www dasish eu ns addit xmins xs http www w3 org 2001 XMLSchema elementFormDefault qualified xmlns dasish http www dasish eu ns addit gt lt xs import namespace http www w3 org XML 1998 namespace schemaLocation http www w3 org 2005 08 xml xsd gt lt xs complexType name List gt lt xs sequence gt lt xs complexType gt lt xs complexType name ReferenceList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name href type xs anyURI minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs complexType name CachedRepresentationInfo gt lt xs sequence gt lt xs element name mimeType type xs string minOccurs 1 maxOccurs 1 gt lt xs element name tool type xs string minOccurs 1 maxOccurs 1 gt lt xs element name type type xs string minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs attribute ref xml id use required gt lt xs complexType gt
26. 83646 16 In the tables below all the requests are listed and the corresponding server responses are described Principal realm Table 3 API for resource Principal type Redirects to the login page if the GET api authentication login principal is not logged in or String message messages otherwise GET api authentication principal Returns logged in principal GET spi pancipalsinvid a principal with the given EO ete Returns true if the prid is logged CurrenPrincipall GET api principals prid current nfo GET Returns the principal with the given Principal api principals info email user mail com e mail address P NE Returns the string with the name SET apipinepalpaumin and the e mail of DWAN admin Annotations api annotations Table 4 API for resource Annotation Part A Return Gam type Returns the annotations filtered by the request parameters list of info s of the annotations to which the logged in principal has read GET api annotations resp write access Their links link URI amp contain uri their bodies contain text text amp text Moreover these annotations access read write amp are created between datetime AnnotationInfoList owner prid amp and datetime2 If the parameter after datetimel amp link is omitted then considers all before datetime annotated objects to which the principal has read write access The default datetimelis 01 Jan 1970 00 00 The default datetime is today Ad
27. N editorial annotation eae Commenting Lente bibliographic critical res ane Literature Editorial grap so text informal N annotation and stating Notes Manager preferences Commenting bibliographic critical responses NoodleTools Configurational grap ISSP text informal Y annotation and stating preferences 7 bibliographic ae Projects Configurational grap Contextualization text informal N annotation E bibliographic ae i Qigga Configurational grap Contextualization text informal N annotation i A bibliographic i Sente Configurational grap Cataloguing text informal N annotation Commenting EN image critical responses Greenshot Editorial abe resp images informal N annotation and stating preferences SE image E y HyperImage Editorial Linkin images informal N yP 8 annotation 8 8 NewRadial E image ba J text A Configurational 8e Linking A informal N INKE annotation image www dasish eu GA no 283646 31 Skitch UVic Image Markup Tool Juxta MapHub NB Skim iAnnotate Advene Anvil Annotator s Workbench CLAWS Tagger GATE Configurational Editorial Configurational editorial Editorial configurational Editorial Editorial Editorial Editorial Editorial Editorial Editorial www dasish eu image annotation image annotation image annotation syntax semanti c annotation map annotation PDF annotation PDF annotation PDF annotation
28. N connected Identity Federation s he can use her his institution credentials by choosing the institution name from the list of Identity Providers Otherwise the user can create an local account for the DWAN back end by filling in and submitting the registration on the page accessed by the menu DASISH Web Annotator gt Settings gt Server and specify the desired server Choose the default option https myserver ds webannotator for Shibboleth authentication and choose https myserver ds webannotator basic for the basic authentication service The user needs to set the back end server URL via the DWAN client menu in case the intended server differs from the default one This can be done in the Settings dialogue window DASISH Web Annotator gt Settings gt Server where a user specified back end address can be inserted When a user creates an annotation the client sends it to the server together with a cached representation of the annotated page in the moment of annotation The user can request a cached representation later for instance if the client cannot deliver the annotation because the page has been changed and the fragment cannot be resolved Please consult the manual for more details The cached representation is sent as a serialized DOM for the HTML document For images only links are sent The next step in future development would be to zip the HTML images CSS and JavaScript for the cached representation This is done in
29. N would be used creatively for this purpose This use however has several drawbacks For example annotations on a single tier can t overlap each other time wise and multiple comments referring to the same period become cumbersome On the other hand the DWAN back end is an ideal vehicle to store these comments it is based on comments which refer to some URL or even more specifically to some fragment of the target by means of a fragment identifier To use this principle ELAN creates a unique resource identifier for the files it processes an URN such as urn nl mpi tools elan eaf 59d08e6a 5cd9 4aed 8aa4 7074c270e635 This is necessary because ELAN works on files locally stored on a user s computer and that therefore have no universally accessible URL On the other hand once an ELAN file is imported into an archive it will be assigned a stable URL and can then be viewed using the ANNEX web tool Front end for ANNEX ANNEX is an open source online visualizer for time aligned annotation files primarily targeted at the EAF ELAN Annotation Format format just as ELAN It provides an ELAN like web interface where users can visualize and browse trough the annotations of a time aligned annotation file in the same fashion as in ELAN ANNEX will work in a standard Flash enabled web browser As is the case with the ELAN front end ANNEX interaction with the DWAN is being developed in the context of the COLTIME project Given that ANNEX handles the
30. Occurs 1 maxOccurs 1 gt lt xs element name permissions type dasish PermissionList minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs attribute ref xml id use required gt lt xs complexType gt lt xs complex Type name AnnotationInfo gt lt xs sequence gt lt xs element name ownerHref type xs anyURI minOccurs 1 maxOccurs 1 gt www dasish eu GA no 283646 43 lt xs element name headline type xs string minOccurs 1 maxOccurs 1 gt lt xs element name lastModified type xs dateTime minOccurs 1 maxOccurs 1 gt lt xs element name targets type dasish ReferenceList minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs complexType gt lt xs complexType name AnnotationInfoList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name annotationInfo type dasish AnnotationInfo minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs complexType name AnnotationBody gt lt xs choice gt lt xs element name textBody gt lt xs complexType gt lt xs sequence gt lt xs element name mimeType type xs string minOccurs 1 maxOccurs 1 gt lt xs element name body type xs string minOccurs 1
31. Type name NotebookInfoList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name notebookInfo type dasish NotebookInfo minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt Envelopes gt lt xs simpleType name AnnotationActionName gt lt xs restriction base xs string gt lt xs enumeration value CREATE CACHED REPRESENTATION gt lt xs restriction gt lt xs simpleType gt lt xs simpleType name PermissionActionName gt lt xs restriction base xs string gt lt xs enumeration value PROVIDE PRINCIPAL INFO gt lt xs restriction gt lt xs simpleType gt lt xs complexType name Action gt lt xs sequence gt lt xs element name object type xs anyURI minOccurs 1 maxOccurs 1 gt lt xs element name message type xs string minOccurs 0 maxOccurs 1 gt lt xs sequence gt lt xs complexType gt lt xs complexType name ActionList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name action type dasish Action minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt response envelope not a resource used for all response on POST PUT requests gt lt envelope gt lt xs complexType nam
32. Wired Marker but not posted to the back end for now ao Currently DWAN is connected to the CLARIN Service Provider Federation http www clarin eu content service provider federation 21 At the moment the default server is https lux17 mpi nl ds webannotator The user may set https lux17 mpi nl ds webannotator basic as a user specific server if he wants to follow basic authentication procedure www dasish eu GA no 283646 24 It is possible to annotate an image but not an image fragment The mouse pointer must be placed on the image and the remaining steps are the same as for annotating text The title and the annotation body are assigned automatically with the annotation body getting the name of the image file The title and the body can be edited later To edit an annotation select it in the list on the left hand side of the browser window Click the right mouse button and select Properties in the popped up menu Selecting Properties triggers a pop up form for editing the annotation The title can be edited in the Brief Overview tab and annotation body can be edited in Annotation tab In the original Wired Maker it is not possible to assign and reassign read write and none access rights for a particular user given a particular annotation However the DWAN framework assumes dynamic access rights When a DWAN client creates an annotation all registered principals except the creator owner get read access The owner ha
33. and Annotate A principal logs in and sees the lists of annotations that were made by him and his colleagues earlier These annotations are sorted by their dates or by their headers The principal finds a web page that he wants to annotate selects a fragment of the text to annotate say by marking it with some color and attaches a text note on this fragment The text note should not clutter the main document By clicking the mouse the annotation can be saved in the local clients database as well as in the server database It should be visible on the web page as visualized by the client tool Second scenario Editing and deleting The principal must be able to edit the text note to change the header of an annotation and to give different access rights read write none for another specific principal Third scenario Retrieving cached representations The principal logs in sees the list of annotations and selects the one he wants to inspect in the context of the corresponding web page He clicks on the annotation in the list a try a few times to reload the page but the annotation does not appear The client cannot resolve the annotated fragment possibly because the page has been updated and the fragment has changed its position or has disappeared completely The principal requests the front end to retrieve the remote cache and gets the cached representation of the page together with the other annotations made on this page earlier Indeed it can
34. ator is added to the Firefox menu bar The source code is written in JavaScript and contains XUL files as well XUL stands for XML User Interface Language which is a user interface markup language developed by Mozilla XUL is implemented as an XML dialect it allows for graphical user interfaces to be written in a similar manner to web pages One of the possibilities to develop Firefox add ons such as Wired Marker is to use the FoxBeans plug in for NetBeans 7 0 IDE The plug in adds a new project type Mozilla Firefox add on that can be used for extension development Another common option is to work with a development setup that uses an extension proxy file locally In the case of the Wired Marker extension code the jar structured chrome manifest file also needed to be rewritten and adjusted to the local chrome paths We recommend a developer to read https developer mozilla org en US Add ons Setting up extension development environment on how to set an extension development environment From the user s point of view the original Wired Marker extension is a highlighting tool that allows marking fragments of a web document with different colors The tool as well as the based on it DWAN client provides a default finite collection of colors markers with which a user can mark fragments of web documents An annotated fragment can be a text fragment or an image inserted in the document The descriptions of the marked fragments annotations are coll
35. be seen that the page has been updated It is worth to note that this scenario was a part of the DWAN demo during LREC 2014 The wiki page of Right Sector was used Right Sector is a block of right and extreme right groups in Ukraine Due to highly unstable situation in the country this page is updated very often The reader can get the annotations on this wiki page and their cached representations if he has the Wired Marker based DWAN front end installed It turns out that by the time the DWAN developers team started to work on the client the Wired Marker Firefox extension was the open source the tool that could cover these scenarios except that there was no connection with the central database and the annotations made via other clients were not retrievable However code inspection gave the impression that this feature could be added In the next section we give the comparative analysis of the tools which could have been used a DWAN client prototypes in more detail The term principal in general denotes either a user or a group of users At present user and principal are synonyms for DWAN tool gt http Irec2014 Irec conf org en www dasish eu GA no 283646 4 4 Annotation Tools 41 Potential DWAN client prototypes state of the art on September 2012 Before development of DWAN began more than 40 available annotation tools had been investigated to see if they could be and to which extend used as a starting point f
36. cally analyzed corpus in the form of an XML document Each WebLicht service must be able to use a common interchange format that all the other services can also process CLARIN D s Text Corpus Format TCF serves this purpose It is broadly compatible with existing related interchange formats like Negra Paula or TiiBa D Z Moreover format specific converters allow interchange between them WebLicht can be accessed only with a valid DFN AAI Shibboleth based account or a local Tiibingen account e Zotero allows users to bookmark and save content PDFs images audio and video files snapshots of web pages etc by automatically pulling in metadata stored on websites Users can then search tag and annotate any entry in their library Zotero is primarily available as a Firefox plug in but is now also available as a stand alone version with connectors to other browsers Zotero also allows students to automatically create Works Cited pages by drawing on the sources used in a document 6 2 Functionality mapping The DWAN framework is designed for use with different client tools that can share annotations These tools are listed Table 9 and their usage is explained in more detail in potential use cases described below in section 6 3 The functionality of each tool has been mapped to some categorization proposed in the AHRC report on crowd sourcing in cultural heritage and the humanities written by Dunn and Hedges 2012 28 In this report task types were i
37. ccccsssccccssssccsssssceees 28 6 2 Functionality mapping c sccsssccccsssscccssssccccsssccssssscccssssccsssssscscssssccscsssccscsssssscsssscesees 30 6 3 Potential front ends for DWAN in Social Sciences and Humanities ccccccssseees 35 Appendix A DWAN XML schema scssssssccsssssscsssssscssssssccsssscssssssscsssssscsssscsscsssssscscssescssssoeneee 41 Appendix B DWAN Wired Marker manual ccssscssssssccsssssccssssscccsssssscsssssssssssssecssescscssoeseee 47 2 Executive Summary The availability of digital archives and other research data via the Internet creates new chances for collaboration Indeed equipped with special software researchers from different institutions countries and fields can work together via the Internet Such collaboration can take the form of annotating the on line data and sharing these annotations using an annotation infrastructure As stated in the task 5 6 description researchers need to be able to store the results of collaborative intellectual work either as an annotation of a single fragment or in the form of typed relations between a number offragments The aim of this document is to provide a specification of the framework for annotating web documents developed according to task 5 6 plan In this context an annotation is a remark over a fragment s of an on line document s From the technical point of view the proposed framework consists of one back end constituted of the
38. chitecture There one must take into account the variety of annotatable objects because they have different internal structure and client software must technically overwork the internal structure of a document when creating an annotation for it Therefore a specific client is to be designed for a specific type of documents For instance annotating web pages and ELAN files need technically different approaches due to the completely different internal structure of the corresponding annotated documents Addressing the second goal we have developed the DWAN annotating tool which by now consists of the server software with the database and the Wired marker based DWAN client Moreover specific clients have been designed for ELAN files The DWAN back end and the developed clients are discussed in detail in sections 5 3 and 5 4 respectively 3 2 Requirements and user scenarios As stated above in the DWAN framework it is assumed that possibly multiple clients communicate with a single back end consisting of the server software which implements access to the database with annotations Annotations and information about annotated on line resources targets are stored in the database together with cached representations of the targets A cached representation is a copy e g a screenshot of a target document Storing cached representations allows retrieving the copy of an annotated document when the actual web document under the target s URI has been updat
39. d ons Check for Updates View Recent Updates Install Add on From File E Y Update Add ons Automatically v Reset All Add ons to Update Automatically sc Figure 7 Firefox menu to start installation of the DWAN client from file After installation is completed DW AN Dasish Web Annotator is added to the Firefox menu and once activated the DWAN menu will appear on the left sidebar Account management and logging in In order to use DWAN one needs to login into the back end DWAN offers two ways of authentication A using a federated login e g Shibboleth and B with a local DWAN account that you can create yourself by filling in a form on the DWAN server generated web page where you provide your login e mail address and password Below both authentication procedures are described in more detail A If your institution is part of the DW AN supported trust federation and listed within the Discovery Service list of home organizations see Figure 8 you login with your institution credentials Choose from the list of home organizations select and log in B If your institution is not listed on the home organization list you can create a user account following the following steps a Goto https myserver ds webannotator basic b click on Register as a non Shibboleth user c fill in the user registration form and submit it d go to DASISH Web Annotator gt Settings gt Server gt write this link https
40. dd permission ok Figure 13 Changing access rights for a selected annotation www dasish eu GA no 283646 51 Troubleshooting Advanced users and developers can examine the relationship between the Back end and the Front end directly by installing Firebug or Tamper Data which are two other Firefox add ons This can be useful in situations where DWAN does not seem to work properly Because of the updates of the DWAN client Firefox and operating systems sometimes it is necessary to reinstall the client after a new release Normally it is necessary first to de install the current version of the DWAN client following standard Firefox procedure of the add on manager Follow Tools gt Add ons in the browser menu to start the add on manager Within the add on manager choose to de install a selected extension e g the DWAN client Now the second step the new version of the DWAN client can be installed as it is described in the beginning of this manual However sometimes the newly installed version would not work In this case one should inform the administrator and the DWAN developers Still to be able to work create a new Firefox profile Within this new profile you will download and start the new version of the DWAN client as usual How to make a new profile and start it is explained in detail at https support mozilla org en US kb profile manager create and remove firefox profiles Alternatively on MAC OS one can create a p
41. dentified as the following mechanical configurational editorial synthetic investigative and creative Most annotation tools fall into the categories of the Configurational or Editorial task types a task is an activity that a user undertakes in order to create process or modify a digital asset i e geospatial text numerical or statistical information sound image video ephemera and intangible cultural heritage The Configurational type covers tasks that involve identifying structural patterns or configurations in information rather Er http www ahrc ac uk Funding Opportunities Research funding Connected Communities Scoping studies and reviews Documents Crowd 20Sourcing 20in 20the 20Humanities pdf www dasish eu GA no 283646 30 than processing individual pieces of information Some such tasks will require a predisposition for working with quantitative data The Editorial type involves modifying or improving an existing asset A process is a sequence of tasks through which an output is produced by operating on an asset Moreover a tool is considered informal if it has pre defined entities which can be added as annotations and formal if it does not Table 9 Tools that can be used as DWAN front ends asset formal Name task type task sub type process type type informal collaborative platform PA AA AA Se A en al Configurational bibliographic ere Bookends ns grap Contextualization text informal
42. ds a new annotation by picking POST api annotations up its XML serialization from the request body Envelope AnnotationResponseBody www dasish eu GA no 283646 17 In the GET request for the future we may add a namespace parameter ns It may be used to make queries on XPath for xml annotation bodies For instance the following query api annotations ns rdf http 3A 2F 2Fwww w3 org 2F1999 2F02 2F22 rdf syntax ns 23 amp ns owl http 3 A 2F 2F www w3 0rg 2F2002 2F07 2Fowl 23 amp xpath owl sameAs rdf resourc e example 2 is used to find an annotation with the body lt rdf RDF xmins rdf http Awww w3 org 1999 02 22 rdf syntax ns xmins owl http www w3 org 2002 07 owl gt lt owl sameAs rdf about example 1 rdf resource example 2 gt lt rdf RDF gt api annotations aid The table below describes requests in which the logged in principal has authorized access to aid Authorized access means that the principal has read access for GET methods and write access for PUT body methods Any logged in principal can POST an annotation To change permissions of the annotation the principal must be the owner of the annotation If the principal tries to perform a request for which s he does not have privileges the status 403 Forbidden is returned www dasish eu GA no 283646 18 Table 5 API for resource Annotation Part B Return pe GET api annotations aid oo the annotation that has this GET api annotations aid
43. e ResponseBody gt lt xs sequence gt www dasish eu GA no 283646 45 lt xs choice gt lt xs element name annotation type dasish Annotation gt lt xs element name permissions type dasish PermissionList gt lt xs element name notebook type dasish Notebook gt lt xs choice gt lt xs element name actionList type dasish ActionList minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs complexType gt lt HAHAH ELEMENTS FREE gt lt xs element name action type dasish Action gt lt xs element name actionList type dasish ActionList gt lt xs element name annotation type dasish Annotation gt lt xs element name annotationBody type dasish AnnotationBody gt lt xs element name annotationInfo type dasish AnnotationInfo gt lt xs element name annotationInfoList type dasish AnnotationInfoList gt lt xs element name annotationList type dasish ReferenceList gt lt xs element name cachedRepresentationInfo type dasish CachedRepresentationInfo gt lt xs element name cachedRepresentationList type dasish ReferenceList gt lt xs element name cachedRepresentationFragment type dasish CachedRepresentationFragment gt lt xs element name cachedRepresentationFragmentList type dasish CachedRepresentationFragmentList gt lt xs element name list type dasish List gt lt xs element name notebook type dasish Notebook gt lt xs element
44. e DWAN client sends to the server the entire DOM of the annotated page when an annotation is created The relation between the target and its cached representation should be completed by a fragment descriptor pointing to the position of the annotated fragment in the cached representation For instance for a screenshot it may be an x y position of a left upper corner of the annotated fragment and the size of a rectangle Table 7 API for resource Cached Representation Return Gal ype GET Returns the meta information of cid if y Ego CachedRepresentationInfo api cached cid metadata lit exists GET Returns the file stream which is the cached representation with cid if it exists Stream it is up to the client to interpret it correctly Returns the image file which is the cached representation with cid if it exists www dasish eu GA no 283646 21 Notebooks api notebooks Table 8 API for resource Notebook Return Com type Returns notebook infos for the GET api notebooks notebooks accessible to the logged in NotebookInfoList principal Returns the list of all notebook GET api notebooks owned references owned by the logged in ReferenceList principal Returns the list of prid s who allowed to read the annotations from the ReferenceList notebook GET Returns the list of prid that can add ReferenceList api notebooks nid writers annotations to the notebook GET Returns all metadata about a specified Noteb
45. e DWAN framework that is a solution for collaborative annotation An important feature of the DWAN framework is that created content and sources can be stored in a shared database In its turn the DWAN Wired Marker client allows a user to send and retrieve cached copies of the annotated resources Individuals as well as groups of researchers from different institutions countries or backgrounds can all benefit from using DWAN framework Research Institutes or groups of researchers can develop their own clients for their particular use and purposes and as such they will have access to the shared DWAN Database Download and installation The DWAN client can be installed or downloaded from the github repository https github com DASISH dwan client wiredmarker releases One can install it by navigating to this web page using the Firefox web browser clicking on green button dasishwebannotator xpi and following the simple standard instructions issued by the browser like allow to install software from the site A second option is to start up a Firefox drag and drop the xpi file onto the Firefox window and yet another option is to 1 download the xpi file in some directory of the user s computer 2 run the Firefox add on manager 3 follow Install add on from File procedure by clicking the corresponding menu see Figure 7 2 https github com DASISH dwan documentation www dasish eu GA no 283646 47 Q Search all ad
46. e png then the corresponding rdf type must be dctypes Image E http dublincore org documents 2010 10 1 1 demi type vocabulary www dasish eu GA no 283646 11 oa SpecificResource oa TextQuote Selector rdf type Nativity Facade edit oa hasSelector oa prefix http dasish eu targets xxx oa hasState oa hasSource http en wikipedia org wiki Sagrada_ q a Version dc forma text html detypes Text dc publishe op cachedSource screenshot fo dc description lt server gt cached yyy stream dc format lt server gt cached zzz stream image png oa cachedSource dctypes Image Figure 4 Example of an OA representation Target A principal is an agent and for agents Open Annotation recommends to use foaf namespace see http xmlns com foaf spec This data model is designed for social networks and in principle suits DASISH schema for a user and permission lists There is one little technical inconvenience foaf agents do not have a property that can be used to define permission types reader writer directly For now permissions are represented via property foaf topic_interest For an example see Figure J www dasish eu GA no 283646 12 foaf Group ai rdf type lt server gt annotations aaa permissions foaf member foaf member lt server gt principals ppp2 foaf name ac ccs Anton
47. e topic there is an important requirement to be able to share selected parts of the original resource via email Twitter and Facebook 6 Video annotation Review of tools available Advene Annotator s Workbench Annotorius Anvil Atlas ti Hyperlmage Mediathread Project Pad Rehersal Assistant VideoANT Video annotation is probably not the most common form of annotation currently carried out by humanities scholars however the literature review shows that several tools that support such activity are in fact used within the HSS communities This reflects the need to both organize video collections with annotations and to link comments notes with them Scenario User has downloaded a few videos from www youtube com and made a collection themed around the current use of digital tools among Social Sciences and Humanities scholars He opens such collection to other users or collaborators Both the original user and their collaborator can annotate the videos and share such annotations in a research environment The collection could include keynote speeches university lectures conference and seminar papers as well as software tutorials The user then wants to share selected parts of the original resource via social media add personal comments and then share such comments via social media as well This scenario is applicable to scholars and universities but also potentially to software engineers and programmers Formal informal both formal and i
48. ected in folders BITS Co Ltd http www bits cc 2 XPI stands for Cross Platform Installer file extension www dasish eu GA no 283646 23 according to their colors These folders are accessible via standard folder menu of the user interface If the default collection of markers does not suffice the user can create his own marker by picking up a fresh color from a rich palette provided by the DWAN client on user s demand For instance one can create a purple marker purple is not a default DWAN marker and annotate with this color fragments about the family of Picasso from various web pages Then all these annotations are contained in a purple folder An annotated fragment is preserved not only in the local client database that is connected to the extension but also sent by the DWAN client as an XML file to the back end database where it is stored The DASISH developers have implemented synchronization of the local and the back end database One of the aspects of this synchronization is that one of the provided default markers the light yellow one plays a special role By now it corresponds to the annotations created by other client instances The client retrieves these annotations from the server s database and places them in the folder called incoming In a sense it is inconvenient because the user cannot see the colors of annotations made by the others Itis due to the fact that the original Wired Marker was n
49. ed so that locating the annotation in it becomes difficult or even impossible This may happen when the corresponding fragment has been significantly changed or disappeared www dasish eu GA no 283646 3 The client and server must understand each other and therefore follow some uniform rules In a nutshell there are two such rules The first one exchange data by sending http s requests from a given finite collection of requests that the server understands The second rule the content of requests and responds must obey a DWAN XML schema which is a part of the server side software The DWAN XML Schema mirrors a data model see section 5 2 that has been designed to represent the main data classes annotation target principal cached representation and notebook and relations between these classes As a proof of the concept for the architecture design and its technical approach we needed to develop not only back end software but also one or more client tools that work with it Moreover such clients must be usable by wide communities of researchers Before developing a client we first needed to determine which user scenarios it should cover and second to investigate a suitable tool already exists that can be used as DWAN client prototype If it does not fully fit into the infrastructure the tool must be adjustable to fit the architecture and to cover the user scenarios The simplest and the most obvious user scenario can be called Login
50. enting E ae itical text A Annotator Editorial A Re S informal and stating image Sg preferences 3 E a 3 g q 2 gt Commenting T Annotorious Editorial E al dees video informal and stating preferences E gt F P ta text A Atlas ti Synthetic Contextualization informal N image 6 3 Potential front ends for DWAN in Social Sciences and Humanities Following the previous categorization in section above i e task type process type asset type nineteen specific cases of uses HSS researchers make of annotation not necessarily covering just current DWAN functionality can be identified and grouped under six topics bibliography image web page syntax semantics wiki video To describe sets of functionalities demanded to satisfy expectations for an annotating tool for each of these topics we first introduce the following notations where UC abbreviates Use Case E C 1 Highlight text Add comments in the form of scribbled notes text to text C 3 Add comments in the form of scribbled notes text to image Modify text Add information to text within the text Modify text delete information within the text Tag an image with keywords Save own annotations Share own annotations via email Twitter and Facebook C 9 Share selected parts of the original resource via email Twitter and Facebook C 10 Collaborative annotations different users C 11 Track versions of annotations AAAA na N AA on
51. ext but it is rather a wider notion an item Under this item one can collect texts images or their fragments representing Karl Marks on the web pages Reframelt has appeared as a Firefox add on for commenting web pages and sharing it via Facebook Twitter Blogger FriendFeed Wordpress RSS HTML e mails 5 DASISH Web Annotator DWAN 5 1 Framework architecture The DWAN design assumes multiple clients working together with a single back end consisting of a database and a Representational State Transfer REST web service that is implemented in Java It allows annotating any web accessible content linking data creating relations or providing feedback Its novelty is that the created content and target annotated documents are stored in a database that can be shared with other tools in the framework see Figure 2 At the moment the storage for annotations and related resources is provided by the DASISH partner TLA MPY that currently runs the back end DWAN is also especially meant to cater for domain specific tools such as within linguistics that through their use of linguistic data formats can annotate specific linguistic items such as lexical items annotation tags etc Tools for other domains can be integrated without problems the data model underlying the DWAN framework is discipline agnostic The Language Archive Max Planck Institute for Psycholinguistics http tla mpi nl www dasish eu GA no 283646 9 more generic more d
52. he page has been already annotated a full list of annotations will appear on the bottom left side of the browser window The list can be ordered by annotations title or date Please note it is not possible to see the author of the annotations To see annotations from the other users click on the annotation you want to se from the full list It will appear on the webpage marked by light yellow color see Figure 10 To view own annotations navigate to the Marker folder and click on the color used to make the annotation you are interested in You will see the list of all annotations marked by this color Select the one you need If an annotation does not appear after clicking on it and also after refreshing the page it means that the DWAN client cannot resolve the annotated fragment The most probable reason for this is that the webpage has been changed since it was annotated However in DWAN a user can see the annotations even if the webpage has changed This is done by requesting cached representations of the corresponding annotated pages To do this point the mouse to the annotation in question and right click In the pop up menu select Cached representations and click open remote cache in the sub menu You will be able to get the cached representation of the page which almost always looks like the original page You will find the annotation you are interested in www dasish eu GA no 283646 49 namietor 1 Zand as wel incoming
53. ius Goosius foaf topic_interest foaf mbox ee T E E antgoo mpi nl o rdf type owl Thing Figure 5 Example of an OA representation Principal An annotation body in DASISH can be any correct XML or a text A generic way to present such bodies in Open Annotation is to consider a body which typically has attributes and elements as instances of oa Composite Any element and any attribute of the body becomes an oa item of the body If an element has sub elements it is an instance of oa Composite as well etc An attribute or an element with now sub elements has one of the dc types and one of the dc formats and possibly additional relevant properties such as cnt chars for text values 53 DWAN Back end Architecture in a nutshell The core of the back end is the Postgres database where all annotations and related structures are stored together with information about principals The task of the back end software is to connect a client with the database The back end software accepts a request from the client tool validates it and http www postgresql org www dasish eu GA no 283646 13 translates it into database queries The back end software translates the database content and sends it to the client The back end software is a multi layered project Its outermost layer is generated by the Jersey Framework which is responsible for connecting database managing software with
54. l can handle multiple images in one file Amongst the tools disadvantages is that editing done to Image Markup s XML files in an external editor may not be preserved e Virtual Lightbox for Museums and Archives is an educational tool for collecting and reusing in a structured fashion the online contents of museums and archives with visual components With VLMA you can browse and search collections construct personal collections export these collections to xml or Impress presentation format annotate them and share your collections with other VLMA users e WebLicht is a service oriented architecture SOA for creating annotated text corpora Development started in October 2008 as part of CLARIN D s predecessor project D SPIN and further development and enhancement of WebLicht is an important goal of CLARIN D aiming to make WebLicht a fully functional virtual research environment WebLicht employs chains of RESTful web services Each web service encapsulates a certain linguistic tool For example users can access as a web service the query component of a corpus a format converter a tokenizer a tagger or a parser Translation between the input format specific to some tool and the WebLicht information interchange format TCF see below is performed by a web service wrapper Each web service adds at least one layer of annotation encompassing the work of the tool encapsulated by that service The output of a chain of WebLicht services is an automati
55. les each of them stores a corresponding type of resource A column in a table represents an attribute in the corresponding resource class For instance any resource class has an attribute id an identifier of type xml id This identifier is a part of the URI through which a DWAN client accesses an instance of the resource URI has the form lt service uri gt lt resource lt id gt e g https dasish mpi nl api annotations e3c834f0 34c4 1 1e3 aa6e 0800200c9a66 Each of five resource tables has its column external id that stores public identifiers From the programming point of view an external identifier is a UUID string generated by the server when a resource e g an annotation is added to the database Annotation bodies are stored in the table annotation in the column body Furthermore there is a number of join tables representing the relations between the resources which are described as relations between the resource classes These relations create a hierarchy between the resources Indeed any of the relations can be abstracted to refers so that we have that a principal refers to an annotation or a notebook an annotation refers to a target and a target refers to a cached representation As one can see cached representations have the lowest position in this hierarchy This hierarchy induces a cascading mechanism of adding and deleting resources in the database For instance removal of an annotation from the database triggers the removal
56. lines It is by far the most common form of annotation currently carried out by humanities scholars and supported by the current tools offering The tools above therefore support a range of configurational and editorial tasks Scenario User a Latinist and historian is creating a digital critical edition of Marcus Tullius Cicero s judicial speeches They have downloaded the fifty two surviving examples from the Perseus Digital Library http www perseus tufts edu hopper and stored them locally Formal informal Informal annotations are critical here to add context historical allusions biographical notes on persons mentioned and places referred to However formal annotation methods may also be required especially in support of automated parsing and natural language processing NLP However much of this information will be already be available as TEI XML markup in the Perseus documents Necessary functionalities A primary function is to be able to Highlight text that is relevant to a particular arguments made by Cicero important passages and references to important exchanges It will also be necessary to highlight quotations which have significance in other contexts They will also wish to highlight important general entities see below One the text is highlighted the user will wish to Add comments in the form of scribbled notes text to text As well as free text they will wish to construct annotations using their own vocabulary
57. llection from www flickr com commons It is themed around European cultural heritage in the nineteenth and twentieth centuries containing primarily images of objects from museums but also contains images documenting specific events These could include major political events such as those connected to WW1 or scenes from everyday life and objects see example from the University of Reading s Museum of English Rural Life This scenario is applicable to scholars but also potentially to museum and collections curators Formal informal Mostly the functionalities required are informal The main need is to support the user in providing commentaries on individual images and to select particular parts of particular images for specific commentary on those specific parts However the user may also wish to construct formal lists taxonomies of the various aspects depicted These could include objects e g teapots statues vases weapons vehicles time periods and locations Asset the assets are images stored either locally in the user s computer or in a private cloud space Necessary functions The primary function needed is to Add comments in the form of scribbled notes text to image Either the user will wish to tag entire images or selected parts In the example below they will wish to define a particular part of the image and associate tags and or full text comments with these In the example given this might include steam tractor
58. maxOccurs 1 gt lt xs sequence gt lt xs complexType gt lt xs element gt lt xs element name xmlBody gt lt xs complexType gt lt xs sequence gt lt xs element name mimeType type xs string minOccurs 1 maxOccurs 1 gt lt xs any minOccurs 1 maxOccurs 1 processContents skip gt lt xs sequence gt lt xs complexType gt lt xs element gt lt xs choice gt lt xs complexType gt lt xs complexType name Notebook gt lt xs sequence gt lt xs element name ownerRef type xs anyURI minOccurs 1 maxOccurs 1 gt lt xs element name title type xs string minOccurs 1 maxOccurs 1 gt lt xs element name lastModified type xs dateTime minOccurs 1 maxOccurs 1 gt lt xs element name annotations type dasish ReferenceList minOccurs 1 maxOccurs 1 gt lt xs element name permissions type dasish PermissionList minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs attribute ref xml id use required gt lt xs complexType gt lt xs complexType name NotebookInfo gt www dasish eu GA no 283646 44 lt xs sequence gt lt xs element name ownerHref type xs anyURI minOccurs 1 maxOccurs 1 gt lt xs element name title type xs string minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs complexType gt lt xs complex
59. ment to capture and annotate changes made to the wiki pages over time Both available tools have primarily editorial functions Scenario User is conducting a project to capture the reception of public monuments including the Parthenon in Athens They will therefore need to annotate not only the main page of the wiki but also the Talk history of the page and are likely later on to have edits additions to make to the Wikipedia page itself The project is therefore about using annotation to capture discussion about a contentious page and Formal informal only informal annotations are relevant here Assets The assets involved are text and images Necessary functionalities The user will need to Save their own annotations in the form of Add comments in the form of scribbled notes text to text and text to image These are stored in a shared collaborative space The annotations will need to contain metadata detailing the wiki URL and the part of the page being referred to It will be necessary to specify start and end points allowing the user to Highlight text and Highlight images For this scenario it will not be necessary to highlight parts of images Each annotation will have to be able to point to multiple parts or the same wiki page or to multiple web pages Ina shared collaborative environment it will be necessary to Track versions of annotations including responsibility for different versions To gauge discussion on th
60. name notebookInfo type dasish NotebookInfo gt lt xs element name notebookInfoList type dasish NotebookInfoList gt lt xs element name notebookList type dasish ReferenceList gt lt xs element name permissionList type dasish PermissionList gt lt xs element name responseBody type dasish ResponseBody gt lt xs element name target type dasish Target gt lt xs element name targetInfo type dasish TargetInfo gt lt xs element name targetInfoList type dasish TargetInfoList gt lt xs element name targetList type dasish ReferenceList gt lt xs element name principal type dasish Principal gt lt xs element name currentPrincipalInfo type dasish CurrentPrincipalInfo gt lt xs element name currentPrincipalInfoList type dasish CurrentPrincipalInfoList gt lt xs element name principalList type dasish ReferenceList gt lt xs element name referenceList type dasish ReferenceList gt lt xs element name access type dasish Access gt lt xs element name annotationActionName type dasish AnnotationActionName gt lt xs element name permission type dasish Permission gt lt xs element name permissionActionName type dasish PermissionActionName gt lt xs schema gt www dasish eu GA no 283646 46 Appendix B DWAN Wired Marker manual DISCLAIMER As DWAN tools are under continuous development at the time of writing this report the manual may be not c
61. nformal annotations can be relevant here Assets the assets are primarily videos but may involve text and images as well Necessary functionalities www dasish eu GA no 283646 39 To gauge discussion on the topic there is an important requirement to be able to share selected parts of the original resource via email Twitter and Facebook The user may wish to add comments in the form of Add comments The user will want to save their own comments in a collaborative environment Collaborators will have the right to view the user s annotations as well as to add their own Ina shared collaborative environment it will be necessary to Track versions of annotations including responsibility for different versions The user will need to be able to embed links to relevant texts bibliography video and image media www dasish eu GA no 283646 40 Appendix A XML schema There are 5 sorts of resources in DASISH CachedRepresentation Target Principal Annotation Notebook Each of them has the corresponding xsd type in the schema There is no type with the name CachedRepresentation because a cached representation is a pure resource like an image or a text file that does not contain any meta information about itself The metadata of a cached presentation are defined via an instance of CachedRepresentationInfo type Each of resource types has an obligatory attribute id which contains DASISH identifier pointing to the location of
62. nnotation pick it up in the list right click it and select Properties in the menu The form for editing will appear and by selecting tabs Brief Overview or Annotation one can edit the title and the text body See Figure 12 Note that only the creator of annotation or a user with write access can update the annotation OO y sagrada Fam lia Wikipedia x dh hhttps fluxt ce38f4db9od x A hrosi lux1 cfoe8414223 New Tab x jati nl wikipedia org wiki Sagrada Fam lia e EJ Google Q Y A Vani 31 mei TOU BN mer TO JUNI Stat WIKIpedIa INT Net TeKen Van Brazilie DOE mee P DASISH My Annotations x Hoofdpagina 2 uy 41 24 13 NB ia Sagrada Fam lia Vandaag res Etalage gt Zie Sagrada Familia voor andere betekenissen van Sagrada Familia Categorie n Recente wijzigingen De Sagrada Fam lia Catalaans of Sagrada Familia Spaans voluit Bas lica i Temple Expiatori de la Da Entera search sing o Nieuwe artikelen Sagrada Fam lia is een basiliek in Barcelona Spanje naar een ontwerp van Antoni Gaudi De naam CE Date Willekeurige pagina betekent Heiliae Familie ji s Register in Uncategorized m iommate Sind Be 2 voortdurend aan de kerk gebouwd Alleen Incoming i Gebruikersportaal y i i 7 local folder anne gedui DASISH Web Annotator ld Clear all annotations on this page Y Mark oplew E ins B A Wi Hulp en contact P Copy Output with the tag
63. nt of an arbitrary web document mark it with a specific color and add a text comment an annotation body Aggregating Wired Marker annotations in bundles is implemented by a collection of pre defined folders Each folder contains annotations made by a certain marker color For instance in the red folder all annotations made by the red marker are collected It was possible to extend the Wired Marker code so that the extension could communicate with the server to retrieve an annotation from the database or send a created annotation to the back end Another tool called Pundit can do more than Wired Marker but unfortunately at the time when DASISH task 5 6 team had to make a decision it was not yet available and it got an Open Source license only after the development of DWAN had already started 4 2 Developments after September 2012 Pundit allows annotating images and their fragments Moreover the tool allows aggregating annotations into notebooks that can be viewed as a generalized version of the cultured folders aggregation facility of Wired Marker Pundit has a feature which in some cases may be considered as an inconvenience and thus it gives more points to Wired Marker because Wired Marker does not have it While creating an annotation a user must think in terms of a triple Object Predicate Subject for instance Karl Marx subject talks about predicate Kapital object Karl Marx denotes not only a piece of t
64. of its targets except for the ones to which other annotations still refer In turn removal of the targets triggers removal of all the corresponding cached representations unless some other targets refer to a cached representation under consideration Database Access Objects DAO s are used to programmatically access the data in the database The DAO mechanism allows to form and call SQL database commands like SELECT UPDATE INSERT and DELETE from Java methods Methods for basic manipulations over resources retrieving updating adding and deleting are defined in the corresponding DAO java interface For instance the AnnotationDao java interface lists the signatures of all necessary basic operations over the table annotation and the join tables annotation targets and annotations principals permissions By a basic operation we mean an operation which demands a single SQL statement The interfaces are implemented using SpringDAO which utilizes a JDBC connection to access the data store For instance the add annotation method is implemented in JdbcAnnotationDao java class as a single java method As one expects this method forms and calls an INSERT command for the table annotation Due to the presence of join tables there must be a mechanism that takes care of correctly sequencing basic operations For instance consider a complete procedure of deleting an annotation The annotation s internal database identifier occurs in three join tables anno
65. omain specific Web based tool Desktop tool ANNEX CMDI ELAN Browser es ES extension DWAN gt DWAN API qua DWAN Figure 2 The DWAN Framework in more detail 5 2 DWAN s Data Model and its connection to Open Annotation Model The Annotation class is the core of the model see Figure 3 The relations Annotation Target Target Source Target Cached Representation closely follow the Open Annotation OA standard The Open Annotation Core Data Model specifies an interoperable framework for creating associations between related resources annotations using a methodology that conforms to the Architecture of the World Wide Web W3C In OA an Annotation is considered to be a set of connected resources typically including a body and target where the body is somehow about the target The full model supports additional functionality enabling semantic annotations embedding content selecting segments of resources choosing the appropriate representation of a resource and providing styling hints for consuming clients An annotation in DWAN i e an object of the class Annotation is a structure that contains necessary information about the user s annotation In particular it contains the annotation s identifier the reference to the owner and the time of creation An owner is either the principal who has created the annotation or a principal to whom the ownership has been assigned Besides the owner an annotation has
66. ompletely up to date The final manual with a clear description of the framework together with instructions on how to use it will be published at the end of the project in the DASISH DWAN GitHub location DWAN Wired Marker client as a part of DWAN framework The DWAN client is a Firefox extension that enables a user to create free text annotations on fragments of webpage content Moreover the user can share his annotation with other users by granting reader or writer permissions The DWAN client has been developed on the basis of the existing Wired Marker web annotation Firefox Plugin software by adjusting it for collaborative annotating needs The DWAN Wired Marker version is implemented by adding program modules allowing sending and receiving requests to the common server database where the annotations of all users are stored The database and the server software that implements access to the database constitutes the back end with which the DWAN Wired Marker client communicates The DWAN Wired Marker instances and also other DWAN compatible clients have access to the database via a uniform service interface available over HTTP In order to communicate with the back end clients must satisfy certain requirements first of all they should be able to send and receive requests in XML format according to the DWAN Schema then such requests should also satisfy DWAN s API patterns The DWAN back end and the DWAN compatible clients constitute th
67. ook api notebooks nid metadata notebook nid Returns the list of all annotations aid s contained within a Notebook with related metadata Parameters nid optional maximumAnnotations specifies the maximum number of annotations to retrieve default 1 all annotations optional startAnnotation specifies the starting point from which the annotations will be retrieved ReferenceList default 1 start from the first annotation optional orderby specifies the RDF property used to order the annotations default dc created optional orderingMode specifies if the results should be sorted using a descending order desc 1 or an ascending order desc 0 default 0 GET api notebooks nid readers GET api notebooks nid maximumA nnotations limit amp startAnnotation offset amp orderby orderby amp orderingMode 1 0 Modifies metadata of nid The new PUT notebooks nid notebook s name must be sent in request s body aie Adds an annotation aid to the list of Envelope Creates a new notebook Returns the Envelope POST ICO nid of the created Notebook in NotebookResponseBody 17 The feature is implemented however testing is not completed and it is not used in the current DWAN front end Envelope NotebookResponseBody www dasish eu GA no 283646 22 Resource Return xml type response s payload DELETE Deletes nid Annotations stay they hitos statis nezl api notebooks nid just lose connection to nid P
68. or the DWAN client Selection had been based on four criteria tool s functionality compliance with task 5 6 requirements if it is open software if it can be adjusted to communicate with the back end and platform independency The Table 1 represents the results of the investigation Table 1 Annotation tools available by Autumn 2012 Back Tool State Open end Peco Functionality software browsers access Annotating PDF Word A nnotate active commercial and other document formats on line AnnotationEdit active commercial Annotating video audio Library and plug in adding annotation functionality to any web 5 Open source Annotator active yes page Java Script but one needs to alter its html by running script there Different Written in C Last A Annotea distributions annotating html web release jan Open source yes Amaya 2012 Linux Windows documents MacOS Notactve Firefox Highlighting and bookmarklet or clipping chunks of text Awesome 2 ee add on on web documents Highlighter Web page or by using the broken tool s website Not Support conversation on Bler active IE Plugin top of the web page P Web page Firefox addOn broken Via their web Collaborative via page sending screenshots in Bounceape eke Bice app yS http www boun e g Facebook Twitter ceapp com and Notable Commercial 3 Collaboration on visual Web site wi
69. ot designed as a collaborative annotating tool DWAN developers team is working on possibility to retrieve the original colors of annotations created by other client instances In fact a DWAN client does send to the server the color information when an annotation is created and this information is saved in the database At the moment the DWAN client cannot interpret the color information when it retrieves the annotation from the database The server transfers the color information on GET request of the client Technically speaking the annotated fragment is represented by the XPointer link that consists of the link to the page and the fragment descriptor defining the location of the fragment in an original document The information about the color is represented as a CSS property as part of the fragment Other users can view a particular user s annotation in their DWAN clients simply by reloading the annotated page As stated above an annotation made by remote instances of the DWAN client is listed in the directory of incoming annotations in the sidebar on the left hand side of the browser window The corresponding annotated fragment appears as a light yellow cultured fragment In order to access the database and thus use DWAN and its functionality e g view and post annotations one needs to log in DWAN offers two ways of authentication firstly via Shibboleth and secondly via Spring basic authentication If the user s institution is part of the DWA
70. otation read from the database loses its original color while being interpreted by the client e thirdly fragments of images cannot be annotated by Wired Marker but only the whole image e In the fourth place adding notebooks would demand significant refactoring of the original code to a certain extent cultured directories of the local folder can be seen as notebooks Summing up the features that have been added to or changed in Wired Marker to adjust it to DASISH requirements Design Customization of existing visual features e g sidebar top menu right click menus add ons manager view customization of visual features for extended functionality login logout button extended Settings menu for back end configuration Functionality GET PUT update POST DELETE annotations POST and GET for cached representations authentication login logout Miscellaneous Rewrite of chrome manifest for development in extension proxy file environment extension code updates to ensure support by current Firefox versions Wired Marker only supports Firefox versions 2 0 10 http www wired marker org en index html 2 The redirection to these pages was under implementation in the front end by the time the deliverable was written 2 The client developers are working on a fix for this problem at the time of writing of this document www dasish eu GA no 283646 25 hyperanchor http www hyper ancho
71. r more acute Scenario User is researching methods used in 3D reconstruction of archaeological sites and objects They have a need to both define and add annotations to a variety of different web pages especially results of searches using Google Images and Google Scholar Specifically they are interested in linking data created in the Unity 3D modeling package with Geographic Information Systems GIS data They therefore need to compile a profile of web resources which refer to this issue They are leading on this task in a collaborative team and thus need to share their annotations with colleagues remotely and with research students These colleagues will need to be able to add annotations as well and formulate replies to existing annotations www dasish eu GA no 283646 37 Formal informal this is an informal referencing requirement as the researcher will only be adding new information in the form of annotations Assets the assets are primarily text and images but may also include video They are not stored locally Examples include Official advice from Unity http unity3d com learn resources talks gis terrain unity Q amp A threads http answers unity3d com questions 17829 how can i import gis data into a unity project html and bibliography http ieeexplore iece org xpl login jsp tp amp arnumber 5567608 amp url http 3A 2F 2Fieeexplore ie ee org 2Fxpls 2Fabs_all jsp 3Farnumber 3D5567608 Necessary functionalities
72. r org en technical format html mapping to xpointer used on POST GET http www w3 org TR xptr framework http www w3 org standards techs xpointergw3c all setting updated annotation bodies Front end for ELAN An ELAN front end for the DWAN framework is being worked on in the context of the CLARIN NL ColTime project in progress ELAN is a multi media annotation software tool and the goal of the project is to allow ELAN users to exchange messages or create comments consisting from a reference to a particular fragment of a media file and a message text Such comments or messages are mapped on the DWAN annotation concept Since ELAN already had the notion of linguistic annotations in this section we will use the word comment for DWAN annotations to avoid confusion With ELAN users can make annotations associated with specific time spans of the media file This is organized in so called tiers Within each single tier the annotations cannot overlap but between different tiers they can Users can organize tiers so as to use different tiers for different types of annotations For instance one tier could contain annotations pertaining to pitch level while another contains information about hand gestures However there was until now no specific support to comment on the ELAN annotations themselves For instance researchers might want to coordinate their work or review each other s work Sometimes the tier system of ELA
73. raphy they have formed over five years of research on a specific geographic area In this case the bibliography is the archaeology of Cyprus in the Byzantine period Each bibliographic reference is the authority for a particular spelling of a particular place name e g Paphos as opposed to Pafos The user wishes to use their bibliographic resource to annotate place name references in the third party document with their bibliography This may be viewed as enhanced citation Formal informal The annotations of the text is a formal annotation requirement as the third party text is being annotated with pre existing information The annotations of the bibliography are informal as they provide free text information on each individual item Asset The asset is purely textual Previously the researchers have kept it in a Word document on their local hard drive but recently as one of the outputs of a research project they have published it online as part of an inventory marked up in XML of Byzantine monuments in Cyprus It is available on a webpage as a list of publications with author title periodical title if appropriate date of publication and page reference Annotations take the form of links to the bibliographic records in the researcher s database and also the annotations they have made on the bibliographic records The latter might include is this reference up to date or is it being cited in agreement or disagreemen
74. rk this task also found a number of other tools from the Humanities domain that looked promising to integrate in DWAN In separate chapters we present an analysis of the annotation task of DWAN clients in the context of a general overview of Humanities tools and Humanities research workflow We present also some user scenarios that could be fulfilled by DWAN either by the current version or with some future development 3 Introduction to the DWAN framework 3 1 Motivation and goals In the last decades next to the ever growing amounts of data on the web we have also witnessed large amounts of data moving to digital archives These archives have been connected to the Internet spreading the content through the research community The availability of such data creates new chances for collaboration To bring this collaborative environment to a next higher level the requirement is to develop a set of tools that allows groups of researchers from different institutions countries or backgrounds to work together Such collaboration can take the form of annotating the data and sharing these annotations using an annotation infrastructure By an annotation we mean a remark over a parts of a document s For instance it can be a text note containing the short English translation of a certain sentence in a target document which is in Catalan Annotatable documents include for instance web pages or web documents or resources in domain specific formats s
75. rofile via Terminal window by using the command mkdir p Library Application Support Firefox Profiles nameofprofile The instance of the Firefox with the given profile can by launched by the command Applications Firefox app Contents MacOS firefox profile Library Application _Support Firefox Profiles nameofprofile no remote To create and use a new Firefox profile in Windows you can use the Firefox Profile Manager that allows you to create a new profile while retaining your original one If Firefox is open close it completely by choosing File gt Exit Go to the Windows Start Menu and select Run Enter firefox exe P and click OK Click the Create Profile button on the Firefox Choose User Profile window that comes up Click Next gt in the Create Profile Wizard window that comes up Type in a new name in the Enter new profile name box and click Finish Clear the Don t ask at startup box so that it is unchecked and click the Start Firefox box Firefox will then start with a new profile www dasish eu GA no 283646 52
76. s anyURI minOccurs 1 maxOccurs 1 gt lt xs element name Version type xs string minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs complexType gt lt xs complex Type name TargetInfoList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name targetInfo type dasish TargetInfo minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs complexType name Principal gt lt xs sequence gt lt xs element name display Name type xs string minOccurs 1 maxOccurs 1 gt lt xs element name eMail type xs string minOccurs 1 maxOccurs 1 gt lt xs sequence gt lt xs attribute name href type xs anyURI use required gt lt xs attribute ref xml id use required gt lt xs complexType gt lt xs complexType name CurrentPrincipalInfo gt lt xs sequence gt lt xs element name currentPrincipal type xs boolean minOccurs 1 maxOccurs 1 gt y lt xs sequence gt lt xs attribute name href e xs anyURI use required gt y www dasish eu GA no 283646 42 lt xs complexType gt lt xs complex Type name CurrentPrincipalInfoList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name currentPrincipalInfo type dasish Curren
77. s write access and can change the rights of other users Additional web pages from the back end server allows the owner to reassign the rights for a particular user and an annotation or to change the public access mode for a given annotation On principal s request the back end can present one of two web pages allowing the owner of an annotation to change the annotation s access modes which can be read write or none The first page allows changing the access to a specific annotation for a specific principal The second page is used to change the public access mode at once that is all registered principals get write access Updating access modes is implemented through web pages issued by the back end because changing access rights is not implemented in Wired Marker itself and adding this feature to DWAN Wired marker based front end would be quite time consuming While working on the transformation of Wired Marker into a DWAN client the DWAN development team has established that four of Wired Marker s drawbacks cannot be fixed within a reasonable amount of time e first the original Wired Marker does not provide multiple target annotating in other words a user can put a text note exactly on one fragment of the source page for instance it is not possible to annotate two text fragments simultaneously and to interrelate and link them with the remark that they contradict each other e the second drawback has already been mentioned an ann
78. same type resources as ELAN and also uses the same linguistic annotation model in the ANNEX s context DWAN annotations are also referred to as comments 2 http www clarin nl Fe http www ru nl sign lang projects coltime www dasish eu GA no 283646 26 However ANNEX is only a visualization tool for archived materials and currently does not offer any functionality to create annotations Still users would like to make and exchange comments with respect to archived media and annotation files independent of the possibility to actually add linguistic annotations Using the DWAN back end to store search and retrieve such comments is easier than in the ELAN case ANNEX already relies on URLs and part identifiers to fetch its data and specifically ANNEX s URLs accept time period time and duration and tier specification tier parameters are already available This also eliminates the need for the EAF URN described in the ELAN section of this document 55 Testing Procedure The Software Test Plan STP is designed to prescribe the scope approach resources and schedule of all testing activities The detailed testing plan which can be found at https github com DASISH dwan testing identifies the following e the items to be tested e the features to be tested e the types of testing to be performed e the personnel responsible for testing e the resources and schedule required to complete testing and
79. server software and the database and possibly multiple front ends clients Developed within DASISH project the DWAN tools are an instance of the DWAN framework It consists of the back www dasish eu GA no 283646 1 end part and the client which is a significantly adjusted version of the Wired Marker Firefox extension We chose Wired Marker after a selection process looking for a suitable tool as a general DWAN client for annotating objects on the Web The selection process is separately described later The core of the back end is a database where annotations and information about corresponding annotated target documents are stored together with the targets cached representations Archiving cached representations in the database is relevant when annotated documents are dynamic pages like news sites or wiki pages under construction A client in the DWAN framework exchanges data with the server by sending REST requests and getting responses Client request bodies and server s responses have a form of XML The client is able to accept and send XML structures that obey a pre defined XML schema The schema mirrors a data model that has been designed to represent the main data structures which are involved in constructing annotations The work in task 5 6 succeeded in delivering next to the back end one front end client tool for the DWAN framework and in collaboration with other projects integrated two extra client tools For future wo
80. similar url found from annotation evaluation lists they didn t work at all http www keeppy com Keeppy Server a social network relevant for our purposes No license One Click Annotator information a WYSIWYG Web short technical editor for enriching tomp new information content with RDFa i and easy annotations access for http loomp org index p downloading hp home html is missing Toolbox Will never be WYSIWYG editor http markitup jaysalvat com home MIT GPL licence based nen jQuery MarkITUp new gt 1 4 2 Javascript a onn library jTagEditor only for http www notateit com Windows seems NotateIT new noropen not to be sources i E compliant with other platforms WebKlipper new commercial As one can see there were not that many open software tools with suitable functionality available and moreover not many of them were well documented At the end the decision was made to select Wired Marker as a starting point for the DASISH web annotator client Wired Marker is a Creative Common licensed Firefox plugin with the possibility to change the code under the agreement with its creators It is platform independent since Firefox is one of the most www dasish eu GA no 283646 popular browsers installable at Linux OS X and Windows The access to the back end database can be adjusted Wired Marker s annotation functionality though limited still is in line with our goals select a text fragme
81. sis The primary innovations Bibliopedia achieves are 1 the aggregation and cross referencing of separate silos of scholarly data 2 the transformation of that information into a format consistent with the semantic web and 3 crowd sourcing the verification and elaboration of that data Mapping and cross referencing large scale high volume scholarship also means that unexpected connections can be found and brought to light along with less known original works that might otherwise remain unread Moreover formatting scholarly references for the semantic web will make this data available to a far broader community and enable unexpected innovations Bibliopedia will generate custom bibliographies and visualizations based on search results facilitating a wide variety of scholarly inquiry and discovery Most importantly Bibliopedia is designed for ease of use so as to substantially broaden participation to attract the largest possible range of humanities scholars as its user base in particular scholars who do not normally use digital tools Bibliopedia provides a RESTful API SPARQL queries linked data Zotero compatibility and many other features Built with Drupal 7 available on github and served from the cloud for scalability portability and reliability Bibliopedia is open to interested academics and libraries who would like to see what their metadata looks like on the semantic web e LitBlitz Literature Notes Manager is free web based beta
82. software that aims to improve how students and researchers manage their notes for literature reviews assignment research and more With LitBlitz you can avoid hours of printing highlighting organizing and typing save money involved in printing 100s to 1000s of pages highlight and write notes without shuffling a stack of papers organize your notes into digital notebooks in real time easily transfer notes to your draft review assignment LitBlitz was designed from the ground up to solve problems other annotation and note taking services haven t looked at or have solved poorly It s is different from popular note taking and archiving software like Evernote in that it allows users to take text and image snippets from their document webpage sources rather than forcing them to archive entire documents write Own Notes personal insights related to snippets to enable rapid draft writing and context building manage these snippets in themed digital notebooks for fast easy reference The founder is open to improving through suggestions from librarians academics and Ed Tech e MapHub is an online application for exploring and annotating digitized high resolution historic maps All user contributed annotations are shared via the Maphub Open Annotation API e Pliny is a note taking and annotation tool It may be used with both digital web pages images PDF files and non digital books printed articles materials Pliny is a desktop application that
83. stische Annotation Potsdam Interchange Format for Linguistic Annotation a stand off multilevel XML format which serves as the basis for further processing ANNIS2 provides the means for visualizing and retrieving this data Bibliopedia is an open source semantic wiki research platform designed to crawl scholarly resources including JSTOR the Library of Congress the Arts and Humanities Citation Index and similar data sources extract metadata about works cited convert that data into a semantic web format aggregate the different repositories then display the results on a wiki style website for the scholarly community to verify add to annotate elaborate and discuss o We envisage Bibliopedia as an open research enabling platform designed to unify the many disparate closed silos of scholarly information available today that remain 2 http www ahrc ac uk Funding Opportunities Research funding Connected Communities Scoping studies and reviews Documents Crowd 20Sourcing 20in 20the 20Humanities pdf 27 The information has been taken from http dirt projectbamboo org resources www dasish eu GA no 283646 28 difficult and time consuming to use Our first goal was to extract and transform bibliographic data into a linked data format consistent with semantic web requirements and to create large volumes of cross references among texts making digitized scholarly texts exponentially more useful to researchers and to machine analy
84. t The annotations in the bibliography should be able to link simultaneously to multiple bibliographic references Necessary functions Highlight text placing markers on particular publications as aides memoire for publication they are working on This would be whole records paragraphs rather than individual words They may also wish to Add comments in the form of scribbled notes They may wish to Share selected parts of the original resource via email Twitter and Facebook although email is likely to be far the most useful of these as they will wish to share references to their bibliography with individual colleagues www dasish eu GA no 283646 36 Enhance text with links Using records in the bibliography to annotate sections of text in a second document This would be done by embedding hyperlinks in the second document pointing back to the bibliography records In the application therefore the third party text is annotated twice first with the bibliography and second with the annotations of the bibliography Both types are displayable in hover over boxes on the third party document 2 Image annotation Review of tools available Greenshot HyperImage NewRadial INKE Skitch UVic Image Markup Tool These tools are both configurational and editorial This reflects the need to both organize image collections with annotations and to link comments notes with them Scenario User has downloaded a large 1000 image co
85. t uses an intermediate layer a dispatcher class This is because a REST request cannot be interpreted as a single PostgreSQL command but is a chain of such commands For instance when getting an annotation first users access rights must be checked via a separate DAO If the logged in user has read rights for the requested annotation then GET annotation request is to be fulfilled Otherwise the GET annotation method returns 403 status access forbidden Thus the intermediate dispatcher object is responsible for turning a REST request into a sound chain of the calls of the necessary DAO objects The DAO layer is the innermost layer of the back end software Testing lt lt component gt gt 8 Jersey test lt lt component gt gt Jersey Client El HTTP access T HTTP access lt lt component gt gt 8 Grizzly lt lt component gt gt JMock DASISH API lt lt use gt gt lt lt component gt gt Mock DAO s Dependency Injection JdbcTemplate O sa lt lt component gt gt 8 HSQLDB Figure 6 DWAN backend Architecture https jersey java net 2 http tomcat apache org 1 https jaxb java net www dasish eu GA no 283646 14 Database and Database Access Objects A Postgres relational database provides storage for all the core information resources annotations targets cached representations principals and notebooks The database contains five main tab
86. tPrincipalInfo minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs simpleType name Access gt lt xs restriction base xs string gt lt xs enumeration value read gt lt xs enumeration value write gt lt xs enumeration value none gt lt xs restriction gt lt xs simpleType gt lt xs complexType name Permission gt lt xs attribute name principalHref type xs anyURI use required gt lt xs attribute name level type dasish Access use required gt lt xs complexType gt lt xs complexType name PermissionList gt lt xs complexContent gt lt xs extension base dasish List gt lt xs sequence gt lt xs element name permission type dasish Permission minOccurs 0 maxOccurs unbounded gt lt xs sequence gt lt xs attribute name public type dasish Access use required gt lt xs extension gt lt xs complexContent gt lt xs complexType gt lt xs complexType name Annotation gt lt xs sequence gt lt xs element name ownerHref type xs anyURI minOccurs 1 maxOccurs 1 gt lt xs element name headline type xs string minOccurs 1 maxOccurs 1 gt lt xs element name lastModified type xs dateTime minOccurs 1 maxOccurs 1 gt lt xs element name body type dasish AnnotationBody minOccurs 1 maxOccurs 1 gt lt xs element name targets type dasish TargetInfoList min
87. tarting with the server s location specified by the type of requested resource and its identifier when applicable Requests of method type GET are used to retrieve information about resources stored in the database For these GET requests the URL generally contains the identifier of a requested resource as a path request parameter For instance it can be the identifier of an annotation or the identifier of a cached representation Passing a principal identifier as a parameter is not required because the active principal is known from the session via an authentication procedure e g provided by Shibboleth which is a an open source project that provides Single Sign On capabilities and allows sites to make informed authorization decisions for individual access of protected online resources in a privacy preserving manner A PUT resp DELETE request is used to update resp delete the resource of which the identifier is given as a request parameter Only the owner has DELETE rights POST is performed when a client wants to create a new annotation Most information necessary to fulfill a PUT or POST request is not given as a request parameter but given serialized in the request body For instance to submit an annotation a client needs to fill in the request body with the XML element corresponding to class Annotation All the information necessary to create an annotation should be placed in the corresponding nodes of the XML element If a POST PUT
88. tations targets annotations principals permissions and notebooks annotations If the annotation record is deleted from the table annotations before the corresponding rows in the join tables are removed then the join tables have references to the non existing annotation via its internal identifier and the database will signal an integrity error To prevent such errors we have introduced a java class DBDispatcher java which calls the methods from the DAO implementations in the correct order Moreover it triggers cascading of the operations when necessary For instance complete deletion of an annotation amounts to purging the join tables first then deleting the corresponding record in the annotation table and then triggering removal of the annotation s unused targets Auxiliary resource info classes generated by JAXB for the corresponding xml types Targetinjo AnnotationInfo NotebookInfo contain references to the corresponding resource plus the most important information about the resource di http docs spring io spring framework docs current spring framework reference html dao html S Java Database Connectivity www dasish eu GA no 283646 15 REST Application Programming Interface The server and a client communicate with each other by means of a REST Application Programming Interface API A REST API is a collection of requests that the server must recognize and respond to in an appropriate way Requests are made by means of a URL s
89. th the base media projects The Commentor active you need an plan free 3 accoun collaborators Annotating web documents in local networks and on the Last internet Different color CritLink executable yes Unix means different sorts of from 2000 comment support green issue red bcomment bule query orange www dasish eu GA no 283646 5 Commercial Annotated PDF word Girocodot active with free Pwerpoint documents are Standard y saved on Crocodoc edition servers Commercial Annotating web pages DIIGO active with free base yes o Satan saved to Diigo library account Diigo account is needed 92 i DrawHere Pe one needs Firefox IE Drawing on web pages an account bookmarklet shareable Commenting web sites Discontinu anyone could write ThirdVoice Browser Plug in anything a lot of ed in 2001 ee criticism from the web page owners Highlighting and putting Wired Marker aci Creative ee Firefox text notes on the commons extension fragments of web documents Tool of 2006 Inactive Fleck their site does not exist any more http delicious com ath Need an es Bookmarklet account Need an With skitch http evernote com active account a Server storage annotating pdf and premium is of the documents images not web pages commercial Firefox Safari Highlighting html http webmarginali Open source Chrome TE
90. the web server e g Tomcat which hosts the database The Jersey shell is not written by DWAN developers but used as a library of program modules The remaining layers are designed and implemented by DWAN developers in Java Next to the Jersey shell there is a package containing REST methods which accept client requests in the form of http strings possibly together with XML bodies for more complex requests For instance when a client has to send it as an XML file for posting an annotation This XML file is deserialised within the POST annotation REST method into a Java instance of the class Annotation using JAXB technology The other way around REST methods also translate database responds into interpretable by the clients presentations For instance when getting an annotation the respond from the database which constructed as a java object is serialized by the GET annotation REST method into an XML file which is sent to the client The client is responsible for converting it into a user friendly form REST methods do not perform calls to the database directly A REST method uses Data Access Objects DAO s that take a REST request together with its parameters and translate it into a Postgres database command For instance a GET request is typically translated into a SELET command of PostgreSQL POST and PUT requests are translated into INSERT and UPDATE commands respectively To be precise a REST method does not call DAO objects directly bu
91. uch as transcriptions originally created by linguistic software e g EAF files created by the ELAN multi media annotation tool To bring in the collaborative element such annotations should be shareable between different groups of users and if editable by different tools with domain specific capabilities https addons mozilla org nl firefox addon wired marker 2 http en wikipedia org wiki Representational state transfer https tla mpi nl tools tla tools elan elan description www dasish eu GA no 283646 2 Based on these ideas of shareable annotations that can be worked on by different domain specific tools we have set ourselves two goals Addressing the first goal we have come up with the one server many clients architecture see section 5 1 for more detail and Figure 1 General DWAN Architecture Indeed the server with the database is used to store annotations which all have the same structure independent on annotatable documents This structure mirrors an annotation itself e g a text comment a reference to the sources with the annotated fragment specifications and possibly references to cached copies of the annotated documents see section 5 2 This uniformity opens a possibility to design one database that stores annotations for different type of documents On the client side the situation is different in general Browser tool 2 Browser tool 1 Annotation database ho DA Figure 1 General DWAN Ar
92. ve a precise MIME type For instance a body can be a plain text which describes a specific relation like contradiction between two fragments of some web document In this case the body should contain references to the targets that represent these two fragments and the document Annotations can be gathered in notebooks DWAN model has been designed with Open Annotation in mind and therefore the mapping between DWAN model components and open annotation concepts is built in a natural way The targets of DWAN model correspond to the instances of the open annotation class oa SpecificResource see Figure 4 Multiple target sources are represented as instances of oa Composite Each of oa item of the composite is either an instance of oa SpecificResource or oa Composite A cached representation of an annotated target source is referred via the target s state see the figure above The properties oa hasState and oa cachedSource are used The metadata of the cached representation are presented via dc properties and dctypes mimeType is presented as dc format tool is presented as dc publisher type is presented as dc description note that de type cannot be used here because its value must recommended be from the DCMI Type Vocabulary therefore e g screenshot would not be a good value here Moreover a cached representation must have one of the dctypes as rdf type and it must be compliant with dc format value For instance if dc format is imag

Download Pdf Manuals

image

Related Search

Related Contents

Goodram 4GB DDR3L 1333 MHz  メンテナンスマニュアル 据付 組立編  TVAC15000A - Sicher4You.com  Oracle Key Manager - Oracle Documentation  取扱説明書 - ツカサ電子  Rexel Centor Stapler  User manual-12k.cdr  Terminus GSM864Q User Guide R08.pmd  Home Phone vs Cell Phones  

Copyright © All rights reserved.
Failed to retrieve file