Home
TyDI: Terminology Design Interface – User Guide - Migale
Contents
1. Prevalidation Justification all users CH a el Lal producer g Validation a _ all users vild Surface form N He Exp Head Expansi Syntad 2590 transports of illegal 1 oi Oltran ms NNS ee a fA allel wife bila ies Surface form Head pr Expansio Head Expansion Syntactic cat Producer art UNCLOS NP NN Yatea 1 quotas legal JI NNS Yatea 1 volumes fish JJ NNS of NN Yatea 1 cod illegal JI NN Yatea 1 tuna northern JI NN NN Yatea 1 population tuna NIN NIN Yatea 1 bottom high seas JJ NNS NN Yatea 1 term long JJ NN Yatea 1 State EU NPNPNP Yatea_1 fishermen local J NNS Yatea_1 fishing pirate NN NN Yatea 1 Islands Balearic NP NP Yatea 1 cod Barents NP NN Yatea 1 g 892 UNCLOS art Semantic Classes of term illegal cod T 900 legal quotas F 3 921 large volumes of fish H illegal cod 1 932 illegal cod gt amp illegal cod 960 northern bluefin tuna e 973 tuna population 1019 high seas bottom 1051 long term 1160 EU Member State 1248 local fishermen 1269 pirate fishing 1278 Balearic Islands 1307 Barents cod an Nm VIN OD NI NI wi N w A O ISIN L NIOINIOILOINIOIN ele lnlo 2 2 2 F 0 0 0 0 0 B 0 N o Candidate 932 Properties 9 Properti
2. al 5 7 908 rows Z illegal cod vild Surface form S 5 Head Expansion Syntactic Producer Kl 892 UNCLOS art 2 0 O art UNCLOS NP NIN Yatea 1 t illegalfish 1 900 legal quotas NG 2 1 quotas legal JJ NNS Yatea 1 Vv 921 large volumes of fish 2 0 1 volumes fish JJ NNS Yatea_1 D 932 illegal cod 10 0 3 cod illegal JI NN Yatea 1 960 northern bluefin tuna 6 0 1 tuna northern JI NN NN Yatea 1 973 tuna population 3 0 2 popul tuna NIN NN Yatea 1 T FF 1019 high seas bottom 2 0 0 bottom high seas JJ NNS Yatea 1 1051 long term 3 0 2 term long JI NN Yatea 1 1160 EU Member State 2 0 O State EU NP NP NP Yatea_1 illegal cod Properties EEE Meum Mill 1248 local fishermen 2 1 O fisher local JJ NNS Yatea 1 IL IL Ji 9 Properties 1269 pirate fishing 20 0 7 fishing pirate NN NN Yatea_1 vV 1 i T F Id ER 1278 Balearic Islands 2 0 O Islands Balearic NP NP Yatea_1 Lemma illegal cod L OccurencelnContext Window Candidate 932 Sx Syntactic category J NN in Filename Sentence Context H Head cod p F j Turning a blind eye to the source of the cod that they are buying helps make illegal cod fishing a profitable business lt 3 m DOC 0 5 1 and paves the way for illegally caught cod to end up packaged in the boxes of major brands around the world Expansion illegal ed analysis illegal cod kend Stop any trading or cooperation with fish companies involved in trade w
3. If your database does not contain any user profile yet please refer to TyDI Admin Guide to learn how to connect as an application administrator 3 2 1 User profiles creation and modification A Clicking on the button located in the User toolbar opens the user edit window This command can also be found in the context sensitive menu under the Term database node in the Project window Note the user profile management is only granted to Application administrators A UserEdit Window x we D root admin fred sophie sandra claire Name admin Alias name AppliAdmin is Application administrator V Password New Delete Illustration 6 Users editing window The User editing window displays the list of the user profiles existing in the current database The list 1s located at the top of the window To edit a profile First select it in the list and then perform the change in the text field below You can even reset the password of an existing user which is useful in case of password loss Confirm changes by clicking on the save button or discard them with the Cancel button Clicking on the New button creates a new user profile Clicking on the Delete button deletes the selected user profile Warning Removing user profiles that have been used to create data e g term validation semantic class or term link creation is forbidden Note user profiles with Application administrator privileges hav
4. When a new version is made available an icon will appear in the status bar you just need to click on the dedicated hyperlink to open the update wizard that will guide you through the update process 6 Parameterization 6 1 Connection configuration The parameters needed to connect to a database instance are grouped and associated to a connection name in the application preferences which is saved in a local file Hence it is easy to switch from one database to the other First launch the application but do not connect to a database click the Cancel button when the login dialog appears Then open the named connexions editing window Tools Option Term Validation category DataSources tab Pal Options Gy EE E Gre TK General Keymap Miscellaneous Term Validation Datasources External Links Miscellaneous DEV p rojetl P DEMO zi delete Mig_TermWS_DEV Termino_DEV new Mig TermWs PROD Termino PROD up down Name DEMO provider class fr inra mig_bibliome providers jdataprovider JDataPro w Url lidbc postgresql schema id password e Illustration 26 Datasources option panel The list located at the top of the window contains the existing named connections In order to edit a connection you need first to select it in the list and then perform the change in the fields below Confirm the changes by clicking on the save
5. Id Surface form pesticide resistance Lemma pesticide resistance Syntactic category NN NN Head resistance Expansion pesticide DW analysis pesticide resistance is inferred WW Nb occur Nb doc Producer Expert is Concept V is PseudoTerm Illustration 13 Property sheet displaying term candidate info Note 1 it is possible to tag several terms at once by selecting them in a term grid and setting a new value in the property sheet Note 2 It is possible to select property text to copy it into the clipboard Text from the term grid cannot be copied 3 4 4 2 Context window OccurencelnContext Window Candidate 1183107 Ne Filename Sentence Context H The method of any of claims 353 362 363 or 367 wherein 128 the growth regulator treatment is completed when lt the embryos reach the late cotyledonary stage CA2240135A1 xmI 5 D 1 1 A development treatment over a selected development time period of somatic embryos for producing viable mature cotyledonary desiccation tolerant somatic embryos said development treatment including i nourishment of the embryos by a suitable metabolizable carbon source and ii for a selected growth regulator 1131 ment time period treatment of the embryos by at least one growth regulator influencing embryo development selected from the class comprising stress hormones and iii for a selected water stressing treatment tim
6. Number of Number of distinct document None documents where the candidate 1s found Head Number of term candidates the Open a new grid containing all the term productivity head of which is the current candidates the head of which is equal to candidate the current candidate Expansion Number of term candidates the Open a new grid containing all the term productivity expansion of which is the current candidates the expansion of which is candidate equal to the current candidate Head Head form Open a new grid containing all the term candidates that have the same head than the current candidate 1 e head family Expansion Expansion form Open a new grid containing all the term candidates that have the same Expansion than the current candidate i e expansion family Syntactic Part of speech tag POS tag Open a new grid containing all the term category candidates that have the same POS tag than the current candidate Lemma Lemma of the candidate None Supersedes Number of term candidates that Open a new grid containing all the term have been regrouped under the candidates superseded by the current current candidates see post YaTeA candidate processing is Inferred True if the term is not found in the None corpus alone in a maximal noun phrase MNP but has been retained for the syntactic analysis of larger term Number of Number of words m the form None words is Concept True when the current term is None selected as ca
7. The FastR import button is located in the Project toolbar It is enabled when a corpus node corresponding to a YaTeA or a tab separated value import is selected in the Project window 3 Explore FastR variant proposals and qualify Fastr morphosyntactic variation links as semantic relationships o The FastR variants window is opened by clicking on the button is located in the Project toolbar it is enabled when a Project node is selected in the Project window this command can also be found in the context sensitive menu under the Project node in the Project window 4 4 2 FastR variant proposals view Using FastR variant proposal view is similar in use to the term search window it contains a filter panel to refine data retrieval and a term grid which displays pairs of terms found as variant by FastR one pair by row Combined with the semantic class view detailed in 4 2 1 it helps the user to quickly qualify FastR proposals into semantic links by creating new synonymy classes or enrich existing ones and hyper hyponymy links 4 4 2 1 5 FastR variants filter panel Filter eh T Form any term delta string not in any semantic class delta word count gt O lt 0 showrepresentative only Tablmport Fastr_1 Faster 1 Term producer Link producer XX 37 Perm 32 16 Ins XX 4 Coor x 1 Coor XX 28 ins 30419 Ins XX 31 Perm FastR rule ee e e z xx 40 Perm XX 22 Ins 22 X 28 1ns XX 7 Coor xx 31 Ins x lt 25 Ins XX 1
8. 1s some specificity depending of the operating system you are running The main differences appears in the look and feel it can also differs from one version of Java to another The table below presents a list of these differences Issue Unix like OS Mac OS MS Windows command line tydi bin tydi tydi bin tydi exe execution path relative to installation directory contextual menu click with the right if you have a single click with the right button of the mouse buttoned mouse button of the mouse ctrl click application saving tydi Users LOGNAME L1 APPDATA tydi directory brary Application user preferences Support tydi 7 Appendix 7 1 Term text import file format The text file import process recognizes the column headers detailed in the table below Column header Description Note ID External term identifier Not imported PREVALIDATION Prevalidation string Free text VALIDATION username Validation status One column per user Imported only 1f there 1s a matching username in the term database The default validation statuses are recognized D D V and V A special sixth value VC can be used to tag the term as Concept see 4 1 without setting any validation status COMMENTARY username Validation justification comment one column per user OCC Number of occurrence
9. 4 RNA modification o DNA insert o lt lt RNA synthesis Ve DNA polymerase e 4 protein folding v 4 2 4 2 Cooperative work and concurrent modification Since several users can work at the same time on the same terminology project structure it may happen that they whish to change the same data independently Then the tree view might not be synchronized with the actual data stored in the database So if a user is about to modify data that has been changed by another user after the data displayed has been read from the database he will be warned via a specific dialog and the modification will not occur A ConcurrentAccessException x x Cannot perform this modification Data has been changed by another user Reload and retry CN OK Moreover the tree view will be refreshed to display the new data state but in some case you may need to update the entire view thanks to the refresh button 4 2 5 Adding a term When shaping a terminology it sometimes happens that some level of the hierarchy cannot be embodied by any already available terms because the term is not found in the corpus or for some reason has not been detected by the term extractor Thus it is possible to manually create a new term in a terminology project thanks to the dedicated button available in the Term Grid toolbar To create a new term the user need to enter the term properties in the dialog box then click the Ok button Creat
10. button or discard them thanks to the Cancel button Clicking the New button creates a new named connection Clicking the Delete button deletes the currently selected named connection Your application administrator should have given you the parameters for your specific Datasources See TyDI Admin Guide for more details Tip TyDI is shipped with default Datasources Removing all existing Datasource configurations and restarting TyDI will restore them 6 2 External link to web browsers When a term candidate is selected 1t is possible to quickly perform a search of the surface form by launching an external web browser It is possible to add a new search engine thanks to a dedicated option panel To open the external links editing window Tools Option Term Validation category External links tab Pal Options x Oe KE Ve Gfa General Keymap Miscellaneous Term Validation Datasources External Links Miscellaneous Google Google Scholar delete PubMed Wikipedia new add default links up down Name Wikipedia Url http en wkipedia org wki Special Search search 2s example test Illustration 27 External links option panel The list of located at the top of the window contains the existing external links to web browsers In order to edit an external link you need first to select it in the list and then perform the change in the fields below Confirm th
11. corn 7 gt Ny seed of corn gt Ny seed of corn gt Ny seed of corn corn seeds corn seeds corn seeds o seed ofthe corn o seed of the corn o seed of the corn Oo Seed of any corn Oo Seed of any corn Oo Seed of any corn Oo Seed of the corn Oo Seed of the corn Oo Seed of the corn SEEDS OF CORN SEEDS OF CORN SEEDS OF CORN Oo seeds of corn Oo seeds of corn O seeds of corn 6 7 6 seed 2 b seed 2 b seed 2 Illustration 21 context enabled actions in the Semantic class window The available actions are summarized below 7 1 Synchronize with Toggle button used to freeze the view to the currently selected selection term s of the term grid By default the view is always synchronized with the current term selection Create class Create a new semantic class containing all selected terms g within the semantic class window o Remove class Remove all selected classes o Show class Show in the view the selected class only Useful to navigate through the class to class links Note this action is triggered by a double click on a class to class link T Classes fusion Merge the two selected classes the resulting class contains the union of the terms of source classes It is also linked to the classes that were linked to the source classes hyper hypo and antonyms CG Remove
12. file nodes list of individual files None contained in a corpus DN User node only one node for the currently Change password np connected user Change user right 3 1 Connection The connect dialog allows the user to choose a database and a user profile at the same time A correct password for the application is required to connect to the database At a given time the application is connected to a single database but a database can contain several terminology projects that can be opened simultaneously Connection Termino_PROD Username fred Password Cancel Illustration 5 connection dialog Note The Connect and Disconnect commands are located under the File main menu hence the user can disconnect from a database and connect to another without closing the application These commands can also be found in the context sensitive menu under the Term database node in the Project window 3 2 User profiles management Most of the time a change in the data is recorded along with the identification of the user who issued it Thus it is strongly recommended to create one user profile for each person taking part in the terminology building process User profiles are stored in the database so that one user can potentially work on any terminology project of the database 1f the user profile has been explicitly given the rights to view and work on a project by an application administrator
13. query to the database with the current values of the gt criteria mi Short cut Enter key Y A Maskable incremental search bar type a text in the text field the first term containing this text will be selected in the grid You can browse forward and backward amongst the matching terms using the arrow buttons Ctrl F keyboard shortcut opens the search bar Clicking on the cross on right side closes it y Multi validation button open a specific dialog to set the validation status and to set an optional comment for all the currently selected terms in a single action External search button launch a search of the selected candidate surface form e within your favourite web browser The available search engines can be parameterized thanks to a dedicated window see 6 2 x Context button open refresh the context window displaying the occurrences of the ki selected candidate within the corpus Term link button open refresh the Semantic class and Term link window to display the classes containing the selected terms Create class button create a new semantic class containing all selected terms Create a new term Terminology export button allows performing two distinct types of export 1 Total image export export to a text file tab separated value the term displayed in the grid in the order they are displayed with all the columns visible in the grid 2 Term and POS tag list in TreeTagger form
14. room for other windows Or pining it down in order to have it always displayed Finally you can undock any of the top level windows if you prefer to work with independent windows via the context sensitive menu in the window title bar or by dragging out the title of the window 2 2 Application general presentation The Terminology Design Interface client is a graphical user interface composed of several top level windows which can be reorganized at the user will By default top level windows appear docked within the main window workspace top level windows have been coloured in the picture below A TyDI TAIR File Edit View Tools Window Help es opd ele SB Projects Window ZE gl xX f A Term grid 1 free search Environment x drivo Term Database jdbc postgresql bddev 5432 texte d Filter 9 Environment en ES NET 9 Ge Yatea l E 5 en corpus of Yatea_1 Form showinfered terms D amp Fred Lemma show dismissedterms Syntactic category Word count gt des lt 0H Head Nb occurences gt oH lt des Expansion show canonic only Prevalidation Justification all users producer Yatea_l Validation D D J all users v Vo V v g T E EE EEN Geess illegal cod 1 ES
15. the extractor but has been X regrouped with others under a merged representative optional post YaTeA processing Nb of occurrences Number of occurrences of the form within the corpus J V4 Justification Validation comment X J Word count Number of words in the form Number of Number of distinct documents where the candidate is found X documents Producer Processing or user who created the term is Canonical true if the term has been chosen has canonical representative of a semantic class v feature imported from input file X feature not available feature not imported but computed or set by the user within TyDI 7 3 References YaTea S Aubin and T Hamon Improving Term http www lipn univ Extraction with Terminological Resources In paris 13 fr aubin yatea_en html Advances in Natural Language Processing 5th International Conference on NLP FinTAL 2006 EL D Cpan Org gua YaTeA 0 5 TreeTagger http www ims uni stuttgart de projekte corplex TreeTagger FastR Jacquemin C A Symbolic and Surgical http www limsi fr Individu jacquemi FAS Acquisition of terms Through Variation In TR Connectionist Statistical and Symbolic Approaches to Learning for NLP Wermter S Riloff E amp Scheler G eds pp 425 438 Springer Verlag 1996 OBO Edit Day Richter J Harris MA Haendel M Gene http oboedit org Ontology OBO Edit Working Group Lewis S OBO Ed
16. 0 Coor Illustration 22 FastR variants filter panel The table below contains a short description of the features that can be used as filter criteria Feature Description Form Surface form of the any term the origin tem or the variant suggested by FastR Not in any Check this box to retrieve only terms that are not already part of a semantic semantic class class Representative Check this box to retrieve only terms that are representative of a semantic class only Term producer Processing or user who created the term Delta string String difference between the origin term and the variant Delta word count Number of words contained m the delta string Link producer Processing who created the variation term link FastR rule Rule used by FastR to discover the variant 4 4 2 2 FastR variants grid Each column composing this grid is actually related to one of three distinct objects the variant term id form producer and validation status on the left of the figure 23 the variation link rule nb word delta string and producer in the middle of the figure 23 the origin term id form producer and validation status on the right of the figure 23 Hence the currently selected term can be either the variant or the origin term depending on which cell in the table got the focus unless you performed a multiple selection by dragging a rectangular zone over the grid z will Jee Ka 643
17. 4 5 2 Validation modes There is two distinct validation ways available that should be set as a project parameter drop down list radio button where the label status is not displayed but which is a quicker way to validate long series of terms Id Surface form Validation Syntactic Nb occur 1955059 antifolding factor NN NN 1944746 antifolding function JI NN 1851756 antifolding role of zech NN NN 2242882 2D folding model V JJ JJ NN 1 1848372 advantage of an involvement of the folding process NN of D 1 1772816 alkaline phosphatase folding D e IHN NN 1 2199770 altered folding properties JJ JJ NNS 2 2233386 antifolding activity of Sech E JI NN of 2 1659513 Antifolding activity of the SecB chaperone JI NN of 1 1 1 Illustration 15 drop down list based validation blind mode surface form Validation Syntactic Nb occur Id 2242882 2D folding model e IT JJ NN 1 1848372 advantage of an involvement of the folding process al m m NN of D al 1772816 alkaline phosphatase folding a IT NN NN 1 2199770 altered folding properties E J JJ NNS 2 2233386 antifolding activity of SecB Ir OO JJ NN of d 16595 13 Antifolding activity of the SecB chaperone JI NN of 1 1955059 antifolding factor O INNNN d 1944746 lantifolding function J NN 1 Illustration 16 radio button based validation blind mode Moreover there are two distinct val
18. DI scenario document for more details on TyDI practical goals The application architecture follows the client server model The server side 1s mainly in charge of the data storage using a relational database and 1s described in TyDI Admin Guide The client side is a graphical user interface that will be detailed in this document If TyDI is not already installed on your platform please follow the installation procedure described in chapter 5 2 General presentation 2 1 Process description The term validation process is the following the user browses a list of term candidates and the user assigns a validation status to each one the status can range from rejected to fully approved The candidate list is provided by a third party application such as a corpus based term extractor like YaTea The terminology structuring process 1s the following the user assigns synonymy and hyperonymy relationships to couples of terms In both cases TyDI provides many facilities for selecting and displaying terms that share common properties such as morphology so that the validation or structuring actions for given terms can be derived from the observation of other similar terms When opened the windows appear docked at the favourite position within the application main window Simple drag and drop move them to another site There are also two special buttons in the title bar of the top level windows Alternatively sliding the window makes
19. TyDI Terminology Design Interface User Guide version 0 3e 2011 10 20 copyright INRA MIG 2009 2010 2011 Table of contents 1 Keng ga teg tr KE 5 2 General presa nee ana ne a ce eee 5 2 1 Process description WE 5 2 2 Application general presentation ssmmnnnnnnnnnennnnnnnnnnnnnnnnnens 5 2o NANA ORIN Wd OWS sA AA ANAN AA EAE an dt onu 7 3 TN 7 S l CONNECTION e E E onaionnnnns 8 34 USEF profiles management ns elles ses wnair aana nadis aies 9 3 2 1 User profiles creation and modification nn 9 yey umm SPA OA AO AA AA AA 10 3 3 Creating a terminology e e anse esse sante inst tnt tease 11 3 3 1 Importing data into projects rires 11 3 3 2 Importing multiple extraction results nn 13 34 Termcandidate selection na ca AA den 13 sal BET AR AA AA AA 14 OZ COPA IQ EOWA ne 15 J43 ee l0 EE A ad he cn tn nu te rected 17 3 4 4 Displaying candidate features sisi 19 SEN KE ga CATV Ee ele E EN sa ZOOIDA SOA EN ee ee st a ee ne een is 23 e AVAL USA lt n a AA 25 4 1 Optional term features concept pseudo termM nssmnnnnmnnnnnnnnnnnnnnnnns 25 AZ Terminology str ct re desiati see eege 25 42l Semantictclass View descriptio seori nn EE 25 4 2 2 Adding links between terms and classes rennes 27 4 2 3 Removing links among terms and classes and More ren 28 4 2 4 Semantic class tree view description nn 29 AG NAE AA NAA AA Adan 32 43 Term Grid OCR a ee ANA AA NANANA 34 ol ROUE CC NON aaa AA AA AA AA 34 e R bulArexp
20. a subset of term candidates the user assigns term feature value to the criteria as figured in the Filter panel Note depending of the data that was imported to create the project some criteria are not available for candidate selection greyed field You can perform approximate filtering by using special wildcard characters in the text fields will match any string of any length strictly more than zero 2 will match any single character If several criteria are specified the term candidates that are retrieved will match the union of those criteria logical AND operator Tip in order to give more space to the candidate table it is possible to reduce the panel thanks to splitter widget Dragging the splitter bar resizes the panels on both sides of the splitter Expand and reduce buttons quickly expand reduce the panel triangular buttons at the left side of the splitter Filter ke Ki ignore case D 4 a There is a reduced panel at the top of the filter panel to indicate if searches should case sensitive or not case ignored by default It also contains a button to reset the filter panel The table below contains a short description of the term features that can be used as filter criteria Feature Description Form Surface form of the term candidate as it 1s found in the corpus Lemma Lemmatized form of the candidate Syntactic category Part of speech tag POS tag Head Head f
21. at of the currently displayed terms that 1s used as input to FastR processing IL le Local filter button allows to define a local filter by specifying a regular ei expression to be applied to one of the visible column see 4 3 Wa Apply local filter toggle button allows to quickly enable or disable the local filter 1f defined 908 rows Rows count field shows the total number of terms candidate currently displayed excluding those filtered out by the local filter 3 4 3 Candidate table The list of candidates corresponding to the criteria set in the filter panel is displayed in the term candidate grid This table is the central widget used to navigate through term candidates Id Surface form Nb Head Expans Head Expansion Syntactic c Producer Fred fan 2356 ideal survivor 1 0 O survivor ideal JI NN Yatea_1 a 8190 illegal booty 1 0 0 booty illegal JJ NN Yatea_1 7363 illegal catch 2 A O Catch illegal JJ NN Yatea 1 932 lillegal cod 10 A 3 cod illegal JJ NN fatea_1 5294 illegal cod fishing 1 0 O fishing illegal JJ NN NN Yatea_l 7610 illegal cod landings 2 0 O landings illegal cod J NN NNS Yatea_1 5613 illegal cod story 1 A O story illegal cod J NN NN Yatea_l 492 illegal fish 6 0 1 fish illegal JJ NN Yatea 1 4989 illegal fishing 10 A 5 fishing illegal JJ NN Yatea 1 6448 illegal fishing activities 2 A Q activities lillegal fishing J NIN NNS Yatea_1 860 illegal fishing bo
22. ation status is If the non validated status 1s selected the user can choose to export inferred unparsed or dismissed terms as well The produced columns are Term 2 columns lemmatized form of the term surface form of the term Synonym 2 columns lemmatized form of the synonym surface form of the class representative Quasi synonym 2 columns lemmatized form of the quasi synonym surface form of the class representative Hyponym 2 columns surface form of the hyponym surface form of the hyperonym Merged 2 columns lemma of the merged term lemma form of the representative term Typographic variant N columns surface form of variant 1 surface form of variant N Acronym 2 columns surface form of the merged term surface form of the representative term Note if a term has no lemma available the surface form will be used instead 4 1 2 OBO flat file export The OBO flat file export utility exports semantic classes including synonyms and hyponymy relationship in the OBO Edit file format a Export Term project as OBO fiat file Steps Step 1 wizard 1 of 1 L Step 1 Export file Encoding UTF 8 v Namespace Kl Semantic Classes ki Terms status filter D D User s priority 0 Claire 1 Bernard 2 Sandra ES 3 Wiktoria Up 5 Fred It is possible to choose whether the produced file includes semantic classes and or simple terms Specific synon
23. ats 1 A O boats illegal fishing J NN NNS Yatea 1 6720 illegal fishing vessels 1 O fishing illegal JJ NNS fatea_l 7946 illegal irregularities ji A O irregula illegal JJ NNS Yatea_l Illustration 12 Term candidate grid lt gt Tip TyDI remembers the candidate terms previously selected in a grid and allows navigating backwards and forwards thanks to the two arrow buttons located on the term navigation toolbar The table displays most of the term features in distinct columns It also display additional columns to render the candidate validation status set by each user recruited as validator for the current project The validation columns display the validation status and the optional justification comment The specific rendering of the validation status is a project specific property in V0 2 list box and radio button rendering 3 4 3 1 Table visual settings The table visual organisation is very flexible and can be adapted to your preferences Click and drag of the column header vertical boundaries resize the columns Simple drag and drop of the column header reorder the columns _ Columns can be hidden and restored thanks to a specific dialog box opened by a click on HU the top right corner of the table or via the context sensitive menu on any column header Rows can be sorted following any of the column by a simple click on the column header first click on the header to perform ascending sort on the col
24. class class Remove the selected class link s link ey Show term classes Show in the view the classes containing the currently selected term Change synonym Change the type of the term in the context of the class op The available types include class representative synonym and quasi synonym ag Remove synonym Remove the selected terms from the class Note the class representative cannot be removed from the class e Show linked term Show in the view the classes containing the currently selected classes linked term Note this action is triggered by a double click on a linked term a Remove term term Remove the selected term term link s link t Expand Expand the selected node s of one level in depth Note if a selected node is already open then it will be expanded in depth till its leaf s Collapse Collapse the selected node s 4 24 Semantic class tree view description This window displays in a single view the global hyperonym hyponym hierarchy of a terminology Drag and drop gestures are used within this view to create or delete hyperonymy relations between classes E Tip Several Semantic Classes views can be opened at the same time on the same of on distinct projects ca Semantic Classes Tree Window x FPE lala Mici obio2 v o LA 3G 497 4 Anything J Subtilis Molecular Biology Concept 9 KH Subtilist functional classification 9 lt cell envelope and cellular process oH cell wall o lt germinatio
25. e button face contains a purple exclamation mark 1 drop down list based validation blind mode Note The tooltip text of the comment button contains the text of the comment Hence the comment can be read just by pointing the button with the mouse pointer and wait for a few seconds for the tooltip to appear This is especially useful to read other user comment in the cooperative mode 3 5 Toolbar summary vs AS 0Ha P48 Illustration 18 TyDI toolbars The actions that can be performed within TyDI depend on the current selection Hence some buttons on the toolbars may be disabled Here 1s a quick summary of the available actions lt b o gt Term navigation backward and forward over the selected terms Project statistics Project export text and OBO format d F FastR result import in an existing project Import extractor result in an existing project YaTeA format or Tab separated KE values Ca Import extractor result in a new project YaTeA format or Tab separated values ca Import text file in an existing project e g synonym typo variant hyponyms Ba Open a new term search window Ab Open a new FastR link exploration window d Open a new Semantic Class Tree Change current user password Lb Edit user authorisations 2 Edit user profiles 4 Advanced usage 4 1 Optional term features concept pseudo term Independently of the validati
26. e changes by clicking on the save button or discard them thanks to the Cancel button Clicking on the New button creates a new external link Click on the Delete button deletes the currently selected external link In order to perform the search on the selected term candidate surface form the url must contain a specific placeholder s which will be replaced by the actual surface form You can test the url you entered by typing a search string in the example field and clicking the test button Note new users will not have any external link configured Some default external search links can be quickly added by clicking on the add default links button 6 3 Memory allocation The maximum memory size allocated by the JVM is a parameter in the application configuration file see TyDI Admin guide The default configuration allows allocating up to 512 MB of memory If this amount is not adequate it 1s possible to override the default value by adding a specific argument on the command line For example to run with 1 GB of memory type tydi J Xmx1024m 6 4 Look and Feel TyDI is a Swing application hence its GUI supports pluggable Look and Feel s LAF Since TyDI has been designed with the cross platform LAF called Metal if another LAF is used you may experience some subtle visual flaws see TyDI Admin guide for more details 6 5 OS specificity Even 1f the application 1s portable there
27. e extended rights They can Create new user profiles or modify existing ones Grant or revoke the right of users to work on given terminology projects Create new terminology project or import term extraction result into existing ones 3 2 2 User authorizations A The user authorisation window is opened by clicking on the button located in the User toolbar This command can also be found in the context sensitive menu under the User node in the Project window Note the user authorisation management is only granted to Application administrators UserAuthorization Window x Project AO1H_jan09 v Id Name Validating dei 2 sophie V 3 sandra V d Claire V 5 fred 6 robert 7 bernard V Sialain Illustration 7 user authorisations window The drop down list box located at the top of the user authorization window displays the current terminology projects in the current database On a selection of a project the table below is refreshed The table displays for each existing user profile an editable check box indicating if the corresponding user is granted the right to work on a project A granted user can perform term validation semantic class and term link creation To change the rights of a specific user just click on the corresponding check box 3 3 Creating a terminology project Just after creation the database is empty Terminology projects are created by importing terms such as candidate terms output by a term extracto
28. e period water stressing the embryos wherein the duration of the development period the type and intensity of the water stressing the duration of the water stressing treatment time period the duration of the growth regulator treatment time period and the type and concentration of the metabolizable carbon source and growth regulator are selected to reduce the moisture content of the embryos to a level of less than about 55 deg s and to render the embryos desiccation tolerant and wherein the water stressing is non plasmolysing characterized in that the concentration of the growth regulator is raised from a selected initial concentration to a selected peak concentration value before the completion of the regulator treatment CA2240135A1 xmI 7 51 un The method of any of claims 1 to 9 wherein the effective termination of the influence of the growth regulator on embryo development determines the completion of the growth regulator tre atment CA2240135Al xml 52 Illustration 14 Occurrence in the context window The context window table highlights occurrences of a given term candidate within the corpus text The table contains one row per sentence containing an occurrence The columns display the name of the source file the sentence rank within the file the number of occurrences in the sentence the sentence text with highlighted occurrence multiple occurrences present within the same sentence a
29. e term Surface form Lemma word count Syntactic category Head Expansion is Concept LJ Note Term creation should be scarcely used 4 3 Term Grid Local filter The term grid local filter 1s a second level of filter compared to filter criteria panel It is used to temporally hide some terms on the term grid It can be quickly enabled disabled thanks to a specific toggle button This is a very powerful tool that can be combined with a first level selection criterion to refine the list of the terms displayed in the grid Local filter button allows defining a local filter by specifying a regular expression that is applied to one of the visible columns j Apply local filter toggle button allows quickly enabling or disabling the local filter 1f defined Note it 1s called local filter because 1t does not query the database each time the filter is modified or applied Hence its quickness 4 3 1 Regular expressions The description of regular expressions is beyond the scope of this document For more information see http en wikipedia org wiki Regular_expression Briefly regular expressions work similarly to the wildcard characters used in the text fields of the filter panel as described in 3 4 1 the regular expression is tested against each row of the grid a row is then showed only if the expression evaluates to true Regular expressions include other constructs than the wildcards that are useful to express mo
30. e three kinds of link links to group terms sharing the same meaning hence to build semantic classes links between semantic classes corresponding to ontological and semantic relation links between terms corresponding to semantic relations based on morphosyntactic transformations A semantic class is defined as a set of terms the role of a term in a class can be of three distinct types class representative there 1s always one and only one such representative per class and a term can be the representative of one single class only The name of the class is the surface form of the representative synonym for terms having the same meaning as the representative with respect to the application need It is a transitive relation quasi synonym for terms having a close meaning to the class representative in a certain context only non transitive relation Semantic classes can also be related to each other Hyponymy Linked classes are linked by a general specific is a relation hyperonymy directed asymmetrical link Antonymy Linked classes have opposite meaning undirected symmetrical link There are at least four types of link between terms FastR variant T Typographic variant Link used for example to bound misspelled form of the same term relation H Acronym Link between the acronym and its extended form directed asymmetrical l
31. ects to efficiently work on a project since it is possible to drag a class on the first view and drop it in the second one allowing to create relation between widely separated classes d Semantic Classes Tree Window x SS Semantic Classes Tree Window x Microbio2 vi2 H P ie 497 a o lt mobility and chemotaxis al Anything lt protein secretion o J Subtilis Molecular Biology Concept lt sensor 9 KH Subtilist functional classification 9 4H sporulation 9 lt cell envelope and cellular process o 4 activation of the mother cell o cell division o 4 different sporulation stage Oo cell wall o J early sporulation o lt germination 9 KH endospore formation o 4 membrane bioenergetics bacterial endospore formation o lt mobility and chemotaxis o 4 late sporulation o lt protein secretion o 4 mother cell lysis sensor o 4 onset of sporulation o lt lt sporulation o lt H spore formation o 4 transformation competence o 4 transformation competence o 4 transport binding protein and lipoprotein o 4 transport binding protein and lipoprotein o J information pathway 9 lt H information pathway lt lt intermediary metabolism o 4 DNA recombination no similarity 9 lt DNA replication J other function lt DNA initiation of replication o lt lt similar to unknown protein o lt DNA restriction modification and repair misc o 4 DNA packaging and segregation DNA
32. embryos 7 conifer embryo embryos are conifer conifer somatic embryo Illustration 24 FastR variants graphical view after manual rearrangement A grey box surrounds terms linked to only one other term Otherwise the boxes are green and the most linked terms are represented in bigger boxes the biggest box is usually the best term representative for a synonymy class Boxes can be moved by a simple dragging gesture Linked terms can by automatically rearranged around the currently selected box thanks to a right click Like in any other view terms can be selected in this view The available actions are a subset of the ones described before Note that the terms belonging to the same subgraph are not necessarily synonyms For example conifer somatic embryo is a kind of conifer embryo 99 66 Likewise conifer pre cotyledonary somatic embryo conifer cotyledonary somatic embryo and conifer mature somatic embryo are different specific kinds of conifer somatic embryo Of course not all FastR candidate term proposals are valid for example embryo are conifer is obviously not a term 4 5 Modular text import utility SA Important note This functionality is currently not available when using Web Service access Sometimes a terminology project is built up from distinct resources The modular text importer can enrich an existing project in TyDI Three distinct categories of data can be impo
33. es Id 932 Surface form illegal cod OccurencelnContext Window Candidate 932 Filename Sentence Context Lemma illegal cod Turning a blind eye to the source of the cod that they are buying helps make illegal cod fishing a profitable business and paves the way for Syntactic category JI NN illegally caught cod to end up packaged in the boxes of major brands around the world Head Sc Stop any trading or cooperation with fish companies involved in trade with illegal cod As an immediate measure only buy cod that has been Expansion illegal landed in Norway or Iceland analysis illegal cod is infered 9 Other Nb occur 0 Ban all unloading of Barents Cod from black listed or convenience flagged transport vessels in EU harbors The lack of proper European Nb doc cooperation to combat illegal cod landings According to sources available to Greenpeace Dutch and German harbors provide reasonably good S information about landings of cod to both Russian and Norwegian authorities Producer is Concept Grimsby close to Manchester seems to be the no 1 UK harbor for illegal cod landings at least by transport reefers Dutch harbors Eemshaven close to Groningen and Velsen close to Amsterdam are among the most important and frequently used landing ports for 3 is PseudoTerm Illustration 3 undocked top level windows 3 Basic usage Since all the te
34. idation modes depending on a project parameter blind mode the current user makes his own validation without seeing the validations performed by other users cooperative mode the current user can see the validations performed by other users if any In cooperative mode the table displays one column per user participating in the project the column headers contain the corresponding user name Surface form Sandra Claire Bernard Fred Syntacti Id 1145747 benzulfuron NN 2245748 Benz Ip NN 1142484 benzy adenine Fi V NN NN 1142936 benzy alcohol V NN NN 1145751 benzylamino D INN 1145752 benzaminopuhne Ip NN 1145753 benzVlaminopur le H r BIT 1143513 benzV aminopurine G y NN NN 1183113 benzWamino riboside growth regulator V M NN N 1145756 benzyle D NN 1142485 benzyle adenine Vv NN NN 5 1160875 benzy glucosinolate LE vr NNN Illustration 17 drop down list based validation cooperative mode 3 4 5 3 Validation justification If necessary users can write a free text comment as a validation justification or as a way to qualify terms for further processing e g segmentation problem OCR error incomplete named entity Illustration Clicking on the button located on the left side of the validation widget open the comment edit window When no comment is set the button face is empty When a comment is set th
35. ink P Variant relation as proposed by FastR tool These links are not relation editable by users read only directed asymmetrical link j Translation Synonymy link between terms in different languages Note in the case of directed link the link icon contains a small arrow head to indicate the direction of the link For example in the screen capture below seed of corn is an hyponym of seed 5 T seed of corn 7 8 D seed of corn seed of the corn Oo Seed of any corn Seed of the corn o SEEDS OF CORN Oo seeds of corn ve rn seeds Ng gt eed 21 seed 2 o gt seed Q z2 seeds ge amp seedofcorn 7 Illustration 19 Semantic class window gt The Term link button allows to open refresh the Semantic class and Term link window welle The expand collapse button allows expanding collapse the selected nodes 4 2 2 Adding links between terms and classes We have seen that it is easy to create a semantic class from a selection of terms from the term grids g Click on the Create class button creates a new semantic class containing all selected terms Then adding new terms in existing semantic classes is performed by drag and drop gestures Dragging can be initiated either from a term grid or from the semantic class view itself There are three types of drop targets grey arrows in image below corresponding to the three possible kinds of lin
36. it an ontology editor for biologists Bioinformatics 2007 Aug 15 23 16 2198 200 Epub 2007 Jun 1
37. ith illegal cod As an immediate measure is infered O DOC 0 6 1 only buy cod that has been landed in Norway or Iceland 9 Other 3 S Dutch harbors Eemshaven close to Groningen and Velsen close to Amsterdam are among the most important Nb occur l DOC 0 7 1 land frequently used landing ports for Nb doc e Ban all unloading of Barents Cod from black listed or convenience flagged transport vessels in EU harbors The lack Producer LI of proper European cooperation to combat illegal cod landings According to sources available to Greenpeace is Concept DOC 0 8 1 Dutch and German harbors provide reasonably good information about landings of cod to both Russian and is PseudoTerm O Norwegian authorities Illustration 1 TyDI main windows The most common top level windows are One project window in blue in the screen capture displaying all projects that are visible to the current user in the current database As many term grids in yellow as needed displaying a selection or all candidate terms of a specific project in a tabular form The validation is performed through this screen One property sheet in green a general purpose window displaying detailed information about the currently select item e g term project corpus link One context window in grey presenting the occurrences of a specific term in its corpus context One term link window in red displaying the semantic class and the
38. k as described above Semantic Classes of term United Nations dw 5 L MEE MS NII Gr LEE e H UN o MH PED 2 P United Nations Illustration 20 Drop targets in Semantic class window Class node dropping a term on a class 1 of the semantic class view creates a synonymy link between the term and the other terms of the class Class relation node dropping either a class dragged from the semantic class view or a representative term dragged from a term grid on the class structure icon 2 adds a link between the two semantic classes The link is an hyperonymy or an hyponymy or an antonymy link as proposed by the scrolling menu Term node dropping a term dragged from a term grid or from the semantic class view on a term node 3 adds a link between the two terms The link is a typographic variant an acronymy or a translation link as proposed by the pop up menu Note that the menu appears once the mouse button has been released Esc button or dragging the mouse pointer out of the menu cancels the action 4 2 3 Removing links among terms and classes and more Link deletion can be performed by selecting the corresponding node in the Semantic class view and clicking on the relevant button in the toolbar Semantic Classes of term seed of corn LUE MX Semantic Classes of term seed of corn E 41 x Semantic Classes of term seed of corn Sc l X 5 a gd e o LA o S D a seed of corn 7 seed of corn 7 seed of
39. links where the selected terms appears Three toolbars are located just below the menu bar where many action buttons are available From left to right Term navigation tool bar Project toolbar and User toolbar CERS E EWERT Illustration 2 TyDI toolbar 2 3 Managing windows Managing the windows inside the application 1s very flexible in particular the user can customize the layout of the windows as s he likes Ti TyDI Term grid 3 free search Environment Editor 5 x Z Eile Edit View Tools Window Help EK Term grid 3 free search Environment x dell ES er ousntslsss tl WI sla EIERE ra 7 vild Surface form Nb Hea Expa Head Expansion Syntacti Producer Fred 6228 Member States 10 0 States Member NP NNS Yatea_1 derm Database jdbc postgresql bddev 5432 texte 6233 destructive fishing methods 2 O meth destructive JJ NN Yatea_1 Environment en 6235 deaf 1 1 deaf IT Yatea_1 gt Yateal a ate own atea L J llsl al 5 en corpus of Yatea 1 Term grid 1 free search Environment Editor Fred Ar 4 A Term grid 1 free search Environment x DAC TEE Drees Ft SS Form showinfered terms Lemma show dismissed terms Syntactic category Word count oH lt 0H Head Nb occurences 0H lt 0H Expansion show canonic only
40. m grid window is used to display all or a selection of term candidates associated to a project It is composed of two main panels and a toolbar the filter criteria panel upper part is used to limit the number of term candidates retrieved from the database by setting some criteria the toolbar contains buttons to perform commands on selected candidates the candidate table lower part displays the list of candidates corresponding to the criteria set above it The validation and the structuring of terms are based on the examination of close terms The closeness is mainly based on morphology criteria frequency linguistic properties other user opinion and context The filter criteria panel is then used to filter the term candidate to be displayed in the candidate table so that the command of the toolbar can be performed on the selected candidates 3 41 Filter panel A typical terminology project can contain several thousands of term candidates and it is usually not useful to display them all at once not to mention that it can take some time to retrieve them from the database Filter eh 2 Form include inferred terms include dismissed terms Lemma Syntactic category Word count gt 0 lt 0 Head Nb occurences gt 0 lt 0 Expansion show only class members show only class representative D Justification all users producer Yatea 1 Validation D D all users E WE W Illustration 10 Filter panel criteria For selecting
41. minology project and the kind of usage for example several projects opened at the same time the amount of memory needed can vary see 6 3 5 2 Client installation 5 2 1 OS specific installer Depending on the OS you are using you may download one of the available installer http bibliome jouy inra fr TyDI_updateCenter downloads tydi_latest linux sh http bibliome jouy inra fr TyDI_updateCenter downloads tydi_latest macosx tgz Windows http bibliome jouy inra fr TyDI_updateCenter downloads tydi_latest windows exe Once downloaded execute the installer and follow the installer instructions 5 2 2 Generic zip archive Alternatively a generic zip distribution 1s available Download the zip distribution at http bibliome jouy inra fr TyDI_updateCenter downloads tydi_latest zip Extract it it will create a subdirectory named tydi Launch the application If you are using MS Windows operating system you can start the application by executing bin tydi exe located under the newly created directory If you are using an Unix like or Mac operating system you can start the application by executing bin tydi located under the newly created directory Once the client installed you need to set up a database connection see 6 1 5 3 Client update As of v0 3 TyDI can keep itself up to date by downloading and installing newer modules Checking for new version 1s performed at every start up of the application
42. n Matching strings DNA Any string strictly equal to DNA DNA any string containing DNA DNA RNA any string containing DNA or RNA 3 DIR NAG any string containing DNA or RNA Me OK any string finishing by a period eet KEE any string containing at least one punctuation mark amongst comma semi colon colon period Sa RS any string finishing by one and only one s MPG any string beginning by at least one capital letter LOST any string containing at least one decimal character SE idem simplified form 4 4 Term variants Most of the time a term comes upon various distinct forms Depending on the purpose of the terminology design the user might want to reassemble variants corresponding to validated terms TyDI allows to easily exploiting the result of a specific variant detecting tool called FastR in order to enrich a terminology project with term variants that might not have been discovered by the term extractor and to link these variants to one representative term 4 4 1 Variant discovery using FastR This 1s a three steps procedure 1 FastR must be fed by a certified term list and a corpus It can be the terms validated through TyDI and exported in the relevant format see Terminology export in 3 4 2 2 Import FastR result file in TyDI A Important note This functionality is currently not available when using Web Service access
43. n o 4 membrane bioenergetics o 4 mobility and chemotaxis o lt protein secretion lt sensor o lt H sporulation 4 transformation competence o 4 transport binding protein and lipoprotein o J information pathway o J intermediary metabolism o 4 no similarity o lt lt other function o J similar to unknown protein Oo J misc i DNA E DNA insert Er DNA polymerase Semantic classes tree window The window is divided in two distinct areas a toolbar at the top which displays the current terminology project it can be changed a refresh button to read the data anew from the database and a search field use Ctrl F as a shortcut and Enter key to find next occurrence a panel which displays the hyperonym hyponym tree The tree can contain 4 different types of node plus a unique root node corresponding to semantic classes contained in the project as described below Je Root node This special node as no name but it displays the total number of rooted classes Rooted lonely class Class without any hyperonym or hyponym 4 Rooted hyperonym Class without any hyperonym but with associated class hyponym s Leaf hyperonym class Class associated to hyperonym s but without any hyponym Hi Hyponym and Class both associated to hyperonym s and hyponym s Hyperonym class The label associated to the nodes is the surface form of the representative of the class If the font used for this label i
44. ndidate to be the label of a concept in ontology is Dismissed True if the term is detected by the None extractor but has been filtered out optional YaTeA post processing Producer Processing or user who created the None term Is pseudo Term Is set to true by the user if the term None candidate is not member of the target terminology but should be kept as an alternative form in a semantic classes for indexing purposes Unparsed True for YaTeA unparsed phrase None Validation Validation status and comment None Note if the cooperative validation mode is set for the project one validation column per user is displayed see 3 4 5 2 3 4 4 Displaying candidate features 3 4 4 1 Property sheet G The property sheet can be opened thanks to the command located in the Windows main menu The property sheet is a general purpose view that displays in a tabular format information about the selected elements It can be advantageously used to display features of a candidate especially when some columns of the candidate table are hidden Properties are separated in two distinct sets the first one contains the actual properties of the term as they are found in the corpus or computed by the term extractor On the other hand the Expert set includes user editable properties such as the Concept and Pseudo term tags see 4 1 Candidate 8310936 Properties dl x Properties
45. on status we distinguish three types of terms standard well formed terms that belong to the target terminology terms that actually denote concept that should be marked to appear as label of the concept in ontology derived from the terminology They are tagged Concept malformed terms that do not belong to the final terminology but must be kept as alternative forms for indexing purposes These should be tagged pseudo term 4 2 Terminology structure design Beyond validating terms TyDI allows to structure the terminology by creating links between terms and classes of terms All these operations are performed using the Semantic class window or the Semantic classes tree view windows see 4 2 4 The Semantic class window displays a class tree view and a toolbar including all buttons needed to perform actions on the selected nodes The nodes are indicated bya see Illustration 19 There are three different types of nodes term classes and structure nodes Click on a node open it and displays specific information e terms with different roles ar e Click on a term displays the term links e Click on a term class displays the members of the class e Click on the structure node displays the links of the class Similarly to the application main toolbar the available actions depend on the nodes selection and the buttons are enabled disabled consequently 4 2 1 Semantic class view description There ar
46. or example we may want to remove already known named entities like species names in a biological terminology Nevertheless it is still possible to view the dismissed terms in TyDI Note 3 the project creator is automatically granted the right to work on the new project Hence grants must be explicitly set to other participating users see 3 2 2 3 3 2 Importing multiple extraction results A Important note This functionality is currently not available when using Web Service access It is possible to import term extractor results into an already existing project and the corresponding corpus can be in languages distinct from the main project language The result importation is performed using the same wizard described before 3 4 Term candidate selection The main activity of TyDI user is to navigate through the list of term candidates of the project for assigning validation status or relationships To avoid scanning the list of candidates sequentially as this process can be pretty boring and in addition really inefficient TyDI provides many facilities to select sort and navigate through candidate lists Click on the button located in the Project toolbar opens the Term grid window it is enabled when a Project node is selected in the Project window It can also be opened by a simple double click on the Project node this command can also be found in the context sensitive menu under the Project node in the Project window The Ter
47. orm Expansion Expansion form Prevalidation Prevalidation string is Class member True if the term belongs to a semantic class is Representative True if the term is the representative term of a semantic class Producer Processing or user who created the term free selection is Inferred True if the term is not found in the corpus alone in a maximal noun phrase MNP but has been retained for the syntactic analysis of a larger term 1s Dismissed True if the term is detected by the extractor but has been filtered out optional post YaTeA processing is Superseded True if the term is detected by the extractor but has been regrouped with others under a merged representative optional post YaTeA processing is Unparsed Phrase True if the term extractor has not been able to parse the phrase Word count Number of words in the form Nb of occurrences Number of occurrences of the form with a given syntactic analysis within the corpus Class member only representative Term members of any semantic class or only the representative amongst those class members Justification Validation comment Validation Validation status es free selection 3 4 2 Term grid toolbar gt o sf A x o L pj o o 3 9723 rows Illustration 11 term grid toolbar This toolbar contains the following action buttons Apply button Execute the
48. ot available when using Web Service access Project export utilities are used to export a whole project in a specific format The project export button is located in the Project toolbar It is enabled when a project node is selected in the Project window 4 7 1 Text file export a Export Term project as text files Steps Text export parameters wizard 1 of 1 L Text export parameters Export directory output settings Exported categories vi Term DD status filter Vv M Il prefix by numeric ID ki remove duplicates ES wel lallusers ki override filter to include any class members include amongst unvalidated i infered unparsed dismissed A Synonym Quasi Synonym V Hyponym Merged vi Acronym Typographic variant _ sSh O M ggq Mgrrj jes Cancel j The text file utility exports several types of data from a terminology project it produces in a specified directory a set of text files containing tab separated values By default duplicate lines are removed A global option allows prefixing each field by the internal term identifier The term file settings allow selecting the term to export depending on the validation status as set by the current or by any of the other users The filter can be overridden to always export terms that are part of a semantic class whatever their valid
49. perties 2 Step 2 Corpus and Processing properties Corpus language English v Description enen a je Candidates File IUUF_merged xml Corpus File Illustration 9 Project creation wizard files import panel Note you can indicate here that a corpus is in a distinct language than the project main language The input format file must be specified YaTeA XML results or Tab separated values file with TyDI v0 2 version of June 2009 and above see 7 1 for file format Then you need to indicate the path to the data files in the provided text fields You can use the open file dialog box to set these paths button with the ellipsis mark If you import YaTeA data you can optionally indicate a corpus file The advantage of importing the corpus file is that TyDI will then be able to display terms in context Note 1 the YaTeA term candidate files can be post processed by the merging tool fusion termino xml pl to reduce term redundancy by gathering flexed forms or close typographic variants under a representative form of the group The merged terms are then said to be superseded by the representative form Nevertheless it is still possible to view the superseded terms in TyDI Note 2 the YaTeA candidate files can be post processed by the filtering tool filtrage termino xml pl to reduce term profusion by removing superfluous terms using simple regular expression or dictionary based methods F
50. r Ca The Create project button is located in the Project toolbar it is enabled when the Term Database node is selected in the Project window This command can also be found in the context sensitive menu under the Term Database node in the Project window 3 3 1 Importing data into projects 8 A Important note This functionality is currently not available when using Web Service access A wizard will guide you during the import process Warning be aware that the time required to import data depends on the available memory Importing more than 25 000 terms requires the allocation of more memory than the default setting see 6 3 3 3 1 1 Project global properties The first step of the wizard can be used to set the project global properties name description and main language A Create a new Term project using Extractor results X Steps Step 1 Project global properties L Step 1 Project global properties 2 Step 2 Corpus and Processing properties Project Name IUUF Description terminology related to illegal fishing Main language English v Illustration 8 Project creation wizard main panel 3 3 1 2 Corpus and processing properties The second step of the wizard allows to set properties of the corpus if relevant and of the term input format Create a new Term project using Extractor results Steps Step 2 Corpus and Processing properties 1 Step 1 Project global pro
51. re complex filters on string patterns for instance character classes short form for sets of characters alternative of pattern logical OR grouping quantification number of successive occurrences of pattern anchors whether a pattern occurs at the beginning or at the end of the line 4 3 2 Regular expression short references Predefined Character Classes matches any character d matches a digit 0 9 GG matches a whitespace character space tabulation NW matches a word character alphanumeric User defined Character Classes xyz matches x or y or z a g matches any character within the interval a to g xyZ matches any character except x y and z Alternative xyz abC matches xyz or abc Quantifiers S once or not at all a Zero or more times T one or more times n exactly n times ton at least n times n m at least n times but no more than m times Anchors z start of the line 5 end of the line Note matching any of the special characters used by the regular expression language requires prefixing it by an antislash bar For example to match the dollar sign 4 3 3 Local filter examples Note the content of the cell is considered as a whole line and by default the local filter regular expression 1s anchored at the beginning and at the end of the line Regular expressio
52. re highlighted with different colours Notes the visual settings of this table can be set as for the Candidate table see 3 4 3 1 In order to select a part of the sentence double click in the sentence cell then select a text part and copy it to the clipboard use keyboard short cut 3 4 5 Term candidate validation While browsing the term candidates the user can quickly assign a validation status to the terms in the cell of the column named by the user identifier or Validation and located in the same table row Note superseded candidates and dismissed candidates cannot be validated 3 4 5 1 Validation status By default TyDI offers to choose amongst five distinct status values because in real project it is not always easy to put each term in one of the two distinct class valid terms or invalid terms The table below explain the meaning of these status values Status label Description No status assigned to the term candidate D Candidate term to be removed irrelevant for the application purpose D Candidate term to be removed but the user is unsure should be checked Not decided after examination V Candidate term to be kept but the user is unsure should be checked V Candidate term to be kept relevant for the application purpose Note the number of distinct status values and their associated label is actually a project specific parameter It can be customized 3
53. resSSI0N short relcrerntES sieisen Eaa a ne one 34 LS NG SE 0 0x7 OR RDS unmnr nT ene Sn mee mete rere eee errno nee eee ere 35 44 CE CW e En EE 35 4 4 1 Variant discovery using FastRi AGANG AA AABANG BAGA KAKA 36 4 4 2 FastR variant proposals Vi W uuumummununanananunuasawawaahawawawananaasasananaasasanannasasanasanansasanansasananananaananananananannananana 36 4 4 3 FastR Variants graphical VIEW aan ARNAN AGA GA 38 AS Modularte DO FCO nn a ee ie nn 39 Aad TND UE ANS L OPA AA AA AA AA 39 e SE E Ter E A0 47 Projectexport E eee a de ne ne at a s iea En aKaraa A0 GEN e KU e E 40 Tl OBOT TE EE 41 D SOA OM ee eee E eee nn 43 S1 AL a AP a ec ee eee ner PAA ses 43 5 CHENES anO nant A ne amd an caen 43 DA OS SOC ARR ea ba E 43 BZ SEENEN 43 D CUCM UPUTI E 43 VE NEE E e ti NANANA BANAAG 44 6 1 Con ection configuratiot EE 44 6 2 External link to web browsers s ssssunnsnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnnn annann nanan 45 6 3 Memory ANOCAUON 33ANG 46 OR LORD is 46 mo OS eh ee 46 PAPC L O D E 47 7 1 Term textimport ere 47 1 2 Term Candidate feature Si Ee 48 1 Introduction The Terminology Design Interface TyDI is a graphical tool for e The validation of large sets of candidate terms extracted from texts written in natural language e The selection of a subset of terms in a terminology relevant for a given application e And the structuring of terminologies See the Ty
54. rminological data is stored in a database the user needs to connect to a dedicated server before starting working on terminologies Note A connection dialog appears automatically when the application is launched be patient the dialog is displayed once the application is correctly initialised Projects Window SG SE ui d a Term Database dbc postgresql bddev 5432 texte_dev o A01H en Yatea_l F fe 243450 en corpus of AO1H BE813354A1 xml BE904069A1L xml BE904561A2 xml BE904661A1 xmI CA1045067A1L xml CA1072545A1 xml o validations_BT en A Fred Illustration 4 Project window The project window is the entry point to work with TyDI It shows a hierarchical tree of the data organisation Of course when you are not connected to a database yet the project window is empty Selecting a node in the project window enables specific actions Node type Available actions The Term Database node root of the tree Disconnect Create new project E Project nodes one per project the current Term free search user has been given rights to work with Import result of a new term extraction View FastR variation proposals View project statistics Processing nodes one per processing None performed in the project e g YaTeA extraction tab file import FastR variant search E Corpus nodes corpora can be shared by Import FastR variant search results bit distinct processing Text
55. rows Variant Id V Surface form V Pro Validati FastR Metar Nb wor Delta string Lin Origin Surface form O Produc Validati ii 1195201 crop plant varieties Tab XX 16 Ins 1 plant Fas 1195167 crop variety Tabim 1195404 closed vessel Tab XX 16 Ins 1 sterile Fas 1195400 closed sterile vessel Tabim 1195911 wall of the cells Tab XX 37 Perm 2 of the Fas 1195902 cell walls Tablm 1197721 yields of mushroom cell biomass ates KN Z Perm 3 of mushroo Fas 1197674 biomass yield Tabim 1197722 yields of crops Tao XX 2 Perm il of Fas 1197677 crop yield Tabim 4882779 vield of crop Fast XX 37 Perm 1 of Fas 1197677 crop yield Tabim 4882781 C1 C4 alkyl Fast KX 22 Ins 0 Fas 1143102 C1 C4 alkyl Tabim 4882784 grain is corn Fast KX 31 Perm il is Fas 1161126 corn grain Tablm E 4882786 cotyledonary zygotic cacao embryos Fast XX 16 Ins 2 zygotic cacao Fas 1155714 cotedonary embryo Tabim 4882788 adventitious buds produce rootable shoots Fast XX 16 Ins 3 buds produ Fas 1187942 adventitious shoot Tabim 4882790 walls generated by active cell Fast XX 37 Perm 3 generated Fas 1195902 cell walls Tabim 4882792 B5 culture medium Fast 16 Ins il culture Fas 1168772 B5 medium Tabim X D Illustration 23 FastR variants grid Note Term validation can be performed thanks to this grid Note When a candidate term
56. rted in a project If necessary the import process will create new terms and new semantic classes and the corresponding links between these objects In the case of hyponym hyperonym import newly created terms can optionally tagged as Concept EI Import text datafile in an existing Term project x Steps Text import parameters L Text import parameters Imported category Hyponym Hyperonym add missing terms as Concept file contains lemmatized forms 3 Typographic variant Translation _ Synonym A first column contains lemmatized forms JJ Quasi synonym datafile e Cancel Illustration 25 Modular text import wizard 4 5 1 Input file in text format Text format of input files uses the tabulation character as field separator The row should not contain a header The column should contain a header as described in Appendix 7 1 The expected columns are Synonyms 2 columns surface form of the synonym surface form of the class representative Quasi synonyms 2 columns surface form of the quasi synonym surface form of the class representative Typographic variant 2 to N columns surface form of variant 1 surface form of variant N Hyponyms Hyperonyms 2 columns surface form of the hyponym surface form of the hyperonym 4 6 Ontology import Not implemented yet 4 7 Project export utilities SA Important note This functionality 1s currently n
57. s bold it means that the class 1s associated to several hyperonyms Note Its is possible to jump to another hyperonym thanks to the context sensitive menu 9 lt cell envelope and cellular process o lt lt cell divi other hyperonyms gt cell cycle o gt cell wall o lt germination o lt membrane bioenergetics o lt mobility and chemotaxis o lt protein secretion 4 2 4 1 link modification In this view all modifications are performed using drag and drop DnD gestures The default DnD action is a Copy operation it is symbolized by a plus sign that appears in the mouse pointer when dragging has started But is it possible to change the DnD action to the Move operation by pressing the Ctrl Shift When doing so the dragged hyponym class will be actually moved from one hyperonym to another In summary two distinct operations can be performed Create a link the hyponym class must be dragged and then dropped on its new hyperonym class Of course 1f the dragged class was not associated to any hyperonyms i e it was a child of the root node the gesture will behave like a DnD Move operation regardless of the mouse pointer aspect Delete a link the hyponym class must be dragged and then dropped on the root node In this case 1t always behaves like a DnD Move operation Note It is strongly advised to open side to side two Semantic Classes views on the same proj
58. s of the surface form DOC Number of distinct documents in which the surface form is found SURFACE FORM The surface form of the term Mandatory unless a lemma 1s specified the lemma will then be used as a surface form LEMMA Lemma of the term POS Part of speech tag HEAD LEMMA Lemma of the head HEAD SURFACE FORM Surface form of the head MODIFIER SURFACE FORM Surface form of the expansion MODIFIER LEMMA Lemma of the expansion 7 2 Term candidate feature list The table below lists all the features that can be associated to a term candidate The surface form is the only mandatory feature The available feature depends on the import type YaTeA XML file or Tab Separated Value file Feature Description Yatea Tab Form Surface form of the term candidate as it 1s found in the corpus J J Lemma Lemma form of the candidate V J Syntactic category Part of speech tag POS tag J VA Head Head form J J Expansion Expansion form VA VA Prevalidation Prevalidation string X VA Analysis Recursive decomposition of the term candidate in head J X expansion elements is Inferred True 1f the term is not found in the corpus alone in a maximal J X noun phrase MNP but has been retained for the syntactic analysis of larger term is Dismissed True 1f the term is detected by the extractor but has been X filtered out optional post YaTeA processing 1s Superseded True 1f the term is detected by
59. umn second click will toggle to descending sort and a third click will restore the natural order alphabetical sort on the surface form It is also possible to perform sorting on any number of rows by using Shift click to add new column to the sort group Of course all these settings are stored in the application preferences and reused for any subsequently opened table 3 4 3 2 Term candidate Table details The term candidate table contains one row per term retrieved from the database that verifies the filter Most of the term features are displayed in the table column Besides quick term browsing is provided by double clicking on a cell of the table 1t opens a new term grid containing a new list of terms the content of which depends of the clicked feature cell as described in the table below Column Description Associated action double click Id TyDI term candidate identifier None Prevalidation Prevalidation string from tab None separated value file import only Surface form Surface form of the term candidate Open a new grid containing all the term as found in the corpus candidates which are part of the syntactic analysis of the current candidate Number of Number of occurrences of the form Open refresh the context window occurrences within the corpus displaying the occurrences of the selected candidate within the corpus Available only 1f the corpus has been imported in the project
60. variant suggested by FastR has not already been validated before it is created but marked with a specific term producer as shown on the figure 23 4 4 2 3 Variant grid toolbar This toolbar contains a subset of action buttons available in the Term grid toolbar Apply button execute the query with the current values of the criteria Short cut enter key E External search button launch a search of the selected term surface form ki Context button open refresh the context window displaying the occurrences of the selected term in the corpus KA Term link button open refresh the Semantic class and Term link window to display the classes containing the selected term g Create class button create a new semantic class containing all selected terms ts Graph display button Display selected terms in a graphical view see 4 4 3 908 rows Rows count field shows the total number of term candidates currently displayed excluding those filtered out by the local filter 4 4 3 FastR variants graphical view FastR variants graphical view is a simple graphical view where terms are represented in rectangular boxes and linked together by magenta lines representing FastR variation proposals es FastR links view x d klleip cotyledonary conifer somatic embryos conifer germinants conifer somatic embryo germinants CONIFER SOMATIC EMBRYO CONIFER SOMATIC EMBRYOS pi a conifer somatic
61. ym categories are created and included in the output file in order to distinguish exact synonym quasi synonym acronym and typographic variant TyDI s term IDs are also exported and visible in OBO Edit as cross reference Terms belonging to semantic classes representative and synonyms are always exported with no regards for their validation statuses On the other hand simple terms will be exported only 1f they match the statuses selected in the export option panel Actually the option panel allows to define priorities among users in order to decide which term to export For each term the system search for a validation status in the order of user priorities and compare it to the statuses selected in the option panel Terms in conflict i e for which at least 2 users disagree about the validation status are displayed in the output window for further analysis 5 Installation The client installation is easy but it requires that the server side is already available and that the user knows some server parameters host name access mode database login amp password amongst others to properly configure the data connection For more detail about database installation see TyDI Admin guide 5 1 Requirement The Terminology Design Interface client is a Java application Thus it requires at least a Java Runtime Environment version 1 6u25 or later The JVM must allocate at least 512 Mo memory Depending of the size of the ter
Download Pdf Manuals
Related Search
Related Contents
取扱説明書 - 丸三タカギ Quick Programming Guide 16CH Stand Alone Digital Video Recorder PDR-3160 Rangemaster Elan 90 Ceramic TR139取扱説明書を見る Targus AMU7512EU mice STIHL FS 240, 240 R Copyright © All rights reserved.
Failed to retrieve file