Home
User Manual - Intelligent Data Analysis Research Lab
Contents
1. TreeLiker Graphical User Interface IDA Intelligent Data Analysis RESEARCH LAB USER MANUAL Intelligent Data Analysis Research Lab Department of Cybernetics Faculty of Electrical Engineering Czech Technical University in Prague CTU June 2012 WAAC Ct ON omm 4 2 Detailed Functional Description nee Renee 4 2 1 ree ee ee ee ee ee een 5 2 2 TED LE IN UNS MK 5 2 3 Pattern se ar Eee 5 2 4 Found Patern IW OG MEMO ERI 5 2 5 MOUI 5 3 E E E E E EE A E ER 5 3 1 IEE PE A 5 3 2 Running the Machine learning 6 7s HEBES oes ee lu eoo EM V MUNI 6 4 1 MM 6 4 1 1 d NER 6 Fe OJ aan ana nennen 7 41 12 Project eee 7 ee eee ee 8 SNMP Mi duce 9 4 1 2 m uU Emm 9 4 2 Tal e100 ole lt T E 9 4 2 1 Welcome WINDOW 9 4 2 2 N 11 4221 12 Dre C To EZ IS ERE E 12 d22 3 UU II DE 12 4 3 Temp MO
2. JRE The version needed is the JRE6 The latest version of JRE can be downloaded from the Java website http www oracle com technetwork java javase downloads 3 2 RUNNING THE MACHINE LEARNING TOOL To run the Machine Learning Tool just run the file start bat or start sh if you are working on Linux and the program will start automatically Figure 1 Welcome Window Machine Learning Tool IDA Intelligent Data Analysis Research Lab Load Project New Project 4 GRAPHIC USER INTERFACE 4 1 MENU BAR Provides access to file options and also provides the help option of the application Figure 2 Menu Bar File Help It has the following options 4 1 1 1 NEW PROJECT You can create a new TreeLiker project in this case it creates a new directory 1 On the menu bar click the File option 2 Click the New Project option 3 Onthe Create a Project Directory window select the desired project location Figure 3 Create a Project Directory Window Create Project Directory ware 4 88 build Bj nbproject 5 config Ir sic B DataSets Bg Test 1 ij dist img Bi lib Nombre dela Carpeta C users DocumentsWWetBeansPrajectsWMachinel earningTaol Archivas de tipo 4 Type the name of the project 5 If you want to continue click Create If you do not want to click Cancel or close the window 4 1 1 2 LOAD PROJECT You can load an exis
3. bondType Validate for RelF HIFi Note There are some restrictions on this tab An error occurs if you do not fulfill them and it is going to appear an error message in the status panel They are the following If the template text is empty Specifies if there are syntax errors in the given template Indicates if there s not an input variable for a specific output variable Indicates if there are cycles in the template po IE 13 Note The components of each literal of a template can be distinguished by their color Output variables Blue Input variables Red Constants Green Ignored variables Gray Aggregation variables Orange 4 4 PATTERN SEARCH MODULE The pattern search module enables the user to construct relational patterns for the datasets selected in the Input Module The language bias is taken from the Template Module File Template Lal Pattern 7 Input Figure 15 Pattern Search Tab settings TreeLiker Machine Learning Tool For Complex Data 4 Sa DB asp itm Found Palterns G Path countinghiflrelf_experimentsimutagenesis Parameter Settings for Pattern Search Algorithm RelF v Count True Groundings Minimum Frequency 0 1 AT Maximum Size of Feature CL Search Status Finished Labeled Construct Polynomial Features Maximum Degree of Polynomials 4 35 Export ARFF
4. 14 Figure 16 Pattern Search Tab HiFi settings 2 gt TreeLiker Machine Learning Tool For Complex Data babak A File Help Input Template Pattern Search Found Patterns 8 Input Path Class Label countinghifiirelf_experimentsimutagenesis Labeled Parameter Settings for Pattern Search Algorithm HiFi E2 Construct Polynomial Features Count True Groundings W Maximum Degree of Polynomials 4 Minimum Frequency 0 1 Maximum Size of Feature 20 Cl Search 35 Export ARFF Status Finished Note There are some restrictions on this tab An error occurs if you do not fulfill them and it is going to appear an error message in the status panel They are the following Select datasets to start the Pattern Search Establish a valid template The minimum frequency must be a number between 0 and 1 All the parameters have to be a number greater than or equal to O poc 4 4 1 1 SEARCH FOR RELATIONAL PATTERNS You can start the Pattern Search selecting the desired algorithm to run followed by the necessary pa rameters according to the selected algorithm 15 Note If you try to run Pattern Search but all the parameters are the same as last time you ran it there is going to appear a message in the status panel indicating that the pattern search has already been done If the selected algorithm corre
5. nbproject Bg src B Test 1 build xml 5 hs err 568 100 5 manifest mf Nombre de archivo Archivos de tipo 4 Select a dataset by clicking on the directory 5 If you want to continue with the selection click Open If you do not want to click Cancel close the window 6 Select the Format of the dataset 7 Write the Class Label for the dataset selected 4 2 2 2 EDIT DATASET 1 Onthe Input tab locate the dataset you want to edit 2 You can edit the following parameters a Dataset s Directory i Follow the step 2 to 5 indicated the Add New DataSet section b Format i Select the new Format for the dataset selected c Class Label i Write the new Class Label for the dataset selected 4 2 2 3 DELETE DATASET 1 Onthe Input tab locate the dataset you want to delete 2 Click the Delete button 12 4 3 TEMPLATE MODULE Allows the user to enter a template and validates the correctness of it See the document describing the language bias specification using templates for a description of the template based language bias Figure 13 Template Tab Treeliker Machine Learning Tool For Complex Data In X File Help Input Pattern Search Found Patterns 8 Template 1 atom ta2 propatomi propatom fatomlype detailedAtomType charge propatom propatom Fatomlype detailedAtomlype charge bond 81 2 propatam propaton
6. 79906446 6 553212809161567 6 412949743003999 9 93153061 7854497 5 662105549643843 5 474545269534611 5 283029944997419 4 441818250618273 Desplazamiento Horizontal Compactar Todas las Columnas Compactar la Columna Seleccionada 1 016547 34292705540 7 0 013820128587464398 0 013564397134367567 0 012999222872370209 0 012714246939953044 0 01156253446397626 0 011000721079851838 18 4 5 TRAINING MODULE Allows the user to train classifiers and evaluate them using cross validation Figure 19 Training tab Results _ TreeLiker Machine Learning Tool For Complex Data File Help mpiate Pattern Search 9 Found Patterns Choose Rule Learning Algorithm Classifier Output V Cross validation Folds 10 Cross validated accuracy 0 7630994152046784 148 pruned tree Stat Stop Result List atom A bond A B C D 7 atom B propatom C _ _ propatom D c 1 bond A G 7 ato pe e atom A B C D 1 atom B propatom C c _ propatom D h real x1 bond A 1 16 08 00 One Rule atom A bond A B C D 1 atom B propatom C C _ propatom D h real x1 bond A E G 17 02 00 J48 Decision Tree atom A C D 7 atom B propatom C c _ _ propatom D 1 bond A 17 11 14 One Rule
7. UE RR m 13 4 4 m 14 4 4 1 1 Search for Relational 15 4 4 2 Found Patterns Module ee anne eier 17 4 5 Tanne N NN UT ut 19 4 5 1 STIL ee m 19 1 INTRODUCTION TreeLiker GUI is a simple application providing access to fast algorithms for work with complex structured data in relational form The data can for example describe large organic molecules such as proteins or groups of individuals such as social networks or predator prey networks etc The algo rithms included in TreeLiker GUI are unique in that in principle they are able to search a given set of relational patterns exhaustively thus guaranteeing that if some good pattern capturing an im portant feature of the problem exists it will be found In experiments with real life data the algo rithms were shown to be able to construct complete non redundant sets of patterns for chemical datasets involving several thousand molecules as well as for datasets from genomics or proteomics The included relational learning algorithms are tailored towards so called tree like features for which some otherwise very hard sub problems NP hard become tractable The problem of finding a complete set of informative features remains hard also for tree like features however w
8. atom A bond A C D 1 atom B propatom C 1 propatom D _ _ bond A atom A bond A B C D 1 atom B propatom C c 1 propatom D n bond A E atom A bond A B atom A bond A B atom A B C D 7 atom B propatom C c propatom D _ real x1 bond A C D 1 atom B propatom C _ _ propatom D h _ real x1 gt lt C atom A bond A C D 1 atom B propatom C _ _ propatom D h _ _ bond A F G C D 1 atom B propatom C _ propatom D 1 gt x1 gt atom A bond A D 1 atom B propatom C 1 propatom D _ _ bond A atom A bond A B C D 7 atom B propatom C _ propatom D _ real x1 bond atom A bond A B C D 1 atom B propatom C real x1 propatom D b atom A bond A B C D 1 atom B propatom C c 1 propatom D b atom A bond A B C D 7 atom B propatom C c propatom D c real x1 bond atom A bond A B C D 1 atom B propatom C real x1 propatom D _ _ bond A atom A bond A B C D 1 atom B propatom C c propatom D bond A G atom A bond A B C D 1 atom B propatom C c
9. c doubled 5 5 031530617854497 0135643971 34867567 son bend e B resonant 5 662105549643843 o NUTS atom b bond B c X3 c resonant 5 47454528953461 012 1245920052044 atom N resonant lt 5 283029944997419 0 156253446397626 ton 7 3 res 0 d atom sonda E B e resonant 4 441818250618273 0 01 1000721079851838 atom m 297378 055 atom A bond B 3 resonant 4 083888335795716 o 010105225362675219 bond A B c c sindlebo 4 083888335795716 0 010105225362675219 Note If you want to order the results according to the Patterns the Chi Square or Information Gain you can click on the titles of each column Patterns Chi Square or Information Gain Additionally the user interface provides a button Top Right with options to hide columns or to ad just each column to the size of the content Figure 18 Found Structural Patterns Tab with option panel atom bond A B c doubleb atom A bond A c c doubleb atom bond A doubleb atom bond A c br singleb atom A bond A br singleb atom A bond A c br singleb alom bond A c br singleb A bond A br singleb atom _ A bond B singleb atom bond A c br singleb Chit2 11 794844451735946 10 07930228912448 8 027219
10. e the found patterns module and the training module The structure and descending order of the different modules are the following 2 1 INPUT MODULE This module allows the user to select the dataset directories or the specific files that should be used as input data The user can add as many datasets as desired 2 2 TEMPLATE MODULE The template module permits the user to introduce the template specifying the language bias that should be used in the execution of the algorithms of the application 2 3 PATTERN SEARCH MODULE The pattern search module enables the user to construct relational patterns for the datasets selected in the Input Module The language bias is taken from the Template Module 2 4 FOUND PATTERNS MODULE This module uses the results provided by the pattern search module It shows the structural patterns that were found 2 5 TRAINING MODULE This module allows the user to train a classifier based on the patterns generated in the Pattern Search Module The available classifiers are Zero Rule SVM with Radial Basis Kernel 148 Decision Tree One Rule Ada boost Simple Logistic Regression Random Forest L2 Regularized Logistic Regression and Linear SVM The results of training are shown in the result list where the user can choose one of them and display it 3 INSTALLATION 3 1 INSTALLATION GUIDE To be able to run the Machine Learning Tool follow the next steps 1 Download and install the Java SE Runtime Environment
11. e were able to develop algorithms for tree like features which scale well for problems of real life scale Cur rently the machine learning algorithms integrated in TreeLiker GUI include implementations of re lational learning algorithms HiFi and RelF and Poly in an intuitive GUI The three algorithms were described in the following papers Ond ej Kuzelka and Filip elezn Block Wise Construction of Tree like Relational Features with Monotone Reducibility and Redundancy Machine Learning 83 2011 Poly Ond ej Kuzelka Andrea Szab ov Mat j Holec and Filip elezn Gaussian Logic for Predic tive Classification ECML PKDD 2011 European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases this paper described a restricted version of Poly HiFi Ond ej Kuzelka and Filip elezn HiFi Tractable Propositionalization through Hierarchical Feature Construction Late Breaking Papers the 18th International Conference on Inductive Logic Programming 2008 TreeLiker GUI uses WEKA WEKA Mark Hall Eibe Frank Geoffrey Holmes Bernhard Pfahringer Peter Reutemann lan H Witten 2009 The WEKA Data Mining Software An Update SIGKDD Explorations Volume 11 Is sue 1 The panel based philosophy of TreeLiker GUI is also inspired by WEKA 2 DETAILED FUNCTIONAL DESCRIPTION The application is composed of six main modules the input module the template module the pattern search modul
12. es You can enter the Machine Learning Tool application by loading an existing project or creating a new one IDA Machine Learning Tool Figure 8 Welcome Window IDA Intelligent Data Analysis Research Lab LoadProject New Project 10 You can select the datasets that are going to be used by the other Modules Figure 11 Input Tab a Pattern Search Found Patterns aa Data Directory C Users TTT Documents NetB Browse Format PSEUDO PROLOG with CLASSLABEL Class Label Delete Data Directory Browse Format Class Label Delete Note There are some restrictions on this tab An error occurs if you do not fulfill them and it is going to appear as an error message in the status panel They are the following 1 2 3 The data directory file must be selected for every dataset The format of all datasets must be selected The class labels of all datasets must be set if the selected format does not contain information about class labels of the individual examples The selected dataset cannot be empty The format of the selected dataset must be correct 11 4 2 2 1 ADD NEW DATASET 1 Onthe Input tab click the Add New Dataset button 2 Click the Browse button to search for the Dataset s Directory 3 Select a Directory window select the dataset directory location Figure 12 Select a Project Directory window
13. propatom D h _ real x1 bond A B C D 1 atom B propatom C c propatom D _ real x1 bond A 1 atom A B C D 7 atom B propatom C c _ propatom D 1 bond A E G 7 ato 11 Note There are some restrictions on this tab An error occurs if you do not fulfill them and it is going to appear an error message in the status panel They are the following Always select a classifier If Cross validation is selected indicate the number of folds The number of folds has to be an integer number greater than 1 Can t have more folds than instances p Click the Training tab Check the Choose button to select a classifier You can choose to use cross validation or not a If you want to use Cross validation i Checkthe Cross validation checkbox to enable the Folds text field ii Write the number of folds in the Folds text field b If you do not want to use Cross validation i Uncheck the Cross validation checkbox Click the Start button the results are going to appear in the Classifier Output text area Note if you want to visualize results for a different item on the result list just select the name by clicking on it or using the up and down keys If you want to delete one of the searches you can select it on the result list and press the back space key to erase it from the list 20
14. sponds to RelF you have to determine the following parameters 1 Minimum Frequency a number from O to 1 2 Check the Count True Groundings checkbox if you want to use it uncheck it if you do not want to If the selected algorithm corresponds to RelF HiFi you have to determine the following parameters 1 Minimum Frequency a number from O to 1 2 Maximum size of features a positive integer 3 Check the Count True Groundings checkbox if you want to use it uncheck it if you do not want to 4 Check the Construct Polynomial Features checkbox if you want HiFi to construct multivariate polynomial aggregation features 5 Select the maximum degree of the multivariate polynomial aggregation features Click the Search button to start the Pattern Search Note The information about what the Pattern Search algorithm is doing is going to appear in the sta tus panel as well as error messages if something wrong happens 16 Shows relational patterns found with the pattern search algorithm Figure 17 Found Structural Patterns Tab tami 6 B ba 0 singlen 11 794844451735946 0 030067 124906814624 som A benda B singled 10 07930228912448 0 02189307495588122 atom N ond 3 br single 8 02721979906446 7 01995480657 167037 son vonata e B c 17 singe 6553212809161507 0018547242020654073 E send E 3 3 dole 6 412949743003999 o 0102012078008 sen 3 bonda E B c
15. ting Machine Learning Tool project 1 menu bar click the File option 2 Click the Load Project option 3 Onthe Select a Project Directory window search the project location Figure 4 Select a Project Directory Window Bj build Bg nbproject m config inal src Bj DataSets 88 Test 1 80 dist Bj ima 88 lib Nombre de la Carpeta C Users TTT DocumentsiNetBeansProjects MachineLearmingT ool Archivas tipo M Abrir Cancelar 4 Select a project by clicking on the file or directory 5 If you want to continue with the loading click Open If you do not want to click Cancel or close the window 4 1 1 3 SAVE PROJECT You can save the changes done in the current Machine Learning Tool project 1 On the menu bar click the File option 2 Click the Save Project option 3 Accept on the message window that appears or just close it Figure 5 Save Project Message 4 1 1 4 EXIT Exit the Machine Learning Tool application 1 menu bar click the File option Click the Exit option 3 f you want to continue with the exit accept when the Information Exit message window ap pears If you do not want to deny it or close the window Figure 6 Information Exit Message C Are you sure you wantto quit Esi wo It has the following option 4 2 INPUT MODULE You can enter Machine Learning Tool application and indicate the dataset directories or fil
Download Pdf Manuals
Related Search
Related Contents
i-Transfert L 5 Y - Société Générale 取扱説明書 Baixar - Harvesting do Brasil AC-35V WIRELESS CHARGING PAD GUÍA DEL USUARIO USER MANUAL pinus oocarpa.PMD télécharger la circulaire (330 ko) Continental Electric CE23711 User's Manual Copyright © All rights reserved.
Failed to retrieve file