Home
DIXER User Manual
Contents
1. 2 J G Bazan M S Szczuka and J Wr blewski A new version of rough set exploration system In J J Alpigini J F Peters A Skowron and N Zhong editors Rough Sets and Current Trends in Computing Third International Conference RSCTC 2002 LNAI 2475 pages 397 404 Springer 2002 3 M Dash and H Liu Feature selection for classification Intelligent Data Analysis 1 3 131 156 1997 4 T Lindholm and F Yellin The Java Virtual Machine Specification Second Edition Addison Wesley 1999 5 Sun Microsystems Inc Java 2 SDK Standard Edition Documentation Version 1 4 0 February 13 2002 6 Sun Microsystems Inc Java 2 SDK Standard Edition Documentation Version 1 4 1 September 16 2002 31
2. Branch n Bound Module 6 1 Introduction The Branch n Bound Dixer plugable module allows to carry out experiments with very expensive feature selection based on the wrapper method see e g 3 The strategy for searching int the space of the possible attribute subsets is similar to the branch and bound search Here however this strategy is not complete and is used rather to selecting the first candidate to visit in the lattice of the possible feature subsets The results are saved in the special file that can be vieved both manually and with a help of the text processing tools such as awk or gawk that is the GNU Project s implementation of the AWK programming language Microsoft Excel or OpenOffice Spread spreadsheet editor or any programming language that support textual string manipulation and I O operations As the wrapper classifier the decision tree is used here The settings of the used decision tree are 0 98 for the minimal purity of the leaf and 5 for the indivisible leaf size The decision tree with clustering of the symbolic values and the indiscernibility as the split evaluator is used The Branch n Bound module is implemented package rslib dixer branchnbound but uses also classes from other packages to perform experiments The server class is rslib dixer branchnbo und ServerBranchNBound and the client class is rslib dixer branchnbound ClientBranchNBound These class identifiers should be specified in configuration fi
3. Guide for details The class declared here should be provided at the execution time by including in dixer jar or by entering an entry in the class path of JVM No default value Set to rslib dixer rsparallel ServerRSPar e task2 clientclass mandatory for client only Name of the class file responsible for evaluating jobs of task type 2 This class should be a subclass of rslib dixer client ClientTask see DIXER Module Developer Guide for details In current release of DIXER dynamic class loader is removed This means that declared here class should be provided at the execution time by including in dixer jar or by entering an entry in the class path of JVM No default value Set to rslib dixer rsparallel ClientRSPar e tasks maxfailed mandatory for server only Number of maximal allowable fails of job before dropping job permanently When a job evaluation fails than it is inserted at the end of the current scheduled jobs queue but up to the specified here number of times No default value Set to 10 e Module specific settings Each plugable module declared before as a task type can require addi tional settings The parameters begins with name of the module instead of task prefix Usually the parameters begins with the same name as the name specified in taskX lt name gt parameter For ex ample the value maxfailed is usually implemented by each module in form of lt name gt maxfailed and can overwrite the standard value specified i
4. dialog for this task is presented on the figure 7 1 The necessary settings are the name of the script file to execute as describe in the previous section and the result file as described in the next section Additionally user can specify if the results that are already present in the result file should be recomputed 7 5 Result file The results of the script processing are written to the result file This file is designed to both looking manually or processing automatically The corresponding lines in the script file generates output lines one or more in the result file The echo lines in script file write message included in echo command The content of the echoed message is not interpreted and could be used to special tagging or making the result file more readable The experiment lines in script file write the experiment description and experiment results The structure of such line is as follow lt Result flag gt lt Exrperiment description gt lt Experiment results gt e Result flag If the computations were successful the character is written here If the computa tions failed the sign is written here e Experiment description Exactly the same experiment description as in the script file e Experiment results After the separator character come the computed results If the computation were successful the results include the classifier accuracy number of rules true positive rates and other parameters that describe
5. showing a ClientConfiguration window see fig 3 1 DIXER Client Configurator can be executed as a standalone application by running rslib dix er client ClientConfigurator class When a client can not connect with a server cf fig 5 1 it does not necessarily means that configuration is wrong It may also happen that server is shut down or that there are some network problems e g broken cable 11P address is required or any other name that can be resolved by gethostbyname 2rslib dixer client Client see chapter 5 for details 10 CHAPTER 3 CONFIGURATION DIXER Client Configurator Please specyfy the server address localhost Save Cancel Figure 3 1 DIXER Client Configurator window If dixer connection file is already written e g by the Client Configurator than it will be implicitly used without notification If user want to change the server address name it is necessary to delete the dixer connection file in current path If DIXER Client is executed without additional command line parameters than it will prompt about new server address or name 3 3 Configuration file Configuration file contains advanced parameters for the DIXER Sever and Client The name of the config uration file should be passed to the main server class i e rslib dixer client Server as a command line parameter c f 4 1 If such a parameter is omitted than the standard name dixer conf is used The DIXER Server searc
6. was deeply tested with various versions of Sun s JRE 1 4 and should operate also on any new version of Java Virtual Machine c f 4 The DIXER Software does not work with Java Runtime Environment 1 0 1 1 1 2 and 1 3 2 4 Executing The provided dixer jararchive contains all necessary files to execute the DIXER Software The execution of the parts of the DIXER Software can proceed as follows 1 by selecting a program shortcut in the Start Programs menu or at the user desktop 2 by executing a batch script from the program directory 3 by executing Java Virtual Machine with special options The special parameters mentioned above should be analogous to the provided here examples 1 java cp lib dixer jarrslib dixer client Server for executing the DIXER Server 2 java Xmx512M cp lib dixer jar Djava library path lib rslib dixer client Cli ent for executing the DIXER Client 3 java cp lib dixer jarrslib dixer client ClientConfigurator for executing the DIXER Client Configurator For details and explanation of the above parameters see the Java Tools Documentation 6 Chapter 3 Configuration 3 1 Server Configuration The configuration of the DIXER Server is stored in a configuration file that can be shared with DIXER Client This configuration file concern with advanced properties of server itself and plugable modules and can be leaved for experienced users or an administrator A default version of thi
7. Client in the same working directory on the same or different computer It is also not recommended to execute more than one DIXER Client on the same computer as the some RS Lib operations may interfere 5 1 Command line parameters It is possible to pass command lines parameters to the main client class rslib dixer client Client see tools description in 6 for details The client parameters are not case sensitive except the file names which depends on the used operating system e VERSION shows version and exits The version number is read from the configuration file c f 3 3 e C lt filename gt specifies the name for the configuration file At first the server will search for such a file in class path c f ClassLoader getResourceAsStream in Java API Documentation 6 Than it will search in a current or else if specified in name path for a such file in a regular file system If this parameter is not specified it is assumed to use dixer confconfiguration file A standard version of this file is included inside dixer jar e GUI enables Graphical User Interface overrides the settings in configuration file e NOGUI disables Graphical User Interface overrides the settings in configuration file e NO_GUI a synonym for NOGUI left for backwards compatibility e DEBUG enables debugging messages overrides the settings in configuration file e NODEBUG disables debugging messages overrides the s
8. DIXER User Manual for version 2 0 1 Rafal Latkowski rlatkowsQmimuw edu pl http logic mimuw edu pl rses dixer Contents 1 Introduction LL A NR 12 DISERSEWGE o e Lidl aa dsd wow e e iy LS O ans dk r a SABA HE 6 0 BB BAR ae ee 150808 81 2 Installation Ul O0WPPO CRNE on a AG de eee are ae Ge a Bose ed A 222 Package content i esor a aia ar i ee ee ee RHEE ee 6 A EES 2 3 Java Runtime Environment LL aaa e e a za BECOME o o aa ae cd Gee PONE Re eee Wiwa ee a Ba ale de m as 3 Configuration Sul Server COMES URAIOR aa 00 aa a ee e ee eA a ee EG A a2 Client Configuration 60 6400 084 he deus Soo ae ea Re Oe Saas ea eed a2 1 Oy Address sa a a a a a o eee u oo Gonfiguration ble iio ee dice yaa edi ade eee i nnn nd Sack COBRE u00 E a 9 6834 Be ee 5 A ala ooe Graphical User ateria e 444 4a ee a Ee ee aa kC 0 DMRRSDRON CANONY e 6 6 gi a BAS ee ea ee a 334 Command speciicahion 4444644 64654242 44a we ee ee edn 3 3 5 Debugging and verbosity o aa aaacasa ee he ee Ra dE WOW BHR EY 4 Operating Server 41 Command line parameters 4 4 44204445 83 oo Fe da A2 Perver MIO oa woe one rara Se Ree ESSERE EEE DOYS ASS 4 Creating new taska o ca ducea a a an aee ee ee a ee ee estat at 44 Monitoring server stats es ee ce ee ee ee ew eee RE LEE ee Re ae eG 44 Server status e cesen rara a Swe Lee ed 440 asks cuco e a eee EERE EEE Ee ee a a AMS Cluster DOS s cae be ee PESO A EE 4 5 Dismissing tasks conc
9. It is not allowed to run the DIXER Server more than once on the same network port Such a situation is detected and the error message as on the figure 4 3 is displayed It is however possible to run the DIXER Server more than once on the different port numbers It may also happened that the user does not have necessary privileges to execute the DIXER Server on the specified port number See the configuration file details sec 3 3 to solve this problem 18 CHAPTER 4 OPERATING SERVER Information 222 fy Task configuration not available Figure 4 4 A default task configuration window is displayed when a module does not provide an additional run time configuration 4 4 2 Tasks One can choose in task selector to see the statistics for a particular task The tasks are described in task selector by its consecutive number and task type name After selecting a particular task its statistics appear in the upper text area c f 4 2 4 4 3 Cluster hosts List of the connected hosts that form a DIXER Cluster is displayed in the lower text area c f 4 2 These statistics show time of the last contact with a host and a number of computed jobs for each defined task type They are extremely useful for monitoring problems with network communication or client software and memory assignment 4 5 Dismissing tasks Pressing Dismiss task button will stop currently selected task The currently selected task is the task displayed at the task st
10. af size If a node contains equal or less objects cases than it will automatically become a leaf and will be used as a decision table for inducing the minimal covering set of rules as above RULES COV Discretization Set to true if the decision subtables should be discretized after the decomposition Shortening factor The shortening factor of the decision rules 7 3 4 Examples tt train tab test tab rules all local_scal 1 0 The Train amp Test experiment with training table train tab and testing table test tab that use the all decision rules induction algorithm as a classifier The data will be discretized locally with optimization and rules will not be shortened tt train tab test tab rules all global_scal 0 8 The Train amp Test experiment with training table train tab and testing table test tab that use the all decision rules induction algorithm as a classifier The data will be discretized globally and rules will be shortened by a factor 0 8 tt train tab test tab rules cov local_scal 0 7 The Train amp Test experiment with training table train tab and testing table test tab that use the minimal covering decision rules induction algorithm as a classifier The data will be discretized locally with optimization and rules will be shortened by a factor 0 7 tt train2 tab test2 tab rules gen global_scal 0 9 The Train amp Test experiment with training table train2 tab and testing table tes
11. alue Set to rsparallel task1 serverclass mandatory for server only Name of the class file responsible for scheduling jobs and processing their results for task type 1 This class should be a subclass of rslib dix er client ServerTask see DIXER Module Developer Guide for details The class declared here should be provided at the execution time by including in dixer jar or by entering an entry in the class path of JVM No default value Set to rslib dixer branchnbound ServerBranchNBound task1 clientclass mandatory for client only Name of the class file responsible for evaluating jobs of task type 1 This class should be a subclass of rslib dixer client ClientTask see DIXER Module Developer Guide for details In current release of DIXER dynamic class loader is removed This means that declared here class should be provided at the execution time by including in dixer jar or by entering an entry in the class path of JVM No default value Set to rslib dixer branchnbound ClientBranchNBound 5See InetAddress getHostName for details 6Depends on the number of configured tasks 7 Java Virtual Machine see 4 Usually one can start JVM by executing java 12 CHAPTER 3 CONFIGURATION e task2 serverclass mandatory for server only Name of the class file responsible for scheduling jobs and processing their results for task type 2 This class should be a subclass of rslib dix er client ServerTask see DIXER Module Developer
12. atistics selector and in the tasks statistics text area c f 4 2 The task is not killed immediately It waits for all clients to finish their jobs related with this task Task is permanently destroyed if there are no more clients that process jobs related with this task see also server clienttimeout parameter in the configuration file 4 6 Configuring tasks Some of modules provides additional configuration possibility at the run time of a task instance Pressing Configure task button will open the task specific window related with currently selected task The currently selected task is the task displayed at the task statistics selector and in the tasks statistics text area c f 4 2 If this option is not implemented in a plugable module that is a case for current version of Branch n Bound and RSParallel modules than the standard window appears Task configuration not available see figure 4 4 The assignment of the reasonable amount of memory for DIXER Client is a very important issue see also 8 Chapter 5 Operating Client The DIXER Client can be executed by running a java class named rslib dixer client Client This is usually done in a batch script Some plugable modules have problems with leak of memory so in standard distribution the batch file loops forever in executing client class If a client dies because of out of memory error than it will be restarted It is strictly forbidden to execute more than one DIXER
13. ber The DIXER Server can be executed by running a java class named rslib dixer client Server For convenience this is usually done in a batch script 4 1 Command line parameters It is possible to pass command lines parameters to the main server class rslib dixer client Server see tools description in 6 for details The server parameters are not case sensitive except the file names which depends on the used operating system e VERSION shows version and exits The version number is read from the configuration file c f 3 3 e C lt filename gt specifies the name for the configuration file At first the server will search for such a file in class path c f ClassLoader getResourceAsStream in Java API Documentation 6 Than it will search in a current or else if specified in name path for a such file in a regular file system If this parameter is not specified it is assumed to use dixer confconfiguration file A standard version of this file is included inside dixer jar e DEBUG enables debugging messages overrides the settings in configuration file e NODEBUG disables debugging messages overrides the settings in configuration file e WARNINGS enables warnings messages overrides the settings in configuration file e NOWARNINGS disables warnings messages overrides the settings in configuration file e VERBOSE enables warnings and debugging messages overrides the settings in configuration
14. emporarily broken answer could be delayed No default value Set to 300000 5 minutes server buffersize mandatory for server only Size in bytes of data blocks transferred at once during filetransfer job No default value Set to 1024 hostname lt HOSTNAME gt optional for server only Creates a dictionary of IP addresses Each client that contacts to the host is recognized by IP address If the searching for name fails this client will be displayed as its IP address To avoid this problem user can create its own dictionary of hosts that overrides any other method of assigning a name The value of this parameter should be a IP address in textual form X X X X No default value Some entries can be already written at the bottom of the default configuration file 3 3 2 Graphical User Interface client gui optional for client only Whether client should display a window or not yes or no and true or false Default value true Set to true 3 3 3 Task specification Parameters described in this section specify the scope of jobs tasks that a client can remotely execute tasks mandatory for server and client A number of specified task types No default value Set to 2 Usually modified to add new plugable module task1 mandatory for server and client Name label of the task type 1 No default value Set to branchnbound task2 mandatory for server and client Name label of the task type 2 No default v
15. eout mandatory for server only Time in ms that server waits for an answer from a client After that time server assumes that client failed and closes communication channel This time should be longer than execution time of any job on any computer that will be used during experiments Some times it is reasonable to set this value not too large especially when we have some computers with small amount of memory due to trashing i e running out of the physical memory The server attempts to shutdown client before closing connection by sending a kill command No default value Set to 3600000 1 hour 3See java util Properties in Java API Documentation for details 5 4See respective documentation or manual of used operating system 3 3 CONFIGURATION FILE 11 server waittimeout mandatory for server only Maximal time in ms that server will sleep when there are no more jobs to send This time This time should be shorter than client server timeout time No default value Set to 240000 4 minutes client querytimeout mandatory for client only Time in ms that client sleeps after unsuc cessful attempt to connect with a server No default value Set to 10000 10 seconds After that time client will attempt to reconnect with a specified server client servertimeout mandatory for client only Maximal time in ms that client will wait for answer from a server If server is overloaded or network physical connection is t
16. er rsparallel but uses also classes from other packages to perform experiments The server class is rslib dixer rsparallel ServerRSPar and the client class is rslib dixer rsparallel ClientRSPar These class identifiers should be specified in configuration file dixer conf as task lt N gt serverclass and task lt N gt clientclass respectively see sec 3 3 for details 7 2 Module specific parameters The RSParallel Module uses following parameters in configuration file c f sec 3 3 e rsparallel forcerecompute optional for server only Whether the job that was previously computed should be recomputed from the beginning or not yes or no and true or false Default value false Set to false e rsparallel maxfailed optional for server only Number of maximal allowable fails of job before dropping job permanently When a job evaluation fails than it is inserted at the end of the current scheduled jobs queue but up to the specified here number of times Default value tasks maxfailed Set to 20 e rsparallel scriptfilename optional for server only Default name for the script file Useful in serial or batch experiments No default value Set to rsp_ rules txt e rsparallel resultsfilename optional for server only Default name for the results file Useful in serial or batch experiments No default value Set to rsp_ rules log 7 3 Script language description The RSParallel Module uses a special script language to schedule experimen
17. ettings in configuration file e WARNINGS enables warnings messages overrides the settings in configuration file e NOWARNINGS disables warnings messages overrides the settings in configuration file e VERBOSE enables warnings and debugging messages overrides the settings in configuration file e SILENT disables warnings and debugging messages overrides the settings in configuration file 1 Java Garbage Collector works pretty poor with complex data structures 19 20 5 2 CHAPTER 5 OPERATING CLIENT DIXER Client version 2 0 1 1 Alive time 15s 21ms Connected with server Computed 0 results Task name Phase Memory total 1984K free 591K max 196608K Server Messages pa Figure 5 1 The DIXER Client not connected to the DIXER Server QUIET a synonym for SILENT lt serveraddress gt the first parameter that is not recognized as a one of the above will be used as a server address If server address is not specified than the content of the dixer connection will be used See section 3 2 1 for details Client window At the figure 5 1 we can see a client window If client is properly configured such a window appears in few seconds after execution of the client class The client window provides following information Alive time time that client staying alive Helpful in diagnosing out of memory or other errors frequency Connected with server name o
18. f a server when client is connected cf fig 5 2 or string when client is not connected cf fig 5 1 Computed x results number of computed results related with all defined tasks Task name name of currently processed task or empty string if idle Phase name name of current phase of processed job if any Helpful to diagnose status of client computations Memory total x free y max z size of total memory allocated by JVM x in KB size of free memory in already allocated memory y in KB 0 lt y lt x and maximal size of memory that JVM can allocate z in KB 0 lt x lt z Helpful to diagnose status of consumed resources Server messages free text messages from server that inform about intercommunication and server status 5 3 MONITORING CLIENT STATUS 21 DIXER Client version 2 0 1 Alive time 35s 70ms Connected with server localhost Computed 0 results Task name Phase Memory total 1984K free 447K max 196608K Server Messages 2003 07 04 15 20 24 Client version ok Please ignore all errors 2003 07 04 15 20 24 Server has no more jobs temporarily Ex Figure 5 2 The DIXER Client connected to the DIXER Server The Server currently has no jobs for this computer e Shutdown button pressing this button will kill the client If the client is executed in an infinite loop in batch script than this action restarts client 5 3 Monitoring client status At the begi
19. file e SILENT disables warnings and debugging messages overrides the settings in configuration file QUIET a synonym for SILENT 4 2 Server window At the figure 4 1 one can see a server window If server is properly configured such a window appears in few seconds after execution of the server class The server window provides following information 15 16 CHAPTER 4 OPERATING SERVER DIXER Server version 2 0 1 File View Help Reresh Rate sec Server uptime 1m 13s Memory total 1984K free 603K max 131072K Select task statistics Please selecttask Figure 4 1 DIXER Client without connection e Refresh rate time in seconds in which the whole window will be periodically refreshed User can change this time by entering a number in right text input filed and pressing enter key e Server uptime says how much time server is operating e Memory total x free y max z size of total memory allocated by JVM x in KB size of free memory in already allocated memory y in KB 0 lt y lt x and maximal size of memory that JVM can allocate z in KB 0 lt x lt z Helpful to diagnose status of consumed resources e Task statistics selection on the left side is a request for selecting appropriate task and on the right side user can select task for which it would like to see statistics messages e Task specific statistics the content of this text area is directly implemented i
20. hes for the specified file in available class paths c f ClassLoader getResourceAsStream in Java API Documentation 5 If a resource of a specified name cannot be found than the server searches for this file in the current directory It is safe and recommended to use the same configuration file for both server and client especially when they share the same version of code It is not recommended to share configuration file across the different versions of code e g when the set of accompanying modules is changed The DIXER Client uses the same method for reading the configuration file and for passing the name of this file c f 5 1 In the configuration file each parameter is described in following way lt parameter gt lt value gt Whitespace characters before and after equation sign are not important Empty lines and lines that begins with character are ignored Below the description of each parameter is provided 3 3 1 Connection e version mandatory for server and client Specifies version number of software Usually modified only by a developer that made changes in source code No default value Set to 2 0 1 e server port mandatory for server and client IP port number on which server listen for client No default value Set to 7333 If there are any problems because of network firewall or user privileges than one can try to modify this value e g to port below 100 to port above 4096 etc e server clienttim
21. inish their jobs related with this task if any Task is perma nently destroyed if there are no more jobs related with this task see also server clienttimeout parameter in the configuration file e Configure task button pressing this button will open the task specific window related with currently selected task If this option is not implemented in a plugable module than the standard window appears Task configuration not available 4 3 Creating new tasks By pressing Create task it is possible to create new tasks i e a new instance of a task type The newly created task will be executed on the server and than it will schedule jobs for connected clients After pressing button Create task the task selection dialog appears as on figure 4 2 When the selection is approved by pressing button Create on this dialog fig 4 2 the task is created For some task types it may be followed by a task specific configuration dialog as it is for e g Branch n Bound and RSParallel modules After successful creation of a new task the first available task number will be assigned to this task and the task will appear in task selector in main server window c f 4 2 4 4 Monitoring server status 4 4 1 Server status It the server works fine it shows the time how much it is working up time and current memory usage at the bottom of the window c f 4 2 These fields are systematically updated each refresh rate seconds
22. is tet Attributes mask file datatb nb_irisattr tet Max number of used attributes 50 Max number of attempts 5 Start Cancel Figure 6 1 Creating the Branch n Bound task in DIXER Server 6 3 Task creation The create dialog for this task is presented on the figure 6 1 In configuration of this type of task is necessary to provide following settings e Results file the file where the results evaluated elements of the subset lattice are stored e Train file the decision table used for training the decision tree e Test file the decision table used for evaluation of the induced decision tree can be the same as the training file e Attributes mask file the plain list of attributes each attribute in a separate line that should be used in searched lattice of subsets e Maximal number of attributes the maximal number of used attributes in a evaluated subset i e an upper cut of the lattice e Maximal number of attempts the same as the maxfailed parameter 6 4 Results file The results file contains the already computed nodes of the subset lattice A node can be not opened opened or closed The not opened node it is a node that was not evaluated An opened node is a node that system has already computed the classifier accuracy for corresponding subset of attributes A closed node is a node for which all successors subsets that have at most on one attribute more are eithe
23. le dixer conf as task lt N gt serverclass and task lt N gt clientclass respectively see sec 3 3 for details 6 2 Module specific parameters e branchnbound resultsfilename optional for server only Default name for the results file Useful in serial or batch experiments No Default value Set to bnb_ results txt e branchnbound trainfilename optional for server only Default name for the file name of train ing set Useful in serial or batch experiments No Default value Set to data iris trn e branchnbound testfilename optional for server only Default name for the file name of testing set Useful in serial or batch experiments No Default value Set to data iris tst e branchnbound attributefilename optional for server only Default name for the file name of attribute list Useful in serial or batch experiments No Default value dataybnb_ irisattr txt e branchnbound attributelimit optional for server only No Default value Set to 50 e branchnbound maxfailed optional for server only Number of maximal allowable fails of job before dropping job permanently When a job evaluation fails than it is inserted at the end of the current scheduled jobs queue but up to the specified here number of times Default value tasks maxfailed Set to 5 23 24 CHAPTER 6 BRANCH N BOUND MODULE DIXER Branch and Bound Configuration Results file bnb_results bd Train file dataliris tm Test file datalir
24. led Nodes 0 Failed Queue Size 0 lobs scheduled to execute 2 User localhost Task type O Answers 0 Time Oms Task type 1 Answers 0 Time Oms Last Contact 1m 15s Update now j Shutdown JI Create Task f Dismiss Task j Configure Task Figure 1 1 DIXER Server DIMER Client version 2 0 1 000 0 Alive time 1m 5s Connected with server localhost Computed 1 results Task name RSES Parallel Phase rules tt Memory total 7376K free 927K max 196608K Server Messages 2003 07 07 13 20 39 Client version ok Please ignore all errors Figure 1 2 DIXER Client Usually DIXER Server consumes a little of system resources both CPU time and system memory Exact values however depend on current work DIXER Server can also require some additional files on disc that are used by plugable modules as data files needed in experiments 1 3 DIXER Client DIXER consist of two applications Server and Client DIXER Client should be run on all computers that should cooperate in distributed computations DIXER Server should be run only once on administrator s computer DIXER Client executed all experiments scheduled by DIXER Server It contains simply graphical user interface on which one can supervise a program status see fig 1 2 On demand this GUI can be disabled and then DIXER Client runs completely silent Usually DIXER Client consumes a lot of system resources both CPU time and system memory Exac
25. les for the DIXER Server The batch script files allows to customize the execution of the DIXER software The most im portant is the customization of the available memory for the DIXER Client see files client bat and runclient bat that executes the DIXER Client in an infinite loop The files rsp_rules txt and rsp_decomptree txt in the directory server are the example scripts for RSParallel plugable module The files in directory server data contains a data files for these examples and also an example file for the Branch n Bound plugable module The default settings of these plugable modules make a use of provided example files 2 3 Java Runtime Environment To properly run any of DIXER software components a Java Runtime Environment in version at least 1 4 is required c f 5 6 The Java Runtime Environment JRE is included also in any Java Software Development Kit JDK or J2SE_ SDK The JRE or JDK can be downloaded directly from 7 8 CHAPTER 2 INSTALLATION 2 Setup DIstributed eXEcutoR for RSES lolx Select Destination Directory Where should Distributed eXEcutoR for RSES be installed Select the folder where you would like Distributed eXE cutoR for ASES to be installed then click Next Proaram File C Accessories Y Adaptec Adobe C ahead gt ec The program requires at least 3 1 MB of disk space lt Back Cancel Figure 2 1 DIXER Setup program http java sun com The DIXER Software
26. n a selected plugable module User should carefully read the information displayed here because this reflect the current status of computations for selected task e Report on connected clients the DIXER Clients currently connected to the server are displayed identified by its dictionary name network name or IP address Also some additional data is provided like number of processed jobs approximated time spent on computations and last communication time e Update now button pressing this button will update the content of window immediately e Shutdown button pressing this button will kill the server e Create task button pressing this button will open a dialog where user can choose a task type for newly created task 11t should be stress here one more time that there can be more than one task related with the same task type plugable module All of them operate separately and display different statistics messages 4 3 CREATING NEW TASKS 17 Create Task Select task to create Create Cancel Figure 4 2 Selecting a task to creation Error in creating server port 7333 DIXER Server is already running or you have insufficient priviliges Figure 4 3 The error message is displayed when the user executes the DIXER Server more than once e Dismiss task button pressing this button will stop currently selected task The task is not killed immediately It waits for all clients to f
27. n tasks maxfailed See a particular module User Manual for details 3 3 4 Command specification Parameters described in this section specify binary codes of commands that server and client send to each other It is very important that client and server work with identical codes e command echo mandatory for server and client Code for command requesting an identification string with version number from a client No default value Set to 1 e command kill mandatory for server and client Code for command requesting client to die No default value Set to 2 e command verifyfile mandatory for server and client Code for command requesting verification of particular file No default value Set to 3 Verification consist in checking file name existence its length and in comparing CRC16 of the first 4KB of file e command transferfile mandatory for server and client Code for command requesting storing of a file relative or not if it is directly requested to current directory No default value Set to 4 e command servermessage mandatory for server and client Code for command displaying message on a client window No default value Set to 5 e command answer mandatory for server and client Stamp code of answer for command No default value Set to 65536 e command lt task 1 name gt mandatory for server and client Code for command related with task 1 If task 1 is specified as branchnbound than this paramete
28. nd can take from a couple of minutes up to weeks usually 1 100 hours Each plugable module implements a mechanism to restart from the interruption point and to do not repeat already computed jobs This is extremely important in a case of e g power or computer failure Job Created tasks are non elementary beings and consist of jobs that can be interpreted as elementary operations on DIXER Cluster These jobs are sent to client and are visualized at the DIXER Client GUI The computation time of each job vary and can take from a couple of seconds up to about two hours usually about five minutes If there is any failure during job evaluation the results are dropped and the job returns to queue of scheduled jobs 1A plugable module can be specified as a task type multiply times They will be identified by different names labels c f section 3 3 but they will share all module specific configuration parameters 2 t is at least true for all modules that are distributed with the DIXER software 3It returns only a specified finite number of times to avoid permanent blocking of jobs queue 6 CHAPTER 1 INTRODUCTION DIXER Server version 2 0 1 y E File View Help Reresh Rate sec 5 Server uptime 5m 19s Memory total 2900K free 1139K max 131072K Select task statistics 1 RSES Parallel Experimenter w Script file rsp_rulesbt Results file rsp_rules log Force recompute false Readed lines 6 Permanently Fai
29. nning server checks client version by an echo command In result of that in a Server messages will appear one of the following messages Client version ok if the client version is in accordance with server version or Client version obsolete otherwise Mainly we can say that client operates under two conditions there are jobs to compute or there are no jobs to compute If there are jobs to compute the current job is described in the client window as described in previous section cf fig 5 3 If there are no more jobs the server will inform client by a free text message cf fig 5 2 How frequently server informs a client can be changed in the configuration file If the communication with server is broken i e due to the server shutdown the client is be able to notice this fact when it is communicating with the server During computations the communication is disabled to save the network overload and time delays It means that if the communication is bro ken during computations the client will notice it after finishing its computations both successfully or erroneously 22 CHAPTER 5 OPERATING CLIENT DIXER Client version 2 0 1 Alive time 1m 5s Connected with server localhost Computed 1 results Task name RSES Parallel Phase rules tt Memory total 7376K free 927K max 196608K Server Messages 2003 07 07 13 20 39 Client version ok Please ignore all errors Figure 5 3 DIXER Client Chapter 6
30. oes not depends on the experiment mode In both Train amp Test and Cross Validation experiments any classifier can be used The keywords are not case sensitive Rule based classifiers RULES lt algorithm gt lt discretization gt lt shortening factor gt e Algorithm There are three different algorithms 1 ALL induces all rules 7 3 SCRIPT LANGUAGE DESCRIPTION 27 2 COV induces minimal covering set of rules 3 GEN induces genetically optimized set of rules Discretization There are four different discretization scaling methods 1 NO_SCAL without discretization at all tables should contain only symbolic attributes 2 GLOBAL_SCAL global discretization in current version of RSES Lib available only on MS Windows platform 3 LOCAL_SCAL optimized local discretization 4 LOCAL_SCAL_WS optimized local discretization with additional discretization clustering of symbolic attributes Shortening factor A number between 0 0 and 1 0 that adjust the shortening of induced rules The 1 0 means no shortening and 0 0 means empty rules Decomposed rules classifier The decision tables can be decomposed by a template based decomposition tree before inducing the decision rules It is advised especially for the huge decision tables for which the standard rule induction works unacceptably long DECOMP_RULES lt leaf size gt lt discretization gt lt shortening factor gt Leaf size The maximal le
31. ordia me ee eR DRA ee ak ed M0 Con earings tasked ah a amp Gadde a Soe a oe wwe ec A A 5 Operating Client 5 1 Command line parameters occiso ye a de e dy 52 BERLIOZ ar wa ak ke do AAAA 40 de ok Bok Pe owe Poe Mtoe 4 58 Monitoring client status ios e eee bbe edd eee EE EES 6 Branch n Bound Module 6 Inmtroduci 2 4 eh eee Oe ea ee eb ROA RRR EE ew WA 62 Module specific parameters o cocos eaa ada a ee dd eee ee ee eee RY 65 Task CEGAN gt carnoso bb oe RAR aa ee KE DAME ee ee 6 Wesabe No sas aaa O or or GI 00 JI JAN 7 RSParallel Module Tel Introduchiod cies a be eee wii 7 2 Module specific parameters 7 3 Script language description 1 11212211 1 8 1 Experiment Type 1 awa desa Taa Data LADIES oi Aa ada 7 3 3 Experiment Parameters Tad Examples oo ewa apa wos o 4 a TA Task CTCAIOD da aaa a Wwa a add i 0 Reuk tue iio aaa gops wawa AAA 8 FAQ Bibliography CONTENTS Chapter 1 Introduction 1 1 DIXER DIXER Distributed eXEcutoR is a part of Rough Set Exploration System RSES 2 DIXER is designed to help experimenters that would like to run their experiments on many computers DIXER allows to employ computers connected with an intranet or internet to distribute experiments on them automatically It creates a virtual cluster of computers for this particular purpose DIXER takes care on all aspects of scheduling communication and data transfer so i
32. r opened or closed In the results file are stored opened and closed nodes in the following form e 0 0 0 1 1 1 0 9 an opened node e c 0 0 1 1 1 0 9 a closed node The letter at the beginning of the line indicate whether a node is opened o or closed c After the equation mark the node description is written as the characteristic function of used attributes The 1 means that subsequent attribute was used in classifier induction while O means that respective attribute wasn t used At the end of the line after the semicolon character the accuracy of the classifier is written The accuracy is measured as the number of answers compatible with the test decision table Chapter 7 RSParallel Module 7 1 Introduction The RSParallel Dixer plugable module allows to carry out experiments based on methods implemented in Rough Set Exploration System RSES and particularly in RSES Lib c f 2 It process a special script language described in section 7 3 that allows to prepare a individual scenario for experiments It is strongly recommended to prepare scripts in some semi automatic manner There are a lot of suitable tools for this task such as awk or gawk that is the GNU Project s implementation of the AWK programming language Microsoft Excel or OpenOffice Spread spreadsheet editor or any programming language that support textual string manipulation and I O operations The RSParallel module is implemented package rslib dix
33. r should be written as command branchnbound No default value Set to 100 8The prefix name for the task specific parameters is encoded in source code while the name for a task type can be arbitrarily chosen in the configuration file It is recommended to use the same name for both purposes to maintain the clarity of the configuration file 9Depends on the configured task type name 3 3 CONFIGURATION FILE 13 e command lt task 2 name gt mandatory for server and client Code for command related with task 2 If task 2 is specified as rsparallel than this parameter should be written as command rsparallel No default value Set to 101 3 3 5 Debugging and verbosity e config warnings optional for server and client Whether warnings should be displayed on a java console yes or no Default value no Set to yes e config debug optional for server and client Whether debug messages should be displayed on a java console yes or no Default value no Set to yes 14 CHAPTER 3 CONFIGURATION Chapter 4 Operating Server The DIXER Server should by running only once at the administrator s computer It manages all clients that are reachable in used network It is possible to safely run more than one server if they operate on different computers or on different port numbers see 3 3 Tn such a case a server will manage only those clients that have it host address selected as a server address and operate on the same port num
34. s file is attached to each distribution of DIXER software so there is no need especially for beginners to alter mentioned configuration file 3 2 Client Configuration The configuration of the DIXER Client consist of two stages The first stage concern with selection of server in an intra or internet Such a configuration is in scope of interest of all users The second stage concern with advanced properties of client itself and plugable modules and can be leaved for experienced users or an administrator The first stage of configuration can be made by a graphical user interface The advanced properties are accessible in textual configuration file A default version of this file is attached to each distribution of DIXER software so there is no need especially for beginners to alter mentioned configuration file 3 2 1 Server address DIXER Client requires an address of DIXER Server that will schedule jobs for remote execution One cannot execute the DIXER Client without specifying a server to connect to User can define server name or address in two ways 1 as a command line parameter to client class 2 as a file named dixer connection in current directory If the command line parameter is specified than it will be used first If client class is started without parameter then there will be an attempt to read settings from dixer connection file in current directory If this also fails than program will ask about server name or address by
35. t requires no additional network services like e g ftp or nfs It is mainly designed for experiments based on RSES Lib software but as an open architecture allows to plug in new modules without modification of the source code DIXER consist of two applications Server and Client This manual descSribes how to use DIXER Server DIXER Client and also plugable modules Branch n Bound and RSParallel See also DIXER Module Developers Guide 1 2 DIXER Server DIXER consist of two applications Server and Client DIXER Client should be run on all computers that should cooperate in distributed computations DIXER Server should be run only once on administrator s computer DIXER Server schedules experiments jobs for all executed DIXER Clients It contains a graphical user interface that allows managing whole DIXER Cluster see fig 1 1 To clarify some concepts and make this manual more precise we specify here distinction between jobs taks and types Task type At the DIXER Server runtime there are fixed number of task types They can be more or less identified with installed plugable modules Task types are configured in a configuration file named dixer conf which is described in detail in section 3 3 Task Tasks are created and dismissed dynamically by a user during runtime User can create a new task of any previously specified task type There can by more than one task of a particular task type The computation time of each task vary a lot a
36. t values however depend on current work that is scheduled by DIXER Server DIXER Client can also require some disc space in order to store data files needed in experiments Chapter 2 Installation 2 1 Setup program The DIXER Software is provided for continency with a MS Windows setup program see fig 2 1 This program simplifies a lot the installation procedure In few mouse clicks user can install the full DIXER Software package this documentation sample files and shortcut icons that executes the DIXER Server and Client For a special purposes also similar setup program is provided for separately DIXER Server and DIXER Client The DIXER Client package is sufficient for the workstations that will be used only as a working elements of the DIXER Cluster The DIXER Server package contains this documentation and files necessary to run the DIXER Server and manage the cluster of working elements 2 2 Package content The full package of the DIXER Software contains 1 This DIXER User Manual 2 dixer jar the binary archive with the DIXER Software 3 RS Lib the computational core of the Rough Sets Exploration System RSES 4 Examples for provided plugable modules 5 Set of shortcuts and batch scripts for executing different parts of the DIXER Software The DIXER Server package does not contain the RS Lib computational core and scripts for the DIXER Client part The DIXER Client package does not contain this manual scripts and examp
37. t2 tab that use the genetic decision rules induction algorithm as a classifier The data will be discretized globally and rules will be shortened by a factor 0 9 tt train2 tab test2 tab rules all local_scal_ws 0 6 The Train amp Test experiment with training table train2 tab and testing table test2 tab that use the all decision rules induction algorithm as a classifier The data will be discretized locally with clustering of all symbolic attributes and rules will be shortened by a factor 0 6 28 CHAPTER 7 RSPARALLEL MODULE DIXER RSParallel Configuration Script file rsp_rules bt Results file rsp_rules log 1 Force recompute Max number of attempts 20 Start Cancel Figure 7 1 Creating the RSParallel task in DIXER Server e tt train3 tab test3 tab rules all no_scal 0 5 The Train amp Test experiment with training table train3 tab and testing table test3 tab that use the all decision rules induction algorithm as a classifier The data will be not be discretized and rules will be shortened by a factor 0 5 e tt my_tab rses my_tab rses decomp_rules 100 true 1 0 The Trainfz Test experiment with decomposition tree The my_tab rses decision table will be decomposed into the subtables of no more than 100 objects The formed subtables will be used to induce minimal covering decision rules with discretization and shortening factor 1 0 7 4 Task creation The create
38. the classifier performance If the computations failed the error message is written here Chapter 8 FAQ e Q The DIXER Server as well as DIXER Client does not even start e A Make sure that there is a Java Runtime Environment in version at least 1 4 installed on this computer Make sure that installation is not corrupted and java executable is in the executable path If it is necessary modify the PATH environment variable e Q The DIXER Client shows the following message DIXER Cluster failure Incompatibile DIXER Server Client or Java Virtual Machine version e A Make sure that you are using Java version 1 4 or higher This DIXER Client is taken from the other incompatible distribution of the DIXER Software Reinstall the DIXER Client in the release version compatible with the DIXER Server e Q The DIXER Client cannot compute even one job e Al Try to increase maximal available memory for JVM by executing with Xmx parameter For example java Xmx256M rslib dixer client Client e A2 The DIXER Client cannot find the dynamic shared library RSLib Please verify location of the file RSLib d11 or RSLib so and adjust the Djava library path JVM parameter 1See java X for details The X options are non standard and subject to change without notice 29 30 CHAPTER 8 FAQ Bibliography 1 J G Bazan M Mikotajezyk and M Szczuka Rough Set Exploration System version 2 0 User Manual Warsaw University 2003
39. ts across cluster of computers This script language is interpreted line by line from the beginning of the file to its end and each line contains description of separate experiment 25 26 CHAPTER 7 RSPARALLEL MODULE The line that describes an experiment consist of three main parts lt Exrperiment Type gt lt Data Tables gt lt Experiment Parameters gt The meaning of these three parts will be explained in following sections Besides the experiment description lines the script language allows special lines that are insignificant from the experimental point of view e Comment The line can be a comment line if its begins with character or if it is an empty line e Echo If the line contains a string echo at the beginning of this line the whole content of this line will be copied to output log file This allows to add some comments in this log file 7 3 1 Experiment Type There are two experiment types in this version of script language e TT Train amp Test mode The experiment will be executed in Train amp Test mode This means that user should support a separate data set table for training classifier and separate data set table for testing classifier e CV Cross Validation mode This mode is not supported in version 2 0 The experiment will be executed in Cross Validation mode The Cross Validation uses only one data set for training and testing This set is partitioned into a number let say N of subsets of possibl
40. y equal size and in N steps each of this subsets is assumed to be a testing set As a training set N 1 remaining subsets are taken in such a way that the same subset cannot be used for training and testing in the same step After all N steps each part of original data set is used for training and testing This keywords are not case sensitive 7 3 2 Data Tables In the second part of experiment description the description of data sets tables are provided The specification of data sets depends on the mode in which the experiment will be carried out e If as the experiment mode the Train amp Test was chosen than it is necessary to specify a file name for the training table and for the testing table my_training tab my_testing tab The data tables should be written in RSES Data Table Format described in 1 e If as the experiment mode the Cross Validation was chosen than it is necessary to specify a file name of one table used for both training and testing and the number of steps folds For example my_data tab 5 means that the 5 fold Cross Validation will be performed on the my_ data tab data file The host Operating System rules apply whether the file names are case sensitive or not But to achieve the portability we should assume that file names are case sensitive 7 3 3 Experiment Parameters Description of the experiment parameters is the most important part as it describes the classifier to be used This description d
Download Pdf Manuals
Related Search
Related Contents
Eglo BADOS descargar manual de instrucciones mod. stilo TaqMan® Exogenous Internal Positive Control F3X27 Series Router User Manual - Four Copyright © All rights reserved.
Failed to retrieve file