Home

Temporal Discovery Workbench (TDWB)

image

Contents

1. 78 FIGURE 6 4 CREATE NEW PROJECT PROJECT PROPERTIES DIALOG vi 79 FIG RE 6 5 ADD A DATA FILE DIALOG i3 ene NER Ue RU vade dane RT ORE EXE S RENE CLEA ENS Eng rata 80 FIGURE 6 6 ADD A DATA FILE DIALOG WITH FILE 1 nr ennt 81 FIGURE SET VARIABLES DIALOG SIRIA 82 FIGURE 6 8 DATA FILES PAN E ne unisce 83 FIGURE 6 9 CONTINUOUS DATA GRAPH 5 2 iere eee e EH HR RR ERE UIS ERE ee Pe EO pa eres Eze canned 84 EIGURE 6 10 DISCRETE DATA GRAPH cere ee cect ule E Suet exe ex e be e Pe Set x ed Ie eR pe 84 FIGURE 6 143 ANALYSIS PANEL tina iaia lau EUR I EC e PRX 85 FIGURE 6 12 NEW ELEMENTARY PATTERN 85 FIGURE 6 13 ADD NEW COMPOSITE PATTERN DIALOG 86 FIGURE 6 14 ADD NEW COMBINATORY PATTERN DIALOG 86 FIGURE 6 15 PATTERN DISCOVERY 87 FIGURE 6 16 PATTERN
2. T 57 FIGURE 5 2 NEW Ru leis 61 TDWB FIGURE 5 3 ADD DATA SEGMENT DIALOG ei en Fe ee RAR Eee EY ERE oben ERR EX YN nti 62 FIGURE 5 4 MVARIABEES eun irre EURO D CARO EE ee RUE E ee eh es RUE ERE 62 FIGURE D 5 DATA PANEL ueterem ebore deci tre aes 63 FIGURE 5 6 5 25 etx toot ee e eR Y 63 FIGURE 5 7 DISCRETE GRAPH suc 35 erre do Re gun dert o ie an eee o ea ee 64 FIGURE 5 8 PATTERN DISCOVERY PANEL 25 tree eee E o RR RUE C dh lontani ih 64 FIGURE 5 9 AND PATTERN EDIT DIALOG eot er ee E e Set exe tonnes Ao ie teenies 65 FIGURE 5 10 COMBINATORY PATTERN DIALOG eoe a aaa aviaria 65 FIGURE 5 11 PATTERN MATCHING REPORT sisse eene enne nennen enne reser entes nens ESSES SESSE ESEE ASEE tereent eene 66 FIGURE 5 12 MODEL UMLL e vee desee pe e nime edie here 67 FIGURE 6 1 USE WORKFLOW 5255 beret eere eeu en menn 69 FIGURE 6 2 DATA SEGMENT OVERLAPPING eren ehh 70 FIGURE 6 3 COMPLEX PATTERN
3. sucess e Ente Ren E bed ede cene in se dne dag 87 FIGURE 6 17 MVE REQUEST PROCESS 2420 sens uon Ren cete oer ee E ERR ERR DU Ha ARR E REN EP Ae Ho Peta Pe nada 88 FIGURE 6 18 MODELUM ERE RAT e QU EYES BENE A RE CUR EN FEE 89 FIGURE 6 19 CONTROLLER UML esses iicet e Exec eu sek scien sees cas exeo CE ERR ARAB FERRY 90 FIGURE 6 20 CHART LIBRARY ABSTRACTION tenete seen enne tenete sees 92 FIGURE 6 21 ANALYSIS MODULES 5 eene 93 FIGURE 6 22 PATTERN TYPES 5 94 FIGURE 3 1 CREATING EDITING A CSV FILE WITH OPENOFFICE ORG CALC 102 FIGURE 3 2 DATE TIME PATTERN LETTERS sr hee E nee Ree iaia A lie ail lle 103 FIGURE 3 3 POSITIVE SPECIAL EVENT prio e e exi d nete Re ede ee e e ede 104 FIGURE 3 4 NEGATIVE SPECIAL EVENT 55 6 26 ever veo gr vex Ere e Yee deduct eee n ee RE nee RARE eee 104 TDWB FIGURE 4 1 NEW PROJECT PROJECT PROPERTIES 105 FIGURE 4 2 DATA SEGMENTS
4. 106 FIGURE 5 1 ADD DATA FILE DIALOG RA ed sedges aisi 107 FIGURE 5 2 ADD DATA FILE DIALOG WITH DATA PREVIEW nennen nennen entren nennen nennen enn nenne 108 FIGURE 6 1 SET VARIABLES DIALOG 15250 5 552 ierra tope eo e RU pod sure 109 FIGURE 6 2 DISCRETE RANGES PANEL s tii agi glassa 110 FIGURE 7 1 DATA FILES PANEL a 2 4 55e eer eet v ee ro e e e ede Toe e esee o ree rei de qe P eene 111 FIGURE 7 2 CONTINUOUS DATA GRAPH sentences aste pea d n 112 FIGURE 7 3 DISCRETE DATA GRAPH ge P pecu ae vache pa DA Ere des 112 FIGURE 7 4 ANALYSIS REPORT csccecceseescesecasensesecaeesecuseaccusessecusesecsassncasecassecuseasesecuasaceuaeseeauecaueaeenseaeeateaes 113 FIGURE 8 1 NEW ELEMENTARY PATTERN nennen nenne 115 FIGURE 8 2 NEW COMPOSITE PATTERN 116 FIGURE 8 3 NEW COMBINATORY PATTERN i 117 FIGURE 8 4 COMPLEX PATTERN H 118 FIGURE 8 5 PATTERN DISCOVERY PANEL sesscesssssiscescececeisvesssursccssceasiscgseesoesseievssbocenessvossduysitenseceginssbiecsersteenesase 120 FIGURE 8 6 PATTERN MATCHING 120 FIGURE 6 1 MVG REQUEST PROCESS iii iraniani 124 FI
5. GERD 181650 Baa Idi 182300 152450 152500 182680 Ita Connections 208 es angl she sec ang 2 7 ang 9 ong 10 1 eng t2 lackwat 5 Activity for All Data Caches par Second Lock wits p n 150000 n Naxrecvery 5 jii V siasi Sig ea so sam eeso Isao UMSO Sura ansi 15799 Logtal Reads Physical Reads wri a Number of Disk 10 Operations per Second 600 A A IN va 47727 ul M mure m am eiu User Data User Log Ten ters end Loo System Device IO Service Time in Miliseconds n VA V LV ime dence mde bun Cuna iow baa Sao ban ban sano maa bas baw san man unm pom saa May Average Step for 9 eco matina for ne sane ped sae tep 22900985 nct temor Lied 7206 Fee 424 Figure 2 6 Jchart2D Page 25 TDWB 2 6 2 6 JChartLib Has a big community and tutorials but in its web site there are not much examples and the documentation is not free URL http sourceforge net projects jchartlib Dx JChartLib Demo Appllication Prot Fruits are good Nu
6. marc 2012 maig 2012 jul 2012 set 2012 nov 2012 gen 2013 Date 77 Series 1 Series 2 Figure 2 13 Deviation demo Annotations This could be good to mark determined values Annotation Demo 2 4 5 6 7 B 9 1011 12 13 14 15 16 17 18 Date Figure 2 14 Annotations Page 31 TDWB Box annotation This is useful for discrete graphs colouring 200 000 150 000 100 000 Hours of Operation ia r 20 20 3 8 dic 03 2 gt e Production Date Old New 8 2 Figure 2 15 Box annotation Marker It is useful for mark the special events Category Marker Demo 1 Category 1 Category 2 Category 3 Category 4 Category 5 Category Series 1 Figure 2 16 Marker demo Page 32 TDWB 2 6 5 1 Support There is a support forum and the API Documentation is free But on the other hand the developer documentation is not free However found some tutorials http www vogella de articles JFreeChart article html The developer documentation of an older version http ktipsntricks com data ebooks java jfreechart 0 9 1 US v1 pdf Some demos http www java2s com Code Java Chart Time Series Chart htm http www java2s com Code Java Chart JFreeChartTimeSeriesDemo8 htm 3 Outline design This chapter describes an initial design of a possible solution to this problem 3 1 General use workflow Figure 3 1 shows the
7. Add discrete range Figure 9 10 Discrete ranges panel The field Label is required and duplicated values are not accepted The value in the field min must be less or equal than the value in the field max The discrete ranges of a variable cannot overlap For example if a variable has two discrete ranges dr1 and dr2 with the values min 0 max 50 for dr1 and the values min 40 and max 70 for dr2 then the system will give an alert when pressing the Apply changes button of the dialog When generating the discrete values for the data analysis a continuous value belongs to a range if it is greater or equal than Min and less than Max When everything is defined to accept the changes you have to click on the Apply changes button 9 1 6 Analysing the data After loading all the data files and setting the variables In the Data Files tab we can visually check the data and its discrete ranges as seen in Figure 9 11 Page 110 TDWB 4 New project Temporal Discov Data Files Project Variables Patterns Help Original data Continuous data Discrete data Time of Timepoint 16 10 2008 01 18 00 000 11 0 0 6 16 10 2008 02 00 00 000 11 0 0 6 15 10 2008 n3 nn nn nnnitn n In 45 Figure 9 11 Data files panel In the right panel you can see the graphs and the values of the loaded data files In th
8. 100 9 2 PHOS eed RR das on a T n e RE 100 9 1 3 Gredting a NCW Project rer ya aae ax a RS lina ia 104 9 1 4 AddingdatajJiles 5 o eoa ena e UR E OE RE deo e Onda had 106 9 1 5 Selecting and configuring the variables to analyse 108 9 1 6 Analysing the datoa iii 110 9 1 7 Pattern discovery process iii 114 TDWB 9 1 8 ian 121 919 Savednddload patterns ice Lanier ati ri 121 9 2 MAINTENANCE MANUAL cusses aen Loto eoo pe npe ren saeua ve vu pe ha Ma aea auae ee vo ve Fe Rea Pao qua 122 9 2 1 Installing the system Pada essa euro ae ERE Rede Pede naa dave 122 9 22 Compile build the systemM iii 122 9 2 3 SEX CUTE the program e E aces ER CAR ERA iaia 122 9 2 4 Dependehcl s adr D OG ER RE ERREUR ah ERA ERE AE e oe 122 9 25 Organisation of fles bett e aa a LO E OD RR EXE REN UA 123 9 2 6 Model view controller iii 123 9 2 7 UM ongen nda a IT REI rte IR IA 125 9 2 8 List Of source code files aenieiai adedana ia a ai aaa inaa Eia ae taa idka 126 9 2 9 Main procedures methodS iii 133 9 2 10 Configurationjiles nocere aaa 134 9 2 11 Directions for fut
9. 3 42 DATA DISPLAY ssec toes cte i ae ceeds Fes Fere ERE Ore e oA onse TDWB 3 4 1 chart libr ry interface Class i re ra RR Y RR Rex ER TAA 37 2 5 UDATA ANALYSIS iii Ria 38 9 6 PATTERN DISCOVERY ose sit in ii 38 3 6 1 Single TIME point battern eere te Ile so edo Fe depo a eR ERR Cede eeu SIR REA P dE Ced Od 40 3 6 2 Single time point composite 40 3 6 3 Temporal pattern rte e c e ERR Y Ao aiar 40 3 6 4 Temporal composite pattern iii 41 3 6 5 Combinatorial patterns e C RR EB erat 41 3 606 Combinatorial M out of N 42 3 7 METHODOLOGY AND TECHNOLOGIES 43 3 7 1 Software development method iii 43 3 7 2 Programming Language iii 45 3 73 Integrated Development Environment IDE i 46 3 7 4 Repositories and Backups iii 46 3 7 5 Program versi ns cs eia ba AIA aio 47 4 PROTOTYPE 1 TDWB 0 1 5 2 2 o orae ode Soa eu spado ee Co o a Fea a 47 LA MISC Hasc DIE 48 4 1 1 D CY WOPK POW ERNST 48 4 2 2 Function lities list ee gn Rec XR 48 4 1 3 50 4 1 4 UM clelia ica 55 4 2 LIUSERSIEVALUATIONI rep RAI On 55
10. JCCKit Chart2D Time series Discontinued TS Export as image Documentation Tutorials Community Ease of use Stable Performance Page 29 TDWB JCHartLib JCharts JFreeChart JOpenChart Ptplot Time series O Discontinued TS O Export as image Documentation ot free Tutorials Poo Poo Poo Community Poo Poo Poo Ease of use Stable Performance 2 6 4 Chosen library JFreeChart and JChart2D are the only one that offers all the requirements JFreeChart has better documentation better examples and a more active community have tried it and the results are very good so have chosen JFreeChart as the graph library used by TDWB tried to draw a similar graph to the generated graph by my own custom library and the result as seen in Figure 2 12 is much better than the given by my custom library Spo2 8 8 Troponin p p e o o 5 10 15 20 25 30 35 40 45 50 55 es 70 75 80 85 90 95 wo 105 0 Hours Figure 2 12 JfreeChart test Page 30 TDWB 2 6 5 Extra features Plus JFreeChart offers these interesting extra features Statistical functions like mean medium and correlation Are functionalities not required for the current specifications but could be useful for further work Projected Values Test
11. values Size of the Graphs In the last version the graph size was fixed to the dimensions of its containing panel That was a problem when large Data Files were loaded into the system because the graphs were compressed to fit the panel Now the graph s dimensions are not fixed to the dimensions of its containing panel and grow Page 73 TDWB according to the amount of loaded data and the selected variables Scroll bars are provided to allow visualization of the entire graph The JfreeChart graphs are designed to fill its containing swing component and it is very difficult to customize their dimensions Now when large files are loaded the separation between time stamps is more or less the same than when loading small files But when large files are loaded the graph labels deform have been looking for a straight forward solution but haven t found any one Maybe should be a good idea to use other chart library like JChart2D 6 1 2 4 Data analysis In this version data analysis modules are included They are two modules the Number of Elementary Patterns module and the Value Changes module Number of Elementary Patterns This module reports the frequency of all the elementary patterns splitting the result for the positive data segments and the negative data segments Value Changes This module reports all the variables value changes splitting the result for the positive data segments and the negative
12. N H H H H N N If the pattern to be matched is A L then this results in 4 matches out of 10 possible matches A N L L L L N N N H H H We 4 10 100 40 matches against this data segment Page 76 TDWB If the pattern to match is Composite T 0 A L T 1 B H the result is 2 matches in 9 possible matches A N N L L L L N N N B N N H H H H N N N N We have 2 9 100 22 22 matches against this data segment Is 9 possible matches because the length of the data is 10 and the length of the pattern is 2 so 10 2 1 9 possible matches To determine if the pattern matches against the data segment the user determines a threshold for the number of matches or for the percentage of matches in a data segment For the patterns P1 P2 and P8 if the user determines the thresholds as in Table 6 1 Thresholds Table 6 1 Pattern thresholds example The pattern matching algorithm will report a positive data segment match it Finds 40 or more matches for the pattern P1 or Finds 1 or more matches for the pattern P2 or Finds 0 1396 or more or a number of 23 or more matches for the pattern P3 These thresholds can be defined in the Pattern Discovery panel like is displayed in Figure 6 15 Page 77 TDWB Recursive patterns Now the composite patterns and the combinatory patterns can be composed of not only element
13. belongs to the two ranges then the system has to deal with an ambiguity The system has to provide an intuitive interface to determine the value ranges of each variable 3 4 Data display The system has to display the original variables values the continuous and the discrete versions of the values JfreeChart is going to be used to display the graphs and a JTable Swing component to display the data values For the discrete graph each value is going to be coloured These colours describe the value ranges and are used as a visual aid The special events have to be marked in the graph to visually see when a data segment finishes 3 4 1 Chart library interface Class am going to use JfreeChart library as the graph library to display the data time series Maybe this library has some major bug or a lack of functionality that now Page 37 TDWB don t know or maybe a new version is distributed Then then library has to be easy of change by other library that is not necessarily a new version of JfreeChart To resolve this am going to write an interface class as a proxy to use the chart library functions Then to change the library the proxy class is the only class to be modified 3 5 Data analysis The system provides algorithms to statistically analyse the data segments This information should be useful for the analyst when determining the kind of patterns to develop As stated in section 2 2 am going to implement
14. namely the character that delimits the values in the CSV file to be loaded 2 Click in the button Select a file to load the CSV file 3 If the format of the file is correct with a time variable which has time stamps in the correct format with a special event variable with one or more positive and or negative special events and without duplicated Page 107 TDWB variable names then the data is previewed in the File preview section as seen in Figure 9 8 2 Add a data file Add a CSV data file Column separator File preview Selected File test csv Time of Timep 16 10 2008 01 18 16 10 2008 02 00 16 10 2008 03 00 t Calcium 16 10 2008 04 00 16 10 2008 05 00 16 10 2008 06 00 16 10 2008 17 00 4 m Time Variable Time of Timepoint Special Event Variable Special Event Time point Type Fri Oct 17 23 00 00 BST 2008 NEGATIVE Oct 20 15 00 00 BST 2008 POSITIVE Add data file cancel Figure 9 8 Add data file dialog with data preview 4 If the system detects more than one possible time variable and or more than one possible special event variable you have to choose which variables are the correct ones 5 When everything is as desired click on the button Add data file to add the data file into the system After this the system will prompt you about add more data files setting var
15. 4 2 1 Proposed ch hges i e ee e eh ie ie 55 5 PROTOTYPE 2 TDWB 0 2 5 lalla iii 56 Sil CDESIGN DER 56 5 1 1 Use woFKflOW ie ilo ra 56 5 12 Functionalities list PE en A 57 5 1 3 te 61 5 1 4 67 5 2 V SERS EVAEUATION 7 secta ilaria 67 5 2 1 Proposed ch nges aiar 67 6 FINAE VERSION TDWB 1 0 5 lin oio e 68 F DESIGN pe H 69 6 1 1 WISC3 WOK POW s tv A s 69 TDWB 6 1 2 Functi nalities dte ar ei t RD Rn a e Ula Eo 69 6 1 3 da silent ERR IO 79 6 1 4 Modet view controller iii 87 6 1 5 ARI 89 6 2 IMPLEMENTATION ce itin rae puta paia rt 90 6 2 1 Generate the analysis data ee s en RR eR ERR s ee ge e NE 90 6 22 Analysing the d at iot eS ade ei iii 90 6 2 3 Generating composite patterns in a combinatory pattern ii 91 6 24 Pattern matching d edat eie ande ea sea ae ede eoo d enden ead Eo deae da eoa vus 91 6 3 OPTIMIZATIONS sapo aaa AA 91 6 3 1 Pre processing combinatory 5 2 2 11 0 00000000000000000000000000 91 6 4 SCALABILITY eres rep
16. GOALS 565 1 5 STRUCTURE OF BACKGROUND 2 1 SPECIFICATIONS niter Peer t age aaa 2 2 APRIORI AEIGORITHM 5 onere rhe te ntn Ra ERE 2 3 INITIAL REQUIREMENTS ni inn tierna nh rente Ye tuae pee ha koe 2 3 1 Functional lata 2 3 2 Nonjfunctioridl iacta n tt ehe ae aen AER ARA Hd ete 2 4 SIMILAR APPROACHES ort seder reseput en ga tue aca d EE bao do ea 2 5 CONSTRAINTS rai 2 6 CHOOSING JAVA GRAPHIC TIME SERIES LIBRARY 2 6 1 Requirements needed esses sessi 2 6 2 Reviewed libraries uid idee des da Ee 2 6 3 COmpardatiVe eae ee de den A ae enean ae 2 6 4 Chosen libraryzs i eee P RR END e e ORE dh add QUARTER Gash 2 6 5 Extra features on oe i aao esee sese dienen OUTLINE DESIGN sis sccccsdscccccvecseccnvecotessvscatessecovesseecsdssecossdscceessteuevessceossess 3 1 GENERAL USE WORKFLOW 3 2 MANAGEMENT Pt ree E a e P RR E 33 MAR IABLES Ra ee Sy codes eS Se dne tues
17. Page 137
18. SpringLayouts This file has been developed by Oracle Class used by JFreeChartLibrary java to create a time serie Helping class to represent a discrete range in the view layer Helping class to represent a pattern in the view layer Helping class to represent a variable in the view layer VariablesDialog java The dialog used to edit the variables 1 http docs oracle com javase tutorial uiswing examples layout SpringGridProject src layout Spri ngUtilities java TDWB ViewStyle java Constants used by all the view layer classes to format its components ViewUtils java Functions used by all the view layer classes to format its components and to show confirmation warning and errors dialog 9 2 9 Main procedures and methods 9 2 9 1 Generate the analysis data In the class DataFile there is a method that transforms the raw variable data from the CSV file into analysable data generating the smoothed continuous values and the discrete values 9 2 9 2 Analysing the data The Analyser java file has a method that analyses the data and returns the result of the analysis report 9 2 9 3 Generating composite patterns in a combinatory pattern After a combinatory pattern is created its composite patterns must be generated In the file CombinatoryPattern java are the methods that generate the composite patterns 9 2 9 4 Pattern matching The file Pattern java contains the methods that dete
19. TDWB 5 2 1 1 Data management Allow multiple special events per file Step by step wizard to load files and select variables to be used in the analyses amp pattern generation 5 2 1 2 Variables Edit variables in a dialog not in a tab panel Make the Variables edit system easier to use Enhance discrete ranges labelling system Grey colour for new discrete ranges 5 2 1 3 Pattern discovery Pattern matching thresholds More sophisticated patterns 6 Final version TDWB 1 0 After developing several prototypes this is the final and most complete version of the system There are some major changes to the system like a Model View Controller structural pattern functionality to load multiple Special Events from a Data file Additionally there is the abstraction of the graph library of the analysis modules and of the patterns A more flexible pattern creation and a more sophisticated pattern evaluation are also included Also a stylish GUI a wizard for data loading and confirmation warning and error dialogs are included in this version Page 68 TDWB The analysis module removed from the last prototype is included in this version and a new analysis module has been created two analysis modules are also included 6 1 Design 6 1 1 Use workflow This version implements all the functionalities so the use workflow includes the analysis and the pattern discovery processes as seen in Figure 6 1
20. AD y Year Year 1996 96 M Month in year Month July Jul 07 Week in year Number 27 W Week in month Number 2 D Day in year Number 189 d Day in month Number 10 F Day of week in month Number 2 E Day in week Text Tuesday Tue a Am pm marker Text PM H Hour in day 0 23 Number 0 k Hour in day 1 24 Number 24 K Hour in am pm 0 11 Number 0 h Hour in am pm 1 12 Number 12 m Minute in hour Number 30 3 Second in minute Number 55 3 Millisecond Number 978 z Time zone Generaltime zone Pacific Standard Time PST GMI 08 00 Z Time zone RFC 822 time zone 0800 Figure 3 2 Date time pattern letters Special event The user has to define the time of the special events to be predicted The data before each special event constitutes a data segment If a file has multiple special events then the file also has multiple data segments The special events can be POSITIVE or NEGATIVE If it is positive means that a special event happened if it is negative means that there is not special event This is useful for the training data segments Continuous values The Continuous values are the result of applying a smoothing process to the original values First the system generates new time stamps at a regular period Then the values are reallocated to the new time points If more than Page 35 TDWB one value is in the same period the mean of these variables is used as the value of that time period Figure 3 3 illustrat
21. Also it would have been good to fully implement the core algorithm without any GUI and to agree with the customer representative that the core algorithm was complete and correct before the first prototype production 7 2 Further work This is a list of possible new features and improvements for the next version Some of these improvements have been proposed by the program testers 7 2 1 Data files More file formats to load data segments e g excel A panel to edit extend the date formats accepted by the CSV parser Allow the user to edit the data Export the processed data as files Allow data with mixed time point periods in the same project Page 97 TDWB 7 2 2 7 2 3 7 2 4 7 2 5 Data Panel Synchronize scroll bars for the graphs of the three tabs Remove lines in discrete graphs Variables Overlap discrete ranges add an option to specify whether or not this can occur A button in the variable dialog which selects all the variables Data analysis Analyse the continuous data Analysis time frame after and or around the special event Allow the user to edit the Special Events A check button to allow the Data Segments to overlap or to not overlap Don t sort the discrete ranges in the report Pattern discovery Show more helpful information in the pattern creation dialog Ability to edit patterns Reference existing patterns inside the pattern creating facility s Advanced UI pattern ent
22. Also the pattern discovery process has been implemented But to simplify the code the analysis module has been removed Page 56 TDWB Start Add data segment y Select variables and define their discrete amp Yes ranges Run pattern Define a pattern matching against Yes Enough data segments the data segments Satisfactory result Figure 5 1 Use workflow 5 1 2 Functionalities list 5 1 2 1 Project The concept of project has been implemented A project is a conjunction of data files selected variables discrete ranges and patterns Save Load the project into from files These are simple but important functionalities because the users need to save their work for further purposes Also the saved project files can be distributed 5 1 2 2 Data management Removed the selection of variables when data loading To simplify the data loading all the variables in the CSV files are loaded into the system After data loading the system provides a process to select which Page 57 TDWB variables are to be used in the analysis and the pattern discovery processes The system now loads multiple data files faster than before Also this adds flexibility because if the user wants to analyse new variables he does not have to reload all the CSV files as in the last prototype Temporal limits In the last prototype all the data was anal
23. Data Segments can overlap Functionality to allow the user to determine if the Data Segments overlap or do not overlap could be developed as part of further work Currently if the user wants non overlapping Data Segments the user has to split the Data File into multiple Data Files and load them separately 14 06 2012 16 06 2012 Analysis time frame 72h Special Event 1 Special Event 2 E RI 11 06 2012 14 06 2012 Data Segment for Special Event 1 13 06 2012 16 06 2012 Data Segment for Special Event 2 Figure 6 2 Data segment overlapping Page 70 TDWB Editing Special Events To simplify the system the Special Events cannot be edited anymore from the program The type and the time are loaded from the Data File and are permanent in the system To modify the Special Events the Data File must be removed from the program the original CSV file edited and then re loaded into the program A functionality to edit the Special Events could be part of further work Show an error if the Time variable or the Special Event variable are not detected Those variables are necessary So if they are not in the Data File then the system does not load the file Check time format errors A data record is ignored if the format of its time variable is not valid Show a warning message if empty variable names The system show a warning message when it detects empty variable names in a Data File If the user wants to load that D
24. Discovery Workbench from the Sleeman et al DS 11 paper which describes To date in this series of projects we have accepted an initial rule set from a domain expert as we believed that machine learning algorithms sometimes failed to incorporate domain important concepts rules However as the domain and the task prediction in temporal datasets gets more complex we feel it is appropriate to develop systems which are genuinely collaborative i e where both the system and the expert suggest features to explain specific temporal events the system creates from these composite features and these Page 14 TDWB are evaluated against datasets The expert then decides on the basis of coverage statistics and his knowledge of the domain which patterns should be retained and developed further The ground breaking APRIORI algorithm has recently been developed to handle temporal datasets and patterns we plan to use this later algorithm as a central component of this collaborative workbench This is a general guideline about the software system to be developed Interviews with Professor Derek Sleeman will outline a more precise requirements list 2 2 APRIORI algorithm Analysing the data is a key step in guiding the domain expert to produce more precise patterns The APRIORI algorithm concept as described in LAX 06 can be used as a guide for designing the analysis modules of our Workbench Basically the APRIORI algorith
25. MVC 1 MVC 2 Page 88 TDWB 6 1 5 UML 6 1 5 1 Model layer DiscreteRange 1 Variable 1 Project DataFile DataSegment 1 1 PatternNode CombinatoryPattern NotPattern Analyser Figure 6 18 Model UML Page 89 TDWB 6 1 5 2 Controller layer CompositeController CombinatoryController ElementaryPatternController SelectPatternNodeController DataController 1 1 DataFileController Figure 6 19 Controller UML 6 2 Implementation 6 2 1 Generate the analysis data In the class DataFile there is a method that transforms the raw variable data from the CSV file into analysable data generating the smoothed continuous values and the discrete values 6 2 2 Analysing the data The Analyser java file has a method that analyses the data and returns the result of the analysis report Page 90 TDWB 6 2 3 Generating composite patterns in a combinatory pattern After a combinatory pattern is created its composite patterns must be generated In the file CombinatoryPattern java are the methods that generate the composite patterns 6 2 4 Pattern matching The file Pattern java contains the methods that determine if a pattern matches against a data segment The PatternNode java flips the result of the match of its subclasses if the attribute isNot is set to true 6 3 Optimizations 6 3 1 Pre
26. N o N a e Figure 5 6 Continuous graph Page 63 TDWB 0 0000000 Troponin RD Figure 5 7 Discrete graph Project Help Analysis Pattern Discovery Pattern Matching Result Add Combinatory pattern Figure 5 8 Pattern discovery panel Page 64 TDWB _ New AND patter Figure 5 9 And pattern edit dialog WOCHE Ree UC ess 7 15 New Combin atory p Figure 5 10 Combinatory pattern dialog Page 65 TDWB Pattern Matching Result T SpO2 3 AND T 1 Troponin RD 2 TP 0 FP 0 TN O FN 1 Combinatory 3 T SpO2 2 2 Gaps Total TP FP TN FN T SpO2 2 AND T 1 Spo2 2 AND T 2 Spo2 2 TP 0 FP 0 TN 0 FN 1 T SpO2 2 AND T 1 Spo2 2 AND T 3 Spo2 2 TP 0 Figure 5 11 pattern matching report Page 66 TDWB 5 1 4 UML 5 1 4 1 Model UML DiscreteRange CombinatoryPattern Figure 5 12 Model UML 5 2 Users evaluation User evaluation realized on 16 December 2011 5 2 1 Proposed changes This is the list of changes proposed by Derek Sleeman and Wamberto Vasconcelos after evaluate the system on 16 December 2011 A detailed discussion of the actual changes applied can be found in the functionalities section 6 1 2 of the next version of the program Page 67
27. Save project as on the upper menu if you want to save the project in a different file To open an existing project from a file click on Project gt Open project on the upper menu 9 1 9 Save and load patterns To save the patterns into a file on the upper menu click on Pattern gt Save patterns To load patterns from a file on the upper menu click on Pattern gt Load pattern Page 121 TDWB 9 2 Maintenance Manual 9 2 1 Installing the system To explore the source code NetBeans is needed For this project have used the version 7 0 1 of NetBeans Visit the website http netbeans org to download it and install it in your system Once installed open NetBeans and load this project To do this on NetBeans on the upper menu click on File gt Open project Then select the folder of this project and click on the Open project button 9 2 2 Compile build the system To compile the system on the upper menu of NetBeans click on Run gt Clean and build the project This will generate the folder dist inside the root folder of the project There you can find a jar file which is the executable 9 2 3 Execute the program To execute the program double click on the jar file of the dist folder To execute the program from NetBeans press the key F6 on your keyboard 9 2 4 Dependencies This program depends of the Java version 6 0 that can be found at http java com es download I
28. Use Add data segment i Select variables and define their discrete amp Yes ranges Run analysis against the data segments workflow Enough data segments Run pattern Define a pattern matching against the data segments No Satisfactory result om ee Figure 6 1 Use workflow 6 1 2 Functionalities list This is a list of the new and the modified functionalities 6 1 2 1 Data management Title for Data Files Page 69 TDWB It is not necessary to define manually a title or a name for the Data Files The Data File name is used as the title This change simplifies and speeds up the Data Files loading process Multiple Data Segments per Data File To make the system more flexible it is now possible to load multiple Special Events from a Data File Each Special Event determines the end of a Data Segment and each Data Segment is as long as determined by the parameter Analysis time frame That is defined when creating a new project After the project creation this parameter can be modified in the project properties dialog that can be accessed from Project gt Project properties in the menu As can be seen in Figure 6 2 two Special Events are loaded from a Data File The analysis time frame parameter is 72h Then two 72h Data Segments are used for the data analysis and the pattern discovery process Currently the
29. analysis modules to report the occurrences and the changes of the discrete values Should be easy to add or remove analysis module so am going to make this modifications easy to achieve by writing an interface class that the analysis modules must implement and a proxy class that will interact between the analysis modules and the system 3 6 Pattern discovery After loading the data files selecting the variables defining their discrete ranges and reviewing the analysis report the user can evaluate the patterns developed against the data segments All the data segments end with a positive or a negative special event Each pattern is matched against the discrete values of the variables of each data segment The workbench then reports which positive and which negative data sets are matched against the pattern A typical report is shown in Figure 3 5 Page 38 TDWB Pattern matching result SE NSE 25 33 75 8 16 20 8096 SE Match True False 8 33 24 296 4 20 2096 NSE Match False True Figure 3 5 Pattern match result The categories used in Figure 3 2 are defined below True Positive if the pattern matches a data segment which contains a positive special event True Negative if the pattern doesn t match a data segment with a negative special event False Positive if the pattern doesn t match a data segment with a positive special event False Negative if the pattern matches a
30. but future analysis undertaken by TDWB may require additional modules to be added So have implemented a system which allows one to easily add or remove analysis modules The analysis modules are implementing classes of a common interface AnalysisModule java and Analyser java is a singleton class that manages the AnalysisModule s subclasses To add a new module one simply creates a new implementing class of AnalysisModule java implementing the methods getModuleName and analyse Then one modifies the constructor method of the Analyse java class to add an instance of the new analysis module An example of this architecture is shown in Figure 6 21 getModuleName String analysefin project Project String ElementaryPatternsAnalysisModule ValueChangesAnalysisModule Figure 0 2 Analysis modules abstraction Page 135 TDWB 9 2 11 3 Adding new pattern types For the myocardial damage study combinatorial patterns are needed For other studies maybe it would be very useful to add other kinds of patterns that are not implemented currently So a system to add new patterns easily is needed The solution is very similar to the solution for the analysis modules namely we use an abstract class PatternNode java and a manager class Pattern java that will use the classes that implements PatternNode java However here a different UI and a different controller are needed for the different patterns Thi
31. community doesn t support time series and the last version was in 2004 URL http jcckit sourceforge net davon 0 200 i 9 219 cummulative density 0001 106060 206060 306060 509090 Figure 2 5 JCCKit 2 6 2 5 JChart2D It is very complete have lots of functionalities and an active community URL http jchart2d sourceforge net index shtml Asemon Bug Fle View Task a8 cgpmoccememenousge g statement cate E Sunmary B objects Processes Datsbeses ta Caches B Pools Devices sa 10 Sun 2 10 Queue 5 Spek Sum uP Cached Procedures Procedure Cache Cached Objects 1 Erofeg 3 Deadleck 7 Proc Cache Module Usage Summary panel rends Graph Historyinfinutes 106 Server Information n i CPU Summary of all Engines local server name LOCALHOST 002 bcabosti s ny 9 i ais GGenemame PPLESVE i E V ASE Port isteners 2500 985 5002 Y Ese Version 152 ttr eu 5 Meme mh E 979 Sample time 2009 04 17 15 29 89 73 m 150800 5000 ds Bam sera 152850 157299 ceu Sar dete 2009 04 08 05 01 12 356 n CPU Usage per Engine System User 9 AJAN SV Counters cear date 1970 01 01 01 09 09 0 vw Funring checkpoint 1 Nuber of deadlock 0
32. data segment with a negative special event A perfect pattern will match only the data segments with a positive special event and none of the data segments with a negative special event NSE In the next sections different patterns defined for the myocardial damage domain are described Page 39 TDWB 3 6 1 Single time point pattern The elementary components of the patterns are the duplets in a format Variable name Discrete range For example for the variable with name X with discrete ranges V1 V2 V4 a possible tuple would be X V2 or X V1 The single time point patterns are checked at each single time point of every data segment For example the pattern X V3 for the sequence X V1 V1 V2 V3 V3 Reports positive matches X V1 V1 V2 V3 V3 3 6 2 Single time point composite pattern The system matches patterns at the same time for more than one variable For example for the variable X with discrete ranges V1 V2 V3 V4 and the variable Y with discrete ranges V1 V2 V3 the user can check if X V2 AND Y V1 happen at the same time point For example the pattern X V3 AND Y V2 for the sequence X V1 V1 V2 V3 V3 Y V1 V1 V1 V1 V2 Reports a positive match X V1 V1 V2 V3 V3 Y V1 V1 V1 V1 V2 3 6 3 Temporal pattern The user can specify a composition of elementary patterns with a specific time offset The time offset parameter is added to the elementary pattern whi
33. more helpful information in the pattern creation dialog Allow patterns to be edited A functionality to export rules to standard file formats like CSV 7 Conclusion The final version of the program provides functionalities to analyse and match patterns against multiple data sets This program can be used not only in the domain of myocardial damage but in a wide range of domains It is easy to add new analysis modules and new patterns for different domains In conclusion the primary and secondary goals of this software engineering project have been achieved and the result is more than satisfactory Page 96 TDWB 7 1 Discussion The project have been completed on time but more features and a better pattern editing Ul could have been implemented had the project been managed better The following issues have added unnecessary work The specifications of the core algorithm pattern discovery have been implemented gradually along the project The revised specifications forced an extensive rewrite of almost all the classes and structures of each prototype If the core algorithm were fully implemented in the first prototype then would have had more time to add and improve the functionalities of the UI for the last version This happened because all the important specifications were not firmly settled during the first weeks of the project should have been more proactive and should have asked for details about the core algorithm
34. names of the variables This is a example of a CSV file using the comma character as the field separator Year Make Model Length 1997 Ford E350 2 34 2000 Mercury Cougar 2 38 In this example there are four variables Year Make Model and Length and two records each one ending with a line break More information about CSV files can be found in the Wikipedia http en wikipedia org wiki Comma separated values 9 1 2 2 How to use OpenOffice org Calc If you don t have installed OpenOffice org Calc first you need to download it from its website http www openoffice org and install it on your system The OpenOffice org Calc version used for this manual is the 3 3 0 To create a new CSV file 1 Open OpenOffice org Calc 2 In the new file write in the first row the variables name Each variable name must be in a different cell 3 For each data record write the values of each variable in the corresponding column At the end you should have something like Figure Page 101 TDWB 9 1 bet Untitled 1 OpenOffice org Calc File Edit View Insert Format Tools Data Window Help fis Ho s BER max B Arial x 10 x B 7 U 1 Year Make Model Length 2 1997 Ford E350 2 34 3 2000 Mercury Cougar 2 38 4 Figure 9 1 Creating editing a CSV file with OpenOffice org Calc 4 To save the document as a CSV file In the upper menu click File gt Save As Then select the CSV format When prompted ch
35. processing combinatory patterns Originally before match a combinatory pattern all its composite patterns were generated This adds more time to the pattern matching process which is also very time consuming The solution is to generate the composite patterns when the combinatory pattern is first created Then the overall time is distributed becoming in shorter waiting times The trade off is that more memory resources are needed but the actual computers have enough also when the project is saved into the system disk it needs more space Page 91 TDWB 6 4 Scalability 6 4 1 Changing the graph library JFreeChart is very complete But maybe for a specific domain or because a new version is available it is necessary to change it have implemented a class interface named ChartLibrary java and used the JFreeChart library through this interface Then to change the graph library is easier Figure 6 20 is a UML representing this scheme interface ChartLibrary N JFreeChartLibrary DataPanel jfreechart 1 0 13 Figure 6 20 Chart library abstraction 6 4 2 Adding new analysis modules Currently there are only two analysis modules but future analysis undertaken by TDWB may require additional modules to be added So have implemented a system which allows one to easily add or remove analysis modules The analysis modules are implementing classes of a common interface Ana
36. program will be general enough so that it can be applied to time series data sets from various domains 2 3 2 3 Platform compatibility The program will be compatible with several operating systems and platforms 2 3 2 4 Portability The program logic will be abstracted so different GUIs can be implemented easily 2 3 2 5 Response times The response time of the system will depend on the size of the datasets and the quantity of patterns to be matched However for small datasets we expect a rapid response time 2 3 2 6 Usability Friendly user interface Easy visualization of the data Feedback provided for most values provided by the User Confirmation dialogs for irreversible actions Form data validation Page 18 TDWB 2 4 Similar approaches The following are some of the software systems or solutions with similar functionalities to TDWB APRIORI algorithm is very useful to analyse the data Semantic Web Enabled Exploration of Temporal Information SWEETInfo Stanford That uses ontologies to analyse time stamped data sets See http omir stanford edu projects view php sweetinfo ChronoMiner Rashmi Raj Martin J O Connor Amar K Das Stanford This system as SWEETInfo uses ontologies to extract information from data sets Temporal Pattern Discovery System TEMPADIS Ramirez amp Cook Used to discover patterns in temporal data 2 5 Constraints The main constraint is
37. segments overlapping If the special events are closer than the analysis time frame then their corresponding data segments will overlap That could affect the analysis if both special events are of a different kind To avoid this situation it is better to split the data file in two different data files with their corresponding non overlapping data segments The Analysis Time Point Period is the time period of the data records For example if the data from a patient in the ICU is recorded once per hour and the time scale is set to hours this field should be 1 If you need to change these values you can do so by opening the dialog from the Figure 9 5 clicking in the upper menu in Project gt Project properties 9 1 4 Adding data files After creating a new project the system will ask to the user to load data for the analysis and the pattern discovery process A dialog like Figure 9 7 appears to load a data file Page 106 TDWB Add a CSV data file Select a file File preview Data Preview Time Variable Special Event Variable Special Events Figure 9 7 Add data file dialog Also you can access this dialog by clicking in the Data Files tab near the top of the system screen and then click on the button Add a data File which is in the upper left corner of the panel now it is time to load the CSV file that contains the analysis data 1 Select the Column separator
38. the humidity reaches 85 Duplicated variable names are not accepted in the data files Each special event Positive or negative determines the end of a data segment 9 1 3 Creating a new Project To start a new project in TDWB you have to click on the upper menu Project gt New project And a dialog like Figure 9 5 appears Page 104 TDWB Create New Project Time Scale Time Frame to Analyse Before Each Special Event 72 Analysis Time Point Period 1 Author Figure 9 5 New project project properties dialog The Time Scale field sets the units for the analysis period For example for a medical study about patients in the ICU could be hours because the data from the patients are collected hourly The Time Frame to Analyse Before Each Special Event determines the length of time before a special event that would be analysed in the analysis and the pattern discovery processes The data segments are as long as this parameter So for example if we have a data file with two special events and a time frame to analyse before each special event of 72h then we have two data segments of 72h long in this data file like described in Figure 9 6 Page 105 TDWB 14 06 2012 16 06 2012 Analysis time frame 72h Special Event 1 Special Event 2 purus mm RE 11 06 2012 14 06 2012 Data Segment for Special Event 1 13 06 2012 16 06 2012 Data Segment for Special Event 2 Figure 9 6 Data
39. the time limit of around four months to develop the system Ideally we should ask the medical experts to evaluate the system although if don t have access to these experts then my supervisors and other research staff who work directly with the domain experts will be able to provide feedback on the system 2 6 Choosing a Java graphic time series library am going to need a graph library to display the data sets in a time series graph This could be a big effort so have to decide if am going to write my own graph library or to use a third party library As seen in Figure 2 1 have implemented a very simple library to display time series graphs in a swing component Page 19 TDWB on ATT NUN MN AT el gt ra ry Figure 2 1 Custom graph library This library has not all the functionalities that need To develop a library with all the needed functionalities can be a large amount of work so it is better if try to find an existing graphs library and use it for this project To make use of a specialized graphics library could be very useful Not only because the amount of code to write the program would be minor but because could benefit from the library s extra functionalities to easily add new features to the program 2 6 1 Requirements needed The library must satisfy various requirements The library not only has to satisfy functional requirements has to satisfy non functiona
40. FIGURE2 15 BOX ANNOTATION nia 32 FIGURE 2 16 MARKER 2 556 21 32 FIGURE 3 1 GENERAL USE WORKFLOW sicescesssorsiscssvessssuneeessuetensssvessesuivedsesteadiasssndecussdesssenbiduesdoadserscecesuebeeneesseoes 33 FIGURE 3 2 DATE TIME PATTERN LETTERS lea 35 FIGURE 3 3 DATA SMOOTHING ne irata sene ee vas ve 36 FIGURE 3 4 DATA DISCRETIZING apud eoe 36 FIGURE 3 5 PATTERN MATCH RESULT ear 39 FIGURE 4 1 USE WORKFLOW x sette b pr ER PR e ROME I dies olbia 48 FIGURE 422 LOAD DATA DIALOG cis oer e ee o ee Teo ve dee eee et ue qe de e e d nn dee 51 FIGURE 4 3 LOAD DATA DIALOG WITH DATA PREVIEW c iii 51 FIGURE M4 DATA PANEL ee eti RR Pee T cost Re x Te aive cage ee Guedes teens 52 FIGURE 4 5 SMOOTHED CONTINUOUS VALUES ener enne trennen enne snnt 52 FIGURE 4 6 DISCRETE VALUES terere rr ERE beoe Peer aeo Poco ve uet RE Ebenso paese er RE Reine 53 FIGURE 4 7 ANALYSIS OPTIONS cicci 53 FIGUREA 8 ANALYSIS 2 ses cu eer R e EORR UR EE A A 54 55 FIGURE b 1 USE WORKELOW ae RISE URNA
41. GURE 7 1 MODECUMLE oett ero ilo v dte uer a ERE 125 FIGURE 7 2 CONTROLLER UM L 1 2 bo pte n Ce e RI 126 FIGURE 11 1 CHART LIBRARY 135 FIGURE 11 2 ANALYSIS MODULES 5 135 FIGURE 11 3 PATTERN TYPES 5 enne enne ener tnter ennt 137 10 Vi TDWB Index of tables TABLE 6 1 PATTERN THRESHOLDS EXAMPLE TABLE 8 1 PATTERN THRESHOLDS EXAMPLE Page 11 TDWB 1 Introduction This sections provides an overview of the problem to be solved the goals the objectives and the document structure 1 1 Overview To detect myocardial damage MD in intensive care unit ICU patients a test is made every 72 hours This test checks if the troponin level in the patient s blood is high i e over a pre defined threshold This test is expensive and is only performed infrequently So it would be convenient to find a cheaper and faster system to detect MD The experts know that after a MD the patient s physiological parameters change For example as we can see in Figure 1 1 Special event example we suggest the hypothesis that a high troponin value is
42. Not Variable Discrete range label Defined discrete range labels N VL Store pattern Cancel Figure 9 15 New elementary pattern dialog The Discrete label field is to input the discrete range label The Possible label values shows the labels for the discrete ranges of the selected variable 9 1 7 2 Composite pattern A composite pattern is composed of one or more patterns If when matched against data at a time point all of its component patterns match then the composite pattern reports a match If some of its patterns don t report a match then the composite pattern doesn t report a match The component patterns can have different time offset The time offset parameter is added to the pattern with the format T Time point offset Pattern For example the pattern Composite T 0 CVP N T 1 CVP VL For the sequence CVP N N VL Reports a match CVP NN VL Page 115 TDWB Because a pattern CVP VL is found one time slot after a pattern CVPIN The above pattern is defined by adding two elementary patterns to a composite pattern and defining a time offset T of 1 to the second elementary pattern See Figure 9 16 Add patterns and its time offsets to this composite pattern Add pattern to this composite pattern Not Patterns Time Offsets T lo CvP W 1 Store pattern Cancel Figure 9 16 New comp
43. Single Honours Computing Project 2011 2012 Temporal Discovery Workbench TDWB Dissertation and manuals Daniel Blasco Calzada d blascocalzada 11 aberdeen ac uk Department of Computing Science University of Aberdeen Aberdeen AB24 3UE UK TDWB I Acknowledgements would like to thank my project supervisors Professor Derek Sleeman and Dr Wamberto Vasconcelos for all their advice and assistance during the development of this project and also for providing such an interesting and rewarding topic for the basis of my Honours project would also like to thank my family and friends for their words of wisdom and advice throughout the project s duration Page 1 TDWB ll Declaration declare that this document and the accompanying code have been composed by myself and describe my own work unless otherwise acknowledged in the text It has not been accepted in any previous application for a degree All verbatim extracts have been distinguished by quotation marks and all sources of information have been specifically acknowledged So falo cao RO DD FUR Deae Date ios Daniel Blasco Calzada Page 2 TDWB Abstract The necessity of providing an agile tool to researchers who want to describe the data series behaviour before a special event has encouraged me to carry out this project This is a software engineering project that provides a solution to a real probl
44. T 2 B AND T 3 C OR T 0 A AND T 1 C AND T 2 B OR T 0 A AND T 1 C AND T 3 B OR T 0 A AND T 2 C AND T 3 B OR T 0 B AND T 1 A AND T 2 C OR T 0 B AND T 1 A AND T 3 C OR T 0 B AND T 2 A AND T 3 C OR T 0 B AND T 1 C AND T 2 A OR T 0 B AND T 1 C AND T 3 A OR T 0 B AND T 2 C AND T 3 A OR T 0 C AND T 1 A AND T 2 B OR Page 42 TDWB T 0 C AND T 1 A AND T 3 B OR T 0 C AND T 2 A AND T 3 B OR T 0 C AND T 1 B AND T 2 A OR T 0 C AND T 1 B AND T 3 A OR T 0 C AND T 2 B AND T 3 A 3 7 Methodology and technologies 3 7 1 Software development method The traditional way of software development is a linear sequence Requirements Design Development Testing Feedback But in this case there are some important factors which make the linear approach less viable have just a general idea about the problem being addressed So it is better to do a quick early prototype and then produce an enhanced version which incorporates changes proposed by the expert users know which kind of users are going to use the program but don t initially know the kind of interface that is appropriate In any event am planning to separate the logic from the UI the latter can then be a simple interface or it could be a web interface It is acceptable for me to use libraries or tools at my disposal but of course their use must be acknowledged Some of these libraries are going to be used in the initial Ul as am not sure about the UI de
45. The following are the Java packages and a short description about the files of each package 9 2 8 1 Tdwb package This is the root package Page 126 eS App java File that contains the main function Instantiates and shows MainView java Cons java Has some constants used in the other classes 9 2 8 2 Tdwb controller package The controller package contains the classes of the controller layer These classes receive the user s inputs and change the state of the system accordingly modifying the model layer and reporting the changes to the view layer ___ i AnalysisController java Handles the analysis tab panel user events in the main view CombinatoryPatternEditController java Handles the CombinatoryPatternEditDialog java user events CompositePatternEditController java Handles the CompositePatternEditDialog java user events DataController java Handles the data tab panel user events in the main view DataFileController java Handles the DataFileDialog java user events ElementaryPatternEditController java Handles the ElementaryPatternEditDialog java user events MainController java Handles the MainView java user PatternDiscoveryController java Handles the pattern tab panel user events in the main view PatternEditController java Abstract class implemented by the patterns edit controllers PatternTypeEnum java Contains the representation of the different kinds of pa
46. a CompositePatternEditDialog java DataFileDialog java DataModeEnum java DataPanel java ElementaryPatternEditDialog java JFreeChartLibrary java It is a dialog with some information about the program A dialog to add a new discrete range to a variable The analysis panel displayed in the tab panel of MainView java A interface class used as a proxy between DataPanel java and JFreeChartLibrary java A dialog to add a new combinatory pattern A dialog to add a new composite pattern The dialog to load a CSV file into the program Contains the representation of the data mode to be displayed The data panel displayed in the tab panel of MainView java A dialog to add a new elementary pattern The class that interacts directly with the JfreeChart library and implements ChartLibrary java PatternDiscoveryPanel java PatternMatchingResultDialog java ProjectDialog java SelectPatternNodeTypeDialog java SpringTable java SpringUtilities java TimeSerie java UlDiscreteRange java UlPattern java UlVariable java The project discovery panel displayed in the tab panel of MainView java Dialog that shows the result of the pattern matching The dialog used to create a new project and to edit the project s properties The dialog used to select the kind of pattern to add It s a swing custom component used for adapt JTables to its contents A swing utility used for set
47. a used the program with the help of the manual First explored test projects Secondly tried to create a new project using the CSV files provided When Laura didn t understand some functionality Derek and helped her After using all the functionalities of the program Laura gave us her feedback about the program In overall Laura could use the program intuitively except after determining the variables when she didn t know what to do next so she checked the user manual and realized what to do without our help The pattern creation is the functionality where Laura had more problems In that point Derek and had to help her For a lack of time couldn t implement visual aids for the pattern creation process neither edit functionality That should be the first functionality to be improved in a further version 6 5 1 Users feedback After the evaluation this is the list of proposed changes by Laura 6 5 1 1 Data management Allow data files with mixed time point periods for the same project 6 5 1 2 Variables A button to select all the variables at the same time in the variables dialog Page 95 TDWB 6 5 1 3 Data display Synchronize scroll bars for the graphs of the three tabs Remove the lines between points in discrete graphs because with lines seems like continuous variables 6 5 1 4 Data analysis Don t sort the discrete ranges in the report 6 5 1 5 Pattern discovery Show
48. ary patterns but also of all kinds of patterns This adds more complexity to the possible patterns That requires a recursive system to build patterns A pattern with the shape of a tree is an abstract representation of the recursive pattern creation where the elementary patterns are leafs and the composite and combinatory patterns are nodes of this tree as seen in Figure 6 3 This provides the user with more possibilities for design more complex patterns Combinatory Composite Composite Elementary Elementary Elementary Elementary Elementary Elementary Figure 6 3 Complex pattern Select deselect patterns to match In the pattern discovery tab the user can select or deselect the patterns which are matched against the data segments Export Import the patterns into from files Page 78 TDWB To reuse the patterns in various projects the user can export the pattern from a project into a system file and then import them into other projects subsequently if needed 6 1 2 6 User manual The user manual can be found in the menu by clicking Help gt User manual 6 1 3 Ul Create New Project Time Scale Analysis Time Point Period 1 Author Figure 6 4 Create a new Project Project properties dialog Page 79 TDWB B Ade file Add a CSV data file Add data file Figure 6 5 Add a d
49. ata File the variables with empty names will not be loaded Show an error message if duplicated variable names The variable names must be unique If the system detects duplicate variable names then it displays an error and the Data File is not loaded The variables names cannot be duplicated because then the system cannot differentiate between them Page 71 TDWB Step by step wizard for loading files and selecting variables To add agility and to help the user to use the program the system provides a step by step wizard After adding a Data File into the system the system asks the user what to do next add more Data Files set the variables perform the analysis or close the wizard 6 1 2 2 Variables Edit variables in a Dialog not in the tab panel It is not intuitive for the user to edit the variables in the main tab panel With the tab panel after editing the variables and pressing the Apply button the system did not confirm the changes to the user The solution for this problem is to edit the variables in a dialog When the user presses the Apply button the dialog disappears and then the user intuitively realises that the changes are confirmed New mechanism to edit variables When the system is managing lots of variables is not convenient for the user to see all of them the selected and the unselected variables in the same scroll panel A better way is to only display in detail the select
50. ata file dialog Page 80 TDWB Add a CSV data file Column separator I File preview Selected File test csv Mon Oct 20 15 00 00 BST 2008 POSITIVE Figure 6 6 Add a data file dialog with file preview Page 81 TDWB Set the Lia variables Variables list Selected variables E Acetylcysteine Addiphos E Adrenaline Alfentanil Amicoph Picone nn n mo onm Eco v Calcium Gluconate mmol ml 7 Diastolic Discrete ranges 7 Dobutamine Label Min Max Colour Description Drotrecogin Alfa 200mcg m Vi E m J 2 z Figure 6 7 Set variables dialog Page 82 TDWB Ne Project Variables Patterns Help Time of Timepoint 16 10 2008 01 18 00 000 11 0 16 1 2 NBAN AN nnnitn n Figure 6 8 Data files panel Page 83 TDWB Original data Continuous data Discrete data NEGATIVE Figure 6 9 Continuous data graph Orginal data continuous data Discrete data Figure 6 10 Discrete data graph Page 84 TDWB oject Variables Patterns Help Discovery Pr Analysis Modules Analysis result Time frame to analyse before each special event 72h Analysis time point period ih Numb
51. ays a report similar to Figure 6 16 Result of the pattern matchin Result of the pattern matching Time frame to analyse before each special event 72h Analysis time point period ih Positive special events PSE Negative special events NSE Unknown special events USE Result of the pattern matching Pattern CVP VL i 0 0 Composite T 0 CVP N T 1 CVP VL ic 2 Combinatory 1 x CVP N 2 x CVP VL 0 x Gaps x OQ 2 Combined result 1 0 0 TP True Positive TN True Negative FP False Positive FN False Negative Figure 9 20 Pattern matching report Page 120 TDWB When a pattern is matched against a data segment four possible results can be reported TP True Positive if the pattern matches a data segment with a positive special event TN True Negative if the pattern doesn t match a data segment with a negative special event FP False Positive if the pattern doesn t match a data segment with a positive special event FN False Negative if the pattern matches a data segment with a negative special event At the end of the report there is an extra row which provides the combined result of all the patterns A perfect pattern only matches the data segments with a positive special event and none of the data segments with a negative special event 9 1 8 Save and open projects To save a project into a file click on Project gt Save project or on Project gt
52. ch is a triplet with the format T Time offset Variable Name Discrete range Page 40 TDWB For example the pattern T 0 X V3 AND T 1 X V3 for the sequence X V1 V1 V2 V3 V3 Reports positive matches X V1 V1 V2 V3 V3 3 6 4 Temporal composite pattern The user can combine temporal patterns of a variable with the temporal patterns of other variables For example the pattern T 0 X V3 AND T 1 Y V3 for the sequences X V1 V1 V3 V3 V3 Y V1 V1 V3 V2 V3 Reports positive matches X V1 V1 V3 V3 V3 Y V1 V1 V3 V2 V3 3 6 5 Combinatorial pattern The system can look for different combinations of single time point and single time point composite patterns This is denoted with commas like A B C or For example for the combinatory pattern A B C where A X V1 B X V2 C X V3 AND Y V2 the system expands to Page 41 TDWB T A AND T 1 B AND T 2 C OR T 0 A AND T 1 C AND T 2 B OR T 0 B AND 1 AND T 2 C OR T 0 B AND T 1 C AND T 2 A OR 0 AND T 1 A AND T 2 B OR 0 AND T 1 B AND T 2 A 3 6 6 Combinatorial M out of N pattern This is similar to the previous pattern but here interleaved gaps are added The gaps are denoted by a G For example the pattern A B G is expanded to the pattern T 0 A AND T 2 B i e occurs at T 1 The pattern A B C G is expanded to the pattern T 0 A AND T 1 B AND T 2 C OR T 0 A AND T 1 B AND T 3 C OR T 0 A AND
53. d pattern discovery process Define positive special events PSE and negative special events NSE A positive special event is the event to be predicted the negative special event is when nothing has to be predicted 2 3 1 2 Variables Select the variables to be analysed The variables are numeric Define ranges of values for each variable this is to discretize the values The ranges cannot overlap Page 16 TDWB 2 3 1 3 Data display Display the data in a graph and in a table Display the continuous values and the discrete values Select colours for the ranges of each variable The colours of the ranges are shown in the graph as a visual help Display the special event in the graphs 2 3 1 4 Data analysis Implement analysis modules to report statistical information from the data One of these analysis modules will report the occurrences of the elementary patterns 2 3 1 5 Pattern discovery 2 3 2 Hardcoded pre defined pattern types The user can define the patterns to be matched against the datasets The user can match the patterns against the several data series The system generates a match report Non functional These are requirements that will not become functionalities are requirements which 2 3 2 impose constraints on the design or implementation 1 Documentation Would be available a user manual from the main menu Page 17 TDWB 2 3 2 2 Extensibility The
54. data segments 6 1 2 5 Pattern discovery The pattern s name AndPattern used in the last prototype had not been understood by the users so decided to change it to CompositePattern Also have renamed the PatternNode to ElementaryPattern Because the name PatternNode now refers to the abstract class which all the patterns implements Page 74 TDWB A report to compare the patterns and their results After a pattern matching analysis a dialog with a text report is shown with the number of the Data Segment matches for each pattern All the patterns are checked against all the positive and negative data segments The program shows a report similar to Figure 6 16 When a pattern is matched against a data segment there are four possible results TP True Positive if the pattern matches a data segment with a positive special event TN True Negative if the pattern doesn t match a data segment with a negative special event FP False Positive if the pattern doesn t match a data segment with a positive special event FN False Negative if the pattern matches a data segment with a negative special event At the end of the report there is an extra row which provides the combined result of all the patterns A perfect pattern only matches data segments with a positive special event and none of the data segments with a negative special event Discrete labels UI input In the last prototype to def
55. detected after three high HR values To check if the hypothesis is valid we could match this pattern against hundreds of similar data sets where half of them contain raised troponin values and the other half report zero or low troponin values If the pattern matches all or almost all of the positive data set and doesn t match any or almost any of negative data set then we could say that there is a correlation between the HR parameter and the troponin values 160 4 140 120 100 1 Troponin 20 7 07 y y y y PP LLP MM S uas o oO OQ I Figure 1 1 Special event example Page 12 TDWB Sleeman et al DS 11 argue that a software system could provide domain experts support when attempting to detect such events With such support it would be easier for experts to detect a MD before the 72 hours period Also this new detection system would be cheaper than the old troponin based detection test The experts have determined some of these patterns But these patterns happen in multiple parameters that frequently change and are different depending on the type of MD Hence these patterns are not always precise As a result we need an intuitive and agile tool to help the domain expert to formulate test and improve such patterns 1 2 Objectives The high level objective is to provide a tool which enables the domain exp
56. e 92 6 4 1 Changing the graph 92 6 4 2 Adding new analysis ModulesS iii 92 6 4 3 X Adding new pattern types nnns a sss essa sa ada 93 6 5 SERS EVAEUATION 2 siot enatis eoe re eee Landa sua YEAR Pea an essa EET Up A E n Neve 94 6 5 1 Users feedback une e es Fev ee o E 95 7 CONCLUSION eR 96 TL DISCUSSION P Y 97 7 2 qe PR 97 7 2 1 97 72022 e MM AM DIM 98 7222 Variables s eor e ae a un s eed 98 42 4 Data analysis Soe se bestie eire EA RAT edem a Fa 98 72 5 Pattern disCOVEry rites o ente eene o o n FERRE Fe buble ded o Wane aban Tego d ine 98 8 REFERENCES EE EEUU 99 9 APPENDICES E eR pon Ee eee ese eee eee sees n us ere e 100 9 17 L SER MANUAL coner rerit crean betae eese eta Poen e exper Lio Sentences Deseo npe e eee 100 9 1 1 Running the program ini raodo eden nde se rae do e Fe Ere ko venire
57. e The library must be free for educational purposes Ease of use will use a library because the amount of work would be considerably less than if have to write a whole graphics library myself If the library is extremely difficult to use then it would not be sensible to use it Stable If am going to use a third party library then need than it always works correctly Performance Maybe the program is going to manage large amounts of data So need an efficient library Open source If there is some bug in the library s code or if need to tweak its performance then the library has to be open source so could improve it Page 22 TDWB 2 6 2 Reviewed libraries 2 6 2 1 Chart2D Has active developers documentation some tutorials but is not very extended has not a big community and has not time series support URL http sourceforge net projects chart2d Bugs in Charting Libraries 0 0 93 48 20 JFreeChart 45 23 31 16 24 12 Figure 2 2 Chart2d 2 6 2 2 ChartDirector License needed so it is discarded URL http www advsofteng com cdjava html Page 23 TDWB Figure 2 3 ChartDirector 2 6 2 3 G The last version was in December 2009 and has not time series support URL http geosoft no graphics X A G Graphics Library Demo 12 e ry x Figure 2 4 G Page 24 TDWB 2 6 2 4 JCCKit Has not an active
58. e Original data tab the original data is displayed In the Continuous data tab a smoothed version of the loaded data is displayed That is time points generated in the period defined when creating the project see Figure 9 5 If for example the defined time point period is 1h and in the loaded data are recorded every 30 minutes which is 2 per hour the Page 111 TDWB system will generate the mean of all the records within hour period Also as you can see in Figure 9 12 in the continuous tab the colours of the discrete ranges are shown in a way that you can see which values fall into each range Original data Continuous data Discrete data NEGATIVE POSITIVE 10 0 9 0 8 0 7 0 6 0 4 0 3 0 2 01 Fio2 o in sen Figure 9 12 Continuous data graph In the CVP variable there is a value that falls in the very low VL discrete range in less that 72h before the positive special event occurs but not in the negative special event So we can guess that a positive special event happens after the variable CVP drops down into the very low VL range Also there are two values very near of the VL range To see more precisely if shown in Figure 9 13 Original data Continuous data Discrete data that values are or not in the VL range we can check the Discrete data tab Figure 9 13 Discrete data graph z 3 Fio2 Page 112 TDWB D
59. e Selected file test csv Time Variable Time Special Event Variable EOSHDE r Special Event Select an existing time point Selecta time point SS Time point 20 10 2008 15 00 Figure 5 3 Add data segment dialog Project Help Data Variables Analysis Pattern Discovery Select the variables to analyse and determine their discrete ranges 3 1750 1100 0 Discrete ranges Value Min 0 00 1 125 0 2 500 750 Urine Catheter Discrete ranges Value Min 0 0 0 1 25 0 2 500 Figure 5 4 Variables panel Page 62 TDWB POSITIVE Special event at Mon Oct 20 15 00 00 BST 2008 A test data file Eee ott o o Troponin RD spo2 Time of Ti 16 10 2008 16 10 2008 16 10 2008 95 0 16 10 2008 96 0 16 10 2008 96 0 Spo2 16 10 2008 93 0 16 10 2008 95 0 amp 102008 870 DA ile 16 10 2008 600 e Driginal values 6 10 2008 1000 Analysis continous values 16 10 2008 100 0 16 10 2008 100 0 darete valuti 16 10 2008 se Figure 5 5 Data panel o o o Troponin RD 2 B E o o o a o a
60. ed variables To achieve this the system uses a divided panel The left panel shows a compact list of all the variables where each variable has a check box to select it The selected variables are shown in detail in the right panel where the user can edit their discrete ranges The new variables dialog can be seen in Figure 6 7 Discrete ranges labelling system Page 72 TDWB A unique ID is necessary to preserve consistency between the discrete ranges and the patterns The unique ID is an alphanumeric label that the user must define In addition a label is a good mnemonic For example for the low values the label could be 1 and for very low values the label could be VL Figure 6 7 shows the UI to define discrete range labels Descriptions in discrete ranges As aide memoire for the user a description field is included for each discrete range as can be seen in Figure 6 7 This field is not mandatory 6 1 2 3 Data display Default grey colour for new discrete ranges In the last prototype the default colour for new discrete ranges was the green But the colour green was also one of the colours used for the default discrete ranges And that is confusing So it has been decided to use a grey colour as the default colour for new discrete ranges Labels in Graphs As the identifying system for the discrete ranges now uses alphanumeric labels the discrete graph axis must now show these range labels instead of the integer
61. efinitely checking this graph we can say that only one value of CVP is in the VL range Another useful tool to analyse the data are the analysis modules These modules can be found in the Data Analysis tab near the top of windows program 54 New project Temporal Discovery Workbench 10 Project Variables Patterns Help Analysis result Time frame to analyse before each special event 72h Number of Elementary Patterns Analysis time point period ih V Value Changes Positive special events PSE 1 Negative special events NSE 1 Unknown special events USE 0 Analysis Number of Elementary Patterns CVP PSE N 60 VL 1 NSE N 36 VL 0 FiO2 PSE Vi 66 V2 0 V3 0 v4 0 NSE V1 43 V2 0 V3 0 v4 0 Analysis Value Changes to VL PSE 1 NSE 0 VL to N PSE 0 NSE 0 FiO2 Vi to V2 PSE 0 NSE 0 Vi to V3 PSE 0 NSE Figure 9 14 Analysis report In Figure 9 14 in the left panel the modules to analyse the data can be selected After selecting the desired modules click on the Run an analysis with the selected modules to run the analysis After that in the right panel the result of the analysis is shown In this case we can see that for all the positive data segments the CVP variable contains one VL value and also one change from N to VL Page 113 TDWB 9 1 7 Pattern discovery process After analysing the data it is time to f
62. em Before producing a software program we outlined a possible solution for this problem In order to implement the designed solution two prototypes have been developed These prototypes have been evaluated by domain knowledgeable analysts As a result of this feedback have changed the specifications for the program s final version The final version of Temporal Discovery Workbench TDWB offers the needed functionalities to solve the main goal and the secondary goal of this project Proposed further work and improvements are described at the end of this document Page 3 TDWB IV Index of contents 5 2 2 4 12 4 2 1 010121701770077000 ABSTRACT iesst INDEX OF CONTENTS sscssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssees INDEX OF FIGURES sssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssssesees INDEX OF INTRODUCTION 2 2 17 1 77 7 7 7 7 47 0472 4 7 0 0 121 OVERVIEW 1 2 OBJECTIVES ina EFE Ode 1 3 PRIMARY GOALS zie Raina neri ERR 1 4 SECONDARY
63. er of Elementary Patterns V Value Changes Positive special events PSE 1 Negative special events NSE 1 Unknown special events USE 0 Analysis Number of Elementary Patterns CVP PSE N 60 VL 1 NSE N 36 VL 0 FiO2 PSE Vi 66 V2 0 V3 0 v4 0 NSE V1 43 V2 0 v3 0 v4 0 Analysis Value Changes CVP N to VL PSE 1 NSE 0 VL to N PSE 0 NSE 0 FiO2 V1 to V2 PSE 0 NSE 0 0 NSE 0 Vito V3 PSE Figure 6 11 Analysis panel Defined discrete range labels VL Figure 6 12 New elementary paitern dialog Page 85 TDWB Adda pattern to this composite pattern 7 Not Patterns Time Offsets T cem e Figure 6 13 Add a new composite pattern dialog Add a pattern to this combinatory pattern 7 Not Patterns Number of patterns o Comme ome 0 Number of Gaps 0 Store pattern Figure 6 14 Add a new combinatory pattern dialog Page 86 TDWB Data Files Data Analysis Pattern Discovery Select Pattern Percentage of matches threshold Number of matches threshold 2 100 0 1 2 Composite T 0 CVP N T 1 CVP VL 100 0 1 2 Combinatory ix CVPIND 2x CVPIVL Ox Gaps 100 0 Ti Figure 6 15 Pattern discovery panel Result of the pattern matching Time frame to analyse before each special event Analysis time po
64. eract with the system The controller is a layer between the view and the model The controller handles the user events modifies the model information accordingly to the user s request and then updates the view layer with results In Figure 6 17 MVC request process this process can be seen graphically 1 User presses a button 2 Result is shown Return result Figure 9 21 MVC request process The main purpose of the controller layer is to abstract the model from the view then it is easier for a programmer to modify the model without modify the view or to modify the view without modify the model More information about this pattern can be found here http en wikipedia org wiki Model E2 80 93view E2 80 93controller And here http www oracle com technetwork articles javase index 142890 html Page 124 TDWB 9 2 7 UML 9 2 7 1 Model DiscreteRange 1 Variable 1 Project DataFile DataSegment 1 1 PatternNode CombinatoryPattern NotPattern Analyser Figure 9 22 Model UML Page 125 TDWB 9 2 7 2 Controller CompositeController CombinatoryController ElementaryPatternController SelectPatternNodeController PatternDiscoveryController 1 DataController MainController 1 1 1 DataFileController Figure 9 23 Controller UML 9 2 8 List of source code files
65. ers The other variable is the Special Event variable Remember that the purpose of the TDWB system is to help the domain experts find patterns in the data before a specific event happens For example special events could be that starts to rain or that a patient in the ICU suffers myocardial damage so you need to tell the system when these special events happen The special events can be POSITIVE or NEGATIVE If it is positive means that a special event happened if negative means that there is not special event This is useful for the training data segments A perfect pattern will match only the data segments with a positive special event and none of the data segments with a negative special event NSE For example Figure 9 3 shows a data file with a positive special event Page 103 TDWB Time Humidity Special Event 16 10 08 01 18 80 16 10 08 02 00 80 16 10 08 03 00 85 16 10 08 04 00 85 POSITIVE Figure 9 3 Positive special event In this case we can guess that when the humidity changes from 80 to 85 then it starts to rain in the next hour Figure 9 4 shows a negative special event Time Humidity Special Event 16 10 08 01 18 80 16 10 08 02 00 80 16 10 08 03 00 80 16 10 08 04 00 80 NEGATIVE Figure 9 4 Negative special event This means that there is no rain after 3 hours of 8096 humidity So when the domain expert writes a pattern he she should write something like the positive events happen when
66. erts understand the data series patterns before a special event occurs This will involve providing the domain expert with statistical information about the data series and by matching the produced patterns against these data series 1 3 Primary goals The main goal of this software engineering project is to design and build a workbench which the myocardial damage experts can use to develop potential explanations for troponins rises 1 4 Secondary goals The secondary goal is to generalise the workbench to be useful in a variety of domains like weather prediction ecology and other medical domains This could be a demanding yet interesting goal Page 13 TDWB 1 5 Structure of Document Chapter 1 Overview of the problem addressed the project s goals and objectives Chapter 2 Describes the background and the required specifications of the needed software system Chapter 3 Initial design of a possible solution to this problem Chapter 4 The first prototype of Temporal Discovery Workbench TDWB Chapter 5 The second prototype of TDWB Chapter 6 The final version of TDWB Chapter 7 Conclusions and further work Chapter 8 References Chapter 0 Appendices 2 Background The software engineering process begins with the software requirements specification which is the description of the system functionalities 2 1 Specifications The specifications of this program are given by the section 4 1 5 Temporal
67. es this process Two data records inside the A data record with offset same time period Original values 01 01 2012 01 02 2012 01 03 2012 01 04 2012 01 05 2012 01 06 2012 Smoothed continuous values 01 01 2012 01 02 2012 01 03 2012 01 04 2012 01 05 2012 01 06 2012 Figure 3 3 Data smoothing Discrete values After generating the smoothed continuous values the system has to generate the discretized version of these values This process is done by determining the value range to each smoothed continuous value The Figure 3 4 illustrates this process Smoothed continuous values 01 01 2012 01 02 2012 01 03 2012 01 04 2012 01 05 2012 01 06 2012 o o A Discrete values 01 01 2012 01 02 2012 01 03 2012 01 04 2012 01 05 2012 01 06 2012 Figure 3 4 Data discretizing Page 36 TDWB 3 3 Variables The user selects the variables to be analysed These variables have to be numeric values like integer or decimal values Discrete ranges It is very difficult to analyse continuous variables and so we made a decision to only analyse data which had been assigned to ranges Classifying the values in ranges is a method to discretize the data For example the user could classify the heart rate values as very low VL normal N and very high VH These discrete ranges for each variable are the elementary patterns These ranges cannot overlap because if two ranges overlap and a value
68. f integrated development environme nts Java short listed Eclipse and NetBeans have used both IDEs before and both are good enough but as have never used the GUI builder of Eclipse decided to use NetBeans 3 7 4 Repositories and Backups A good way to keep track of the changes and to share the code and the documents with my tutors is a revision system have tried Subversion SVN Page 46 TDWB and Git However Git has more functionality and is better at merging changed files than SVN SVN is easier to use and don t need the extra functionalities of GIT so decided to use SVN Also SVN can be a backup system for the project But sometimes for certain mistakes or malfunctions the repository gets corrupted or lost So decided to use an auxiliary tool for the backups DropBox is a web service to store files in the cloud The files are uploaded automatically every time they are saved to disk Also with DropBox can get access to the files from the web share them and have control version Additionally will use the UoA user space to backup weekly copies of all the PFC material The subversion project is structured in two folders trunk and tags In the trunk folder will be the working copy of the NetBeans project and in the tags will be the finished prototypes that produce 3 7 5 Program versions The prototypes are numbered with the convention 0 X Y where X means the number of the pro
69. f you have installed NetBeans then you will have a newer version of Java 6 0 Other dependency is JFreeChart 1 0 13 and JCommon 1 0 17 but these are included in the source code Page 122 TDWB 9 2 5 Organisation of files Files in the root folder build xml and manifest mf NetBeans project configuration files Code listing pdf the source code of the project example csv a CSV file example Maintenance manual pdf this manual readme txt text file describing the installation compilation execution and dependencies of the project User manual pdf user manual which is copied to the folder dist after compiling the NetBeans project Folders in the root folder build compile temporary files nbproject NetBeans project configuration files dist the executable file TDWB jar and the user manual distVib libraries used by the executable file lib JFreeChart and JCommon libraries imported in the NetBeans project patterns example pattern files projects example project files src Source code 9 2 6 Model view controller The final version is structured with the classic architectural pattern model view controller or MVC EG 94 The model layer contains all the classes that represent the information and the methods to transform it Page 123 TDWB The view layer has the classes that draw the GUI in the system screen showing the state of the information and gives mechanisms to allow the user for int
70. faster and with precision Other functionalities The user can remove the data files from the current project Null time values accepted The file can have a special event variable If the has more than one possible time variable then the system asks the user which one to use as the time variable for the analysis 5 1 2 3 Variables All the variables are displayed in a panel as can be seen in Figure 5 4 There the user can select which variables are to be analysed Also the user can define the ranges for the variables 5 1 2 4 Data display Only the data within the analysis time is showed It is not necessary to show data before the analysis time frame neither after the special event Select which data segment display As the system allows the loading of more than one data file the user must be able to select which data file to display Draw a marker for the special event Page 59 TDWB To mark where a special event is a vertical line is drawn in the graph At the top of this marker information is displayed which indicates whether this is a positive or negative special event Discrete ranges colours The colours of the discrete ranges are painted on the graph As seen in Figure 5 6 the colours are painted in the area of every range This allows the user to visually check if any value is outside the defined ranges In the discrete graph the colours are painted in each time point as seen in Figure 5 7 This i
71. hange test Time t 1 Mon Oct 20 13 18 00 BST 2008 Time t 2 Mon Oct 20 12 18 00 BST 2008 HR No changes 3 0 3 0 MAP Unknown missed data SpO2 No changes 3 0 3 0 Figure 4 8 Analysis report Page 54 TDWB 4 1 4 UML ChooseColorDialog JFreeChart DiscreteRangePanel Parameter DiscreteRange Figure 4 9 UML 4 2 Users evaluation 4 2 1 Proposed changes This is the list of changes proposed by Derek Sleeman and Wamberto Vasconcelos after evaluate the system on 16 November 2011 A detailed discussion of the actual changes applied can be found in the functionalities section 5 1 2 of the next version of the program 4 2 1 1 Data management Change the name of the parameter Data granularity to Time point period Page 55 TDWB 4 2 1 2 Variables At Discrete ranges section Change text of the button Update data to Apply changes Check that the min value of a range is less than or equal to the max value 5 Prototype 2 TDWB 0 2 For this version multiple files can be loaded into the system Also the project concept and the pattern matching process have been implemented The model layer has been completely changed and a radical redesign of the GUI has been undertaken therefore have rewritten almost all the source code 5 1 Design 5 1 1 Use workflow The user can now load more than one file in the system
72. his button is only for testing purposes and is not going to be present in the final version of the program The program allows the user to choose between different CSV formats like semicolon or colon separated columns The user can decide whether the first row of the CSV file provides names of the variables or whether it is part of the data set Page 48 TDWB After previewing the data the user can decide which variables to load The user can change the variable names as blank names and repeated names are not allowed It is necessary to select at least one variable before a data file is loaded The program recognizes which variables are time variables and allows the user to pick one of them as the main time variable A variable is a valid time variable if all its values are not blank and are in a correct time format The currently accepted time formats are hardcoded A possible feature would be to allow the user to add more time formats At least one time variable is required to load the data The loaded data replaces the previously loaded data in the program s main memory Data granularity As seen in the chapter 3 2 the data is smoothed and then discretized To smooth the data the user must determine the period to generate the new time stamps This period is the data granularity that determines the smoothing factor as less granularity value then the data is less smoothed By default the granularity is 1h hour The progra
73. iables for the analysis or closing the dialog 9 1 5 Selecting and configuring the variables to analyse If you want to select and configure the variables a dialog like in Figure 9 9 appears Page 108 TDWB Set the analysis variables Variables list Selected variables A CVP Discrete ranges 0 0 Aminophylline T 25 0 Amiodarone E Label Min Colour Description 50 0 Atracurium co Calcium Gluconate mmol ml Diastolic Discrete ranges 175 0 0 0 p p p 3 Add discrete range Dobutamine Label Min Colour Description Drotrecogin Alfa 200 Fio2 Soo Figure 9 9 Set variables dialog Also you can access this dialog whenever you wish by clicking the upper menu Variables gt Set variables In the left panel you can select the variables to be analysed in this example the 2 variables are selected In the right panel you can define the range of values that classifies the values of each variable in different classes or discrete values For example the domain expert thinks or knows that the range of values for the CVP variable between 0 and 5 are too low So then the domain expert should set discrete ranges as in Figure 9 10 Page 109 TDWB CVP Discrete ranges Label Min Max Colour Description N 5 0 25 0 Normal Remove VL 0 0 5 0 Very Low
74. ine a discrete range in a pattern a combo box was used If the user removes or changes the label of a discrete range used ina pattern then there is an inconsistency Because the combo box when it is initialized again is not going to provide the option for the old value The implemented solution is to use an input text field instead of a combo box to define the discrete ranges To show the user the different options an Page 75 TDWB informative text is displayed near the input text field If the user introduces a non existing label for a variable a warning dialog is shown after pressing the add button Figure 6 12 shows the new UI to introduce elementary patterns Check patterns with some invalid variable name or some invalid discrete labels If there is a pattern with an invalid variable or an invalid discrete range label the system displays a warning panel in the pattern discovery tab Also a warning dialog is shown if the user presses the Run pattern matching button in the pattern discovery tab Pattern discovery thresholds In the last prototype the pattern matching algorithm stopped when found a single match in a Data Segment and reported a match But a more flexible approach is to let the user define a threshold of matches per Data Segment to report a match The patterns are tested against all the possible time points of the data segments For example for the data set A N N N L L L L N N N
75. int period ih Positive special events PSE Negative special events NSE Unknown special events USE Result of the pattern matching Pattern TP TN FP CVP VL 0 Composite T 0 CVP N 1 CVP VL 1 0 Combinatory 1 x CVP N 2 x CVP VL 0 x Gaps 1 0 Combined result 2 9 TP True Positive TN True Negative FP False Positive False Negative Figure 6 16 Pattern matching report 6 1 4 Model view controller The final version is structured with the classic architectural pattern model view controller or MVC EG 94 The model layer contains all the classes that represent the information and the methods to transform it Page 87 TDWB The view layer has the classes that draw the GUI in the system screen showing the state of the information and gives mechanisms to allow the user for interact with the system The controller is a layer between the view and the model The controller handles the user events modifies the model information accordingly to the user s request and then updates the view layer with results In Figure 6 17 MVC request process this process can be seen graphically 1 User presses a button 2 Result is shown Figure 6 17 MVC request process The main purpose of the controller layer is to abstract the model from the view then it is easier for a programmer to modify the model without modify the view or to modify the view without modify the model
76. l requirements too The overall requirements depend on the requirements of the program 2 6 1 1 Functional requirements Draw time series charts The data series analysed in the program are time series so the library must have functions to manage and display time series Page 20 TDWB Draw discontinued time series Sometimes there are missing values in the data series to analyse The chart library must deal with missing values Export graph as an image It is not essential but it is nevertheless desirable that the user can save graphs as bitmap images 2 6 1 2 Non functional requirements Java compatible As seen in the section 3 7 2 the chosen programing language is Java So the library must be Java compatible Swing compatible As seen in the section 3 7 3 the chosen IDE is NetBeans which has a Swing UI builder So the library must be compatible with Swing or easily embeddable in a Swing component Documentation need documentation to know how to use the library Tutorials Some simple tutorials that explain how to set up my program to use the library functions would be very useful If someone has written tutorials about advanced uses of the library that would also be helpful Page 21 TDWB Active community Sometimes the libraries have some bugs have some tricky functionality or the documentation is not clear Most of these bugs and functionalities are discussed in forums or blogs Fre
77. lds PatternNode java Abstract class used to represent a pattern It is extended and used by the different patterns Also is used by Pattern java Project java Represents a project with its parameters data files variables SpecialEventTypeEnum java Contain the representation of the special event types ValueChangesAnalysisModule java Implements AnalysisModule java and contains the algorithm to calculate the discrete value changes in a data segment Variable java Represents a variable with the variable name and its discrete ranges 9 2 8 4 Tdwb utils package The utils package contains useful classes for all the other classes DataFileReader java Used to read the information contained in CSV files DateUtils java Functions to manage dates PatternReaderWriter java Used to save and load pattern files ProjectReaderWriter java Used to save and load project files StringUtils java Functions related to strings TimeScaleEnum java Contains the representation of time scales 9 2 8 5 Tdwb view package The view package contains the classes of the view layer These classes are responsible for drawing the GUI on the system screen and for providing Dann 12A rage 150 mechanisms to allow the user to interact with the system Also shows the state of the model LEE ic FEE AboutBox java AddDiscreteRangeDialog java AnalysisPanel java ChartLibrary java CombinatoryPatternEditDialog jav
78. lysisModule java and Analyser java is a singleton class that manages the AnalysisModule s subclasses To add a new module one simply creates a new implementing class of AnalysisModule java implementing the methods getModuleName and analyse Then one modifies the constructor method of the Analyse java class to add an instance of the new analysis module An example of this architecture is shown in Figure 6 21 Page 92 TDWB Analyser AnalysisController interface AnalysisModule getModuleName String analyselin project Project String ElementaryPatternsAnalysisModule ValueChangesAnalysisModule Figure 6 21 Analysis modules abstraction 6 4 3 Adding new pattern types For the myocardial damage study combinatorial patterns are needed For other studies maybe it would be very useful to add other kinds of patterns that are not implemented currently So a system to add new patterns easily is needed The solution is very similar to the solution for the analysis modules namely we use an abstract class PatternNode java and a manager class Pattern java that will use the classes that implements PatternNode java However here a different UI and a different controller are needed for the different patterns This adds complexity to the solution So needed to implement the same solution for the controllers with the abstract class PatternEditController java A further component needed supp
79. m analyses the time points and suggests a granularity value that is automatically selected Special events The user can set the time point which corresponds to a Special Event SE event There is a button in the interface to set the time of this special event to the last time point For now the user can pick one and only one special event There isn t the option of mark the event as Not special event NSE Page 49 TDWB 4 1 2 2 Variables Discrete ranges The user can add delete and modify the ranges for the values of each variable Also the user can change the colour of each range In the initial requirements we decided that the ranges don t overlap so when the user applies any changes to the discrete ranges the program checks if the ranges overlap 4 1 2 3 Data display The data is displayed in a table on the bottom left and as a graph in the top The JfreeChart graphs can be zoomed printed and saved as a bitmap image The user can choose how to view the data the options being Original values Analysis continuous values and Analysis discrete values The graphs and the data table change when the user selects one of these options To draw a time series using the JfreeChart library is easy but to change the style of the graphs like the colours and the shapes of the lines were not trivial 4 1 2 4 Data analysis Analysis modules In the analysis panel as shown in Figure 4 8 we can acti
80. m principle states that more complex patterns should only be created if their elementary components have sufficient support The elementary patterns for a domain are the variable name and one of its ranges For example the normal values and the very high values of a patient s heart rate can be two distinct ranges If for all the data sets the system provides the number of values that are in each range maybe the domain expert can determine if there are more high values in data sets of type A than in the data sets of type B This feedback helps the user decide which elementary patterns could be used to build more complex patterns The variables are changing along the time and these changes are also important to understand better the causes of the problem so the system should provide also a report about these changes Page 15 TDWB 2 3 Initial requirements This is the list of the requirements initially agreed with the customer representative These requirements are going to be updated with the user s feedback after each prototype 2 3 1 Functional These are requirements that will become implemented functionalities of the program 2 3 1 1 Data management Load patient s data series from a CSV file Feature Read from Excel and other popular formats Feature Load data from multiple files Define the data samples period This is the time between each data sample This is used to smooth the data and for the analysis an
81. mber of Fruits Ww 0 1 2 3 Banana Time goes by Figure 2 7 JchartLib 2 6 2 7 jCharts Has basic functionalities and the last version is from 2003 URL http jcharts sourceforge net Page 26 TDWB Micro oft At Work 5 000 4 000 3 0001 5 E 2 0004 1 000 gt z gs 3 Bo PP P PI Years Figure 2 8 jCharts 2 6 2 8 JfreeChart It has lots of functionalities lots of examples lots of tutorials good documentation and good community URL http www jfree org jfreechart JFreeChart DeviationRendererDemo2 Projected Values Test Index Projection May 2007 Jul 2007 Sep 2007 Nov 2007 Jan 2008 2008 Date Series 1 Series 2 Figure 2 9 JfreeChart Page 27 TDWB 2 6 2 9 JOpenChart The last version is from 2002 The community and the tutorials are very poor URL http jopenchart sourceforge net atever Data Ltd e 4 7 3 de 3 5 9 __ bar online 4 foobar IT Services Figure 2 10 JopenChart 2 6 2 10 Ptplot It is developed for drawing functions and not time series URL http ptolemy eecs berkeley edu java ptplot Page 28 TDWB File Edit Special Ptolemy plot eu Figure 2 11 Ptplot 2 6 3 Comparative All the compared libraries are Java Swing compatible open source and free for academic educational purposes These are mandatory requirements JChart2D
82. oose the comma or the semicolon as the field delimiter 9 1 2 3 Required variables TDWB requires two specific variables in each data file to load them One of the variables is the Time variable This variable is the time stamp for each data record The formats accepted by this program for the time stamps are dd MM yyyy kk mm ss SSS dd MM yyyy kk mm dd MM yyyy kk mm ss yyyy MM dd kk mm yyyy MM dd kk mm ss yyyy MM dd kk mm ss SSS dd MM yyyy kk mm dd MM yyyy kk mm ss dd MM yyyy kk mm ss SSS yyyy MM dd kk mm yyyy MM dd kk mm ss yyyy MM dd kk mm ss SSS yyyy MM dd G at HH mm ss z h mm a yyyyy MMMMM dd GGG hh mm aaa yyMMddHHmms sZ yyyy MM dd T HH mm ss SSSZ Page 102 TDWB Where the pattern letters are described in Figure 3 2 Letter Date or Time Component Presentation Examples G Era designator Text AD y Year Year 1996 96 M Month in year Month July Jul 07 W Week in year Number 27 W Week in month Number 2 D Day in year Number 189 d Day in month Number 10 F Day of week in month Number 2 E Day in week Text Tuesday Tue a Am pm marker Text PM H Hour in day 0 23 Number 0 k Hour in day 1 24 Number 24 K Hour in am pm 0 11 Number 0 h Hour in am pm 1 12 Number 12 m Minute in hour Number 30 3 Second in minute Number 55 3 Millisecond Number 978 2 Time zone General time zone Pacific Standard Time PST GMT 08 00 Z Time zone RFC 822 time zone 0800 Figure 9 2 Date time pattern lett
83. ormalize all the hypotheses This is done by writing a pattern and matching that pattern against all the data segments Remember that a data segment is defined by a special event and by the analysis time frame before each special event The pattern discovery process is done in the Pattern Discovery tab which can be found near the top of the windows program To add a new pattern to match click in the Add a new pattern button in the upper left corner You can then choose between three kinds of patterns Elementary pattern Composite pattern Combinatory pattern In all the patterns there is a Not option If this option is selected the program will report as a match the not matching data and as not matches the matching data This could be useful to build patterns that match negative special events Following are described each kind of pattern 9 1 7 1 Elementary pattern This is the simplest pattern which looks for the discrete range of a variable The elementary patterns are defined by triplets of the form Variable name Discrete label For example for the with discrete ranges VL a possible tuple would be CVP N or CVP VL The elementary patterns are checked at each time point of every data segment For example the pattern CVP VL For the sequence CVP N NVL Reports a match CVP N N VL Page 114 TDWB Figure 9 15 is the dialog to add a new elementary pattern
84. orts a dialog which selects the kind of pattern to be added and this component needs to know about all the different kinds of patterns For that purpose have implemented the SelectPatternNodeTypeController java and SelectPatternNodeTypeDialog java classes An UML giving the architecture of this solution is shown in Figure 6 22 Page 93 TDWB ElementaryPatternEditDialog CompositePatternEditDialog CombinatoryPatternEditDialog SelectPatternNodeTypeDialog PatternDiscoveryPanel 1 PatternDiscoveryController HisLeaf boolean HisNot boolean lenght integer childrenNodes ArrayList PatternNode itmyToString getMyInvalidPatterns Figure 6 22 Pattern types abstraction To add a new pattern simply write a new class that implements PatternNode java a class that implements PatternEditController java and its dialog To connect it all edit the file PatternTypeEnum java add a new value to the enumeration and edit the function getController 6 5 Users evaluation Derek Sleeman and Daniel Blasco Calzada interviewed Laura Moss on 16 January 2011 Laura Moss is co author of DS 11 and knows about the myocardial damage domain The interview procedure was the following Derek and gave a copy of the user manual to Laura Laura read the user manual Page 94 TDWB Derek and provided CSV data files and project files to Laura Laur
85. osite pattern dialog 9 1 7 3 Combinatory pattern A combinatory pattern is like the composite pattern composed of a number of patterns The difference with the composite pattern is that the component patterns are combined in all possible ways to generate all possible composite patterns For example if the user define a combinatory pattern of two CVP N and one CVPIVL the generated composite patterns would be Composite T 9 CVP N T 1 CVP N T 2 CVP VL Composite T 9 CVP N T 2 CVP N T 2 CVP VL Composite T 9 CVP N T 1 CVP VL T 1 CVP N Composite T 9 CVP N T 1 CVP VL T 2 CVP N Composite T 9 CVP VL T 0 CVP N T 1 CVP N Composite T CVP VL T 1 CVP N T 2 CVP N Page 116 TDWB An example of this combinatory pattern can be seen in Figure 9 17 Add patterns and its time offsets to this combinatory pattern Add a pattern to this combinatory pattern Not Patterns Number of patterns CVP N 2 Remove CVP VL 1 Remove Number of Gaps 0 Store pattern Cancel Figure 9 17 New combinatory pattern dialog Also there is a Gaps parameter which adds extra time gaps to the generated composite patterns for example for the pattern Combinatory 2xCVP N 2xGaps generates the composite patterns Composite T CVP N T 1 CVP N Composite T CVP N T 2 CVP N Composite T CVP N T 3 CVP N 9 1 7 4 C
86. reating complex patterns The composite and combinatory patterns are composed by other patterns the other patterns can be elementary composite or combinatory patterns This allows you to create for example a combinatory pattern composed of composite patterns An example of this complex patters is shown in Figure 6 3 Page 117 TDWB Composite IN Combinatory Composite Elementary Elementary Elementary Elementary Elementary Elementary 9 1 7 5 Matching thresholds Once you have created your patterns it is time to define their matching Figure 9 18 Complex pattern thresholds The patterns are tested against all the possible time points of the data segments For example for the data set L L L N H H H H N N N N If the pattern to be matched is A L then this results in 4 matches out of 10 possible matches N N L L L L N N N N H HH N N N N We 4 10 100 40 matches against this data segment If the pattern to match is Composite T 0 A L T 1 B H the result is 2 matches in 9 possible matches Page 118 TDWB A N N L L L L N N N N N H H H H N N N We have 2 9 100 22 22 matches against this data segment Is 9 possible matches because the length of the data is 10 and the length of the pattern is 2 so 10 2 1 9 possible matche
87. rmine if a pattern matches against a data segment Page 133 TDWB The PatternNode java flips the result of the match of its subclasses if the attribute isNot is set to true 9 2 10 Configuration files 9 2 10 1 Cons java Contains some constants and formats for data processing The CSVseparators array contains the characters used as element separator to load the CSV files The timeFormats array contains all the accepted formats to convert dates from strings that are loaded as part of the CSV files 9 2 10 2 TimeScaleEnum java This file contains all the time scales accepted by the program 9 2 10 3 SpecialEventTypeEnum java This file contains the different kinds of special events 9 2 10 4 ViewStyle java This file contains constants used to configure the style of the GUI 9 2 11 Directions for future improvements 9 2 11 1 Changing the graph library JFreeChart is very complete But maybe for a specific domain or because a new version is available it is necessary to change it Page 134 TDWB have implemented a class interface named ChartLibrary java and used the JFreeChart library through this interface Then to change the graph library is easier Figure 6 20 is a UML representing this scheme interface ee A a DataPanel jfreechart Figure 0 1 Chart library abstraction 9 2 11 2 Adding new analysis modules Currently there are only two analysis modules
88. rrt course 601 lectures xp html 19 January 2012 at 20 00h 9 Appendices 9 1 User Manual The User Manual details how to use the TDWB program including instructions for all of the functionalities it provides 9 1 1 Running the program Just double click the file TDWB jar located in the dist folder of the NetBeans project If you want to execute the program from NetBeans first open NetBeans and then open this project by click File gt Open project in the upper menu of NetBeans Then select the folder of this project and click on the Open project button To execute the program is just press the key F6 in your keyboard 9 1 2 Data files Files with a specific format are required to be loaded into the program 9 1 2 1 CSV Comma Separated Values The kind of files accepted by the program are the comma separated values files or CSV files These files are plain text format files with a csv file extension these can be edited with any standard text editor but the easiest way to edit them is with Microsoft Excel or with OpenOffice org Calc OpenOffice org Calc is open source free and can be downloaded from its website http www openoffice org Page 100 TDWB The CSV format requires that each value or field is separated by a special character The special characters accepted by TDWB are the comma and the semicolon Each record is on a single line The first line of the file must contain the
89. ry like a text parser to input patterns with a text field Page 98 TDWB Show the list of matched data segments in each pattern and the list of matching patterns in each data segment Also highlight in the data segment the matched time points of the pattern Store load the reports PDF reports Export the patterns in standard file formats like CSV 8 References AG 1 http en wikipedia org wiki Agile software development 19 January 2012 20 00h AG 2 http agilemanifesto org 19 January 2012 20 00h DS 11 Derek Sleeman Laura Moss Malcolm Sim and John Kinsella 2011 Predicting adverse events detecting myocardial damage in intensive care unit ICU patients In Proceedings of the sixth international conference on Knowledge capture K CAP 1 1 EG 94 Erich Gamma Richard Helm Ralph Johnson and John Vlissides 1 edition November 10 1994 Design Patterns Elements of Reusable Object Oriented Software Addison Wesley Professional LAX 06 Laxman S and Sastry P S 2006 A survey of temporal data mining SADHANA Academy Proceedings in Engineering Sciences 31 2 MVC 1 http en wikipedia org wiki Model E2 80 93view E2 80 93controller 19 January 2012 20 00h MVC 2 http www oracle com technetwork articles javase index 142890 html 19 January 2012 20 00h XP 1 http en wikipedia org wiki Extreme Programming 19 January 2012 20 00h Page 99 TDWB XP 2 http www cs usfca edu pa
90. s To determine if the pattern matches against the data segment the user determines a threshold for the number of matches or for the percentage of matches in a data segment For the patterns P1 P2 and P8 if the user determines the thresholds as in Table 6 1 Thresholds Table 9 1 Pattern thresholds example The pattern matching algorithm will report a positive data segment match it Finds 40 or more matches for the pattern P1 or Finds 1 or more matches for the pattern P2 or Finds 0 1396 or more or a number of 23 or more matches for the pattern P3 These thresholds can be defined in the Pattern Discovery panel See Figure 6 15 Page 119 TDWB Data Files Data Analysis Pattern Discovery Pattern Discovery Percentage of matches threshold Number of matches threshold W Composite T 0 CVP N T 1 CVP VL 100 0 1 21 Combinatory 1x 2x CVP VL 0x Gaps 100 0 1 Run pattern matching with the selected patterns Figure 9 19 Pattern discovery panel 9 1 7 6 The pattern matching report When all the patterns with their thresholds are defined it is time to check the patterns against the data segments To do that you have to click on the Run pattern matching with the selected patterns button that is on the bottom of the panel Then all the patterns are checked against all the positive and negative data segments The program then displ
91. s adds complexity to the solution So needed to implement the same solution for the controllers with the abstract class PatternEditController java A further component needed supports a dialog which selects the kind of pattern to be added and this component needs to know about all the different kinds of patterns For that purpose have implemented the SelectPatternNodeTypeController java and SelectPatternNodeTypeDialog java classes An UML giving the architecture of this solution is shown in Figure 6 22 Page 136 ElementaryPatternEditDialog CompositePatternEditDialog CombinatoryPatternEditDialog SelectPatternNodeTypeDialog PatternDiscoveryPanel 1 1 1 1 1 1 1 1 ElementaryPatternEditController CompositePatternEditController CombinatoryPatternEditController 1 1 1 PatternEditController 1 1 SelectPatternNodeTypeController PatternDiscoveryController childrenNodes ArrayList lt PatternNode gt 00002 0 1 isLeaf boolean HisNot boolean lenght integer childrenNodes ArrayList lt PatternNode gt itmyToString getMyInvalidPatterns CompositePttem iS qp E Figure 0 3 Pattern types abstraction To add a new pattern simply write a new class that implements PatternNode java a class that implements PatternEditController java and its dialog To connect it all edit the file PatternTypeEnum java add a new value to the enumeration and edit the function getController
92. s useful to visually check the value changes To add the markers and the range colours to the JfreeChart graphs were quite easy 5 1 2 5 Data Analysis Analysis module removed It is now necessary to implement the pattern discovery process This means that a huge amount of work has to be done in a short time period To simplify the code decided not to include the data analysis module 5 1 2 6 Pattern discovery The patterns described in chapter 3 6 are single time points single time point composite temporal temporal composite combinatorial and combinatorial M out of N patterns This is an abstract description of the different patterns needed for this domain but that classification can be simplified as PatternNode which is the single time point and a temporal pattern AndPattern which is the single time point composite and a temporal composite pattern CombinatoryPattern which is the combinatorial and a combinatorial M out of N pattern Page 60 TDWB The PatternNode is the simplest of the above patterns UI panel to introduce patterns Figure 5 8 shows the panel to add new patterns Figure 5 9 and Figure 5 10 show the dialogs to edit these patterns and Figure 5 11 show an example of a pattern matching report 5 1 3 UI Author Time Scale Hour Time Frame Before the SE 72 Time Point Period i _ Gere Figure 5 2 New Project Page 61 TDWB Loada fil
93. sign When have more detailed information then will better appreciate the services need from a third party library should generalize the program to solve a wide range of problems Also plan to morph the workbench into a framework But first need working program for the specific problem and then will generalize it Page 43 TDWB am exploring a novel and slightly under specified task It would be a colossal amount of work to modify or redo all the documentation and processes involved in a traditional development method like Rational Unified Process RUP or the Waterfall model The philosophy that fits this situation is one like Agile Software Development AG 1 AG 2 which could be implemented using a methodology such as Extreme Programming XP XP 1 XP 2 However am a team of one so will follow this method will build a fast and simple prototype show it to the customer representative tutors and to the stakeholders experts Then take notes of the required suggested changes and then modify or rebuild the prototype This process will end when all the parties partners agree about the functionalities offered by the final prototype or we are close to the end date A N Initial Requirements Design Coding Test Doc Feedback Design proposed method Coding Test Optimization Doc The last part after the prototyping cycle is to tidy up the code and add some extra functionali
94. testing will be carried out by the users at the end of each prototyping cycle 3 7 2 Programming Language have background in widely used OOP languages like C Java and Python additionally have experience of web technologies Initially the users are going to be only a specific group of experts so a PC based system is acceptable The program is probably going to use some libraries written in Java like graph libraries There are some IDEs with a graphical GUI designer for Java If the program is growing and if it is written in Java then it can be reformulated as a web service For these reasons Java is the language that choose as the logic Page 45 TDWB and the Ul components of the systems mentioned earlier can be implemented in it Java s great strengths are that it allows libraries to be used these libraries are now very extensive and it is platform independent This approach will allow me to have for instance a number of Uls including one to the WWW if required 3 7 3 Integrated Development Environment IDE An IDE is a great tool for developing a program It has a lot of useful functionalities to assist the programmer and to maximize the productivity need an IDE with the following functionalities and requirements Supports Java Free license Supports control version Visual GUI builder Refactoring Debugging Unit Test So after check the list http en wikipedia org wiki Comparison o
95. ties if needed 3 7 1 1 Design am going to use standard flow charts and UML diagrams But at first will only develop a simple version because they will certainly be modified at later stages am planning to use UML Diagrams because these are a standard for Object Oriented Programming OOP and am familiarized with them am going to use Microsoft Visio to create and maintain the UML diagrams because have used it in the past and know that it is easy to use Page 44 TDWB 3 7 1 2 Prototyping At first the program doesn t need a heavy data management facility as at least initially only CSV files are to be used for entry data This will simplify the code considerably An IDE with a graphical GUI designer will be a great help am working alone so for the prototypes can mix the code a little bit by not splitting it in a typical architectural pattern like Model View Presenter MVP And only comment the important parts for my future revision of the code am a team of one so cannot assist team meetings as XP recommends But my tutors will play the role of co workers and will share points of view at our weekly meetings 3 7 1 3 Testing Every class needs a Unit Test XP method recommends writing the Unit Tests before the code Essentially will write the properties and the headers of the methods then the tests and finally the body of the methods This will give me a clear perspective The GUI
96. totype and Y means the versions of the prototype The beta versions for the final program are numbered with the convention 1 0bX where X means the version of the beta The final and definitive version for this project is the 1 0 4 Prototype 1 TDWB 0 1 After outlining the initial design produced the first prototype in which all the functionalities were not implemented However it provides a general idea about the final program Only one analysis module has been implemented the One variable one change test which is the simplest With the development of this prototype the main infrastructure is set up this was a considerable amount of work So the further development phases are expected to be shorter than this one Page 47 TDWB 4 1 Design 4 1 1 Use workflow The pattern discovery process is not implemented in this version po Select variables and Start Add data segment define their discrete ranges Run analysis against the data segments Figure 4 1 Use workflow 4 1 2 Functionalities list This is the list of the implemented functionalities 4 1 2 1 Data management Load data from a CSV file jError No se encuentra el origen de la referencia shows how to load data in to the program from a file The button Load Data starts the dialog to load the data Load Test Data loads automatically the variables HR MAP SpO2 and Troponin from an existing CSV file T
97. tterns Also provide for each pattern its corresponding controller ProjectController java Handles the dialogs related to create a new project edit the project properties and the save and load project dialogs user events SelectPatternNodeTypeController java Handles the SelectPatternNodeTypeDialog java user events VariablesController java Handles the VariablesDialog java user events 9 2 8 3 Tdwb model package The model package contains the classes of the model layer and the core algorithms to process the data and match the patterns Analyser java Proxy class between the analysis modules and the system AnalysisModule java Interface implemented by the analysis modules It is used by CombinatoryPattern java Extends PatternNode java Represents the combinatory pattern CompositePattern java Extends PatternNode java Represents the composite pattern DataFile java Contains the original data loaded from the CSV and the continuous and the discrete data Also contains its data segments DataSegment java Represents a data segment and its special event DiscreteRange java Represents a discrete range ElementaryPattern java Extends PatternNode java Represents the elementary pattern ElementaryPatternsAnalysisModule java Implements AnalysisModule java and contains the algorithm to calculate the elementary patterns in a data segment Pattern java Contains a PatternNode and its thresho
98. typical stages of the program analysis First load data then select the variables to be included in this analysis specify their discrete ranges analyse the data and finally search for patterns that match the data Select variables and Start L Add data segment 99 define their discrete ranges Run analysis against the data segments Run pattern matching against Define the data segments A No Satisfactory result Figure 3 1 General use workflow Page 33 TDWB 3 2 Data management Data segments The user must load the data to be analysed The system requires that files be formatted to comply with particular conventions TDWB requires one specific variable in each data file to load them it is the Time variable This variable is the time stamp for each data record The formats accepted by this program for the time stamps are dd MM yyyy kk mm ss SSS dd MM yyyy kk mm dd MM yyyy kk mm ss yyyy MM dd kk mm yyyy MM dd kk mm ss yyyy MM dd kk mm ss SSS dd MM yyyy kk mm dd MM yyyy kk mm ss dd MM yyyy kk mm ss SSS yyyy MM dd kk mm yyyy MM dd kk mm ss yyyy MM dd kk mm ss SSS yyyy MM dd G at HH mm ss z h mm a yyyyy MMMMM dd GGG hh mm aaa yyMMddHHmms sZ yyyy MM dd T HH mm ss SSSZ Where the pattern letters are described in Figure 3 2 Page 34 TDWB Letter Date or Time Component Presentation Examples G Era designator Text
99. ure improvements iii 134 Page 7 V TDWB Index of figures FIGURE 1 1 SPECIAL EVENT EXAMPLE 2 ce cepe eph ee orte er o eeu ene rne 12 FIGURE 2 1 CUSTOM GRAPH LIBRARY enne stent nene seen nein nennen nennen enne nennen 20 FIGURE 242 2 E 23 FIGURE 2 3 CHARTDIRECTOR iir karta EE iene 24 FIGURE c 24 FIGURE2 5 S JECKIT E eR NEL vana oa 25 FIG RE2 6 JCHART2Z D E er rene rene artes 25 FIG RE2 7 JCHARTLIB iaia n 26 FIG RE2 8 JCHARTS ito ta d pe 27 FIGURE 2 9 JEREECHART pinoli dire se fena Poe epo E eed ER Rt ep IR e iaia ian 27 FIGURE 2 10 JOPENCH ART 3 ct euo ong eee 28 FIGURE2 11 PTPLOT iier cepta evs Pr p Den iaia Tee Me Ce ron be EXC ERE Te 29 FIGURE 2 12 JEREECHART TEST ore eiie rera eee E Po a E Porn tere E Rode 30 FIGURE2 13 DEVIATION DEMO 45e 31 FIGURE2 14 ANNOTATIONS oc REED Peu 31
100. vate the analysis module In this version the One variable one change module is the only analysis module implemented Report After the analysis the system displays the result on the screen 4 1 3 UI Page 50 TDWB csv V The names of the variables are in the first row CSV Column Separator V The names of the variables are in the first row Figure 4 3 Load data dialog with data preview Page 51 TDWB 5 02 90 95 100 105 110 u Oct 16 103 0 Oct 16 105 0 Oct 16 1140 Oct 16 116 0 Oct 16 113 0 Oct 16 112 0 Oct 16 1140 Thu Oct 16 113 0 Oct 16 1140 Oct 16 112 0 Oct 16 1140 Oct 16 113 0 Oct 16 112 0 Oct 16 112 0 Oct 16 1114 0 5 Start analysi js Analysis discrete values Figure 4 4 Data panel 0 5 10 15 22 25 30 35 40 45 50 55 60 Hours Figure 4 5 Smoothed continuous values Page 52 TDWB 0 0000000 Troponin 0 5 10 15 20 25 30 35 40 45 50 55 60 t Hours Figure 4 6 Discrete values Choose the granularity of the analysis values 1 00h Set SE to the last time point Figure 4 7 Analysis options Page 53 TDWB Analysis on discrete values with a granularity of 1 00h Special Events Type Special event SE at Mon Oct 20 14 18 00 BST 2008 One varariable one c
101. ysed but sometimes not all the data has to be analysed In this version have implemented the Time frame before the SE parameter that determines which is the first value to be analysed This is also used in the pattern matching process Data granularity and time point period parameters In the last version have used the name Data granularity to describe the time period used for smoothing the continuous values That was unclear for the users so have changed the name of that parameter to Time point period In the last prototype the time point period time granularity options were only some values from 0 15h to 8h In this prototype have added more flexibility letting the user choose between different time scales from milliseconds to years and a wide range of values This is useful for pattern matching Positive and negative special events The special events are of two types namely positive or negative Negative means that a special event did not happen This is useful for the matching pattern process A perfect pattern will match only the data segments with a positive special event and will not match any of the data segments with a negative special event Selection of the special event time from the time variable Page 58 TDWB combo box with all the time stamps allows to the user to determine the time of the special event this functionality helps the user to determine the time of the special event

Download Pdf Manuals

image

Related Search

Related Contents

Wl[|D[R  REGULADOR UNIVERSAL RE KNT 000  CADRE D`EMPLOI DES ALERTES A RETENIR  Manual de Usuario LexNet: Graduado social    EIA Hidroaysen IA 1 annex  Cooper Lighting QCT-1975BK User's Manual    Bedienungsanleitung für Sinus 206 Pack    

Copyright © All rights reserved.
Failed to retrieve file