Home
MBF_CLASSIFY USER MANUAL
Contents
1. The rows in the data file correspond to the objects to be classified First 15 columns in the data file represent the experimental information and the remaining columns correspond to the features extracted through image analysis NOTE If the software shows an error saying File not in standard format the user can take the following actions Check the first 15 columns of the file generated from Acapella These columns should correspond to the experimental information in the same order as shown in the figure the eighth column being the treatment sum information MBF Classify User Manual Check the naming convention used in the file for all the features starting from column 16 and onwards Check if there are too many clusters of empty rows in the file However the software is capable to remove one or two empty rows occurring at some points This feature has not been tested exhaustively Make sure there are no Inf values in the data file The software can deal with NaN s but not Inf s Currently MBF Classify allows the user to select between 3 channels Channell Channel2 and Channel3 and 4 feature categories Morphology Intensity Texture and Colocalization The nomenclature followed in the data file shown in the following figure is as follows Chl Channel 1 Ch 2 Channel 2 Ch3 Channel 3 MOR Morphology JNT Intensity TXT Texture CLC Colocalization NOTE MATLAB is case sensitive s
2. save the results for later use It systematically cycles through different samples of control objects feature reduction algorithms number of features kept from the feature reduction and classifiers The best performing classification scenario for the given control data set is identified and applied to cluster the unknown data The underlying algorithm of MBE classify is based on the idea of systematically testing all the possible classification scenarios applied to the control data and then picking the optimal scenario Although the MBF Classify User Manual 4 optimal scenario is selected each initial classification run uses only a selection of the control data Therefore when the data set is large enough several different initial classifications should be used to identify the most robust feature set There are 4 variables in each classification scenario 1 The size of the control set values from 25 to a user defined number in steps of 25 2 Feature reduction method PCA KS SDA 3 Number of features to keep after feature reduction and 4 The classification method KNN SVM Neural Network The algorithm selects one process from each step e g for the feature reduction step it will choose one of PCA KS or SDA as a process to train and test a classifier The accuracy of the classifier is calculated from the test results and is stored for later use This process is repeated for the same scenario for a user defined number
3. 30 10 3 17 PM analysis ctr 0 TAM 600 ch2 tex 5 dat output 6 30 10 3 45 PM analysis_ctr_0_TAM_600_ch2_tex_6 dat output ACTO bt 26 10 12 47 analysis ctr 0 TAM 600 ch2 tex 7 dat s rat 5 analysis ctr 0 TAM 600 ch2 tex 8 dat output ACTO TA 9 8 10 11 38 AM analysis_ctr_0_TAM_600_ch2_tex_9 dat output ACTD TNF 9 8 10 10 34 AM analysis ctr 0 TAM 600 ch2 tex 10 dat output STAURO bt 7 26 10 12 43 analysis_ctr_0_TAM_600_ch2_tex_11 dat output TAM bt 7 26 10 12553 FEATURE NAMES output TAM TN bt 8 19 10 4 10 PM Output TAM TNF bt 9 8 10 10 19 AM output TN p 7 26 10 12 50 output TN ACTD bt 9 8 10 11 35 AM output TN TNF bt 9 8 10 11 11 AM output TNF bt 7 12 10 221 PM S acapella_aca5356_0 6 30 10 3 46 PM 20 10 1 50 PM SELECT TREATMENTS AND CONTROLS Oactd_ 7 27 10 4 27 PM mue Current Director D 6 X Command W r 6 L 30 06 2010_output gt v amp gt gt initiate_knnonecontrol Enter number of control analysis ctr 0 actd 28 10 1 43 PM analysis ctr 0 actd 9 15 10 4 18 PM Command History apax Pick a random sample analysis ctr 0 actd 28 10 3 54 PM initiate mbfclassify R Control 1 Control 2 Control 3 analysis_ctr_0_actd_ 9 15 10 11 18 clear all VT Ue 0 29uM_TAMTrest3 2T RE me 4 Gi analysis ctr 0 actd 9 15 10 11 19 0 59 uM TAM Treat3 9 uM TAM Treat3 10 59 uM TAM Treat3 w 353 z 1 47 uM TAM Treat3 B 147 uM TAM Treat3 amiyiis ctr d actd x 1 21 08 2 17 PM V 9 15 10 1 31 PK 2 34uM T
4. As mentioned for MBF_Classify the user can upload a single data file or multiple data files by clicking on the single or multiple buttons However in this case the user has to upload two different files or two different sets of files as control and test separately The steps ahead of this that is the selction of features and channels followed by the selection of controls are the same as described for MBF_Classify earlier INTERPRETATION OF RESULTS The results of both initial classification run and final classification run consist of two types of files that appear in the current folder panel of MATLAB First is a fig file that contains the PCA plot of the controls used for classification of the data Second is another fig file that contains the controls and unknowns classified plotted together in the PCA plot The third file is a dat file that contains the classification results for the test data The results are saved in two parts described below Apart from saving the 21 MBF Classify User Manual 22 results another dat file is created that saves the information corresponding to the controls used for classification All these dat files can be opened in MATLAB by right clicking on them and selecting the option of open as text or as excel file word file or using WordPad The first dat file is labeled as results_ and includes the following information 1 FE
5. EE o gt f E B EBB DA m Base f BA o eis x eee 1 ferso 4003000 4003000 4001000 4003000 4001000 4002000 4003000 4001000 4002000 4003000 4002000 4001000 4003000 4003000 4003000 4001000 4002000 4002000 4003000 4001000 4001000 4003000 4002000 4003000 4001000 4001000 4001000 4001000 4001000 4002000 4003000 4002000 4002000 4002000 4003000 4002000 4003000 4003000 4001000 4003000 4003000 4002000 4001000 4001000 4003000 4003000 min 4003000 min aa A003000 ming min min min min min min min min D CO c 4 wn La o min min ere N G min min min min herer man oe w min min min PRR oc min min N N Po min min NN on min min NN eos min min N N 30 min min N N Ww oc min min w w P G min w N min w min min min min w w w w aoan amp min w eo min min D w o v min min doe Mon min min b e D w min min min b b abe b oan amp n Plate ID starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve starve star
6. History D x Z DoseResponse control actd 10 ch2 m 9 2 10 11 37 AM clear all Z DoseResponse control TAM 600 ch2 t 8 31 10 12 41 PM Q X 9 13 10 2 14 I Z DoseResponse control TAM 600 ch2 t 8 31 10 2 48 PM initiate mbfclas E DoseResponse control TAM 600 ch2 t 9 9 10 10 29 AM igs Z DoseResponse control TAM 600 ch2 t 9 9 10 2 02 PM DoseResponse for TAM control TAM 9 14 10 2 07 PM Z DoseResponse for TAM control TAM 9 14 10 2 08 PM L DoseResponse for TN control TAM 60 9 14 10 1 59 PM initiate mbfclas Z DoseResponse for TN control TAM 60 9 14 10 1 59 PM clear all 4 PcaPlot control actd 10 ch2 mor tex 9 1 10 3 56 PM Q 9 14 10 9 36 E PcaPlot control TAM 600 ch2 tex Lfig 8 31 10 10 44 AM initiate mbfclas PcaPlot control TAM 600 ch2 tex 2fig 8 31 10 12 46 PM initiate mbfclas E PcaPlot control TAM 600 ch2 tex cyt 9 9 10 9 35 AM global bestchoic E 0raDint control TAK ANN ch tartiira 0 0 10 10 50 AM Details initiate mbfclas initiate mbfclas g 9 13 10 2 32 clc clear all B 9 14 10 5 09 1 initiate mbfclas clc Select a file to view details close all clear all clc m MBF Classify User Manual 12 2 Press Single on the Graphical User Interface to select a single data file on which the analysis has to be performed from its specific directory This file is the output generated from Acapella with txt extension and needs to be in the format
7. Treat03 617 475 1 1022000 NoBarcode V MBF AT hsc AT 20110120 GBM 001022000 tif K9 GMB3 Treat03 411 647 1 1029000 NoBarcode V MBF AT hsc AT 20110120 GBM 001029000 tif K9 GMB3 Treat03 393 502 1 1019000 NoBarcode V MBF AT hsc AT 20110120 GBM 001019000 tif K9 GMB3 Treato03 1068 129 1 1008000 NoBarcode V MBF AT hsc AT 20110120 GBM 001008000 tif K9_GMB3_Treato3 653 500 1 1002000 NoBarcode V MBF AT hsc AT 20110120 GBM 001002000 tif K9 GMB3 Treat03 300 815 1 1009000 NoBarcode V MBF AT hsc AT 20110120 GBM 001009000 tif 1076 656 1 iD CO J CO 4 CQ MN LA kkk eh u uQ w u d wQ w N wg QN N N N N N N N N N MM MM M M i M M t 4 Q NM P O 0 C OQ Q amp 0 MN P O ip CD WE Q I PP O Ip C WE WN P oO M N M M MN M bM MN M M bM M M M M M M M N M M KM M N M b L o G M MM M MM M M M M W M M M M M M MM M M M M M M MM MM M M MM M M M M M M M M W M M o T ZMB n h n Q Q q T 006 K a x final run m X controls unknowns m initiate mbffinalrun m x newff m x MBP plain text file Col 1 OVR NOTE This file can be used by Acapella directly to look at the images of the classified cells MBF Classify User Manual 28 APPENDIX A M A 1 KNN SINGLE CONTROL ALGORITHM While working with high content screening data there can be situations when only a single control is present to create a classifier For example the single control can be the set of objects that were not affected by a particular treatmen
8. be used for the analysis by the user A new pop up box appears in which the user enters the p value for the analysis the default is 0 1 In practice we have found 0 1 the best but values between 0 05 and 0 5 all work to varying degrees In performing the analysis the program analyzes the untreated cells and determines the distribution for all of the cells based on all of the features it uses all the features in the data files It then uses a random set of cells from the untreated control as a benchmark 30 MBF Classify User Manual 31 In the next step the program measures the distance of all of the objects cells in the treated samples from the benchmark Once the analysis is complete the KNN one control algorithm creates as many control sample mat files as there are treatments present in the Select Control column on the interface The control sample mat files contain the information of the objects picked as control as specified by the user and the objects cells in the treatments that are scored as affected by comparing to the p value selected above usually 0 1 These objects can therefore be used as the second control to perform analysis in MBF_Classify by simply uploading the control sample_ mat files on the interface If there are multiple mat files they can be appended to each other in the main part of MBF classify MBF Classify User Manual 32 APPENDIX B B 1 DATA FLOW
9. for MBF_Classify The MATLAB prompt and the graphical user interface are shown in the following figure MATLAB 7 8 0 File Edit Debug Desktop Window Help OSS BRO amp Sew E O weErcurvoperalmages FenFeiluly22NAOdraq40X Meas_01 2010 07 22_16 48 13 output Shortcuts 4 Howto Add What s New Current Directory Dex Ji Meas_01 2010 07 22_16 48 13 output gt v Her D Name Date Modified Ji 05 08 2010 output 8 6 10 8 06 AM DB 06 08 2010 output 8 3 10 9 02 AM Db 30 08 2010 output 8 31 10 1 17 PM DL 3108 2010 output 8 31 10 1 00 AM L ActD_output_30 08 2010 o 8 30 10 11 07 PM L3 ActD_output_31 08 2010 o 8 31 10 8 00 AM LJ CONTROL_output_30 08 2010 t 8 30 10 8 06 PM Co mman d p rom pt LJ STR output 30 08 2010 6 8 30 10 11 32 PM L STS output 31 08 2010 6t 8 31 10 7 29 AM TAM output 30 08 2010 xt 8 31 10 12 03 AM L TAM output 31 08 2010 5 t 8 31 10 7 50 AM LJ TN output 30 08 2010 8 30 10 11 48 PM LJ TN output 31 05 3010 66 8 31 10 7 40 AM TNFa output 30 08 2010 8 30 10 11 20 PM L TNFa output 31 08 2010 6 t 8 31 10 7 19 AM j analysis_control_actd_10_ch2_mor_tex_ 9 2 10 9 51 AM analysis control TAM 600 ch2 tex 1 dat 8 31 10 12 24 PM j analysis control TAM 600 ch2 tex 2 dat 8 31 10 2 41 PM analysis control TAM 600 ch2 tex cyt 9 9 10 10 19 AM analysis control TAM 600 ch2 tex nu 9 9 10 11 41 AM umm E analysis_control_TAM_600_TN_200_ch1 9 14 10 1 43 PM Command
10. is anything less than 50 NOTE The software has been designed to take care of high overlaps by stopping them from entering the classification process However in some cases the data might have an overlap MBF Classify User Manual that is just below 50 but the controls are positioned in such a way that a fair amount of demarcation is not possible In such a case the software might enter the classification process but report later at the time of classification in the form of an error dialogue box that no feature was found by any of the feature reduction methods to separate the controls In cases where there is too much overlap because the treated control is heterogeneous some cells responded and others did not it may still be possible to classify the images but a KNN single control described below may be required File Edit View Insert Tools Desktop Window Help OGas s SS O9SA a mE am Multiple Single All 0 A Treat 0 005 ui ActD Treat C uM ActD Treat3 0 01 uM ActD Trea 0 02 uM ActD Treat3 0 04 uM ActD Treat3 0 08 uM ActD Treat3 0 16 uM ActD Trea 0 31 uM ActD Trea 0 63 uM ActD Treat3 4 AE eaa Acer Tonni m D All Morphology 0 02 uM_ActD_Treat3 E 0 04 uM_ActD_Treat3 0 08 uM ActD Trea 0 16 uM ActD Treat3 0 31 uM ActD Treat3 0 63 uM ActD Trea EcL 4 n U Controls Selected MBF_Classify Advanced Selection Morphology Texture H
11. siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 0
12. uM 37 5 TAM 4 69 uM 4 69 TAM 600 uM 600 TAM 15 uM 15 ictr ctr i sis_ctr_0_actd_ sis_ctr_0_actd_TAM_ch2_tex 1 da 448 PM zis ctr 0 stauro 20 actd ch2 tex 8 16 10 4 43 PM hi ch mo 8 4 10 9 34AM mor tex 7 2 23AM stauro 20 ch2 tex Ldat 8 3 Laure 20 TAM ch2 m sis ctr 0 stauro 2 sis ctr 0 TAM 150 ch2 315 ctr 0 TAM 600 ch1 ch2 mo anahuris coe A TARA KAN ch mar fax zis ctr 0 stauro 2 Pick a specific sample tr 0 stauro 2 EMPLOY MBF_CLASSIFY MBF Classify Select a file to view details Hit MBF_CLASSIFY to start classification W780 LPM File Edit View Insert Tools Desktop Window Help a eaa group group2 centroid1 centroid2 MBF Classify User Manual 15 NOTE Controls are selected starting with 25 and stepping up by 25 Default value is set to 100 and generally is a good number for training the classifier The maximum features to be used must be less than the total features but in practice typical values are 15 or less this allows the computations to be completed in a reasonable length of time However the default value has been set to 15 The number of repetitions is exactly the number of times MBF Classify will cycle through the training and classification process for a given number of controls feature reduction number of features and classifier scenario In practice 10 repetitions which is the default valu
13. 0 534346 0 952638 0 922516 0 763 8 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 2 146 344 03 05 2010 0 585388 1 1 0 686 9 1002000 JY 200906 V 00Arch 293 DRAQS 1 2 0 MitoTracker 0 0 MitoTracker 0 2 351 39103 05 2010 0 713286 0 921732 0 898606 0 815 10 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 2 170 371 03 05 2010 0 635289 1 1 0 628 11 1002000 JY_200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 2 120 389 03 05 2010 0 760756 1 1 0 781 12 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 3 117 69 03 05 2010 0 40334 1 1 0 387 _13 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 3 172 98 03 05 2010 0 383531 1 1 0 323 14 1002000 JY 200906 V 0O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 3 139 116 03 05 2010 0 251687 1 1 0 290 15 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 E 292 440 03 05 2010 0 390926 1 1 0 362 16 1002000 JY 200906 V 00Arch 293 DRAQS 1 2 0 MitoTracker 0 0 MitoTracker 0 4 7 67 03 05 2010 0 409487 0 997192 1 0 403 17 1002000 JY 200906 v 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 77 124 03 05 2010 0 414983 0 99848 1 0 408 18 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 535 129 03 05 2010 0 321658 0 999169 1 0 271 19 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 474 124 03 05 2010 0 351402 0 99166 1 0 249 20 1002000 JY
14. 00 5 6780 13 9695 4 4771 4 7980 72 4724 73 6177 76 8167 75 2370 74 4471 73 4202 0 73 4202 0 0 57 5829 60 5055 71 6825 74 5261 71 2875 75 0790 78 1596 75 4344 79 6209 78 9889 74 8420 76 3428 0 79 3839 80 3318 59 4787 67 4566 71 9589 74 7235 76 0664 76 7378 76 5798 75 8407 74 7566 75 0768 72 2447 74 5325 72 4531 0 79 0410 0 64 7098 70 6954 76 9209 74 3748 Two sample t test Population 1 Population corresponding to maximum mean i e row 5 column 5 Population 2 Population corresponding to all the other means in the matrix i e a loop for i j H i j obtained H infoset featvect alinums knn3 ks 12345 ABE es knn3 sda 1 0000 2 0000 3 0000 4 0000 5 0000 EE oe knn3 pca 1 0000 1 6667 2 0000 1 6667 2 3333 EE d L svm012 ks 1 0000 2 0000 3 0000 4 0000 5 0000 E d Rh svm012 sda 1 0000 2 0000 3 0000 4 0000 5 0000 A 99 svm012 pca 1 0000 2 0000 2 3333 2 0000 2 0000 EX ER network ks 1 0000 2 0000 3 0000 4 0000 5 0000 T network sda 1 0000 2 0000 3 0000 4 0000 5 0000 i i A network pca 1 0000 2 0000 3 0000 2 6667 2 0000 1 0000 2 0000 3 0000 4 0000 5 0000 MBF Classify User Manual 39 Step 6 Finding best set up whichone rowheaders whichone means whichone stds whichone ctrinums whichone numfeat whichone nums network sda 72 4531 13 7952 100 3 2 6667 sym012 ks 79 6209 2 1426 100 4 4 0000 sum012 sda 79 3839 3 0231 100 4 2 6667 sym012 ks 78 9889 0 5845
15. 000 tif K9 GMB3 Treat03 767 926 1 1035000 NoBarcode V MBF AT hsc AT 20110120 GBM 001035000 tif K9_GMB3_Treato3 1219 241 1 1010000 NoBarcode V MBF AT hsc AT 20110120 GBM 001010000 tif K9 GMB3 Treat03 715 768 1 1016000 NoBarcode V MBF AT hsc AT 20110120 GBM 001016000 tif K9 GMBS3 Treat03 674 595 1 1029000 NoBarcode V MBF AT hsc AT 20110120 GBM 001029000 tif K9_GMB3_Treato3 549 901 1 1004000 NoBarcode V MBF AT hsc AT 20110120 GBM 001004000 tif K9 GMB3 Treat03 1026 870 1 1019000 NoBarcode V MBF AT hsc AT 20110120 GBM 001019000 tif K9 GMBS3 Treat03 616 808 1002000 NoBarcode V MBF AT hsc AT 20110120 GBM 001002000 tif K9 GMB3 Treat03 172 179 1009000 NoBarcode V MBF AT hsc AT 20110120 GBM 001009000 tif K9 GMB3 Treat03 372 436 1022000 NoBarcode V MBF AT hsc AT 20110120 GBM 001022000 tif K9 GMB3 Treato3 209 258 1019000 NoBarcode V MBF AT hsc AT 20110120 GBM 001019000 tif K9 GMBS3 Treato03 244 629 1018000 NoBarcode V MBF AT hsc AT 20110120 GBM 001018000 tif K9 GMB3 Treat03 1214 567 1 1017000 NoBarcode V MBF AT hsc AT 20110120 GBM 001017000 tif K9 GMB3 Treat03 188 249 1002000 NoBarcode V MBF AT hsc AT 20110120 GBM 001002000 tif K9 GMB3 Treat03 544 396 1009000 NoBarcode V MBF AT hsc AT 20110120 GBM 001009000 tif K9 GMB3 Treat03 750 504 1011000 NoBarcode V MBF AT hsc AT 20110120 GBM 001011000 tif K9 GMB3 Treat03 811 180 1013000 NoBarcode V MBF AT hsc AT 20110120 GBM 001013000 tif K9 GMBS3 Treat03 842 673 1006000 NoBarcode V MBF AT hsc AT 20110120 GBM 00100
16. 04001000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex LIM S CHEO 20 and an E analysisl_NSC_starved_siLKB1_20mMogl analysis2 NSC starved siLKB1 20mMgl x NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 NSC starved Treat3 aiSrtarved 004003000 flex NSC starved Treat a 12 13 12 14 14 15 15 16 288 543 355 163 519 304 616 2 335 9 395 3 299 8 327 9 380 18 11 10 1
17. 058 1 8508 0 9698 1 6602 0 8322 3 0400 1 0000 2 0000 3 0000 4 0000 5 0000 72 1959 77 9226 76 3428 75 3160 77 6066 4 3215 2 0431 3 0493 5 3519 2 3785 1 0000 1 6667 2 3333 2 3333 2 0000 62 5987 63 6651 68 8784 72 3144 70 8136 5 2137 2 6290 6 3721 2 4176 3 8069 1 0000 2 0000 3 0000 4 0000 5 0000 75 3160 79 4629 78 0806 77 1327 78 8705 1 6346 0 5845 2 1943 2 3785 0 6080 1 0000 2 0000 3 0000 4 0000 5 0000 76 9747 75 2765 81 3586 74 6445 80 7662 3 0122 3 8581 3 5174 5 0865 6 6587 1 0000 1 3333 2 6667 2 6667 3 3333 50 7109 63 6256 68 6414 71 1295 74 4076 1 9720 1 0123 3 6717 4 6077 2 0556 1 0000 2 0000 3 0000 4 0000 5 0000 78 7915 73 4530 76 9547 72 2113 76 0702 1 1669 1 8892 2 3668 9 8366 2 2630 1 0000 2 0000 3 0000 4 0000 5 0000 73 3412 77 7515 80 9347 80 0834 74 2552 6 1234 2 0988 3 0622 1 5610 4 2877 1 0000 1 6667 2 6667 3 6667 2 6667 61 6039 62 3478 61 8882 72 2644 74 3673 0 0000 5 6780 13 9695 4 4771 4 7980 1 0000 2 0000 3 0000 4 0000 5 0000 allmeans means of overall accuracy over 10 repetitions for each number of features allerrors standard deviation of overall accuracy over 10 repetitions for each number of features allnums means of the number of features used over 10 repetitions for each number of features MBF Classify User Manual 35 Step 2 Example statistics Remove allmeans with allerrors lt 0 01 allmeans Mean over 10 reps featvect 12 34 5 73 6177 76 8167 75 2370 74 4471 73 4202 74 6840 73 4202 76 1058 76 5008 infoset
18. 100 5 5 0000 svm012 sda 80 3318 2 3667 100 5 3 3333 network sda 79 0410 2 1745 100 5 2 6667 Where whichone numfeat is extracted from featvect and gives the number of featvect features required to be used oe d whichone nums is extracted from allnums and gives the number of alinums features actually used 1 0000 2 0000 3 0000 4 0000 5 0000 1 0000 1 6667 2 0000 1 6667 2 3333 1 0000 2 0000 3 0000 4 0000 5 0000 1 0000 2 0000 3 0000 4 0000 5 0000 1 0000 2 0000 2 3333 2 0000 2 0000 1 0000 2 0000 3 0000 4 0000 5 0000 1 0000 2 0000 3 0000 4 0000 5 0000 1 0000 2 0000 3 0000 2 6667 2 0000 1 0000 2 0000 3 0000 4 0000 5 0000 MBF Classify User Manual 40 Step 7 Best choice is the one with the least standard deviation Result Bestchoice std 0 5845 Bestchoice mean 78 9889 Bestchoice ctrlnum 100 Bestchoice numfeat 5 Bestchoice num 5 Bestchoice frmethod ks Bestchoice classifier svm012
19. 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 143 151 03 05 2010 0 448077 1 0 998752 0 350 21 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 504 164 03 05 2010 0 328167 1 1 0 354 22 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 468 187 03 05 2010 0 577522 1 1 0 302 23 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 412 195 03 05 2010 0 370545 1 1 0 330 24 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 80 173 03 05 2010 0 415021 0 975434 0 999596 0 389 25 1002000 JY 200906 V 00Arch 293 DRAQS 1 2 0 MitoTracker 0 0 MitoTracker 0 4 136 193 03 05 2010 0 31979 0 944766 0 999881 0 310 26 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 59 210 03 05 2010 0 21865 0 970256 0 999989 0 258 27 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 105 229 03 05 2010 0 262275 0 996368 0 999987 0 348 28 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 138 253 03 05 2010 0 406057 1 1 0 365 29 1002000 JY_200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 4 90 279 03 05 2010 0 391954 0 992656 1 0 226 30 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 5 394 71 03 05 2010 0 511497 1 1 0 567 31 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 5 364 102 03 05 2010 0 358149 1 1 0 371 32 1002000 JY_200906 V 00Arch 2
20. 4 14 14 635 337 498 243 269 275 240 1 48 1 440 1 277 2 189 2 31 1 368 1 411 293 1 107 2 81 1 387 1 137 2 283 1 449 2 19 Y 255 1 258 2 437 2 1 233 204 1 15 355 260 1 16 139 310 2 9 189 231 2 16 388 62 1 7 518 168 2 15 525 233 2 2 456 68 1 14 249 238 1 1 476 337 1 11 386 229 1 12 291 254 1 7 139 381 2 1 624 363 1 2 391 160 2 20 19 14 15 14 12 603 444 367 206 407 267 65 2 451 178 140 354 111 3 310 429 2 13 143 153 2 15 115 79 1 2 233 212 1 16 560 127 1 15 404 366 2 10 47 368 2 16 486 RI 2 plain text file Ln 1 Col 26 MBF Classify User Manual 27 Edit Text Go Tools Debug Desktop Window Help Saimi IEN i ea d E d4 5 BH BDH IB A stack Base fe SCH ho faa x 086 08 amp Well No Plate ID Image No Control Field of View X Coord Y Coord Classes 1017000 NoBarcode V MBF AT hsc AT 20110120 GBM 001017000 tif K9 GMB3 Treat03 1 596 133 1 1004000 NoBarcode V MBF AT hsc AT 20110120 GBM 001004000 tif K9 GMB3 Treat03 654 965 1 1011000 NoBarcode V MBF AT hsc AT 20110120 GBM 001011000 tif K9 GMBS3 Treat03 993 178 1 1008000 NoBarcode V MBF AT hsc AT 20110120 GBM 001008000 tif K9 GMB3 Treat03 1027 515 1 1036000 NoBarcode V MBF AT hsc AT 20110120 GBM 001036000 tif K9_GMB3_Treato3 897 367 1 1019000 NoBarcode V MBF AT hsc AT 20110120 GBM 001019000 tif K9 GMB3 Treat03 908 276 1 1015000 NoBarcode V MBF AT hsc AT 20110120 GBM 001015
21. 6000 tif K9 GMB3 Treat03 309 761 1005000 NoBarcode V MBF AT hsc AT 20110120 GBM 001005000 tif K9_GMB3_Treato3 275 905 1005000 NoBarcode V MBF AT hsc AT 20110120 GBM 001005000 tif K9 GMB3 Treato3 208 243 1023000 NoBarcode V MBF AT hsc AT 20110120 GBM 001023000 tif K9 GMB3 Treat03 903 435 1010000 NoBarcode V MBF AT hsc AT 20110120 GBM 001010000 tif K9_GMB3_Treato3 670 347 1041000 NoBarcode V MBF AT hsc AT 20110120 GBM 001041000 tif K9 GMB3 Treat03 791 909 1019000 NoBarcode V MBF AT hsc AT 20110120 GBM 001019000 tif K9_GMB3 Treat03 249 697 1007000 NoBarcode V MBF AT hsc AT 20110120 GBM 001007000 tif K9_GMB3_Treato3 438 451 1006000 NoBarcode V MBF AT hsc AT 20110120 GBM 001006000 tif K9 GMB3 Treat03 473 739 1022000 NoBarcode V MBF AT hsc AT 20110120 GBM 001022000 tif K9 GMB3 Treat03 1 912 269 1004000 NoBarcode V MBF AT hsc AT 20110120 GBM 001004000 tif K9_GMB3_Treato3 163 379 1025000 NoBarcode V MBF AT hsc AT 20110120 GBM 001025000 tif K9 GMB3 Treat03 660 866 1007000 NoBarcode V MBF AT hsc AT 20110120 GBM 001007000 tif K9 GMB3 Treato3 786 427 1010000 NoBarcode V MBF AT hsc AT 20110120 GBM 001010000 tif K9 GMB3 Treato3 975 306 1016000 NoBarcode V MBF AT hsc AT 20110120 GBM 001016000 tif K9 GMB3 Treat03 625 280 1009000 NoBarcode V MBF AT hsc AT 20110120 GBM 001009000 tif K9 GMB3 Treat03 821 598 1038000 NoBarcode V MBF AT hsc AT 20110120 GBM 001038000 tif K9 GMB3 Treato3 1175 933 1 1038000 NoBarcode V MBF AT hsc AT 20110120 GBM 001038000 tif K9 GMB3
22. 93 DRAQS 1 2 0 MitoTracker 0 0 MitoTracker 0 5 353 15303 05 2010 0 477855 1 1 0 474 33 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 6 52 126 03 05 2010 0 338242 1 1 0 321 34 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 6 173 128 03 05 2010 0 543246 1 1 0 379 35 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 6 32 151 03 05 2010 0 313849 1 1 0 333 36 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 6 165 171 03 05 2010 0 246976 1 1 0 315 37 1002000 JY_200906V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 6 197 191 03 05 2010 0 557692 1 1 0 360 38 1 20 MitoTracker 0 0 6 0 284768 0 99538 0 999598 1002000 JY 200906 V 00Arch I Y Wi truncated 72 Ready start 3 MATLAB 0 MitoTracker 583 211 03 05 2010 Gmail Inbox su NOTE The order of first 15 columns is important Note that the data file generated from Acapella should have the same first 15 columns as shown in the figure above The remaining columns are feature names Generate the feature names according to the naming convention ChannelName_FeatureCategory_ In case of the Colocalization feature the naming convention used is ChannelName_FeatureCategory_ChannelName_ Examples Chl_MOR_Nucleus_area stand for Channel 1 Morphology feature and then the user defined term nucleus area these terms come from the feature extraction program That
23. AM Treat3 2 4 uM TAM Treat analysis ctr actd 7 28 10 3 06 PM analysis ctr 0 actd 8 16 10 2 31PM analysis ctr 0 actd 8 12 10 2 04PM ctr 0 actd 8 4 10 4 48 PM etr 0 staur 16 10 4 43 PM clc s 4 10 9 34 AM initiate mbfclassify diff clc initiate mbfclassify initiate mbfclassify diff clear all Pick a specific sample 29 10 9 23 AM clear all analysis ctr 0 staur 8 3 10 4 11PM clc analysis ctr 0 staur 8 5 10 3 20 PM initiate mbfclassify diff envhieic ehe A ebre 12710 2 22 DM En I clear all utput THF txt clc initiate mbfclassify diff clc clear all EMPLOY MBF_CLASSIFY No details available initiate_knnon clc initiate knnon initiate mbfclassify 1 2 3 MBF Classify User Manual CLASSIFICATION INTERFACE FOR DIFFERENT CONTROLS AND TEST SETS There can be cases when the user wants to try a particular set of controls from one data set to classify another set of objects coming from a different data set Hence another interface has been designed that allows the user to upload two different text files as the files from where the control set and the test set would be selected respectively The protocol to use this interface is similar to the protocol used to run MBF_Classify except for a few changes Type initiate_mbfclassify_diff at the MATLAB prompt to launch the Graphical User Interface The MATLAB prompt and the graphical user interface are shown in the following figure
24. ATURE REDUCTION METHOD This specifies which method was used for feature reduction before proceeding into classification by MBF Classify It can show KS SDA or PCA as the feature reduction method used 2 FEATURES USED The names of the features that were used for classification are specified under this heading NOTE The feature names are displayed only in the case when KS or SDA have been used as feature reduction methods However in case of PCA no feature names are displayed This is because PCA uses a combination of various features for classification and not singular discrete features as in the case of SDA and KS feature reduction methods 3 TREATMENTS This lists the set of all the treatments used in the data set being analyzed The controls used for the analysis had been selected from the same list of treatments 4 SCORES This gives the scores of the number of objects classified as either of the controls selected earlier The scores is a matrix with either three or four columns depending on the number of controls used for classification and rows corresponding MBF Classify User Manual 23 to the number of treatments present In case of two controls the matrix consists of three columns where the first column gives the total number of objects per treatment the second column represents the number of objects classified as control 1 per treatment and the third column gives the number of objects classified as control 2 for each trea
25. MBF Classify User Manual the last part of the name is not fixed permits features to be added or deleted as desired Chl CLC Ch2 ICQ stands for Channel 1 Colocalization with Channel 2 and in this case indicates that the colocalization feature was an ICQ calculation While selecting the channels the user should take care which channel number corresponds to which color and include only the channels that are required for classification The user has the option of selecting all three channels or just the two channels required for classification as per the requirements of the experiment The software also allows the user to select only a single channel MBF Classify has been designed to perform classification using either 2 or 3 controls for training the classifiers The user must take care that anything other than 2 or 3 controls is not allowed and would generate an error Also the software cannot proceed to classification if the controls provided for training have a very high degree of overlap MBF Classify provides an approximate demonstration of the amount of overlap in the controls by plotting them in Principle Component Space as shown in the following figure The approximate percentage of the overlap is also displayed at the top of the PCA plot for the user A pop up error message also shown indicates the user to check the controls if it finds them with high amount of overlap The amount of overlap that is allowed to proceed for classification
26. MBF_CLASSIFY USER MANUAL Mac Biophotonics Facility MBF Classify User Manual CONTENTS Introduction 3 What will the software do 3 2 1 Initial Classification runs 3 2 2 Final Classification run 5 What it will not do 5 Protocol 10 4 Pre classification 10 4 2 Initial Classification run 10 4 3 Final Classification run 17 Classification Interface for Different Controls and Test Sets 21 Interpretation of Results 21 Appendix A 28 Appendix B 32 MBF Classify User Manual 1 INTRODUCTION Machine learning is a technique that can provide good classification results for objects seen in microscopic images There exist many methods by which machine learning can be accomplished and every method makes use of a supervised classifier A supervised classifier takes a training set consisting of examples of each class and assigns a particular class to an unknown input MBF_Classify is based on the same approach and uses three supervised clustering methods namely KNN K Nearest Neighbors SVM Support Vector Machines and Neural Network to generate classes for every unknown object in the data set However MBF _ Classify can also be used to collate a training set from a mixed population of images without the user having to manually assign a class to the images 2 WHAT WILL THE SOFTWARE DO 2 1 INITIAL CLASSIFICATION RUNS MBF Classify allows the user to run a series of classifications over the same data set and
27. THROUGH MBF_CLASSIFY The figures below explain the flow of data during the process of feature reduction and supervised classification in MBF_Classify script Random samples of equal sizes are picked and tested for each combination of feature reduction method and classification method to find the best set up Sample file Acapella Controls Unkowns Overlap checked PCA amp KNN classify MBF Classify Figure 1 Data break up before MBF Classify MBF Classify User Manual 33 10 reps 15 features 100 controls a 10 reps 15 features 100 controls b Figure 2 a b Data flow within MBF Classify MBF Classify User Manual 34 B 2 STATISTICAL ANALYSIS STEPS TO FIND THE BEST SET UP In order to select the best combination of feature reduction method and classification method the statistical steps followed are shown below The values used here represent actual values from an analysis performed using the default values of number of features 15 number of controls 100 and number of repititions 10 Step 1 Example Numbers from statistical analysis to chose the best combination ctrivect featvect infoset 100 12 345 knn3 ks 100 knn3 sda 100 knn3 pca 100 svm012 ks 100 svm012 sda 100 svm012 pca 100 network ks 100 network sda 100 network pca allmeans allerrors allnums 76 3033 77 1327 77 8041 77 2907 76 1
28. e work well without causing the program to require great length of time NOTE The processing times for classification can range from 30 minutes to 6 hours depending on the size of the data set the total number of features in the set and the number of repetitions chosen by the user The processing time can also increase marginally in case of a very high overlap between the controls approximately between 40 to 50 Once the classification is complete MBF_Classify prompts the user to save the results The users are encouraged to save the names with experiment number after the underscore to keep track for later use especially when doing the optional final classification run The data from these initial classifications can be viewed and used as is The output is in the same format as the output for the final classification run see below for how to view and interpret this data In the initial classification runs PCA can be used as the feature reduction method and to view the data However the MBF Classify User Manual 16 output will not include a list of the specific features used as the PCA process combines them linearly The program permits PCA classification for those users that wish to stop at this stage and not perform a final classification run Data generated using PCA cannot be used in the final classification because the features are not specified explicitly However if MBF Classify picks up PCA as the best feature reducti
29. e initial classification run This file is the output generated from Acapella with txt extension and needs to be in the format specified in section 3 Once the data file is selected the name of the file is displayed on the top right corner of the Graphical User Interface 2 Press Select analysis files to select the analysis files generated from initial classification run This button allows the user to select multiple files at the same time by using ctrl or shift keys Once the set of analysis files have been selected the corresponding names appear under Analysis file names The interface would automatically update the list of top features that repeated at least 60 of the times under the title Feature names The user can then select all features in the list or only the top few features to conduct the classification on Once the feature selection has been made hit Selected NOTE If no feature names appear please re check the files used for analysis The possible reasons are that there are no features in common or the initial analysis runs 3 MBF Classify User Manual used PCA as feature reduction method In either case try running a few more classification runs to see if any features appear to be commonly used NOTE Multiple feature selection can be done using the ctrl or shift keys NOTE The feature names are arranged in descending order and appear with their respective hit rate as a percentage A hit ra
30. eated after the final classification run NOTE For files named coordinates_ and controls_ the last column that is the class category 1 represents the type of object selected as control 1 2 represents the type of object selected as control 2 and 3 if present represents the type of object selected as control 3 MBF Classify User Manual 25 File Edit Text Go Tools Debug Desktop Window Help DSH LAAI CSP Med BE BWM BB sade Base fe MOM hio ra x 8 Features Reduction Method ks Features Used Ch2_TXT_Nucleus_moment2 Ch2_TXT_Nucleus_moment3 Ch2_TXT_Nucleus_moment4 Ch2 TXT Nucieus moment5 Chi TT Cell TAS 02 40 Chi TXT Cell TAS 01 40 Chi TXT Cell TAS 07 40 Treatments NSC 20mMglucose Treat3 NSC starved Treat3 SiLKB1 20mMglucose Treat3 SiLKBl1 starved Treat3 untransfected 20mMgiucose Treat3 untransfected starved Treats Scores 6048 0000 1966 0000 4082 0000 4359 0000 2262 0000 2097 0000 3035 0000 871 0000 2164 0000 4773 0000 2756 0000 2017 0000 11570 0000 3363 0000 8207 0000 5864 0000 3659 0000 2205 0000 w cO O s won P NM M N NMM N MG MM GMM G BB n amp Q M P O t 3 WE MM Po analysis NSC starved siLKBl 20mMgl x analysis2 NSC starved siLKB1 20mMgl x plain text file tn 1 Col 1 OVR L Editor Y te File Edit Text Go Tools Debug Desktop Window Help MBF Classify User Manual Hee seare
31. er followed by the second part of employing the classification process For the convenience of the user a graphical user interface has been designed that allows the selection of the correct inputs for the classification process As mentioned earlier a two stage classification run is possible with MBF Classify Initial classification run and Final classification run Both the stages have similar steps to follow as mentioned below 4 PRE CLASSIFICATION Open MATLAB and set the path of the directory where the program files m extension are saved The path can be set by using File set path and navigating to the folder with the MATLAB program files This step needs to be followed only when using MBF Classify on MATLAB for the first time The path once set is saved in the pathdef m file of MATLAB for all subsequent runs MATLAB REQUIREMENTS Make sure that the MATLAB version being used has the 3 toolboxes installed before using the software Neural Network toolbox Statistics toolbox and Bioinformatics toolbox To check the version and toolboxes present in the MATLAB version being used type ver and press enter on the MATLAB prompt 4 2 INITIAL CLASSIFICATION RUN Once the path has been set the user can start using the software for classification Follow the steps mentioned below to proceed MBF Classify User Manual 11 1 Type initiate_mbfclassify at the MATLAB prompt to launch the Graphical User interface
32. ighly overlapping controls ActD a 0 01 uM_ActD_Treat3 _ 0 02 uM ActD Treats 10 04uM_ActD_Treat3 0 08 uM_ActD_Treat3 0 16 uM_ActD_Treat3 0 31 uM_ActD_Treat3 0 63 uM ActD Trea nervo m Hit MBF CLASSIFY to start classification Proceed to final analysis U File Edit Debug Parallel Desktop Window Help HG 4 S amp B 5 C Sex E mercurvoperaimages FENNAO staining FeiJuly22NAQdraq40X Meas_01 2010 07 22 16 48 13 output Shortcuts Z Howto Add 2 What s New Current Folder r D X Command Window a oti A initiate mbfclassify E Figure2 adit ala File Edit View Inset Tools Desktop Window Help A EREN LET AEE Overlap 52 Approx group group2 x centroid 80 N X centroid2 40 2 0j E 204 Details Select a file to view details 4 Start Je ED X Workspace O ax fa mi El 3 e O se v Name Value m Command Hitoy D x imagesc RGB64 image RGB64 imagesc RGB64 colormap gray a RGB64 1 w 7 imwrite RGB64 w imwrite RGB64 w imwrite i test imwrite RGB64 test imagesc test tif imwrite i test tii Clear all clc initiate mbfclassii clc initiate mbfclassii m U OVR MBF Classify User Manual 10 4 PROTOCOL The working of MBF Classify can be divided into two parts The first is setting up the MBF Classify inputs in the correct ord
33. mand Window see O B X Workspace 9 15 10 1 31 PH initiate mbfclassif initiate mbfclassif clear all clc initiate mbfclassif clear all clc initiate mbfclassif clear all clc initiate mbfclassif clc clear all initiate knnonecont clc E initiate knnonecont U OVR 2 3 4 5 6 MBF Classify User Manual Once the interface opens the user can select a single file to upload by clicking in the single button or upload multiple files that would be appended together by hitting the multiple button The name of the file appears on the interface once it is done uploading it The next step is to select the single control from the list that appears in the select control column The list appears automatically once the upload of the file is complete NOTE That while multiple sets of data can be analyzed to generate a control set only a single control can be selected from the list for each analysis The user then has the option of either plotting the distributions of the distances of the control and the samples from the benchmarks if you want to visually determine how overlapping the distributions are or directly starting the analysis by clicking on the KNN one control button NOTE Since KNN computes an average distance of K number of nearest neighbors to the benchmark object there has been included an option to specify the number of nearest neighbors that should
34. o upper case letters cannot be replaced by lower case letters and vice versa 6 MBF Classify User Manual iio truncated Microsoft Excel NUN Home Insert Pagelayout Formulas Data Review View e amp Cut Calibri Ju IK a m i wrap Text General lt dad Egit x autos ar d Ha Copy s a g Fin Paste EE E SSR ay Merge amp center v CARICE Se Trong lt oc Insert Delete Format 5 cear v Sota Find amp l Clipboard nj x Alignment tai Number 5 Styles Cells jl Editing Al 0 Fe Wellindex v B G D E F G H 1 J K M N Quam nala 5 WellindedBarcode Path Cells Dye Row Column Treatment Sum Treatment01 Treatment02 Treatment03 FieldofView Xcoord Ycoord DateOfAnal Chi_CLC_Chi_CLC_Chi_CLC_1Chi_Cl 2 1002000 JY 200906 V O0Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 1 352 336 03 05 2010 0 475686 1 1 0 45 3 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 L 385 375 03 05 2010 0 523546 1 1 0 372 4 1002000 JY_200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 1 332 383 03 05 2010 0 3915 0 99889 1 0 330 5 1002000 JY_200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 1 355 416 03 05 2010 0 683793 1 1 0 705 6 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 2 80 125 03 05 2010 0 405773 0 662315 0 944889 0 571 7 1002000 JY 200906 V 00Arch 293 DRAQS 1 20 MitoTracker 0 0 MitoTracker 0 2 464 293 03 05 2010
35. of replicates Thus a series of accuracy values for each possible scenario is recorded From these accuracy values an optimal scenario is chosen and then this optimal scenario is used to create a classifier in a final classification stage see below and classify the unknown data NOTE KS and SDA are the most frequently used feature reduction methods by MBF Classify It has also been observed that supervised clustering is usually performed by using KNN and SVM and very rarely Neural Networks However when using 3 controls KNN is the most preferred method followed by Neural Networks SVM is rarely used for 3 controls MBF Classify User Manual 2 2 FINAL CLASSIFICATION RUN Following the analysis of the dataset for a number of times the user can move ahead to final classification run During the final run the software picks only those features that have been used at least 60 of the times during the initial classification runs of the same dataset and performs a final classification of the dataset It goes through the same protocol of feature reduction and classification as described for the initial classification result The results are also saved in the same manner for later use 3 WHAT IT WILL NOT DO The software will not substitute for inaccurate input data MBF Classify works only if the data is generated in a certain format from Acapella The figure below shows the snapshot of a typical data file as per the Acapella script used
36. on method it would immediately prompt the user as shown in the figure below to chose between carrying on with PCA in which case no feature list would be available or switch to the next best feature reduction method but PCA that was picked after statistical analysis and get the feature list NOTE The instructions to use the Graphical User Interface mentioned above also appear at the bottom of the interface as the user proceeds 7 Torun another initial classification close the interface and repeat steps 1 to 5 MBF Classify User Manual 17 DSasl e AQVdeK 2 0H 20 Enc GIA GdG groupi Shortcuts Z Howto Add 2 group2 Current Folder x De X Workspace HDH B d Java HotSpot TM 64 Bit Se gt ca D A Neural Network Training nntraintool T gd fe 83 E select datato v Name rsion 7 R2010a Name Value b ee eters rsion 7 R2010a 1 E d Classification Literature Z Codes E R2010a E d Columbus rsion 4 R2010a 44 HGK 293 EG R2010a 4 MBF Classify final X rsion 8 R2010a J MBE Classify nn version 7 Fs FF ww brsion 2 R2010a J NAO data frsion 3 R2010a J Old data from Mats H3K4 prsion 4 R2010a gi Presentations and stuff PCA picked as feature reduction method Proceed with PCA or click NEXT to Ersion 3 R2010a L Random proceed with the next best method brsion 2 2 R2010a J stem cell data 3 way 20 054 euer R2010a Li Controls tes
37. peinitiate_mbfclassify f E Version 7 R2010a Stop Training Cancel s Version 3 R2010a Select a file to view detail Version 4 R2010a PRAES E a file to view details gt gt initiate_mbfclassify initiate_mbfclassify gt gt initiate_mbfclassify fe gt gt initiate_mbfclassify The clear all welc ver 4 3 FINAL CLASSIFICATION RUN The user should proceed to final classification run only when a minimum of 5 initial classification runs have been completed Therefore there should be at least 5 analysis_ txt files each with a different file name before running the final classification However 10 initial classification runs are highly recommended The user can enter the final classification run interface via two routes First by clicking on the pushbutton at the bottom right corner of the initial classification run interface named Proceed to final analysis or second by typing initiate_mbffinalrun on the MBF Classify User Manual 18 MATLAB prompt Both these steps would open the final classification run interface as shown in the following figure The steps needed to be followed are as follows 1 Press Single or Multiple on the Graphical User Interface to select a single data file or multiple data files respectively on which the analysis has to be performed from its specific directory in the manner similar to the one used for th
38. s 57 5829 60 5055 71 6825 74 5261 71 2875 knn3 ks 75 0790 78 1596 75 4344 79 6209 78 9889 alimean s knn3 sda 74 8420 76 3428 80 0158 79 3839 80 3318 7 i aar 59 4787 67 4566 71 9589 74 7235 76 0664 knn3 pca 76 7378 76 5798 75 8407 74 7566 75 0768 72 4724 73 6177 76 8167 75 2370 74 4471 amie hi 72 2447 74 5325 72 4531 76 3579 79 0410 73 4202 74 6840 73 4202 76 1058 76 5008 svm012 sda 70 3791 64 7098 70 6954 76 9209 74 3748 57 5829 60 5055 71 6825 74 5261 71 2875 svm012 pca fi 75 0790 78 1596 75 4344 79 6209 78 9889 network ks 74 8420 76 3428 80 0158 79 3839 80 3318 network sda alls 59 4787 67 4566 71 9589 74 7235 76 0664 network pca Standard error over 10 reps 76 7378 76 5798 75 8407 74 7566 75 0768 72 2447 74 5325 72 4531 76 3579 79 0410 0 64 7098 70 6954 76 9209 74 3748 4 5771 0 7617 0 6597 1 1009 2 7770 2 2050 5 2272 4 8626 3 0539 4 8259 4 0981 2 0658 0 8544 3 0722 3 8013 0 6841 1 6602 2 1426 0 5845 4 3410 3 5983 1 1508 3 0231 2 3667 10 8921 5 5096 0 6737 7 5721 2 4626 1 0751 4 0981 3 0678 6 6676 2 8958 0 1790 2 3865 13 7952 3 8913 2 1745 O 1 2336 6 0411 0 9137 6 0862 MBF Classify User Manual 36 Step 3 Example Statistics Remove allmeans for which the average number of features used is less than 9096 of the number of features that is required featvect poem ks ooo knn3 sda knn3 pca svym012 ks svm012 sda svym012 pca network ks ne
39. specified in section 3 At times there can be multiple text files generated by Acapella for the same dataset Hence to append the files together the user can hit the Multiple button and select as many files as needed to be appended Once the data file files is are selected the name number of the file files is are displayed on the top right corner of the Graphical User Interface NOTE The size of the data set that can be imported into MATLAB depends on the processor memory MATLAB can crash and show an error if memory space is low Generally a 32 bit processor will not handle data files greater than 2GB in size 3 The next step is to select the desired features and what channels they correspond to General procedure is to first select the channel and then its corresponding features The channel can be selected by clicking on the toggle button followed by feature selection from the respective list Once feature selection is complete hit Features Selected to allow MATLAB to process the selected information NOTE To select multiple features press control key and make selection from list NOTE Advanced feature selection tool is also included that allows the user to be even more specific in selecting features Hence the user can select features within the major classes of type Texture Morphology Intensity or Colocalization as mentioned earlier MBF Classify User Manual 13 4 Once the processing is complete a lis
40. t 8 23 10 420 PM analysis ctr 0 TAM 600 ch2 tex 4 dat 8 24 10 4 37 PM analysis ctr 0 TAM 600 ch2 tex 5 dat 8 24 10 3 03 PM analysis ctr 0 TAM 600 ch2 tex 6 dat 8 25 10 9 53 AM analysis ctr 0 TAM 600 ch2 tex 7 dat 8 25 10 2 56 PM analysis ctr 0 TAM 600 ch2 tex 8 dat 8 25 10 420 PM analysis ctr 0 TAM 600 ch2 tex 9 dat 8 26 10 9 18 AM analysis ctr 0 TAM 600 selectedfeaturez ldat 8 26 10 3 47 PM j analysis ctr 0 TAM 600 TN 200 chl tex ch2 tex 1 9 13 10 4 22 PM Lj analysis ctr 0 TAM 600 TN 200 ch2 tex Ldat 8 20 10 2 533 PM analysis ctr 0 TAM 600 TN 200 ch2 tex 1 diff dat 9 13 10 12 48 PM analysis ctr 0 TAM 600 TN 200 ch2 tex 2 diff dat 9 13 10 2 11 PM analysis ctr 0 TAM 600 TN ch2 tex 1 dat 8 16 10 11 36 AM analysis ctr 0 TAM 600 TN ch2 tex 1 sp dat 8 13 10 404 PM analysis ctr 0 TAM 600 TNF ch2 tex 1 dat 8 16 10 1 16 PM analysis ctr 0 TAM 75 ch2 mor tex int L dat 7 30 10 10 50 AM analysis ctr 0 TN 200 ch2 mor tex int 1 dat 7 29 10 2 43 PM analysis ctr 0 TN 200 ch2 tex Ldat 8 5 10 9 32 AM analysis ctr 0 TN 200 TAM ch2 tex Ldat 8 12 10 4 30 PM analysis ctr 0 TNF 1875 ch2 mor tex int dat 7 23 10 12 27 PM analysis ctr 0 TNF 600 ch1 ch2 mor tex int Ldat 7 27 10 9 46 AM analysis ctr 0 TNF 600 ch2 mor tex int 1 dat 7 2 10 1 26 PM ansherie_cby N TNE EAN ch baw 1 dat 210250520 OM OR x amp analysis_ctr_0_TAM_600_selectedfeatures_ontlewllaoData_1 dat DAT File No details available Com
41. t and hence can be called a negative control In order to proceed towards classification using MBF Classify there is a requirement of at least 2 controls Hence software called KNN single control was designed that performs a comparison between the single control usually the unaffected objects and all the other objects in the population to pick those objects as the second control that are most distinct from the unaffected population This is done by comparing the distances from the unaffected population to a benchmark to the distances of a given query to the benchmark using the KS test The classification procedure described above often fails if more than 5096 of the treated control cells were unaffected In this situation the treated control is not really an appropriate control To create a more useful control set we created the KNN single control algorithm Using this algorithm the user selects from the treated cells those that are significantly we usually use p 0 1 different than the normal cells This group of cells is then used as the positive control in the classifier The alternative and what other software programs do is to let the user manually select positives based on visual inspection At the moment we do not favor this approach but if you want to use it there is a way to do it To manually select positive controls one selects them using Acapella and then uses the feature extraction script to extract the features from the selec
42. t of treatments used in the experiment appears under Control 1 Control 2 and Control 3 The user can now specify the number of controls to be used for classification and select the respective controls from the list Two different treatments under Control 1 and Control 2 respectively should be selected if the user wants to proceed into classification using only 2 controls while for classification with 3 controls three different treatments under Control 1 Control 2 and Control 3 respectively should be selected Hit Controls selected once the selection of controls is complete The software also allows the user to upload the specific objects as the controls and proceed towards classification The control objects to be uploaded should be mat files extension mat 2 mat files need to be uploaded for running a classification with 2 controls while 3 files need to be uploaded if the user wants to run a three way classification The specific objects can be selected and saved into mat files using the knn single control algorithm as discussed in Appendix NOTE Before proceeding towards the selection of controls make sure the number of controls box has been set to the correct number For example 2 for selecting two controls and 3 for selecting three controls For a three control classification if the user does not change the number of controls to 3 and proceeds towards selecting three treatments per con
43. te of 100 means that the particular feature was repeated in all the initial runs and should certainly be used for the final classification run Once the feature selection is complete the steps are the same as steps 4 to 6 for the initial classification run explained in Section 4 2 NOTE While setting the parameters for final classification run the user must make sure that the Maximum Features input should not exceed the number of features selected under the title Feature names The default value for Maximum Features is automatically updated to the number of features selected by the user under the title Feature names It is recommended to perform the classification run using the default values NOTE The software does not allow the user to select three or less than three features for the final classification run 19 MBF Classify User Manual 20 eskt Nind Help OG 238 9 amp d n E Q WERCURPOpenimage VamieUM 012210 fei mcf7 dose nao exiens 01 2010 01 22 17 10 44 output l s 8 Hoy 4 What s SELECT DATA FILE Shortcuts Z Howto Add 2 What s New TAM_output_30 08 2010 txt va gt gt initiate mbfclassify L Name Date Modified fe gt gt MALE PAE RS ACTD_controls 8 23 10 1 17 PM Men T TAM controls 30 10 11 28 analysis ctr 0 TAM 600 ch2 tex 2 dat i Kuk a Select analyis files analysis ctr 0 TAM 600 ch2 tex 3 dat Te SUR 23 MM analysis ctr 0 TAM 600 ch2 tex 4 dat 20100630 AcapellaE 6
44. ted cells These are then provided to MBF classify as a positive control set A 2 PROTOCOL MBF Classify User Manual 29 The KNN one control algorithm can be launched through MATLAB in a similar way as described for the other user interfaces above 1 Type initiate_knnonecontrol on the command prompt to launch the interface The figure below shows the command to launch the interface along with the interface File Edit Debug Desktop Window Help OS X BBO C d E O Curent Directory MERCURV Operalmages Vamie M_012210_fei_mcf _dose_nao_exptl Meas_01 2010 01 22_17 10 44 output 30 06 2010_output Shortcuts Z Howto Add 2 What s New Current Directory a L Meas 012010 01 22 17 10 44 output gt 30 06 2010 output D Name Date Modified analysis ctr 0 tauro 20 TAM ch2 tex Ldat 8 12 10 3 22 PM analysis ctr 0 TAM 150 ch2 mor tex int 1 dat 8 3 10 9 32 AM analysis ctr 0 TAM 600 chi ch2 mor tex int Ldat 8 3 10 2 26 PM analysis ctr 0 TAM 600 ch2 mor tex int 1 dat 7 29 10 12 07 PM analysis ctr 0 TAM 600 ch2 tex 1 2 dat 8 12 10 12 37 PM analysis ctr 0 TAM 600 ch2 tex 1 dat 7 29 10 417 PM analysis ctr 0 TAM 600 ch2 tex 10 dat 8 26 10 10 42 AM analysis ctr 0 TAM 600 ch2 tex 11 dat 8 27 10 9 35 AM analysis ctr 0 TAM 600 ch2 tex 1 sp dat 8 12 10 1 19 PM analysis ctr 0 TAM 600 ch2 tex 1 try dat 8 12 10 1 02 PM nalysis ctr 0 TAM 600 ch2 tex 2 dat 8 23 10 3 03 PM analysis ctr 0 TAM 600 ch2 tex 3 da
45. ting3 SD dat Ersion 3 R2010a N Controls testing SD dat kesten 7 R2010a LL Controls testing5 SD dat 7 R2010a L Controls_testing6_SD dat 9 9 iterations R2010 L Controls_testing7_SD dat 0 00 00 L a 3 i 00 R2010a _ Coordinates testing3 SD dat Perf 166 Coordinates testing4 SD dat rra Version 5 R2010a a D Coordinates testing5 SD dat Gradient 1 00 0 548 1 00e 10 Version 4 R2010a _ Coordinates testing6 SD dat Mu 0 00100 0 0100 1 00e 10 Version 1 0 R2010a _ Coordinates testing7 SD dat Validation Checks 0 6 Version 7 R2010a szzz size data controldat F extra time record 1 xlsx Version 3 4 R2010a tremfeat extra time record 2 xlsx Plots Version 7 R2010a for 1 numel m1 R extra time record 3 xIsx Version 6 R2010a A in80081501 usregformaldocumentsand faites Version 5 2 R2010a m12 ff intersect m1 ff gt msdia80 dll um Uu aes R2010a m123 ff intersect m12 fi E E a E E E El E E EH Command History Oa x Version 2 outlook backup pst plotregression Version 5 1 R2010a if numel mi23 ff 1 amp LZ PCA_controls_unknown fig Version 3 R2010a Tu 8 PCA controk unknownjpg Plot Interval 7 Version 3 3 R2010a remfeat remfeat ff a PeaPlot controls unknows testing3 SQ Version 7 R2010a end Af vada Version 7 R2010a land Kaiia Version 5 R2010a Details
46. tment However in case of three controls there are four columns in the matrix The first column being the total number of objects per treatment second being the number of objects classified as control 1 third being the number of objects classified as control 2 while the last column gives the number of objects classified as control 3 per treatment Each treatment is presented on a separate row However the second dat file is labeled as coordinates_ and includes the information for each cell that was classified as either of the controls selected 5 The above mentioned variables are followed by a set of variables arranged in a matrix form The first column in the matrix corresponds to the WELL NUMBER second column specifies the PLATE ID third column represents the IMAGE NUMBER fourth column corresponds to the CONTROL fifth column representing the FIELD OF VIEW sixth and seventh columns are for X COORDINATES and Y COORDINATES of the object being classified and the last or eighth row specifies the classification result of the cell MBF Classify User Manual 24 except The third dat file consists of the same information as stored in coordinates 33 that the file is named controls and shows the information corresponding to the controls used for classification The following figure shows the three dat files created after the initial classification run Similar files are cr
47. trol the software would completely ignore the third control selected and perform classification using only the first two controls MBF Classify User Manual 14 5 To start the classification process hit MBF_Classify As mentioned earlier MBF Classify starts checking the controls for the degree of overlap If the overlap is in permissible limits below 20 96 it asks the user to input the values for Maximum Controls Maximum Features and Repetitions The window to input values along with the figure for overlap in the controls is shown in the figure below MATLAB 7 8 0 R2 Figure 2 File Edit View Insert Tools Desktop Window Help n d kaa o9egxxs 0 c B erl DE ag Shortcuts Howto Add Current Directory l gt d output gt 30 7 Name output_ TAM txt ACTD_controls d d TAM controls SELECT CHANNELS AND FEATURES Advanced Selection TNF_controls Channei1 Channet 2 Channel 3 20100630 AcapellaError A output ACTD bt _output ACTD_TAM bt Inputs to m ois E Maximum controls n multiples of 25 SELECT TREATMENTS AND CONTROLS 00 Maximum features Enter number of controls ke Pick a random sample Repettons Control 1 Control 2 Control TAM 150 uM 150 TAM 150 uM 150 TAM 13 75 uM 13 75 TAM 18 75 uM 13 75 jadis hR S oeexs TAM 2 34 uM 2 34 TAM 300 uM 300 TAM 31 5 uM 31 5 TAM 4 69 uM 4 69 TAM 31 5
48. twork sda network pca allnums allmeans 1 0000 2 0000 3 0000 4 0000 5 0000 72 4724 73 6177 76 8167 75 2370 74 4471 1 0000 Gees 20000 G 6003 73 4202 0 73 4202 0 0 1 0000 2 0000 3 0000 4 0000 5 0000 57 5829 60 5055 71 6825 74 5261 71 2875 1 0000 2 0000 4 0000 5 0000 75 0790 78 1596 75 4344 79 6209 78 9889 Max of allmeans 1 0000 oa ets 2 0000 2 0000 74 8420 76 3428 0 79 3839 80 3318 1 0000 2 0000 3 0000 4 0000 5 0000 59 4787 67 4566 71 9589 74 7235 76 0664 1 0000 2 0000 3 0000 5 0000 76 7378 76 5798 75 8407 74 7566 75 0768 1 0000 2 0000 3 0000 2 0000 72 2447 74 5325 72 4531 O0 79 0410 1 0000 2 0000 3 0000 4 0000 5 0000 0 64 7098 70 6954 76 9209 74 3748 Two sample Row 5 Column 5 t test MBF Classify User Manual 37 Step 4 Two sample t test signal _ difference between group means E nose variability of groups t value Ho Null Hypothesis Samples come from populations with statistically equal means H1 Alternate Hypothesis Samples come from populations with statistically different means Significance level 0 05 MBF Classify User Manual 38 Step 5 i allmeans rows j allmeans columns allmeans E allerrors 1 8508 0 9698 1 6602 0 8322 3 0400 4 3215 2 0431 3 0493 5 3519 2 3785 5 2137 2 6290 6 3721 2 4176 3 8069 1 6346 0 5845 2 1943 2 3785 0 6080 3 0122 3 8581 3 5174 5 0865 6 6587 1 9720 1 0123 3 6717 4 6077 2 0556 1 1669 1 8892 2 3668 9 8366 2 2630 6 1234 2 0988 3 0622 1 5610 4 2877 0 00
49. ve starve starve starve starve Starve Image No NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAO NAQ Control Field of View X Coord Y Coord Classes Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 fiex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004003000 flex Mercury operaimages CHEO 20100809 siStarved 004001000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809 siStarved 004002000 flex Mercury operaimages CHEO 20100809
Download Pdf Manuals
Related Search
Related Contents
Jensen MSR3007 User's Manual Manual de instrucciones Web Alumno AgfaPhoto Precisa 107 2 0 0 3 -> H onda 4 6 0 8 WAECO CF 32UP Freecom - TwonkyMedia User Manual 3.1 - Spanish Figura 1 LetraTag Etichettatrice Home Decorators Collection 0804100410 Instructions / Assembly Copyright © All rights reserved.
Failed to retrieve file