Home

LSF Analyzer User's Guide LSF Analyzer User's Guide

image

Contents

1. Average Run Time of Exited Jobs Average time from dispatch to completion for failed jobs in seconds LSF Analyzer User s Guide 3 Generating Reports Statistics Description Total Run Time of Done amp Exited Jobs Time from dispatch to completion total for all finished jobs in seconds Total Run Time of Done Jobs Time from dispatch to completion total for all successful jobs in seconds Total Run Time of Exited Time from dispatch to completion total for all failed Jobs jobs in seconds Average Memory Usage of Average maximum resident memory used by Done amp Exited Jobs processes of all finished jobs in KB Average Memory Usage of Average maximum resident memory used by Done Jobs processes of successful jobs in KB Average Memory Usage of Average maximum resident memory used by Exited Jobs processes of failed jobs in KB Total Memory Usage of Maximum resident memory used by all processes of a Done amp Exited Jobs job total for all finished jobs in KB Total Memory Usage of Maximum resident memory used by all processes of a Done Jobs job total for successful jobs in KB Total Memory Usage of Maximum resident memory used by all processes of a Exited Jobs job total for failed jobs in KB Average Swap Usage of Average maximum virtual memory used by Done amp Exited Jobs processes of all finished jobs in KB Average
2. LSF Analyzer User s Guide 51 7 LSF Database Windows NT ODBC Driver Installation LSF Analyzer requires the version 3 0 or later ODBC driver for Microsoft SQL Server 6 5 The driver is available from Microsoft Step 1 Step 2 Step 3 Download the Microsoft ODBC 3 5 SDK ODBC35IN exe which installs the supported ODBC driver for Microsoft SQL Server 6 5 from http www microsoft com data odbc Run ODBC35IN exe to install the Setup for Microsoft ODBC 3 5 SDK Run Setup exe from the ODBC 3 5 SDK install directory to begin the ODBC driver installation Select the Minimum installation option DSN Setup Step 1 Step 2 Step 3 Step 4 Step 5 Step 6 Step 7 Step 8 Step 9 52 Run ODBC Administrator from the Windows Control Panel Select the System DSN tab Click the Add button Displays the Create New Data Source dialog Select the MS SQL driver from the list Displays the Create New Data Source for SQL Server dialog Enter the name of your MS SQL database in the Name field and add the ana extension e g clusterl ana Select the LSF database server from the Server list Enter a description for the DSN into the Description field Click Next Select the With SOL Server authentication using a login ID amp Password entered by the user option Enables the Login ID and Password fields Enter the name of the LSF database account for Login ID e g 1s admin and the password of the LSF database accou
3. Job Turnaround Time Host Related Job Turnaround Time Queue Related Job Turnaround Time User Related Job Turnaround Time Project Related Job Turnaround Time Average Turnaround Time of Done amp Exited Jobs description page 18 Job Related Job Turnaround Time Host Related Job Turnaround Time Queue Related Job Turnaround Time User Related Job Turnaround Time Project Related Job Turnaround Time Average Turnaround Time of Exited Jobs description page 18 Job Related Job Turnaround Time Host Related Job Turnaround Time Queue Related Job Turnaround Time User Related Job Turnaround Time Project Related Job Turnaround Time Average Wait Time of Done Jobs description page 19 Job Related Job Wait Time 68 Queue Related Job Wait Time User Related Job Wait Time Project Related Job Wait Time Average Wait Time of Done amp Exited Jobs description page 19 Job Related Job Wait Time Queue Related Job Wait Time User Related Job Wait Time Project Related Job Wait Time Average Wait Time of Exited Jobs description page 19 Job Related Job Wait Time Queue Related Job Wait Time User Related Job Wait Time Project Related Job Wait Time Batch Job Slot Utilization description page 21 Resource Related Batch Job Slot Utilization of Hosts Host Related Batch Job Slot Utilization of Hosts Batch Processor Utilization description page 21 Resource Related Batch Job Slot Utiliz
4. LSF Analyzer User s Guide Second Edition August 1998 Platform Computing Corporation LSF Analyzer User s Guide Copyright 1994 1998 Platform Computing Corporation All rights reserved This document is copyrighted This document may not in whole or part be copied duplicated reproduced translated electronically stored or reduced to machine readable form without prior written consent from Platform Computing Corporation Although the material contained herein has been carefully reviewed Platform Computing Corporation does not warrant it to be free of errors or omissions Platform Computing Corporation reserves the right to make corrections updates revisions or changes to the information contained herein UNLESS PROVIDED OTHERWISE IN WRITING BY PLATFORM COMPUTING CORPORATION THE PROGRAM DESCRIBED HEREIN IS PROVIDED AS IS WITHOUT WARRANTY OF ANY KIND EITHER EXPRESSED OR IMPLIED INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE IN NO EVENT WILL PLATFORM BE LIABLE TO ANYONE FOR SPECIAL COLLATERAL INCIDENTAL OR CONSEQUENTIAL DAMAGES INCLUDING ANY LOST PROFITS OR LOST SAVINGS ARISING OUT OF THE USE OF OR INABILITY TO USE THIS PROGRAM LSF Base LSF Batch LSF JobScheduler LSF MultiCluster LSF Analyzer LSF Make LSF Parallel Platform Computing and the Platform Computing and LSF logos are trademarks of Platform Computing Corporation Other products
5. All examples in this chapter assume the use of Microsoft SQL Server 6 5 as the LSF database and use of an ODBC compliant DBMS In This Chapter Database Installation Procedures e Database Setup on page 50 e Host Setup on page 51 e LSF Setup on page 53 Database Commands and Parameters e LSF Database Utility Command on page 56 e LSF Data Collection Parameters on page 56 e Isb acct Data Conversion on page 57 LSF Analyzer User s Guide 49 7 LSF Database Windows NT Database Setup Log on as the Microsoft SQL Server Database Administrator DBA and complete the following procedures in order e Create LSF Database Accounts on page 50 e Create A New Database on page 51 e Build LSF Database Schema on page 51 e Grant Permissions to LSF Database Accounts on page 51 Create LSF Database Accounts Step 1 Create a new account that will be used by LSF Batch to write information to the LSF database This is the LSF database user account By default LSF Analyzer also uses this account to read information in the LSF database Step 2 If you are very concerned about security you may choose to create a second account that does not have permission to write to the database This is the optional LSF database guest account LSF Analyzer does not require permission to write to the database so if you create the guest account LSF Analyzer will use it instead of the LSF database use
6. For example oe setenv PATH SDB bin S PATH setenv MANPATH SDB man SMANPATH where DB is the LSF database directory Command Reference To run any of the commands in this section you must log onto the LSF database server as the LSF primary administrator The following commands all support the standard h option lsdbserver start stop Starts up or shuts down the LSF database LSF Analyzer User s Guide 43 6 LSF Database UNIX lsdbcreate database name Creates a new database with the LSF database schema Specify a unique database name you cannot use the name of an existing LSF database lsdbstatus Prints the current status of the LSF database server which configuration file is in use maximum and actual number of connections details of all connections lsdbrecords database name t time Prints reference numbers for job related and resource related records If t is specified prints another pair of reference numbers for job and resource records logged before the specified time If you want good performance from LSF Analyzer you should keep the LSF database small When the reference numbers exceed 50 000 for job records or 400 000 for resource records you should archive the data using the 1sdmove command The rate of growth of your LSF database depends on all of the following e number of jobs issued daily in the cluster e number of hosts in the cluster e number of resource indices logged for the cluste
7. Job Memory Usage Project Related Job Memory Usage Total Memory Usage of Running amp Suspended Jobs description page 22 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Total Run Time of Done Jobs description page 20 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Total Run Time of Done amp Exited Jobs description page 20 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Total Run Time of Exited Jobs description page 20 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Total Swap Usage of Done Jobs description page 20 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Total Swap Usage of Done amp Exited Jobs description page 20 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage LSF Analyzer User s Guide 73 74 C Statistics Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Total Swap Usage of Exited Jobs description page 20 Job Related Job Swap Space Us
8. Job Memory Usage Ranges and Job Swap Space Usage Ranges Statistic Description Number of Done and Exited All finished jobs includes both successful and failed Jobs jobs but does not include running pending or suspended jobs Number of Done Jobs Successful jobs jobs that were successfully dispatched and completed without an error status Number of Exited Jobs Failed jobs jobs that were not successfully dispatched or exited with an error status The following statistics can be displayed over these X axis values Time accumulated Hosts Host Types Host Models and Queues Statistic Description Job Throughput The number of done and exited jobs divided by the time taken to finish these jobs in jobs hour The following statistics can be displayed over these X axis values Time accumulated Queues Users Projects and Job Names Statistic Description Average Turnaround Time of Done amp Exited Jobs Average time from submission to completion for all finished jobs in seconds Average Turnaround Time of Done Jobs Average time from submission to completion for successful jobs in seconds Average Turnaround Time of Exited Jobs Average time from submission to completion for failed jobs in seconds Statistic Description Average Wait Time of Done amp Exited Jobs Average time from submission to dispatch for all
9. On UNIX enter the following command 9 xanalyzer LSF Analyzer User s Guide 9 2 Getting Started Note If the command cannot be executed the appropriate directories may need to be added to the systems path check with the system administrator On Windows NT select e Programs LSF Suite for Workload Management LSF Analyzer Step 2 Select New Report click OK displays Select Data Source dialog app E ORG EN L io E Step 4 Enter Number of Jobs as the Title for this Report 10 Step 5 Click Y axis Add button displays Add to Y axis dialog Huber af Dane amp Exton Jota j in he chicks LSF Analyzer User s Guide 11 2 Getting Started Step 6 Click Add then Done returns to New Report dialog Hr Number ef Joby 12 Haaraar af Jet From n MA 10233 Tac Bruns 1607 4 Werda zi Doe k Evite Jobs in Ene citer Modifying the Chart Type To change the line chart to a bar graph area graph or table select Report Bar Area or Table LSF Analyzer User s Guide 13 2 Getting Started 14 3 Generating Reports About Reports In LSF Analyzer a report displays one or more statistics on the Y axis measured against a single set of values on the X axis We Frage Lari pla Cia Dik ihe Fuit hater 38 a _ _ meld ad heres LM BS Det Nr Tae change Hn hoan Repar Range EE
10. chargeback reports 38 CSV comma separated files 27 POPOL Sve Sen RR xx esa sex 25 CSV file creation 27 currency for chargeback 38 D data collection parameters 45 56 data conversion utility 46 57 Data Source description 15 data source selecting 16 data exporting 00 27 database utility commands 43 56 DB DEFAULT INTVAL 45 56 DB JOB RES USAGE INTVAL 46 57 DB LOAD INTVAL 46 57 DB SELECT LOAD 46 57 dialogs File Selection 27 29 30 Modify Chargeback Report 39 Modify Report 26 New Chargeback Report 38 New Report 25 mt eene SE 26 27 40 RAGES EE BE RE eda 39 Select Data Source 16 XcaxIg iei N bep 26 DAS c auger alg eaten 26 documentation 00008 x 75 Index E Edit Y axis dialog 26 Elapsed Time chargeback resource 37 elements of a report 15 exporting report data 27 F fax numbers Platform xi features of LSF Analyzer 1 File menu EXPOLb 2 deep 27 New Report 25 New Report from Template 29 PEDE ire dood ea AIR rho EE 26 Save as Template 29 Select Data Source 16 File Selection dialog 27 29 30 G generating chargeback reports 38 EP Sed res 25 guides eios e er tee chr
11. oooooooooommmoo 26 Y Y axis dialog oooooooooommooo 26 Y axis Statistics o oooooooooo 17 78
12. Database Parameters Edit the 1sf conf file and set the following parameters to define the LSF database environment Note If you took all the recommended steps to simplify database setup you do not need to set these parameters The default values will match your database setup LSF_DB_ACTIVE Specifies the name of the active LSF database see Step 5 on page 52 Default the name of the cluster is used e g cluster1 but the variable remains undefined LSF_DB_ACCT Specifies the Login ID for the LSF database account see Create LSF Database Accounts on page 50 Default the name of the LSF primary administrator account is used e g 1sfadmin but the variable remains undefined LSF_DB_GUEST_ACCT Specifies the Login ID for the LSF database guest account see Create LSF Database Accounts on page 50 Default the name of the LSF primary administrator account is used e g 1sfadmin but the variable remains undefined Update LSF The procedure to update LSF differs depending on whether LSF is running or yet to be started Is LSF running e Yes Go to Reconfigure LSF 54 e No Go to Starting LSF Reconfigure LSF Reconfigure LSF by executing the following two commands C gt lsadmin reconfig C gt badmin reconfig Starting LSF The LSF service and daemons on each LSF server host will start automatically when the machine is restarted If you cannot restart each host at this time log on as
13. Guide 1 Introduction The report in Figure 3 compares the memory utilization and CPU utilization for host1 Figure 3 Host Resource Utilization Hos istinto U iiaii From 7 3988 10554 Ta BYT 11553 D ic iasa Go ds a ari ps ea k Fer inin eres pri hiorir heat A Le CPU hon kee ged rs boda Inl ch This example shows the memory for host1 is fully utilized but the available CPU resources are not One conclusion drawn from this report is that there is not enough memory installed in host1 One possible corrective action would be to install additional memory resources then rerun this report to verify CPU resources are being fully utilized Cluster Profile Cluster Availability Using LSF Analyzer to produce a cluster profile provides the information needed to demonstrate that service commitment levels were met The report shown in Figure 4 was produced using the Performance General ClusterAvail_Time template Figure 4 Cluster Availability Custer Fre DET EZ 10 313 Ta Bro SSH 15314 Tras Hamis ai Moats n the dd Rumer ai OF Hoots in iria cdas This report shows the number of OK hosts available in the cluster and the number of hosts total in the cluster The large number of available OK hosts reflect the reliability of the cluster Activity Trend Using LSF Analyzer to produce a cluster profile highlighting user submission trends provides the information needed to schedule regular m
14. Tue x H Wel pir eege ate BEER deed x xi I VOLGENS deua Moers 33 invoices creating ice ridere erg 25 slee AE RE eee 26 76 J job log file data conversion 46 57 L Isb acct data 46 57 Isb charge rate file 38 lsdbbuildidx o oo ananuna 44 Isdbeleaf ss bases me 44 lsdber at ics ce ata EE 44 lsdbdrop deng N EER Ae Ee pi 45 Isdbmove 45 Isdbpasswd ees ee ee ee ee ee ee 56 Isdbrecords ee EE ee ee 44 Jdboerver aN a 43 Isdbstatus 5 cities ute dal oda ee 44 LSF Analyzer CONCEP tS Eg ess GEE ee ee 2 features e eee E 1 LSF Database 16 LSF Enterprise Edition x LSF Standard Edition x LSF Suite documentation x LSF Suite products ix LSF DB ACCT sssssss 54 LSF DB ACTIVE 54 LSF DB GUEST ACCT 54 LSF DB HOST 42 M mailing address Platform xi Memory chargeback resource 37 Modify Chargeback Report dialog 39 Modify Report dialog 26 modifying chargeback rates 39 chargeback reports 39 LEPOMS es des cals e EL eps 26 N New Chargeback Report dialog 38 New Report dialog 25 O online documentation xi P phone numbers Platform xi Platform Computing Corporation xi Print dialog 26 27 40 printing chargeback
15. edit LSB_CONFDIR cluster configdir lsb params The following parameters can be configured DB_DEFAULT_INTVAL Specifies the time interval to log job data except load information and resource usage of running and suspended jobs to the database in minutes To stop logging job data set DB DEFAULT INTVAL 1 Default 5 56 DB_JOB_RES_USAGE_INTVAL Optional If defined specifies the time interval to log job resource usage to the database in minutes Default undefined job resource is not logged DB_LOAD_INTVAL Specifies the time interval to log load information internal and external load indices and shared resources in minutes Minimum 15 Default 60 1 hour DB_SELECT_LOAD Optional Specifies which load information to collect Possible values internal load indices external load indices and shared resources This parameter is case sensitive If more than 4 load values are specified separated by spaces only the first 4 will be used For example DB SELECT LOAD ut mem ext_idxl shared_licl will collect CPU utilization available memory the user specified external load index ext_idx1 and the user specified shared resource shared_licl Default ut Isb acct Data Conversion The acct 2db utility is used to convert job log files e g 15b acct into LSF databases allowing you to analyze LSF data collected before LSF Analyzer was installed These databases cannot be used as online active dat
16. invoices 40 chargeback reports 40 printing reports s s sasssaesa 26 Q quick start eei ea ete eth 9 R range of reportage 25 Rates button 38 Rates dialog o n nananananannn 39 rates chargeback 38 report Anabyzer sese 15 chargeback 00 33 Report menu echte 5 tere delete 26 30 report modification 26 Report Range 16 34 LSF Analyzer User s Guide report range 6 eee u eee 25 Report Range button 26 report elements of 15 reports chargeback 35 Ot c o do dd 25 printing 6 eee eee eee 26 saving as templates 29 StatiStiCS oooooooomooo o 26 Resource Rates changing 38 resource rates chargeback 38 resources chargeback 37 S saving reports as templates 29 Select Data Source dialog 16 selecting data source 000 16 Statiste 26 statistics Y axis o ooooo oo 17 summarizing chargeback data 35 SUPPOM Sade ee Pe he red Ed xi Swap chargeback resource 37 T technical assistance xi telephone numbers Platform xi template sso sse ee RE ae adria 29 Title coke Ge Ee Dates 16 34 Tools menu Chargeback nannsnnnnan 38 U UNIX database utility commands 43 77 Index W Windows NT database utility commands 56 X X axis dialog
17. report shows the costs of resources used by individual users or projects and also the total costs of resources used in a table format Tampak Report Far Paajeri nii Fram 7 12 5880 1E Te ETER 0531 He D aa aa noo um om aaa LSF Analyzer User s Guide am am na P s Bl T EE poo 1013 inon E ion GH ng 35 5 Chargeback Accounting A set of invoices is just a different way of displaying the same information contained in areport An invoice gives the cost of resources used by individual users or projects One invoice is generated for each user or project included in the report apice Dala SE 10 07 Tr wer Far Praject mi Fea 01118 1654 Ta 080106 0351 ETE 1 19 zmsi T CES Reed Maat Pari Oe Invoices may be used for billing purposes The tabular format may be used for comparison purposes and to summarize and archive periodic reports 36 Chargeback Resources LSF Analyzer can generate chargeback reports for any combination of resources The resources for which chargeback accounting is done are listed below Resource Description Unit Elapsed Time The wall clock time from the start of ajob to itsend seconds This is useful if you want to charge for the actual time a job is running in the cluster This time includes any suspension of a job by the person who submitted the job CPU Time The CPU time used by the job This is use
18. the following address LSF Technical Support Platform Computing Corporation 3760 14th Avenue Markham Ontario Canada L3R 3T7 Tel 1 905 948 8448 Toll free 1 87PLATFORM 1 877 528 3676 Fax 1 905 948 9975 Electronic mail support platform com Please include the full name of your company You may find the answers you need from Platform Computing Corporation s home page on the World Wide Web Point your browser to www platform com LSF Analyzer User s Guide xi Preface If you have any comments about this document please send them to the attention of LSF Documentation at the address above or send email to doc platform com Xii 1 Introduction The LSF system provides a powerful distributed computing environment that tightly integrates a suite of products LSF Batch LSF JobScheduler LSF MultiCluster LSF Analyzer LSF Make LSF Parallel and LSF Base While each of these products independently delivers great value collectively they constitute a complete workload management solution LSF Analyzer is a tool for comprehensive workload and performance analysis Overview of LSF Analyzer LSF Analyzer processes historical workload data to produce reports about a cluster The workload data includes information about batch jobs system metrics load indices and resource usage LSF Analyzer provides system administrators and managers with the information to make intelligent informed scheduling and capacity planning de
19. Ad EE wheats We ES vea eh do 25 Creating Reports 2 e EER bs EER EERS VR SEER le a ge 25 Moditying Reports e igre eae ie URNA RON oui tee teet 26 Prin rig Reports as cenobii edem E bea ria b PUPA 26 Exporting Reports s ce RE Goenka Gow aint ba nae E 27 LSF Analyzer User s Guide V 4 Using Templates is sie iia ce ke wee ara EE 29 About Templates taras ete cl E ME Goats ROY ation 29 Saving a Templates iras EES e xy RE ge Vie e bee ese e 29 Using a Template to Create a Report 29 Using a Template to Modify a Report 30 Default Templates sis ra ESE nite DE ERE UM BE EE 31 5 Chargeback Accounting 200s e eee ee RR RR RR RR RR RE RR ee 33 About Chargeback Reports sesse EE EE EE eee eee 33 Reports and Invoices sees e EE EE EE EE Ee ee 35 Chargeback Resources ie EE EE EE EE EE EE EE EE E ee eee 37 Chargeback Rates EER DER EE eee ed ee Ge utes ea 38 Generating Chargeback Reports iese Ee SS SEE ee SE Ee ee ee ee ee 38 Modifying Chargeback Reports see ee EE EE EE SE EE Ee ee ee ee ee 39 RR LE HA ER OE oS 40 Printing Chargeback Report 40 Printing Chargeback Invoices see ee EE eee eee ee 40 6 LSF Database UNIX EE Re N 41 In This Chapter sce pata as ER ER AE Rer aee P 41 LSF Database Installation 0 0 EE EE EE EE EE Ee EE EE ee ee ee ee 42 Installations A Er DE 42 Starting the Database ee ee EE EE EE EE EE EE EE Ee eee 42 LSF Database Utility Commande 43
20. Default LSB_SHAREDIR cluster logdir lsb acct database_name Specifies the name of the target LSF database The target database cannot be the online active database 58 A Categories of Statistics This list shows the classes of statistics second field in the Y axis dialog available in each category first field in the Y axis dialog Job related statistics Number of Jobs Job Throughput Job Turnaround Time Job Wait Time Job CPU Time Job Run Time Job Memory Usage Job Swap Space Usage Resource related statistics Usage of Resource Shared among Hosts Load Index of Hosts CPU Utilization of Hosts Memory Utilization of Hosts Swap Space Utilization of Hosts Batch Job Slot Utilization of Hosts Job CPU Time Job Memory Usage Job Swap Space Usage Host related statistics Number of Hosts Load Index of Hosts CPU Utilization of Hosts Memory Utilization of Hosts Swap Space Utilization of Hosts Batch Job Slot Utilization of Hosts Number of Jobs LSF Analyzer User s Guide 59 A Categories of Statistics Job Throughput Job Turnaround Time Queue related statistics Number of Jobs Job Throughput Job Turnaround Time Job Wait Time Job CPU Time Job Run Time Job Memory Usage Job Swap Space Usage User related statistics Number of Jobs Job Turnaround Time Job Wait Time Job CPU Time Job Run Time Job Memory Usage Job Swap Space Usage Project related statistics Number of Jobs Job Turnaround Time Job Wait Tim
21. Environment Configuration iese see ee ee EE eee 43 Command Reference 43 LSF Data Collection Parameters ee ee ee ee ee ee ee ee ee ee ee ee 45 Isb acct Data Conversion isse n 46 7 LSF Database Windows NT ss es EE EE EE EER EER RE RR ER ee 49 Ine This Chapter 3 1 EE NE Ed DO ead a en de a s 49 Database Setup secre beh EU NE EE ER EIS 50 Create LSF Database Accounts 50 Create A New Database sse 51 Build LSF Database Schema 51 Grant Permissions to LSF Database Accounts 51 Host SOUP Es AA e EU EH nt 51 ODBC Driver Installation 52 DSN Setup t das EE Sua i be EE en 52 SF Setup ER OE OE mene RE Cates agen EE EES 53 vi Set Database Login Passwords Ee Ge EE ee ee ee ee ee 53 Set LSF Database Parameter 54 Update LSE evitas rr a es 54 LSF Database Utility Commande 56 Isdbpasswd i is RE a NG 56 LSF Data Collection Parameters e 56 Isb acct Data Conversion o 57 A Categories of Statistics E 59 B Classes of StatiSlicS 0c eee 61 C Staluslics cli OE EA a 65 INGOX cx dd idee Cas ken ro dcn Sate TE BYE ae E rn ee RC e a PR TR oe 75 LSF Analyzer User s Guide vii viii Preface Audience This guide is designed to help managers LSF cluster administrators and performance analysts use LSF Analyzer to perform accounting and chargeback functions cluster performance analysis capacity planning and forecasting This guide describes the use of LS
22. F Analyzer In it you will find all the information you need to set up and use LSF Analyzer at your site This guide assumes you have knowledge of common LSF system terminology LSF Suite 3 2 LSF is a suite of workload management products including the following LSF Batch is a batch job processing system for distributed and heterogeneous environments which ensures optimal resource sharing LSF JobScheduler is a distributed production job scheduler that integrates heterogeneous servers into a virtual mainframe or virtual supercomputer LSF MultiCluster supports resource sharing among multiple clusters of computers using LSF products while maintaining resource ownership and cluster autonomy LSF Analyzer is a graphical tool for comprehensive workload data analysis It processes cluster wide job logs from LSF Batch and LSF JobScheduler to produce LSF Analyzer User s Guide ix Preface statistical reports on the usage of system resources by users on different hosts through various queues LSF Parallel is a software product that manages parallel job execution in a production networked environment LSF Make is a distributed and parallel Make based on GNU Make that simultaneously dispatches tasks to multiple hosts LSF Base is the software upon which all the other LSF products are based It includes the network servers LIM and RES the LSF API and load sharing tools There are two editions of the LSF Suite LSF Enterprise Edi
23. PU Time Resource Related Job CPU Time Queue Related Job CPU Time LSF Analyzer User s Guide 71 C Statistics User Related Job CPU Time Project Related Job CPU Time Total CPU Time of Done amp Exited Jobs description page 19 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Total CPU Time of Exited Jobs description page 19 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Total CPU Time of Running amp Suspended Jobs description page 22 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Total Memory Usage of Done Jobs description page 20 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Total Memory Usage of Done amp Exited Jobs description page 20 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Total Memory Usage of Exited Jobs description page 20 Job Related Job Memory Usage 72 Resource Related Job Memory Usage Queue Related Job Memory Usage User Related
24. Swap Usage of Average maximum virtual memory used by Done Jobs processes of successful jobs in KB Average Swap Usage of Average maximum virtual memory used by Exited Jobs processes of failed jobs in KB Total Swap Usage of Done Maximum virtual memory used by all processes of a amp Exited Jobs job total for all finished jobs in KB Total Swap Usage of Done Maximum virtual memory used by all processes of a Jobs job total for successful jobs in KB Total Swap Usage of Exited Jobs Maximum virtual memory used by all processes of a job total for failed jobs in KB 20 3 The following statistics can only be displayed over Time accumulated on the X axis Statistic Description CPU Utilization The CPU time used over the last minute divided by the CPU time available in the same period in percentage Memory Utilization The amount of memory used divided by the total amount of memory available on the host in percentage Swap Space Utilization The amount of swap space used divided by the total amount of swap space available on the host in percentage Batch Job Slot Utilization The number of used batch job slots divided by the maximum number of job slots on the host in percentage Batch Processor Utilization The number of used batch job slots divided by the number of processors on the host in percentage 15 second Run Queue Length The 15 second e
25. T TT 4 il o A Wi d mum m x L Une Range of Dada Serre Chart Type E IM Me MI 2 kom mseesge Cem Heg The report contains the following elements Data Source database containing the information used to generate the statistics the data source is specified when starting LSF Analyzer but is not displayed in the report LSF Analyzer User s Guide 15 3 Generating Reports Title user specified title displayed on the report optional Y axis statistics calculated by LSF Analyzer maximum 7 per report X axis values such as hosts users projects or time periods for which the statistics are calculated Report Range the period of time over which the statistics are calculated Chart Type the method of displaying the information line chart bar chart area chart or table format Selecting the Data Source Each LSF cluster writes cluster information to a database and this information is used to generate the LSF Analyzer reports You will have multiple databases if old databases have been archived or if you have multiple clusters Before you can create a report you must specify which database contains the information you want to analyze By default the database uses the same name as your cluster e g cluster1 To specify a database take the following steps Step 1 Choose File Select Data Source Step 2 In the Select Data Source dialog select one data source then click OK 16 Y ax
26. Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Average CPU Time of Running amp Suspended Jobs description page 22 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Average Memory Usage of Done Jobs description page 20 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Average Memory Usage of Done amp Exited Jobs description page 20 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Average Memory Usage of Exited Jobs description page 20 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage 66 User Related Job Memory Usage Project Related Job Memory Usage Average Memory Usage of Running amp Suspended Jobs description page 22 Job Related Job Memory Usage Resource Related Job Memory Usage Queue Related Job Memory Usage User Related Job Memory Usage Project Related Job Memory Usage Average Run Time of Done Jobs description page 19 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Average Run Time of Done
27. X cluster or a mixed UNIX NT cluster If your cluster contains only Windows NT hosts see Chapter 7 LSF Database Windows NT on page 49 Note LSF database administration is the only part of LSF that requires you to log on using the LSF primary administrator user account The LSF primary administrator account will be the first cluster administrator specified in the 1sf cluster cluster file If your cluster includes any Windows NT hosts do not attempt to change the LSF primary administrator after LSF has been installed In This Chapter e LSF Database Installation on page 42 e LSF Database Utility Commands on page 43 e LSF Data Collection Parameters on page 45 e Isb acct Data Conversion on page 46 LSF Analyzer User s Guide 41 6 LSF Database UNIX LSF Database Installation Installation Install LSF Analyzer using 1sfsetup the UNIX installation program for LSF Suite For detailed instructions see the LSF Installation Guide When you install LSF Analyzer for the first time you are prompted to specify the LSF database server This defines the LSF_DB_HOST parameter in the 1sf conf file The LSF database files will be installed on the machine you specify e Ifyou specify the installation host the machine used to run 1sfsetup as the LSF database server the LSF database installation program starts automatically e If you specify another machine as the LSF database server you have to install
28. a report from scratch For more information see Using a Template to Create a Report on page 29 Modifying Reports To modify the current report take the following steps Step1 Click the Modify button on the toolbar or choose Report Modify Step 2 In the Modify Report dialog make any changes you want then click OK For more information see Using a Template to Modify a Report on page 30 Printing Reports To print the current report take the following steps Step 1 Choose File Print 26 Step 2 In the Print dialog specify printing parameters then click OK Exporting Reports To export data from the current report into a CSV comma separated file take the following steps Step 1 Choose File Export Step 2 In the File Selection dialog type or select the CSV file name then click OK or Save LSF Analyzer User s Guide 27 3 Generating Reports 28 4 Using Templates About Templates LSF Analyzer includes templates to help you reproduce reports A template stores all the characteristics of an analysis report title Y axis X axis report range and chart type except for the data itself You can save any report as a template and use it as the basis for generating a similar analysis of different job data You can use any template to create a new report or modify an existing one Saving a Template To save the current report as a template take the following steps Step 1 Choose Fil
29. abases The license for the acct 2db utility expires 30 days after your LSF 3 2 license is generated To convert data from an existing job log file take the following steps all Windows NT examples assume the use of Microsoft SQL Server 6 5 as the LSF database and use of an ODBC compliant DBMS Log on as the Microsoft SOL Server Database Administrator to complete steps 1 through 4 LSF Analyzer User s Guide 57 7 LSF Database Windows NT Step 1 Create a new LSF database see Create A New Database on page 51 Step 2 Build the schema see Build LSF Database Schema on page 51 Step 3 Grant the LSF database user account permission to read and write to the new database If you have created an LSF database guest account grant it Read permission for the new database Step 4 Set up a DSN for the new database on every batch host and on the host where LSF Analyzer is installed See DSN Setup on page 52 Step 5 Log onto the host where the job log file i e 1sb acct is stored Step 6 Issue the acct 2db command and specify the new database as the target database Each finished job record in the job log file is converted into a database record Syntax acct2db h V f lsb acct file database name h Prints command usage to stderr and exits Prints the LSF release version to stderr and exits f lsb acct file Specifies the job log file on the local host which is to be converted into a database
30. ack Reports To create a new chargeback report take the following steps Step 1 Choose Tools Chargeback 38 Step 2 In the New Chargeback Report dialog specify the following elements e title optional e format Report or Invoice e who to charge Users or Projects e what resource usage to charge for resource rates Rates button optional e report range Step 3 Click OK Modifying Chargeback Reports To modify the current chargeback report or invoice take the following steps Step 1 In the Chargeback Report dialog click the Modify button Step 2 In the Modify Chargeback Report dialog make any changes you want Step 3 Optional If you want to change the rates charged for resources click the Rates button Make changes to the Rates dialog then click OK Step 4 Click OK in the Modify Chargeback Report dialog LSF Analyzer User s Guide 39 5 Chargeback Accounting Printing Chargeback Reports To print the current chargeback report take the following steps Step 1 Click Print on the Report dialog Step 2 In the Print dialog specify printing parameters then click OK Printing Chargeback Invoices To print chargeback invoices take the following steps Step 1 Select a user name from the User list Step 2 Click Print on the Invoice dialog Step 3 In the Print dialog specify printing parameters then click OK 40 6 LSF Database UNIX This chapter is aimed at the LSF administrator running a UNI
31. age Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Total Swap Usage of Running amp Suspended Jobs description page 22 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Index A accounting chargeback 33 acct2db data conversion utility 46 57 Add Y axis dialog 26 address Platform xi B billing for resource use chargeback 35 C calculating resouce use chargeback 38 Change X axis dialog 26 changing Resource Rates 38 chargeback accounting EE ee 33 INVOICES ME Se EE ae 35 El p M 38 reports c c fe orca mite 35 resources i ee ese eee 37 chargeback invoices PONINES ees ee ee eee 40 chargeback rates modifying ee ee ees ee Ee ee 39 chargeback reports AA ciues ta de GE 38 modifvimng sess 39 Drintng ee ee eee 40 chargeback resource CPU TE Geer POL Ee 37 LSF Analyzer User s Guide Elapsed Time 37 Memory Cedere te 37 SWap esseeviss As age a Rd 37 Chart Type description 16 comma separated file creation 27 contacting Platform Computing xi conversion utility 46 57 CPU Time chargeback resource 37 creating
32. aintenance and system downtime LSF Analyzer User s Guide 7 1 Introduction The report shown in Figure 5 was produced using the Workload General Job_Time template Figure 5 Number of Running and Suspended Jobs in the Cluster mag perke aard arsparsrhrs lobo Fre DED ES 10 Ta B7 S898 15 04 aad E ax A add KA AG an 200 iun D E E DEG 019 Tras ME Leite of or d ospended Joba in Te haie This report shows the number of running and suspended jobs in the cluster which represents the times of maximum and minimum system usage This example shows that patterns of low system usage occur on a regular basis 2 Getting Started This chapter functions as a quick start guide to begin using LSF Analyzer The procedures in this chapter will provide you with experience using the various features of LSF Analyzer by guiding you through the series of steps needed to generate a report A detailed explanation of the features and options for performing workload analysis is dealt with in the following chapters Chargeback accounting is discussed in Chapter 5 Chargeback Accounting on page 33 Using LSF Analyzer for the First Time In this example you will use the default template to create a report showing CPU usage for all users Configuring and Starting LSF Analyzer Step 1 Start LSF Analyzer displays Welcome to LSF Analyzer dialog PETTE EI C Mis Piaget His papri ior Tee plata Ok Canckl
33. alyzer can be used to identify the users who are submitting CPU intensive jobs or are submitting a large number of jobs With this type of information the administrator can take corrective action to prevent these users from monopolizing cluster resources LSF Analyzer User s Guide 3 1 Introduction The report in Figure 1 shows the number of jobs submitted by each user and the CPU resources consumed by these jobs Figure 1 System Resource Usage ops fn e E Fee eme Geet e all 13 8 umi une uml unt mai Laer SS ummi una uruil ze um vgl ut fx EF re kin g e roer Tori Ch inse or are kla i rte dd er rada This report identifies the heaviest consumers of the LSF system resources One possible action from this information is to implement fairshare policies to reduce the monopolization of system resources An extension of the user profile is the chargeback report shown in Figure 2 Figure 2 Chargeback Report e Chongeiiack Report Far baart nit Fr OTS ER Te OTTO 0851 1011 4000 Ha pon So uerge eme HE Briar EIN BED 4000 poo 30 00 noo 4000 Host Profile Are the Host Resources being used efficiently LSF Analyzer can produce a host profile providing the justification to upgrade computing resource Performance exceptions are identified in the cluster like hosts that are not doing the expected amount of work due to hardware or configuration problems LSF Analyzer User s
34. amp Exited Jobs description page 19 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Average Run Time of Exited Jobs description page 19 Job Related Job Run Time Queue Related Job Run Time User Related Job Run Time Project Related Job Run Time Average Swap Usage of Done Jobs description page 20 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Average Swap Usage of Done amp Exited Jobs description page 20 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage LSF Analyzer User s Guide 67 C Statistics User Related Job Swap Space Usage Project Related Job Swap Space Usage Average Swap Usage of Exited Jobs description page 20 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Average Swap Usage of Running amp Suspended Jobs description page 22 Job Related Job Swap Space Usage Resource Related Job Swap Space Usage Queue Related Job Swap Space Usage User Related Job Swap Space Usage Project Related Job Swap Space Usage Average Turnaround Time of Done Jobs description page 18 Job Related
35. an LSF cluster administrator a member of the LSF Global Administrators group and start the LSF service and daemons manually Note You should not use the primary LSF administrator s account normally 1sfadmin to start or stop LSF service and daemons To start the LSF service and daemons use any one of the following methods e Use the Windows NT Server Manager to start LSF Service on all LSF server hosts e Click Services on the Windows NT Control Panel and start LSF Service You will have to repeat this step on each LSF server host e If LSF Batch has been installed go to the LSF Suite for Workload Management LSF Batch program folder and use the LSF administrative tool LSF Batch Administration You can use this tool to perform all your administrative tasks for LSF Base and LSF Batch products e Start anew command console and type C gt lssrvcntrl start m all lssrvman LSF Analyzer User s Guide 55 7 LSF Database Windows NT Usage information for 1ssrvcntr1 is available by typing 1ssrvcntrl with no options LSF Database Utility Command Isdbpasswd lsdbpasswd h userID Sets and changes the user s password The password is encrypted then written to the 1s dbpasswd file Print command usage to stderr and exit userID Specifies user ID LSF Data Collection Parameters Data collection can be tuned by modifying the parameters configured in the lsb params file For example
36. ation of Hosts Host Related Batch Job Slot Utilization of Hosts CPU Utilization description page 21 Resource Related CPU Utilization of Hosts Host Related CPU Utilization of Hosts Disk I O Rate description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Interactive Idle Time description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Job Throughput description page 18 Job Related Job Throughput Host Related Job Throughput Queue Related Job Throughput LSF Analyzer User s Guide 69 C Statistics Memory Utilization description page 21 Resource Related Memory Utilization of Hosts Host Related Memory Utilization of Hosts Number of Busy Hosts description page 23 Host Related Number of Hosts Number of Closed Hosts description page 23 Host Related Number of Hosts Number of Done Jobs description page 18 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Number of Jobs Project Related Number of Jobs Number of Done and Exited Jobs description page 18 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Number of Jobs Project Related Number of Jobs Number of Exited Jobs description page 18 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Numbe
37. can be configured DB DEFAULT INTVAL Specifies the time interval to log job data except load information and resource usage of running and suspended jobs to the database in minutes To stop logging job data set DB DEFAULT INTVAL 1 Default 5 LSF Analyzer User s Guide 45 6 LSF Database UNIX DB_JOB_RES_USAGE_INTVAL Optional If defined specifies the time interval to log resource usage of running and suspended jobs to the database in minutes Default undefined job resource is not logged DB_LOAD_INTVAL Specifies the time interval to log load information internal and external load indices and shared resources in minutes Minimum 15 Default 60 1 hour DB_SELECT_LOAD Optional Specifies which load information to collect Possible values are internal load indices external load indices and shared resources This parameter is case sensitive If more than 4 load values are specified separated by spaces only the first 4 will be used For example DB SELECT LOAD ut mem ext_idxl shared_licl will collect CPU utilization available memory the user specified external load index ext_idx1 and the user specified shared resource shared_licl Default ut Isb acct Data Conversion The acct 2db utility is used to convert job log files e g 15b acct into LSF databases allowing you to analyze LSF data collected before LSF Analyzer was installed These databases cannot be used as online acti
38. cisions required to fully utilize the power delivered by LSF LSF Analyzer can also be used to do chargeback accounting generating chargeback reports and invoices The primary features of LSF Analyzer e Profiles highlighting the number of jobs processed by the system job resource usage system metrics load indices and resource usage e Usage trends for the LSF system hosts users queues applications and projects e Information to manage resources by user and project e Chargeback accounting for users or projects providing reports and invoices LSF Analyzer User s Guide 1 1 Introduction e Data export to comma separated values csv file format compatible with industry standard spreadsheet and data analysis tools e Built in and user generated templates to automate analysis Basic Concepts LSF Analyzer collects and analyzes historical data stored in the LSF database to produce statistical reports which are designed to suit your needs The analysis can be displayed in table bar area and line charts and can be saved as a template which makes it convenient to repeat the analysis any time The basic concepts used by LSF Analyzer e Data Collection The LSF data collection engine is fully integrated in the LSF system During normal operation of the LSF system historical data is collected for all LSF objects jobs users queues hosts projects load indices and resources over a user determined period of time hours days w
39. dispatched but not finished Suspended Jobs running jobs or jobs in a suspended state Num of Pending Jobs Jobs submitted but not yet dispatched Average CPU Time of CPU time used by each job averaged for all jobs Running amp Suspended Jobs which have been dispatched but not finished in seconds Total CPU Time of Running Total time the CPU has spent running jobs which amp Suspended Jobs have been dispatched but not finished in user mode and in kernel mode in seconds Average Memory Usage of Resident memory used by all processes in a job averaged for all jobs which have been dispatched but not finished in KB Total Memory Usage of Running amp Suspended Jobs Total resident memory used by all processes in jobs which have been dispatched but not finished in KB Average Swap Usage of Running amp Suspended Jobs Virtual memory used by all processes in a job averaged for all jobs which have been dispatched but not finished in KB Total Swap Usage of Running amp Suspended Jobs Total virtual memory used by all processes in jobs which have been dispatched but not finished in KB 22 Statistic Description Number of Hosts in the Number of LSF batch server hosts in the cluster Cluster Number of OK Hosts Hosts able to accept batch jobs Number of Busy Hosts Overloaded hosts which are unable to accept batch jobs because some load indices go beyond the confi
40. e Save as Template Step 2 In the File Selection dialog type or select the template file name and click OK or Save Using a Template to Create a Report To create a report from a template take the following steps Step 1 Choose File New Report from Template LSF Analyzer User s Guide 29 4 Using Templates Step 2 In the Load Template dialog type or select the template file name and click OK or Open Filter rd p ar DEPT E Br nea lO ang Direccion Fiti HEILIET anas ce en erai ER C ren Perfor ce en erab Clear Tee ma Le pm eme Step 3 In the New Report dialog make any changes you want then click OK Using a Template to Modify a Report Using a template to modify a report is the same as deleting the report and creating a new report from a template To modify the current report using a template take the following steps Step 1 Click the Modify button on the toolbar or choose Report Modify Step 2 In the Modify Report dialog click Load Template Step 3 In the File Selection dialog type or select the template file name and click OK or Open Step 4 In the Modify Report dialog make any changes you want then click OK 30 Default Templates A wide variety of default templates can be installed with LSF Analyzer and used to create reports quickly They are found in the 3LSF_MISC xanalyzer directory on Windows NT and under the LSF_MISC Xanal
41. e Job CPU Time Job Run Time Job Memory Usage Job Swap Space Usage 60 B Classes of Statistics This list shows the statistics third field in the Y axis dialog in each class second field on the Y axis dialog For each class the definition and statistics are shown Batch Job Slot Utilization of Hosts Batch Job Slot Utilization Batch Processor Utilization Definitions The number of used batch job slots divided by the maximum number of job slots on the host and by the number of processors on the host in percentage CPU Utilization of Hosts CPU Utilization Definition The CPU time used over the last minute divided by the CPU time available in the same period in percentage Job CPU Time Average CPU Time of Done amp Exited Jobs Average CPU Time of Done Jobs Average CPU Time of Exited Jobs Average CPU Time of Running amp Suspended Jobs Total CPU Time of Done amp Exited Jobs Total CPU Time of Done Jobs Total CPU Time of Exited Jobs Total CPU Time of Running amp Suspended Jobs Definition The time the CPU spent running job in user mode and in kernel mode in seconds Job Memory Usage Average Memory Usage of Done amp Exited Jobs Average Memory Usage of Done Jobs Average Memory Usage of Exited Jobs Average Memory Usage of Running amp Suspended Jobs Total Memory Usage of Done amp Exited Jobs Total Memory Usage of Done Jobs LSF Analyzer User s Guide 61 B Classes of Statistics Total Memory Usage of Exit
42. e cluster 64 C Statistics For each statistic the related category first field on the Y axis dialog and class second field are shown 15 second Run Queue Length description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts 1 minute Run Queue Length description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts 15 minute Run Queue Length description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Available Memory description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Available Swap Space description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Available tmp Space description page 22 Resource Related Load Index of Hosts Host Related Load Index of Hosts Average CPU Time of Done Jobs description page 19 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time LSF Analyzer User s Guide 65 C Statistics Average CPU Time of Done amp Exited Jobs description page 19 Job Related Job CPU Time Resource Related Job CPU Time Queue Related Job CPU Time User Related Job CPU Time Project Related Job CPU Time Average CPU Time of Exited Jobs description page 19 Job Related Job CPU Time Resource Related Job CPU
43. econd Run Queue Length 1 minute Run Queue Length 15 minute Run Queue Length Paging Rate Disk I O Rate Number of Login Users Interactive Idle Time Available Memory Available Swap Space Available tmp Space user specified external load indices Definition The built in indices provided by LIM and user defined external dynamic numeric resources Memory Utilization of Hosts Memory Utilization Definition The amount of currently available memory divided by the total amount of memory in the host in percentage Number of Hosts Number of Hosts Number of OK Hosts Number of Busy Hosts Number of Full Hosts Number of Closed Hosts Number of Available Hosts Definition The total number of LSF Batch Server hosts in the cluster and the number of hosts satisfying certain conditions Number of Jobs Number of Done amp Exited Jobs Number of Done Jobs Number of Exited Jobs Number of Running Pending amp Suspended Jobs Number of Running amp Suspended Jobs Number of Pending Jobs Definition The total number of jobs in the cluster satisfying certain conditions LSF Analyzer User s Guide 63 B Classes of Statistics Swap Space Utilization of Hosts Swap Space Utilization Definition The amount of currently available swap space divided by the total amount of swap space in the host in percentage Usage of Resource Shared among Hosts specific shared resource Definition The total usage of a resource shared by all or some hosts in th
44. ed Jobs Total Memory Usage of Running amp Suspended Jobs Definition The maximum resident memory used by all processes of a job in KB Job Run Time Average Run Time of Done amp Exited Jobs Average Run Time of Done Jobs Average Run Time of Exited Jobs Total Run Time of Done amp Exited Jobs Total Run Time of Done Jobs Total Run Time of Exited Jobs Definition The elapsed time from job dispatch to job completion in seconds Job Swap Space Usage Average Swap Usage of Done amp Exited Jobs Average Swap Usage of Done Jobs Average Swap Usage of Exited Jobs Average Swap Usage of Running amp Suspended Jobs Total Swap Usage of Done amp Exited Jobs Total Swap Usage of Done Jobs Total Swap Usage of Exited Jobs Total Swap Usage of Running amp Suspended Jobs Definition The maximum virtual memory used by all processes of a job in KB Job Throughput Job Throughput Definition The number of done and exited jobs divided by the time period to finish these jobs in jobs hour Job Turnaround Time Average Turnaround Time of Done amp Exited Jobs Average Turnaround Time of Done Jobs Average Turnaround Time of Exited Jobs Definition The elapsed time from job submission to job completion in seconds Job Wait Time Average Wait Time of Done amp Exited Jobs Average Wait Time of Done Jobs Average Wait Time of Exited Jobs Definition The elapsed time from job submission to job dispatch in seconds 62 Load Index of Hosts 15 s
45. eeks and months and stored in the LSF database e LSF Database The LSF system works with commercial class database management systems DBMS providing superior performance and data management The installation configuration and maintenance of the LSF databases is discussed in Chapter 6 LSF Database UNIX on page 41 and Chapter 7 LSF Database Windows NT on page 49 e LSF Analyzer xanalyzer LSF Analyzer provides xanalyzer a graphical analysis and reporting tool as an integral part of this application The xanalyzer application retrieves the stored data and performs statistical analysis to produce reports describing the LSF cluster and objects profiles Case Studies The major advantage in using LSF Analyzer is it allows the LSF administrator to solve problems regarding the performance of the LSF cluster that would typically be very difficult to answer Finding these solutions allows an LSF cluster to be configured and used optimally Statistics generated by LSF Analyzer are used to show how well a system is working and trend analysis helps with capacity planning Examples showing the benefits of LSF Analyzer e Who are the largest consumers of system resources e Are these resources being used efficiently e Are the service commitment levels being met i e what is the clusters reliability e What are the activity trends of a cluster User Profile Who are the largest consumers of system resources LSF An
46. finished jobs in seconds Average Wait Time of Done Jobs Average time from submission to dispatch for successful jobs in seconds Average Wait Time of Exited Jobs Average time from submission to dispatch for failed jobs in seconds The following statistics can be displayed over these X axis values Time accumulated Hosts Host Types Host Models Queues Users Projects and Job Names Statistics Description Average CPU Time of Done Average time the CPU spent running each job in user amp Exited Jobs mode and in kernel mode in seconds Average CPU Time of Done Average time the CPU spent running each successful Jobs job in user mode and in kernel mode in seconds Average CPU Time of Average time the CPU spent running each failed job Exited Jobs in user mode and in kernel mode in seconds Total CPU Time of Done amp Total CPU time spent running all jobs in user mode Exited Jobs and in kernel mode in seconds Total CPU Time of Done Total CPU time spent running successful jobs in user Jobs mode and in kernel mode in seconds Total CPU Time of Exited Total CPU time spent running failed jobs in user Jobs mode and in kernel mode in seconds Average Run Time of Done amp Exited Jobs Average time from dispatch to completion for all finished jobs in seconds Average Run Time of Done Jobs Average time from dispatch to completion for successful jobs in seconds
47. ful if you seconds want to charge for the actual use of CPU resources by a job Computation intensive jobs will be charged more than other types of jobs Memory The maximum resident memory used by the job KB seconds multiplied by the elapsed time This is useful if you want to charge for the maximum resident memory used by the job during its run time Swap The maximum virtual memory used for the job KB seconds multiplied by the elapsed time This is useful if you want to charge for the total size of the core images used by a job during its run time LSF Analyzer User s Guide 37 5 Chargeback Accounting Chargeback Rates By default LSF Analyzer gets chargeback rates from the 1sb charge rate configuration file located in the LSF_MISC directory of the cluster You can easily view or change the rates by clicking the Rates button in the New Chargeback Report dialog or Modify Chargeback Report dialog Remes Aria Elapied Time ban ficar CAJ Time 84 0040 feecora Mirta 210000 IK B aros amp BI Dn VK warm Cemenry wmbo 7 Cp Canca Hap This basic formula is used to calculate the cost of a particular resource number of units used resource charge rate When you set the rates you must specify the currency The charge rate can be specified for each separate resource The unit of measurement associated with each resource cannot be changed Generating Chargeb
48. gured thresholds Number of Full Hosts Hosts which are unable to accept batch jobs because the configured maximum number of batch job slots has been reached Number of Closed Hosts Hosts which are unable to accept batch jobs for any of the following reasons they are running an exclusive job they have been locked by the LSF administrator or they have been closed by the LSF administrator or its dispatch windows Number of Unavailable Hosts Hosts which are unable to accept batch jobs because they are down or their LIM sbatchd is unreachable LSF Analyzer User s Guide 23 3 Generating Reports X axis Values The statistics you display on the Y axis are measured against values on the X axis hasil D host E howl nett 3 hosilda Tir homis hosil b homil bone B Ed howls hos host hoit hnsi23 A4 UK Cancel Hep Statistics can be displayed against the following values e Time e Hosts e Host Types e Host Models e Queues e Users e Projects e Job Names 24 e Job CPU Time Ranges e Job Memory Usage Ranges e Job Swap Space Usage Ranges Report Range You may choose to limit the report range so the report is generated faster and so you can easily analyze data collected over a time period that is meaningful to you You can specify the time and date of the start and end of the report range Accumulated statistics are calculated using data writte
49. is Statistics The statistics calculated by LSF Analyzer are chosen in the Y axis dialog A report can display up to seven statistics against suitable values of the X axis nasa Slabibcs Edu bp lg emp A8 Ariaid hunt ai Dane A Edd Jota puma of oe Job Thru grip a Joh UB A Dee nie chee Job Wian Dee Job CPU Tie Inh Fur Tire Job Aer iy ssa Job Swap Space Lago Ballets Lirio J The ruber at joto Printed E Huang 8 bord af time nnd d rumbo af runrirg pande and E Rad Done Heip There are two ways of interpreting Time on the X axis depending on the Y axis statistics Accumulated statistics use data accumulated over a user specified time interval and report the total or an average value Sampled statistics use data that is sampled at a single point in time and the data sampling is repeated at user specified time intervals The statistics are sorted by categories and classes in the Y axis dialog which are described in detail in Appendix A Categories of Statistics on page 59 and Appendix B Classes of Statistics on page 61 These tables show all the statistics available on the Y axis They are also listed alphabetically in Appendix C Statistics on page 65 LSF Analyzer User s Guide 17 3 Generating Reports The following statistics can be displayed over these X axis values Time accumulated Hosts Host Types Host Models Queues Users Projects Job Names Job CPU Time Ranges
50. n to the database between the times and dates specified By default the statistics will be calculated using all of the data in the database Sampled statistics are displayed at regular intervals specified in the Y axis dialog between the times and dates specified By default these statistics are displayed over the entire range of the data Creating Reports When creating a report you can make changes to all the elements in the New Report dialog until the report displays the information you want The report is generated when you click OK and the time it takes depends on the amount of data involved To create a new report take the following steps Step 1 Choose File New Report Step 2 Specify the title of the report LSF Analyzer User s Guide 25 3 Generating Reports Step3 Click Add to specify the first statistic you want to display on the Y axis In the Y axis dialog select the category and class on the left side then specify the statistic and scope on the right side Click Edit or Delete if you want to modify any of the Y axis statistics you have selected You may include up to seven statistics in one report Step 4 Specify the values of the X axis Step 5 Specify the Report Range By default the entire range of the database is used Step 6 Specify the Chart Type line chart bar chart area chart or table format Step7 Click OK It is easier to create a report from an existing template than it is to create
51. nt for Password Click Next 7 Step 10 Select the Change the default database to option and enter the name of the LSF database e g cluster1 Click Next Step 11 Follow the prompts and accept all default values to complete the configuration Click Finish when displayed Displays the Confirmation dialog Step 12 Click Test Step 13 After the test completes click OK LSF Setup Log on as an LSF cluster administrator not the LSF primary administrator The following process describes the activities for the LSF cluster administrator to carry out for setting up LSF for use with Microsoft SQL Server 6 5 and LSF Analyzer e Set Database Login Passwords on page 53 e Set LSF Database Parameters on page 54 e Update LSF on page 54 Set Database Login Passwords LSF needs to know the database password for the LSF database user account and the LSF database guest account if it was created see Create LSF Database Accounts on page 50 Use the 1sdbpasswd command to set the passwords in LSF see Isdbpasswd on page 56 This command will ask for the password for the specified user ID encrypt the password and save it into a file For example type C gt lsdbpasswd lsfadmin LSF Analyzer User s Guide 53 7 LSF Database Windows NT If the LSF database user account has the same name as a Windows NT user account use this command to input the database password not the Windows NT password Set LSF
52. or services mentioned in this document are identified by the trademarks or service marks of their respective companies or organizations Printed in Canada LSF Analyzer User s Guide iii Revision Information for LSF Analyzer User s Guide Edition Description First This document describes LSF Analyzer 3 1 Second Revised to reflect the changes in LSF Analyzer 3 2 Contents de ON EER KT xr EE N E ix Audiente ss von see ee BG Mien E de ce care Beacons s ix ESFE SUES EE ix LSF Enterprise Edition seir Oe onia a AN A eee x LSF Standatrd Edition cnt ges d EEN AE EE A x Related Documents x Online Documentation EE EES SE ee xi Technical Assistance xi T Introduction iaeia eg ee CANE ICSE See AS VR Ee Rd ee 1 Overview of LSF Analyzer ee EE EE EE EE EE Ee ee ee EE Ee ee ee ee ee 1 Basic Concepts ER EEND agi ay Lean a a donee 2 si nomas N EE DE eee eed EE 3 User Profile rk ux au Sama OU EH IR doce rd a 3 Host Protec aes e AMIE RA eap uA LE oM DE safe AE 5 Cluster Profile sss diee A ALES M dre 6 2 Getting Started cogere dia aii 9 Using LSF Analyzer for the First Une 9 3 Generating Reports 0 cece eee eee 15 About REPORTS sure ENE EE sie LAW EE RS Re alti em itae EE 15 Selecting the Data Source Lie EE EE eee eens 16 Y xis SLAHSHES daw acces st cia A Gaze esee e e aa e d ene 17 X axis Values sanae pe E SIG Eg EE Ee ark RE DA ENE E EN DA Eg 24 Report Range eene HD GE We WEER PR
53. r t time To specify time use the following syntax time year month day hour minute Use 4 digits for the year if specified Use 2 digits for month day hour and minute lsdbbuildidx database name Reindexes the specified LSF database If LSF Analyzer users have trouble retrieving data from the database try this command as there might be a problem with the database index lsdbclear database name Deletes all the logged data from the data tables in the specified LSF database leaving an empty LSF database 44 CAUTION There is no way to recover the deleted data Note You should not run this command on a working database such as the online LSF database If you do use the 1sdbstatus command and make sure there are no open connections to the database before using this command This will prevent data inconsistency problems lsdbdrop database name Drops the specified LSF database deletes the entire LSF database including all of the logged data CAUTION There is no way to recover a dropped LSF database lsdbmove source database name destination database name Moves the contents of the source database data to the destination database This command must be followed by thebadmin reconfig command LSF Data Collection Parameters Data collection can be tuned by modifying the parameters configured in the lsb params file For example edit LSB CONFDIR cluster configdir lsb params The following parameters
54. r account The guest account could also be used by other users who only need to read the contents of the database Note To simplify database setup give the LSF database user account the same name as your LSF primary administrator user account and do not create the LSF database guest account You can specify any password you want but do not confuse the database password with the password for the Windows NT user account 50 Create A New Database Create a database that will be the new LSF database Note To simplify database setup give the database the same name as your LSF cluster e g cluster1 Build LSF Database Schema Create the tables in the new database according to the LSF database schema The LSF database schema is specified in a Microsoft SOL Server 6 5 ISOL executable file named createschema sql located in the cluster bin directory e g cluster1 bin Grant Permissions to LSF Database Accounts Grant the LSF database user account Select Delete and Update permissions for all tables in the new database If you created an LSF database guest account grant it Read permission for the new database Host Setup Before you can use LSF Analyzer you must install the ODBC driver and set up a Data Source Name DSN on each batch host in the cluster and on the host used to run the LSF Analyzer xanalyzer graphical user interface Log onto each host and if necessary install the ODBC driver before you set up the DSN
55. r of Jobs Project Related Number of Jobs Number of Full Hosts description page 23 Host Related Number of Hosts Number of Hosts in the Cluster description page 23 Host Related Number of Hosts Number of Login Users description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts 70 Number of OK Hosts description page 23 Host Related Number of Hosts Num of Pending Jobs description page 22 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Number of Jobs Project Related Number of Jobs Num of Running Pending amp Suspended Jobs description page 22 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Number of Jobs Project Related Number of Jobs Num of Running amp Suspended Jobs description page 22 Job Related Number of Jobs Host Related Number of Jobs Queue Related Number of Jobs User Related Number of Jobs Project Related Number of Jobs Number of Unavailable Hosts description page 23 Host Related Number of Hosts Paging Rate description page 21 Resource Related Load Index of Hosts Host Related Load Index of Hosts Swap Space Utilization description page 21 Resource Related Swap Space Utilization of Hosts Host Related Swap Space Utilization of Hosts Total CPU Time of Done Jobs description page 19 Job Related Job C
56. the LSF database using a separate installation file that is included in the regular LSF distribution Complete the regular LSF installation using lsfsetup and then log onto the LSF database server and run 1sfabsetup When you install the LSF database you are prompted to specify the LSF database directory By default this is usr local 1sf_db All the LSF database files are installed under this directory The database is created automatically and has the same name as the cluster Starting the Database Once the database is installed you must start it To start the LSF database log onto the LSF database server as the LSF primary administrator and type lsdbserver start The 1sdbserver command is located in the bin subdirectory of the LSF database directory 42 LSF Database Utility Commands The utilities described in this section help you monitor and limit the size of the LSF database to get better performance from LSF Analyzer They are used to e startup and shut down the LSF database e create and drop LSF databases e move or delete existing data in the LSF database e show the status of the LSF database e show the size of the LSF database e rebuild the indices of the LSF database Environment Configuration The lsf conf file and the online documentation for database utility commands should be accessible to the working environment of the LSF primary administrator on the LSF database server LSF_ENVDIR must be set
57. tion Platform s LSF Enterprise Edition provides a reliable scalable means for organizations to schedule analyze and monitor their distributed workloads across heterogeneous UNIX and Windows NT computing environments LSF Enterprise Edition includes all the features in LSF Standard Edition LSF Base and LSF Batch plus the benefits of LSF Analyzer and LSF MultiCluster LSF Standard Edition The foundation for all LSF products Platform s Standard Edition consists of two products LSF Base and LSF Batch LSF Standard Edition offers users robust load sharing and sophisticated batch scheduling across distributed UNIX and Windows NT computing environments Related Documents The following guides are available from Platform Computing Corporation LSF Installation Guide LSF Batch Administrator s Guide LSF Batch Administrator s Quick Reference LSF Batch User s Guide LSF Batch User s Quick Reference LSF JobScheduler Administrator s Guide LSF JobScheduler User s Guide LSF Analyzer User s Guide LSF Parallel User s Guide LSF Programmer s Guide Online Documentation Man pages accessed with the man command for all commands e Online help available through the Help menu for the x1sbat ch xbmod xbsub xbalarms xbcal xlsjs xlsadmin and xanalyzer applications Technical Assistance If you need any technical assistance with LSF please contact your reseller or Platform Computing s Technical Support Department at
58. ve databases The license for the acct 2db utility expires 30 days after your LSF 3 2 license is generated To convert data from an existing job log file take the following steps Step 1 Use the LSF primary administrator user account to log onto the host where the job log file e g 15b acct is stored 46 6 Step 2 Create a new LSF database using the 1sdbcreate command described in the Command Reference on page 44 Step 3 Issue the acct 2db command and specify the new database as the target database Each finished job record in the job log file is converted into a database record Syntax acct2db h V lsb acct file H database host database name Prints command usage to stderr and exits Prints the LSF release version to stderr and exits f lsb acct file Specifies the job log file on the local host which is to be converted into a database Default LSB_SHAREDIR cluster logdir lsb acct H database host Specifies the remote host where the target database is located Default the local host database name Specifies the name of the target LSF database The target database cannot be the online active database LSF Analyzer User s Guide 47 6 LSF Database UNIX 48 7 LSF Database Windows NT This chapter is aimed at the LSF administrator running a Windows NT only cluster If your cluster contains any UNIX hosts see Chapter 6 LSF Database UNIX on page 41
59. xponentially averaged CPU run queue length 1 minute Run Queue Length The 1 minute exponentially averaged CPU run queue length 15 minute Run Queue Length The 15 minute exponentially averaged CPU run queue length Paging Rate The memory paging rate exponentially averaged over the last minute in pages second Disk I O Rate The disk I O rate exponentially averaged over the last minute in KB second Number of Login Users The number of current login users Interactive Idle Time For all logged in sessions the amount of time during which the keyboard is not used in minutes Available Memory Amount of memory available in MB Available Swap Space Amount of swap space available in MB LSF Analyzer User s Guide 21 3 Generating Reports Statistic Description Available tmp Space Amount of space available in the temporary directory in MB User specified external load indices Any user defined external dynamic numeric resource Usage of Resource Shared among Hosts Total value of all instances for a dynamic shared resource The following statistics can only be displayed over Time sampled on the X axis Statistic Description Num of Running Pending Jobs in the system either running or waiting to run Running amp Suspended Jobs amp Suspended Jobs Num of Running amp Jobs which have been
60. yzer directory on UNIX The templates are organized as shown Accounting Project Queue User Performance General Host Queue Workload General Job_Profile Job_Type LSF Analyzer User s Guide 31 4 Using Templates 32 5 Chargeback Accounting LSF Analyzer can be used to perform chargeback accounting by generating tabular reports and invoices The chargeback accounting uses the same statistics as LSF Analyzer reports but a cost is associated with the use of each resource You can determine the costs associated with a set of users or projects over a specified time range About Chargeback Reports Haw Chagehark Neopet on e NDICO Ths Charge al ers for ll peyeck Change Fer Je Espri Tire B CPU Tme B Senay M ap Exe Pani From I Te e TE a mr H W Ues Rangr of Dos Source DK Cancel Help LSF Analyzer User s Guide 33 5 Chargeback Accounting Chargeback reports contain all of the following elements Title user specified title displayed on the report optional Format method of displaying the information Report or Invoice format Who to Charge who or what will be charged for resource usage Users or Projects Rates costs associated with the use of each resource Resources to Charge For one or more resources to account for in the report Report Range the period of time over which the charges are calculated 34 Reports and Invoices A chargeback

Download Pdf Manuals

image

Related Search

Related Contents

Toilet Seat  Ports and controls - CNET Content Solutions  UNIVERSITÉ DU QUÉBEC À MONTRÉAL  fbi omni 400 - punto y control  Supermicro X7DA3+ motherboard  

Copyright © All rights reserved.
Failed to retrieve file