Home
"user manual"
Contents
1. af 5 S o 4 eiior Seege Zare er Sal E strus acon ActonSenet doGet HttpSeretRequestHtipServetResponse E 1 Bel eg RE o 0 at struts action AcionSeniet doGet HtipSenietRequest HtipServetResponse 14s f 788 00 os 10 15 20 25 30 o 2 4 6 8 Time ms Time ms so etstore presentation AccountBeanisAuthencated so etstore presentation CatalogBieangetCategony EE a 14 0 me oo a ietstor sence CatlogSenice getProductListByCategory Sting f 158 gt 1 Pastore persistence samapsan AccounSqlapDao getAccoun Sting Sting gt pestorepersstenco samapsae CategorySalMapDo getCategory Stng 1505 325 a 1 itso sence AccountSenice gotAccountSting Sting lt e ietstoe sence CatalogSenice getCategory String 1658 E am 5 1 ietstoe presentation AccounBean signon DEE 35 83 e 0 o 3 a 1 itso persistence agmapdao Produc Zg a 1 Petstore gervice Catalogservice getProducttistByCategory Sting 737 a itso presentation AccounBean est 2 itso presentation Ggs 108 a7 auusecionAcionSenet doPosi HtpSemietRequestHtpSeretResponse y guusacionAcionSenetdoGetttpseretequest HipSeretResponse SS gp 2583 o 10 20 30 40 so o 5 10 15 20 25 Time ms Time ms vo estore domain Account getsemamet vo etstore domain Account getanneNanet 97 o a Petstore persistence sglmapdao ProdustSqhapDao getProdut Sting a itso presentation AccountBeanisautentcat
2. 8 Y g i 3 1 Mean HE g 2 E TI z 8 3 3 A Me an o e i E o 6 8 a 8 8 SZ ge 8 8 o o Q 8 E e o 8 8 E 8 i EN a fe Oo fe i r 7 2 S Si BS E co AAA ed a 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 Experiment time minutes Figure 5 16 Box and whisker plot for response times of operation service CatalogService getItem The visualization is trimmed to the upper box of the 29 experiment minute operation persistence sqlmapdao AccountSqlMapDao getAccount In this case the 3 paramater log normal distribution almost fits the entire response time sample including the length of the tail 5 3 Summary and Discussion of Results The description of the processed experiment results show that the workload intensity impacts response time statistics The maximum is very sensitive and reaches the highest stretch factors The mean is more sensitive to varying workload than the median is Upper quartiles are more sensitive to the workload intensity than lower quantiles are yielding increasing interquartile ranges The response time minimum is largely unaffected by the workload intensity The data is right shifted long tailed and right skewed i e mode lt median lt mean No correlation between the workload intensity and the outlier ratio could be observed Figure shows a box and whisker plot for an experiment of 30 minutes leng
3. Discrete Continuous Mean u E X if xi u E X 1 xf x de VO EIS ay oF DIE EL pp Variance Sta 40 f z WP F e de Table 2 1 Definition of mean and variance for discrete and continuous probability dis tributions 14 2 4 Probability and Statistics N x 1 1 N x 1 0 9 N x 1 0 8 Density Density 4 3 2 1 0 1 2 3 4 o 1 2 3 4 5 6 7 8 9 x x a Normal distributions b Log normal distributions Figure 2 10 Graphs of probability density functions for parameterized normal and log normal distributions Parametric Distribution Families A number of named probability distributions exists which can be parameterized by one or more values e g in order to influence its shape or scale In the following two sections the normal and log normal distributions are described since they are used within this thesis Other distribution families include the uniform the exponential the gamma and the Weibull distributions Normal Distribution The most widely used model for the probability distribution of a random variable is the normal distribution It approximates the distribution of a continuous random variable X which can be modeled by the sum of many small independent variables It is parame terized by mean y and variance 0 Its distribution function is denoted by N u o see Equation 2 7 The symmetric bell shaped probability density
4. A 4 3 0 else activeT races a t 5 Given an active traces history A the function pwi N gt Rt maps a time to the platform workload intensity at that time using a window size w w 1 1 st tiveT ti 4 4 pwiau t 2 iveT races 4 t i 4 4 4 4 2 Implementation We implemented the methodology defined in the previous Section using the GNU R Environment R Development Core Team 2005 1 The trace history can be obtained from the Tpmon monitored data by querying the entries for the application entry points struts action ActionServlet doGet and struts action ActionServlet doPost 2 The trace history is converted into a two column matrix representing the event history It is sorted by the first element of the tuples representing the time of the event The second row contains the summands which are either 1 or 1 60 4 5 Execution Methodology 3 A two column matrix representing the active traces history is computed by se quentially traversing the trace history matrix and accumulating the entries in the second row Due to performance issues concerning matrix manipulations in R we implemented this functionality in C and created a dynamic library which can be used from within R 4 The functionality provided by R to define step functions from two dimensional matrices is used to define the step function activeT races 5 We implemented the function pwia t which comp
5. UOTYRUIULIO Op Jurod SULIOJUOUL SATYe14I JO asVIBDAOD Isonba pue atstipig SUIT suodsoy Y aIqeL sw og SULIOYUO N payeanoy e dete 6T9 TT2T SE 8 OL2e oT OZES ZT 2621 v2 OvLGTT 22911 6911 tvoe s te apap uasura919s1apip aoinas see pero cle 297 03 6 Zv 220772 8090 T OLEBE SET 660 829 Ze 9 O l tzeb ao1nias1apap animas DO sosto b06 T 0z89T 117 OTO ZT 96T 71 19015u uua3 sI a9119530 e38 391195 28 170 G0 SBT 0v48 69 81658 OOZES 09522 08 8T Lett PLOT zeig ele 75 10393e Ag91517190Mp014193 39119530 e 8 a01n198 00052 19980 69287 08497 TET 97T 9vb 8Tr 7 19npo4 4193 a91195930 e 3e a91n19s 18 8T 6v y 988 EVE OSOELT 9659 TbTT ZT OC6T Y SOS El b 62 RA e 5 19npo1g gas uuay193 991n19530 e e m s TITHE 8vs c26 66 OCSL9E 06Tv sS 25888 08TS v DEI GBS 0 ST 80 vI Oo 1 uoy 1038 s0119530 e38 991muas 92T v00 6ST zur O6TE 9T Tblz e Leer OT6VZ 6 soo rge ce 5 1039389193 a0119530 e38 aos Glev 589 9vO0T 8 OSS8ST EA E SYEETS 0ETET VESC CLT Lost Se ET 1uno99y 193 991m1951UNo909y eones ObTe E9 9000 2 STPLOT Ob8TZ VOLT HHT 69ZT SST ej wu0440psQmau ueaguapiQ uoNequasaid GLTE TTOT vT 6 TO O O
6. persistence gt and service OrderService getNert Id the maximum response times intermediately decrease considerably after hav ing reached a local maximum If present peaks occur for PWIs around 30 Figure 5 7 b shows the monotonically increasing curve for service OrderService insertOrder The curve for service CatalogService getCategory with an intermediately de creasing maximum is shown in Figure B 7 0 sqlmapdao OrderSglMapDao insertOrder 5 2 3 3 Mean For a PWI up to 0 84 45 users the mean response time stretch factor of all operations increases linearly to values between 1 11 and 1 26 As a representative of all operations Figure B 8 a shows the curve for service CatalogService getItem in this PWI range For a PWI up to 2 98 85 users the stretch factor increases with a slightly higher slope to values between 1 43 persistence sglmapdao UrderSqlMapDao insertUrder and 2 62 service OrderService getNezrtla From a PWI of 4 54 95 users it increases to operation related maximum values 2 06 persistence sqlmapdao OrderSqlMapDao insertOrder more considerably to values between 10 09 persistence sqlmapdao ItemSqlMapDao getItem and 33 59 service CatalogService getCategory and the highest stretch factors 103 03 presentation OrderBean newOrder 129 89 service OrderSeruice insertOrder and 362 94 service OrderService getNeztId As de
7. An anomaly detector has to decide for each element of a non empty set of executions Y whether or not it is an anomaly It knows a set of observations X of sufficient size This set is called the history and is assumed to contain no anomalies The anomaly detector decides by comparing a set of executions Y with a history A The error rate relates anomaly detection errors including false positives type I errors and false negatives type II error to the total number of decisions and is used to quantify the quality of an anomaly detector 6 2 Plain Anomaly Detector The very basic Plain Anomaly Detector PAD classifies an execution e as anomalous if its response rt exceeds an operation specific threshold 7 The threshold 7 for an operation is determined by multiplying the mean of the executions for this operation contained in 85 Chapter 6 Workload Intensity sensitive Anomaly Detection o 8 ads D z7 S o A Ki PA 2 jsa M3 S RK E o o 2 o o Q e q E ZS gra RO BEB ER Lg 908 D S S 8 7 A Anomaly 16 SR c Normal obs 84 ell o ee Se I l l SH T GEES EE 0 20 40 60 80 90 100 110 120 130 140 150 160 Experiment time seconds Threshold a Response times and start times b Error rate in relation to the threshold Figure 6 1 Anomaly detection scenario with constant workload intensity a given history Y with a tolerance factor 4 The historical variance could be used to determine a feasibl
8. 7 1 Summary The foundations required for our work including performance metrics and scalability workload characterization and generation for Web based system probability and statis tics as well as anomaly detection have been presented in the beginning of this thesis We designed and implemented a workload driver performed a case study to analyze the relation between workload intensity and response times and prototypely implemented a workload intensity sensitive anomaly detector Workload Driver The development of our approach for modeling and generating realistic workload for enterprise applications has been presented in Chapter 3 Design Application specific information required to generate valid user sessions is defined in an application model which is a hierarchical state machine In contrast to the similar Er tended Finite State Machine presented by Shams et al 2006 it contains a second layer providing a separation of session related constraints from technical details Probabilistic user behavior models are used to define classes of users with Markov chains similar to the Costumer Behavior Model Graphs CBMGs by Menasc et al 1999 Both application model and user behavior model are combined into a single probabilistic session model which the workload driver uses to generate and execute valid user sessions Moreover the workload driver supports the definition of a user behavior mix The number of emulated users within a
9. Figure 5 13 Platform workload intensity 3 quartile response times and outlier ratios for the operations persistence sqlmapdao ItemSqlMapDao getItem and service OrderService insertOrder outliers in all experiments between 3 75 presentation OrderBean newOrder and 6 11 persistence sqlmapdao ItemSqlMapDao getItemListByProduct We observed that extreme outliers correlate with the considerable local maximums observed for some maximum mean and 3 quartiles of response times However no general corre lation with an increasing workload intensity could be determined Figure 5 13 a and 5 13 b show the correlating ratio of extreme outliers with the local maximum of the 3 quartile for the operation persistence sqlmapdao ItemSqlMapDao getItem Figures and 5 13 c show that the ratio of extreme outliers doesn t generally correlate with an increasing maximum 5 2 4 Distribution Characteristics By analyzing the kernel estimated densities of each operation in each experiment run we identified four types of density shapes under increasing workload intensity Generally the elements of all distributions are right skewed with long and heavy tails The length of the tails and the skewness of the distribution increases as the workload intensity increases The density of the operation presentation OrderBean newOrder is bimodal i e two clusters of considerable size of occurring response times can be identifie
10. OTEOT E002 ST SYP8 ZT OTTO O 60 6TT 28 02 T9 6T Er O o 10 O ap1Omauuesgaapao uonrejuasa4d 6077 627 89 TT 99 S OVEESC 2750 98 9 9T 000T 8 MIE SATA ARG Ee O pnpo1 man uesg3o e3e uonejuasard TO SZ GOS TTOT SV OSESZE 90T9 S TI88 ZT 02295 OT HL 278 O89T 6 bT weymain ueag Z0 e7e5 uo ejuasord Gc6e 109 Ty TT cGy 020972 6 5 0T 8SE8 ST 068T Z s6Tc SET 9 TT 08 0T AuoSaqeQ main ueag3ojerey uoneruasad 139 199 OT ET 60 9 0490 6v STvcL LATL 2 OV9D CT TE 6c 060 v09C CHET Que 0 ueyppeuesgue uonezuasold Lyts so 6 62 ST v69 ozeezz 0280 5 Se9T ET ostys erer ore 6zze 1862 JuouSisuesg1unos9y uonejuasald 0600 0 60000 S9000 05000 50 coo v0 070 youS s uesgjunossy uonejuasaud 04000 80000 000 0v00 0 50 100 8r0 StO 5 awueusaspy19s uesg3unos9 y uonejuasard 02000 8000 0 87000 0v00 0 650 00 090 0rO 7 p10msse y19s ueagiunoo9y uonejuasald ETO O TTOO 0 200 0 0900 0 POT v00 56 0 080 s ele 39534 ueagyunossyuorgejuasa4d 00289 8zr90 98292 ozsez YZT 660 29S 17S C P RxON323 0egdewibsa gt uanbag oepdewjbs a9uags siad 06527 67200 688 1 O8r2T 6621 ott 766 Ire H 103938 Ag181119Np04 4193 oeQdeyy bSynposg oppdeubes o2us1stemggd 0bz0 TL O TEST OLIST TSE soo ve LEE 19npo 323 0egdeyy bspnpo4g oepdeuybs sausgsisiad LEL
11. The 3 parameter log normal distribution fits most unimodal response time samples aside from their tails These are in the majority of cases shorter than those of the respective theoretical samples 84 Chapter 6 Workload Intensity sensitive Anomaly Detection The experiment results presented in the last Chapter 5 showed that workload intensity has a considerable impact on response time statistics e g the mean This chapter out lines a novel approach for timing behavior anomaly detection which explicitly considers varying workload The details are presented elsewhere Rohr et al 2007b The response time mean is used as the underlying statistic but others may be selected based on the results presented in the previous chapter Section gives a short introduction on basic terms and the underlying approach for anomaly detection in software timing behavior In Section we introduce a ba sic anomaly detector and demonstrate that it performs badly when workload intensity variations occur Our approach is presented in Section 6 3 6 1 Anomaly Detection in Software Timing Behavior The system model defined in Section is used particularly including the definition of traces and active sessions Formally an execution of an operation o is a tuple o st rt st denoting the start time and rt denoting the response time as defined in Section Here an anomaly is considered a response time exceeding a given mean value by o percent in a period 6
12. universitat OLDENBURG Abteilung Software Engineering Fakult t II Informatik Wirtschafts und Rechtswissenschaften Department f r Informatik Diploma Thesis Workload sensitive Timing Behavior Anomaly Detection in Large Software Systems Andre van Hoorn September 14 2007 First examiner Prof Dr Wilhelm Hasselbring Second examiner MIT Matthias Rohr Advisor MIT Matthias Rohr Abstract Anomaly detection is used for failure detection and diagnosis in large software systems to reduce repair time thus increasing availability A common approach is building a model of a system s normal behavior in terms of monitored parameters and comparing this model with a dynamically generated model of the respective current behavior Devi ations are considered anomalies indicating failures Most anomaly detection approaches do not explicitly consider varying workload Assuming that varying workload leads to varying response times of services provided by internal software components our hypothesis is as follows a novel workload sensitive anomaly detection is realizable using statistics of workload dependent service response times as a model of the normal behavior The goals of this work are divided into three parts First an application generic tech nique will be developed to define and generate realistic workload based on an analytical workload model notation to be specified Applying this technique to a sample applica tion in a cas
13. 50 53 53 53 54 59 56 56 57 59 59 60 61 65 65 65 66 67 69 69 70 70 70 71 79 81 83 85 85 85 87 vi Contents A Workload Driver A l Available JMeter Test Elementsi A 2 Installing Markov4JMeter B Case Study B 1 Installation Instructions for Instrumented JPetStore B 1 1 Install and Configure Apache Tomcat B 1 2 Build and Install JPetStorel B 1 3 Monitor JPetStore with Tpmon B 2 Trace Liming Diagrams o eee ewe aa ir B 3 Iterative Monitoring Point Determination Acknowledgement Bibliography vu List of Figures rad 6 KEES 7 2 3 Effieieney in the ISO 9126 standard 2 an 2a aa ea na as 7 EN 8 9 2 6 Hierarchical workload modell o 0 a 22 2 o o 10 2 7 Example Costumer Behavior Model Graph oaoa aaa aa 11 EES 12 2 9 Workload generation approach 13 ees 15 2 11 Description of a box and whisker plot 16 EE iD 17 ig Oe Se at oe te a SE eo 18 2 14 JMeter GUI eh ea ee hed whee eh AAA 20 ADA MO oS a Oe OE Ee OE AE 8 21 a eae weg bag Be eka EE 22 See 24 2 18 Aspect weaver weaves aspects and functional parts into single binary 24 3 1 Use cases of the workload driver e 30 3 2 Class diagram of the workload configuration data model 34 E R oan ah e neh daa 35 ghee eae ayy ERE 36 3 5 Architecture overview of workload driver in UML class diagram no
14. OrderBean newOrder increases to 8 76 ms Figure 5 8 c shows the curve for the operation persistence sqlmapdao OrderSqlMapDao insertOrder The impact of a further increasing workload intensity on the mode follows a simi lar pattern for the remaining operations it increases with a higher slope for PWIs around 40 165 175 users before raising considerably to maximum values The stretch factors raise to values between 13 32 presentation CatalogBean viewItem and 74 5 2 Data Description Workload Intensity vs Workload Intensity vs Workload Intensity vs 1 Quartile of Response Times 1 Quartile of Response Times 1 Quartile of Response Times presentation OrderBean newOrder presentation AccountBean signon service OrderService getNextld 0 0150 L o o o o o ot 125 E o 1 T 1 20 60 1 T T 10 12 1 150 1 a T T 50 0 0 0140 Response time ms 1 g 8 T T 1 10 1 15 Stretch Factor Response time ms 100 1 T T 30 40 Stretch Factor x Response time ms 40 L T 8 Stretch Factor 0 0130 L ee T j 1 05 g 20 i o o o o o No N o T T 2 4 D L lo o o o o N o T 1 N T D 0 0120 LU o t 1 00 O D H D H Platform workload intensity Platform workload intensity Platform workload intensity a PWI 0 03 60 90 b PWI 0 03 60 90 c PWI 0 03 60 90 Figure
15. Users vs Quartil ER Si Scatter Plot of Response Times Box and Whisker Plot of Response Times E NEE i ersistence sqlmapdao AccountSqlMapDao getAccount persistence sqlmapdao AccountSqlMapDao getAccount persistence sqlmapdao AccountSqlMapDao getAccount p qimapi qiMap g LA a o 7 1 quartile gt Local regressiono a an o i median q 2 o D Median 3 quartile El o o o 2 EN 2 o ei 2 a Fi 3 o RB T 8 Eo Fe Bs i N i o ei o e e e o Eh E S A E 8 EN 8 o Boe i 7 5 x of a g 37 5 ai i S ad a RS a S 8 z Pi 8 8 E o c Lou Ba o KE eS 7 7 Z Y e ae eebe et 20 ES 0 0 ae al Sr i i i o E o 3 5 i i 9 peo sw i 8 48 0 0 070707 Erz Se EJ T T T T T T T T 0 20 40 60 4 6 8 5 6 7 8 Experiment time minutes Users N 486 Experiment time minutes a Line plot b Scatter plot c Box and whisker plot Density Plot of Response Times QQ Plot of Sample Data and persistence sqlmapdao AccountSqlMapDao getAccount 3 Parameter Log Normal Distribution Mean 3 163 A Median 3 106 Zi Approx Mode 3 057 Skewness 1 666 Kurtosis 3 614 Density Sample response time ms 3 lt ri ILL 3 0 3 5 4 0 Response time milliseconds N 486 Bandwidth 0 04385 M1 2 78 1 1 084 0 0 496 d Line Plot e QQ Plot Figure 5 1 Overview of all pl
16. and V Akula 2003 Towards Workload Characterization of Auction Sites In Proceedings of the 6th IEEE Workshop on Workload Characterization WWC 6 Austin TX USA October 27 2003 pages 12 20 IEEE Press D A Menasc and V A F Almeida 2000 Scaling for E Business Technologies Models Performance and Capacity Planning New Jersey Prentice Hall D A Menasc V A F Almeida R Fonseca and M A Mendes 1999 A Methodology for Workload Characterization of E Commerce Sites In Proceedings of the Ist ACM Conference on Electronic commerce EC 99 Denver CO USA November 3 5 1999 pages 119 128 ACM Press Mercury Interactive Corporation 2007 Mercury LoadRunner Homepage www mercury com us products performance center loadrunner Last visited August 31 2007 A Mielke 2006 Elements for response time statistics in ERP transaction systems Performance Evaluation 63 7 635 653 D C Montgomery and G C Runger 2006 Applied Statistics and Probability for Engineers New York John Wiley amp Sons Inc fourth edition 111 Bibliography D Mosberger and T Jin 1998 httperf A Tool for Measuring Web Server Performance Technical Report HPL 98 61 Hewlett Packard Laboratories http www hpl hp com techreports 98 HPL 98 61 html J D Musa A lannino and K Okumoto 1987 Software Reliability Measurement Prediction Application New York McGraw Hill first edition Object W
17. h Unimodal density in Exp 7 6 0 Mean 3 632 o Mean 4 796 Median 3 623 o A Median 4 192 Approx Mode 3 624 e Approx Mode 3 697 Skewness 2 127 Skewness 2 457 Kurtosis 7 02 3 4 Kurtosis 6 998 3 8 Density Ee 4 0 Response time milliseconds N 2756 Bandwidth 0 01253 f Density in Exp 2 0 8 0 4 0 0 Density 6 10 12 Response time milliseconds N 11863 Bandwidth 0 1509 8 42 44 14 g Density in Exp 12 Density Plot of Response Times service AccountService getAccount Mean 4 135 Median 3 785 Approx Mode 3 571 Skewness 2 542 Kurtosis 8 362 Dio u EEE E m I EC 7 3 4 5 6 7 8 Response time milliseconds N 1284 Bandwidth 0 1243 i Unimodal density in Exp 11 Figure 5 14 Examples for all identified density shapes 80 5 2 Data Description The operations persistence sqlmapdao ItemSqlMapDao getItem persistence sqlmapdao ItemSqlMapDao getItemListByProduct persistence sqlmapdao Order SqlMapDao insertOrder service CatalogService getCategory service CatalogService getItem and service OrderService insertOrder have mul timodal distributions for a low workload intensity The distributions turn unimodal as the workload increases Figures 5 14 e and show the bimodal data of opera tion service CatalogService getItem in Experiment 2 Figure shows the related unimodal distribution in Experim
18. request or the response e g a set of content encodings accepted by the client or the content type of the message body BODY 1p sent by the server STATUShttp is a string indicating the status of the result e g OK or Not Found We give the ordered tuple as in Figure 2 1 when referencing complete HTTP messages using left and right braces to denote lists When referencing specific fields of a message we use an indexed notation e g resp body or resp status to get the body and the status of an HTTP response Execution Model As Menasc amp and Almeida 2000 we consider a component a modular unit of functionality accessed through defined interfaces A component provides a set of oper ations which can be called by other components Operation calls may be synchronous or asynchronous When synchronously calling an operation the caller blocks until the operation has executed The caller immediately proceeds when calling an operation asynchronously We consider a trace a record of synchronous operation calls of an application processing a single user request Traces are assigned a unique identifier Active traces are those traces currently processed by the system Possible starts of a trace are denoted application entry points The sequence diagram in Figure 2 2 illustrates a sample trace 2 2 Performance Metrics and Scalability Sample Trace Figure 2 2 Seq
19. userThinkTimesDistr s Client side Think Times distribution Think time distributions should be configurable for each user class to enable think times that vary based on a parameterizable probability distribution family such as the normal distribution This requirement extends config aub userThinkTimes Const m 32 3 2 Design 3 1 5 2 Behavior Mix and Workload Intensity Configuration config bmwi userCountSingle m Single User Count The workload driver must provide an option to configure the number of concurrent users to simulate config bmwi userCountVarying s User Count Variation The workload driver should provide means to define a varying number of concurrent users to be simulated e g specified by mathematic formulae depending on the elapsed experiment time This requirement extends config wmic userCountSingle m config bmwi duration m Duration of Workload Execution The workload driver must provide an option to configure the duration of the workload execution e g by specifying a time value or by specifying the number of iterations per user config bmwi userClassSingle m User Class Assignment The workload driver must allow to assign a single user class to be used for an entire workload execution config bmwi userClassMix s User Class Mix The workload driver should allow to use multiple user classes during a single workload execution The number of users being associated to every si
20. 03 60 90 b PWI 0 03 60 90 c PWI 0 03 60 90 Figure 5 12 Platform workload intensity vs response time variance of operations presentation CatalogBean viewItem and persistence sqlmapdao OrderSglMapDao insertUrder as well as skewness of operation service CatalogService getItem Figure 5 12 b shows the curves for persistence sqlmapdao OrderSqlMapDao insertOrder Figure 5 8 c shows the curves for the mean median and mode stretch factors for this operation 5 2 3 9 Skewness With the exception that the operation presentation CartBean addItemToCart has the skewness 6 06 for PWI 0 22 increasing to a positive value for PWI 0 84 and service CatalogService getItem which has a skewness of 0 11 in Experiment 1 in each case all skewness values are positive Hence the distributions are generally right skewed or become right skewed respectively All skewness curves have a shape which is similar to the one shown for the operation service CatalogService getItem in Figure B 12 c The curves have two peaks for PWI values 2 22 and 38 81 Unless the above mentioned exceptions the computed skewness values are in a range between 0 11 and 8 55 5 2 3 10 Outlier Ratio For all operations the ratio of normal outliers is between 0 and 8 5 with averages between 2 3 presentation OrderBean newOrder and 4 9 service OrderService getNeztId The ratio of normal outliers shows no co
21. 1 6 Let E 1 2 be the event that a number smaller than 3 occurs and let X be the discrete random variable denoting the occurring number within the experiment The probability for the event E is P E 3 P X lt 3 Probability Distribution The probability distribution of a random variable X defines the probabilities associated with all possible values of X Its definition depends on whether X is discrete or con tinuous and is given below The cumulative distribution function F R gt 0 1 for a discrete or continuous random variable X gives the probability that the outcome of the random variable is less than or equal x ER F x P X lt 2 2 4 For a discrete random variable X with possible values x1 2 n the probability distribution is the probability mass function f R gt 0 1 satisfying fa P X 2 2 5 A probability density function f satisfying the properties in Equation 2 6 defines the probability distribution for a continuous random variable X b IR a 1 f gt 0 2 fejd l 3 P a lt X eb 2 6 The statistics mean variance and standard deviation are commonly used to summarize a probability distribution of a random variable X The mean u or E X is a measure of the middle of the probability distribution The variance 0 or V X and the standard deviation o Vo are measures of the variability in the distribution The definitions are given in Table
22. 1 3 2 we will describe the two models The transition graphs are visualized in Figure For both profiles we specified a think time distribution fi N 300 200 which is a parameterized normal distribution with u 300 and o 200 both values given in milliseconds 4 1 3 1 Browser This model represents sessions of users mainly browsing among the categories products and items of the JPetStore i e the application states View Category View Product and View Item The user starts a session by entering the state Home With a low probability a user returns to the state Home or enters the state View Cart The transition graph of this behavior model is shown in Figure 4 2 a 4 1 3 2 Buyer This model represents sessions of users which tend to select an item by first selecting the category and the product With a high probability this item is added to the shopping cart In contrast to the above described browser behavior this model contains all appli cation states A user does only sign on directly after having added an item to its cart i e a transition from the state Add to Cart to Sign On is taken Afterwards the user either purchases the buy by sequentially entering the states View Cart Purchase and Sign Off or it quits the session The transition graph of this behavior model is shown in Figure 4 2 b 4 1 4 Probabilistic JMeter Test Plan Based on our application model we developed a probabilistic Test Plan for JMeter ex tended
23. 2 33 818 g EJE a als gt 513 3 3 er CEL tr or l6 18 2 gis v lei T uonessy y uonezouuy uorgesay T vonezouuy uoneguawnysu m4 adh ysanbay uonerado 103 Acknowledgement Pd like to thank e my parents who unconditionally supported me throughout my entire life e Matthias Rohr for having provided this interesting topic and for being a great advisor e all members of the Software Engineering Group of Prof Dr W Hasselbring for the enjoyable working atmosphere and for notifying me when it s time for lunch in case I missed it e and last but not least Merle for being an awesome girlfriend 105 Declaration This thesis is my own work and contains no material that has been accepted for the award of any other degree or diploma in any university To the best of my knowledge and belief this thesis contains no material previously pub lished by any other person except where due acknowledgment has been made Oldenburg September 14 2007 Andr van Hoorn 107 Bibliography M K Agarwal K Appleby M Gupta G Kar A Neogi and A Sailer 2004 Problem Determination Using Dependency Graphs and Run Time Behavior Models In A Sahai and F Wu editors 15th IFIP IEEE International Workshop on Distributed Systems Operations and Management DSOM 2004 Davis CA USA November 15 17 2004 volum
24. 27 198 22 Jun 2007 13 58 00 0200 GET jpetstore shop viewltem shtml itemId EST 11 200 3954 6AB05EB3025094E198118EFEC3825F77 394540 Figure 4 4 Access log entry of HTTP requests for JPetStore s request type viewProduct CPU Utilization Free Memory LO o oJ a server server S clien T client A Vi sul e e x I wo 57 S 7 E 5 91 z ki To g 2 t 8 O E Val T T T T T T T T T T 0 2 4 6 8 0 2 4 6 8 Time min Time min a CPU utilization b Free physical memory Figure 4 5 Graphs of monitored resource utilization data from server and client node The CPU utilization is in the range 0 200 since both server and client nodes are equipped with a single hyperthreaded CPU with two virtual cores see Section 4 2 1 e HTTP request method URI query String and status code e Size of response data in bytes e Session id and e Server side response time in milliseconds 4 2 2 3 Resource Monitoring We implemented a simple tool which periodically collects resource utilization data of the hosts running both the client and the server The data includes memory and CPU utilization It is used to uncover unwished client side performance bottlenecks and ar guments for later discussion concerning the performance of the server Figure 4 5 shows visualized resource utilization data of client and server monitored during an experiment 4 2 3 Adjustment of Defaul
25. 35 00 38 00 53 00 36 00 86 00 19 00 24 00 64 00 18 00 min 0 00 0 00 6 00 4 00 6 00 4 00 6 00 1 00 1 00 1 00 1 00 0 00 g median 1 00 1 00 7 00 5 00 6 00 5 00 7 00 2 00 2 00 2 00 7 50 1 00 2 x 0 96 1 14 7 18 5 06 6 58 5 38 7 58 2 08 1 96 1 74 6 09 1 16 s 0 49 0 57 0 83 0 79 0 86 0 95 0 91 0 85 1 52 0 88 3 99 0 65 max 2 00 2 00 11 00 9 00 9 00 10 00 11 00 5 00 12 00 5 00 15 00 4 00 Table 4 3 The table shows aggregated server side end to end response time statistics derived from that Tomcat access logs for all 13 request from the experiments described in Section 4 3 1 The response times were measured with a granularity of 1 ms Those of the request types newOrderData and newOrderConfirm cannot be distinguished since they have the same HTTP URI The rows labeled with observations request denote the number of observations for an issued request of the respective type 2 Instrumentation of Entry Points Tpmon operates in annotation mode with only the application entry points being annotated 3 No Instrumentation Tpmon is disabled completely i e the aspect weaver is not registered at all Response time statistics for each configuration and request type are listed in Table 4 3 Additionally the number of monitored operation calls is included The statistics show that a considerable overhead is introduced by Tp
26. 5 b 5 5 c With 195 users the throughput has a value of approximately 57 300 requests per minute see Figure 5 5 c In the following Sections 5 2 3 1H5 2 3 10 we describe the observed relation between work load intensity and the statistics listed in Section We identified the following two behavioral characteristics regarding the impact of in creasing workload intensity on the minimum response times of all operations 71 Chapter 5 Analysis 1 The operations presentation OrderBean newOrder and presentation CartBean addItemToCart have the lowest minimum around 9 us and 15 us respectively considering all experiment runs Regarding the scale of microseconds the minimum of these two operations decreases slightly with an increasing workload intensity For presentation OrderBean newOrder it decreases from 11 us in Experiment 1 to 8 us in Experiment 25 and from 19 us to 7 us for presentation CartBean addItemToCart Figure B 6 a shows the minimum curve for the operation presentation CartBean addItemToCart In this case the stretch factor 0 4 indicates the decreased minimum compared to the value in Experiment 1 and relates to an ab solute deviation of 12 us 2 The minimum values of the remaining operations enclose averages between 1 5 ms service CatalogService getCategory and 9 9 ms service OrderService insertOrder Up to a PWI of 7 63 105 users the stretch factors remain
27. Controller limits the number of active sessions based on the elapsed experiment minute in the i th minute active sessions are allowed in parallel Figure 3 15 shows how to use this BeanShell script within the Session Arrival Controller configuration lv Session Arrival Controller Active Sessions Properties Maximum Number __BeanShell source examples jpetstore numSessionsNumM inute bsh C Logging Figure 3 15 BeanShell scripts can be used with the function BeanShell The BeanShell function source includes a script file The Session Arrival Controller doesn t create any new threads It simply blocks threads in a queue in case the number of active sessions would exceed the given maximum number Hence the maximum number of active sessions is limited to the number of threads configured in the Thread Group If an error occurs while evaluating the active sessions formula the entire test is aborted immediately Details concerning this error are written to the file jmeter log 46 Chapter 4 Experiment Design In the case study the JPetStore sample application see Section 2 3 is exposed to varying workload in order to obtain related operation response times for the later analysis The workload is executed by the workload driver JMeter extended by Markov4JMeter which has been described in Chapter 3 The response times are monitored using the monitoring infrastructure Tpmon see Section 2 8 This chapter o
28. Java annotation TpmonMonitoringProbe Additionally the related class must be specified in the configuration file 25 Chapter 2 Foundations As soon as the execution of an instrumented operation starts Tpmon stores the cur rent timestamp The same holds when the method returns Start time tin and stop time tout as well as the below listed data form the monitoring entry for this invocation Figure P 17 b illustrates how the weaving takes place for an annotated method e experimentid A unique identifier for the current experiment e operation The full operation name consisting of class and method name e g com ibatis jpetstore presentation CatalogBean viewCategory e sessionid traceid The application level session and trace identifier as described in Sections and 2 9 Related Work Work directly related to ours covers the characterization and generation of workload for Web based systems the analysis of response time in enterprise applications as well as timing behavior anomaly detection The results of workload characterization for specific enterprise application in real use are presented in a number of papers For example 2001 2000 and 2003 analyzed online bookstores shopping systems and auction suggest a hierarchical workload model for Web based systems consisting of a session layer a functional layer and an HTTP request layer see Section 2 3 Many freely available and commercial Web workload genera
29. Markov4JMeter provided function which randomly selects an item from a comma separated list of values are used to vary the identifiers of categories products and items to be requested see Section 4 1 4 1 4 1 4 4 Response Assertions Assertions are inserted to detect application errors which are not reflected in HTTP error codes We check for specific text strings in the server response of some requests in order to make sure that the requests have been processed correctly by the JPetStore For example after having signed on the returned Web page must contain the string Welcome as well as a hyperlink labeled Sign Out Thank you your order has been submitted must appear after having confirmed the order 92 4 2 Configuration 4 2 Configuration A description of the node configurations the monitoring infrastructure used and the adjustment of the software settings is described in the Sections 4 2 1 4 2 3 The definition of the experiment runs to be executed follows in Section 4 2 4 Detailed installation instructions are given in Appendix B 1 4 2 1 Node Configuration For our experiment we used the following three nodes which are connected through a 100 Mbit switched local area network e The application server executes the JPetStore 5 Web application in a version 5 5 23 Apache Tomcat Servlet Container It runs GNU Linux 2 6 17 13 and is equipped with an Intel Pentium 4 3 00 GHz hyperthreaded CPU two virtual
30. a configurable number of users is emulated Case Study In a case study workload dependent distributions of service response times shall be obtained from an existing Web based application The functionality to measure the response times is considered to be given First our developed workload generation ap proach needs to be applied to this specific application Moreover its appropriateness has to be evaluated A large number of experiments must be executed exposing the application to varying workload The obtained response time data has to be processed by a statistical processing and graphics tool such as the GNU R Environment R Develop 2005 The results have to be analyzed to derive characteristics of the response time distributions with respect to the varying workload For example we are interested in the similarity of the response times when workload varies Mathematical probability density functions containing the workload parameter as a third dimension shall be determined approximating these distributions The Java BluePrints group Sun Microsystems Inc 2007 provides the Java Pet Store J2EE reference application GE is intended to be used in our case study It is the sample application presented in Singh et al and used in literature e g by 2005 2005 The Java Pet Store has been extended by the above mentioned monitoring functionality already First the application s stability in terms of long periods of high workload needs
31. and 3 Parameter Log Normal Distribution Model Mean 4 796 Median 4 192 Approx Mode 3 697 Skewness 2 457 Kurtosis 6 998 Do 4 6 8 10 12 Response time milliseconds N 11863 Bandwidth 0 1509 3 parameter log normal distribution with estimated parameters Density Sample response time ms Sample response time ms 02 04 06 0 0 12 10 12 10 6 Density Plot of Response Times service CatalogService getltem O Mean 4 796 A Median 4 192 Approx Mode 3 697 Skewness 2 457 Kurtosis 6 998 12 4 6 8 Response time milliseconds N 11863 Bandwidth 0 1509 10 b Kernel estimated density QQ Plot of Sample Data and Normal Distribution Sample mean O Oo 10 N u 4 796 o 1 63 d QQ plot for normal distribution QQ Plot of Sample Data and 3 Parameter Log Normal Distribution en 000 wo o O T T T T 20 40 60 80 A t 3 437 1 0 3 0 1 155 f QQ plot for 3 parameter log normal distri bution Figure 5 15 Visualization of the goodness of fit analysis for operation service CatalogService getItem in Experiment 12 The box and whisker plot a and the rug included in the density plots b c e summarize the monitored sample data 82 5 3 Summary and Discussion of Results Box and Whisker Plot of Response Times service CatalogService getltem
32. and whisker plot of an operation from Experiment 2 with one active session Finally the response times of the first ten minutes of Experiment 2 were removed 68 5 1 Methodology Scatter Plot of Response Times Box and Whisker Plot of Response Times presentation CatalogBean viewCategory presentation CatalogBean viewCategory gt TR Loi regression o Tree Mea a z 42 Ho Median E a E a o o o SZ E Y o o E e E 2 a ei S x J xX Ki a a ei 0 5 10 15 20 Experiment time minutes o i N 1015 Experiment time minutes a Scatter plot b Boxplot Figure 5 3 Scatter plot and box and whisker plot showing ramp up time Experiment 2 5 1 4 Considered Statistics For each experiment run and operation the following statistics were computed see Sec tion DA e Minimum maximum e Mean variance and standard deviation e Mode 1 quartile median 3 quartile Skewness and e Outlier ratio For the statistics minimum maximum mean mode and the quartiles we considered the stretch factor see Section 2 2 relative to the respective value in Experiment 2 in addition to the values that actually occurred The mode was approximated by determining the response time value related to the maximum value of the kernel estimated density 5 1 5 Parametric Density Estimation Based on
33. behavior and how to generate synthetic workload follow Chapter 2 Foundations Business Level 2 Application Level Functional Layer 1 Protocol Level HTTP Request Layer Resource Level Figure 2 6 A hierarchical workload model Menasc et al 2000 Hierarchical Workload Model Following 2000 workload for Web based systems can be considered to concern the three layers session layer functional layer and HTTP request layer see Figure 2 6 Functional Layer On the functional layer an EIS provides a number of services see Section Dm e g a user can browse through product categories add items to a shopping cart and order its content A service call usually requires parameters to be passed e g a product identifier when adding an item to the cart HTTP Request Layer A service call on the functional layer may involve more than one lower level HTTP communications see Section 2 1 on the HTTP request layer For example for creating a user account it might be necessary to send and confirm a number of HTML forms each requiring an HTTP request As defined in Section the time interval elapsed between the completion of a server response related to the last request and the invocation of the next one is denoted as the think time Session Layer A series of consecutive and related requests to the system issued by the same user is called a session Menasc et al 1999 A session starts with the first request a
34. cores and 1 GiB physical memory e The client acts as the workload generating node executing JMeter version 2 2 ex tended by Markov4JMeter It is of identical equipment and configuration as the application server node e The database server runs the MySQL relational database management system DBMS and provides the databases used by JPetStore and by Tpmon It runs GNU Linux 2 6 15 and is equipped with 4 Intel Xeon 3 00 GHz CPUs and 2 GiB physical memory 4 2 2 Monitoring Infrastructure Three types of monitoring are used to obtain performance and resource utilization data during the experiment runs They are outlined in the following sections 4 2 2 1 Tpmon Tpmon see Section is the main monitoring tool used in order to derive the application level statistics regarding performance and control flow of instrumented JPet Store s operations as well as those regarding traces and user sessions We developed a script to import the data written to the filesystem into the database Hence the Tpmon monitored data can be considered present in the database regardless of where it has originally been written to 4 2 2 2 Tomcat Access Logging In addition to Tpmon we configured the file logging functionality of the Apache Tom cat server to monitor the below listed parameters relating to incoming HTTP requests Figure 4 2 2 2 shows a sample entry e Remote IP address e Date and time 93 Chapter 4 Experiment Design 134 106
35. determined from log data containing start and stop times of traces 1 A trace history H C N contains tuples of trace start and stop times H can contain duplicates 2 An event history E C N x 1 1 is derived from the trace history For each tin tout H E contains the tuples tin 1 tow 1 E can contain duplicates E tin 1 tout 1 EN x 1 1 tiny tout H 4 1 59 Chapter 4 Experiment Design Platform Workload Intensity Active traces window size 61 ms step size 30 ms 1 5 2 max 1 541 mean 0 3213 median 0 2787 N S E _ aa min 0 sl z 2 a 4 2 JJ o gt T T a T T T T T T T T 2 3 2 2 111 2 333 2 556 2 778 3 Experiment time minutes Experiment time minutes a Graph visualizing the active traces his b Graph visualizing the platform workload tory intensity with given window and step size Figure 4 7 Graphs visualizing active traces history and platform workload intensity 3 The event history is used to define an active traces history A C N Each element t k A states that k traces were active at time t A t k N Ja 1 1 ta EAk NY H 42 V t b EE t lt t 4 In order to state the number of active traces for times between events contained in an active traces history A we define the step function activeT races N gt N with k at eN t mar t Nit lt tA k
36. example state machines from the protocol layer Details on both layers of the application model are described in the following two Sections 4 1 2 1 and 4 1 2 2 4 1 2 1 Session Layer The application transitions were defined based on the hyperlinks being present on the Web pages of the JPetStore For example by entering the application state Home the server would return the JPetStore index page which is shown in Figure 2 16 a This page provides hyperlinks to the product categories to the shopping cart to the index page itself and allows to sign in or sign off This results in transitions to the respective application states The variables signedOn and itemInCart are used to store additional state information during a user session is simulated A user can only sign on and sign off if the value of the variable signedOn is false or true respectively The variable itemInCart is set when an item is added to the shopping cart A transition to the state Purchase can only be selected when a user has signed on and has at least one item in its shopping cart Figure Hall shows the session layer of our application model which contains nine application states each of which relates to one of the JPetStore services If transitions without guards and actions exist in both directions between two states e g this is the case for Home and View Cart we combined them into a bidirectional transition with arrows at both end A junction connector is used
37. exit 1 fi i callTpmonCtrlServlet action enable then echo ERROR exit 1 fi i startSysmonitoring then echo WARNING exit 1 fi HE 4 Main Execution Phase if jmeter devel sh n t MAINTESTPLAN Jerrorlog S ERROR LOG Jassertionerrors ASSERTIONS ERRORLOG JmaxUserId USERNUM 1 then echo ERROR exi 1 fi if grep ERROR JMETERLOG then echo ERROR exit 1 fi if test s ASSERTIONSERRORLOG then echo ERROR fi if test s ERRORLOG then echo ERROR y exit ly fi 5 Clean up Phase if stopSysmonitoring then echo WARNING fi if callTpmonCtriServlet action disable amp amp callTpmonCtrlServlet action incExperimentId then echo ERROR exi 1 fi i mvAccesslog then echo WARNNG fi Figure 4 8 Sketch of experiment execution script Table r w Description Table r w Description account r w User account data e g login name real name and address orders r w Order data e g user data shipping and billing address bannerdata r Name and location of product category banners orderstatus r w Status of each order category r Product category names and descriptions product r Product data e g name and description inventory r w Quantity of each item in stock profile r w User profile data e g pref
38. function is illustrated in Figure 2 using three different pairs of parameter values N 0 1 is the standard normal distribution N x u 0 P X lt x 2 7 Log normal Distribution A positive random variable X is said to be log normally distributed with two param eters u and o if Y In X is normally distributed with mean y and variance 0 A variable might be modeled as log normal if it can be thought of as the multiplicative product of many small independent factors The 2 parameter log normal distribution 15 Chapter 2 Foundations Whisker extends to Whisker extends to smallest data point within largest data point within 1 5 interquartile ranges 1 5 interquartile ranges from first quartile f hi il x 1 quartile Median 3 quartile rom third quartile E Da F EN O kell i Es Normal Normal Extreme outliers outliers outlier 1 5 IQR 1 5 IQR IQR 1 5 IQR 1 5 IQR Figure 2 11 Description of a box and whisker plot Montgomery and Runger 2006 is denoted by A p 0 see Equation 2 8 Two log normal density functions are illus trated in Figure 2 10 b In contrast to a normal distribution a log normal distribution is asymmetric It is right skewed and long tailed Mxz mo P X lt 2 2 8 If a random variable X can only take values exceeding a fixed value 7 the 3 parameter log normal distribution can be used X is said to be log normally distributed with the three par
39. gon VTE COE OLCO ET STOVT III OLCEE 9G EOT 8G TT SEEC SE 6L s 5 4apa1O1asurroeg den ibs1ap1O oepdewybsradua3sisiad 0664 T 6v80 0 ZIVET OLETT wor S60 IR STE H 17 Luapupuo1isauenbivgaiepdn ogOdewibeuau oepdeutbe a2aua1stsiad 97 v00 esz 9vz 080Er E SETTO SOOT E 0967 ZEST TET TL 7607 9npo1gAg15 7ua111930egdeyyibguay oepdeubs adua3sisiad 6 2 ero 8ze 80 02869 LZyve0 DE OBZGE Il GBS SH HT ler ele ua111930egdeyyibgusy oepdeuwybsa9ua3sisiad 0292 T 1950 0 6 ST 0697T 06 C soo vez vlc H 5 1033189193 0e q de y bsA1039 e oepdewybsraouazsisiad see ro Jee vez 08962 zzoso onge ostze zat Leon gser osz e 3uno99y323 0egdeyy bsjunossy oepdeubs sausgsisiad 022979 9989 eegen orest 6661 pot sont Wer ele 4apaguursepsg ulewop 0Tz0 0 61000 00100 08000 00 8 YSO Ia 97 EE 5 uayauppe4ap1p uiewop 01200 vc00 0 66000 0800 0 cs T 900 YT pt 7 Anuenpaasua7 11e uewop OyTO 0 6000 0 6TTO O0 OTTO O CHL OTO Takt elt e 5 uay3as ua7 11e ulewop OTZO 0 ez00 0 600 0 0800 0 197 970 OST TET U nuenpuususa2ut use uieuop 07900 eso Sze0 0 00 0 0 Leen pt 8TZ EST efelele Oie101 qng103 1e 9 ureWop ozor rt 8z66 ososv osset joo 650 eer log H CIuenippeue uewen 3 s x 3 3 s x 3 3 s x 3 e s 3 3 ge sz 2 lt S lt S s 2 3 x 5 x 312 la aa 222 2123 3 5 s 22 21508 z 2 9 3 3 a a 2 5 3 ols n 218
40. largely constant below 1 0 with the lowest value of 0 92 for the operation service CatalogService getItemListByProduct The minimum increases moderately with a higher workload intensity The operations presentation AccountBean signon presentation CartBean addItemToCart service OrderService getNextId and service OrderService insertOrder reach stretch factors of 2 10 2 71 3 09 and 6 26 in the last experiment run PWI 60 9 The remaining operations average a value of 1 2 Figures 5 6 b and 5 6 c show two minimum curves for the operation service OrderService getNeztId scaled to the range of 5 125 users PWI 0 03 18 85 and 5 195 users PWI 0 03 60 90 5 2 3 2 Maximum For a PWI up to 0 84 45 users the maximum stretch factors of all operations increase linearly to values between 1 53 persistence sqlmapdao OrderSqlMapDao insertOrder and 2 72 service CatalogService getCategory Up toa PWI of 2 22 75 users the maximum stretch factors increase to values between 2 19 persistence sqlmapdao OrderSqlMapDao insertOrder and 18 46 service OrderService getNeztId Figure 5 7 a shows the maximum curve for the operation service CatalogService getCategory with PWIs up to 2 22 This operation is the only one for which the slope doesn t increase considerably for PWIs between 1 08 and 2 22 For PWIs between 2 98 85 users and 60 9 195 users the maximum in
41. model of a system s normal behavior in terms of a set of monitored parameters and comparing this model with a dynamically generated model of the respective current behavior in order to uncover deviations 2005 The data being monitored can be obtained from different levels e g network hardware or application level Typically the model of the normal behavior is created based on data monitored in a learning phase Current system behavior is monitored and compared with the learned model in the monitoring phase which is the system in its productional use If an adaptive approach is used the normal behavior is updated with new data in the monitoring phase Thus anomaly detection contributes to failure detection in that way that anomalies are assumed to be indicative for failures As soon as failures are detected techniques can be used to localize the root cause In the following section we present a selection of existing approaches covering anomaly detection in software systems An approach by Agarwal et al 2004 is presented in Section Examples Chen et al 2002 present an approach for detecting anomalies in component based software systems and isolating their root cause By having instrumented the middleware underlying the application to be monitored the set of components used to satisfy a user request is captured Internal and external failures such as failing assertions or failing user requests are detected In a data clusteri
42. of a probabilistic session model A Markov State is added for each application state on the session layer of our application model as described in Section The transitions among them are configured accordingly The configuration dialog of the Markov State relating to the application state View Cart is shown in Figure 4 3 b Underneath each Markov State HTTP Request Sampler Test Elements are placed and configured according to the protocol layer of the application model Identifiers for categories products and items are randomly chosen using a dedicated Markov4JMeter function before the respective request is issued We exported two behavior file templates using the configuration dialog of the Markov Session Controller and filled in the probabilities as defined in the behavior models in Section 4 1 3 These files were added to the behavior mix in the configuration dialog The session arrival formula is configured to be read from a file A Random Timer Test Element realizes the think time f N 300 200 as defined in Section 4 1 3 4 1 4 3 Variation of Request Parameters The counter and the constants mentioned in 4 1 4 1 were added to the Test Plan to vary some HTTP parameter values passed with requests e The counter value which is automatically incremented when the simulation of a new user starts is used within the login credentials Of course the respective user accounts need to be created before a test starts e The constants and a
43. or having a low correlated partner The correlation for the highly correlated system variables and its input is recalculated during operation in order to detect deviations as indicators of failing behavior The low correlated variables are monitored with respect to 19 Chapter 2 Foundations File Edit Run Options Help a HTTP Request e E Thread Gruppe q Mi HTTP Cookie Manager Name viewCategory shtml Hit HTTP Request Defaults Web Server gt 8 Index e E Sign On o View Category f viewCategory shtml HTTP Request o 8 Sign Off View Results Tree Protocol default http Method GEI E WorkBench Server Name or IP Port Number Path viewCategory shtml vi Redirect Automatically v Follow Redirects Use KeepAlive Send Parameters With the Request Name Value Encode Include Equals categondd REPTILES Y Add Figure 2 14 JMeter GUI The hierarchical Test Plan is shown in the left hand side of the window The right hand side contains an HTTP Request Sampler configuration dialog a statistic capturing the activity of variables Again a threshold is used as an anomaly indicator 2 6 Apache JMeter Apache JMeter Apache Software Foundation 2007b is a Java implemented workload generator mainly used for testing Web applications in terms of functionality and perfor
44. runs 2 Warm up Execution Phase The script disables Tpmon and invokes JMeter in non GUI mode with a 2 minute warm up Test Plan executing 30 concurrent sessions 3 Main Preparation Phase The script reinitializes those JPetStore tables which are marked as r w in Table A configured number of JPetStore users is created which is later used by the main Test Plan see Section 4 1 4 The resource uti lization monitoring on client and server node is started Tpmon is enabled and its experiment identifier is incremented 4 Main Execution Phase The script invokes JMeter in non GUI mode with the probabilistic Test Plan which has been described in Section 4 1 4 61 Chapter 4 Experiment Design bin bash HR variable initializations 1 Initial Preparation and function declarations Phase if z 1s log then rm log x fi if callTpmonCtrlServlet action disable amp amp callTpmonCtrlServlet action setDebug amp debug off amp amp callTpmonCtrlServlet action incExperimentId then echo ERROR exit 1 fi 2 Warm up Execution Phase if jmeter devel sh n t WARMUPTESIPLAN Jerrorlog ERRORLOG then echo ERROR exit 1 fi if test s ERRORLOG then echo ERROR Pe exit Ly fi 3 Main Preparation Phase i resetDB then echo ERROR exit 1 fi i createUsers then echo ERROR
45. stores an application and user behavior con figuration for the AUT the required configuration parameters are defined in Section 3 1 5 1 uc workloadmix define m Configure Behavior Mix and Workload Intensity Actors Tester Pre condition The application and user behavior configuration to be used is accessible Post condition A behavior mix and workload intensity configuration which can be executed by the workload driver has been defined Description The user defines and stores the behavior mix and workload in tensity configuration the required configuration parameters are defined in Section 3 1 5 2 uc workload execute m Execute Configured Workload Actors Tester AUT Pre condition The workload to be executed has been configured and is acces sible to the workload driver see use cases uc aum define m and uc workloadmix define m Post condition The workload configuration has been executed by the workload driver Description The user invokes the workload driver to generate the configured workload see Section 3 1 5 for the AUT 31 Chapter 3 Probabilistic Workload Driver 3 1 5 Workload Configuration This section contains the requirements for the workload driver configuration As men tioned above this configuration is divided into an application specific application and user behavior configurat
46. systems using a Markov chain A CBMG is a pair P Z ofn x n matrices containing n nodes states representing the available requests a user can issue through the system interface The matrix P p represents the transition probabilities between the states and Z z represents the average server side think time A single CBMG represents the behavior of a class of users e g heavy buyers in terms of the requests issued within a session Menasc et al present an algorithm how to obtain a desired number of CBMGs from Web log files by means of filtering and clustering techniques A state transition Graph for an example CBMG including its transition probabilities is shown in Figure 11 Chapter 2 Foundations Sign in A A signed_on True GD on Browse A Purchase Browse Checkout P items_in_cart gt 0 Sign in Exit A signed_on True Delete A items_in_cart 1 P items_in_cart gt 0 Se CARO SD o Delete A items_in_cart 1 Checkout Purchase P items_in_cart gt 0 P items_in_cart gt 0 P signed_on True Add A items_in_cart 1 Exit Figure 2 8 Extended Finite State Machine for an online shopping store following Shams me Extended Finite State Machines Shams et al 2006 state that CBMGs are inappropriate for modeling valid users sessions CBMG are supposed to allow invalid sequences of requests inter request dependencies and do not provide mean
47. ties dbConnectionAddress and db TableName must be set appropriately Otherwise the property filenamePrefix the directory and filename prefix of the file to be used by Tpmon can be specified 99 Appendix B Case Study The property setinitialExperimentIdBasedOnLastId stating whether the experiment identifier is to be incremented automatically on each startup is only evaluated in database mode B 1 3 3 Create aop xml The file src META INF aop zml in the source tree of Tpmon is used as the AspectJ configuration file It should be created by copying the example configuration given in aop ml ezample In order to monitor the JPetStore and the entry points to the application the following entries must be added lt include within com ibatis jpetstore gt lt include within 0rg apache struts action ActionServlet gt To enable full instrumentation the aspect TpmonMonitorFulllnstServlet must be acti vated according to the element given in the following listing The aspect TpmonMoni torAnnotationServlet is used for instrumentation by annotations lt aspect name tpmon aspects TpmonMonitorFulllnstServlet gt B 1 3 4 Build and Install Tpmon Tpmon is build by calling ant from within the top level directory of the sources The Java binary tpmonLTW jar must be copied or preferably linked to the directory common lib of the Tomcat installation together with the MySQL driver The following line
48. to be evaluated Alternative applications which may be used in case our Pet Store stability evaluation 1 3 Document Structure fails are the TPC W online bookstore Transaction Processing Performance Council and RUBIS ObjectWeb Consortium 2005 which models an online auction site Both applications are available in J2EE implementation variants and are also commonly used in literature e g by 2002 2002 2004 and Pugh and Spacco 2003 Workload sensitive Anomaly Detection Prototype Based on the relations between workload intensity and response times derived in the analysis part of the case study a prototype of a workload sensitive anomaly detection shall be implemented A degree of anomaly for service call response times shall be computed and the prototype shall provide simple visualization functionality 1 3 Document Structure This document is structured as follows e Chapter 2 contains the foundations of our work Starting with a description of the considered system model an introduction into performance metrics and scalability workload characterization and generation for Web based system probability and statistics as well as anomaly detection follows Moreover we describe the software we used and present related work e Chapter 3 deals with the probabilistic workload driver Based on a requirements definition the design of the workload driver is described The implementation called Markov4JMeter is outli
49. which Tpmon stored the data in the filesystem The analysis methodology is described in Section 5 1 SectionB 2 contains the detailed description of results Summary and discussion follow in Section 5 1 Methodology We used the GNU R Environment R Development Core Team 2005 for the analysis described in this Section R provides means to directly access the monitoring data by executing SQL queries 5 1 1 Transformations of Raw Data As described in Section each finished operation execution yields an entry in the database consisting of the operation name the start and stop time as well as the trace and session identifiers Start and stop times are given in milliseconds elapsed since the start of the respective experiment run Response Time and Throughput The response time of an execution is computed by subtracting the start time from the stop time By considering the entries belonging to the application entry points the achieved throughput see Section 2 2 of the application in terms of handled requests per minute is computed Platform Workload Intensity and Active Sessions The platform workload intensity PWI is computed using the implementation described in Section 4 4 2 A similar implementation is used to compute the number of active sessions over time for each experiment run Therefore the start time of the first and the stop time of the last call to an application entry point are considered 65 Chapter 5 Analysis
50. 2 2 8 Tpmon An HTML Web interface provides access to the application see Figure 2 16 a p The catalog is hierarchically structured into categories e g Dogs and Cats Categories contain products such as a Bulldog and a Dalmation Products contain the actual items e g Male Adult Bulldog and Spotted Adult Female Dalmation which can be added to the virtual shopping cart the content of which can later be ordered after having signed on to the application and having provided the required personal data such as the shipping address and the credit card number Architecture The architecture is made up by three layers i e the presentation layer the service layer and the persistence layer Clients communicate with the application through the HTML Web interface using the HTTP request response model see Section 2 1 A database holds the application data The architecture is illustrated in Figure 2 16 b The presentation layer is responsible for providing the user interface which gives a view of the internal data model and its provided services The layer is realized using the Apache Struts framework Apache Software Foundation 2007a which includes the so called ActionServlet constituting the application entry point see Section 2 1 The service layer maintains the internal data model and actually performs the re quested services Data is accessed and modified through the persistence layer F
51. 200 see Section 4 1 3 Depending on the respective number of active sessions simulated in an experiment run we set two additional parameters the length of the ramp up period and the experiment duration The values of these three parameters are described below The varying values for all experiment configurations are listed in Table 55 Chapter 4 Experiment Design No 1 2 3 4 5 6 7 8 9 10 11 12 13 Active Sessions 1 5 10 15 20 25 30 35 40 45 55 65 75 Ramp up seconds 0 0 30 60 60 60 60 60 60 60 90 90 90 Duration minutes 20 20 15 15 15 12 11 10 9 8 7 7 7 No 14 15 16 17 18 19 20 21 22 23 24 25 Active Sessions 85 95 105 115 125 135 145 155 165 175 185 195 Ramp up seconds 120 120 120 120 180 180 180 180 180 180 180 200 Duration minutes 7 7 8 8 9 9 9 9 9 9 9 9 Table 4 2 Overview of varying parameters for all experiments e Active Sessions Starting with one simulated active session in the first experi ment run the number is incremented by 5 in each subsequent experiment until 45 active sessions are simulated Afterwards this number is incremented by 10 to 195 active sessions in the last run e Ramp Up Period During the ramp up period the number of threads is linearly increased to the maximum number in order to warm u
52. 3rd International Work shop on Software Quality Assurance SOQUA 06 Portland OR USA November 6 2006 pages 54 61 ACM Press B W Silverman 1986 Kernel Density Estimation Technique for Statistics and Data Analysis In Monographs on Statistics and Applied Probability volume 26 London Chapman and Hall I Singh B Stearns and M Johnson 2002 Designing Enterprise Applications with the J2EE Platform Amsterdam Addison Wesley 2nd edition C U Smith and L G William 2001 Performance Solutions A Practical Guide to Creating Responsive Scalable Software Amsterdam Addison Wesley 1st edition 112 Bibliography Sun Microsystems Inc 2006 Java Pet Store Reference Application Homepage https blueprints dev java net petstore Last visited August 31 2007 Sun Microsystems Inc 2004 JVM Tool Interface http java sun com j2se 1 5 0 docs guide jvmti Last visited August 31 2007 Sun Microsystems Inc 2007 Java BluePrints Homepage http java sun com Last visited August 31 2007 Transaction Processing Performance Council 2004 TPC W Homepage http tpc org tpcw Last visited August 31 2007 A van Hoorn 2007 Markov4JMeter Homepage http markov4jmeter Last visited August 31 2007 113
53. 5 9 Platform workload intensity vs 1 quartiles of response times and stretch factors for the operations presentation UrderBean newOrder presentation AccountBean stgnon and service OrderService getNextId 30 90 service CatalogService getCategory Figures 5 8 b and 5 10 b show the curves for service CatalogService getItem and presentation CatalogBean viewItem which considerably increase starting with the PWIs 44 88 and 38 81 5 2 3 5 1 Quartile Up to a PWI of 25 5 135 users the 1 quartile stretch factors increase rather moderately to values between 1 25 presentation OrderBean newOrder and 3 68 service OrderService insertOrder in a linear way Up to the highest considered PWI 60 9 the operation presentation OrderBean gt newOrder reaches a maximum stretch factor of 1 25 see Figure 5 9 a The values of persistence layer operations remain between 1 41 and 1 76 The impact of a further increasing workload intensity on the mode follows a similar pattern for the remaining operations it increases with a higher slope for PWIs around 35 165 175 users before raising considerably to maximum values The service OrderService insertOrder and service OrderService getNeztId reach the highest stretch factors of values 34 47 and 66 53 As a representative to both Figure 5 9 c shows the curve for service OrderService getNextId All other oper ations reach stret
54. Additional Test Elements include Timers Listeners Assertions and Configuration Elements Timers add a think time e g constant or based on a normal distribution between Sample executions of the same thread Assertions are used to make sure that the server responds the expected results e g it contains a certain text pattern By using Listeners results of the Sample executions can be logged e g HTTP response codes or failed assertions The set of Configuration Elements includes an HTTP Cookie Manager to enable client side cookies as well as Variables Table A 1 lists all Test Element types included in JMeter version 2 2 The user s manual Apache Software Foundation 2006 contains a description of functionality and available parameters of all Test Elements Architecture JMeter includes components required for the GUI and non GUI mode for holding the configuration of the Test Plan and those required for the test execution This catego rization is illustrated by the pillars shown in Figure GUI components provide the functionality to graphically define a Test Plan and to start and stop the test execution As mentioned above test can also be started in non GUI mode The Engine is responsible for controlling the test run It initializes the Thread Group and the included Threads each of which is assigned a private instance of the Test Plan to be executed 21 Chapter 2 Foundations JPetStore Fish Dogs Reptiles Cat
55. Bean viewCategory presentation CatalogBean viewCategory e 2 o gt o e ES e a A o SEN Es 8 E e E 7 5 BT f g Boo o 3 Eo db Eo A bacilo rin ez i la WE MEE c Boxplot including all d Boxplot trimmed to e Boxplot including all outliers database 18 ms database outliers filesystem Figure 5 2 Scatter and box and whisker plots showing the gaps and higher variance of the data when Tpmon stores the response times to the database tion 2 4 to the right hand side i e those data points more than 1 5 IQR farther from the 3 quartile were considered By this definition we calculated an average ratio of outliers for the instrumented operations between 1 1 and 15 6 Due to the rather high outlier ratio not all outliers from the sample data are removed but at most 3 Given an outlier normal and extreme ratio o in an experiment for the response times of a single operation the ratio of min40 03 o largest response times is removed from the original sample data 5 1 3 3 Time Interval Extraction In each case a trailing period of non representative response times had to be removed They were due to the configured ramp up time and the warm up periods which occurred within the application although a two minute warm up has been executed before each experiment run see Section 4 5 We extracted time intervals considered representative and of sufficient duration Figure shows the scatter plot and the box
56. Design 4 1 Markov4JMeter Profile for JPetStorel ee 4 1 2 Application Modell r 44 Ne a ar da deh asa 4 13 Behavior Models E 4 2 Configuration AA IAE 4 2 1 Node Configuration ae 45444 8440426248224 44 4 2 2 Monitoring Infrastructure 284444464 Eh ha 4 2 3 Adjustment of Default Software Settings 2 2 2 2 CRA A AA ae EE nr daria bi a A A ee 4 3 1 Assessment of Monitoring Overhead 4 3 2 Identification of Monitoring Points a a A a A AA BY amp 8 4 4 1 Formal Definition 4 a as 488 pda abe 4 4 2 Implementation 2 0 8 8a ara a haha ehe 4 5 Execution Methodology 2r 2 2 au Ks san Ha ati 5 Analysis D Methodology ee e AAA a a A a 5 1 1 Transformations of Raw Data 5 1 2 Visualization 2 22 2 oo oo rn 5 1 3 Examining and Extracting Sample Data 5 1 4 Considered Statisties a A A 5 1 5 Parametric Density Estimation 24 24 44 EI a 5 2 Data Description z lado da e E A Ee a 5 2 1 Platform Workload Intensity esas a A 5 2 2 TU a ae a A a a a A EA E A A A AO A 5 2 4 Distribution Characteristios o 404 4 ae 24 5 2 5 Distribution d ei s e E E AE E A ee 5 3 Summary and Discussion of Results 0 a a AE ot Sea oh e athe e woe eee ee A ja 24442 babes bade pede dasd een deena dae po Bete Dee eR De A Pe dio Oe Ee ae ee eS o F tut WTI y Sanie Al ke he da dl do a SE EERE a 47 47 47 48 49
57. Platform workload intensity c PWI 0 03 60 90 Platform workload intensity a PWI 0 03 2 22 Figure 5 10 Platform workload intensity vs mean median and mode response times as well as quartile stretch factors for the operation presentation CatalogBean viewltem persistence sglmapdao OrderSqlMapDao insertOrder shows a different behavior see Figure 5 8 c The persistence layer operations increase to values between 1 97 and 2 35 for a PWI up to 18 84 Unless for persistence sqlmapdao OrderSqlMapDao insertOrder the persistence layer operations decrease slightly starting with PWIs around 20 but remain in this range up to the maximum PWI 60 90 The curve for persistence sqlmapdao OrderSglMapDao insertOrder is shown in Figure 5 8 The stretch factors of service OrderService getNextId and service OrderService insertOrder reach the values 7 07 and 7 31 up to a PWI of 18 84 The values for the remaining operations are between 2 59 service CatalogService getCategory and 4 75 presentation CatalogBean viewCategory For those operations having an intermediate local maximum for mean and maximum the median response times increase further with a smaller slope up to a PWI around 34 71 The remaining operations show a slighter higher slope in this range The operations service OrderService insertOrder and service OrderService getNextId reach values of 21 28 an
58. Session Controller and Markov State both extend the JMeter class jmeter control GenericController A Markov Session Controller forms the root of a probabilistic session model within a Test Plan Markov State Test Elements are added directly underneath each representing an application state of the session layer as defined in Section Figure 8 8 a shows the Markov4JMeter Test Plan for the sample application model illustrated in Figure 3 3 1 1 Markov Session Controller According to the JMeter Test Elements the Markov Session Controller is divided into a Test Element class and a GUI class providing the configuration dialog The configuration dialog allows the definition of the behavior mix and the configuration of the session arrival controller It is shown in Figure 3 8 b The behavior mix is defined by selecting the respective behavior files and specifying the desired relative frequencies 40 3 3 Markov4JMeter H d Test Plan Markov Session Controller Markov State EF Thread Gruppe Name Markov Model Name 52 i HTTP Cookie Manager Comments Commen ts H HTTP Request Defaults 9 Markov Model Behavior Mix State Transitions Relative frequenc Filename Destinatio Disabled Guard Action so i E lame BehaviorO 0 3 behv0 csv D Variable Initialization 50 DEE b 1 e 8 so Behaviorl 0 7 behv1 csv SL EE EI a wl Ms Add Delete Generate Template gr ts
59. ToCart see Section 5 3 indicate the need for probabilistic testing but at least for the exact emulation of user behavior when testing enterprise applications Most likely the minor cluster of response times due to special use cases would not have been uncovered otherwise Anomaly Detection As all anomaly detectors WISAD has its limitations in terms of that there exist scenarios which lead to a higher number of detection errors For example for scenarios with bimodal distributions all executions of a cluster with higher response times than the other might be classified as anomalies due to the nature of the mean In order to use WISAD with these types of distributions it could be combined with additional techniques such as a control flow analysis for determining the context of an operation call 2007a Moreover WISAD only detects anomalies which exceed a given threshold but not those whose reponse time are considerably below normal 1 3 Future Work Based on our results the following topics could be worked on in the future e The presented approach for generating probabilistic workload could be further eval uated e g by applying it to an enterprise system in productional use The prob abilities needed for the transition matrices of the user behavior models could be derived from Web log files This could be based on the approach by Menasc et al 1999 for CBMGs 92 7 3 Future Work Suitable monitoring points with
60. al ACM Symposium on Applied Computing SAC 06 Dijon France April 23 27 2006 pages 705 709 ACM Press M Y Chen E Kiciman E Fratkin A Fox and E A Brewer 2002 Pinpoint Problem Determination in Large Dynamic Internet Services In Proceedings of the 2002 Inter national Conference on Dependable Systems and Networks DSN 02 Washington DC USA June 23 26 2002 pages 595 604 IEEE Press I Cohen S Zhang M Goldszmidt J Symons T Kelly and A Fox 2005 Capturing Indexing Clustering and Retrieving System History In A Herbert and K P Birman editors Proceedings of the 20th ACM Symposium on Operating Systems Principles SOSP 2005 Brighton UK October 23 26 2005 pages 105 118 ACM Press E L Crow and K Shimizu editors 1988 Lognormal Distributions Theory and Appli cations volume 88 of Statistics Textbook and Monographs Series New York Marcel Dekker Eclipse Foundation 2007 AspectJ Homepage http www eclipse org aspectj Last visited August 31 2007 R Fielding J Gettys J Mogul H Frystyk L Masinter P Leach and T Berners Lee 1999 Request for Comment RFC 2616 Hypertext Transfer Protocol HTTP T Focke 2006 Performance Monitoring von Middleware basierten Applikationen Diplomarbeit University Oldenburg J Fulmer 2006 Siege Homepage http www joedog org JoeDog Siege Last visited August 31 2007 Gamma Helm Johnson and Vlissides 2000 Des
61. ameters 7 u and 0 if Y In X 7 is N u ol The distribution is denoted by A T u 0 The parameter r is called the threshold parameter and denotes the lower limit of the data Aitchison and Brown 1957 Crow and Shimizu 1988 Figure 2 10 b contains the density graph of a 3 parameter log normal distribution Descriptive Statistics Assume that in an experiment a sample of n observations denoted by 21 has been made The relative frequency function f R 0 1 gives the relative number of times a value occurs in the sample F R 0 1 is the cumulative relative frequency function according to the cumulative distribution function described above The sample mean amp is the arithmetic mean of the observed values see Equation 2 9 Analogous to the variance of a probability distribution the sample variance s and the sample standard deviation s describe the variability in the data Minimum and maximum denote the smallest and greatest observations in the sample 3 DEET 2 2 Ilm uf 2 9 i l A p Quantile p for p JO 1 is the smallest observation x satisfying F x gt p see Equation 2 10 The quantiles for p 0 25 p 0 5 and p 0 75 are denoted as the 1 2 and 3 quartiles The 2 quartile is also called the median The interquartile range IQR is the range between the 1 and the 3 quartile Figure 2 11 shows the description of a box and whisker plot which is commonly used to displ
62. an be accommodated St Chapter 3 Probabilistic Workload Driver This chapter deals with the development of the workload driver used in the case study We introduce an approach for a model based definition and generation of probabilistic workload for enterprise applications The definition of the user behavior is separated from application and protocol specific details The models are based on hierarchical finite state machines and Markov chains We implemented our approach by extend ing the existing workload generator JMeter see Section P 6 This extension named Markov4JMeter has been released under an open source license 2007 Section contains the requirements definition of the workload driver The concep tual design of the workload driver including the definition of the workload configuration and execution semantics are given in Section Section 3 3 outlines the resulting im plementation and integration into JMeter A description on how to model and generate probabilistic workload with Markov4JMeter is illustrated in Section 3 4 3 1 Requirements Definition The Sections and introduce how requirements are labeled and classified in this requirement definition and states the assumptions made Which applications and use cases must be supported is defined in Sections 3 1 3 and 3 1 4 Requirements for the workload configuration and the provided user interface follow in Sections 3 1 5 and 3 1 6 3 1 1 Requirements Labeling and Cl
63. ase Study B 1 Installation Instructions for Instrumented JPetStore Section contains the steps to install and configure the servlet container Apache Tomcat In Section we give instructions to build and install the iBATIS JPetStore In Section we describe how to monitor the JPetStore with Tpmon B 1 1 Install and Configure Apache Tomcat The Apache Tomcat server is installed by extracting its binary archive After executing the script bin startup sh the server is started and presents a welcome page through the URL For remote access firewall settings may need to be modified Web applications are installed by copying them into the webapps directory This does also work while the server is running Our changes of some default settings of the server configuration are described in the following paragraphs Heap size By adding the following line to the file bin catalina sh we increased the maximum heap space to be usable by the Java virtual machine to 512 MiB JAVA_OPTS Xms64m Xmx512m JAVA_OPTS Thread Pool Size By modifying the following XML element in the configuration file conf server zml we increased the maximum number of available request processing threads attribute marThreads to 300 and the maximum number of simultaneously accepted requests attribute acceptCount to 400 lt Connector port 8080 maxHttpHeaderSize 8192 maxThreads 300 minSpareThreads 25 maxSpareThreads 75 enableLook
64. assification Each requirement is labeled with a unique identifier such as scope applications m An identifier consists of a descriptive component in this case scope applications and a clas sification component which is either m mandatory or s secondary The classification is as follows Mandatory A mandatory requirement emphasized by must is a requirement which must be fullfilled in every case Secondary A secondary requirement emphasized by should is a requirement which should be fullfilled The non fullfillment of a secondary requirement must be care fully weighted and reasoned 29 Chapter 3 Probabilistic Workload Driver UCD Define and Execute Workload Workload Driver Figure 3 1 The use case diagram illustrates the use cases to be supported by the work load driver 3 1 2 Assumptions The application which workload is to be generated for is denoted as the application under test AUT The term tester relates to the person using the workload driver The workload configuration is divided into an application and user behavior configu ration and a behavior mix and workload intensity configuration e The application and user behavior configuration contains the application and protocol specific details required to generate valid workload Moreover it includes models of the probabilistic user behavior for the AUT e The behavior mix and workload intensity configuration contains
65. ata called a rug as well as other statistics are included Quantile Quantile Plots A quantile quantile plot QQ plot is a graphical technique to test whether two data sam ples come from equally distributed populations Each point in the plot represents a pair of quantile see Section of both samples The closer the points follow the 45 degree reference line the likelier do both samples come from equally distributed populations In the QQ plot in Figure the underlying distribution of the sample response times follows the given 3 parameter log normal distribution quite well 5 1 3 Examining and Extracting Sample Data Before performing the analysis of the response times we examined and extracted the sample data as outlined in the following Sections 5 1 3 1H5 1 3 3 5 1 3 1 Sample Data Examination We investigated the results for discrepancies between the configuration and the derived data For example we controlled the ramp up time and the number of active sessions throughout the experiment run using a plot visualizing the number of active sessions over time Moreover we investigated the scatter plots and box and whisker plots for abnormal variability in the data or suspicious events We recognized gaps in the scatter plots related to those experiment runs in which Tpmon stored the data within the database The gaps are obvious by comparing Figures B 2 a and B 2 b which show the scatter plots of the operation service CatalogService
66. aving agent is registered inside the Java virtual machine tool interface JVMTI Sun Microsystems and basically performs binary weaving at the time a class to be woven is loaded by the class loader Point cuts inside the application can be defined within external configuration files without modifying the application s source code or by enriching the application code with annotations which were introduced with Java 5 Instrumentation Modes Tpmon is based on the AOP extension AspectJ A configuration file contains a speci fication of classes to be considered by the weaving agent e g by using the directive lt include within com ibatis jpetstore gt any of the JPetStore classes see Section 2 7 the name of which matches the given pattern would be considered In order to consider the application entry operations of the presentation layer as well one would also include the pattern org apache struts ActionServlet The preferred weaving method is load time weaving one of the weaving alternatives mentioned above Tpmon offers the two below described instrumentation modes i e the way operations to be monitored are specified 1 Full Instrumentation Mode Using this mode the weaving agent weaves the monitoring functionality into all operations in those classes specified in the confi guration file No source code modifications are necessary 2 Annotation Mode Using this mode all methods to be monitored need to be labeled by the
67. ay these statistics p Tne F x gt p 2 10 16 2 4 Probability and Statistics 95857 0 015 0 030 0 025 0 0104 2 0020 z 0 0154 0 005 0 0104 0 005 0 000 0 000 T T T T r T T T 20 40 60 80 50 H 50 100 a Window size 2 b Window size 20 Figure 2 12 Kernel density estimations of a data sample using a normal kernel and different window sizes The outside points in a box and whisker plots mark the outliers of a data sample Value between 1 5 and 3 IQRs farther from the 1 or the 3 quartile are called normal outliers All values more than 3 IQRs farther are called extreme outliers An observation that occurs with the highest frequency is called the mode Data with more than one mode is said to be multimodal A sample with one mode is called unimodal A sample with two modes is called bimodal Generally for symmetric distributions mean median and mode coincide If mean median and mode do not coincide the data is said to be skewed asymmetric with a longer tail to one side Montgomery and Hunger 2006 It is right skewed if mode lt median lt mean and left skewed if mode gt median gt mean Density Estimation Often one obtains sample data from an experiment and needs to estimate the underlying density function f Density estimation denotes the construction of an estimate of the continuous density function from the observed data It can either be param
68. by Markov4JMeter see Chapter 3 In this section we will describe the contained 50 4 1 Markov4JMeter Profile for JPetStore 9 o Test Plan E Thread Gruppe a HTTP Cookie Manager dif HTTP Request Defaults Hit User Defined Variables IN User Parameters Markov State IN Userld Assignment Name View Cart 9 ciri Comments o gt E Home H Kd Sign On State Transitions 4 ipetstore shop signonForm shtml Destinatio Disabled Guard Action A fipetstore shop si html Home jpetstore shop signon shtm Sign On m I signedOn signedOn true View Category View Cate O 7 o View Prod v e E EEN View Hem LI iteminCart Ke K d View Item Add to Cart lv iteminCant true e Addto Cart View Cart O pr Purchase el signedOn amp amp fiteminCart iteminCan false gt E View Cart Sign Off signedOn signedOn false o 8 Purchase o E sign Off W Gaussian Random Timer FS Error Writer E Assertion Results E View Results Tree Markov4 JMeter version 0 9 20070721 1308 beta a Probabilistic Test Plan for b Configuration of state View Cart JPetStore The subtree of the state Sign On is expanded Figure 4 3 Probabilistic Test Plan and configuration of state View Cart elements and outline their configuration The tree of the Test Plan and the configuration of the state View Cart are shown in Figure 4 1 4 1 Basic Test Elements The Test Plan contains the followi
69. caotemSgMapDao gitem Sng E prsorcmancargentneeottens pacto ic Caer gang A ansranen ein Ja Cat ere eeu ol nie mein potion srt Corto marc i a E ms o zung g Request addltemToCart h Request viewCart peasannan TJ abi tosh eat tonos te i i piaui psonn petstore domain OrderintOrderljpetstore domain Account petite domain Gan E prsorecemansccortgetusenaned aarmer peso pennies wong potion perio Jeon whiter E E SS mas o zung i Request checkout j Request newOrderForm pasion reso tours cents E peta tina b ssegen d emmmer tai rror ptr ona Se Eu petstore domain Order genen o E pere persstence sgmapdao Seasons Span gee pasta retiene d NEE Time ms Time ms k Request newOrderData I Request newOrderConfirm petstore presentation AccountBean isAuthenticated ae 5 2 jpetstore presentation AccountBean clear PROE 5 o jpetstore presentation AccountBean signoff el ZS 5 059 5 A EE Bene ME sedie E 4 Z o petstore domain Account setBlanerOption boolean PN ne S 3 001 jpetstore presentation AccountBean ceset sch S Struts acion ActionServet doGet HitpServietRequest HitpServieiResponse del 1115 542 r r r r r r o 2 4 6 8 10 Time ms m Request signoff Figure B 1 Sample trace timing diagrams for JPetStore requests cont 102 B 3 Iterative Monitoring Point Determination B 3 Iterative Monitoring Point Determination
70. ch factors between 11 1 and 23 75 Figure 5 9 b shows a representative curve the stretch factor increases moderately up to a PWI around 35 1 41 4 47 before it raises considerably to the maximum value 5 2 3 6 Median The operation presentation OrderBean newOrder reaches its highest median 7 25 ms for a PWI of 0 27 15 users For all other operations and a PWI up to 18 84 125 users the median increases rather moderately compared to the mean Figures B5 10 a and B 10 b illustrate this Solely 75 Chapter 5 Analysis Workload Intensity vs Mean Median and Mode Stretch Factors presentation CatalogBean viewltem o Workload Intensity vs Response Time Mean Median and Mode presentation CatalogBean viewltem mean Es CLs mean ClP 95 5 m median Workload Intensity vs Quartile Stretch Factors presentation CatalogBean viewltem approx mode o mean 6 1 quartile a Y 4 median o median approx mode o 3 quartile a o e w 2 2 6 P gt 4 Yo E X fi g Be d Stretch Factor 6 8 Response time ms 30 e a Oo 0 N Stretch Factor 10 fi eg vi o o To o o e 4 e Ze 4 S e s ft o o o E ee a a a o o el Pa 24 f rd a e a 4 Ce S E 7 o A o goo E iii 83 T T T T T T T T T T T T T T T T T T D 5 10 15 0 10 20 30 40 50 60 D 10 20 30 40 50 60 Platform workload intensity b PWI 0 03 60 90
71. creases considerably for all operations up to peak values The value for persistence sqlmapdao OrderSqlMapDao insertOrder increases rather moderately to 3 99 The operations service OrderService insertOrder and service OrderService getNextId show the highest stretch factors with 922 17 and 72 5 2 Data Description Response time ms Workload Intensity vs Response Time Maximum service CatalogService getCategory Platform workload intensity a PWI 0 03 2 22 Stretch Factor Response time ms 2000 4000 6000 8000 10000 D Workload Intensity vs Response Time Maximum service OrderService insertOrder T T 600 800 400 200 Platform workload intensity b PWI 0 03 60 90 Stretch Factor Response time ms 200 150 Workload Intensity vs Response Time Maximum service CatalogService getCategory 80 T 60 Stretch Factor 40 2 D Platform workload intensity c PWI 0 03 60 90 100 120 0 Figure 5 7 Platform workload intensity vs maximum response times for the operations service CatalogService getCategory and service OrderService insertOrder 3082 15 The stretch factors of the remaining operations raise to values be tween 128 07 persistence sqlmapdao OrderSqlMapDao insertOrder and 322 52 presentation OrderBean newOrder Unless for the operations service OrderService insertOrder
72. ct issues with the configuration of the persistence layer see Section D A table names are inconsistently spelled in terms of capitalization in the original SQL initialization scripts and the iBATIS object relational mapping files For example the database table holding the status of each order is spelled orderstatus within the respective SQL statements within the SQL scripts whereas spelled ORDERSTATUS in the mapping files Since session identifiers are not reused by different threads we decreased the time until a sessions times out within the JPetStore to 3 minutes default 30 minutes for saving resources needed to maintain session information The JPetStore source code is instrumented by annotating the operations described in Section 4 3 to allow application level monitoring using Tpmon 4 2 4 Definition of Experiment Runs 25 experiments runs are executed under increasing workload intensity conditions in or der to obtain response time data of the instrumented JPetStore operations These 25 experiments are executed once with Tpmon configured to write its monitored data to the filesystem and once writing it to the database As mentioned in Section 2 3 the workload intensity is considered to be implied by the number of active sessions and the client side think times It will solely be controlled by varying the number of active sessions simulated by the workload driver The client side think time distribution are set to fa N 300
73. ction ActionServlet doGet and struts action ActionSerulet doPost pro vided by the Struts framework see Section 2 7 are instrumented 56 4 3 Instrumentation Request Type S E m E E E E SE 2 SEO 5 5 Pal E Ele 2 103 1315 u z S 2 S 5 E 3 Y KK 5 2 Ig EF S8 1 131929 1 8183 8 t 3 KENG 2 2 2 3 amp 2 6 3 i wn YN YN gt gt gt gt o E E E 7 observations request 5 12 83 40 89 59 119 75 88 78 107 159 17 min 4 00 7 00 48 00 25 00 50 00 33 00 62 00 35 00 41 00 38 00 51 00 10 00 E median 7 00 9 50 53 50 30 00 53 50 36 50 68 00 42 50 50 00 42 50 84 00 14 00 u x 9 52 13 48 57 08 31 28 57 24 40 76 72 20 44 70 49 74 44 14 81 04 15 72 s 7 54 12 74 15 52 5 39 14 11 11 14 16 69 1157 5 83 5 03 33 03 5 74 max 40 00 97 00 157 00 47 00 148 00 103 00 173 00 114 00 65 00 54 00 285 00 38 00 4 observations request 1 1 1 1 1 1 1 1 1 1 1 1 1 min 2 00 2 00 8 00 5 00 8 00 5 00 8 00 2 00 2 00 2 00 3 00 1 00 median 8 00 8 00 15 50 13 00 16 00 13 50 5 90 9 00 10 00 8 00 14 00 9 00 gt x 10 14 8 82 17 52 13 90 16 34 14 58 16 56 10 34 9 66 9 26 15 51 8 96 5 s 10 32 6 09 11 99 5 26 5 90 7 87 5 61 11 60 3 83 5 16 9 65 4 56 max 57 00 35 00 92 00
74. d 35 66 for a PWI up to 34 71 The values for the remaining operations are between 4 83 and 8 53 For values starting with 35 the median stretch factor of all operations unless those of the persistence layer increases considerably The operations service OrderService insertOrder and service OrderService getNextId reach the values 81 52 and 182 59 The remaining operations reach values between 14 58 presentation CatalogBean viewItem and 32 13 service CatalogService getCategory 76 5 2 Data Description Workload vs Workload vs Workload Intensity vs 3 Quartile of Response Times 3 Quartile of Response Times Quartile Stretch Factors presentation CartBean addltemToCart presentation CartBean addltemToCart service OrderService getNextld a Sy of or 1 quartile o 24 o median g 2 4 3 quartile K r 34 o o _ y i s 3 gh E L amp E o 5 S 5 8 ps E o KC EN a o tr o A Z e gi a rt med Z Te E Le Br SPEER EEE EIER ee o 4 m0 870 0 8 0 0 MT T T T T T T T T T T T T T T T 0 0 0 5 1 0 15 2 0 D 5 10 15 0 10 20 30 40 50 60 Platform workload intensity Platform workload intensity Platform workload intensity a PWI 0 03 2 22 b PWI 0 03 18 85 c PWI 0 03 60 90 Figure 5 11 Platform workload intensity vs 3 quartiles of response times and quartile stretch factors for the operations presentation CartBean addItemT
75. d in all experiment runs Figures 5 14 a and 5 14 b show the scatter plot and the kernel estimated density of the response times monitored in Experiment 4 One cluster is located near 0 ms and shows no significant variance The second cluster is right skewed see Section 2 4 with a mode around 8 ms The operation presentation CartBean addItemToCart does also show a bimodal density throughout all experiments But in contrast to presentation OrderBean newOrder a cluster located close to O ms contains response times which only occur sporadically and thus don t have a considerable density Figures 5 14 c and 5 14 d show the scatter plot and the kernel estimated density of the response times monitored in Experiment 11 79 Chapter 5 Analysis Scatter Plot of Response Times presentation OrderBean newOrder o amp Pocal regression P o o n E wo TOF E o g 2 E o S o 2 7 a e 4 2 4 6 8 10 12 14 Experiment time minutes N 1681 a Two response time clusters Scatter Plot of Response Times presentation CartBean addltemToCart bs WEE zur Loealtedfessio og SC SC Wi T Vo 0 o oe oft 0 o O 00 E o We o 000 7 gogo 000 eg o van BOO 080 058 8 E 0 96 Qe Ga DW EN Q n O A ES S D A a a y OO 3 o OO O0 0 0000 OO MO o coa o T T T T T 4 0 4 5 5 0 5 5 6 0 Experiment time minutes N 1025 c Minor cluster of sporadic respons
76. d related requests to the system issued by the same user is called a session As soon as the first request has been issued the user begins a new session which is then denoted as an active session until the last request has been issued The time interval elapsed between the completion of a server response and the invocation of the next one is denoted as the think time 3 2 2 Workload Configuration Data Model Based on the requirements related to the workload configuration see Section in cluding the application and user behavior configuration and the behavior mix and in tensity configuration we defined a workload configuration data model It consists of an application model a set of user behavior models related to the application model as well as a definition of the user behavior mix and the workload intensity The configuration elements are described in the following Sections Figure shows the class diagram for this data model 34 3 2 Design a shtml Protocol Layer Figure 3 3 Sample application model illustrating the separation into session layer and protocol layer 3 2 2 1 Application Model The application model contains all information required by the workload driver to gene rate valid sessions for an application It is a two layered hierarchical finite state machine consisting of a session layer on top and a protocol layer underneath Figure 3 3 shows a sample application model Session Layer The session layer co
77. d the definition of conditional probabilities considering all possible outcomes of evaluated guard expression We decided not to include conditional probabil ities since this would come with an additional overhead of keeping application and user behavior models consistent and a lot more probabilities to be defined The design of the workload driver resulted in an extension named Markov4JMeter Additionally the following implementation options were considered before 1 The Development of a new stand alone workload driver from scratch 2 A transformation of the workload model into a JMeter Test Plan according to the transformation of GOTO programs into WHILE programs An outer While Controller contains a list of If Controllers each representing an application state A global variable stores the current state of the session model A pre processor executed before each iteration would be used to select the next state based on externally defined user behavior models and the current state Option 1 the implementation of a new stand alone workload driver would have required lots of implementation of protocol related functionality which on the other hand has been implemented in a number of workload drivers already The open source workload driver JMeter provided this benefit Option 2 outlined above would have either required the generation of Test Plans in the JMX file format or the manual definition of such models in JMeter First we are conf
78. dao UrderSqlMapDao insertOrder curve for service CatalogService getItem with its local maximum at PWI 25 Figure 5 8 c shows the curve for persistence sqlmapdao OrderSqlMapDao insertOrder which is the only one of this shape 5 2 3 4 Mode Up to a PWI of 0 84 45 users the mode stretch factor of the operations increases mod erately to values between 1 00 and 1 08 Up to a PWI of 2 22 75 users it increases with a slightly higher slope to 1 41 for the operation presentation OrderBean newOrder and to values between 1 03 and 1 72 for the remaining Figure 5 8 a shows the curve for service CatalogService getItem in the PWI range 0 22 2 22 Up to PWI of 24 50 135 users the stretch factor of the operations service OrderService getNextId and service OrderService insertOrder increases to 4 82 and 4 28 The response time mode of presentation OrderBean newOrder varies in a range between 0 01 ms negative due to the approximation and 0 03 for PWIs between 0 22 and 4 54 95 users before increasing to 0 30 ms This yields a stretch factor exceeding 70 The stretch factors of the remaining operations increase to maxi mum values between 1 56 persistence sqlmapdao AccountSqlMapDao getAccount and 3 08 presentation CatalogBean viewCategory In no experiment is the stretch factor of the persistence layer operations response time mode values higher than 1 56 and 1 89 The mode of presentation
79. e and Secure Computing 1 1 11 33 G Ballocca R Politi G Ruffo and V Russo 2002 Benchmarking a Site with Re alistic Workload In Proceedings of the 5th Annual IEEE International Workshop on Workload Characterization WWC 5 Austin TX USA November 25 2002 pages 14 22 IEEE Press 109 Bibliography P Barford and M Crovella 1998 Generating Representative Web Workloads for Net work and Server Performance Evaluation In Proceedings of the 1998 ACM SIGMET RICS Joint International Conference on Measurement and Modeling of Computer Sys tems SIGMETRICS 98 PERFORMANCE 98 Madison WI USA June 22 26 1998 pages 151 160 ACM Press E Cecchet J Marguerite and W Zwaenepoel 2002 Performance and Scalability of EJB Applications In Proceedings of the 17th ACM Conference on Object oriented Programming Systems Languages and Applications OOPSLA 2002 Seattle WA USA November 4 8 2002 pages 246 261 ACM Press H Chen G Jiang C Ungureanu and K Yoshihira 2005 Failure Detection and Localization in Component Based Systems by Online Tracking In Proceedings of the 11th ACM SIGKDD International Conference on Knowledge Discovery in Data Mining KDD 05 Chicago IL USA August 21 24 2005 pages 750 755 ACM Press H Chen G Jiang C Ungureanu and K Yoshihira 2006 Combining Supervised and Unsupervised Monitoring for Fault Detection in Distributed Computing Systems In Proceedings of the 21st Annu
80. e 3278 of Lecture Notes in Computer Science pages 171 182 Berlin Springer J Aitchison and J A C Brown 1957 The Lognormal Distribution with Special Ref erence to its Uses in Econometrics Cambridge Cambridge University Press C Amza A Chanda E Cecchet A Cox S Elnikety R Gil J Marguerite K Raja mani and W Zwaenepoel 2002 Specification and Implementation of Dynamic Web Site Benchmarks In Proceedings of the 5th Annual IEEE International Workshop on Workload Characterization WWC 5 Austin TX USA November 25 2002 pages 3 13 IEEE Press Apache Software Foundation 2006 JMeter User s Manual http jakarta apache org jmeter usermanual Last visited August 31 2007 Apache Software Foundation 20072 iBATIS Data Mapper Framework Homepage http ibatis apache org Last visited August 31 2007 Apache Software Foundation 2007b Apache JMeter Homepage http jakarta apache org jmeter Last visited August 31 2007 M F Arlitt D Krishnamurthy and J Rolia 2001 Characterizing the Scalability of a Large Web based Shopping System ACM Transactions on Internet Technology 1 1 44 69 AspectJ Team 2005 AspectJ Development Environment Guide http www eclipse org aspectj doc released devguide Last visited August 31 2007 A Avizienis J C Laprie B Randell and C E Landwehr 2004 Basic Concepts and Taxonomy of Dependable and Secure Computing IEEE Transactions on Dependabl
81. e Once Per Iteration needs to be activated The guards and actions of the transitions should be configured as listed in Table Source State Destination State Disabled Guard Action any Sign On I signedOn signedOn true any Sign Off signedOn signedOn false Table 3 5 Guards and actions used to restrict transitions to the states Sign On and Sign Off The variable signedOn is used to remember whether a user has logged in or not Creating User Behavior Models and Defining the BehaviorMix A behavior file template can be exported by clicking the button Generate Template within the Markov Session Controller This file can then be edited in a spread sheet application or a text editor The sum of probabilities in each row must be 1 0 This step needs to be performed for each behavior model to be used The behavior mix i e the assignment of behavior models to their relative frequency of occurrence during execution is defined in the configuration dialog of the Markov Session Controller Entries can be added and removed using the buttons Add and Delete Again the sum of relative frequencies must be 1 0 Absolute and relative filenames can be used Relative filenames are always relative to the directory JMeter has been started from Figure 3 13 shows an example behavior mix Behavior Mix Name Relative frequency Filename Browser 0 9examples jpetstore br
82. e executed we defined the application and user behavior models according to our developed methodology and created a probabilistic Test Plan A subset of JPetStore s operations was instrumented since we found out that using full instrumentation has a considerable impact on the occurring response times The platform workload intensity metric PWI has been developed and implemented to quantify the workload on the server node This is based on the number of active traces within the application Analysis and Results The response times obtained from the executed experiment runs were statistically ana lyzed in terms of the relation between workload intensity and response times statistics We set up an analysis environment using R to transform and visualize the raw monitored data This environment is application generic and can be used for any data derived from Tpmon We showed that workload intensity does have a considerable impact on the response times For example the mean is much more sensitive to varying workload than the median or the mode are Response times were generally right skewed Four classes of distributions have been identified within the response time samples The 3 parameter log normal distribution is well suited to fit the class of unimodal operations Solely the tails of the measured distributions are generally too short Workload intensity sensitive Anomaly Detection Based on the findings in the case study we developed the Workl
83. e study the relation between workload intensity and response times will be statistically analyzed Based on this a workload sensitive anomaly detection prototype will be implemented 111 Contents Introduction 11 Motivation 2 1 System Modell 5 2 nae bbe twp ra Ea Dee EES E 2 2 Performance Metrics and Scalability 2 2 rs 8 2246 2 3 Workload Characterization and Generation oY eh eae Bee Wa wey Wa ae wa ds aia he ae Gh a He in E seda ee sue Sid ee ek es ARA ANA AR EE AAA A Be e edad Ey a ee ee ee ee ee 2 9 Related Work y 3 4 s 24 4 kd od e o eS Probabilistic Workload Driver 3 1 Requirements Definition 3 1 1 Requirements Labeling and Classification e 3 1 2 Assumptions 3 1 3 Supported Applications 924 64 42404 48 4 64 43 4 3 1 4 Use Cases 3 1 5 Workload Configuration we Je ea de Bee le ey ea 3 1 6 User Interface 3 2 Design 3 2 1 System Mode 3 2 2 Workload Configuration Data Modell 3 2 3 Architecture and Iterative Execution Modell 3 3 Markov4JMeter 000 000 0 0 000 2 eee 3 3 1 Test Elements 3 3 2 Behavior Files 3 3 3 Behavior Mix Controller 0 02002 022002 0000084 3 3 4 Session Arrival Controller 3 4 Using Markov4JMeter w N RE O Xx aa 13 18 20 22 23 26 29 29 29 30 30 31 32 33 33 34 34 37 39 40 41 42 42 42 Contents 4 Experiment
84. e time interval elapsed between two subsequent requests issued by the same user is called the think time This can be divided into client side and server side think time depending on from which perspective the measurement takes place Menasc et al 1999 The time interval between a user finishes a request and the system starts its execution is denoted as the reaction time Both metrics are included in Figure 2 4 a 1991 Resource Utilization The utilization of a system resource is the fraction of time this resource is busy 1991 System bottlenecks can be uncovered by monitoring resources with respect to this metric For example the response time of a system may considerably decreases due to a fully utilized CPU or because the free memory has exceeded 2 3 Workload Characterization and Generation Nominal Capacity AAA ei nl wen ei bi eg bai ei bel ei rei ei ei bai ml ei bai ei el Throughput Response Time Usable Capacity 1 1 1 Knee Capacity 1 Workload Workload Figure 2 5 Impact of increasing workload on throughput and end to end response time based on 1991 Workload and Scalability The term workload refers to the amount of work that is currently requested from or pro cessed by a system It is defined by workload intensity and service demand characteristics Kozolick 2007 e Workload intensity includes the number of current requests mainly based on the current number of users and the t
85. e times Scatter Plot of Response Times service CatalogService getltem Density Density 0 00 0 05 0 10 0 15 0 20 02 03 04 0 0 0 1 Density Plot of Response Times service CatalogService getltem Density Plot of Response Times presentation OrderBean newOrder Mean 4 517 Median 7 245 Approx Mode 0 008206 Skewness 0 3665 Kurtosis 1 167 FDO 15 20 5 Response time milliseconds N 1681 Bandwidth 0 9664 10 b Density plot of data in Figure a Density Plot of Response Times presentation CartBean addltemToCart Mean 6 447 Median 5 938 Approx Mode 5 322 Skewness 0 5146 Kurtosis 3 08 Do la Mos ea R m UBER nn un a u 0 5 10 15 Response time milliseconds N 1025 Bandwidth 0 3326 d Density plot of data in Figure c Density Plot of Response Times service CatalogService getltem or p o BE gt docgby gresgi Roe S o ES Y 7 oS QP E o E ECH 9 oo SR F E Za E d E J 2 e N o v eE g co y Com 7 o T T T 5 10 15 3 4 3 6 Experiment time minutes N 2756 e Data in Exp 2 Density Plot of Response Times service AccountService getAccount x o Mean 3 751 A Median 3 575 4 id Approx Mode 3 502 gt H Skewness 2 374 Kurtosis 6 374 o ia a K A TE Ss IEN 5 aod 2 EE 4 0 4 5 5 0 5 5 6 5 Response time milliseconds N 1294 Bandwidth 0 05728
86. e value for Given a history X and a set of executions Y let e o st rt Y be an execution of operation o and let rt denote the sample mean of historical executions of o in X The function PAD N gt B gives the value 1 for the execution e if and only if iff e is considered anomalous 1 rt gt xrt PAD e 6 1 0 else In the following two sections PAD is applied to two different scenarios in terms of the workload intensity variation Example 1 PAD with Constant Workload Intensity Figure 6 1 shows a synthetic workload scenario for a single operation and a constant work load intensity The response times were generated based on the parameterized normal distribution N 100 3 The inter start times i e the distance between execution start times are exponentially distributed with constant parameters 2 clusters of 8 anomalies whose response times are increased by 8 were randomly inserted The data is shown in Figure 6 1 a Executions that correspond to triangles are classified as anomalies and the circles as normal executions The curve of the error rate for PAD applied to this scenario is shown in Figure 6 1 b It shows the relation between the chosen threshold 7 and the resulting error rate The error rate is 0 for 7 106 4 144 9 Assuming that rt 100 this implies that 6 should be set to a value in 1 06 1 44 Example 2 PAD with Varying Workload Intensity Figure show a scenario for a varyi
87. eb Consortium 2005 RUBiS Home Page http rubis objectweb org Last visited August 31 2007 OpenSTA 2005 OpenSTA Open System Testing Architecture Homepage www opensta org Last visited August 31 2007 D Pugh and J Spacco 2004 RUBiS Revisited Why J2EE Benchmarking is Hard In Companion to the 19th Annual ACM SIGPLAN Conference on Object oriented Programming Systems Languages and Applications OOPSLA 2004 Vancouver BC CA October 24 28 2004 pages 204 205 ACM Press R Development Core Team 2005 R A Language and Environment for Statistical Com puting Vienna R Foundation for Statistical Computing http www R project M Rohr S Giesecke and W Hasselbring 2007a Timing Behavior Anomaly Detec tion in Enterprise Information Systems In J Cardoso J Cordeiro and J Filipe editors Proceedings of the 9th International Conference on Enterprise Information Systems ICEIS 2007 Funchal Madeira Portugal June 12 16 2007 volume DISI Databases and Information Systems Integration pages 494 497 Portugal INSTICC Press M Rohr A van Hoorn S Giesecke and W Hasselbring 2007b Workload Intensity Sensitive Timing Behavior Anomaly Detection in preparation R Sedgewick 1986 A New Upper Bound for Shellsort Journal of Algorithms 7 2 159 173 M Shams D Krishnamurthy and B Far 2006 A Model based Approach for Testing the Performance of Web Applications In Proceedings of the
88. ed on the configured application and user behavior models by executing the following steps 1 Request a user behavior model from the behavior mix controller Request session arrival controller for a permission to execute a session 3 Execute a probabilistic session model which is a composition of the application and the user behavior model assigned for this iteration This step is described in the following Section 3 2 3 4 38 3 3 Markov4JMeter 3 2 3 4 Session Model Composition and Execution Sessions executed by the user simulation threads are based on probabilistic session models which are a composition of the application model see Section 3 2 2 1 and a user behav ior model see Section 3 2 2 2 As mentioned above the application model contains the protocol information and constraints required to generate valid sessions for an applica tion A user behavior model contains information about the probabilistic navigational behavior as well as a model of the think times The two models are directly related by associated states and transitions contained in the user behavior model the set Z of states and on the session layer of the applica tion model The composition of the two models to a single probabilistic model can be performed straightforward by enriching the application transitions with the probabilities contained in the user behavior model Starting with the entry state zu defined in the user behavior model a probabilis
89. edt 370 Dee gt 1 Ipesoresendes CtalogSomice getProduettstring gt 1 ipetsore domain tom selSuppieid o 1 itso domain tom siiPceara mah BigDecimal lt e ietstore domain tem settemisSting 3 H E 5 1 ietstore domain tom settemlaSting A ipotstopersstencesamapdao omSalMapDao getem Sng 0 01 o e 16 05 1 itso persistence smapdao temSamapDao getemisisyP itso sence CatalogSenize gentemtsvino 256 1702 DEE gt itso presentation Catalogean viewtemg 2052 1007 3 itso presentation CatalogdeanvewProdut p 1 Petstore presentation CatatogBean seem Sting Ex 4 suusecionAcionSenetdoGetttupseretRequest HipSeretResponse oeren HipSeretResponse 5935 3977 7 7 7 7 7 7 7 7 7 7 7 7 o 10 20 20 40 so 60 o 10 20 30 40 Time ms Time ms e Request viewProduct f Request viewltem Figure B 1 Sample trace timing diagrams for JPetStore requests 101 Appendix B Case Study 10 Operation 10 Operation 10 Operation petstore domain Cart getSubTotal jpetstore domain Cartltem setQuantity int 10 jpetstore domain Account getBannerName jpetstore domain Cart getAllCartitems Ta oat pet na attend enana i T pasosa 375 E petsore persistence samap
90. either directly nor indirectly Each Markov State must have a unique name HTTP Request Samplers should be added to the Markov States according to Table 9 amp Test Plan H Y Thread Gruppe A HTTP Cookie Manager Hit HTTP Request Defaults 9 Markov Model o A Index Sign On E View Category Sign off Wa Gaussian Random Timer View Results Tree Tr Figure 3 12 Markov4JMeter Test Plan Defining Transition Guards and Actions When selecting a Markov State within the Test Plan the configuration dialog including the table to define guards and actions for transitions to all states of the same Markov Session Controller appears The table is automatically updated each time Markov States are added removed or renamed Transitions can be assigned guards in order to allow a transition to be taken only if the specified expression evaluates to true By selecting the respective check box transitions can be deactivated completely which is equivalent to entering a guard evaluating to false An action is a list of statements such as func tion calls or variable assignments separated by a semicolon which is evaluated when a transition is taken In our example a variable signedOn is used to remember whether a user has logged in or not A User Parameters Pre Processor to the Markov Session Controller with a 44 3 4 Using Markov4JMeter new variable named signedOn with the value false to initialize the variable The check box Updat
91. elated Markov Session Controller is modified For example the Markov State S0 in Figure 3 8 c contains the HTTP Samplers f shtml and g shtml which are executed in this order according to the application model in Figure The configuration of the transitions is shown in Figure 8 8 c 3 3 2 Behavior Files The transition matrix of a user behavior model as defined in Section 13 2 2 2lis stored in external files using a comma separated value CSV file format It contains the names of all Markov States underneath a Markov Session Controller The entry state of the model is marked with an asterisk at most one The column labeled with represents the transition probability towards the exit state 41 Chapter 3 Probabilistic Workload Driver Figure shows the behavior file of the user behavior model in Figure 3 4 a Valid behavior file templates can be generated through the Markov Session Controller configu ration dialog nn 81 92 n S0x 0 00 0 70 0 10 0 20 S1 0 00 0 50 0 10 0 40 52 0 10 0 50 0 00 0 40 Figure 3 9 User behavior model stored in CSV file format 3 3 3 Behavior Mix Controller As mentioned above the Behavior Mix Controller assigns user behavior models to the Markov Session Controller based on the configured behavior mix The models are read from the behavior files and converted into an internal representation which is passed to the Markov Session Control
92. enchmarking Web or application servers They include functionality to generate workload for the respective application 12 2 4 Probability and Statistics Workload Model Application Model i Sequence Generator Trace of Sessionlets J SWAT Y Trace Generator q Synthetic Workload J Request Generator System under Study Figure 2 9 Workload generation approach by Shams et al 2006 Synthetic workload is usually based on generated user sessions The included requests are then executed by a number of threads each representing a virtual user 2002 We consider the combination of the number of active sessions and the think times the means to imply the workload intensity The service demand characteristics are related to the services called within a session simulated users whose behavior was derived from CBMGs 2002 propose an integration between existing benchmarking tools and the CBMG workload characterization model According to the algorithm by Menasc et al Ballocca et al derived CBMGs from Web logs To emulate user behavior matching a class described by a certain CBMG they first chose the respective CBMG selected a session and then generated a script executing the session reconstructed from the original session from the Web log The scripts were executed by the Web stressing tool OpenSTA Based on an EFSM capturing inte
93. ent 12 The remaining operations have unimodal distributions in all experiments Repre sentatively Figure 5 14 h shows the density of operation service AccountService getAccount in Experiment 7 The distribution for Experiment 11 is shown in Figure 5 14 i 5 2 5 Distribution Fitting In addition to the non parametric kernel estimation we estimated the values for the parametric normal 2 parameter log normal and the 3 parameter log normal distribu tions for all operations in each experiment see Section 5 1 5 Remarks on the goodness of fit are given in the following Sections 5 2 5 1 and 5 2 5 2 5 2 5 1 Kernel Density Estimation As denoted in the previous Section 5 2 4 we observed that the distributions are right skewed remarks that a normal kernel is not well suited to model the steep height in our case the left hand side of skewed distributions This is obvious in Figure where the curve considerably extends the range of response times which actually occurred and even estimates the density for values less than 0 ms 5 2 5 2 Distribution Families In no case is the normal distribution a good estimator of the distribution neither does it estimate the steep left side nor the long tail Figures 5 15 c and 5 15 d show the density curve and the QQ plot of the parameterized normal distribution The 2 parameter log normal distribution doesn t yield good fittings as well since it doesn t take the shift in t
94. ent menu to the Thread Group and select the cookies to be deleted after each iteration 3 Add the HTTP Request Defaults from the Config Element menu and insert the data shown in Figure 3 10 4 Add a View Results Tree for debugging purposes and select Save Response Data 0 0 L Y amp Test Plan 5 E Thread Group Hig HTTP Cookie Manager aig HTTP Request Deu View Results Tree E WorkBench HTTP Request Defaults Name IHTTP Request Defaults Server Name or IP www jwebhosting net Port Number 180 0 m Protocol default http Ihttp Path serviets jpetstores Figure 3 10 A prepared JMeter Test Plan Adding a Markov Session Controller After installing Markov4JMeter the new Logic Controllers Markov State and Markov Ses sion Controller appear in the respective menu see Figure 8 11 A Markov Session Controller needs to be added to the Thread Group This is the root element of any probabilistic session model consisting of a number of Markov States and transitions between them Also it contains the configuration of the behavior mix and the Session Arrival Controller A Gaussian Random Timer added to the Markov Session Controller emulates client side think times between subsequent requests It is highly recommended to use at most one Markov Session Controller within a Test Plan The Markov Session Controller should be placed at the root level of the Thread Group Especial
95. erred language and favorite category item r Item data e g price supplier and description sequence r w Next order number to be used lineitem r w Item quantity and price for each order position supplier r Supplier data signon r w Login credentials i e pairs of username and password Table 4 5 Table description of JPetStore database schema Tables being written during normal operation are marked with mode r w Those which are solely read are marked with mode r 62 4 5 Execution Methodology 5 Clean up Phase The scripts disables Tpmon increments the experiment iden tifier and stops the resource utilization monitoring The Tomcat access log file is copied from the server Each time after a Test Plan has executed the script checks the log files whether errors indicated by HTTP status codes failed assertions or entries in the JMeter output oc curred The configuration of each experiment and the resulting log and monitoring data is archived 63 Chapter 5 Analysis This chapter deals with the analysis of the monitored data of 50 experiments with varying workload The experiment design has been described in the previous Chapter 4 While examining the response time data before the actual analysis we uncovered a problem existent in Tpmon when writing large amounts of data to the database Due to this problem we only considered those experiments in
96. es petstore mysql create user sql Creates a new MySQL user and grants the appropri ate rights for the newly created database The credentials are those to be entered in the database properties file as mentioned in Section B 1 2 3 jpetstore mysql dataload sql Inserts application data e g a default user profile categories products and items into the database B 1 2 4 1 Build and Deploy In order to build the sources ant must be called from within the directory build After a successful run the file jpetstore war exists inside the directory build wars This file needs to be copied or linked to the webapps directory of the Tomcat server together with the MySQL driver The JPetStore is now accessible through the URL http localhost 8080 jpetstore B 1 3 Monitor JPetStore with Tpmon The required steps to monitor the JPetStore with Tpmon are given in the following sections B 1 3 1 Initialize Tpmon Database The file table for monitoring sql contains the SQL query to create a database table to be used by Tpmon The query must executed in order to obtain the required database schema B 1 3 2 Configure Tpmon The file dbconnector properties ezample must be copied to src META INF gt dbconnector properties It contains the Tpmon configuration parameters Depending on the value of the property storelnDatabase Tpmon writes the monitored data into a database or the local filesystem When a database is to be used the proper
97. etRequest HttpServietResponse 14 1 21 68 T T T T T T o 20 40 60 80 100 Time ms a Trace timing diagram of the first iteration T T T T T o 5 10 15 20 Time ms b Trace timing diagram of the fourth itera tion Figure 4 6 Sample trace timing diagrams for the request type newOrderConfirm Figure a shows the diagram derived from the data of the first iteration containing the 10 operations with the highest response times Figure b shows the diagram for the fourth iteration consisting of the annotated operations Operation Request Type 1 27 3 4 5 6 7 8 9 10 N 13 index signonForm signon viewCategory viewProduct viewltem addltemToCart viewCart checkout newOrderForm newOrderData newOrderConfirm signoff struts action ActionServlet doGet HttpServletRequest HttpServletResponse struts action ActionServlet doPost HttpServletRequest HttpServletResponse persistence sqlmapdao AccountSqlMapDao getAccount String String persistence sqlmapdao ItemSqlMapDao getltem String persistence sqlmapdao ltemSqlMapDao getltemListByProduct String persistence sqlmapdao OrderSqlMapDao insertOrder Order presentation AccountBean signon presentation CartBean addltemToCart presentation CatalogBean viewCategory presentation CatalogBean viewltem presentation CatalogBean viewProduct presentation OrderBean newO
98. etric or non parametric 1986 When using the parametric strategy one assumes that the sample is distributed ac cording to a parametric family of distributions e g the above mentioned normal or log normal distributions In this case the parameters are estimated from the sample data With non parametric density estimation less assumptions are made concerning the distribution A popular non parametric estimation is the kernel estimator as in Equa tion Based on the observations x from the sample data the density at a point x is estimated by summing up the weighted distance between x and all observations x within a given window width h around each observation The distances are weighted by a kernel function K which satisfies the condition in Equation If K is a density function such as the normal distribution f is a density function as well The window width h is also called the smoothing parameter or the bandwidth When h is small spikes at the observations are visible whereas with h being large all detail is obscured Figure 17 Chapter 2 Foundations illustrates density estimation by means of a normal kernel using a small and a large window size Kod 1 2 11 t A 2 12 Other non parametric estimation methods exist which adapt the smoothing parameter to the local density of the data sample e g the nearest neighbor method or the variable kernel method Details on these methods as well as a more detailed discus
99. f users presentation CartBean Users vs Throughput T T T 20 40 60 Users a 0 75 users Figure 5 5 Users vs Response Time Minimum presentation CartBean addltemToCart T T T 0 6 0 7 0 5 z T D 50 100 Users a 0 195 users getNextId 5 2 3 1 Minimum 0 Stretch Factor Requests minute Response time ms Users vs amp Throughput S S S S o S 84 S S o S S S amp o s 8 Ss o T T T T T T T 0 20 40 60 80 100 120 Users b 0 125 users Users vs Response Time Minimum service OrderService getNextld 244 246 248 250 252 254 256 258 vs 20 40 60 80 100 120 Users b 0 125 users addItemToCart 5 2 3 Descriptive Statistics t 1 00 T 0 99 Stretch Factor Requests minute Response time ms Users vs Throughput o S 8 4 S R o S S S 8 e S S4 S 8 e S S Ss d Wi T T T LR o 50 100 150 200 Users Number of users vs throughput c 0 195 users Users vs Response Time Minimum service OrderService getNextld 900GG00G 500 O07 070 OOOOH 3 g 3 0 o T 2 0 Stretch Factor 1 5 o d o 1 0 T 0 T T T T 50 100 150 200 Users c 0 195 users minimum response times for the operations and service OrderService per minute with 165 to 185 users see Figures 5
100. for the operation persistence sqlmapdao OrderSqlMapDao insertOrder all curves for the variance and standard deviation have the same shape they increase con siderable to a peak located around a PWI of 30 and considerably decrease afterwards As a representative the curves for the operation presentation CatalogBean viewItem are shown in Figure 5 12 a The related curves showing the mean median and mode stretch factors for this operation is shown in Figure 5 10 b 77 Chapter 5 Analysis Workload Intensity vs Workload Intensity vs Workload Intensity vs Response Time Variance and Standard Deviation Response Time Variance and Standard Deviation Response Time Skewness presentation CatalogBean viewltem persistence sqlmapdao OrderSqIMapDao insertOrder service CatalogService getltem e S L variance 8 o variance S a standard A a standard deviation No Ne Weg 7 C o o wo o 3 P Da N e o y 8 r 8a SC A g B SE 0 a z 8 P E er bas o E ef o go Fa g 8 S pas y e 3 amp s 8 3 S 7 E 3 3 o eg 5 o oa o B 5 weg 5 gt Los Ss o 8 a 4 8 Ys gt A S E a J a mn s A d A o o9 Lg ND Le Al o S 3 e E 0 Oo o o oe SE o H WE T T T T T T T T T T T T T T q T T T T T T D 10 20 30 40 50 60 D 10 20 30 40 50 60 0 10 20 30 40 50 60 Platform workload intensity Platform workload intensity Platform workload intensity a PWI 0
101. getCategory in Experiment 23 executed both in database and filesystem mode Moreover in these experiments the sample data has a significantly higher variance than when storing the data to the local filesystem Figures B 2 c and 5 2 d show box and whisker plots in two different scalings emphasizing the higher variance compared to the summarized data in FigureB 2 e operation presentation CatalogBean viewCategory in an experiment with 20 concurrent users 5 1 3 2 Outlier Removal In first examinations of the results we determined that the response time distributions have a rather right skewed shape Thus only normal and extreme outliers see Sec 67 Chapter 5 Analysis Scatter Plot of Response Times Scatter Plot of Response Times service CatalogService getCategory String service CatalogService getCategory S Local regr amp ssi Is goca Si g S N ocal regr ssion 3 ocal regression ge oe ae Bog E 27 E 8 7 202 8 3 Esos 8 Q S o 27 o Soa n n S S Q a g 87 g RJ c c o o 4 3 5 3 6 3 7 3 8 3 9 4 0 3 5 3 6 3 7 3 8 3 9 4 0 Experiment time minutes Experiment time minutes N 1103 N 3145 a Sample data stored in database b Sample data stored in filesystem Box and Whisker Plot of Response Times Box and Whisker Plot of Response Times Box and Whisker Plot of Response Times presentation CatalogBean viewCategory presentation Catalog
102. he application HTTP Request Response Model The Hypertext Transfer Protocol HTTP is a request response protocol An HTTP communication is initiated by an HTTP request issued by a client and results in an Chapter 2 Foundations SQD HTTP Communication http client http server i l l I comm GET foo com news shtml id 12 lt Accept Charset ISO 8859 1 utf 8 gt lt gt gt http OK lt Content Type text html gt lt lt html gt lt html gt gt Figure 2 1 Example HTTP communication A client requests the resource news shtml from the server foo com The server responds with a message body containing an HTML file HTTP response from the server Details can be obtained from RFC 2616 Fielding 1909 We model the relevant elements of an HTTP communication by means of the function COMMnttp an HTTP request reg REQ s1p results in an HTTP response resp RESPptip see Equations 2 1 2 3 An example is illustrated in Figure COMM http REQhttp gt RESP http 2 1 REQhttp METHOD petp X U Bunn X HEADER itp X BODY http 2 2 RESPhttp STATUS pttp X HEADER http X BODY http 2 3 METHOD is a string describing the HTTP method e g GEI simply requests the content of a resource referred to by the Uniform Resource Identifier URI URInttp HEADERp11p simply as a list of name value pairs containing meta information about the
103. he response time data into account see Section DA Thus the distribution contains data starting with 0 ms although the sample data is generally shifted to the right In large part the 3 parameter log normal distribution fits the left sides of unimodal data samples see Section up to quantiles greater than the upper whiskers In most cases the tails of the response time samples are shorter than those found in samples of the estimated distribution Representatively Figures 5 15 e and B 15 N show the fitting of the 3 parameter log normal distribution for the operation ser vice CatalogService getItem in Experiment 12 The QQ plot emphasizes that the distribution fits the data sample approximately up to 10 ms Then the sample of the estimated distribution shows a longer tail Figure B 1 0 shows a QQ plot of the 81 Chapter 5 Analysis Response time ms Density Density 10 12 8 02 04 06 0 0 c 02 04 06 0 0 e Box and Whisker Plot of Response Times service CatalogService getltem ean edian Experiment time minutes a Box and whisker plot Density Plot of Response Times and Normal Distribution Model Sample Mean Mean 4 796 Median 4 192 Approx Mode 3 697 A Skewness 2 457 A Kurtosis 6 998 Do 4 6 8 10 Response time milliseconds N 11863 Bandwidth 0 1509 Normal distribution with estimated pa rameters Density Plot of Response Times
104. hink times between the requests e Service demand characteristics include resource usage on server side required to service the requests Increasing workload generally implies a higher resource utilization which leads to de creasing performance The stretch factor is a measure of performance degradation It is defined by the ratio of the response time at a particular workload and the response time at the minimum load The term scalability is used to relate to the ability of a system to continue to meet its response time or throughput objectives as the demand for the software functions workload increases 2001 Figure 2 5 illustrates the characteristic impact of workload on throughput and response time The knee capacity denotes the point until which the throughput increases as the workload does while having only a small impact on the response time Until the usable capacity is reached response time raises considerably while there s only little gain for the throughput With workload continuing to increase the throughput may even decrease The nominal capacity denotes the maximum achievable throughput under ideal workload conditions 11991 2 3 Workload Characterization and Generation This Section gives an overview about how synthetic workload for Web based systems can be generated based on models of user behavior First a hierarchical workload model and basic terms for Web based systems are presented Examples on how to model user
105. htm P a html Y Session Arrival Controller IC Think Time Active Sessions Properties E View Results Tree AO o 15 ES WorkBench C Logging a Test Plan b Markov Session Controller con c Markov State configuration di figuration dialog alog Figure 3 8 Probabilistic Test Plan and Markov4JMeter configuration dialogs The formula defining the number of allowed active sessions during the test execution must evaluate to a positive integer The Test Element class contains the implementation of the session model composi tion and execution as described in Section In each iteration i e each time a new session is to be simulated the Markov Session Controller requests a behavior from the Behavior Mix Controller and requests the Session Arrival Controller to start the exe cution of this session An iteration ends when the exit state of the behavior model see Section 3 3 2 is reached 3 3 1 2 Markov State As the implementation of the Markov Session Controller the Markov State is divided into a Test Element class and a GUI class Any subtree of JMeter Test Elements can be added to an Markov State representing the related deterministic state machine on the protocol layer of the application model The configuration dialog of the Test Element allows the definition of the state transi tions with guards and actions using JMeter s variables and functions The list of tran sitions is refreshed as the list of Markov States underneath the r
106. iagrams for JPetStore requests cont 102 List of Tables 2 1 Mean and variance for discrete and continuous probability distributions 14 3 4 Configuration of HTTP Request Samplers 44 3 5 Configuration of guards and actionsl e 45 4 1 Identified service and request types of JPetStore 48 4 2 Overview of varying parameters for all experiments 2 2 2 2 56 4 3 Response time statistics with different monitoring configurations 57 4 4 Identified monitoring points and coverage of request types 58 4 5 Table description of JPetStore database schema 62 A 1 Available JMeter Test Elements o 95 B 1 Response time statistics and request coverage of JPetStore operations 103 xl Chapter 1 Introduction 1 1 Motivation Today s enterprise applications are large scale multi tiered software systems involving Web servers application servers and database servers Examples are banking and online shopping systems or auction sites Especially if these systems are accessible through the Internet and thus exposed to unpredictable workload they must be highly scalable The availability of such systems is a critical requirement since operational outages are cost intensive Anomaly detection is used for failure detection and diagnosis not only in large software systems A common approach for detecting anomalies is building a model of a
107. ident that selecting JMeter as the basis was the right decision since we experienced it as a greatly maintained and active open source project Further more it 91 Chapter 7 Conclusion was the right decision to integrate our approach by implementing Markov4JMeter allowing the convenient definition of probabilistic session models into an ordinary JMeter test plan After spending a lot of time and efforts to implement Markov4JMeter we created a stable product forming the basis of our cases study and being helpful to a number of other users Case Study As described in Section originally the Java Pet Store was considered the sample application to be used It turned out that it could not cope with workload intensi ties required in the case study Moreover the alternatives TPC W Transaction Pro and RUBIS out dated Finally we selected the JPetStore which in the meantime is being used in other universitary theses as well We spent a long time for setting up and configuring the final experiment environment including JPetStore Tpmon and Markov4JMeter as well as for developing the software supporting the execution and the analysis of the experiments This evolved to a reliable environment for performance evaluations Additionally we experienced that when per forming timing behavior evaluations like ours a large number of parameters need to be configured appropriately The characteristics of the operation presentation CartBean addItem
108. ied version of the one defined in Section Given an active traces history A and a step function activeTracesa N Nas defined in Section 4 4 1 the function pwi is defined as follows st rt pwia e K activeT races A t 6 2 r t st 2 In order to estimate the expected response time for some workload intensity we define the function wnf Rr R w wnf w For a given workload intensity w wnf w is a workload intensity normalization factor for the response time threshold that applies to a workload intensity of 1 The function wnfo is realized by mathematical polynomial function of any order whose parameters are learned from tuples rt pwi e which are computed for every execution e o st rt Y 87 Chapter 6 Workload Intensity sensitive Anomaly Detection e S a o Total error rate a gR Lo Type 1 error rate E S Type 2 error rate i E DS o o pet g E Se Ss E a gs o e D ER U a S g 5088 S ZS Si A Classified as anomaly 16 8 0 percent 3 4 Classified as normal obs 184 S Threshold 110 o T T T T T T T T T T 1 T T 1 0 10 20 30 40 50 60 80 100 120 140 160 180 200 Experiment time seconds Threshold a Anomaly detection with threshold 110 b Threshold vs anomaly detection qual ity Figure 6 3 WISAD scenario With a dynamic threshold the response times of the anomalies can be correctly dis
109. ign Patterns Elements of Reusable Object Oriented Software Amsterdam Addison Wesley 110 Bibliography ISO IEC 2001 ISO IEC 9126 Software engineering Product Quality R Jain 1991 The Art of Computer Systems Performance Analysis New York John Wiley amp Sons E Kiciman 2005 Using Statistical Monitoring to Detect Failures in Internet Services PhD thesis Stanford University E Kiciman and A Fox 2005 Detecting application level failures in component based internet services IEEE Transactions on Neural Networks 16 5 1027 1041 G Kiczales J Lamping A Menhdhekar C Maeda C Lopes J M Loingtier and J Irwin 1997 Aspect Oriented Programming In M Aksit and S Matsuoka editors Proceedings of the European Conference on Object Oriented Programming ECOOP 97 Jyvaskyla Finland 9 13 June 1997 volume 1241 of Lecture Notes in Computer Science pages 220 242 Berlin Springer H Kozoliek 2007 Introduction to Performance Metrics In I Eusgeld F Freiling and R Reussner editors To appear in Proceedings of the DMETRICS Workshop Nov 9 10 2005 Lecture Notes in Computer Science Berlin Springer D Menasc V A F Almeida R Riedi F Ribeiro R Fonseca and J Wagner Meira 2000 In Search of Invariants for E Business Workloads In Proceedings of the 2nd ACM Conference on Electronic Commerce EC 00 Denver CO USA November 3 5 1999 pages 56 65 ACM Press D A Menasc
110. in the JPetStore were determined based on an ite rative process see Section 4 3 Approaches for an application generic automatic identification of suitable monitoring points in enterprise applications would be valu able A node with a single physical CPU core was used as the application server By executing equally configured experiments with a server node equipped with multiple physical CPU cores and performing a similar analysis the results could be compared with those we derived The platform workload intensity PWI has been computed based on historical data using an implementation in R and C A real time computation could be im plemented This would require additional efforts in minimizing the influence on server performance We did not optimize the PWI parameters window and step size An evaluation of these could be future work 93 Appendix A Workload Driver A 1 Available JMeter Test Elements EZ AW Ul SWH ISIL ofqefreay T Y SlgeL synsay Alewwng e19u99 1039819X3 YIe4X ail E 02 asuodsay ares 4o49e1 xq uoIssasdxy 1e n39y Ja puey uoy sn2eIS 1 nsay 1055990441504 2ySueag 4OSSIIOI 1504 apoy SURLIM Y TIN dLLH s apweieg 1SN yseyy Joyowesed IW LH J24Un0 Japoy Jajawesed SN IW LH sessed AUT TWLH JOSS9901 4944 Pysuesg ETS 3ijuo sajdwes AL uorrein3yuo Mme ONT Jazeuel 4SpeSH d1 LH synejag 1sambay qi LH juawa y 314U0 ulso7 synejag Isanbay papuaxx3 dv gl u0 3e1n31UuO gt uo
111. ion and a behavior mix and intensity configuration for specific experiment runs config loadstore m Load and Store Configuration The workload driver must provide functionality to load and store configurations such that they can be reused This includes the definition of an appropriate file format 3 1 5 1 Application and User Behavior Configuration config aub ifaceDetails m Application Interface Definition A description format must be defined to specify the details on the HTTP communi cation between client and application under test for all provided interactions config aub dynValueAssignment s Dynamic Value Assignment Means should be provided to dynamically assign values for HTTP request param eters This includes selecting values which depend on the last response as well as functionality to select values from prepaired data e g login credentials for user au thentication config aub naviBehaviorModel m User Behavior Model A description format must be defined to model user session in terms of the issued re quests within a session This must include probabilistic elements to model weighted alternatives Varying the assigned probabilities allows for modeling behavior of differ ent behavior classes config aub userThinkTimesConst m Client side Think Times constant A constant value to be used as the client side think times between two subsequent requests of the same user must be configurable config aub
112. ipauuo GO J93eue 91400 d LLH synejag 1sanbay d 14 sajqeuen pauljag Joer synejag 1sanbay dyal synejag Isanbay ever Jodeueyy uonezuoyyny qi LH quawa y 31juo gt S dw g Zuuo PS eq ASD quawia y uolzesnsijuoy vonngugsig Wopuey unoziun Jaw Wopuey uelssnes Jaw JURISUO Jaw Buiziuoryou s Jow indy3no4y JueIsuoy Jaw jeySueag ABW uolassy uoip ssy y edX uoip ssy ew yos TWX uoip ssy TNX uorp ssy ZI uoluassy X HGSAN uoluassy JNLH uolassy uoljeing uolassy jaygueag uolpassy asuodsay uolpassyy 394 SINSIY MIA 9 q8 ul SINSIY MAA woday lewuns pod y 91039133 y ydesy 91039133 y JazijensiA auude J21UM 2q ajdw s synsay JOHUO A synsay ydessy synsay Un yders ydeasy uonnquasiq 19433517 jayguesg synsay uoip ssy 19497517 13 10 4409 SJIYAN 19 04JU0 YOUMS 19104340 WOpuey 131043405 doo 19104140 apn 9u 19 043U07 uoIpesues j Jollonugo7 sw yjuny 19104340 AJUQ a2u0 191043405 ajdwis 19104340 J 19043405 BuIpsod9y 13 013u0 gt indy3noxy 19104140 JaPAQ WOopuey 13 01140 gt 3 NPO A 19 10 1405 2583 4914 19 0440 y22e7104 49 JO4JUO 21307 Ja dwes 19peay WEI 4aquosqns SIL ysanbay Dgar Jajdues jaygueag ysanbay LdvOe baaegaw ysanbey d LLH uo 159 ysenbey AC a ysijqnd SINF ysanbay enef dajdwes 459 ysanbey DdY TNX AVOS sajdwes T dfV dajduies dl isonbay papuaix3 dv WIOg OF U Od SINF ysanbay HUNF Jaj dwes 307 ssad3y 431d 11H ysonb
113. ler A single Behavior Mix Controller is available which is implemented using the singleton pattern Gamma et al 2000 3 3 4 Session Arrival Controller According to Section 3 2 3 2 the Session Arrival Controller provides the methods enter Session and ezitSession which are called by the Markov Session Controller before starting to execute a new session Depending on the current number of active sessions and the configured active sessions formula see Section 3 2 2 4p a thread might get blocked until the session entrance is granted The Session Arrival Controller is also implemented using the singleton pattern 3 4 Using Markov4 JMeter This section contains a step by step description how a simple probabilistic Test Plan for the JPetStore see Section 2 6 is created This Test Plan can then be executed just like any ordinary JMeter Test Plan The JMX file of the Test Plan and the associated files can be found in the directory ezamples jpetstore of the Markov4JMeter release Appendix A 2 describes how to install Markov4JMeter Preparing the Test Plan By performing the following steps the basic Test Plan shown in the left hand side of Figure is created 1 Add a Thread Group to the empty Test Plan and select to stop the test when a sampler error occurs Set the number of threads to 5 and the loop count to 5 without using the Scheduler 42 3 4 Using Markov4JMeter 2 Add an HTTP Cookie Manager from the Config Elem
114. ly Markov Session Controllers must not be nested Adding Markov States After adding four Markov States named Index Sign On View Category and Sign Off to the Markov Session Controller the Test Plan has the tree structure shown in Figure Markov States must be placed directly underneath a Markov Session Controller Markov State Markov Session Controller Logic Controller Listener Sampler gt Add Remove Delete Cut Ctri X o Figure 3 11 After installing Markov4JMeter the Logic Controller menu shows two new entries Markov State and Markov Session Controller 43 Chapter 3 Probabilistic Workload Driver Name Path Method Parameters Name Value Markov State Index index shtml JPSROOT index shtml GET Markov State Sign On signonForm shtml JPSROOT signonForm shtml GET signon shtml JPSROOT signon shtml POST username j2ee password j2ee submit Login Markov State View Category viewCategory shtml JPSROOT viewCategory shtml GET categoryld REPTILES Markov State Sign Off signoff shtml JPSROOT signoff shtml GET Table 3 4 Data to fill in to the HTTP Request configuration dialogs JPSROOT needs to be replaced with servlets jpetstore5 shop Also the check boxes Redirect Auto matically and Follow Redirects must be selected and must especially not be nested n
115. mance This section contains a description of those aspects relevant for our work Overview Traces are defined by a Test Plan which is a hierarchical and ordered tree of Test Ele ments The number of users to be emulated as well as other global parameters such as the duration of the test i e the trace generation phase are configured within a Thread Group element forming the root of a Test Plan The core Test Elements are Logic Con trollers and Samplers Logic controllers group Test Elements and define the control flow of a Test Plan Samplers are located at the leaves of the tree and send the actual requests Examples for Logic Controllers are If and While Controllers which have an intuitive meaning known from programming languages HTTP Request or FTP Request are examples for Samplers and are located at the leaves of the Test Plan JMeter provides a graphical user interface GUI for creating a Test Plan and executing the test Figure shows the JMeter GUI Existing Test Plans can also be executed in non GUI mode in order to save system resources 20 2 6 Apache JMeter Apache JMeter Engine en Es configu initializes ration from and controls modifies and creates contains y number of executes instance of JMeter Test Elements GUI Classes Test Element Classes EE ono EE samplers EE otg assertions stored as Figure 2 15 JMeter Architecture
116. mor already when solely the application entry points are instrumented The minimum values indicate a constant overhead of about 1 2 ms Each activated monitoring point adds an additional overhead e g the median response time of the request type addltemToCart with 119 monitored operation calls increases from 5 9 ms with annotated application entry points to 68 ms using full instrumentation 4 3 2 Identification of Monitoring Points By executing four experiments we identified 17 JPetStore operations and the two ap plication entry points to annotate by iteratively reducing the number of annotations This relates to the version of Tpmon used in our experiments In the meantime Tpmon has been optimized due to this reason and those mentioned in Chapter 97 Chapter 4 Experiment Design Operation jpetstore presentation AccountBean isAuthenticated 10 insertOrder ipetstore domain Ord 27 23 4 37 Operation 7 41 jpetstore service OrderService insertOrder jpetstore domain Order 42 02 Bean newOrder 45 07 struts action ActionServlet doGet HitpServletRequest HttpServietResponse 106 85 jpetstore persistence sqlmapdao OrderSqiMapDao insertOrder jpetstore domain Order jpetstore service OrderService insertOrder jpetstore domain Order 34 u HEEN 18 73 jpetstore presentation OrderBean newOrder DE WT 19 22 struts action ActionServiet doGet HttpServi
117. must be added to the file bin catalina sh with the path to aspec tjweaver jar being set appropriately in order to register the AspectJ weaving agent on server startup JAVA OPTS j avaagent path to aspectjweaver jar B 1 3 5 Build and Install Tpmon Control Servlet Tpmon can be controlled by a servlet included in the directory tpmon control serulet of the Tpmon source tree It is installed by copying the file tpmon control servlet war to the webapps directory of the Tomcat installation and can be accessed through the URL http localhost 8080 tpmon control servlet B 1 3 6 Annotate JPetStore If only specific methods of the JPetStore shall be monitored the source code needs to be instrumented In each class containing methods to be monitored the line import tpmon aspects must be added The annotation CTpmonMonitoringProbe must directly precede each method to be monitored The Tpmon binary tpmonLTW jar must be copied or preferably linked to the folder devlib of the JPetStore sources before rebuilding them 100 B 2 Trace Timing Diagrams B 2 Trace Timing Diagrams Operation Operation Operation jpetstore persistence DaoConfig gelDaoManager petstore presentation AccountBean isAuthenticated jpetstore domain Account setListOption boolean
118. n experiment can be varied based on mathematical formulae to be specified in order to execute long term experiments with varying workload Implementation The approach has been implemented by extending the open source workload generator JMeter We added new Logic Controllers which allow the definition of a probabilistic 89 Chapter 7 Conclusion session model within a usual JMeter Test Plan This includes the application model the definition of a user behavior mix and the variation of the user count based on mathematical formulae This extension named Markov4JMeter has been released under an open source li cense 2007 It can be used with any protocol supported by JMeter e g HTTP LDAP and Web services We know of companies which are successfully using Markov4JMeter for more than two months Case Study The sample application JPetStore has been exposed to varying workload in a large number of experiments in order to obtain response time data Based on this we statistically analyzed the relation between varying workload intensity and response time statistics Methodology We carefully designed and executed a large number of experiments JMeter extended by Markov4JMeter was used to generate the workload The response times were moni tored using Tpmon We set up an automatic execution environment in order to obtain reproducible and meaningful results This can be reused for similar experiments In order to define the workload to b
119. n functionality Examples are logging error handling and per formance measurement In a certain way it may be possible to capsulate those concerns by procedure abstraction but often this still leads to code which is hard to maintain Aspect Oriented Programming AOP is a concept which strives to separate cross cutting concerns from application functionality as far as possible 1997 The cross cutting concerns are called aspects and are expressed in a form which is separate from the application code Positions in the code to which aspects are to be woven are called point cuts A so called aspect weaver automatically combines the application and the aspects into binaries Following Kiczales et al 1997 the procedure of enriching an application with aspects using AOP is illustrated in Figure AspectJ Eclipse Foundation 2007 is an AOP extension to the Java programming language The AspectJ weaver allows for weaving aspects into an application at compile time post compile time and load time 2005 Independent of the time the weaving takes place the Aspect weaver produces equal Java binaries Using compile time weaving the AspectJ compiler weaves the aspects to the defined point cuts inside the application sources When using post compile time weaving the aspects are woven into the already existing application binaries Thus post compile time weaving is also denoted as binary weaving In order to use load time weaving an AspectJ we
120. nd times out after a certain period of inactivity Arlitt et al 2001 Each sessions has a unique identifier and is associated with state information about the user e g the items in the shopping cart The identifier is passed to the server on each interaction e g by using client side cookies or by passing the identifier as a paramater value 10 2 3 Workload Characterization and Generation lect Search lt 0 2 0 05 Exit 0 4 Figure 2 7 Costumer Behavior Model Graph Menasc et al 1999 Modeling User Behavior Synthetic user sessions may be based upon captured real traces or on analytic workload models 11998 Primarily in research papers workload generators are presented which base upon mathematical workload models such as Markov chains or queued networks In this section we will present two approaches by Menasc et al and 2006 Costumer Behavior Model Graphs A first order Markov chain is a probabilistic finite state machine with a dedicated entry and a dedicated exit state Each transition is weighted with a probability The sum of probabilities of outgoing transitions of a state must be 1 Given the current state the next state is randomly selected solely based on the probabilities associated with the outgoing transitions which are stored in a transition matrix 1999 defined a Customer Behavior Model Graph CBMG to formally model the user behavior in Web based
121. ned and an example of use is presented e Chapter 4 contains the description of the experiment design A probabilistic work load driver profile for the sample application JPetStore according to the design in Chapter 3 is created Appropriate monitoring points have been determined and a workload intensity metric is defined Moreover the configuration of the machines the software and the experiment runs is described e Chapter 5 deals with the statistical analysis of the data obtained in the experiments After outlining the analysis methodology we give a detailed description of the results and conclude with a summary and discussion of these e Based on the results of Chapter 5 a workload intensity sensitive anomaly detection prototype has been developed This is part of Chapter 6 e Chapter 7 contains the conclusion of our work including a summary a discussion and the presentation of future work e The Appendix contains additional resources which are referenced by the related chapters of this document Chapter 2 Foundations This chapter contains the foundations of our work In Section 2 1 we describe the con sidered system model An introduction into performance metrics and scalability is given in Section 2 2 Section 2 3 deals with workload characterization and generation for Web based system An introduction into the theory of probability and statistics as far as this is relevant for our work is presented in Sec
122. ng analysis sets of components which are highly correlated with failures are discovered in order to determine the root cause 2005 present an approach for detecting anomalies of internal system behavior in terms of component interaction Based on the framework mentioned in the previous paragraph 2002 they capture component interactions and path shapes Two components or component classes interact with each other if one uses the other to service a request A path shape is defined as an ordered set of logical components used to service a client request The approach is divided into three phases observation learning and detection phases While observing and learning the path shapes and component interactions are derived from monitored data A reference model of the application s normal behavior in terms of the above mentioned characteristics is build Sets of path shapes are modeled by a probabilistic context free grammar PCFG In the detection phase anomalies in the current behavior are searched with respect to the reference model using anomaly scores to determine whether an observed shape is anomalous Based on the correlations between input and internal system behavior variables present an approach for anomaly detection Both sets of variables are trans formed into a number of correlating pairs Based on a threshold the system variables are divided into those having a highly correlated input partner and those being uncorrelated
123. ng core JMeter Test Elements e Thread Group The thread group is configured to stop the entire test as soon as a sampler error occurs The provided scheduler functionality is used to specify the test duration e HTTP Cookie Manager Cookies are selected to be cleared after each iteration This is necessary since in each iteration a new user shall be simulated and all former state information must be reset e HTTP Request Defaults This configuration element globally sets the server name and the port number since they have the same value for all HTTP requests e Constants All identifiers for categories products and items used in the JPetStore are stored in three dedicated constant variables as comma separated lists e Variables The two variables signedOn and itemInCart are defined according to the vari ables of the same name used in the session layer of the application model see 51 Chapter 4 Experiment Design Section 4 1 2 These variables are private to each thread and initialized with false when an iteration starts e Counter A counter named userId is shared by all threads and incremented in each iteration After reaching a specified maximum value it is reset to 0 e Listeners Two listeners namely Error Writer Assertion Results log erroneous HTTP status codes and failing assertions in order to determine the reason for aborted tests 4 1 4 2 Markov4JMeter Session Model A Markov Session Controller forms the root
124. ng workload intensity This is an extension of the scenario in Figure 6 1 by letting the mean response time increase over time while letting the inter start time decrease similarly 86 6 3 Workload Intensity sensitive Anomaly Detector o g 4 E o o 8 8 E z S CG o 5 Ge j a g Ig A Anomaly 16 8 0 percent a amp gt o Normal obs 184 34 Maximum of normal obs 177 3 S T T T T T 1 T 0 10 20 30 40 50 60 80 100 120 140 160 180 200 Experiment time seconds Threshold a Increasing workload intensity effects In b The minimum error rate of PAD is at 8 creasing response times and smaller inter for any threshold gt 176 start times Figure 6 2 Anomaly detection scenario with increasing workload intensity Applying PAD to this scenario yields the error curve in Figure 6 2 b The minimum error rate is 8 for values of 7 gt 176 But with these values none of the anomalies is detected 6 3 Workload Intensity sensitive Anomaly Detector Our approach the Workload Intensity sensitive Anomaly Detection WISAD explicitly considers varying workload intensity in order to decrease the error rate in scenarios like the one presented in Example 2 WISAD extends PAD see Equation 6 1 by including two additional functions 1 The function pwi N x N gt R gives the average number of active traces during an execution e and is a slightly modif
125. ngle class should then be relative to the total number of emulated users This requirement extends con gt fig bmwi userClassSingle m 3 1 6 User Interface ui workload execute m Workload Execution User Interface The workload driver must provide a command line option to allow batch execution of previously configured workload see use case uc workload execute 3 2 Design This section contains the high level design of the workload driver which is based on the requirements definition presented in Section In Section we refine the system model of the applications to be supported and define basic terms The workload confi guration data model and the design of the architecture and the execution model follow in Sections and 33 Chapter 3 Probabilistic Workload Driver co Data Model of Workload Configuration Figure 3 2 Class diagram of the workload configuration data model including the basic elements application model a set of user behavior models related to the application model as well as a definition of the user behavior mix and the workload intensity 3 2 1 System Model We consider a system model which is based on the system model presented in Section 2 1 and on the hierarchical workload model presented in Section An EIS provides a number of services An invocation of a service involves the invocation of possibly more than one request on protocol level A series of consecutive an
126. nsists of a non deterministic finite state machine The states are denoted as application states The transitions are called application transitions Each application state is related to a service provided by the application e g adding an item to a shopping cart An application transition between two states represents a valid sequence of service calls within a session No application state is marked as the state machine s entry or exit state Transitions can be labeled with guards and actions A guard is a boolean expression stating that a transition can only be taken if the expression evaluates to true An action is a list of statements such as variable assignments or function calls which are executed when a transition is taken Variables and functions used within guards and actions are assumed to be globally defined outside the application model The session layer in Figure contains the states 50 S1 and S2 using the variables a b and c in the guards and actions For example a transition from state S2 to Sl is only possible if b 0 evaluates to true When this transition is taken the value of b is assigned the value 1 Protocol Layer Each application state has an associated deterministic finite state machine on the protocol layer The states are denoted as protocol states The transitions are called protocol transitions 35 Chapter 3 Probabilistic Workload Driver a User behavior model Bao b User beha
127. ntroduction 1 2 Goals Our work is divided into the three parts covering workload generation component response time analysis in a case study and developing an anomaly detection protype based on the findings from the analysis part These three parts are outlined in the following paragraphs Workload Driver To systematically obtain workload dependent performance data from Web based appli cations workload is usually generated synthetically by executing a number of threads emulating users accessing the application under test A common approach is to replicate a number of pre recorded traces to multiple executing threads Tools exist which allow for capturing and replaying those traces A major drawback of this approach comes with the fact that only a limited number of traces is executed instead of dynamically generating valid traces which cover the application in a more realistic manner In our case we need such realistic workload since we want to obtain response time data of all components and thus the application functionality must be covered much more thoroughly We aim at developing a technique which leads to more realistic workload than this is the case with the classical capture and replay approach The intended procedure is as follows First the possible usage sequences are modeled for an application This model may be enriched with probabilities to weigh possible usage alternatives Based on this model valid traces are generated and
128. oCart and service OrderService getNextId 5 2 3 7 3 Quartile Except for a smaller slope for PWIs up to 0 19 10 threads the 3 quartile stretch factors increase linearly up to PWI 5 0 reaching values between 1 7 service OrderService getNeztId and 3 0 persistence sqlmapdao OrderSqlMapDao insertOrder A PWI greater 5 0 heavily impacts the 3 quartile The operations service OrderService getNextId service OrderService insertOrder and presentation OrderBean newOrder reach maximum values of 449 60 157 99 and 70 40 The values for the persistence layer operations remain between 2 30 persistence sqlmapdao OrderSqlMapDao insertOrder ER to 7 28 persistence sqlmapdao ItemSqlMapDao getItemListByProduct As observed for the maximum and the mean before the 3 quartile intermediately decreases for some operations after having reached a local maximum Figures 5 11 a and 5 11 b show the curves of the operation presenta tion CartBean addItemToCart for PWI ranges 0 03 2 22 and 0 03 18 85 Figure 5 11 c emphasizes the impact of the workload intensity on the quartile stretch factors of the operation service OrderService getNeztId including a peak in the values observable in all three quartiles Figure 5 10 c shows this curves for the operation presentation CatalogBean viewItem with only the 3 quartile containing a peak 5 2 3 8 Variance and Standard Deviation Unless
129. oad Intensity sensitive Anomaly Detector WISAD WISAD considers an executions anomalous if its response 90 7 2 Discussion time exceeds a threshold which is dynamically determined based on historical data the workload intensity and a tolerance factor A protototype of WISAD has been implemented and applied to synthetically generated samples of operation executions with varying workload intensities Its gain has been illustrated by comparing the performance with that of a basic anomaly detector which doesn t consider varying workload 1 2 Discussion The following sections include a discussion of design alternatives and experiences made throughout our work Generally we gained insight into a variety of related topics and applications including applied statistics and density estimation performance evaluation and anomaly detection for enterprise applications and aspect oriented programming Workload Driver When modeling workload using the presented approach application transitions can be labeled with guards corresponding to pre conditions However these conditions are not regarded in the user behavior models During workload execution only those applica tions are considered for which the guard evaluates to true see Section 8 2 3 4 As a consequence the probabilities from the transition matrix of the user behavior model do not in every case relate to those actually used when the workload is executed Doing so would have require
130. ompletes next request request execution response response request bo ee J E E E E a E E E E i Time o e c o 2 2 Ong ee hen o 5 5 8 Reaction Time Think Time F 8 8 8 pp E A x x l a i amp Response Time 1 lt 4 gt l SCHAN MARC RECH e ara SR a8 abe Aiia Response Time 2 i ane e a Both end to end response time definitions and b Operation response and execution related timing metrics 11991 times Figure 2 4 Timing metrics on system and on application level On application level response times can be related to operation calls In this case the response time is the time interval between the start and the end of an operation execution A second metric is the operation s execution time which is its response time minus the response times of other operations called in between Figure 2 4 b illustrates both operation call metrics in a sequence diagram Throughput Throughput is the rate at which a system or a system resource handles tasks The maximum throughput a system can achieve is referred to as its capacity 11991 The respective unit depends on the system and the measurement objective For exam ple the throughput of a Web server may be expressed by the number of requests served per second The throughput of a network device can be expressed by the transmitted data volume per time unit Think Time and Reaction Time Th
131. onment gt gt TpmonMonitoringProbe public void b point cu pointcut probeMethod execution TpmonMonitoringProbe aspect Object around probeMethod ai start getTime proceed actually execute b b stop getTime insertMonitoringData getOperationName start stop Ji SQL DBMS a In an execution environment three components a b Component a calls operation b b and c each provide services which are monitored of component b This opera by means of Tpmon using the AOP concept Tp tion contains a point cut defined mon stores the monitored data into the database by the annotation TpmonMon itoringProbe As defined in the description of the respective as pect probeMethod Tpmon saves the current time before and after b is executed Figure 2 17 Sample system instrumented with Tpmon a and how an annotated ope ration is woven b aspect programming basic functionality program languages aspect description programs woven output code Figure 2 18 An aspect weaver weaves the aspects and the functional part of an applica tion into a single binary following Kiczales et al 1997 24 2 8 Tpmon Aspect Oriented Programming Often application code is tangled with cross cutting concerns which are not directly responsible for applicatio
132. or each component of the data model data access objects DAOs exist within the per sistence layer acting as an interface to the database The DAOs and the actual database access are realized using the Apache iBATIS persistance framework which provides a common interface to SQL based relational database management systems Table gives an overview of the tables contained in the database schema of the application 2 8 Tpmon Tpmon is a monitoring tool which can be integrated into Java applications in order to monitor the response times see Section of operations as well as other application level information The core implementation is based on but has been considerably modified in the meantime The monitoring functionality is woven into the code of an application using the concept of Aspect Oriented Programming Depending on the configuration Tpmon stores the data into a database or the filesystem An example showing a system instrumented with Tpmon storing the monitored data into a database is illustrated in Figure P 17 a Tpmon provides a Web interface for enabling and disabling the monitoring as well as for setting monitoring parameters In the following two sections we will describe the concept of Aspect Oriented Program ming and give details on how instrumentation of applications takes place 23 Chapter 2 Foundations CPD Tomon Integration sap Call of Annotated Operation 0 T lt lt execution envir
133. ot types used 5 1 2 Visualization This section contains a brief description of the plot types used to visualize and graphically analyze the resulting data and statistics of all experiment runs Scatter and Line Plots Scatter and line plots are used to visualize series of bivariate data pairs in a two dimensional coordinate system Each pair is presented by a point In line plots adjacent points of the same series are connected through a line In scatter plots we added a local regression line Figure B 1 a shows a sample line plot visualizing the relation between the number of simulated users and the response time quartiles of an operation The scatter plot in Figure 5 1 b visualizes the occurring response times between the fourth and the eighth minute of an experiment Box and Whisker Plots A box and whisker plot has been described in Section Each of our box and whisker plots includes a box for each experiment minute in order to uncover changes in the response time distribution over time Additionally it contains a line representing the sample mean 66 5 1 Methodology Figure 5 1 c shows a sample box and whisker plot for the fifth sixth seventh and eighth minute of an experiment Density Plots In a density plot a probability density function is visualized Figure 5 1 d shows a density plot of a kernel estimated density using a normal kernel see Section DA A one dimensional representation of the sample d
134. our modified JPetStore version We changed all table names in the mapping files located in the folder sre com ibatis jpetstore persistence sqlmapdao sql to lower case letters B 1 2 2 Struts Session Timeout The duration after which a user session expires is configured within the file web WEB INF webd zml We decreased the default value 30 given in minutes to 3 in the session timeout element as listed below lt session config gt lt session timeout gt 3 lt session timeout gt lt session config gt B 1 2 3 Database Properties The database connection parameters must be defined in the file src properties database properties The listing below shows example settings for using a MySQL database named jpetstore running on host jupiter The MySQL user with the given credentials must have the appropriate rights to access the data within the database EEN Database Connectivity Properties EE 98 B 1 Installation Instructions for Instrumented JPetStore driver com mysql jdbc Driver url jdbc mysql jupiter jpetstore username jpetstore password ibatis9977 B 1 2 4 Initialize Database The JPetStore sources include scripts to initialize the database and populate the tables with application data The following scripts for a MySQL database reside in the directory src ddl mysql They must be executed in the given order jpetstore mysql schema sql Creates the database schema with all tabl
135. owser csw Buyer I E 0 ljexamples jpetstore buyer csv Add Delete Generate Template Figure 3 13 Example Behavior Mix If an error occurs while loading the behavior models the entire test is aborted immedi ately Details concerning this error are written to the file jmeter log Using the Session Arrival Controller The Session Arrival Controller controls the number of active sessions given an expression evaluating to an integer value It is configured within the configuration dialog of the Markov Session Controller JMeter allows to use BeanShell scripts within text expressions Particularly this allows for varying the number of active sessions depending on the elapsed experiment time Markov4JMeter adds the global variable TEST START MS to the JMeter context which is accessible in BeanShell scripts The variable is set when the Test Plan is started 45 Chapter 3 Probabilistic Workload Driver import org apache jmeter util JMeterUtils long startMs Long parseLong JMeterUtils getPropDefault TEST START MS long curMs System currentTime Millis double expMin curMs start Ms 1000 60 return int Math ceil allowedNum expMin Figure 3 14 Example BeanShell script which returns the elapsed experiment minute as an integer using the Markov4JMeter variable TEST START MS For example when using the BeanShell script listed in Figure 3 14 the Session Arrival
136. oy d 14 dapdwes squawaye ajqepieay 103918 WOW ZT 1591 95 Appendix A Workload Driver A 2 Installing Markov4JMeter Two types of Markov4JMeter archives are available 2007 Source Archives The source archive contains the Markov4JMeter sources These archive names follow the pattern markov4 jmeter lt version gt _src tzipltgz This archive is required when Markov4JMeter shall be modified and compiled It includes further instructions to do so Runtime Archives The runtime archive contains a runnable version of Markov4JMeter which can be used with JMeter These archive names follow the pattern markov4 jmeter lt version gt zip ltgz In order to install a runnable version of Markov4JMeter the following steps need to be performed 1 Download and extract the Markov4JMeter runtime archive in the desired file type tgz or zip format 2 Copy the Markov4JMeter jar which resides inside the dist directory to the di rectory Lib ezt of the JMeter installation 3 After restarting JMeter two new entries within the Logic Controllers menu exist Markov State and Markov Session Controller see Figure A 1 Add Logic Controller Markov State Remove Delete Listener Markov Session Controller Cut Ctri X Sampler H Fores ante Figure A 1 After installing Markov4JMeter the Logic Controller menu shows two new entries Markov State and Markov Session Controller 96 Appendix B C
137. p both the workload driver and the application We increased the duration of the ramp up time for a higher number of configured active sessions e Experiment Duration In order to obtain response time statistics based on a sufficient confidence interval we selected a rather long duration for runs with a small number of active sessions We aligned the duration with an increasing number of active sessions and an increasing ramp up time since the ramp up time is included in the duration of the run 4 3 Instrumentation In order to decide which of the Tpmon provided instrumentation modes to select through out our experiments we investigated the impact of Tpmon to the end to end response times of all 13 considered request types of the JPetStore As described in Section full instrumentation does introduce a considerable overhead to the response times which can be reduced by using annotating specific operations The methodology of identifying operations to be annotated is outlined in Section 4 3 2 4 3 1 Assessment of Monitoring Overhead We compared the end to end response times of all 13 request types using the below listed three monitoring configurations With each configuration 50 iterations of a Test Plan covering each request type were executed Tpmon was configured to write the data to the database 1 Full Instrumentation Tpmon operates in full instrumentation mode i e all JPetStore operations and the application entry points struts a
138. perations are displayed only implicitly by their times of operation entry and exit Figure shows sample trace timing diagrams for the request type newOrderCon firm from the first full instrumentation and fourth final set of annotated operations iteration A sample timing diagram for each request type is given in Section B 2 The monitoring points finally determined after the fourth iteration are listed in ta ble The tables does also show within which traces these operations are called Table contains this information for the 40 JPetStore operations in addition to the response time statistics for the first second and fourth iteration 4 4 Workload Intensity Metric Workload is defined by workload intensity and service demand characteristics see Sec tion 2 2 This section deals with the metric we considered to quantify the workload on the server node at a given time The metric explicitly relates to the workload intensity and considers the service demand characteristics only implicitly We denote this metric as the platform workload intensity PWI Given a point in time t and a window size w the PWI expresses the average number of active traces see Section during the time interval t w t Section contains the formal definition how the metric is derived from tuples of monitored start and stop times of traces An implementation is outlined in Section 4 4 2 4 4 1 Formal Definition The following enumeration defines how the PWI is
139. quest types GET and POST as described in Section Each request type listed in Table relates to an HTTP request type provided through the Web interface We decided to focus our further investigation of the application to a subset of 9 services and 13 request types which we consider being part of a typical user session They are labeled by an appended dagger symbol j in Table 47 Chapter 4 Experiment Design Service Request Type Service Request Type Service Request Type Homet indext Browse Category viewCategoryt Remove Item removeltemFromCart Browse Help help switchProductListPage Purchasej checkoutj Sign Ont signonFormj Browse Product viewProductj newOrderFormt signont switchltemListPage newOrderDatat Edit Account editAccountForm View Itemf viewltemj newOrderConfirmj editAccount Add to Cart addltemToCartt Search searchProducts listOrders View Cartj viewCartt switchSearchListPage viewOrder switchCartPage Register newAccountForm switchOrderPage switchMyListPage newAccount Sign Off t signofft Update Cart updateCart Quantities Table 4 1 Identified service and request types of JPetStore 4 1 2 Application Model We created an application model for the JPetStore modeling valid user sessions to be generated by Markov4JMeter As mentioned above we considered the subset services request types mentioned in Section 4 1 1 Figure shows the session layer and two
140. r see Section 2 6 as an extension called Markov4JMeter Markov4JMeter includes the Test Elements Markov Session Controller and Markov State which allow the definition of a probabilistic session model within a JMeter Test Plan The Test Elements are described in Section 3 3 1 The additional components Behavior Mix Controller and Session Arrival Controller are described in SectionsB 3 3landB 3 4 Behavior models are stored in external files The file format will be described in Section 3 3 2 Figure 3 7 illustrates how the Markov4JMeter components are integrated into JMeter The conceptual workload driver components workload driver engine and user simula tion threads defined in Section 3 2 3 are realized by the JMeter components Engine and 39 Chapter 3 Probabilistic Workload Driver Test Plan configuration Thread Group Test Plan instance configured by reads models from JMeter Test Elements markov4jmeter controller gui markov4jmeter controller gt q refers to Figure 3 7 Integration of Markov4JMeter into the JMeter architecture Markov4JMeter components are colored blue The Test Elements are divided into a GUI class and a Test Element class Threads This includes the definition of the experiment duration and the size of the thread pool to be configured in a Thread Group Test Element which is part of any Test Plan 3 3 1 Test Elements The Logic Controllers Markov
141. r request and data dependencies a sequence gener ator produces a set of sessionlets A single sessionlet is a valid sequence of request types representing a session The stress testing tool SWAT generates and executes the synthetic workload based on the set of sessionlets as well additional workload information such as think time and session length distributions The approach is illustrated in Figure Shams et al 2006 2 4 Probability and Statistics This section gives an introduction into the theory of probability and statistics as far as this is relevant for this thesis Most of the definitions follow Montgomery and Hunger 2006 13 Chapter 2 Foundations Basic Terms A random experiment is an experiment which can result in different outcomes even though repeated under the same conditions The set of all possible outcomes of such an experiment is called the sample space S It is discrete if its cardinality is finite or countable infinite It is continuous if it contains a finite or infinite interval of real numbers The probability is a function P P S gt 0 1 used to quantify the likelihood that an event E C S occurs as an outcome of a random experiment Higher numbers indicate that an outcome is more likely A discrete or continuous random variable X is a variable that associates a number with the outcome of a random experiment For example when throwing a single dice once the sample space of this experiment is S
142. rder service AccountService getAccount String String service CatalogService get Category String service CatalogService getltem String service CatalogService getltemListByProduct String service CatalogService getProductListByCategory String service OrderService getNextld String service OrderService insertOrder Order e Activated Monitoring Points Table 4 4 Identified monitoring points and coverage of request types 58 4 4 Workload Intensity Metric starting with full instrumentation With each configuration 50 iterations of a Test Plan covering each request type were executed Tpmon was configured to write the data to the database The operations for the respectively next iteration were determined mainly based on their mean response times with respect to an iteration specific threshold For example for the second iteration we only considered those operations which had a mean response time greater or equal 0 5 ms Moreover from the monitored data we generated timing diagrams for each trace in order to visualize the relation between calling and called operation as well as to see the operation calls in their respective context The x axis defines the elapsed time during trace execution The y axis numbers the operations represented by the dotted horizontal lines In contrast to a sequence diagram relations between called and calling o
143. rently simulated users We measured the throughput of the ap plication in number of executed requests per minute derived from the executions of the application entry points How the throughput is influenced by the number of concurrent users is described in Section The relation between the workload intensity and the considered response time statistics see Section is described in Section B 2 3 We analyzed the response time distributions and performed a density estimation using non parametric and parametric density estimators The results are part of Sections 5 2 4 and 5 2 9 5 2 1 Platform Workload Intensity With 5 users the PWI has the value 0 03 It increases linearly with the number of concurrent users until it reaches 0 84 with 45 users see Figure 5 4 a The PWI increases moderately between 55 and 85 users from 1 08 to 2 98 see Figure 5 4 b For more than 85 users the PWI increases with an extremely higher gradient The PWI reaches 30 03 with 145 users and 60 90 with 195 users see Figure B 4 c 5 2 2 Throughput The throughput increases linearly until a number of 75 concurrent users is reached see Figure 5 5 a Between 85 and 155 users it increases with a higher slope and a consider able oscillation The throughput raises significantly up to a maximum of 70 000 requests 70 5 2 Data Description Requests minute Response time ms 10000 15000 5000 0 008 0 010 0 012 0 014 0 016 0 018 Figure 5 6 Number o
144. response time 26 2 9 Related Work data were used as the parameters for the probability density function of the log normal function Deviations from the log normal distributions occurred and were often caused by performance problems Thus suggests to use response time distributions to locate performance problems in ERP systems combined dependency graphs and monitored response time sam ples from system components for problem determination i e detecting misbehavior in a system and locating the root cause assume existing end user service level agreements SLAs with specified end to end response time thresholds for each trans action type Dependency graphs model the relation between components in terms of synchronous or asynchronous invocations among each other At run time a so called dy namic threshold is computed for each component based on average component response times As soon as a SLA violation is detected all components are classified into good and bad behavior depending on whether or not they are affected by the problem i e they exceed their dynamic threshold or are in a relation with components exceeding their dynamic threshold The components in a bad state are ranked in order of their current response time samples compared with the dynamic threshold in order to deter mine the root cause of the problem claim that by using their dynamic threshold approach changes in operating conditions such as workload changes c
145. rndVal lt cumProbList i then break else z trueList i 1 done execute action and enter application statex evaluate t action T enter z done arrivalCtrl exitSession notify we quit our session gt end Figure 3 6 Sketch of core algorithm executed by each user simulation thread to execute a session B4 denotes the user behavior model Z P zo E fiz which has already been assigned based on the behavior mix The method enter executes the application state z passed as parameter and returns the set of outgoing transitions The method evaluate evaluates the expression passed as parameter and returns the evaluation result The operator e is used to concatenate lists 3 2 3 2 Session Arrival Controller The session arrival controller controls the number of active user sessions during workload generation It provides a session entrance and exit protocol for the user simulation threads Before entering a session the method enterSession must be called which may block the thread in a queue until the session entrance is granted After leaving a session a user simulation thread invokes the method exitSession in order to notify the session arrival controller that the number of active session can be decremented The session arrival controller is configured according to the active sessions formula within the workload configuration 3 2 3 3 User Simulation Threads Each user simulation thread iteratively emulates users bas
146. rrelation with the workload intensity At least one experiment exists for each operation showing no extreme outliers Maximum ratios are between 0 96 persistence sqlmapdao OrderSqlMapDao insertOrder and 20 44 service CatalogService getCategory The operations persistence sqlmapdao OrderSqlMapDao insertOrder and service OrderService insertOrder have the lowest average values of extreme outliers with values 0 1 and 1 94 The remaining operations have an average ratio of extreme 78 5 2 Data Description Workload vs 3 Quartile of R Ti Workload Intensity vs Outliers Workload Intensity vs Outliers F Weeer SD lian persistence sqlmapdao ltemSqlMapDao getltem service OrderService insertOrder persistence sqlmapdao ltemSqlMapDao getltem a o normal o 3 Jo formal D Ce D Jo extreme o extreme a o 24 eal 4 2 s ei ZS E D Ed o o o 22 o 2 3 sp o DER X Lo g o S 2 Sloe 2 o 818 amp 537 Ta 5 J Wie 2 gt 6 o S a o 5 o L amp ei o o o o ZS Je o o o E 2 4 S o Z bo s ge We ie E o 8 o o Oo o Poo amp o o y a amp a 8 o SJ ge ZS 2 S S 18 vi o PR nn Ka 818 Sis S S T T T T T T T T T T T T T T T T T T T T T 0 10 20 30 40 50 60 0 10 20 30 40 50 60 D 10 20 30 40 50 60 Platform workload intensity Platform workload intensity Platform workload intensity a PWI 0 03 60 90 b PWI 0 03 60 90 c PWI 0 03 60 90
147. s Birds Fish SN Freshwater struts ActionServlet Various Breeds ario eed Apache Struts Various Breeds Exotic Varieties Reptiles Lizards Turtles Snakes Birds Exotic Varieties iBatis JPetStore Powered by JiBATIS Figure 2 16 JPetStore index page A Test Plan is internally represented by a tree of Test Element classes itself representing the respective element in Test Plan A Test Plan can be saved in a file with an XML based JMX format In addition to the configuration parameters the Test Element classes contain the implementation of the Test Element s behavior Any Test Element class has an associated GUI class providing a configuration dialog for the Test Element It is responsible for creating and modifying the related Test Element classes Figure shows the dialog for configuring an HTTP Request Sampler 2 7 JPetStore Sample Application JPetStore is a sample Java Web application that represents an online shopping store offering pets In the following two sections those details which are relevant for our work are described Overview The application has originally been developed to demonstrate the capabilities of the Apache iBATIS persistance framework Apache Software Foundation 2007a It is based on the J2EE sample application Java Pet Store Sun Microsystems Inc 2006 which has been used in a variety of scientific studies e g Chen et al 2005 Kiciman and Fo 2005 2
148. s for dynamically setting parameter values to be passed with a request data dependencies use Extended Finite State Machines EFSM to model valid application usage Like a CBMG an application s EFSM consists of different states modeling the request types as well as allowed transitions between them Transitions are labeled with predicates and actions A transition can only be taken if its associated predicate evaluates to true When a transition is taken the respective action is performed An EFSM contains a set of variables which can be used in predicate and action expressions Values of request parameters are set in actions A select operation is provided for assigning values dynamically on trace execution for those values which are not known before the response of the former request has been received For example by means of Browse Item_ID select the item to browse is chosen dynamically from the former response An example EFSM following Shams et al 20001 is shown in Figure 8 Workload Generation Many freely available and commercial Web workload generators exist which provide functionality to record and replay traces e g Mercury LoadRunner Mercury Interac tive Corporation 2007 OpenSTA OpenSTA 2005 Siege Fulmer nn and Apache JMeter Apache Software Foundation 2007b see Section 2 6 Moreover sample ap plications e g TPC W Transaction Processing Performance Council 2004 and RUBIS Pugh and Spacco exist for b
149. s solely differ in their transition probabilities 3 2 2 3 User Behavior Mix A user behavior mix BMIA for an application model A is a set Bao po Ban 1 Pn 1 assigning relative frequencies p to user behavior models B4 The prop erty in Equation 3 1 must hold gt pi 1 3 1 A tuple PA pi states that user sessions based on the user behavior model BA occur with a relative frequency p 0 1 during workload execution 36 3 2 Design High Level Overview of Workload Driver includes gt includes gt session entrance scheduled by gt Figure 3 5 Architecture overview of workload driver in UML class diagram notation 3 2 2 4 Workload Intensity The workload intensity configuration includes the definition of the duration and a math ematic function R gt o gt N defining the number of active sessions i e the number of concurrent users to simulate relative to the elapsed experiment time 3 2 3 Architecture and Iterative Execution Model The architecture of the workload driver includes a workload driver engine a behavior mix controller a session arrival controller and a pool of user simulation threads the size of which is constant throughout the entire execution The workload driver engine initializes and controls the other components based on a workload configuration as defined in the previous Section 3 2 2 Each user simulation thread represents one user at a time and issue
150. s the requests based on a session model which is composed by the application model and a user behavior model assigned by the behavior mix controller The session arrival controller controls the respective number of active sessions A more detailed description of components is given in the following Sections 3 2 3 3 Figure shows a UML class diagram illustrating how they are related The composition and execution of the probabilistic session model is described in Sec tion 3 2 3 1 Behavior Mix Controller The behavior mix controller maintains the assignment of user behavior models to user sessions executed by the user simulation threads The assignment is performed based on the relative frequencies configured in the behavior mix which is part of the workload configuration 37 Chapter 3 Probabilistic Workload Driver Method executeSession begin arrivalCtrl enterSession request session entrance z Ba z0 set current state to entry state T enter z enter application state while z 4 Ba E do fetch transitions whose guards evaluate to true trueList probSum 0 0 cumPropList for each te T do if evaluate t guard then trueList trueList e t cumProbSum cumProbSum BaA P z t z cumProbList cumProbList e t fi done select transition based on transition probabilities rndVal random cumProbSum z trueList 0 for i 0 cumProbList length 2 do if
151. scribed for the maximum in Section the values of some operations decrease intermediately after having reached a local maximum Usually this occurs for PWIs around 24 50 135 users and 30 03 145 users Figure shows the 73 Chapter 5 Analysis Workload Intensity vs Workload Intensity vs Workload Intensity vs Mean Median and Mode Stretch Factors Mean Median and Mode Stretch Factors Mean Median and Mode Stretch Factors service CatalogService getltem service CatalogService getltem persistence sqlmapdao OrderSqlMapDao insertOrder mean w mean a Y mean we D median Gi median o median oer a 7 approx mode approx mode _ approx mode ee o ri o a Piar a 4 8 lt f e Y EA E K 2 o A Li SE S N r z 5 5 Si 5 A MA Bo o 3 o 3 Je 9 E Za E KE E oe ai sl SZ g a i BS Do o o o 5 a e a 5 d Be BLE J e o r o o w 4 ge S Je ze 4 o o a e Ben o om a iz amp o o o 1 sage I e joo ae A 2 Joege8 8 ooo 09 Be o7 ER 7 T T T T T T T T T T T T T T T T T T 0 0 0 5 1 0 15 2 0 0 10 20 30 40 50 60 0 10 20 30 40 50 60 Platform workload intensity Platform workload intensity Platform workload intensity a PWI 0 03 2 22 b PWI 0 03 60 90 c PWI 0 03 60 90 Figure 5 8 Platform workload intensity vs mean median and mode response time stretch factors for the operations service CatalogService getItem and persistence sqlmap
152. sion layer is entered In the case of the JPetStore each request is one of the 13 HTTP request types provided by the application For each request type we determined its required HTTP request method the URI and parameters to be passed see Section 2 1 on an invocation Figure shows the state graphs which relate to the application states Sign On and Purchase In order to sign on a user first invokes an HTTP request of type signonForm using the method GET The server returns a form asking for a username and a password In a subsequent invocation the user passes the filled in data of the completed form by invoking the HTTP request type signon The variables userId and password are used as placeholders for the username and password The state graph of the application state Purchase shows the sequence of HTTP requests to be executed We omitted the HTTP protocol details for this state 4 1 3 Behavior Models In order to yield probabilistic user behavior in the user sessions generated by Markov4JMeter we developed the behavior model representing users solely browsing through the JPetStore and a second one where a users tends to buy items from the shop The models are created according to our definition of a behavior model in Sec 49 Chapter 4 Experiment Design 0 225 View Category a Browser b Buyer y Figure 4 2 Transition graphs of browsers and buyers tion 3 2 2 2 In the following Sections 4 1 3 1fand 4
153. sion of the kernel method can be found in Silverman 1986 2 5 Anomaly Detection An important quality of service attribute of EIS is availability As defined in Equa tion Musa et al 1987 availability is calculated using the two variables mean time to failure MTTF and mean time to repair MTTR Being able to decrease either of them yields an increased availability MTTF MTTF MTTR Availability 2 13 Anomaly detection is an approach to increasing availability by reducing repair times Errors or failures as defined in the following paragraph shall be detected early or even proactively Fault Error Failure According to the fundamental chain of dependability and security threats presented by Avi ienis et al 2004 we distinguish between fault error and failure A fault e g a software bug implies an error as soon as it has been activated An error is that part of a corrupt system state which itself may cause a failure i e an incorrect system behavior observable from outside the system Moreover a failure may cause a fault in a higher level system This is illustrated in Figure Avi ienis et al 2004 activation ropagation causation fault error EE failure fault Figure 2 13 Chain of dependability threats Avizienis et al 2004 18 2 5 Anomaly Detection Approach A common approach for detecting anomalies is building a
154. stributions We analyzed the reason for these shapes and obtained the following results e The operation presentation OrderBean newOrder relates to the order pro cess Its data samples show two major clusters one close to O ms and one shifted to the right see Section 5 2 4 Each cluster relates to one request type i e neu OrderData and newOrderConfirm see Section 4 1 1 The low response times belong to requests of type newOrderData The data of the order form is only submitted in this case without being further processed By requesting newOrder Confirm the order is stored to persistent storage which These response times belong to the second cluster Figures B 1 k Jand B 1 l show the timing diagrams for both request types The operation presentation CartBean addItemToCart isincluded in traces of the request type Add to Cart issued when an item is to be added to the shopping cart The response time samples of this operation show one minor cluster with sporadically occurring low response times close to 0 ms and a main cluster which is shifted further to the right The response times of the minor cluster occur when items which are already included in the cart are added again This involves a different control flow since mainly the number of items needs to be incremented instead of initially adding an item of this type Figure B 1 8 shows a trace timing diagram for the request type Add to Cart relating to the main cluster
155. system s normal behavior in terms of monitored parameters and comparing this model with the respective current behavior to uncover deviations The data being monitored can be obtained from different levels e g network hardware or application level Failures may be detected proactively but at least quickly after they occur Repair times can be reduced by taking appropriate steps immediately thus increasing availability Timing behavior anomaly detection approaches exist which are based on monitored component response times on application level However most of them do not explicitly consider that a varying workload intensity leads to varying response times For example when using preset threshold values this often leads to spurious events when the workload intensity increases This work targets on a timing behavior anomaly detection which explicitly considers this additional parameter For this purpose we statistically analyze the relation between workload intensity and response times in a case study A workload driver is developed which generates varying workload based on probabilistic models of user behavior The results of the analysis are used for a workload intensity sensitive anomaly detection pro totype but they are interesting for timing behavior evaluation of multi user systems in general The goals of our work are presented in the following Section An overview of the document structure is given in Section 1 3 Chapter 1 I
156. t Software Settings Since the tests need to be run with a large number of simulated users the configuration of the Tomcat server JMeter and the JPetStore are adjusted These adjustments are outlined in the following paragraphs In Section the details on how to configure these settings are given Apache Tomcat The following settings were modified 54 4 2 Configuration e Heap Size The heap space available for the executing Java virtual machine is set to 512 MiB default 64 MiB regarding around 900 MiB available physical memory after the node has restarted and the operating system prompts the login screen e Thread Pool Size The maximum number of available request processing threads is set to 300 default 150 and the maximum number of simultaneously accepted requests to 400 default 100 e Access Logging The default pattern of entries in the log file is extended by including a session identifier and the server side response time as described in Sec tion Apache JMeter JMeter has been extended by Markov4JMeter The heap space size is set to 768 MiB default 256 MiB 256 MiB default 128 MiB of which are reserved for the eden space i e the space for newly created objects JMeter needs a large eden space since many new objects are created during a test JPetStore In addition to the usual configuration to setup the MySQL database and the connec tion properties within the JPetStore sources it was necessary to corre
157. tation 37 3 6 Sketch of algorithm executed by a user simulation thread executing a session 38 SE 40 e 4l 3 9 User behavior model stored in CSV file formatl 2 2 222 2220 42 o A a a ee us 43 SE LEERSE DHSS DE ERS Ae A 43 bh OS ete as er EE er BS e li 44 3 13 Example Behavior Mel lt lt lt lt lt aan 45 3 14 Example BeanShell script defining number of active sessions 46 o 46 4 1 Session layer and two protocol states of JPetStore s application model 49 ix List of Figures A ELLI 50 4 3 Probabilistic Test Plan and configuration of state View Cart 51 EA 54 54 4 6 Sample trace timing diagrams for a request 58 Sate ae a Ae GS EE 60 4 8 Sketch of experiment execution arm 62 5 1 Overview of all plot types used 66 A 0 68 ar 69 5 4 Number of users vs platform workload intensityl 70 5 5 Number of users vs throughputl o 71 eer 71 LE 73 74 EE 75 lt n RA A AA AR 76 BER 77 78 nn 79 EEE RE ee 80 5 15 Goodness of fit visualizations for an operation 2 2 2222 2 20 82 5 16 Box and whisker plot for varying workload intensity 83 6 1 Anomaly detection scenario with constant workload intensity PAD 86 6 2 Anomaly detection scenario with increasing workload intensity PAD 87 6 3 Anomaly detection scenario with increasing workload intensity WISAD 88 EES 96 KEEN 101 B 1 Sample trace timing d
158. th A varying number of 60 to 160 users was emulated and the highest workload intensity oc curred in the 12 minute The minimum remains constant The mean and all quartiles increase with increasing workload intensity The mean is more sensitive than the quar tiles Upper quartiles are more sensitive than the lower quartiles are The interquartile range increases Most of the monotonically increasing curves relating to a statistic increase in three steps The slope is rather moderate before increasing slightly starting with a certain PWI The curve raises considerably for an even higher PWI These steps can be considered performance knees according to the performance curve by Jain Figure 2 5 A large number of operations showed intermediate local maximums for more sensitive statistics e g maximum and mean These maximums occurred for PWIs around 25 35 which we assume is related to the initial size of the thread pool provided by the Tomcat server The pool is extended when 25 requests are handled in parallel The ratio of extreme outliers did correlate with these maximums 83 Chapter 5 Analysis Four classes of distribution were identified unimodal distributions two types of bi modal distributions and multimodal distribution becoming unimodal distributions as the workload intensity increases The operations presentation OrderBean newOrder and presentation Cart Bean addItemToCart constitute both types of bimodal di
159. the information related to a specific workload execution e g the used application and user behavior configuration the relative distribution of behavior models to use as well as the number of simulated users and the duration of the execution 3 1 3 Supported Applications scope applications m Supported Applications The workload driver must support applications that satisfy the following properties e The application has an HTML Web interface based on the HTTP request response model see Section 2 1 solely using the HTTP methods GET and POST e Ifthe application uses sessions the session identifier is managed by cookies contained in the HTTP header by URL rewriting or by hidden HTML input fields e The application does not rely on code embedded in its HTML response that must be interpreted on client side such as JavaScript or AJAX 30 3 1 Requirements Definition 3 1 4 Use Cases In this section we define the use cases the workload driver must support Each use case is defined by means of a template containing the fields actors pre and post condition as well as a description Figure 3 1 shows the UML use case diagram uc aum define m Configure Application and User Behavior Actors Tester Pre condition Post condition A new application and user behavior configuration for the AUT has been created and is accessible to the workload driver Description The user defines and
160. the monitored response time sample X for each operation in each experiment run we estimated the parameters for the normal as well as the 2 and 3 parameter log normal distribution For the 2 parameter log normal distribution A w 0 and the normal distribution N p 0 we used the mean and the variance of log X and X as their estimators The parameters for the 3 parameter log normal distribution A T u 0 were estimated by using the method of maximum likelihood with starting points t m s and values defined in Equation The variables mo and s denote the mean and the variance of log X tp zo being the smallest response time in the sample and vo being the 69 Chapter 5 Analysis e 5 Users vs Platform Workload Intensity Platform Workload Intensity Platform Workload Intensity Platform workload intensity 1 Platform workload intensity Platform workload intensity 3 1 T T T T T T T T T T T T T T T T D 20 40 60 0 20 40 60 80 100 120 D 50 100 150 200 U Users Users a 0 75 users b 0 125 users c 0 195 users Figure 5 4 Number of users vs platform workload intensity N 0 1 quantile of order 2 This method is based on Cohen s least sample value method Aitchison and Brown 1957 t ro e m log z t s gt log x t 5 1 5 2 Data Description Section 5 2 1 gives a description how the platform workload intensity metric relates to the number of concur
161. tic session model is executed as follows Given a current state the next state is determined by first evaluating the guards of the outgoing transitions related to the current state One of these transition is randomly selected based on their assigned probabilities The action of the selected transition is executed and the requests towards the application are issued by traversing the determinitic state machine of the state within the protocol layer of the application model A session ends when the determined transition leads to the destination state Exit which is the exit state of the user behavior model The semantic of the composition of an application model and a user behavior model is illustrated in Figure using a pseudo code notation for the code executed by a user simulation thread in each iteration We assume that the user behavior model By Z P 20 E fu has been assigned already The tuple elements are accessed by means of By Z Ba P Ba and Ba E First the user simulation thread invokes a request to enter the session towards the session arrival controller The thread may get blocked at this point i e it might be put into a queue This is the case when the number of active sessions would exceed the allowed number of active sessions if this entrance would be granted 3 3 Markov4JMeter A workload driver following the design presented in the previous Section has been implemented and integrated into the existing workload generator JMete
162. tinguished from the normal response times The dotted line represents the dynamic threshold for 6 1 1 Given a history X and a set of executions Y let e o st rt Y be an execution of operation o and let rto denote the mean response time of historical executions of operation o at pwi value 1 For an execution e o st rt WISAD is defined as follows 1 rt gt rto1 wnf pwi e x d 6 3 0 else ee WISAD e e According to PAD see Equation 6 1 an execution e of an operation o is considered anomalous iff its response time exceeds a threshold But using WISAD this threshold depends on the workload intensity since the scale factor normalizes the impact of varying workload intensity to the response times of operation o at a pwi of 1 Example 3 WISAD with Varying Workload Intensity We applied WISAD to the scenario used in Example 2 see Figure 6 2 a 1 is assumed that for executions e o st rt the values of pwi follow the equation 1 From the historical data we approximated wn f x and rt 100 Figure illustrates the results with the Ge WISAD function A as given in Equation 6 4 For threshold values between 106 and 118 i e 106 118 of the mean response time for pwi 1 WISAD has an error rate of 0 x rt gt 1 24 6 6 4 0 else we 88 Chapter 7 Conclusion In Section 7 1 we give a summary of our work A discussion follows in Section 7 2 Future work is outlined in Section
163. tion 2 4 Section 2 5 outlines the concept of anomaly detection and presents existing approaches The workload driver JMeter our probabilistic workload driver presented in Chapter 3lis based on is described in Section 2 6 Sections 2 7 and 2 8 present the JPetStore sample application and the mon itoring infrastructure Tpmon which are both used in the case study Section 2 9 gives an overview about related work 2 1 System Model This section gives an overview of the system model used throughout this document It contains a description of an enterprise information system the HTTP request response model used by those systems as well as the considered execution model and related terms Enterprise Information Systems Enterprise information systems EIS are large software systems e g banking and online shopping systems or auction sites Usually they are component based and multi tiered i e they consist of a Web Server an application server and a database server 2000 The Web server executes an HTTP server software listening for incoming HTTP re quests see section below establishing the required connection between itself and the client sending the requested response and returning to its listening functionality An ap plication server runs the enterprise software that processes all services provided through the Web server The database server executes a database management system DBMS holding the persistent data accessed by t
164. to combine a set of transitions from multiple source states directing to the same destination state These transitions have at most one label consisting of a guard and actions which is considered the label of all transitions in this set 48 4 1 Markov4JMeter Profile for JPetStore signedOn signedOn false View signedOn false Category signedOn signedOn false Q O View O Pri oduct IsignedOn L signedOn true signonForm req method GET req uri jpetstore shop signon shtm req header lt gt req body lt gt Y signon req method POST req uri jpetstore shop signon shtml req header lt gt req body lt username userld password password submit Login gt signedOn amp amp itemInCart itemInCart false Purchase uegujuey checkout y C newOrderForm gt Sign On newOrderData y E newOrderConfirm gt a Session layer of application model b Protocol states for two application states Figure 4 1 Session layer and two protocol states of JPetStore s application model 4 1 2 2 Protocol Layer As defined in Section 3 2 2 1 each application state on the session layer has an associated deterministic state machine on the protocol layer defining the order and protocol details of the requests to be executed when an application state on the ses
165. tors exist e g Mercury LoadRunner Mercury Interactive Corporation 2007 OpenSTA OpenSTA 2005 Siege Gra 2006 and Apache JMeter Apache Software Foundation 2007b see Sec tion 2 6 In many performance related experiments simple techniques were used to generate synthetic workload For example used the standard request generator httperf 1998 to generate workload for the Java Pet Store In this case the order of issued requests within a session was dynamically determined using probabilities defined a Customer Behavior Model Graph CBMG to for mally model the user behavior in Web based systems using Markov chains Shams et al state that CBMGs are inappropriate for modeling valid users sessions and present Extended Finite State Machines EFSM which include additional elements such as pred icates and actions propose an integration of CBMGs and a Web stressing tool to generate realistic workload In Section we provided a detailed de scription of these approaches statistically analyzed the end to end response times of transactions in three Enterprise Resource Planning ERP systems in real use The response times were measured with built in performance measurement tools The main contributions to the response times were caused by process time and database request time found out that the distributions of the data samples could be estimated by the log normal distribution including body and tail Sample mean and variance from the
166. uence diagram showing a sample trace Component a calls the operation b twice While executing b the first time operation c is called synchronously Operation a is the application entry point 2 2 Performance Metrics and Scalability Performance denotes the time behavior and resource efficiency of a computer system 2007 The ISO 9126 standard contains a definition of the term efficiency with an analogous meaning It consists of time behavior and resource utilization metrics see Figure 2 3 The following sections introduce the metrics used throughout this document Efficiency Resource Utilization CPU Utilization Memory Utilization VO Utilization Time Behavior Response Time Throughput Reaction Time Figure 2 3 Efficiency in the ISO 9126 standard Kozoliek 2007 Time Behavior Response Time and Execution Time On system level the end to end response time denotes the time interval elapsed be tween a request is issued by a user and the time the system answers with an according response 1991 Depending on the measurement objective the time interval ends with the system starting to serve its response or the response completion considers this as the realistic request and response since the duration of the response transmission is taken into account see Figure P 4 a Chapter 2 Foundations user user system system system starts a starts finishes starts starts c
167. ups false redirectPort 8443 acceptCount 400 connectionTimeout 20000 disableUploadTimeout true gt 97 Appendix B Case Study Access Logging Access logging can be enabled by activating the following XML el ement in the configuration file conf server zml We modified the attribute pattern in order to obtain a slightly modified log format with the session identifier and the server side response time in milliseconds being included in each log entry lt Valve className org apache catalina valves AccessLogValve directory logs prefix localhost_access_log suffix txt pattern Yn YU amp quot q amp quot As Ab S XD resolveHosts false gt B 1 2 Build and Install JPetStore In the following sections we describe how to build and install the JPetStore application on Unix and Linux based systems A running Apache Tomcat MySQL DBMS server installation is necessary which is not necessarily located on the same machine as the JPetStore application B 1 2 1 Source Code Modifications The JPetStore source code needs to be modified in order to work with a MySQL instal lation on a server with case sensitive filesystem semantics In the original source code table names are inconsistently used in SQL scripts and the object relational mapping files in terms of capitalization The file JPetStore 5 0_mysql_uniz tar gz included on the DVD attached to this the sis contains
168. utes the platform workload intensity for time intervals using a defined step size For example for a time interval 0 20 and a step size 5 pwia t is computed for t 0 4 9 14 19 Let n be the number of elements contained in the trace history Each of the above steps unless sorting the event history are performed in linear time i e in O n For sorting the event history the R provided variant of Shellsort with complexity O n by Sedgewick 1986 is used Figure shows the graphs of the trace history Figure and the platform workload intensity Figure 4 7 b for an experiment interval of 1 minute using w 61 ms and a step size value of 30 ms The green colored horizontal line represents the mean of all platform workload intensities 4 5 Execution Methodology In order to achieve repeatable results we restart the server node before each experi ment run The Test Plans are executed with JMeter in non GUI mode in order to save resources An experiment run is automatically executed by shell script which started on the client node A considerable parameterization provides repeatability reproducibility and eases configuration changes The simplified version of the script listed in Figure 4 8 illustrates its basic structure The script divides a run into the five below described phases 1 Initial Preparation Phase The script determines the current monitoring con figuration on the server node and removes log files of former experiment
169. utlines the experiment design A description of the probabilistic Test Plan is given in Section 1 including the underlying application and user behavior models as defined in Sections 3 2 2 1Jand 3 2 2 2 The configuration of the machines the software environment and the experiment runs is specified in Section Section describes which operations are monitored and how these monitoring points haven been determined We define a workload intensity metric called platform workload intensity PWI which is part of Section Section deals with the methodology how the experiments are executed 4 1 Markov4JMeter Profile for JPetStore Based on identified service and request types of the JPetStore we created an applica tion model and two user behavior models These models have been integrated into a probabilistic Test Plan which can be executed by JMeter extended by Markov4JMeter A description of the service and request types the application and behavior models as well as the probabilistic Test Plan is given in the following Sections 4 1 1 4 1 4 4 1 1 Identification of Services and Request Types According to our refined system model in Section we identified 29 request types provided by JPetStore on HTTP protocol level and classified them into 15 services Ta ble 4 1 contains the services and their corresponding request types JPetStore provides its services through a Web interface using the HT T P protocol re quest response protocol with the HTTP re
170. vior model Ba 1 Figure 3 4 Transition diagrams of user behavior models B49 and Ba A state machine is executed when the related application state is entered and models the sequence of protocol level requests to be invoked see Section 8 2 1 Analogous to the session layer transitions between states may be labeled with guards and actions using the same global variables and functions For example the state machine related to the application state SO in Figure 3 3 con tains the three protocol states a shtml b shtml and c shtml that correspond to HTTP request URIs After the request for a shtml has been issued the next state depends on the outcome of the evaluation of the expression a gt 0 in the guard 3 2 2 2 User Behavior Model A user behavior model consists of a Markov chain see Section 2 3 and a mathematical formula modeling the client side think time It is associated with an application model and each state of the Markov chain relates to an application state Formally we define a user behavior model B4 for an application A as a tuple Z U Exit P Zo fiz Z denotes the set of states contained in the Markov chain with entry state zo The state Exit is the dedicated exit state which has no related application state P denotes the transition matrix fu contains the think time formula Figure shows the transition diagrams of two user behavior models Ba and Bar for the application model shown in Figure Both user behavior model
Download Pdf Manuals
Related Search
Related Contents
パソコンソフト プラネタリウム 星空散歩ライト 取扱説明書(500KB) 4 - プラスワンリビング F20394D FR.qxd:Layout 1 SewQuiet 5000 - AllBrands.com Remington EP6010 EWS10 EWS11 Manual do usuário 1 HI 3815 Equipo de análisis de cloruro USER MANUAL Copyright © All rights reserved.
Failed to retrieve file