Home

Which Step Do I Take First? Troubleshooting with Bayesian Models

1. over the vocabulary is drawn for the problem specific content of each solution set A is given a symmetric Dirichlet prior with concentration w For each individual solution in a set we draw a complexity level z from 0 i e the complexity level proportions for that problem A position for the so lution is then drawn from the position distribution for that level i e yz The words in the solution are generated by first drawing a switch value for each word indicating if the word came from the problem s technical or complexity vocabulary Accordingly the word is drawn from A or During inference we are interested in the poste rior of the model given the FAQ training data Based on the conditional independencies of the model the posterior is proportional to P d0 61 x TL P mla TP m 1 m 1 N x T P 8 16 x I P Ai w ie Np l x JI I P 210 P rila i 1 j 1 N Npr xij x JI P sije P Wijk Sijk zi gt Ai 19 1 1 where L is the number of complexity levels N the number of problems in the training corpus Np the size of solution set for problem P and x the number of words in solution zij The use of conjugate priors for the multinomial and binomial distributions allows us to integrate out the V y A and 0 distributions The simplified posterior is proportional to where fi is the number of times level m is as signed to a solution in problem i Qr is the num ber of
2. where 1 indicates equivalent rankings 1 com pletely reverse rankings and 0 independent rank ings Table 2 shows the pairwise inter annotator agreement as well as the agreement between each annotator and the original FAQ order The table shows fair agreement between the annotators con firming that this is a reasonable task for humans to do As can be seen there are some individual dif ferences with the inter annotator agreement varying from 0 421 for A B to 0 625 for A D The last column in Table 2 reports the agreement between our annotator rankings and the original or dering of solutions in the FAQ data Although there is fair agreement with the FAQ providing support for its use as a gold standard the overall 7 values are lower compared to inter annotator agreement This implies that the ordering may not be strictly increas ing in complexity in our dataset and that our models should allow for some flexibility during learning Several reasons contributed to disagreements be tween annotators and with the FAQ ordering such as the users expertise personal preferences or the nature of the solutions For instance annotators dis agreed when multiple solutions were of similar com plexity For the first example in Table 1 all anno tators agreed perfectly and also matched the FAQ order For the second example the annotators dis agreed with each other and the FAQ 4 Generative Models In the following we intr
3. and operat ing system manufacturers and other well managed websites As a result the text in the FAQ solutions is of high quality However the same is not true Model Rank 1 Rank n Both Random 15 3 14 6 3 4 Length 18 1 16 3 6 8 SynSem 19 8 19 8 4 3 SynSem Length 20 6 25 0 Sal Position 31 0 37 0 13 7 Length 36 2 36 2 18 9 SynSem 34 4 35 3 16 3 SynSem Length 36 2 35 3 17 2 Permutation 28 4 37 0 12 0 Length 30 1 37 0 15 5 synsem 32 7 37 0 13 7 SynSem Length 32 7 37 9 13 7 Table 7 Model accuracy at predicting the easiest solution correctly Rank 1 the most difficult one Rank n or both Bold face indicates the best performing model for each rank for community generated solution texts on discus sion forums In the latter case we conjecture that the style of writing is likely to play a bigger role in how users perceive complexity We thus expect that the benefit of adding Length and SynSem features will become stronger when we apply our models to texts from online forums We also computed how accurately the models identify the least and most complex solutions For solution sets of size 5 and above so that the task is non trivial we computed the number of times the rank one solution given by the models was also the easiest according to the FAQ gold standard Like wise we also computed how often the models cor rectly predict the most complex solution With Ran
4. around this issue remove the read only property For more information about file properties see View the prop erties for a file 5 1 The system is trying to start from a media device that is not bootable Remove the media device from the drive High complexity 10 0 If you are getting stopped at the CD KEY or Serial Number verification verify you are entering your cor rect number If you lost your number or key or it does not work you will need to contact the developer of the program Computer Hope will not provide any users with an alternate identification number The network controller is defective Contact an autho rized service provider 9 9 Network controller interrupt is shared with an expan sion board Under the Computer Setup Advanced 10 0 menu change the resource settings for the board Table 4 Example solutions with low medium and high expected complexity values position based model with 10 complexity levels The solutions come from various problem solution sets in the training corpus Expected complexity values are shown in the first column can compute the expected complexity for any so lution text x This value is given by yy m p m x We estimate the second term p m z using a the complexity level language models m and b a prior over levels given by the overall fre quency of different levels on the training data Ta ble 4 presents examples of solution texts from our train
5. in web archived data can be challenging given the lack of appropriate search and browsing facilities Table I shows examples typical of the problems and proposed solutions found in troubleshooting oriented online forums The first problem concerns a shaky monitor and has three solutions with increas ing degrees of complexity Solution 1 is probably easiest to implement in terms of user time effort and expertise solution 3 is most complex i e the user should understand what signal timing is and The screen is shaking 1 Move all objects that emit a magnetic field such as a motor or transformer away from the monitor 2 Check if the specified voltage is applied 3 Check if the signal timing of the computer system is within the specification of the monitor Illegal Operation has Occurred error message is displayed 1 Software being used is not Microsoft certified for your version of Windows Verify that the software is certified by Microsoft for your version of Windows see program packaging for this information 2 Configuration files are corrupt If possible save all data close all programs and restart the computer Table 1 Example problems and solutions taken from on line troubleshooting oriented forums then try to establish whether it is within the specifi cation of the monitor whereas solution 2 is some where in between In most cases the solutions are not organized in any
6. ing task using 10 fold cross validation and SVM ranking models with different features The results are broken down by solution set size the number of sets per size is shown within parentheses Boldface indicates the best performing model for each set size SynSem features which measure the complexity of writing in the text are somewhat better but still in ferior compared to Position and Permutation Us ing a paired Wilcoxon signed rank test we com pared the 7 values obtained by the different mod els The Position and Permutation performed sig nificantly better p lt 0 05 compared to Random Length and SynSem baselines However 7 differ ences between Position and Permutation are not sta tistically significant With regard to feature com binations we observe that both models yield bet ter performance when combined with Length or SynSem The Position model improves with the addition of both Length and SynSem whereas the Permutation model combines best with SynSem features The Position SynSem Length model is significantly better than Permutation p lt 0 05 but not Permutation SynSem Length or Position alone again under the Wilcoxon test These results suggest that the solution ordering task is challenging with several factors influencing how a solution is perceived the words used and their meaning the writing style of the solution and the amount of detail present in it Our data comes from the FAQs produced by computer
7. levels in the Corpus level For each complexity level dm 1 lt m lt L Draw a complexity vocabulary distribution m Dirichlet a Draw a distribution over positions Ym Dirichlet p Draw a distribution w for the proportion of complexity versus problem specific vocabulary Beta do 61 Solution set level For each solution set Q in the corpus 1 lt i lt N Draw a distribution over the complexity levels 0i Dirichlet B Draw a problem specific vocabulary distribution Ai Dirichlet w Individual solution level For each solution zij in Qi 1 lt j lt Np Draw a complexity level assignment Zij Multinomial 0 Draw a position depending on the level assigned Tij Multinomial yz j Word level For each word wijp in solution zij Draw a switch value to indicate if the word is problem or complexity specific Sijk Binomial 7 If Sijk 0 draw Wijk Multinomial If Sijk 1 draw Wijk Multinomial Figure 2 Generative process for the Position Model generative process itself The inference process of this model allows to directly uncover a canonical or dering of the levels which explains the training data 4 1 Expected Position model This model infers the vocabulary associated with a complexity level and a distribution over the numer ical positions in a solution set where such a com plexity level is likely to occur After inference the model uses the position dis
8. pages 371 379 J Eisenstein J Clarke D Goldwasser and D Roth 2009 Reading to learn Constructing features from semantic abstracts In Proceedings of EMNLP pages 958 967 M Elsner J Austerweil and E Charniak 2007 A uni fied local and global model for discourse coherence In Proceedings of NAACL HLT pages 436 443 M A Fligner and J S Verducci 1986 Distance based ranking models Journal of the Royal Statistical Soci ety Series B pages 359 369 D Heckerman J S Breese and K Rommelse 1995 Decision theoretic troubleshooting Communications of the ACM 38 3 49 57 T Joachims 2006 Training linear SVMs in linear time In Proceedings of KDD pages 217 226 S Kim L Cavedon and T Baldwin 2010 Classifying dialogue acts in one on one live chats In Proceedings of EMNLP pages 862 871 M Lapata 2003 Probabilistic text structuring Experi ments with sentence ordering In Proceedings of ACL pages 545 552 M Lapata 2006 Automatic evaluation of information ordering Computational Linguistics 32 4 471 484 M Lui and T Baldwin 2009 Classifying user forum participants Separating the gurus from the hacks and other tales of the internet In Proceedings of the 2010 Australasian Language Technology Workshop pages 49 57 N Madnani R Passonneau N Ayan J Conroy B Dorr J Klavans D O Leary and J Schlesinger 2007 Measuring variability in sentence ordering for news summarization
9. particular fashion neither in terms of content nor complexity In this paper we present models to automatically predict the complexity of troubleshooting solutions which we argue could improve user experience and potentially help solve the problem faster e g by prioritizing easier solutions Automatically struc turing solutions according to complexity could also facilitate search through large archives of solutions or serve as a summarization tool From a linguis tic perspective learning how complexity is verbal ized can be viewed as an instance of grounded lan guage acquisition Solutions direct users to carry out certain actions e g on their computers or de vices and complexity is an attribute of these ac tions Information access systems incorporating a notion of complexity would allow to take user in tentions into account and how these translate into natural language Current summarization and infor mation retrieval methods are agnostic of such types of text semantics Moreover the models presented here could be used for analyzing collaborative prob lem solving and its social networks Characterizing the content of discussion forums by their complexity can provide additional cues for identifying user au thority and if there is a need for expert intervention We begin by validating that the task is indeed meaningful and that humans perceive varying de grees of complexity when reading troubleshooting solutions We
10. times a solution with position r is given com plexity m over the full corpus Positions are integer values between 1 and G T and Tt count the num ber of switch assignments of value 0 complexity related word and 1 technical word respectively in the corpus T v is a refined count of the number of times word type v is assigned switch value 0 ina solution of complexity m T v counts the number of times switch value 1 is given to word type v ina solution set for problem 2 We sample from this posterior using a collapsed Gibbs sampling algorithm The sampling sequence starts with a random initialization to the hidden vari ables During each iteration the sampler chooses a complexity level for each solution based on the current assignments to all other variables Then the switch values for the words in each solution are sam pled one by one The hyperparameters are tuned us ing grid search on development data The language model concentrations a p and w are given values less than 1 to obtain sparse distributions The prior on 0 the topic proportions is chosen to be greater than 1 to encourage different complexity levels to be used within the same problem rather than assigning all solutions to the same one or two levels Similarly dg and 6 are gt 1 We run 5 000 sampling iterations and use the last sample as a draw from the posterior Using these assignments we also compute an esti mate for the parameters m A and Ym For ex
11. In Proceedings of the Eleventh Eu ropean Workshop on Natural Language Generation C L Mallows 1957 Non null ranking models i Biometrika 44 1 2 pp 114 130 C D Manning M Surdeanu J Bauer J Finkel S J Bethard and D McClosky 2014 The Stanford CoreNLP natural language processing toolkit In Pro ceedings of ACL System Demonstrations pages 55 60 D S McNamara A C Graesser P M McCarthy and Z Cai 2014 Automated evaluation of text and dis course with Coh Metrix Cambridge University Press G A Miller 1995 WordNet a lexical database for English Communication of the ACM 38 11 39 41 S Schwarm and M Ostendorf 2005 Reading level as sessment using support vector machines and statistical language models In Proceedings of ACL pages 523 530 A Vogel and D Jurafsky 2010 Learning to follow navigational directions In Proceedings of ACL pages 806 8 14 L Wang M Lui S Kim J Nivre and T Baldwin 2011 Predicting thread discourse structure over tech nical web forums In Proceedings of EMNLP pages 13 25
12. MM decomposes d m o in a way that captures item specific dis tance This is done by computing an inversion vec tor representation of d a o A permutation 7 of n items can be equivalently represented by a vector of inversion counts v of length n 1 where each com ponent v equals the number of items j gt 2 that oc cur before item 2 in m The dimension of v is n 1 since there can be no items greater than the high est value element A unique inversion vector can be computed for any permutation and vice versa and the sum of the inversion vector elements is equal to d m o Each v is also given a separate disper sion penalty p Then the GMM is defined as GMM v p e77 2 a P m p 0 Corpus level For each complexity level dm 1 lt m lt L Draw a complexity vocabulary distribution om Dirichlet a Draw the L 1 dispersion parameters Pr GMMo p10 vor Draw a distribution w for the proportion of complexity versus problem specific vocabulary Beta do 61 Solution set level For each solution set Q in the corpus 1 lt i lt N Draw a distribution over the complexity levels 6 Dirichlet 8 Draw a problem specific vocabulary distribution dA Dirichlet w Draw a bag of Np levels for the solution set b Multinomial 6 Draw an inversion vector Vi vi GMM p Compute permutation 7 of levels using vi Compute level assignments z using 7 and bj Assign level z to xij f
13. Which Step Do I Take First Troubleshooting with Bayesian Models Annie Louis and Mirella Lapata School of Informatics University of Edinburgh 10 Crichton Street Edinburgh EH8 9AB alouis mlap inf ed ac uk Abstract Online discussion forums and community question answering websites provide one of the primary avenues for online users to share information In this paper we propose text mining techniques which aid users navigate troubleshooting oriented data such as ques tions asked on forums and their suggested so lutions We introduce Bayesian generative models of the troubleshooting data and apply them to two interrelated tasks a predicting the complexity of the solutions e g plugging a keyboard in the computer is easier compared to installing a special driver and b present ing them in a ranked order from least to most complex Experimental results show that our models are on par with human performance on these tasks while outperforming baselines based on solution length or readability 1 Introduction Online forums and discussion boards have created novel ways for discovering sharing and distribut ing information Users typically post their ques tions or problems and obtain possible solutions from other users Through this simple mechanism of community based question answering it is possible to find answers to personal open ended or highly specialized questions However navigating the in formation available
14. also show experimentally that users agree in their intuitions about the relative complex ity of different solutions to the same problem We define complexity as an aggregate notion of the time expertise and money required to implement a solution We next model the complexity predic tion task following a Bayesian approach Specifi cally we learn to assign complexity levels to solu tions based on their linguistic makeup We leverage weak supervision in the form of lists of solutions to different problems approximately ordered from low to high complexity see Table We assume that the data is generated from a fixed number of dis crete complexity levels Each level has a probability distribution over the vocabulary and there is a canon ical ordering between levels indicating their relative complexity During inference we recover the vo cabularies of the complexity levels and the ordering of levels that explains the solutions and their attested sequences in the training data We explore two Bayesian models differing in how they learn an ordering among complexity levels The first model is local it assigns an expected position in any list of solutions to each complexity level and orders the levels based on this expected posi tion value The second model is global it defines probabilities over permutations of complexity lev els and directly uncovers a consensus ordering from the training data We evaluate our models on a so
15. am ple the probability of a word v in m is computed T v ay AS S TU ota v m After inference we obtain probability distribu tions for each complexity level dm over the vocabu lary m and positions ym We compute the expected position Ep Of dm as G 5 POS ym pos 1 pos 1 Ep dm where pos indicates position values Then we rank the levels in increasing order of Ep 4 2 Permutations based model In our second model we incorporate the ordering of complexity levels in the generative process itself This is achieved by using the Generalized Mallows Model GMM Fligner and Verducci 1986 within our hierarchical generative process The GMM is a probabilistic model over permutations of items and is frequently used to learn a consensus ordering given a set of different rankings It assumes there is an underlying canonical order of items and concen trates probability mass on those permutations that differ from the canonical order by a small amount while assigning lesser probability to very divergent permutations Probabilistic inference in this model uncovers the canonical ordering The standard Mallows model has two parameters a canonical ordering and a dispersion penalty p gt 0 The probability of an ob served ordering 7 is defined as Mallows 1957 e Palro Elo where d 7 o is a distance measure such as Kendall s 7 between the canonical ordering and an observed ordering m The G
16. ange make least more op erate than point less cause slowly access may problem Level 2 use can media try center signal network make file sure when case with change this setting type remove Level 19 system file this restore do can then use hard issue NUM will disk start step above run cleanup drive xp Level 20 registry restore may virus use bio setting scanreg first ensure can page about find install additional we utility Table 3 Most likely words assigned to varying complex ity levels by the permutations based model randomly initializes the hidden variables For a cho sen solution set S the sampler draws Np levels bi one at a time conditioned on the assignments to all other hidden variables of the model Then the inversion vector v is created by sampling each vij in turn At this point the complexity level as signments z can be done deterministically given b and v Then the words in each solution set are sampled one at a time For the dispersion parame ters p the normalization constant of the conjugate prior is not known We sample from the unnormal ized GMMp distribution using slice sampling Other hyperparameters of the model are tuned us ing development data The language model Dirichlet concentrations a w are chosen to encourage spar sity and 8 gt 1 as in the position model We run the Gibbs sampler for 5 000 iterations t
17. aset the solution set size varies between 2 and 16 and the average number is 4 61 Figure illustrates the histogram of solution set sizes found in our corpus We only considered problems which have no less than two solutions All words in the corpus were lemmatized and html links and numbers were replaced with placeholders The resulting vocabulary was approximately 2 400 word types 62 152 tokens Note that our dataset provides only weak super vision for learning The relative complexity of so lutions for the same problem is observed however the relative complexity of solutions across different problems is unknown For example a hardware is sue may generally receive highly complex solutions whereas a microphone issue mostly simple ones 3 2 Task Validation In this section we detail an annotation experiment where we asked human judges to rank the randomly permuted contents of a solution set according to per ceived complexity We performed this study for two reasons Firstly to ascertain that participants are able to distinguish degrees of complexity and agree on the complexity level of a solution Secondly to examine whether the ordering produced by partici pants corresponds to the gold standard FAQ order of the solutions If true this would support our hy B C D FAQ A 0 421 0 525 0 625 0 465 B 0 436 0 434 0 252 C 0 584 0 303 D 0 402 Table 2 Correlation matrix for annotators A D and orig
18. d contains no special facility for grouping solutions with the same complexity In sum Position can more flexibly assign complexity levels to individual solutions 6 Conclusion This work contains a first proposal to organize and navigate crowd generated troubleshooting data ac cording to the complexity of the troubleshooting ac tion We showed that users perceive and agree on the complexity of alternative suggestions and presented Bayesian generative models of the troubleshooting data which can sort solutions by complexity with a performance close to human agreement on the task Our results suggest that search and summariza tion tools for troubleshooting forum archives can be greatly improved by automatically predicting and using the complexity of the posted solutions It should also be possible to build broad coverage au tomated troubleshooting systems by bootstrapping from conversations in discussion forums In the fu ture we plan to deploy our models in several tasks such as user authority prediction expert interven tion and thread analysis Furthermore we aim to specialize our models to include category specific complexity levels and also explore options for per sonalizing rankings for individual users based on their knowledge of a topic and the history of their troubleshooting actions Acknowledgements We would like to thank the editor and the anony mous reviewers for their valuable feedback on an earlier draft of t
19. dom rankings the easiest and most difficult solu tions are predicted correctly 15 of the time Get ting both correct for a single problem happens only 3 of the time The Position Length model overall performs best identifying the easiest and most diffi cult solution 36 of the time Both types of solution are identified correctly 18 of the time Interest ingly the generative models are better at predicting the most difficult solution 35 37 compared to the easiest one 28 36 One reason for this could be that there are multiple easy solutions to try out but the most difficult one is probably more unique and so easier to identify Overall we observe that the two generative mod els perform comparably with Position having a slight lead over Permutation A key difference be tween the models is that during training Permutation observes the full ordering of solutions while Posi tion observes solutions coming from a few normal ized position bins Also note that in the Permuta tion model multiple solutions with the same com plexity level are grouped together in a solution set This property of the model is advantageous for or dering as solutions with similar complexity should be placed adjacent to each other At the same time if levels 1 and 2 are flipped in the permutation sampled from the GMM then any solution with complexity level 1 will be ordered after the solution with com plexity 2 The Position model on the other han
20. ecific ordering sequences Specifically we can compute the expected complex ity of the solution set for problem 72 using the in ferred distribution over levels 6 ye m X im Table 5 shows the complexity of different problems as predicted by the position model with 10 levels As can be seen easy problems are associated with accessory components e g mouse or keyboard whereas complex problems are related to core hard ware and operating system errors 5 Evaluation Experiments In the previous section we showed how our mod els can assign an expected complexity value to a so lution text or an entire problem Now we present evaluations based on model ability to order solutions according to relative complexity 5 1 Solution Ordering Task We evaluated our models by presenting them with a randomly permuted set of solutions to a problem and examining the accuracy with which they reorder them from least to most complex At first instance it would be relatively straightforward to search for the sequence of solutions which has high likelihood under the models Unfortunately there are two prob lems with this approach Firstly the likelihood un der our models is intractable to compute so we would need to adopt a simpler and less precise ap proximation such as the Hidden Markov Model dis cussed below Secondly when the solution set size is large we cannot enumerate all permutations and need to adopt an approximate search pr
21. he disper sion parameters are resampled every 10 iterations The last sample is used as a draw from the posterior 4 3 Model Output In this section we present examples of the complex ity assignments created by our models Table shows the output of the permutations based model with 20 levels Each row contains the highest prob ability words in a single level from the distribu tion m For the sake of brevity we only show the two least and most complex levels In general we observe more specialized technical terms in higher levels e g restore scanreg registry which one would expect to correlate with complex solutions Also note that higher levels contain uncertainty de noting words e g can find may which again are indicative of increasing complexity Using these complexity vocabularies our models 1 0 Cable s of new external device are loose or power ca bles are unplugged Ensure that all cables are properly and securely connected and that pins in the cable or connector are not bent down 1 0 Make sure your computer has at least 50MB of free hard drive space If your computer has less than 50MB free it may cause the computer to operate more slowly 1 0 If the iPhone is in a protective case remove it from the case If there is a protective film on the display remove the film Medium complexity 5 7 Choose a lower video setting for your imported video 5 5 The file property is set to read only To work
22. his paper We are also thankful to members of the Probabilistic Models of Language reading group at the University of Edinburgh for their many suggestions and helpful discussions The first author was supported by a Newton International Fellowship NF120479 from the Royal Society and the British Academy References R Barzilay and M Lapata 2008 Modeling local coher ence An entity based approach Computational Lin guistics 34 1 1 34 R Barzilay and L Lee 2004 Catching the drift Proba bilistic content models with applications to generation and summarization In Proceedings of NAACL HLT pages 113 120 D Bollegala N Okazaki and M Ishizuka 2006 A bottom up approach to sentence ordering for multi document summarization In Proceedings of COLING ACL pages 385 392 S R K Branavan H Chen L S Zettlemoyer and R Barzilay 2009 Reinforcement learning for map ping instructions to actions In Proceedings of ACL IJCNLP pages 82 90 S R K Branavan D Silver and R Barzilay 2011 Learning to win by reading manuals in a Monte Carlo framework In Proceedings of ACL HLT pages 268 277 H Chen S R K Branavan R Barzilay and D R Karger 2009a Content modeling using latent per mutations Journal of Artificial Intelligence Research 36 1 129 163 H Chen S R K Branavan R Barzilay and D R Karger 2009b Global models of document structure using latent permutations In Proceedings of NAACL HLT
23. ic features SynSem and their combinations denoted via We also report the performance of a base line which computes a random permutation for each solution set Random results are averaged over five runs We show results for all solution sets All and broken down into different set sizes e g 2 3 45 As can be seen the expected position model ob tains an overall 7 of 0 30 and the permutation model of 0 26 These values lie in the range of human annotator agreement with the FAQ order see Sec tion 3 2 In addition we find that the models perform consistently across solution set sizes with even higher correlations on longer sequences where our methods are likely to be more useful Posi tion and Permutation outperform Random rankings and a model based solely on Length features The Solution set sizes Model 2 3 4 5 gt 5 All 133 81 86 300 Random 0 002 0 027 0 006 0 004 Length 0 052 0 066 0 028 0 002 SynSem 0 167 0 109 0 149 0 146 Length SynSem 0 122 0 109 0 158 0 129 Position 0 288 0 242 0 381 0 302 Length 0 283 0 251 0 363 0 297 SynSem 0 343 0 263 0 367 0 328 SynSem Length 0 348 0 263 0 368 0 331 Permutation 0 253 0 206 0 341 0 265 Length 0 213 0 229 0 356 0 258 SynSem 0 268 0 263 0 353 0 291 SynSem Length 0 253 0 254 0 354 0 282 Table 6 Kendall s 7 values on the solution reorder
24. inal FAQ order using Kendall s 7 values are aver ages over 100 problem solution sets pothesis that the solutions in our FAQ corpus are fre quently presented according to complexity and that this ordering is reasonable supervision for our mod els Method We randomly sampled 100 solution sets their sizes vary between 2 and 16 from the FAQ corpus described in the previous section and randomly permuted the contents of each set Four annotators one an author of this paper and three graduate and undergraduate students in Computer Science were asked to order the solutions in each set from easy to most complex An easier solution was defined as one which takes less time or effort to carry out by a user The annotators saw a list of solutions for the same problem on a web interface and assigned a rank to each solution to create an or der No ties were allowed and a complete ordering was required to keep the annotation simple The an notators were fluent English speakers and had some knowledge of computer hardware and software We refrained from including novice users in our study as they are likely to have very different personal pref erences resulting in more divergent rankings Results We measured inter annotator agreement using Kendall s 7 a metric of rank correlation which has been reliably used in information ordering eval uations Lapata 2006 Bollegala et al 2006 nani et al 2007 7 ranges between 1 and 1
25. ing data and their expected complexity under the position model We find that the model is able to distinguish intuitively complex solutions from sim pler ones Aside from measuring expected complex ity in absolute terms our models can also also or der solutions in terms of relative complexity see the evaluation in Section 5 and assign a complex ity value to a problem as a whole Low Complexity Problems Computer appears locked up and will not turn off when the power button is pressed A USB device headphone or microphone is not recognized by the computer Computer will not respond to USB keyboard or mouse High Complexity Problems Game software and driver issues Incorrect missing or stale visible networks I get an error message that says that there is not enough disk space to publish the movie What can I do Power LED flashes Red four times once every second fol lowed by two second pause and computer beeps four times Table 5 Least and most complex problems based on the expected complexity of their solution set Problems are shown with complexity 1 3 top and 8 9 bottom using the position based model As mentioned earlier our models only observe the relative ordering of solutions to individual problems the relative complexity of two solutions from differ ent problems is not known Nevertheless the models are able to rate solutions on a global scale while ac commodating problem sp
26. irichlet pri ors Next we generate an ordering for the complex ity levels We draw Np complexity levels from 9 one for each solution in the set Let b denote this bag of levels e g b 1 1 2 3 4 4 4 assuming 4 complexity levels and 7 solutions for a particular problem We also draw an inversion vector v from the GMM distribution which advantageously allows for small differences from the canonical order The z assignments are deterministically computed by or dering the elements of b according to the permuta tion defined by v Given the conditional independencies of our model the posterior is proportional to L L 1 II P mla x L PCerle 0 vor x P yY o 51 5 ll Sal ll m P Ai18 x POilw x P vilo x P bil9i z V R a l P sijk Y P Wijk Sijk Priz ri x Ss ll m amp ll m gt ll S where L is the number of complexity levels N the total problems in the training corpus Np the size of solution set for problem P and x the number of words in solution xij A simplified posterior can be obtained by integrating out the Y A and 0 distributions which is proportional to L II T R 8m L 1 ma x TT GMMolp uo vor araa where the R and T counts are defined similarly as in the Expected Position model We use collapsed Gibbs sampling to compute samples from this posterior The sampling sequence Level 1 Low complexity free space hard drive sure r
27. lution ordering task where the goal is to rank so lutions from least to most complex We show that a supervised ranking approach using features based on the predictions of our generative models is on par with human performance on this task while outper forming competitive baselines based on length and readability of the solution text 2 Related work There is a long tradition of research on decision theoretic troubleshooting where the aim is to find a cost efficient repair strategy for a malfunction ing device Heckerman et al 1995 Typically a diagnostic procedure i e a planner is developed that determines the next best troubleshooting step by estimating the expected cost of repair for various plans Costs are specified by domain experts and are usually defined in terms of time and or money in curred by carrying out a particular repair action Our notion of complexity is conceptually similar to the cost of an action however we learn to predict com plexity levels rather than calibrate them manually Also note that our troubleshooting task is not device specific Our models learn from troubleshooting oriented data without any restrictions on the prob lems being solved Previous work on web based user support has mostly focused on thread analysis The idea is to model the content structure of forum threads by an alyzing the requests for information and suggested solutions in the thread data fet al 2010 Examples of
28. ocedure We opted for a discriminative ranking approach instead which uses the generative models to compute a rich set of features This choice allows us to simul taneously obtain features tapping on to different as pects learned by the models and to use well defined objective functions Below we briefly describe the features based on our generative models We also present additional features used to create baselines for system comparison Likelihood We created a Hidden Markov Model based on the sample from the posterior of our mod els for a similar HMM approximation of a Bayesian model see Elsner et al 2007 For our model the HMM has L states and each state Sm corresponds to a complexity level dm We used the complex ity language models m estimated from the poste rior as the emission probability distribution for the corresponding states The transition probabilities of the HMM were computed based on the complexity level assignments for the training solution sequences in our posterior sample The probability of transi tioning to state s from state s p s s is the con c di dj c di where c d dj is the number of times the complex ity level d is assigned to a solution immediately fol lowing a solution which was given complexity dj c d is the number of times complexity level d is assigned overall in the training corpus We perform Laplace smoothing to avoid zero probability transi tions between states di
29. oduce two Bayesian topic models for the complexity prediction and rank ing task In these models complexity is captured through a discrete set D of L levels and a total or dering between the levels reflects their relative com plexity In other words D dj d2 dz where d is easiest level and D dm lt D dm4 i1 Each complexity level is parametrized by a unigram lan guage model which captures words likely to occur in solutions with that level Our two models are broadly similar Their gen erative process assigns a complexity level from D to each solution such that it explains the words in the solution and also the ordering of solu tions within each solution set Words are gener ated for each solution by mixing problem specific words with solution specific and hence complexity related ones Also each problem has its own distri bution over complexity levels which allows for some problems to have more complex solutions on aver age some a mix of high and low complexity solu tions or otherwise predominantly easier solutions The main difference between the two models is in the way they capture the ordering between levels Our first model infers a distribution for each level over the positions at which a solution with that com plexity can occur and uses this distribution to order the levels Levels which on average occur at greater positions have higher complexity The second model defines probabilities over orderings of
30. or 1 lt j lt Np Word level For each word w in solution ij Draw a switch value to indicate if the word is problem or complexity specific Sije Binomial 7 If sijx 0 draw Wijk Multinomial z If Sijk 1 draw wije Multinomial A Figure 3 Generative process for Permutation Model And can be further factorized into item specific components GMM vil pi x e7 3 Since the GMM is a member of the exponential fam ily a conjugate prior can be defined for each disper sion parameter p which allows for efficient infer ence We refer the interested reader to Chen et al for details on the prior distribution and nor malization factor for the GMM distribution Figure 3 formalizes the generative story of our own model which uses the GMM as a component We assume the canonical order is the strictly increas ing 1 2 L order For each complexity level dm we draw a distribution over the vocabulary We also draw L 1 dispersion parameters from the con jugate prior GMMp density Hyperparameters for this prior are set in a similar fashion to al 2009a As in the position model we draw a binomial distribution 7 with a beta prior over complexity versus problem specific vocabulary At the solution set level we draw a multinomial dis tribution over the vocabulary and a multinomial distribution 6 for the proportion of L levels for this problem Both these distributions have D
31. roximately ordered from least to most complex and learn to predict an ordering for new sets of so lutions This setup is related to previous studies on information ordering where the aim is to learn statistical patterns of document structure which can be then used to order new sentences or paragraphs in a coherent manner Some approaches approxi mate the structure of a document via topic and entity sequences using local dependencies such as condi tional probabilities Lapata 2003 Barzilay and La pata 2008 or Hidden Markov Models Barzilay and Lee 2004 More recently global approaches which directly model the permutations of topics in the doc ument have been proposed Chen et al 2009b Fol lowing this line of work one of our models uses the Generalized Mallows Model Fligner and Ver ducci 1986 in its generative process which allows to model permutations of complexity levels in the training data 3 Problem formulation Our aim in this work is to learn models which can automatically reorder solutions to a problem from low to high complexity Let G c1 C2 cn be a collection of solutions to a specific problem We wish to output a list G c4 Ch ch such that D cj lt D cj41 where D x refers to the com plexity of solution z 3 1 Corpus Collection As training data we are given problem solution sets similar to the examples in Table I where the solu tions are approximately ordered from lo
32. s The de velopment data was used to tune the parameters and hyperparameters of the models and the number of complexity levels We experimented with ranges 5 20 and found that the best number of levels was 10 for the position model and 20 for the permutation based model respectively For the expected posi tion model positions were normalized before train ing Let solution zir denote the rt solution in the solution set for problem P where 1 lt r lt Np We normalize r to a value between O and 1 using a min max method r ee Then the 0 1 range is divided into k bins The identity of the bin containing r is taken as the normalized position r We tuned amp experimentally during development and found that k 3 performed best For our ordering experiments we used Joachims SVMRank package for training and testing During training the classifier learns to minimize the number of swapped pairs of solutions over the train ing data We used a linear kernel and the regulariza tion parameter was tuned using grid search on the development data of each fold We evaluate how well the model s output agrees with gold standard ordering using Kendall s 7 5 3 Results Table 6 summarizes our results average Kendall s 7 across folds We present the results of the dis criminative ranker when using a single feature class based on likelihood and expected complexity Posi tion Permutation length and syntactico semant
33. such analysis include identifying which earlier post s a given post re sponds to and in what manner e g is it a question an answer or a confirmation Other related work Lui and Baldwin 2009 identifies user characteris tics in such data i e whether users express them selves clearly whether they are technically knowl edgeable and so on Although our work does not address threaded discourse we analyze the content of troubleshooting data and show that it is possible to predict the complexity levels for suggested solu tions from surface lexical cues Our work bears some relation to language ground ing the problem of extracting representations of the meaning of natural language tied to the physical world Mapping instructions to executable actions is an instance of language grounding with applica tions to automated troubleshooting Branavan et al 2009 Eisenstein et al 2009 navigation Vogel and Jurafsky 2010 and game playing 2011 In our work there is no direct attempt to model the environment or the troubleshooting steps Rather we study the language of instructions and how it correlates with the complexity of the implied actions Our results show that it possible to predict complexity while being agnostic about the seman tics of the domain or the effect of the instructions in the corresponding environment Our generative models are trained on existing archives of problems with corresponding solutions app
34. tional probability p d d computed as c d d 1 858i cheers This HMM formulation allows us to use efficient dy namic programming to compute the likelihood of a sequence of solutions Given a solution set we compute an ordering as follows We enumerate all orderings for sets with size less than 6 and select the sequence with the highest likelihood For larger sizes we use a sim ulated annealing search procedure which swaps two adjacent solutions in each step The temperature was set to 50 initially and gradually reduced to 1 These 4 values were set using minimal tuning on the devel opment data After estimating the most likely se quence for a solution set we used the predicted rank of each solution as a feature in our discriminative model Expected Complexity As mentioned earlier we computed the expected complexity of a solution x as Di m p m x where the second term was estimated using a complexity level specific language model m and a uniform prior over levels on the test set As additional features we used the solu tion s perplexity under each m and under each of the technical topics A and also the most likely level for the text arg maxm p m Finally we included features for each word in the training data The fea ture value is the word s expected level multiplied by the probability of the word in the solution text Length We also investigated whether solution length is a predictor of comple
35. tribution to compute the expected position of each complexity level The lev els are ordered from low to high expected position and taken as the order of increasing complexity The generative process for our model is described in Figure 22 A first phase generates the latent vari ables which are drawn once for the entire corpus Then variables are drawn for a solution set next for each solution in the set and finally for the words in the solutions The number of complexity levels L is a parameter in the model while the vocabulary size V is fixed For each complexity level dm we draw one multinomial distribution over the vocab ulary V and another multinomial y over the pos sible positions These two distributions are drawn from symmetric Dirichlet priors with hyperparame ters a and p Solutions will not only contain words relating to their complexity but also to the prob lem or malfunctioning component at hand We as sume these words play a minor role in determining complexity and thus draw a binomial distribution that balances the amount of problem specific ver sus complexity specific vocabulary This distribu tion has a Beta prior with hyperparameters o and 61 For each solution set we draw a distribution over the complexity levels 0 from another Dirichlet prior with concentration 3 This distribution allows each problem to take a different preference and mix of complexity levels for its solutions Another multino mial
36. w to high complexity A solution set S is specific to prob lem P and contains an ordered list of Np solu tions S 1 9 TNp such that D a lt D 41 We refer to the number of solutions re lated to a problem Np as its solution set size For our experiments we collected 300 problem and their solutions from multiple web sites includ ing the computing support pages of Microsoft Ap ple HP as well as amateur computer help websites such as The prob lems were mostly frequently asked questions FAQs referring to malfunctioning personal computers and smart phones The solutions were provided by com puter experts or experienced users in the absence The corpus can be downloaded from http www homepages inf ed ac uk alouis solutionComplexity html 70 65 60 55 50 gt 45 g 40 5 35 2 30 M95 20 15 10 E 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Solution set size Figure 1 Histogram of solution set sizes of any interaction with other users or their devices and thus constitute a generic list of steps to try out We assume that in such a situation the solution providers are likely to suggest simpler solutions be fore other complex ones leading to the solution lists being approximately ordered from low to high com plexity In the next section we verify this assump tion experimentally In this dat
37. xity e g simple so lutions may vary in length and amount of detail from complex ones We devised three features based on the number of sentences within a solution words and average sentence length Syntax Semantics Another related class of fea tures estimates solution complexity based on sen tence structure and meaning We obtained eight syntactic features based on the number of nouns verbs adjectives and adverbs prepositions pro nouns wh adverbs modals and punctuation Other features compute the average and maximum depth of constituent parse trees The part of speech tags and parse trees were obtained using the Stanford CoreNLP toolkit Manning et al 2014 In addi tion we computed 10 semantic features using Word Net Miller 1995 They are the average num ber of senses for each category noun verb adjec tive adverb and the maximum number of senses for the same three classes We also include the aver age and maximum lengths of the path to the root of the hypernym tree for nouns and verbs This class of features roughly approximates the indicators typ ically used in predicting text readability and Ostendorf 2005 McNamara et al 2014 5 2 Experimental Setup We performed 10 fold cross validation We trained the ranking model on 240 problem solution sets 30 sets were reserved for development and 30 for testing in each fold The most frequent 20 words in each training set were filtered as stopword

Which Step Do I Take First? Troubleshooting with Bayesian Models

Contents

Download Pdf Manuals

Related Search

Related Contents