Home
Final Report PDF file - VTechWorks
Contents
1. TestFiles TestTree WithChildren rgxt private long TEST_KEY 5814361054088246626L private long FAIL KEY 100L Before public void setUp throws Exception parser new TreeParser Tests loading the tree into binary t s IOException Test public void TestVerifyFileExists throws IOException byte result parser loadFile Assert assertNull result result parser loadFile TEST PATH Assert assertNotNull result Figure 16 Demonstration of code coverage plugin indicating that the code shown is hit in test Programmatically ensuring that code is working is an important part of software development being able to easily run tests change code re run and assert that execution proceeds as expected is crucial to delivering a top notch product In order to assure the best for this project we used a third party tool called EclEmma When this tool is used in Eclipse our development environment it runs all unit tests and displays in code where code is either e Hit e Hit some branches of for loops switch cases etc e Not hit at all This tool assisted in developing tests for the backend system the text file parsers and the transformation into citations The front end of the system was tested manually To test the back end of the system four files were created A tree containing one single node A tree containing one single node and a child A tree containing one matching regular
2. Y EAHA AADMA CHC Author Year Title Publication Location EAD 201 A LL Author Year Title Publication Location NEE W 2 V0 22 EMI Author Year Title Publication ENAKA Author Year Title Publication Location T AACE DWP MIAO Author Year Title Publication d 4 evaluation L 7 Author Year Title Publication Location MAEDA Author Year Title Publication Location N A VE VIEW Author Year Title Publication YONA ANL Author Year Title Publication Figure 9 The Refined Tree This tree parses over 1 000 citations of which over 90 are correct This refined tree parses out the authors year and titles of over 1 000 citations and the rest are filtered to nodes that will be appropriately filled at a later date Work remains to parse the rest of the strings and it is anticipated that the tree will grow in size appropriately Further refinement going forward will still involve increasing the citation coverage of the regex tree but will also involve debugging and fine tuning the GUI A GUI change we are planning to do is instead of having the binary tree file 27 input in a drop down we are going to add another field like the citation file and output location We are also planning to make the GUI more informative to the user by adding some descriptions and tips A B E D E F a 1 Author Year Title Publication Location Date 2 Aatique M Mizusawa G Wo
3. date volume issue and page 5 6 Refinement 2 Refinements this week were based solely on the merits of the tree Expansions were made to ensure that as many Authors Years and Titles of citations were being parsed out as possible Below are examples of both the expanded tree structure and the resulting output showing the improved correctness of our parsing engine 28 T EIAHA 20 Author Year Title Publication MAGE DVG 20 AVL Author Year Title Publication gt LML R LH Author Year Title Publication gt EIAHA RADAN Author Year Title Publication di Author Year Title a a AS ATE SANE Gh a EEA LO E 0 Author Year Title Publication Pages gt NAGE LIAL Author Year Title Publication 2 26 NC Author Year Title Publication Ce Author Year Figure 13 Refinement of the RegexTree in the Tree Tool 4 A B c O Test of a Cooperative Intersection Collision Avoidance System for TTT pe 1998 Invited Commentary on the Special Issue on Automated Highway Systems T 1996 A comparison of landing site safety training techniques for loggers Pr 2005 Motor vehicle warnings In M S Wogalter Ed The Handbook of Warnings pg La 2006 Motor vehicle warnings In 1999 Evaluation of unassigned MUTCD sign colors for trailblazing during critical incider Tr 2002 Evaluation of fluorescent MUTCD sign colors for incident management trailblazin Tr 2002 The relationship between truck driver sleeper ber
4. 20 22 2e 2a 29 00 28 2e 2a 2c oA Se L t ECH hear 2e 2a 29 20 20 28 2e 2a 29 20 20 28 2e 2a 29 00 Gel ed pod KC GAS Figure 7c The hexadecimal representation of our string dump When the string table is parsed each program runs through the chunk grabs all the entries gets the string from the offset and adds it to a key value list The object then uses the key value list to find the matching regular expression from string keys contained within its binary ii Node Object Chunk The node object chunk has a header of 28 bytes composed of 4 bytes the identifier always NOBJ 4 bytes the size of the chunk disregarding the identifier and size 8 bytes zeroed for future proofing 4 bytes the number of nodes to read 4 bytes the size of a node entry 4 bytes zeroed for future proofing 4e 4f 42 4a 0000 0350 0000 0000 op 00 op op NOBJ P 0000 op oe 0000 op 3c oo oo op on ES lt ff Figure 7d The hexadecimal representation of the header of our node object chunk 24 Next we have the node entry section e 8 bytes the string key containing the regular expression data in the string table section 4 bytes zeroed for future proofing 4 bytes the number of children for this node 4 bytes the offset to the children of this node 4 bytes the number of Order elements for this node 4 bytes the offset to Order data for this node 32 bytes zeroed for future proofing String tables are
5. always parsed first due to the necessity of finding string keys for regular expression data The nodes are stored in breadth first order this is due to the fact that node children need to be stored next to each other so that any parser can run recursively through the file to import the tree After all nodes there is a raw dump if Order data d2 ER 55 48 2a 5e e6 d0 00 00 0000 00 00 00 08 OoUH eb 00 00 op 2c 0000 0001 0000 opp 00 00 00 00 eee eeee op 00 op 00 0000 0000 0000 0000 0000 00 00 _ eeeeeee oo op 0000 op op 0000 oooo oof I Figure 7e The hexadecimal representation of a node entry The order data is stored as follows e 4 bytes the enumeration ordinal 00 00 00 01 0000 0002 0000 0000 0000 000l s s sssssesses 00 00 00 02 0000 0003 0000 0000 0000 000l eeeeeeeee oo oo oo o2 B f Figure 7f The hexadecimal representation of ordinal dump data For example ordinal 1 represents Author ordinal 2 represents Year and so on The binary data structure was designed so that the size of an entry could be easily modified without changing too much code internally yet could feasible store as many children and order elements as the byte limit allowed 4 bytes being an integer There can be even more string keys stored as the type of key is a long integer meaning that for most projects the node binary format should suffice and perform well 5 4 Citation Parser Tool 25 Cit
6. expression A text file with four text citations o One that passed the tree o Three that failed the tree for various reasons These files in conjunction with unit tests that simulated the interface successfully tested important elements of execution such as failed key value lookup failed file import asserting the correct number of nodes and more 31 6 2 Manual Interface Testing Our user interface across both of our tools were simple enough to manually test and we deemed it unnecessary to take the extra time to create GUI hooks for testing rather we would manually verify that the interfaces worked as expected and make changes based on what we thought were appropriate For the Tree Tool testing helped us to determine several areas of incorrectness e An empty area at the bottom of the screen that could not be resized e Sometimes clicked nodes opened for edit would display text that was not associated with that node We were able to successfully replicate and eliminate these issues through manual testing For the Text Transformation tool we determined that the only major issue we were running into was the fact that the Import Binary dropdown was more appropriately placed as another input box similar to the boxes indicating the file locations for text input and CSV output The Text Transformation tool s GUI backend interfacing with the actual transformation was sufficiently simulated in the Text Transformation s
7. found 2 6 Alternatives An alternate solution to this problem would be to find an open source program with some similar parsing logic already built in and then adjusting it to fit our needs Unfortunately finding something like this has proven difficult and the ones we have seen so far are in languages not compatible with Java For example FreeCite an open source citation parser from http freecite library brown edu does something similar to what we need but it is in Ruby which isn t compatible with our strategy 13 3 Frontend Design 3 1 User Interface The planned user interface would be developed in JavaFX a GUI framework provided with Java SDK 8 JavaFX allows for quick and simplified UI development by providing a multitude of graphics object libraries as well as allowing the integration of CSS files The GUI will be responsible for letting the user select a Citation Text File to parse and a location to output a CSV file with the parsed citation data It is also important that the UI alert the user to critical program errors as well as file reading writing errors In addition the UI will alert the user if the parser had to create a log file containing any citations that could not be parsed If a Log file is produced it will be provided in the same folder location as the CSV output file 3 2 Java Code Design The front end will branch off from the main program thread on program start up to run on a separate applicat
8. value henceforth referred to as CSV file Each citation is separated from other citations by a newline character Examples of citation inputs and expected outputs can be seen in Figure 1a and Figure 1b These citations are not guaranteed to fall into one category and even citations of the same format can have mild variations that could cause parsing to fail Therefore it will be in our best interests to develop one of the following tools A A robust extensible regular expression suite that will allow us to read through the file and get exact matches to specific regular expression types B A tool that utilizes pre existing natural language tools to help in our analysis of the text Both of these tools will present their own advantages and disadvantages in terms of development effort usability etc as an example the regular expression example is quicker to develop but the processing time could potentially become very large and without an extensible design adding regular expressions could be a tedious task We also intend on delivering a user interface that will allow users to select a text input file and output directory for a created file This interface will also be responsible for displaying any errors that our backend parser runs into Additional options may be added as development furthers and requirements are tweaked or altered Finally we will be delivering a robust test suite that will cover a variety of cases that may affe
9. work on parsing smaller chunks of citations at a time The more we add into the parsing logic the fewer unparsed citations there will be until we manage to cover 100 of the citations While working through it we will have a way to display the ones not yet parsed so that we know what we have to focus on getting next 2 5 Error Catching An important design choice to make is how to handle the parser running into errors One initial thought is to have a separate file which will print out all of the citations it couldn t successfully parse This way we will be able to see what our code needs to handle next Also It allows the user to see if the remaining few would be something that they just need to do by hand We would know if a specific citation could not be parsed correctly if it does not match any regular expression used in the parser A potential problem could be that a citation could match a regular expression but not be correct That is why we plan on being very specific with our regular expressions so we do not run into that problem Every single parse will return a key value pair of a boolean and a string The boolean will indicate if the parse was successful e a leaf level matching regular expression in the tree was found and the string will represent either the rearranged and corrected CSV format string a critical unhelpful error message in event of emergency or a representation of the closest regular expression match that was
10. Source Control We are using GIT for our code s source control It allows us to keep track of what changes we make and who is making them It also provides the functionality to revert to a previous commit if something were to go wrong and we need to return to an earlier point We have a master branch and then we have other branches for specific purposes Such as parsing GUI etc which we can work on and make changes to Once a branch gets to a solid state and after a code review then will we merge with the master branch See Code Review section for details on our code review process Here you can see an example of a source file in the master branch on Github 18 P branch master TextTransformation src Initial parsing for authors Very Limited Cases re zenky41 authored 13 days ago B Converter java Initial parsing for authors Very Limited Cases Figure 7 Github image example 4 5 Agile Process Tracking To track our progress and tasks to do we are using an online scrum board at Trello com Our columns consist of backlog to do in progress test code review and completed The backlog will consist of cards that are stretch goals that we might not have time to get to To do are tasks that are waiting to be worked on In progress are our tasks that we are currently implementing The test column is for cards that need to be tested Code review is for cards that have been tested and need to be reviewed before merging wit
11. Text Transformation Project Report CS 4624 Virginia Tech Blacksburg VA 2 27 2015 Professor Edward A Fox Client Nathan Hall Developers Stephen Fenton Dustin Thompson Zach Henke Kevin Cox Table of Contents Introduction Page s 1 Executive SUMMAry E 4 Users ep 5 6 1 Regular Expression Tree Usage 5 2 Text Transformation GUI Usage 5 3 Alternate Parsing Tool Usage 6 Developers Manual ccc cccccssescseesesesssscssetesessscsseeesestsssseseessseeneseseeees 7 31 TA Ee et 7 10 1 1 Project Deliverables EE 7 1 2 Development Methodologie 8 1 3 Development TOO S nrnna iia EE 9 1 4 Preliminary Timeline ue 9 1 5 ee EE 10 1 6 VTechWorks Deltverables 10 2 Ba ckend Desig e EE 10 12 2 1 Introduction E 10 2 2 TOOS USOA EE 10 2 3 Re EE 10 24 Parsing eee he 11 2 5 EVVORCAICINAG EE 12 26 PUMCUIAUV e 12 3 Frontend Design E 12 15 3 1 WISCPATICTIACE sere esha teat dors Reel utes a adap a A E deenees 12 3 2 Java COMICS II E 13 3 3 UI Flow ele TE 13 A Implementation Han ENEE 16 18 4 1 Programming ane eietietefsien eegeeet deen eeben erbei 16 4 2 Tools ENDO VCO EE 16 4 3 Parsing Ee 16 SEET 17 4 5 Agile Progress Tracking EE 17 Ee EE 18 BF WEE eecht ie tege ean io meats asd means 18 le 19 E ee e e EE 20 29 5 1 Regular Expression Tree 20 5 2 ree ToO ee 20 5 3 Binary Data Structure here eege ee 22 5 4 Citation Parser Tool pias getgeterete reide SOs 24 5 5 Refinement d 26 5 9 Re fin m nt eent G
12. Trailblazing Tr 1998 Advanced traveler information systems and commercial vehicle operations comp T 1997 Human factors analysis and design support for the National Automated Highway Fe 2007 Investigation of driver infrastructure and driver vehicle interfaces for an intersec Jo 1998 Impact of sleeper berth usage on driver fatigue Task 1 Analysis of trucker sleep Pr 1998 Long haul drivers perspective on sleeper berth usage and fatigue in the trucking S 1999 Operational review of specialty vehicles Project report Preliminary needs anal F 2005 Multimodal Warnings Curve Warning Design Pe 2006 Evaluating Way finding Without An In Vehicle Experimenter Methodology 2006 Pe 2012 Outbound Texting Comparison of a Speech Based Approach and a Handheld Tou Tr 2011 Driver Acceptance and Use of a Speed Limit and Curve Advisor 2011 01 0550 S4 2011 Issues Related to the Use and Design of a Backing Rear Cross Traffic Alert System Si 2008 Occupant to Occupant Interaction and Impact Injury Risk in Side Impact Crashes St 29 Figure 14 Sample of CSV output when running the Parser with the Refined RegexTree Figure 15 Example of some of the citations from our failed file given the VTTI_pubs txt as input Figure 11 shows us a sample of the failed output file which shows us which citations don t get matched to one of the nodes in our regex trees We will use this file to create more specific regular expressions to cover more citations Viewing th
13. Transformation tool we will be using an agile methodology Agile is defined as such Agile development is an alternative to traditional project management where emphasis is placed on empowering people to collaborate and make team decisions in addition to continuous planning continuous testing and continuous integration What this entails to the client is that we will not blindly come back with a product that will not match their needs we will return iteratively with both minor and major improvements so that we can work out any issues together and have the product that they desire completed at the end of the project timeline We intend to meet up with our client after an initial month of work barring any immediate need for a meeting followed up by bi weekly meetings where we showcase improvements and request feedback or ideas to implement new features While agile development methods traditionally employ a Scrum master and have daily standups due to the nature of the project and the distance of the team members we will not hold traditional standups and instead communicate during in class sessions through e mail and using messenger applications if the need arises 1 3 Development Tools For this project we have decided to develop our tool in Java Java is a portable language that will allow our utility to be run across a variety of devices with Windows Mac and Linux being the primary targets It is a well known and globally ac
14. as very clear and helpful when communicating what he wanted from us He was very understanding with the issues we had and adjusted what he expected from the project as we went along Through him we also got a meeting with some people from the VTTI which was a valuable experience to see how a real world client meeting would go 2 We would like to thank our professor Dr Fox He was able to give us some tips with the project as well as giving us feedback along the way He also answered questions we had about deadlines and such References 1 Java a httos www java com en 2 Virginia Tech Transportation Institute VIII 33 a httos www vtti vt edu b Copyright 2015 Virginia Polytechnic Institute and State University JavaFX a http www oracle com technetwork java javase overview javafx overview 2158620 html Eclipse a https eclipse org b Copyright 2015 The Eclipse Foundation Github a https github com b Copyright 2015 GitHub Inc EclEmma a http Awww eclemma org b Copyright 2006 2015 Mountainminds GmbH amp Co KG and Contributors FreeCite a http freecite library brown edu b Brought by Brown University htto Awww brown edu c Implemented by Public Display http Avwww pubdisplay com d Funded by the Mellon Foundation https mellon org Agile a http agilemethodology org
15. ation File Browse START Figure 7 The Citation Parser Tool This is the tool used by the client to give an input text file with citations in it and to select a destination for Output files to be saved in The primary tool developed for the client is the Citation Parser Tool this tool was created using JavaFX libraries and for prototyping purposes only runs from the developers IDE The Parser Tool is responsible for and currently capable of the following parsed Aatique M Mizusawa G Woerner B Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahmadian M Ahn Y K Ahmadian M Cantwell R Ahmadian M DeGuilio A P Ahmadian M Eisenhower B A Ahmadian M Glass J E Ahmadian M Huang Ahmadian M Huang W Ahmadian M Huang W Ahmadian M Huang W Ahmadian M Jeric K M Reading an input text file that contains a single citation string on each line Importing Regex Tree Binary Files to use as the basis for parsing citations Parsing citation strings into an output CSV file Creating an Error file that contains citations that were not able to be 1997 Performance of hyperbolic position location techniques for code division multiple access 1997 A hybrid semiactive control for secondary suspension applications 1997 Semiactive control of multiple degree of freedom systems 1999 Design and Developm
16. cepted language used often by each of our team members Java also allows for an easy User Interface GUI design using the JavaFX and Swing libraries We will develop our project in the Eclipse development environment IDE which all team members have used for 2 years Eclipse allows for easy unit testing helpful debugging and easy integration with any other Java tools if we need to use them Code will be stored both locally on team member machines and in a public GitHub repository where we can all collaborate on code 1 4 Preliminary Timeline Weeks 1 4 Finish the wiki write up and get it signed by our client and Dr Fox Completed Setup code repositories and tools required for development Completed Code first iterations of text parsing utilities Completed Set up an initial user interface Set up testing classes and initial tests for the utility Completed Deliver an initial working demonstration to the client by March 20th Weeks 5 6 Add in more parsing utility code to improve citation coverage Add error output code to parser Add error output to user interface Increase code coverage of tests Weeks 7 8 Further parser improvement Further test coverage to go with improvement Focus on improving performance if necessary Weeks 9 Finalize project for client 10 Final push for improved performance Assure that the project will run on different platforms as expected Figure 2 Shown above is the timetable to be fol
17. ct performance and or correctness of the utility All code will be covered by these tests the user interface will be manually tested to ensure correctness These tests will run the gamut of anything from a null input to an attempted memory overflow in order to ensure the best possible product for our client Citation Examples Aatique M Mizusawa G and Woerner B 1997 Performance of hyperbolic position location techniques for code division multiple access Proceedings of Wireless 97 Calgary Canada July 9 11 1997 Ahmadian M and Ahn Y K 2000 Performance Analysis of Magneto Rheological Mounts Journal of Intelligent Material Systems and Structures Vol 10 No 3 March pp 248 256 Figure 1a Shown above are examples of citation inputs CSV Output Examples Aatique M Mizusawa G Woerner B 1997 Performance of hyperbolic position location techniques for code division multiple access Proceedings of Wireless 097 Calgary Canada July 9 11 1997 Ahmadian M Ahn Y K 2000 Performance Analysis of Magneto Rheological Mounts Journal of Intelligent Material Systems and Structures 10 3 248 256 Figure 1b Shown above are the CSV outputs expected for the citation examples in Figure 1a Double quotes are used to bypass reserved characters The CSV columns are ordered as follows Author Date Title Publication Location Date Volume Issue Pages 1 2 Development Methodology For the Text
18. e citations in this file we can determine what our tree is missing As we add more regex nodes into our tree this output file will shrink until eventually we reach the goal of hitting all or most of the citations For any that we absolutely cannot hit with the tree we can manually add into the ending csv file 6 Testing In order to assure that we have created the system we want that the parts operate sufficiently indepently of each other and that the interface satisfies both the needs of the project and the usability requirements of the client we have implemented both software level testing and manual testing They ensure correctness and expose some issues present in the code It should be noted that no code is perfect and different execution environments can bring about unexpected issues 6 1 Software based Unit Testing EES mped Structu ASME International ntrolling Gun 4 g Mat Jeric K M and Inman D J 199 ard S C In Ahn K anc d 7 Ene nd Environ Ahn K d Rakha 0 The c rtation Engin rtation Enc GPR Calibration F g izi and S Lahouar GPR t the ni 8 c ound trating Radar GPR izi and S Lahouar h International S nan T E Hu T L Brandon arming d Santa Monica CA Human F o M Hankey J M Ding ntract No FH dE 4 Mc 30 public class TreeParserTests private TreeParser parser private String TEST PATH TestFiles TestTree rqxt private String TEST_PATH_WITH_CHILDREN
19. e top text field should contain only Java compliant regex strings and the bottom text field should contain citation orderings separated by commas Acceptable citation orderings are Moving forward the team plans to implement some updates to the Tree Tool however as the Tree Tool is a secondary developer tool it does not take Author Year Title Publication Location Date Volume Issue Pages top priority Possible updates include Improved GUI using CSS Text Indicating text field purpose Integration with the main Citation Parser Tool a Tree Tool n REM Author gt ON Author Year a ig Author Year wii Author Year LV Author Year LA AL Author Year AA Author Year GA Author Year 22 Figure 6a The Tree Tool used to create Regex Parsing Tree Files This image shows an example of a Regex Tree being imported and edited in the tool v gt Author vA Author Year T ENAH Author Year Title MAAC 207 Author Year Title Publication LML Author Year Title LL Author Year Ge Author Year amp AT Author Year LVL Author Year LA AU v Figure 6b This image shows the process for editing the regex and order fields in the tree 5 3 Binary Data Structure The binary data structure was developed in order to allow easy transfer of information from the Tree Tool to the Text Transformation utility as well as easy modification within the Tree Tool Within t
20. eated using JavaFX which allows simple attractive interfaces to be created cleanly and easily 4 2 Tools Employed We are utilizing the latest version of Eclipse as our IDE It comes with several handy features such as a visual debugger and a GIT plugin Since we are all going to be working on the same program it will be useful to have the ability to grab the latest branches and merge changes all from within Eclipse Here you can see from inside Eclipse that we can view local and remote branches that are currently being tracked S Git Repositories 32 E 4 E Local H Parsing f739551 Create 4 Remote Tracking origin master 99ee8e0 origin deeg 73955 K origin zach a0 Figure 6 Eclipse built in source control 4 3 Parsing Strategy Our parsing strategy for the citation file will revolve around several combinations of regular expressions to break up the citations and identify the proper parts To make adding more regular expressions easy we will develop a separate tool which builds up a tree of regular expressions The top of nodes of the tree are very general regexes and as you go to the leaf nodes in the tree they become more specific Our program will use the tree to find which regex matches each citation and it will traverse the tree getting more and more specific until it has an exact match The purpose of using a tree is to save time because now it won t have to go through every possible regex 4 4
21. eet 27 6 ee TEEN 29 6 1 Software based Unit Testing EE 29 6 2 Manual Interface Testmg AA 30 Lessons US e e space cle A tare Reale i aac cone 31 1 Problems ENCOUNTCLO EEN 31 ee 31 Acknowledoements ec cccccscssessessseeseeseessessesseessessesseesecsecsucstesseeseeseesseeneasees 32 Sne econ OER a eh Ree 32 Introduction 1 Executive Summary The purpose of this project is to assist the VTTI in converting a large citation file into a CSV file for ease of access It required us to develop an application which can parse through a text file of citations and determine how to properly put the data into CSV format We designed the program in Java and developed a user interface using JavaFX which is included in the latest edition of Java We came up with two main tools the developer tool and the parsing program itself The developer tool is used to build a tree made up of regular expressions which would be used in parsing the citations The top nodes of the tree would be very general regexes and the leaf nodes of the tree would become much more specific This program can export the regex tree as a binary file which will be used by the main parsing program The main parsing program takes three inputs a binary regex tree file a citation text file and an output location Once run it parses the citations based off of the tree it was given It outputs the parsed citations into a CSV file with the citations separated by field For any citati
22. ent of Magneto Rheological Dampers for Bicycle Suspensions 1999 Filtering Effects of Mid Cord Offset Measurements on Track Geometry Data 2000 Application of Magneto Rheological Dampers for Controlling Fire Out of Battery for Heavy Weapons United Defense 2001 An Evaluation of the Practical Design Aspects of Magneto Rheological Dampers for Controlling Gun Recoil Dynamics United Defense 2001 On the design of magneto rheological recoil dampers for fire out of battery control Proceedings of the 10th Gun Dynamics Symposium 2001 The noise and vibration benefits of soft mounted locomotive cabs Proceedings of 2001 IEEE ASME Joint Rail Conference 2000 Performance Analysis of Magneto Rheological Mounts 1999 Effect of Pedestal Liner Clearance and Resilience on Rail Vehicle Operation 2001 Recent advances in the use of piezoceramics for vibration suppression Shock and Vibration Digest 1997 Application of Magneto Rheological Suspensions to Intelligent Transportation Systems 2001 A Comprehensive Experimental Analysis of Kinematics and Dynamics of Volvo Optimized Air Suspension 2 Volvo Trucks North America 2001 An Analytical Analysis of Kinematics and Dynamics of Volvo Optimized Air Suspension 2 Volvo Trucks North America 2001 An Analytical Investigation of Power Hop in Heavy axle Vehicles Final Report 2002 A Qualitative Analysis of the Dynamics of Self Steering Locomotive Trucks 1999 The Application of Piezoceramics for Reducing Noise and Vib
23. erner B 1997 Performance of hyperbolic position location techniques for code division multip Proceedings of Wireless 97 Calgary Canada July 9 11 1997 3 Ahmadian M 1997 A hybrid semiactive control for secondary suspension applications Proceedings of the Sixth ASME Symposium on Advanced Automotive Technologies 4 Ahmadian M 1997 Semiactive control of multiple degree of freedom systems Proceedings of 1997 ASME Design Engineering Technical Conference Sacramento Ci 5 Ahmadian M 1999 Design and Development of Magneto Rheological Dampers for Bicycle Suspensic Proceedings of 1999 ASME International Congress and Exposition Nashville Tennes 6 Ahmadian M 1999 Filtering Effects of Mid Cord Offset Measurements on Track Geometry Data Proceedings of 1999 IEEE ASME Joint Rail Conference Dallas Texas 7 Ahmadian M 2000 Application of Magneto Rheological Dampers for Controlling Fire Out of Battery United Defense L P 8 Ahmadian M 2001 An Evaluation of the Practical Design Aspects of Magneto Rheological Dampers f United Defense L P 9 Ahmadian M 2001 On the design of magneto rheological recoil dampers for fire out of battery conti Proceedings of the 10th Gun Dynamics Symposium Dallas TX 10 Ahmadian M 2001 The noise and vibration benefits of soft mounted locomotive cabs Proceedings of 2001 IEEE ASME Joint Rail Conference Toronto Ontario 11 Ahmadian M Ahn Y K 2000 Performance Analysis of Magneto Rheological Mounts Journal of Intell
24. h the master branch See the Code Review section for more details on our code review process Once the task has been merged to the master branch then it is finally completed and will be moved to the completed section D Text Transformation Pivate Backlog fj To Do S Parse Date 2 Parse Volume Parse Issue Parse Pages Handle special citations by parsing or by hand Insure that all citations are properly entered in the csv file File s location error checking Add a card 4 6 Code Review Figure 8 Shown above is an image of our scrum board early on in our development Test Completed Create frame around parsing Figure out Git eclipse o Add a card Citation Class o Parse Authors gt zH Print out the citations after the parsing is done Maybe store the emors within the citations to print out later too Parse Date 1 ad EI Break up each column parse into their own methods ZH Add a card In order to assure that our code works properly is clean readable and easily maintainable we are enforcing code reviews of up to two people before 19 commits are pushed and merged into the master code branch Code reviews will allow for other team member input on code and get constructive feedback and criticism back to the code author so that we can produce the best possible product At least 2 group members must review major changes before merging For smaller changes 1 person
25. he binary there are two chunks of information e The String Table e The Node Object The chunks are composed as follows i String Table The string table has a header of 32 bytes composed of 4 bytes the identifier always STblI 4 bytes the size of the chunk disregarding the identifier and size 8 bytes zeroed for future proofing 4 bytes the number of entries in the string table 4 bytes zeroed for future proofing 23 e 4bytes the size of the string table e 4bytes the size leading up to the string table including these 4 bytes 53 54 62 6c 0000 01 b1 0000 0000 op 00 00 op STbl 00 00 00 goe op op 0000 0000 00 ag 00 00 ou ff Figure 7a The hexadecimal representation of the header of our string table chunk Next we have the string table entry section e 8 bytes the string key e A bytes the offset to the string for the key starting at the beginning of the string table e 4 bytes zeroed for future proofing d2 ER 55 48 2a Se e6 dd 0000 0000 00 00 00 00 OgUH 2D 4e f1 e9 34 9a Oc 65 d8 00 00 00 03 00 00 00 00 Nn 48 e9 15 dc ad 6f 38 20 78 a 00 00 00 Oe 00 00 00 00 U8 x occ ere Figure 7b The hexadecimal representation of three string table entries Finally once all of the entries are done we have our string table a dump of strings each separated by a zeroed byte 2e 2a 00 28 2e 2a 29 28 Sc 28 2e 2a 29 00 28 2e Ser ESA WY EA E e 2a 29 2c 2e 2a 28
26. igent Material Systems and Structures Vol 10 No 3 March pp 248 12 Ahmadian M Cantwell R 1999 Effect of Pedestal Liner Clearance and Resilience on Rail Vehicle Operation Proceedings of ASME Rail Transportation Division Chicago Illinois 13 Ahmadian M DeGuilio A P 2001 Recent advances in the use of piezoceramics for vibration suppression Shock and Vibration Digest Vol 33 No 1 15 22 14 Ahmadian M Eisenhower B A 1997 Application of Magneto Rheological Suspensions to Intelligent Transportation S Final Report Center for Transportation Research October 1997 15 Ahmadian M Glass J E 2001 A Comprehensive Experimental Analysis of Kinematics and Dynamics of Volvo O Volvo Trucks North America 16 Ahmadian M Huang W 2001 An Analytical Analysis of Kinematics and Dynamics of Volvo Optimized Air Suspe Volvo Trucks North America 17 Ahmadian M Huang W 2001 An Analytical Investigation of Power Hop in Heavy axle Vehicles Final Report Goodyear Tire and Rubber Company Figure 10 Example of some of the citations from our output file given the VTTI_pubs txt as input You can see that up to the location field the citations are parsed correctly As of now the output is creating 1011 parsed citations out of around 1456 citations given Future iterations of our code will refine the parsing of author year title and publication in order to increase the total citations parsed From there we will move on to location
27. ion thread The application thread will ensure that the GUI does not pause or hang while the main thread deals with parsing computations The front end will communicate with the backend models through standard action Listeners and Observers 3 3 Ul Flow Model Below is our the idea for a flow model of the user interface Figure 5a shows the main screen where you select which file you want to parse and where you want the output to go Figure 5b shows an example of the progress bar once the parsing has begun and allows it to be cancelled Figure 5c shows an example of the progress bar being completed for once the parsing program has finished In these images it still just has the placeholders for the files but once being used it would have actual filenames destinations Figure 5d shows an example of what would happen if the parsing program encountered a problem It gives a pop up message telling the user where the output log will be Citation Parser O X Citation File Please Select A File fa Output CSV C DefaultLocation Ea START Figure 5a Shown above is the start screen for the Citation Parser This screen allows for selection of _inpu toutput file location and starts the parsing program Citation Parser mE Citation File Please Select A File fa Output CSV C DefaultLocation EE aa CANCEL Figure 5b Shown above is the in progress screen for the Citation Parser This screen displays the cu
28. led output file the regex tree may need to be refined to parse the file better The output csv file will be labeled as parsed_citations timestamp csv The failed output file will be labeled as failed_citations timestamp csv The timestamp will be in the format yyyy MM dd hh mm ss 3 Alternate Parsing Tool Usage This program simply takes 2 inputs The first is the citation file you want to parse It needs to be a text file The second is the output destination Once you have entered both files directories press run and you will see the number of citations it has parsed out of the number there are total After it has finished running you will get an output csv file with the citations separated by field Developer s Manual 1 Requirements Outlined here are the methodologies we will use to develop our text transformation tool our overall goals for the project and end output and ways we intend to at least explore extending the project to facilitate other institutes within the University We will define individual roles and goals for these jobs set a preliminary timeline for deliverables and detail any other mechanisms for ensuring a successful product 1 1 Project Deliverables We are working with the Virginia Tech Transportation Institute or VTTI to help organize their citation data The primary goal of this project is to take ina text file consisting of a variable number x of citations and transform them into a comma separated
29. lowed for the project s duration 10 This timeline as one should expect is subject to change and new goals may be added e g we have discussed the possibility of a website with the client 1 5 Project Extensions While we have laid out a solid plan for our utility we certainly can extend upon it as we have discussed with our client We anticipate having a solid functional and extensible utility ready for our client at the end of the timeline but in the event that we have more time or resources than we anticipate we have several ways to extend the project e Working with other institutes While this project focuses solely on the output of files submitted by VTTI we have been told that working with VBI would be a natural next step to improving our project e Creating a website Having the program posted on a website to use would increase ease of use for any users unfamiliar with the technical requirements having Java installed of running our utility These ideas and more can be adapted into our project and this space may expand as we come up with other ways to improve our project and leave it long lasting after we have departed the University 1 6 VTechWorks Deliverables The following is a list of files delivered to VTechWorks for consumption CitationParser jar An executable Java JAR file that transforms text citations TreeTool jar An executable Java JAR file for regex tree developers CitationParserProject zi
30. object in CSV format to the file specified by the user 2 4 Parsing The parsing will be the most difficult part of this project The reason for this is that all of the citations don t follow an exact standard so there are several variations between them To manage this we will have to come up with a complex set of regular expressions which can identify structure keywords phrases and certain special characters to determine which fields are which Some of the citations even leave out critical information such as the authors and dates Ahmadian M Ahn Y K and Morishita 5 1999 performance Analysis of Maaneto Rheological Mounts Journal of Intelligent mat rial Systems and Structures 10 3 248 256 Ahmadian M DeGuilio A P Inman D J and Claus R O Application of Smart Damping Materials and Fiber opti ibration Reduction 13th Annual Fiber Optics Biomedical A O O a Optics Research Review Blacksburg VA April 2000 Figure 4 Shown above are example citations highlighting variations in formatting Here you can see that these two citations while similar have variations The first contains a date in parentheses near the beginning as well as page numbers while the second only has a publication date at the end of the citation without any page numbers These variations make parsing very tricky because of how many there are Since there are so many citations to deal with approximately 1500 we will
31. oftware based unit testing Lessons Learned 1 Problems Encountered The main problem we ran into was that there were so many different variations in the citations that our parsing program really wasn t able to achieve 100 coverage These variations stem from the fact that this list of citations we were given isn t complete so some citations are incorrect and missing different parts It will require someone to manually go through to add in the citations we were unable to hit Another problem we faced was that sometimes the parsing program s output isn t perfect It has around 80 success rate Sometimes though it has an error and ends up sticking parts of the citations in the wrong fields This is because of situations where it is very difficult near impossible to tell a citation s title and publication apart from each other Fixing these errors in the output will require some manual labor to fix 2 Future Work This project has a lot of potential for being used in the future We designed it with extensibility in mind The developer tree tool allows someone to create a regex tree 32 tailored specifically to what they want to parse With some development work this parsing program could easily be made to parse other institutions citation lists as well Since it is so customizable it can search for whatever the client needs it to Acknowledgements 1 We would like to thank our client Nathan Hall He met with us often and w
32. ons that the program is unable to process it dumps them into a failed output text file so We also created an additional program as an alternative solution to ours It uses Brown University s FreeCite parsing program and then outputs parsed citations to a CSV file User Manual 1 Regular Expression Tree Usage The tree tool is a developer tool used to build a regex tree which the main program uses to parse citations Upon starting this program you will see an empty tree as well as having the option to import or export a tree When dealing with an empty tree you will see a bar labelled New Regex Node You may click it to fill in what the new node will be The top bar is the regular expression and the bottom bar is for the field and capturing group As an example you could put in a zA Z s 0 9 as your regex and Author Year as the capturing groups This regex would be capable of parsing something such as Bob 2005 and it would designate Bob as the author and 2005 as the year Once you have entered everything you want for your regex node press enter and it will be entered into the tree Now you can export this tree file to be used by a citation parsing program To do this press file and export and you will be able to name the file which will be saved You can also add child nodes to a node Right click a node and press add child node It will create a new regex node which you can enter in another regex into When crea
33. p A zip file containing all source code for the project FinalReport pdt docx A PDF file and a DOCX file comprised of this document FinalPresentation pdf pptx A PDF file and a PPTX file of our final presentation VII clean Det A text file comprised of citations to parse regex_tree rgxt A sample tree to parse VTTI_clean txt FreeCiteTool zip A side tool for non specific parsing using Brown s FreeCite API 2 Backend Design 2 1 Introduction The problem we need to solve is parsing through a large list of citations several of them having variations making them different from each other To 11 tackle this we are going to program in Java and use an object oriented approach to help separate the different fields in each citation 2 2 Tools Used Our IDE of choice is Eclipse Eclipse has several nice features for programming in Java as well as integration with GIT so we will easily be able to keep our program in source control This is vital since we are working ina medium sized group of 4 people 2 3 OO Design We will parse the document of citations and store them as citation objects each one containing different fields for all of the members of a citation For example the citation class would hold several different fields such as author title etc public class Figure 3 Shown above is an example Java Class for storing citations after parsing With this framework in place we would then write the Citations
34. r to explain it test it develop an interface for it and ultimately extend it The backend will include overall design the parsing structure and performance concerns 20 GUI Developer The GUI will be a small but important part of the utility As Dustin is the most experienced with the JavaFX and Java Swing utilities he will be in charge of designing and deploying the interface The interface will be responsible for file input and output selection passing this information to the backend and displaying any errors that arise when using the tool Test Developer Unit testing will be a critical and large part of ensuring that the utility acts as expected and does not collapse under unexpected circumstances While it is large as it will be developed in concert with the backend to ensure correctness as we proceed only one person will be necessary to develop the tests Fenton is experienced with testing through work experience and therefore will be in charge Point of Contact It will be a simple job but a necessary one for the project Fenton has been designated for contacting the client the VIII and the professor in order to obtain meeting times updates and send information to any interested parties Scrum Master Zach person is in charge on making sure everybody is using the scrum board correctly and that cards are moving across the board Also he is in charge of making sure the Github branches are up to date deleted etc Pre
35. rations in Vehicle Structures Figure 8a The output file from the citation parser This is a CSV file that separates sections such as Authors and Date by commas so that the client may easily see portions of the citation File Edit Format View Help Algorithm Evaluation Methodology with Evaluation of Three Alert Algorithms 100 Car Follow On Subtask 5 Final Repor and Proposed Solutions Journal of Transportation Engineering Vol 132 7 pp 555 565 Arafeh M and Rakha H 2005 Genetic Algorithm Approach for Locating Automatic Vehicle Identification Readers 8th 1 Figure 8b The Error Output File This file contains all citations or text that failed the parsing process Moving forward the tool will be extended so that it supports the following 26 Running as an executable Jar Using a complete developer made Regex Parsing File by default Enhanced UI using CSS color and formatting Warning Dialog for when a citation is not able to be parsed Error Dialog and Error handling for program errors Possible addition of a progress bar if needed for slower computers 5 5 Refinement 1 Our prototyping phase left us much further ahead than we anticipated therefore refinement has been relegated to simply improving the tree structure and tying the user interface into the backend code making improvement and fixing bugs where need be Our first iteration of the tree is visualized as follows Y UMW Author Year Title Publication
36. rrent progress of the program and allows cancellation of the parsing 15 f Citation Parser Citation File Please Select A File Output CSV C DefaultLocation Figure 5c Shown above is the completion screen for the Citation Parser This screen allows the user to acknowledge parse completion and to return to the start screen Citation Parser There was a problem parsing some ofthe citations these citations have been provided in a log file at C OutputLocation ParsingErrorLog txt E Figure 5d Shown above is the warning pop up for the Citation Parser The warning alerts the user to an log file created when some citations can not be parsed Citation Parser LI X Ci Error NullPointerException Code Is Broken The program will now exit OL q Figure 5e Shown above is the error pop up for the Citation Parser The Error Alert will open whenever there is an error reading writing files or if the program critically fails 4 Implementation Plan 4 1 Programming Language The client requested that the final program be runnable on Windows Mac and Linux so we need a language capable of running on each of these operating systems to meet the given criteria we selected Java as our programming language for the project Java is also an object oriented language which is ideal for the project since we are breaking up the citations into objects for our parsing logic The GUI is going to be cr
37. sentation Manager Kevin is in charge of making sure that the presentation is finished and of informing the group about who is presenting what 5 Prototyping 5 1 Regular Expression Tree In order to save processing time and make it easier to branch out regular expressions so that our text transformation project can quickly identify the best solution to a problem we use a tree structure that the transformation project traverses Each node of the tree contains the following information A data field which contains the compiled regular expression we want to use A children field which contains all instances of children of this node An order list which contains a list of Order enumerations dictating the order of the string for when we piece the regular expression matches back together This tree is compiled from binary data created by secondary tool we have developed to allow for ease of creating a tree structure The secondary tool and binary data structure are detailed below 5 2 Tree Tool 21 The Tree Tool was developed so that the team could easily build and edit a Regular Expression Tree as development moved forward The tool allows developers to create a tree from scratch or to import an existing tree from a binary file when the developers are done they can export the tree on screen into a binary file format To edit any node in the tree double click the node and edit the text in the text boxes that appear Th
38. th sleep quality and safety reli Pr 2002 The relationship between truck driver sleeper berth sleep quality and safety reli Pr 2005 An overview of the 100 car naturalistic study and findings In 2005 An overview of the 100 car naturalistic study and findings In 2003 Lessons learned during two naturalistic truck driving studies Pr 2002 Naturalistic crash pre crash data collection Task 10 report Ni 2001 Naturalistic crash pre crash data collection Task 1 report Ni 2002 Naturalistic crash pre crash data collection Phase final report Ni 2002 The 100 car naturalistic driving study Phase experimental design DOT HS 809 W 2002 The 100 car naturalistic driving study Phase experimental design DOT HS 809 W 2006 Intersection Decision Support Evaluation of a violation warning system to mitige Ct 2007 Investigation of driver infrastructure and driver vehicle interfaces for an interse Jo 1998 Long haul drivers perspective on sleeper berth usage and fatigue in the trucking S 2001 Operational review of specialty vehicles Preliminary needs analysis with empha Bl 2005 An overview of the 100 car naturalistic study and findings In 1996 A comparison of landing site safety training techniques for loggers Pr 2001 In vehicle information systems and law enforcement A preliminary needs asses IT 2001 Improvement of conspicuity of trailblazing signs Phase III Evaluation of fluore Sp 1999 Evaluation of Unassigned Sign Colors for Incident Management
39. ting a tree have the higher level nodes be more general and have the lower level nodes be more and more specific You can also remove leaf nodes To do this right click on a leaf node and press remove leaf node It will be deleted To import a tree press file and import and it will open up your file explorer Select a RGXT file which was created using this tool 2 Text Transformation GUI Usage The Text Transformation program s interface takes 3 main inputs The first is used to select which binary regex tree the parsing program will use The second input is to select which file you want to parse You must ensure that it is a plain text file The third thing you must select is the output destination Once you run the file the output csv file as well as the failed output text file will be placed in this destination After you have selected all 3 files directories you will be able to press the start button to begin running the parsing program You will see a progress bar which shows you how far along the parsing is It will increment forward for every citation it hits or misses Once it has completed parsing the citations it will produce two output files The first is a csv file which contains citations which were successfully hit by the parsing program Each citation will be broken up into different fields The second output file is a text file which contains the citations the parser wasn t able to match If you end up with a large fai
40. would be okay and for incredibly miniscule changes no review would be necessary 4 7 Timeline e Feb 9 23 o Finish the wiki write up and get it signed by our client and Dr Fox Completed o Setup code repositories and tools required for development Completed e Feb 23 March 9 o Put Requirements and Design into Project Report Completed o Determine tools and libraries to use Completed e March 9 March 23 o Set up an initial user interface o Set up testing classes and initial tests for the utility Completed o Deliver an initial working demonstration to the client by March 20th e March 23 April 6 o Add in more parsing utility code to improve citation coverage o Add error output code to parser o Add error output to user interface o Increase code coverage of tests e April 6 April 20 o Finalize project for client Demonstrate project for client m Hand over source code documentation and compiled executable o Final push for improved performance o Assure that the project will run on different platforms as expected 4 8 Roles We have three main coding roles and three external roles in this project The roles are e Backend Developer The backend will undoubtedly be the bulk of the project it is after all the backbone Without it we do not have a project Therefore it is imperative that every team member collaborates and contributes to it We will all need to understand the inner workings of the project in orde
Download Pdf Manuals
Related Search
Related Contents
BROTHER KH 260 FRANCAIS Partie 2 Le simulateur de la CWaPE Manuel d`utilisation AFP - User Manual - Canadian Paediatric Surveillance Program No 44 Mode d`emploi - Leroy Merlin EEM-FDM Ver.3.2 取扱説明書 株式会社 EEM Poèmes - Érudit "取扱説明書" 電波時計 取扱説明書 Local Mux-6 Copyright © All rights reserved.
Failed to retrieve file