Home
Getting Started with matchIT
Contents
1. ent_db SS ry A CORDA a h d 1 d h JOSE ACEVEDO JOSE ACEVEDO ___ NOBELES Q SA DE CV ad In the data window s own MA BRUCE AHRENS MMD MA BRLICE AHRENS MMD PACIFIC ENTERPRISES ll h h DOUGLAS BARTLETT DOUGLAS BARTLETT C O ATLANTIC AERO INC scro towards t e rig t MIKE BASLER MIKE BASLER ARKANSAS HIGHWAY DEPARTMENT ROBERT DECOSTER ROBERT DECOSTER HOFFMANNS BILL DEIGHTON JF BILL DEIGHTON JA PAC matchIT has added several TODD FRIEDRICH TODD FRIEDRICH THIEL SHERIFFS DEPT y JACK GRIFFITHS JACK GRIFFITHS TIM HORTONS DONUT LTD f ld ft h f ld MARK GUNDERSON MARK GUNDERSON lelds after the last fie MB H HANNOVER MB H HANNOVER IDS FINANCIAL SERVICE INC ee ROBERT HANSEN ROBERT HANSEN 77 STATEWEST AIRLINES INC f h g d MATT KASPAR MATT KASPAR AVIATION SERVICES UNLIMITED rom the origina ata MA DANIEL MCCORMACK MA DANIEL MCCORMACK THE KEFFER MANAGEMENT COMPANY MAD H MILTON MA D H MILTON MCDONNELL DOUGLAS HELICOPTER COMPANY these are the phonetic and MR B NORTON MR B NORTON PACIFIC ENTERPRISES DAN O LAUGHLIN DAN O LAUGHLIN THE KEFFER COMPANY o MICHAEL OREILLY MICHAEL OREILLY ANDERSEN amp LAMB INC other key fields used by DANNY PLOTZ DANNY PLOTZ a5 7 AMWAY CORP i i MA PHILLIP POKLASNY MA PHILLIP POKLASNY THE BURKE GROUP i A h fi MIGUEL ROBOLLEDO MIGUEL ROBOLLEDO LOPATA SA DE CV matc IT to searc or GENE SCHMIDT GENE SCHMIDT oes TIM HORTONS i A EARL SIMMONDS EARL SIMMONDS WOOD AIR HOLDING CORPORATION h h d y
2. C Use Zip 4 Address Validation Enable automatic suppression Create salutations Correctly case name and address Import this file to C PROGRAM FILES MATCHITYS1 DATABASE CUSTOMER WORKFILE DBF m am Select options as follows e matchIT needs to Create match keys if you are going to find duplicates so this option will be selected by default but you can deselect it if you do not want to dedupe this file or compare it with another file e matchIT Pro allows you to create salutations and a contact field for the output so tick the Create Salutations option NB matchIT builds salutations as a by product of generating match keys so you can t choose to just generate salutations Getting started with match T 8 matchIT Pro also allows you to proper case the name and address i e convert data intelligently to upper and lower case so tick the Correctly case name and address option You should decide on which level of deduplication your data needs using the Select matching level section Because this data has a COMPANY field match IT assumes this is business data so the options are dedupe down to one record per Contact one record per Business or one record per Address With consumer data the options are dedupe down to one record per Individual one record per Family or one record per Household Leave this setting as Contact for this exercise Options allows you to change the ma
3. NelplT SYSTEMS intelligent data cleansing Getting Started with matchlT Version 5 12 Copyright matchIT is copyright helpIT systems inc 1994 2006 all rights reserved FoxPro is copyright Microsoft Corporation 1988 2006 all rights reserved Trademarks matchIT is a registered trademark of helpIT systems inc FoxPro is a registered trademark of Microsoft Corporation All other trademarks are also acknowledged Contents Getting Started ccccccceecseesceeeeees I TRIS Guide sa enaa i l INSTALIGHOM seh na n a l Evaluation System Limitations 2 matchIT v5 User Interface 3 The matchIT Toolbar 3 The File Selector Button 000 0000 3 matchIT Wizard Pane ccceeecees 4 Exercise toh ciacsissdesetuneass env seveniedecieeseeees 5 Single File Deduplication 00 0 5 Importing Data 0 0 0 eee cece ceseeeeeeeees 5 Single File Wizard Operation 6 Automatic Import Wizard 000 000 6 Finding Duplicates 0 ccccceee eee 8 Viewing Matches ccecce 1 Flagging Matches 13 File Output 2 seit hati eek ee is 14 Output File Layout 15 CONCIUSION cicucviriiveuicduueeiaganedencen 16 Exercise 2352252 cnieceiessdainas sees esennwees 17 Multiple File Deduplication 17 Starting the Two File Wizard 17 Find SuppressiONS c0ccceceeeeeeeeeeees 20 View Verif
4. File View Text File Import Data ODBC Connection Find Matches Generate Output Find Overlap matchiT Options x bo la The File Selector Button J There is frequent reference to this button in this guide Getting started with match T 3 matchlT Wizard Pane The Wizard Pane gives you immediate access to any of matchITs file wizards Included in the Pane is a quick link to the matchI T Job Script window matchiTs file wizards have the ability to lead you through the entire data cleansing process The file wizards allow inexperienced users to work through the data cleansing processes with ease while still maintaining the advanced data cleansing functionality that has made matchiT an industry leader A description of each icon follows Single File Import and dedupe a single file Two File Work through various 2 file processes Import dedupe and merge two or more 85 files simultaneously Automation Access matchiTs pre programmed job scripts or allow you to create your own Getting started with match T 4 Exercise 1 Single File Deduplication This section of the guide takes you through the process of deduplicating a single file match T works on a copy of your data which is in DBF format To create a DBF copy data will nee to be imported into matchIT Importing Data The easiest and quickest way of importing data into match T is by using the Setup Wizard The ax Singe Fie wizard can be initiated
5. be included when you want to export information about matching records You can always undo the results of the Flagging Step by Getting started with match T 13 unflagging flagged records either individually via Verify Matches or globally via Database Utilities With a bit more experience using match T you will see how it is possible to flag all the closest duplicates automatically and come back to deal with the less obvious ones interactively perhaps after getting advice on how to decide which of a given pair to flag from the client or department whose data it is you are processing You can also use the Intelligent Data Merge option which will ensure that any data present in the record being flagged but missing from the record being kept is copied over This option is subject to rules that you can specify in the Jobs Setup Matching Setup menu File Output At this stage in the exercise we have imported our data found the duplicate entries and flagged them Now we want to output a cleansed list If this file is to be used for a live job you should now select the Q A Wizard Quality Assurance Output which category of records to file Wizard button from the screen shown to examine summary information about the LP file drill down to view any suspect records view data in different orders output ranges or samples of the data If you choose Records in Matched Sets matchIT will export unique refer
6. output Getting started with matchIT 22 Quality Assurance Wizard In the dialogs presented automatically after the Importation of Records or the Deletion of Matches you will see an option to utilize match Ts Quality Assurance Wizard The Quality Assurance Wizard is also available from the Output menu Selecting this option results in the ability to display the Data Summary View Records by Category and View Data within the DBF Select the Data Summary to view the same report as was generated in Exercise 1 but this time for the WORKFILE DBF in Exercise 2 Scroll down or to the next page to look at the information further into the report This section of the report summarizes information about Data Extraction e Potential Data Errors on records which should perhaps be excluded from the output e Main Input Options Data Summary Data Extraction Company Names NIA Job Titles INIA Premise Numbers 58 77 33 Gties NIA States AN7A Countries N A Possible Data Errors Excluded Records 1 33 Potentia Address Truncation 0 00 Blank Address 1 0 00 Blank Address Keys 0 00 alololo Blank Individud Name Keys 6 67 Main Input Options Input forn at Comma Nationality us Postcode verification PAF Correct postcode format Full name contact NB If you have changed these settings since the job was run the curre
7. E1 LEFT ZIP 5 00 00 00 NAME_KEY SUBSTR ADD_KEY 5 4 00 00 00 ADD_KEY PREMISE 00 00 01 This shows number of potential duplicates Verify Matches Flag Matches After EXAMPLE1 TXT has been imported and deduped matchIT will begin to work with EXAMPLE2 TXT the suppression file match T will prompt you to begin the import of EXAMPLE2 TXT Since this is a suppression file match T will not flag any duplicates within the file Select Continue to begin importing EXAMPLE2 TXT EXAMPLE2 TXT will go through a similar process as EXAMPLE1 TXT excluding internal deduplication Save this imported file as STOPFILE DBF and click Continue Now we are going to purge names from WORKFILE DBF that already exist in STOPFILE DBF This is to ensure that we remove existing customers from the WORKFILE DBF file that we just deduped Getting started with match T 19 Find Suppressions matchiT two file wizard To purge the STOPFILE records that also exist in the WORKFILE select Find Suppressions wore sucess ated D PROGRAM FILES MATCHITVS DATABASE CUSTOMER WORKFILE DB Suppression File Successfully Imported Dr PROGRAM FILES MATCHITYS DATABASE CUSTOMER STOPFILE DBF Fad Suppressors helpit Matches Found by Score Range Score Number of Matches in Range 60 89 1 90 99 5 100 109 4 110 113 2 420 130 1 atching weights which are independent int 72008 at resum This report s
8. HRENS MMD PACIFIC ENTERPRISES KROHN PLOUGH CORP AVIATION SERVICES UNLIMITED THIEL SHERIFFS DEPT 36430 BENSON AVE PALMER AIR CHARTERS INC SUITE 106 FEIDER COMPANIES INC POST OFFICE BOX ANDERSEN 6 LAMB INC 2300 MARIE VICTO JACK GRIFFITHS TIM HORTONS DONUT LTD 2600 SERAULT ROBERT DECOSTER HOFFMANNS Box 907 LARRY WEYENBERG MCMAHON FURNITURE 2237 COLBY AVENU AIR MECH PO BOX 37 ROBERT HANSEN STATEWEST AIRLINES INC 4909 EAST MCDOWE JOAN MUELLER AIRWEST HELICOPTERS INCORPORATED 500 NORTH LINK L VAUGHN WILKINSON EUROPEAN AIRCRAFT CO 2316 BAYNARD BOU CHARLES ZEMLOW B100 CORPORATION 500 ROBERT JEMIS G UNDERHEIM JR TBC P 0 BOX 260 MR DANIEL MCCORMACK THE KEFFER MANACEMENT COMPANY 8200 EAST INDEPE Y lt gt Field Information Field name FULLNAME add Rename Beset Field width 40 gt Sch te Naval Wierd Getting started with match T 6 If the Setup Wizard has not understood what is in a column right click on the field name i e the column heading and then select Rename Field The appropriate field name can then be chosen from the drop down list For all the name and address fields in a data set you should use the names that are shown in the field name drop down list matchIT refers to these field names as part of it s processing so all the address lines should simply be labeled ADDRESS LINE match T will number them and compare acr
9. Information On Line help is available from the F1 key The Getting Started Guide Frequently Asked Questions and full User Manual are available from the Help menu in match T If you have any problems or questions or want to know if the results that you are receiving can be improved you can refer to the Frequently Asked Questions section of our website Alternatively please don t hesitate to call us With our experience we should be able to help you get the best out of match T very quickly and we ll be only too pleased to help If you would like information about the core standardization and matching components of match T for plugging into your own systems please contact us for information about findIT For help and more information please email or visit Sales helpIT com www helpIT com Other contact details are as follows US Office UK Office helpIT systems inc helpIT systems ltd 560 South Winchester Blvd Stocks House 5 Floor 9 North Street San Jose Leatherhead CA 95128 Surrey KT22 7AX Tel 866 628 2448 44 1372 360070 Fax 408 236 7491 44 1372 360081 Support Support US helpIT com Support helpIT com Getting started with match T 29
10. KEITH SIPPAL KEITH SIPPAL FEIDER COMPANEES INC matches t the en yOu NEIL STANG NEIL STANG NOVA CORPORATION OF ALBERTA LTKIJRT STEINER s KIKAT STEINER WIIRM MANTIFACTIIRING Je can see the generated Contactand Salutation fields match T has derived the Contact field from the supplied name and worked out the correct Salutation By viewing the data in order of a field you are more likely to see unusual and perhaps suspect values of that field near the top of the data which is a very useful Quality Assurance technique If you scroll down the list and compare what s in these fields against the input name fields you will see how match T deals with complex and uncommon name structures even when an input file has the contact name within one field Press Escape to close the view and then choose the Output tab from the main Quality Assurance Wizard dialog The next dialog allows you to output in N samples and selections of records Select Close to close the Quality Assurance Wizard Getting started with match T 25 Address amp Zip code Validation Any version of match T US install can have the US Address and Zip code validation module addressi T integrated with it using the addressing engine supplied by Datatech The standard evaluation configuration does not include addressi T but you can request an evaluation we will then send you a CD containing the United States Postal Service Address File USPS matchiT Setup W
11. RESS4 zp TELEPHONE False Match ADDRESS4 TELEPHONE PREMISE 500 500 PREMISE v Mark Place CONTACT CONTACT SALUTATION SALUTATION Resume Place m Done Matching Score 80 gt Getting started with match T 11 This screen allows you to display each pair of potential duplicates in turn Below each pair the matching score is shown The matches are shown with the least likely dupes lowest match scores first as you may only want to review the lower scoring pairs You can decide interactively which of the pair to retain cut and paste between records or simply verify whether the duplicates are candidates for flagging Clearly you don t have to wade through all the pairs of duplicates in this way but Verify Matches is a good way of establishing the correct threshold above which it is safe to globally flag the duplicate records Some of the fields in the pairs of records are color coded this is to highlight where match T thinks the main differences lie between the two records Those that are marked in red show fields which are clearly different in content and those highlighted in yellow show where one field s contents are contained within the corresponding field in the other record The first pair displayed has ascore of 80 This is because pairs of records that have scored less than 80 are not thought by matchI T to be duplicates NB The match score is not a percentage but a grade
12. T com The central feedback window will keep you informed of progress and once your file has been processed the Data Summary will be displayed as shown below This report shows the quality and nature of the records in the file before duplicates are removed more information about the use of this report is given on page 23 You can click on any part of the report to zoom in or out or use the scroll bars to see the different sections of the report Database Information Main OGRAMFILES MATCHITVS1 DATABASE CUS TOMER WORKFILE DB On the print preview toolbar click to print the report on the default printer and close it or to close page preview without printing Immediately following the Data Summary the Internal Matching Summary is displayed This shows how many duplicates matchI T found within the file and what score range those matches fall into As a default match T uses the keys below to flag matches for comparison e phonetic lastname key Zip e phonetic lastname key together with first initial phonetic street key e street number phonetic address key city and street Getting started with match T 10 Note You can use these defaults with confidence when finding duplicates in nearly all normal US data In cases where you are dealing with complicated or badly structured data match T has the flexibility to find all duplicates no ma
13. aring IIe soon raconmended tha you donot uss C cass aes passes oriy O CASS Addressed failures only Generate CASS 3553 report C PROGRAM FILES MATCHITYS1 DATABASE CUSTOMER EXAMPLE1 CASS 3553 03 31 2006090 m Select Address Output Configuration City State Zip separate v This feature allows you to modify the current Output File Layout to output a Formatted address as selected above All fields other than address and Zip will be output as specified by the Output File Layout Note You can not use this and choose shuffle up address lines at the same time Generate Output When you reach the output stage for a file that has been run through addressIT the Addressing tab of the Produce Output window will now be available for use These new Addressing options will be available in addition to the Main Options for outputting a file The inclusion of these addressing options will allow you to output subsets of your file depending on the CASS status of a record while providing the ability to format the address lines and generate a 3553 CASS report for the resultant file Selecting any of the options in the Addressing tab can and will modify the source file layout If you would like to maintain a layout that has been specified you will need to set the Se ect Records to Output option to Output all records and the Select Address Output Configuration option to Do not modify the Output File Layout Getting started with match T 28 Further
14. atchiT defaults to saving the file in its original format we can change the output file format here Drop down the Output Format list if you want to select a different format from COMMA this is acomma delimited file without a header record the same as the input file If you want a header record to be inserted in the output file to label each field select CSV near the bottom of the drop down list Other common options are Tab delimited SDF fixed width text DBF and Excel which is limited to 65 000 records Microsoft Word can link to a comma delimited file as a data source for mail merge so COMMA or CSVis a good choice for our exercise Next click on the file selector button to the right of the Destination File Name to choose a different destination directory and name if you wish and select Generate Output matchIT now generates the output file and when finished displays a message box that shows how many records have been output Conclusion Now you have finished the whole process Of course this is a simple example but matchIT is extremely flexible and sophisticated if you need it to be As you become more experienced you can fine tune matchIT to find all the duplicates in any data no matter how badly structured or keyed You can also find the common entries in multiple files introduced in Exercise 2 automate frequent or complex jobs using the Job Script function of matchIT Pro and output address labels or mail merge pa
15. cord highlighted in pink which shows the longest example of each field in the file sample If the input file is a fixed width text file the Sp it Field or Combine Fields buttons can be used to correct the Setup Wizard if it has not been able to work out where one field ends and the next begins When splitting fields click in the data window at the point where the field should end before selecting the Split Field button by right clicking on the field header Then use the Rename Fie d button to correct the field header Getting started with match T 7 The Add Field button can be used to create extra fields which match T can populate later on e g if the input name is of the form Mr J Smith or John Smith all in one field you can add fields for Title Prefix Firstnames and Lastname which match T will populate automatically on Import with the appropriate components of the name match T will rename these fields as Prefix Forenames and Surname Once the field names given to your data have been reviewed select Continue The next dialog box displayed allows you to tell the Setup Wizard what to do with the data after it has been imported Finding Duplicates The Processing Options window allows you to tell the Setup Wizard what needs to be done to the data after importation matchiT Setup Wizard Processing Options 7 Select matching level J Lt Contact Business W O Address Create match keys
16. eft This will open a similar window to the single file process used in Exercise 1 but now you will select a second file and the process you would like to perform Unlike EXAMPLE1 TXT EXAMPLE2 TXT is in fixed width format but the Setup Wizard will be able to recognize this Getting started with match T 17 For this exercise we will Find the overlap with your Suppression File Select that option from the Processing Type drop down Next open EXAMPLE1 TXT as your Work File Then set your Suppression File to EXAMPLE2 TXT EXAMPLE2 has no internal matches Select Continue once you have chosen a process and the necessary files matchIT Import Manager Py ssf Select files to import into matchIT Singe File Two File Select Processing Type Find the overlap with your Suppression File v Work File D PROGRAM FILES MATCHITYSIIMPORTIEXAMPLE1 TXT e Swap Files Suppression File D PROGRAM FILES MATCHITYS IMPORT EXAMPLE2 TXT Cancel Continue The next window displayed is the field labeling window for EXAMPLE1 TXT you may recognize this window as it is the same window used in Exercise 1 Make sure all fields are labeled correctly and then select the Continue button matchiT Suppression Wizard Work File Options 2 Select matching level 3 cay Contact St Business E O ddress Use Zip 4 Address Validation C Enable automatic suppression Create salutations Correctly case name and addres
17. ence numbers for matching pairs to a file This will enable you to remove duplicate records from your source database and reassign orphan records using a program external to match T Choosing Flagged Records outputs only data for the records marked as duplicates For this exercise choose to output the Deduped File This displays the Produce Output dialog which has 2 tabs showing Main Options Campaign History only enabled in matchIT Campaign A third tab is displayed if you have run your file through addressIT Getting started with match T 14 Produce Output Options for File Output File Layout I Layout name Use all Fields Output file Details Destination file name export result txt k j Output format COMMA Be Data Options Sampling Options Record at which to start output and number it Use sequential numbering ines Shuffle Up Address Lines baronies Oai O Selection Sampling One in Filters and Ordering use index order use Filter Generate Output Output File Layout matchIT allows you to save output layouts for future reuse To create a new layout select Newin the Output File Layout section The resultant field picker see below shows you all fields that are available in the currently selected file WORKFILE DBF in the left hand pane and all the fields that have been selected for output in the right hand pane File Layout Desig
18. file before output print them and or write them out to a separate file For this exercise with WORKFILE DBF open tick Defau t Salutation and select Browse on screen Select Continue then Continue again on the next dialog to view any records for which match T could not derive a proper salutation e g records with no contact name no prefixes with unisex first names and inconsistent first name and title i e Mr Stella Black matchlT Options The Default Salutation shown is Dear Customer but you can change this from the Jobs Setup menu Options Input tab or from More options at the end of the Setup Wizard _ Matching Operational Input format Next reference number Dear Customer Default salutation After you view the selected records match T returns you to the View Records by Category screen to select more categories if you wish To leave this screen select Close Getting started with match T 24 Next select the View Data button in the upper right from the main Quality Assurance Wizard dialog This displays the Browse Customer Database dialog You can view matchI T databases in order of any field click on Order Records then click on Continue On the next dialog scroll to the bottom of the field list and double click on Sa utation as the field on which to order the new view
19. from the single file button in the Wizard Pane The wizard builds a database inside match IT into which data is copied To begin with you will be asked which file you would like to import from The Setup Wizard initially defaults to looking in the matchiNmport directory but a different drive or directory can be selected The Setup Wizard helps to import any of the common text file types as well as native Access Excel and ODBC data sources If a file is not displayed you can select A Files from the drop down Alternatively you can select DBF Tables Access Databases or Excel Worksheets from the Files of Type drop down list the appropriate files are then displayed To use ODBC an ODBC connection must be setup first To let you follow through a worked example we have included a file called EXAMPLE1 TXT in this directory This is a file of test data using fictional data made up for the purpose of demonstrating match T This section of the guide assumes that you will be using that file You can of course use your own data instead this guide will be most relevant to data that contains similar information to the example data but you should be able to interpret these instructions as appropriate First select the Single File button on the left hand side of the match T user interface Then use the file selector button to open the file selector window Getting started with match T 5 Either highlight EXAMPLE1 TXT and click
20. g Results dialog box The following dialog is displayed View Matches Options Report format Business v Destination Preview Report grouping Pairs v Low score VJ All keys Run number High score 130 C Create matches File Score sample size oF Sort by score Cancel More Continue This dialog is used to indicate which matches you want to print or preview and in what order Change the Report grouping to Sets from the default Pairs but leave the Report Format as Business with Destination set to Preview Click on Continue A report is displayed listing all matching records showing the unique references name address etc You can zoom out scroll print or close this report in the same way as the earlier reports Note You can customize any of match Ts reports via the Edit Output Layouts menu e g to show additional data items or to change the formatting of the report Please see Going Further with matchIT for more information Flagging Matches Once back at the Matching Results dialog select Flag Matches from the options at the bottom of the window You are prompted to specify a score equal to or above which it is safe to flag duplicates so leaving the minimum score at 80 will flag all duplicates found Once your flagging score is set select the Flag button A dialog will then be displayed showing the results of the flagging step The records flagged will be excluded from any deduped file output but will
21. ges direct to the printer Getting started with match T 16 Exercise 2 Multiple File Deduplication What you have done so far is to dedupe one file match Tcan also be used to find the common overlapping records across two databases using the Merge Purge functionality that is available in all versions except matchIT Lite The ability to find records common to two lists is a powerful function It allows you to e Purge existing customers from bought in mailing lists e Merge databases from regional offices with a head office file without creating duplication e Transfer Data from matching records in one file to records in the other file e Write Overlapping Records to output records that exist in both files to a third file All these options are available from the Merge Purge menu You can perform the Merge Purge step on two databases of different structures e g one database may have the name all in one field and the other split into title first names and last name as in this example If you have more than two files that you would like to Merge Purge you can use the Multiple File Wizard available in match T Pro and matchIT Campaign from the match T Wizard Pane Alternatively match T has the ability to work through files simultaneously in a process similar to that of Exercise 1 The Two File Wizard process is explained in the following sections Starting the Two File Wizard Select the Two File Wizard button on the l
22. hows the matching keys that found overlap the score breakdown and the total number of overlapping records View Verify Overlap After you close the Overlap Summary the Matching Results window will be displayed However the Matching Results dialog now has different options from when you were deduping the single file i Matching Results Number of Time Duplicates Taken 21 Phonetic Last Name Key First 5 characters of Zip 00 00 01 22 Phonetic Last Name Key amp Initial Phonetic Key of Street 00 00 01 23 Phonetic Address Key Town City amp Street Building or apartment number 00 00 01 This shows number of potential duplicates Cancel q View Overlap f Verify Overlap Remove Suppressions Getting started with match T 20 View Overlap and Verify Overlap work in a similar way to View Matches and Verify Matches Select Verify Overlap to see the matching pairs from the two databases Overlapping Pair Score 90 Record in Main Database Record in Second Database if 1 next score Mr PREFIX UNIQUE_REF faa FORENAMES John Furnas Electric Company COMPANY SURNAME Shamrowiz ADDRESS1 COMPANY Furnas Electric Company Norwich ADDRESS2 f False Match ADDRESS2 Norwich ADDRESS4 ADDRESS3 13815 zP Change Fields zP TELEPHONE PREMISE 1000 PREMISE New York 13815 1000 contact Miann shaman 00 contact DearmrShamrow2 SALUTATION SALUTATION Matching Score Note how the databases d
23. ifferent structure is reflected in the layout of the screen If there are more fields to see you can use the central scroll bar to allow you to view the other fields Choose Done to return to the Matching Results dialog once you have looked through the overlapping records Remove Suppressions Having found the common entries between these two files you can now select Remove Suppressions You will then be prompted to Flag records that scored above a user determined threshold 80 is match Ts default 40 for Address level matching The Resu ts window will open after the records have been flagged From here you can utilize the Quality Assurance Wizard output a Cleaned File or output Flagged Records Select Cleaned File to generate a clean file one with no internal duplicates or overlapping records Note If at any stage you depart from the prompts for the usual options that the automatic dialogs display you can select those options as required from the menus e g Output to File from the Output menu Generating a Clean List You can now generate a cleanoutput file one with no internal duplicates or overlapping records After selecting the Cleaned File button you will now see match Ts Produce Output window From this window you can Getting started with match T 21 determine the file layout format and destination as you previously did in Exercise 1 After the file has been generated you will see a window displaying the number of records
24. izard Processing Options 2 Select matching level d Li ia Contact Sy Business 3 O Address Create match keys Use Zip 4 Address Validation C Enable automatic suppression C Create salutations C Correctly case name and address Import this file to C PROGRAM FILES MATCHITYS1 DATABASE CUSTOMER EXAMPLE 1 DBF As with all of match 7Ts operations the data must first be imported into a DBF file The Single or Two File Wizards on the left pane allow you to do this The same rules for naming fields as described in Single File Wizard Operation on page 6 should be used Once the data file has been imported via the wizard you need to tick the Use Zip 4 Address Validation option in the Processing Options window as shown above The Address Enhancement Wizard will then be displayed after selecting Continue The Address Enhancement Wizard will guide you step by step through the key areas when preparing a file for Zip 4 Enhancement For detailed information on address T and the Address Enhancement Wizard please see the address T section of your User Manual Getting started with match T 26 Verifying the addressIT Results When you have completed the verification and removal of duplicate records from within match T you will find yourself in the Results for duplicate flagging step window From here you can select the Quality Assurance Wizard Once in the QA Wizard you will now find the address T section available Thi
25. n small data files If you do not have an Activation floppy disk you will need to contact your supplier for an activation code before you can use matchIT to process anything other than the supplied example data You can do this from the match T Evaluation screen as described in the next section If you are installing under Windows 2000 or XP you must have administrator rights to install match T and or the matchIT activation code If some of the dialog boxes are not fully visible after you have installed matchiT make sure your screen resolution is set to at least 1024 x768 Getting started with match T l Evaluation System Limitations The standard evaluation system is configured as matchIT Campaign and is limited to 30 days and files of up to 5 000 records Until activated the matchIT Evaluation screen is displayed whenever match T is started Simply follow the instructions on screen to activate matchiT at this stage then Cancel and restart match T to use the activated version To use match T without activating it just click on the Run button Note Prior to activation you will only be able to run matchIT on the example data supplied Getting started with matchIT 2 matchlT v5 User Interface While all of match Ts processes can be accessed from the menu bar the most common selections have been integrated into matchIT s User Interface Following is a description of each icon The matchlT Toolbar Open an Existing matchiT
26. ner D PROGRAM FILES MATCHITY5 DATABASE F_LAYOUT DBF Available Fields Output File Structure ADDRESSEE Expression COMPANY ADDRESS1 Add gt gt ADDRESS2 ADDRESS3 Add Expression ADDRESS4 ZIP NAME Insert Line gt gt NAME1L NAMEZ SEX NAME2ZFOUND lt lt Remove All NAME_KEY STD_CONF Me Save Layout Save Layout As You will notice that match T uses more international labels now for the name fields i e FULLNAME becomes ADDRESSEE LASTNAME becomes SURNAME FIRSTNAMES becomes FORENAMES and TITLE becomes PREFIX We could output the ADDRESSEE field from the original data but when we asked match Tto generate a SALUTATION in the Setup Wizard we also generated a CONTACT field Since this is a Getting started with match T 15 standardized name field that has been designed as the first line of the addressed item this field should be the first that you select It is one of the last fields in the database so scroll down the available fields list until you see it and double click the field name to move it over to the right hand pane We will also select the SALUTATION field near the bottom of the list because that is appropriate for the start of a personally addressed letter Next select the COMPANY field all the address lines and the ZIP field from the top of the available list Choose the Save Layout button and name it WORKFILE OPL Click Done to return to the Main Options dialog Although m
27. nt settings are displayed ebove To drill down view various categories of records reported on the summary close the report preview and select View Edit by Category from the main Quality Assurance Wizard dialog Getting started with match T 23 Import Details Quality Assurance review select one or more categories of data below You can then view the records in those categories output them to another DBF file view create a report and or delete the selected records from the Main File Zip Postcode Data extraction Prefix Title Salutations Potential data problems C Extracted Company C Generated Prefix Title C Excluded records Extracted Zip C Changed Prefix Title First name contradicts Prefix Oo Extracted House Supplied Prefix Title Potential address truncation Building Number e a N C Blank address line 1 Blank Zip Postcode Blank individual name keys Foreign Joint Salutation First name of either gender C Unusual Salutation C Addresses with populated lines or less Output Options Browse on screen Output to file O Print to report Create PDF The window displayed allows you to view all records meeting the selected criteria most of them are categorized on the Data Summary You can select several categories to view at once You can also choose to delete the selected records from this
28. ny options that matchIT utilizes such as the default salutation whether to automatically exclude records that have suspect data in them e g if someone has entered New address needed in the address lines and a host of lower level options The file selector button will let you overwrite the default file name and destination Please enter a name of WORKFILE this new file will be saved in DBF format For this example file after the options have been set select Continue When you become more familiar with matchIT you can specify different matching criteria from the default settings see Going Further with matchIT Now matchiTwill start processing This involves Importing the original data Generating the key fields for finding duplicates Enhancing the data as requested e g salutations casing relocating Zip Postcode and country data to fixed fields Locating the duplicates within the file Generating first level reporting This stage should be very quick on the first example file Processing time is dependent upon file size hardware and software configurations matchITs performance is benchmarked at a rate of several million records per hour Getting started with match T 9 Note There are many quick and simple things that you can do to speed things up Some of them apply just to specific areas such as finding matches if you want to know more about tuning specific areas of matchI T please contact Support US helpI
29. oss all the address lines when checking for duplication If a data file with a unique reference number for each record in the file is being used you can right click on the column heading for this field and label it Unique_Ref this will enable you to export reference numbers for matching pairs to a file Enabling the ability to remove duplicate records from the source database The data in the field must be genuinely unique within the input file or an error message will be displayed after import If there are data items in the source file that are not relevant to deduplication or mailing you can enter your own field name by clicking on the Rename Fie d button First check that the top dropdown list Change field labeled shows the field name that needs to change You can then type the appropriate field name in the Enter New Field Name box For Comma or Tab Delimited input files the Setup Wizard uses a default width of 40 characters which it increases if it thinks necessary However this value can be changed if the actual maximum width of that field is greater or significantly less than 40 In EXAMPLE1 TXT you can change the width of TELEPHONE to 20 or 25 characters or leave it at 40 characters if saving disk space is not an issue The Setup Wizard uses a sample size from the input file which is the first 1 000 records by default To see the longest string of data in a field scroll to the bottom of the record list to view a re
30. s Import this file to C PROGRAM FILES MATCHITYS1 DATABASE CUSTOMER WORKFILE DBF Depending on what you want to do with this file you could select to generate salutations and or case the data You should select the matching level required i e whether you wish to dedupe to one record per Contact Business or Address For this exercise we are going to purge the records from EXAMPLE1 TXT that also exist in EXAMPLE2 TXT so EXAMPLE2 TXT is a suppression or stop file Tick the Create salutations option the Correctly case name amp address option and set matchI T to a matching level of Contact To make it clear Getting started with match T 18 which file is which name this match T database WORKFILE DBF by selecting the file selector button Then select Continue When matchIT has finished importing the data it stops at a Data Summary Report for EXAMPLE1 TXT Once you close this report you will be presented with an ntemal Matching Summary report These reports should look familiar as they are the same reports you received in Exercise 1 The Matching Results for your work file will be the next window displayed see below This is also the same window that was displayed in Exercise 1 From here you can select to View Matches Verify Matches or Flag Matches For this Exercise we are going to elect to Flag Matches for records scoring equal to or greater than 80 matchiT Suppression Wizard Work File Matching Results NAM
31. s section will provide the tools necessary to Q A check some or all of address Ts results rc Quality Assurance Wizard File Information an 9 C PROGRAM FILES MATCHITY51 DATABASE CUSTOMERIEXAMPLE1 DBF y 100 Records 2 Flagged Import Output addressIT Data Summary Address Processing Summary View Edit by Category w Address Processing Detail View Non Alphanumeric Report View Field Widths Report Examine the Data Summary to make sure that the volumes of data and the nature of the data shown A on the summary conform with your expectations e The Address Processing Summary will allow you to either preview print file or PDF the Address Enhancement Summary that matchIT has generated e When using the Address Processing Detail option to view records match T will provide the ablility to view either address successes or failures based on criteria selected by you The Address Processing Detail option can be the most powerful quality assurance tools for addressi T results Getting started with match T 27 Utilizing addressIT for an Output File Produce Output Options for File Output File Layout New Edit Use all Fields I j Layout name C PROGRAM FILES MATCHITYS1 REPORT MAILING_OUTPUT OPL Ga Output file Details Destination file name c documents and settings support desktop january mailer txt sal 7 Output format COMMA w Main Options Addressing Select Records to Output W
32. the Open button or double click the file Then select Continue Single File Wizard Operation First the Setup Wizard will determine the type of data in each field Here there are two choices This is the default option The wizard will automatically attempt to determine the data type See Automatic Import Wizard Choose this option to manually specify what the various data items represent This is usually appropriate if a data file is in a Fixed Width format unusual in its layout content or if the Setup Wizard has already failed to determine the data type See the Manual Import Wizard section in the Going Further with matchIT guide Automatic Import Wizard The Field Layout window is displayed immediately following the selection of an import file You should always check that the Setup Wizard has correctly identified the contents of each name or address field using the scroll bars to view more fields and records the Continue button is not enabled initially as the field names must be reviewed first matchiT Setup Wizard File Information The contents of this field will be ignored by matchIT File to import Di MATCHITYS IMPORT EXAMPLE1 TXT File format COMMA Field count 8 COMPANY ADDRESS1 INOBELES Q SA DE CV HANGAR C 5 WORN MANUFACTURING INCORPORATED POST OFFIE BOX 3 MIGUEL ROBOLLEDO LOPATA S A DE C V TERMINAL DE AVIA MR E SIMMONS WOOD AIR HOLDING CORP 2314 NORTHEAST 4 DR M GUNDESEN MR BRUCE A
33. to help separate out true matches from false If there is a gray area for the duplicates that have been found in a given file with the default Matching Weights those duplicates will normally be in a score band of 80 85 Jack Whitson and MrJ Watson both at the same address and Zip would be shown as a match scoring 80 when they could be different people However this gives you the chance to go for marketing overkill and flag one of these records rather than risk sending someone two communications one of which is wrongly addressed If any pair shown is not a true match select the False Match button to unflag the match Use these buttons to flag i e exclude from output either the left or right matching record The flagged record is then grayed out and the icon changes from a cross to a tick which allows you to change your mind 7 gt gt Use the forward and backward arrow buttons to scroll through the pairs of duplicates There is also a Next Score button to help you decide on the threshold score from which to flag duplicates globally Getting started with match T 12 If you right click on either record you will see additional options for copying or combining data from the two records Select Done when you have looked at the duplicates from each of the score bands you can mark the place i e the last pair viewed so that you can return to a given pair later if you wish Now select View Matches from the Matchin
34. tter how difficult the data may be to process but for this you may need match T training plus time to experiment The matching score is derived when a match key has identified a pair of records that it thinks might be duplicates It looks at all of the relevant fields in the two records and accumulates a score based on how close the match is between the records You can modify and set your own matching weights but when using the defaults the higher the matching score the more similar the two records are You can print or close this report in the same way as the Data Summary Viewing Matches The next screen displayed is the Matching Results dialog Matc hing Results Number of Time Duplicates Taken 6 Phonetic Last Name Key First 5 characters of Zip 00 00 00 7 Phonetic Last Name Key amp Initial Phonetic Key of Street 00 00 01 8 Phonetic Address Key Town City amp Street Building or apartment number 00 00 01 This shows number of potential duplicates View Matches Verify Matches Flag Matches To look at the duplicates match T has found choose Verify Matches The first pair of duplicates is displayed shown side by side Matching Pair Score 80 _ _ First Record Second Record J J ADDRESSEE ADDRESSEE unaue e w o wore fy ke conan Biooie conway ADDRESS2 Troy Troy ADDRESS2 gt gt next score gt ADDRESS3 Michigan ADDRESS3 ADD
35. y Overlap o c 20 Remove SuppresSiOns cccceccceees 21 Generating a Clean List 000 00000 21 Quality Assurance Wizard 23 Address amp Zip code Validation 26 Verifying the addressIT Results 27 Utilizing addressIT for an Output File 28 Further Information 0ce008 29 Getting Started This Guide matchIT consists of several optional modules which are totally integrated into one package Depending on the modules that have been purchased or are evaluating options are enabled or disabled i e grayed out The menu structure also depends on what modules are activated This Getting Started Guide focuses mainly on the use of matchIT for dedupe merge purge salutations casing and address enhancement If any of these functions are not relevant to you please ignore any instructions relating to the corresponding options The screenshots in the guide are taken using the US sample data if you are using the Rest of World regional installation you will see the sample data for that installation The literals on the screen sometimes use different terminology e g Surname for Lastname Installation To install match T if installation does not start automatically please run the SETUP program on the CD and follow the instructions given Note match T needs at least 256MB of RAM memory in order to run reliably and 100MB of free hard disk space for eve
Download Pdf Manuals
Related Search
Related Contents
Terratec CINERGY T2 Stick HD PayPal Sandbox User Guide - Mon bloc Philips TriActive vacuum cleaner nozzle FC6031 V7 Slide-In USB 3.0 Flash Drive 32GB black Manual Scorpio 23 FZ6-SS(C) Service Manual Gateway ALR 8300 Server User Manual `11` Syringe Pump Series User`s Manual Kingwin KF-252-BK storage enclosure Samsung 171N Felhasználói kézikönyv Copyright © All rights reserved.
Failed to retrieve file