Home
Project Report
Contents
1. create the window 1 100 5 27 find how large half the size of the window is ceil n p 2 find the size of pit L2 floor L d perform the autocorrelation of the windows and create the pitch file for i 1 1 12 if n gt L n L end b xcorr xq p n find the max of the autocorrelated window and store the value in pit max b pit i 1 v 1 update window position p 50 n 50 lt recwav m Thomas Jonell records a wavfile function recwav name extType wav Fs 11025 N 16 CH 1 filename name extType y wavrecord 5 Fs Fs double wavwrite y Fs N filename return 28 Appendix B Sample User Interfaces Example 1 Successful Login Attempt k amp s amp k ko sk ko ko sk sk ko ko sk Sk ko ko sk k ko ko c k ck Welcome to th Hudat Security System lt k k lt k x lt k lt lt lt x lt lt x x lt lt x lt x x x lt x lt lt Please begin by entering your user name bbash When you are ready to begin press any key and recite the passphrase slowly into the microphone PROCESSING Access granted Welcome Example 2 Three Failed Login Attempts sk k sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk sk ke c 39k ck Welcome
2. In 1 1 11 formant a4 wav ap2 m2 i2 formant b4 wav pass4 newtest il i2 Determining score Adjustment for poor performance gt Variable threshold passl if passl lt 0 passl gt threshold passl 1 elseif passl lt 25 passl 2 elseif passl gt 25 amp 51 lt 250 passl 1 else passl 0 end pass2 if pass2 lt 0 pass2 gt threshold pass2 1 elseif passl lt 25 pass2 2 elseif passl gt 25 amp passl lt 250 pass2 1 else pass2 0 end pass3 if pass3 lt 0 pass3 gt threshold pass3 1 elseif pass3 25 pass3 2 elseif pass3 25 amp pass3 250 pass3 1 else pass3 0 end pass4 if pass4 lt 0 pass4 gt threshold pass4 1 elseif pass4 lt 25 pass4 2 elseif pass4 gt 25 amp pass4 lt 250 4 1 else pass4 0 24 end Determine access no access pass passl pass2 pass3 pass4 if pass lt 0 amp count lt 2 disp This is not a valid match disp count count 1 elseif pass lt 0 amp count gt 2 disp This is not a valid match Attempt limit reached Press any key to reset disp pause Sreset cle clear all break else disp Access granted Welcome Sunlock door lock reset clc clear all break end end end end end newsplit m n
3. wav file cleanup if exist user wav delete user wav exist al wav delete al wav exist a2 wav delete a2 wav exist a3 wav delete a3 wav exist a4 wav delete a4 wav exist bl wav delete bl wav exist b2 wav delete b2 wav exist b3 wav delete b3 wav exist b4 wav delete b4 wav end disp When you are ready to begin disp the passphrase slowly into the microphone disp pause record n recwav us disp PROC disp ew file er ESSING create file prefix string a b user 9 user wav split the wav files newsplit user Mak it recorded OK Sur it exist al wav exist a4 wav disp The system retry disp pause else 0 exist a2 wav newsplit file b 9 perform formant Please apl m 1 11 formant al wav 23 0 could not recognize the passphrase press any key and recite exist a3 wav comparison analysis amp test of indexes 0 Press any key to ap2 m2 i2 formant bl wav passl newtest il i2 Let 1 1 111 formant a2 wav ap2 m2 i2 formant b2 wav pass2 newtest il i2 x Me apl m1 il formant a3 wav ap2 m2 i2 formant b3 wav pass3 newtest il i2
4. xq i end store past slope value pslope slope end formant m formant m Thomas H Jonell creates a vector containing the formants in a sound file function formant file read in wav file xq fs nb wavread file perform the yule walker spectral power density calculations pxq pyulear xq 12 20 10g10 pxq find the peaks of the formants yg max ind deriv newtest m newtest m Thomas H Jonell 2 24 04 test the given formant indexes and return their difference factor oe oe oe function passfail newtest indl ind2 get the initial sizes of the indices 11 length 1 L2 length ind2 20 find the smallest if they are different sizes if L1 L2 L Ll else L L2 end c 0 9 remove all data below index 20 for indl for i 1 L if indl i 20 i end end create new 1 il indl c 1 L1 9 remove all data below index 20 for ind2 for i 1 L if ind2 i 20 i end end create new index2 12 2 1 12 5 get the lengths of the new 11 length 11 L2 length i2 if the indices are not the same size immediately fail the test if Ll L2 passfail 1 return end since the indices are the same size test them z 0 for i 1 11 which length doesnt matter because they
5. Speaker verification is also used in fraud prevention in telephone network security toll fraud applications such as University of Maryland s College Park toll free long distance lines for faculty and staff and GTE TSI integration of speaker verification into wireless security packages offered to carriers In transaction security there is the Home Shopping Network 5 automated product ordering over the telephone and Glenview State Bank transfer of money between accounts of a bank customer It is also used in monitoring This is done in time and attendance monitoring such as SOC Credit Union and the Salvation Army One very important application in monitoring can be seen in corrections monitoring Speaker verification is used at the New York City Dept of Probation tracking of juvenile and adult probationers and the Dane County Jail in Madison Wisc monitoring of home incarcerated offenders 2 Constraints Upon looking at the entire project as a whole certain issues needed to be brought up to determine its plausibility Below are the constraints that were considered along with the project s impact upon them Economic Non standard parts in the hardware could be an issue for the consumer in maintenance so it has been decided that standard parts will be used whenever possible Also software development packages could greatly increase the cost of development concerning licensure yet reduce the amount of time necessary for devel
6. 1 num mat2str counter write the file wavwrite z Fs B pref num postf end reset flags ss 0 fs 0 end end newtest m newtest m Thomas H Jonell test the given formant indexes and return their difference factor oe oe function passfail newtest indl ind2 get the initial sizes of the indices 11 length 1 L2 length 2 26 find the smallest if they are different sizes if L1 L2 L Ll else L L2 end c 0 9 remove all data below index 20 for indl for i 1 L if indl i 20 c i end end create new 1 il indl c 1 L1 9 remove all data below index 20 for ind2 for i 1 L if ind2 i 20 i end end create new index2 12 2 1 12 5 get the lengths of the new 11 length 11 L2 length i2 if the indices are not the same size immediately fail the test if Ll L2 passfail 1 return end since the indices are the same size test them z 0 for i 1 11 which length doesnt matter because they are the same siz x il i i2 i z z end passfail z pitch m pitch m Thomas H Jonell returns a vector of autocorrelated maxes of a given vector used to determine spoken portions of a sound file function pit pitch find the length of the sample length
7. As for text dependent verification with speech recognition there may be an account number said The system would verify the user as well as determine what was said Text prompted verification can involve an entered account number with the system then prompting the user to repeat pre determined phrases Finally there is text independent verification which is the hardest form to use It is also however the most unobtrusive Basically it could involve a call to a bank A system asks the user what they wish to do The user then states an instruction i e want to transfer 10 000 to my offshore account in the Cayman Islands As the system is using the processing the instruction speech recognition it is also using speaker verification to confirm the identity of the user 2 All in all examples of speaker verification are everywhere In the realm of security speaker verification is utilized in applications such as data networks like BMC Software password reset over the telephone using virtual help desk Illinois Dept of Revenue off site access to secure data networks and INTRUST Bank internal wire transfers As for physical site access examples are used by the U S Immigration and Naturalization Service entry to U S and Canada during off hours port of entry at Scobey Mont Girl Tech door access control system and locked box for children and the City of Baltimore evening and weekend access to the five main city buildings
8. How well the Biometric is able to tell individuals apart This is partially determined by the amount of information gathered as well as the number of possible different data results Reliability 1 3 How dependable the Biometric is for recognition purposes Error Rate This is calculated as the crossing point when graphed of false positives and false negatives created using this Biometric Errors Typical causes of errors for this Biometric False Pos How easy it is to create a false positive reading with this biometric someone is able to impersonate someone else False Neg How easy it is to create a false negative reading with this biometric someone is able to avoid identification as oneself Security Level 1 3 The highest level of security that this Biometric is capable of Long term Stability 1 3 How well this Biometric continues to work without data updates over long periods of time User Acceptance 1 3 How willing the public is to use this Biometric Intrusiveness How much the Biometric is considered to invade one s privacy or require interaction by the user Ease of Use 1 3 How easy this Biometric is for both the user and the personnel involved Low Cost Whether or not there is a low cost option for this Biometric to be used Hardware Type and cost of hardware required to use this Biometric Standards Whether or not standards exist for this Biometric Voice Biometrics Voice biometric
9. Proposal Proposal Rough draft Proposal Final Proposal Presentation Peer Evaluations Winter Quarter Implementation Programming Coding amp Testing Assembly Testing Peer Evaluations Update Spring Quarter Implementation Testing Final Assembly Final Documentation Poster Development Written Report Oral Presentation Web Page Mon 9 15 03 Wed 11 12 03 Mon 9 15 03 Mon 9 15 03 Fri 9 19 03 Wed 10 1 03 Mon 9 15 03 Fri 9 19 03 Mon 9 15 03 Mon 10 20 03 Fri 10 24 03 Mon 9 15 03 Mon 9 15 03 Mon 9 15 03 Thu 11 5 03 Mon 9 15 03 Mon 12 1 03 Mon 12 1 03 Mon 12 1 03 Thu 2 5 04 Thu 2 26 04 Mon 12 1 03 Tue 2 24 04 Mon 3 8 04 Mon 3 8 04 Mon 3 29 04 Wed 4 28 04 Mon 4 19 04 Mon 4 19 04 Mon 4 19 04 Fri 4 30 04 Mon 5 3 04 Wed 10 22 03 Fri 9 19 03 Wed 10 1 03 Wed 10 22 03 Fri 11 7 03 Wed 10 1 03 Fri 11 7 03 Fri 10 24 03 Fri 10 31 03 Wed 11 12 03 Wed 11 5 03 Fri 11 7 03 Wed 11 12 03 Fri 10 17 03 Fri 2 27 04 Fri 2 27 04 Wed 2 25 04 Thu 2 26 04 Fri 2 27 04 Fri 2 27 04 Fri 2 27 04 Mon 5 10 04 Fri 3 26 04 Fri 4 16 04 Wed 4 28 04 Mon 5 10 04 Thu 4 22 04 Fri 4 30 04 Wed 5 5 04 i Mon 5 10 04 Figure 3 Project G 17 antt Chart v H g Conclusion A speaker verification system in a security application was developed It utilized voice biometrics to distinguish between authorized users The system was comprised of MATLAB code a circui
10. The Programming portion deals with the actual code writing and the ongoing testing This testing led to the new direction of implementing formants Assembly of the lock circuit also occurred The Update portion deals with the revision of the original Project Proposal Spring Quarter Spring Quarter involved the continuation of programming as well as testing of the code and the final construction of the lock circuit Final Documentation involves poster development revising the written report developing the oral presentation and creating the web page 16 o 23 a o w w w WINN N NI N N N N gt gt gt gt gt gt gt gt N CO G 1 S O N O OO wj N Task Name Duration Year 1 161 days Start Finish September 21 November 11 anuary 1 February 21 April 11 eat 921 1072 11 2 11 23 1244 14 1725 245 37 328 4 8 Mon 9 15 03 Mon 5 10 04 43 days 28 days 5 days 9 days 16 days 40 days 9 days 40 days 5 days B days 43 days 38 days 40 days 5 days 25 days 55 days 55 days 53 days 16 days 2 days 55 days 4 days 46 days 15 days 15 days 1 day 16 days 4 days 10 days 4 days 6 days Fall Quarter Conceptual Team Charter Problem Identification Project Update Feasibility Block Diagram Research Gantt Chart Constraint Analysis
11. are the same siz x il i i2 i z z end passfail z lock m lock m Thomas Jonell amp Brian Bash 4 21 04 sends data to serial port to open lock oe oe int32 0 s serial COM1 BaudRate 9600 Parity none fopen 5 5 sec of open time for i 1 1 2500 fwrite s a end 21 fclose s instrfind delete s clear clear clear clear 5 ans i logon m oe oe oe 9 HUDAT S HUDAT Members Brian Bash Tom Jonell Dustin Williams HUDAT A Senior Project 2003 04 ECURITY SYSTEM dvisor Dr Les Thede clear display clc clear all format compact close all while 1521 loop forever while 1 1 count 0 threshold 500 9 initialize display disp disp EREET disp disp Welcome to the RV disp Hudat Security System disp disp disp disp 9 user logon name input Please begin by entering your user name s 9 get the respective wav file file name wav disp See if the person is allowed in the room if exist file 1 disp The user nam ntered is not valid Press any key to reset th system pause ele clear all break end initialize values pass 0 rec 1 22 9 while rec 1 Recording testing loop
12. to th Hudat Security System C k lt k x lt k lt K lt x lt lt lt lt lt lt x lt x x x lt x lt ck Please begin by entering your user name swagner When you are ready to begin press any key and recite the passphrase slowly into the microphone PROCESSING This is not a valid match When you are ready to begin press any key and recite the passphrase slowly into the microphone PROCESSING This is not a valid match When you are ready to begin press any key and recite the passphrase slowly into the microphone PROCESSING This is not a valid match Attempt limit reached Press any key to reset 29 Example 3 Invalid User Name Ck ck ck ck lt k K x lt k lt lt lt x lt lt k lt lt lt x lt x lt lt x lt ck Welcome to th Hudat Security System k Wk kk kok kk k k k ko k k ko k S ko e Please begin by entering your user name lthed The user nam ntered is not valid Press any key to reset the system 30 Appendix C MATLAB Plots Original Soundwave Autocorrelated Soundwave Spectral Power Density Formants 32
13. used is a 9 pin D Type RS232C Electric Lock This can be used in AC or DC applications For our purposes it is unlocked by 8 16VAC and draws 1 2A w 1 13 Upon completion of the lock circuit assembly and the m files necessary for operation the HUDAT Security System was tested The tests were performed by recording master files for four individuals 3 males and 1 female These files were named as the user name format utilized in logon m i e bbash dwilliams etc Upon executing logon the user was prompted to enter their user name This entry was used to reference the original master file A new user file was created for the session and it was compared to the master file The numerical results generated by the formant function were used to create the threshold 500 utilized in logon Many tests were executed like this to determine the pass fail rates The system s shortcomings can fall into two categories 1 False negative This occurs when the user should be authorized but the system denies access 2 False positive This occurs when the user should not be authorized but the system grants access As was predicted using Table 1 false negatives occurred more often than false positives In fact false positives were fairly rare Ultimately this is very good considering it does not compromise the integrity of the system In the case of our false negatives they can be attributed to factors s
14. zeroed data and determined whether portions were long enough to be considered voiced and whether they were separated enough to be considered separate or the same spoken portion The function worked as intended but was the slowest part of the system sometimes taking 30 seconds to split a file The extra time needed came from the decision process in which the function decided whether a portion of the file was long enough to be considered voiced and had enough separation from the next voiced section A change 9 was made in the collection of the voice samples by having the speaker clearly separate their words With this change it was clear that a new splitting function was needed because there was now no reason for the extra time taken in performing the split The newsplit function replaced the splitfunction by performing the split almost 10 times faster peak m Created 1 13 04 The peak function replaced the failed peakfirst function This function attacked peak finding by a new course of action Instead of searching forward from the first peak found the next and so forth when the first peak was found it and a window around it would be set to zero removing it from being detected as a peak again The next peak was then searched for and it like the first was zeroed This was done until all the peaks above a certain threshold were found When the switch from the FFT method to the formant method was made peak was modified into t
15. 0 2000Hz to eliminate unwanted noise After filtering they were sent through a MatLab Hamming window function followed by an appropriately set up FFT function After listening to the results of the filter it was decided that the bandpass produced an unwanted muted sound Instead a FIR lowpass filter was used and the results were acceptable The file was later modified to filter and perform the FFT on two given sets of voice data The function was used until the formant method replaced it peakfirst m Created 1 5 04 The peakfirst function was the first in a series of functions used to find the peaks in a set of data The function peakfirst worked by first finding the peak entry in the data set and then progressively move forward through the data finding new peaks up to a specified index This idea was flawed because if the first peak occurred halfway or near the end of the specified index peaks preceding the first max wouldn t be found Another flaw was if there were two peaks that were relatively large any peaks that occurred between them wouldn t be found Because of these flaws the peakfirst function was replaced by the peak function split m Created 1 7 04 The first splitting function sp it analyzed the voice data to determine when there was a voiced section and an unvoiced section by zeroing any portions of the data that did not go above a certain threshold The program then made a second pass through the new partially
16. H DAT Security Systems Project Report Speaker Verification System in a Security Application Brian Bash Thomas Jonell Dustin Williams Faculty Advisor Dr Les Thede Date 04 30 04 Executive Summary Problem Identification Research J U U Biometric Decision Table 1 Decision Matrix for Biometric Types Decision Matrix Terms Voice Biometrics ori m System DeSigR u seanna asinawra anadai awaasa iaeiaiai iaai Figure 1 Block Diagram of Speaker Verification System The System Components Programming Language MATLAB Files File Splitting and Formants File Splitting Formants Lock Circuit Figure 2 Lock Circuit Schematic Testing Cost AnalySISuu sus uuu aa sassa g Table 2 Estimated Development Costs Table 3 Estimated Product Costs Gantt Chalrl J Con UU Fall Quarter Winter Quarter Spring Quarter Figure 3 Project Gantt Chart Conclusion PAREM References U Appendix M file Code deriv m formant m lock m logon m newsplit m newtest m p
17. as entering a passcode into a keypad Table 1 ee Matrix for Biometric Types Facial Recognition Hand Geometry Speaker Verification yes ip no 3 Reliability 1in 1 in 500 i 1 in 131 000 dryness dirt age lighting age glasses hair hand injury age noise weather colds poor lighting Ext Diff Difficult Very Diff Very Diff Long term Stability pus User Acceptance Intrusive s lEaseofUse_ 3 yes yes yes Specia cheap Common cheap Special CAR Common cheap Special expensive 1 in 10 000 000 Errors sd glasses changing signatures hand injury tiredness none Ext Diff Medium Difficult Ext Diff Ext ou Easy Ext ou H j E IEEE Common cheap Special expensive Special expensive Special expensive Special me price Common cheap Standards d o 3 7 d ys This table can be found at http ct ncsc dni us biomet 20web BMCompare html 1 Decision Matrix Terms Verify Whether or not the Biometric is capable of verification Verification is the process where an input is compared to specific data previously recorded from the user to see if the person is who they claim to be ID Whether or not the Biometric is capable of identification Identification is the process where an input is compared to a large data set previously recorded from many people to see which person the user is Accuracy 1 4
18. d bio data Issues of privacy and personal liberties may arise The general scope of this project does not provide any political or ethical concerns System Design The HUDAT Security System incorporates the use of a computer driven microphone a directory of reference audio samples an audio processor an interface and a locking mechanism An individual repeats the pass phrase Please let me in into a microphone From this an audio sample is generated The audio processor then compares the sample to the directory of samples and grants access by unlocking the door lock only if the new sample matches a previous authorized sample A tolerance is incorporated to allow for slight changes in tonal quality background noise and other factors to be determined A block diagram of the system can be seen in Figure 1 Computer ID Interface Power Supply Electric Door Lock Figure 1 Block Diagram of Speaker Verification System The System Components User A person wishing access into the secure area Microphone A device capable of capturing an audio sample and relaying it to a computer System Administrator A supervisor of the recognition system responsible for general maintenance of the system as well as maintaining the database They are also responsible for manual identification if something would go wrong with the system i e false negative due to an illness etc Computer ID A device capable of processing
19. es that the user name is allowed access to the secure area When ready a key is pressed to initiate the 6 second recording time Upon completing recording the system splits the new wave file using newsplit The system verifies that the recording was successful and the split was performed properly Upon split verification the wave file associated with the user name is also split The formant function is used to test the four split portions of each wave file A number based upon the similarity of each file compared is returned by the formant function A failure is indicated by 1 while two files with similar power spectral densities returns a positive integer 0 indicates an exact match 10 A weighted grade is assigned to the return of each formant max e m o o N 11 0 25 Assigned a 2 26 250 Assigned a 1 251 500 Assigned a 0 501 0r 1 Assigned 1 11 The grades from each of the four comparisons are totaled A value of zero or greater is required for the system to generate a pass response 12 Upon passing the ock function is called Upon failing the system reattempts the recording process for up to 3 trials After 3 trials the system resets File Splitting and Formants File Splitting o o p In order to properly analyze a spoken file the voiced sections need to be identified Due to the dynamic range of the voice when speaking determining when someone starts stops speaking and whe
20. ewsplit m Thomas H Jonell new splitter program much much faster than the old split m function newsplit wavfile extension read in wavfile xq Fs B wavread wavfile create postfix and prefix strings postf wav pref extension create the pitch data x pitch find lengths of both files Lx length x Lxq length xq find the ratio of the 2 data files for translation r ceil Lxq Lx calculate the mean for use as the threshold m mean create dummy index values 25 p 1 S r create flag values for spooken start and finish s 0 s 0 Mm 9 create a counter for wav file labeling counter 0 test pitch file to find spoken sections for i 1 Lx start of a spoken portion if gt amp ss 0 set start index and start flag start i 13 ss 1 elseif x i lt m amp ss 1 amp fs 0 end of a spoken portion set finish index and finish flag finish i 13 fs 1 end there was a spoken portion write the file if ss 1 amp fs 1 translate indexes start r finish r make sure its not larger than the original file index if n Lxq n Lxq B9 test to sure the spoken segment is large enough q n p if q gt 2000 copy values to temporary variable z xq p n update counter and create a string of it counter counter
21. he peakffunction with little success Because of the way the formant data arranged the zeroing method did not work finding false peaks The peak functions were replaced by an entirely different method with the derv function test m Created 1 13 04 The testfunction originally worked by taking the two sets of indexes given by the peak function and comparing them together They were compared at first by how well the two indexes when paired as x y coordinates lined up to a slope of 1 If they failed to line up to the slope of one the test failed It was then realized that the size of the two FFTs of the voiced samples were not of the same size therefore the indexes would technically be wrong In fact the indexes themselves shouldn t have been used at all because the indexes of an FFT represent a frequency So a new function called frec was created to translate the FFT indexes to their proper frequencies These new frequency sets were then sent to the test function to compare once more as x y coordinates to a slope of 1 When the decision for using formants was made this testing method was scrapped in favor of a new one comparing the difference of the sets to each other in the newtest function fullprog m Created 1 17 04 The program fullprog incorporates all of the FFT method functions The fullorog program always worked as intended but its individual functions never produced the desired results so it was scrapped being replaced by fina
22. ion When discussing the field recognition and verification are often confused Recognition deals with the understanding of speech content A system may draw from a large database of known words and will determine what was said Verification does not care what is said It makes an identification to verify that the person is who he or she is claiming to be Biometrics is quickly becoming a way to ensure that secure items do not become compromised This process is completed by comparing newly acquired data against data already in the verification system Data is in the format of either physiological or behavioral characteristics Physiological examples include facial recognition finger prints and retinal scans Behavioral characteristics include but are not limited to signature verification and speaker verification In this project the use of speaker verification will be implemented Biometric Decision Considering the multiple routes that the field of biometrics could take in a security application Table 1 explains the benefits and drawbacks of each type After reviewing the results it can be seen that speaker verification is a viable option considering its common and inexpensive hardware its low cost and its ease of use In addition it is felt that this would be the easiest system to implement Its drawbacks in the areas of error and reliability can be negated somewhal by utilizing additional more traditional verification procedures such
23. itch m recwav m Appendix B Sample User Interfaces Example 1 Successful Login Attempt Example 2 Three Failed Login Attempts Example 3 Invalid User Name Appendix C MATLAB Plots O OO N N N Executive Summary A speaker verification system in a security application is developed It involves the use of voice biometrics to distinguish between authorized users The system deactivates an electric lock upon a successful match The issue of cost is addressed Problem Identification In today s society we are in the middle of the Information Age Over the past twenty years technology has made every attempt to try and keep up with the overwhelming demand In the process recent technology has spread into every facet of our lives After September 11 2001 society has looked considerably more to technology for security something which was lacking before and is now fundamental A customer has a desire to control access to a secure room by means other than a traditional keycard or passcode The idea is that these security measures can be compromised or stolen After studying different approaches the answer was determined to be biometrics Research Biometrics are automated methods of identifying speech recognition They can also be used as a means of verifying the identification of the person verificat
24. l formant m Created 2 11 04 The formant function takes in a voiced file and applies the Yule Walker spectral power density MatLab function pyulearto it The data is then aligned properly and sent to the deriv function to find peaks formants which index and value are returned along with the power density curve The formant function has worked as intended since it was created newsplit m Created 2 16 04 The newsplit function worked by utilizing the autocorrelation function xcorrin MatLab The newsplit used the xcorr function via the pitch function The pitch function applied the xcoor function in small overlapping windows on the original voice data finding the maximum number of this autocorrelation and placing it in a new data set This new data set was passed to the newsplit function The new dataset represented an essentially amplified 10 version of the original voice data making it incredibly easy to detect a voiced and unvoiced section The new data set index was mapped to the original data set and the file was split apart by use of a magnitude threshold Initial problems in mapping the data caused the newsplit function to cutoff the beginning and ending of a voiced section This was solved by increasing the size of the voiced window after identification of the voiced section Although this eliminated the cutoff problem a new problem emerged involving the creation of extraneous voiced sections This problem
25. n there is white noise can be difficult The solution to enhancing the voiced sections of the voice sample was solved by use of autocorrelation This is done by the autocorrelation of many small windows along the voice sample and then storing the max value of each window in a reference vector The mean of the reference vector is then used as a threshold value to determine when a voiced section begins and ends The indices of the starting and stopping points of the reference vector are taken and then mapped to the original voice sample in order to successfully split it Appendix C shows a graphical comparison of an original voice sample and its reference vector Formants After a file is split into individual words each word needs to be analyzed to find out what makes it unique The uniqueness of each word is measured by formants A formant is a characteristic resonant region peak in the power spectral density PSD of a sound The PSD is essentially the concentration of power at specific frequencies The PSD is unique to each voice due to the resonant qualities of the vocal chords and the mouth nasal cavities The PSD is obtained by utilizing the Yule Walker AR method to calculate a smooth PSD curve The formants of the PSD curve are found at the peaks using a numerical derivative Appendix C shows an example PSD curve with the formants marked 12 This portion of the system involves the circuit between the program s voltage signal via RS232 Se
26. oping a prototype The number of these packages used in the project will be reduced as much as reasonably possible This will be done through research and decision matrices Another issue to consider is the existence of similar systems An attempt will be made to develop an original solution at a competitive price Environmental It was determined that no environmental impact can be foreseen Any issues that arise will be dealt with accordingly Manufacturability Again the issue of standard hardware comes up The use of standard hardware will allow for ease of production and construction Sustainability It is felt that proper documentation sufficiently commented code and a user s manual will need to be provided to allow for any necessary changes to be made in the future Health amp Safety Safety would be compromised if the system did not work properly so all feasible scenarios will be tested to debug the system and ensure operation A closed beta test is being considered to accomplish this Also a manual override i e a key will be implemented to circumvent any system failures that would jeopardize human life Social Since this system will not be used by general society no great social impact can be foreseen Any issues that arise will be dealt with accordingly Polticial Ethical The collecting of data to identify individuals may pose a concern in the long run Future applications of the technology could lead to abuse of collecte
27. rial and the electric door lock Figure 2 shows the schematic of this circuit 2N2222A npn BJT LEE 2 Electric llluminaled u Door Rocker Step Down Transformer Strike Switch 120VAC to 12 6VAC 3 6VDC 120V AC SERIAL LINE RS232 9 pin D Type Figure 2 Lock Circuit Schematic The interface consists of the following components 1 Power Cord 120V AC 60 Hz 2 Illuminated Rocker Switch This acts as a switch for the entire circuit by breaking the positive voltage line from the power cord Upon activation it closes this line and illuminates a red light Panel Mount Fuse Holder This holds a fuse rated for 120V 15A PC Mount Power Transformer This is fed by the 120V line from the fuse With it placed in series connection the transformer supplies 12 6VAC CT at 2 4A 5 In Line Fuse Holder This holds a fuse rated for 120V 2A 6 1N4001 Diode This takes a branch of the positive voltage and uses it to power the relay 7 10uF Capacitor This is tied to ground to clean up the BJT s collector voltage 8 2N2222 BJT An npn transistor used as a switch which is triggered by the positive serial line Its emitter current powers the relay 9 Low Signal Relay This acts as a SPST switch for the transformer s output voltage 10 Serial Port RS232 This provides the signal to trigger the BJT The positive line is pin 3 SEND and the negative line is pin 5 GND The connecting cable
28. s is the use of a person s voice as an identifying characteristic of the person One area where people often become confused is the distinction between speech recognition and speaker verification Put simply speech recognition is a speech processing technology that recognizes what a person is saying 2 This is often applied in computer software intended for the use of the handicapped You can probably recall programs that allow you to speak to your computer and it will act as a stenographer by taking down what you say in a word processor Speaker verification however is the use of a person s voice to identify him or her usually in a secure setting Unlike speech recognition it does not know what the speaker is saying Instead it utilizes a voiceprint database extracts the reference voiceprint after an identity claim is made gets a sample of speech from the user converts this to a voiceprint and compares the two voiceprints through means of digital signal processing DSP utilizes some threshold of error to account for slight variations possibly due to background noise speaker illness etc and accepts the claim or not 2 There are four main types of speaker verification that are used today These are text dependent text dependent with speech recognition text prompted and text independent Text dependent involves something like an account number being initially typed into the system Then a prompt occurs and a password is uttered
29. t Figure 3 was created using the program Microsoft Project and is a visual layout of the work done throughout the project With the work being performed on an academic calendar there are three quarters shown Fall Winter and Spring Fall Quarter Fall Quarter is divided into three main sections Conceptual Feasibility and Proposal The Conceptual portion deals with the initial stages of the project This includes the Team Charter Problem Identification and Project Update It involves any activity that was meant to present the groundwork of the project The Feasibility portion deals with discussion of the project This includes the Block Diagram Research Gantt Chart and Constraint Analysis It involves any activity that was meant to explain the project and delve more into the subject matter surrounding it The Proposal portion simply deals with the final phase of the Fall Quarter This includes the Rough Draft Final Draft and Presentation of the project s proposal One area that falls outside of these main sections is the Peer Evaluations These were conducted to help the team develop and facilitated discussion among the members The Peer Evaluations will also be repeated numerous times throughout the project Winter Quarter Winter Quarter dealt with a great deal of programming In fact programming occurred for the entire duration of the quarter Research also continued and led to many different directions in the programming
30. t to activate a lock and the electric door lock itself Numerous tests were conducted to determine its effectiveness The findings of the tests revealed that the results were within the expected performance parameters Also a cost analysis was performed to compare it to a comparable security system already in use 18 References 1 Biometrics Comparison Chart Court Technology Laboratory Retrieved 07 Nov 2003 lt http ct ncsc dni us biomet 20web BMCompare gt 2 Markowitz Judith A Voice Biometrics Communications of the ACM Vol 43 No 9 September 2000 p66 73 3 Ellis E Darren Design of a Speaker Recognition Code using MATLAB 16 199 Assignment 1 27 January 2004 http www andrew cmu edu asurie 199 1 htm 19 Appendix A M file Code deriv m deriv m Thomas H Jonell create the derivative of a function returns the value of the max and its index along with the derivative of the function This function generally replaces the previously used peak finder functions o oe oe function yq max ind deriv set globals slope 0 pslope 0 0 counter for max positioning for i 1 1 length xq 1 calculate slope and store it in output variable slope xq i t1 xq i i 1 i yq i slope if there is a maximum detected record its position and value if pslope gt 0 amp slope lt 0 c c 1 ind c i max c
31. the captured samples and interfacing with the database for comparison Upon a successful match it will send a signal to the interface Interface Power Supply A device that uses the computer signal to trigger a relay This relay closes the circuit required to power the electric door lock Electric Door Lock An electric door lock whose locking mechanism is normally locked and is unlocked upon receiving an electric signal of a specified voltage Programming Language For the project a programming language was needed to construct the code that would operate the system Among the available languages it was decided that the program MATLAB would be used to initially construct and debug the system This was decided because MATLAB is a very diverse program written in C code It provides an intuitive interface language and a number of math and graphics functions It also has many special functions that would be useful in development i e fft for performing a Discrete Fourier Transform of a given signal plus a wide array of other Digital Signal Processing DSP applications It is the team s belief that the code written in MatLab along with establishing few functions could then be put into C code to develop an executable MATLAB Files myfft m Created 11 04 03 The file was originally created to show the Fast Fourier Transform FFT of voice data that was sent through an FIR bandpass filter based around the range of the human voice 50
32. uch as the room s acoustics and outside disturbances in most instances Overall the system performed fairly well 14 Cost Analysis The current lock system in the Biggs Engineering Building of Ohio Northern University is approximately 1000 electronic door lock This was used as a basis for the cost of the HUDAT Security System Table 2 shows the development costs of the software All of these would impact the HUDAT Company If the only contract were to replace 100 locks of the current ONU system the total cost for development and production 100 units was determined to be 45 000 35000 Development 10000 Production Table 3 shows the manufacturer s suggested retail prices MSRP based upon no profit and 3396 profit As seen from the profit MSRP the cost is still competitive and favorable when compared to the current 1000 lock system One issue that was not considered is the number of hours required to have someone reprogram each current lock at least once a year This would also add to the savings provided by the HUDAT Security System Table 2 Estimated Development Costs Computer System 800 MATLAB reusable 1 900 MATLAB Signal Processing Toolbox reusable 800 MATLAB Compiler reusable 2 700 Lock amp Components 100 Development 20 hr 3 people 28 800 TOTAL 35 100 Table 3 Estimated Product Costs Break Even MSRP per unit 450 Profit MSRP 3396 profit per unit 600 15 Gantt Chart The Gantt Char
33. was solved by increasing the length threshold of a voiced section With the above solutions the newsplit function works as intended deriv m Created 2 17 04 The deriv function was created out of the need for peak finder that would not find false peaks This function worked by utilizing numerical derivative roots A peak always occurs in a function when its derivative transitions between positive and negative The function incorporated this idea and finds peaks without problems newtest m Created 2 24 04 The function newtest compares the functions by squaring the difference of the formant peak indices together and then summing those squares The smaller the sum the more similar the two voices are There have been no problems with this function final m Created 2 25 04 This program incorporates all the formant method functions into a single program The program currently works as intended lock m Created 3 15 04 This program unlocks the door strike It communicates with the serial port of the computer on which it is running The serial port is sent a signal for approximately 5 seconds during which time the door lock buzzes due to AC power being supplied logon m Created 3 15 04 This program replaces final It provides the user with a text based interface to the entire system lts structure works in the following manner 1 Welcomes the user to the HUDAT Security System Prompts the user to enter their user name Verifi
Download Pdf Manuals
Related Search
Related Contents
Dell Wyse 2GF/1GR upgrade kit Havis-Shields KK-S-120-6 User's Manual GW Contour v.1.0 User`s Manual Copyright © All rights reserved.
Failed to retrieve file