Home

Minimum Performance Specification for the Enhanced

1. 2 40 Table 2 3 2 3 1 2 SO 70 Listening Experiment 1 Test 2 41 Table 2 3 2 3 2 1 SO 70 Listening Experiment 2 Test Parameters 2 42 Table 2 3 2 3 2 2 SO 70 Listening Experiment 2 Test Conditions 2 43 Table 2 3 2 3 3 1 SO 70 Listening Experiment Test Parameters 2 44 Table 2 3 2 3 3 2 SO 70 Listening Experiment 3 Test 2 44 Table 2 3 2 3 4 1 SO 70 Listening Experiment 4 Test Parameters 2 46 Table 2 3 2 3 4 2 SO 70 Listening Experiment 4 Test 2 46 Table 2 3 2 3 5 1 SO 70 Listening Experiment 5 Test Parameters 2 47 Table 2 3 2 3 5 2 SO 70 Listening Experiment 5 Test 2 48 Table 2 3 2 3 6 1 SO 70 Listening Experiment 6 Test Parameters 2 49 Table 2 3 2 3 6 2 SO 70 Listening Experiment 6 Test 2 49 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35
2. 3 14 323 Master Codec for SO 68 sss aaa aoaooootossesseosoooooososssssnasnosooonoos 3 15 3 2 4 Fixed Point Bit Exact Codec for SO 68 eee rennen 3 16 3 3 Specific Standard Test Conditions for SO 70 3 20 3 81 Audio Path and Calibration for SO 70 1 eee enne 3 20 3 3 2 Software Test Tools for SO 70 eene nennen 3 21 vii C S0018 D v1 0 3 3 3 Master Codec for SO 70 3 22 3 3 4 Fixed Point Bit Exact Codec for SO 70 3 24 3 4 Specific Standard Test Conditions for SO 73 sse 3 34 3 4 4 Audio Path and Calibration for SO 73 3 34 342 Software Test Tools for SO 73 3 35 3 4 3 Master Codec for SO 73 eene inen nnne 3 36 344 Fixed Point Bit Exact Codec for SO 73 nennen 3 38 4 CONTENTS OF SOFTWARE DISTRIBUTION sss nnne entrent 4 1 5 DUNNETTS teet tut ente dette eet 5 1 5 1 Stage T Analysis of Variance e ie ra e nice ton e i dede ri i e take 5 1 5 2 Stage 2 Dunnett s Multiple Means Test Test CC s vs the Reference CC 5 2 6 Processing BLOCKs FOR So 68 SO 70 and SO 73 sss 6 1 6 1 Nominal Level and Noise Processing esses 6 1 6 2 FER Processing dg pee ete dee edet dee dap ve ede deel aa ue d ie vage 6 1 6 3 Low level and Signaling Processin
3. dim 196 pls bin operating dim 196 bin pkt level point source level dim file dim file High level Processing level input speech file Master Test encoder in a output adjusted src s12 given operating point I packet file Master Test decoder gt speech file scaldemo 10 dB gt output speech file A operating point
4. m 2 63 2 4 Performance Testing for SO 73 2 64 2 4 1 Objective Performance Testing for SO 79 nnne tentent 2 64 2 4 2 Subjective Performance Testing for SO 73 eee 2 66 24 3 Speech Material for SO 73 Testing ki sei bea eui ni Ak akne diei 2 78 24 4 Processing of Speech Material for SO 73 Testing 2 78 245 Randomization decl de ete ao d cp d de peine dt 2 81 2 46 Presentation vit sekte fe tee aite 2 83 24 7 MP ISTONOLS fent li ot tye da pe rr kd e pe da A od an ai de pk ket gan 2 83 248 Listening Test Procedures oxi stuns ki pra edid an bate nes 2 83 2 4 9 Analysis of Results enne tntntn tenente ntenni tata ta tns teta tetas aa anta tata data tasas 2 87 24 10 Expected Results for Reference Conditions 2 88 3 Godec Standard Test Conditions iet rrr ie Rr itae a e aer p vage Per a 3 1 3 1 Specific Standard Test Conditions for SO 3 sssssssssssseeeeeenneenns 3 2 3 1 4 Audio Path and Calibration for SO 3 enne 3 2 3 1 2 Standard Software Test Tools for SO 3 1 eene 3 3 3 1 3 Master Codec for SO 3 sess nennen enne 3 5 3 1 4 Fixed Point Bit Exact Codec for SO 3 nennen 3 7 3 2 Specific Standard Test Conditions for SO 68 3 13 3 224 Audio Path and Calibration for SO 68 enne 3 13 3 22 Standard Software Test Tools for SO 68
5. so70 testvec source suiteB so70 testvec source suitec tvec source suit Files in the so70 testvec directory are provided for the purpose of qualifying a test codec as bit exact and conform to the file naming convention described in Section 2 2 4 The so70 testvec directory is divided into 2 subdirectories so70 testvec source and so70 testvec fixed The so70 testvec source directory contains input source files as well as packet files injected with frame erasures The so70 testvec fixed directory contains files processed with the EVRC WB fixed point reference software The files in these directories are the reference files for bit exact compliance A test codec is bit exact if it can reproduce all of the reference files in the so70 testvec fixed directory exactly The outputs of the encoder and decoder of the test codec are to be obtained for the conditions given below in Table 3 3 4 5 2 Table 3 3 4 5 9 The processing steps for these conditions are illustrated in Section 6 Table 3 3 4 5 2 SO 70 Encoder Suite A Bit exact Test Conditions ji Operating Point Condition ade acid ai src s22 EVRC WB operating point O Nominal 22 dB evrc wb opO0 p22 16 kHz sampling src s12 EVRC WB operating point 0 High 12 dB evrc wb op0 p12 16 kHz sampling src s32 EVRC WB operating point 0 Low 32 dB 196 d amp b evrc wb opO0 dim 196 p32 16 kHz samplin
6. Table 3 3 4 5 9 SO 70 Suite D Decoder Bit exact Test Conditions Input Packet File Operating Point Condition Reference output speech files for bit exact compliance evrc wb opoO fer 396 p22 EVRC WB operating point O 8 kHz sampling Nominal 22 dB 3 FER evrc wb opoO fer 396 022 8k evrc wb op0 p12 evrc wb opO0 p32 EVRC WB operating point O 8 kHz samplin EVRC WB operating point O 8 kHz sampling High 12 dB Low 32 dB evrc wb op0 012 8k evrc wb op0 032 8k evrc wb opO dim 196 pls 196 p22 EVRC WB operating point O 8 kHz sampling Nominal 22 dB 196 d amp b 196 pls evrc wb opO dim 196 pls 196 022 8k evrc wb opO pc EVRC WB operating point O 8 kHz sampling 3 32 Nominal 22 dB 15 dB car noise evrc wb opoO oc 8k C S0018 D v1 0 Input Packet File evrc wb opO ps Operating Point EVRC WB operating point O 8 kHz sampling Condition Nominal 22 dB 15 dB street noise Reference output speech files for bit exact compliance evrc wb opoO os 8k evrc wb opoO fer 296 pb EVRC WB operating point O 8 kHz sampling Nominal 22 dB 20 dB babble noise 296 FER evrc wb opoO fer 296 0b 8k evrc wb opoO fer 3 EVRC WB operating point O 8 kHz sampling Generic audio signal fer 396 evrc wb opO fer 3 8 evrc wb op4 fer 396 p22 evrc wb op4 p12 EVRC WB operating point 4
7. 32 33 34 35 36 37 38 39 40 C S0018 D v1 0 2 4 1 1 2 Average Data Rate Requirement for SO 73 The total average data rate Ravg for each operating point shall not exceed the target average data rate by more than the tolerance level in Table 2 4 1 1 1 1 otherwise the test codec fails the compliance test 2 4 1 2 Unity Gain Requirement The specific EVRC NW test codec shall output speech with unity gain when compared with the input speech The unity gain measurement output active speech level input active speech level will be performed over the entire input speech database for the clean nominal level source conditions for each mode The measurement should be made using the STL 2000 tool 6 6a actlev and must not show more than 0 5 dB deviation between input and output active speech levels This procedure is fully described in 9 2 4 4 3 End to end Algorithmic Delay Recommendation The algorithmic delay for the specific EVRC NW test codec should be calculated analytically by the codec manufacturer In considering the algorithmic delay it can be assumed that all transmission channels have infinite bandwidth and that all processing elements have infinite throughput Algorithmic delay is defined as the sum of all sequential filter delays and buffering delays in the encode decode path The maximum end to end algorithmic delay should be no greater than that of the master codec For the master codecs defined
8. C S0018 D v1 0 l and The randomization of the test samples has been constrained in the following ways for the two experiments 1 A test sample for each codec combination talker and level channel condition or background noise level Experiment or Il or MNRU value and talker shall be presented exactly once 2 Randomization has been done in blocks such that one sample of each codec level codec channel condition or codec background noise level again depending on Experiment 11 or MNRU value will be presented once with a randomly selected talker in each block This ensures that listeners rate each codec condition being tested equally often in the initial middle and final parts of the session and will mitigate the effects of practice and fatigue A block contains 31 file samples A session will consist of eight blocks of 31 file samples plus one practice block of 31 at the beginning of each session for each experiment There are a total of eight sessions per experiment A particular randomization session shall not be presented to more than eight listeners 3 Talkers shall be chosen so that the same talker is never presented on two consecutive trials within the same block The randomization lists for each of the eight file sets of each experiment are given in so3 subjctv exp1 data play Ist and so3 subjctv exp2 data play Ist respectively 2 1 6 Presentation Presentation of speech material for the SO 3 code
9. The source speech material shall be processed by the various combinations of encoders and decoders listed in the descriptions of the experiments given in Section 2 3 2 The master codec software described in Section 3 3 3 shall be used in the processing involving the master codec Generally the master codec encoder and decoder outputs have been provided in the respective 2 52 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 C S0018 D v1 0 directories so70 subjctv exp m pkt and so70 subjctv exp m m Execution of the master codec software is needed only for the test encoder master decoder combination for each experiment condition All codec processing shall be done digitally Noise suppression and post filter options shall be enabled for both the master and the test codecs The digital format of the speech files is described in Section 3 3 4 4 The naming convention of the processed speech is as follows For the packet files in the so70 subjctv exp 1 3 5 m pkt directory the p12 files are the master packet files for the s12 source file Likewise the p22 and p32 files are the respective packet files for the s22 and s32 source files The pf3 files are the impaired packet files which will be described in Section 2 3 4 3 Similarly the directory so70 subjctv exp 2 4 6 m_pkt contains the master packet files for the respective experiments Here the pc10 pb20
10. gt a w o N w N to A gd ge l ge 6 2 26 55808 54272 63232 16336 55040 50176 C S0018 D v1 0 1 Table 2 2 4 5 2 Cutting Points for the astrip Software Tool for the Experiment Il P 835 Test ar eng Sample samples 2151169 23796 2174965 32524 2207489 23719 27993 Experiment II P 835 ar eng ar eng sampies sentence sanple samples m Ct o a Q 0 m2s 036988 30021 65 58 es m2s f2s f2s m3s m3s N N N w H N o m2s0 m2s02 2259201 23729 32847 98817 27194 126011 26310 N e N N N w o N 12502 f 179276 30645 3512 2335464 25137 s soi 209921 21939 misi3 2465025 23592 10 m3so2 231860 25677 29768 fisi3 2518785 23256 24414 m2s13 2557937 23386 m2si4 2591323 23574 28367 16 f1s04 480352 29857 22514 28721 19 2 04_ 534474 29239 misi 2699268 27901 19 sos 563713 25194 f3s13 2727169 19206 26362 miss 2876673 23122 se misi 2899755 26286 Fisis 2926081 20020 fisis 2946101 27596 Pas m2s15 2973697 25310 0 m2s16 2999007 30498 55 fzsis 3055744 28033 misis 3083777 27501 35731 f3s15 3147009 20918 95 3516 3167927 25418 3 4 Table 2 2 4 5
11. 8 kHz sampling EVRC WB operating point 4 8 kHz sampling Nominal 22 dB FER 396 High 12 dB evrc wb op4 fer 396 022 8k evrc wb op4 012 8k evrc wb op4 p32 EVRC WB operating point 4 8 kHz sampling Low 32 dB evrc wb op4 032 8k evrc wb op7 p22 EVRC WB operating point 7 8 kHz sampling Nominal 22 dB evrc wb op7 022 8k evrc wb op4 dim 196 pls 196 p22 EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 196 d amp b 196 pls evrc wb op4 dim 196 pls 196 022 8k evrc wb op4 pc EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 15 dB car noise evrc wb op4 oc 8k evrc wb op7 pc EVRC WB operating point 7 8 kHz sampling Nominal 22 dB 15 dB car noise evrc wb op7 oc 8k evrc wb op4 ps EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 15 dB street noise evrc wb op4 os 8k evrc wb op4 fer 296 pb EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 15 dB babble noise 3 33 evrc wb op4 fer 296 pb 8k 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 C S0018 D v1 0 3 4 Specific Standard Test Conditions for SO 73 3 4 1 Audio Path and Calibration for SO 73 3 4 1 1 Audio Path The audio path for wideband test conditions Experiments 1 and 2 must meet the following requirements for electro acoustic performance measured between the
12. C S0018 D v1 0 Table 2 3 2 3 7 1 SO 70 Listening Experiment 7 Test Parameters 2 50 Table 2 3 2 3 7 2 SO 70 Listening Experiment 7 Test 2 50 Table 2 3 2 3 8 1 SO 70 Listening Experiment 8 Test Parameters 2 51 Table 2 3 2 3 8 2 SO 70 Listening Experiment 8 Test 2 51 Table 2 3 2 3 9 1 Numerical Parameters for the SO 70 Listening Experiments 2 52 Table 2 3 4 5 1 Cutting Points for the astrip Software Tool for the SO 70 Experiments 1 3 and 5 ce ee Hates rat ed tete 2 55 Table 2 3 4 5 2 Cutting Points for the astrip Software for the SO 70 Experiments 2 4 and 6 PEE 2 56 Table 2 3 4 5 3 Composition of the Sentence Triad Samples for the Experiments 2 4 and 6 P 835 Hre 2 56 Table 2 3 5 1 Example Randomization for the Experiments 1 3 and 5 ACR 2 57 Table 2 4 1 1 1 1 Target ADR vs Capacity Operating Point sse 2 65 Table 2 4 2 1 Test Suites for SO 73 compliance sees 2 67 Table 2 4 2 2 Experiments for SO 73 compliance sse 2 67 Table 2 4 2 3 1 1 SO 73 Listening Experiment 1
13. Nominal 22 dB 10 dB car noise Reference output speech files for bit exact compliance evrc wb opoO oc1i evrc wb fer 396 pc2 EVRC WB Nominal 22 dB evrc wb opoO fer 396 0c2 operating point 0 20 dB car noise 16 kHz sampling fer 396 evrc wb opO ps EVRC WB Nominal 22 dB evrc wb opO0 os operating point O 16 kHz sampling 15 dB street noise evrc wb opoO po evrc wb opoO fer EVRC WB operating point 0 16 kHz sampling EVRC WB operating point O 16 kHz sampling Nominal 22 dB 20 dB babble noise Generic audio signal fer 396 evrc wb opO0 ob evrc wb opoO fer 3 evrc wb op4 fer 3 22 EVRC WB Nominal 22 dB evrc wb op4 fer 396 022 8k operating point 4 FER 3 8 kHz sampling evrc wb op4 p12 EVRC WB High 12 dB evrc wb op4 012 8k operating point 4 8 kHz sampling evrc wb op4 p32 EVRC WB Low 32 dB evrc wb op4 032 8k operating point 4 8 kHz sampling evrc wb op7 p22 EVRC WB Nominal 22 dB evrc wb op7 022 8k operating point 7 8 kHz sampling evrc wb op4 dim 196 pls 196 p22 EVRC WB Nominal 22 dB evrc wb op4 dim 196 pls 196 operating point 4 8 kHz sampling 196 d amp b 196 pls 022 8k evrc wb op4 pc EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 15 dB car noise evrc wb op4 oc 8k evrc wb op7 pc EVRC WB operating point 7 8 kHz sampling Nominal 22 dB 15 dB car noise evrc wb op7 oc 8k ev
14. T p 191501 57758 5 m2soz 288424 56412 2so2 404062 59226 9 masoi 463288 51884 1o m3soz 515172 59593 16 21 04 859135 59385 ras 2 04 982890 58954 1s f2s03 1041824 54821 3s10 3202618 58012 4 fisi2 3527467 51931 66 2 12 3734538 55821 82512 3852015 60100 6s msti 3912115 55432 090859 154265 216555 272219 328495 383955 448692 505885 567054 622106 676546 732516 790729 843742 900324 962012 5021304 89 mzsis 5080387 65752 190 mzsis 5146135 63251 54 m3sis 5385036 60125 96 3515 5508244 61792 B Ci U1 ON OT GT OT Of GT UT OV U1 OV UT UT OT WPOP RA OA OR ay Oo Oyo yw COP RFR A gt oO SPH WIM OM WWI o o to vo oJ o Table 2 3 4 5 3 Composition of the Sentence Triad Samples for the Experiments 2 4 and 6 P 835 Test t3 t4 2 3 55 Randomization Pts 85 807 509 S12 For each of the first six subjective experiments each presentation sample consists of a speech sample processed under a condition of the test For the ACR Experiments 1 3 and 5 the sample consists of a pair of concatenated senten
15. 12dB R ambient background segment 32dB R ambient background segment 22dB R 20 dB SNR babble noise segment 22dB R 15 dB SNR car noise segment 22dB R 15 dB SNR street noise segment 22dB The above files are to be processed with EVRC NW encoder at various capacity operating points defined by the active speech average channel rate shown in Table 2 4 1 1 1 1 Table 2 4 1 1 1 1 Target ADR vs Capacity Operating Point Capacity Operating Point Target Average Channel Data active speech average channel data rate Rata kbps EVRC NW RATE REDUC 000 5 6 1 5Yo EVRC NW RATE REDUC 001 5 92 1 5 EVRC NW RATE REDUC 010 4 82 1 5 EVRC NW RATE REDUC O11 4 57 1 5 EVRC NW RATE REDUC 100 4 259 1 5Yo EVRC NW RATE REDUC 101 4 079 1 596 EVRC NW RATE REDUC 110 3 954 1 5Yo EVRC NW RATE REDUC 111 3 29 1 5 The above table provides the maximum allowable average channel rate including full half quarter and eighth rate for the different operating points These maximum allowable average channel rates were obtained by processing the 7 wide band benchmark files for the 16 kHz case and 6 narrow band benchmark files for the 8kHz case through the master floating point software See Section 3 4 2 1 for details in using the provided software tool that can be used to aid in making this calculation 2 65 20 21 22 23 24 25 26 27 28 29 30 31
16. 29 30 31 32 SNR 40dB Car Noise SG 4 OVRL OdB 20dB MNRU 40 dB MNRU 40dB C S0018 D v1 0 e SIG BAK OVRL 20 dB SNR Car Noise 40 dB P 835 Scores MNRU SNR SIG OVRL 10 10 dB 20 20 dB 30 30dB 40 40 dB SNR Car Noise Figure 2 2 10 2 1 P 835 Score Profiles for Reference Conditions 2 3 Performance Testing for SO 70 2 3 1 Objective Performance Testing for SO 70 The objective testing portion of this specification consists of an average data rate test and compliance to End to End Algorithmic Delay and Unity gain requirements 2 3 1 1 Average Data Rate Test An implementation may support SO 70 for 16 kHz sample rates for example a mobile station that supports wideband electro acoustics The average data rate for the test codec shall be measured using seven source speech files that are contained in the so70 subjctv exp 1 2 source directories Each file exhibits a different condition power levels 12 dB 22 dB and 32 dB and background noise conditions 20 dB SNR babble noise 10 dB SNR car noise 20 dB SNR car noise and 15 dB SNR street noise The input source files used in the average data rate test have an approximate voice activity factor of 0 6 and are the same input files used in the subjective portion of the experiment An imple
17. D SER i j k Oli j k max 0 12 c i j k SER i j 1 SER i jk 2 1 10 4 Similarly in Equation 2 1 10 3 the maximum allowable difference i j k 1 is given by j k 1 max 0 12 c i j k DW SER i j k SER i j k D OG j k 1 max 0 12 cli j k 1 SER i j k SER i j k D 2 1 10 5 where i j k and are as defined above and the multipliers are given in Table 2 1 10 1 The standard errors SER i j k for each condition are defined as j kn MOSG j E SER 261632 o Du OG k n MOSG j Y SER j k T RENNES 2 1 10 6 Specifically stating the requirement Equations 2 1 10 2 and 2 1 10 3 shall be true for all cases otherwise the test codec fails the compliance test 2 14 C S0018 D v1 0 Table 2 1 10 1 Multipliers for Equations 2 1 10 4 and 2 1 10 5 Experiment Condition Description cli ik M T T T ee E k 4 Cleans 278 3 04 367 408 294 338 n RENNES NA _ 265 270 om Gean 20 317 365 2 Canos 200 200 200 3 StetNose 200 200 2 00 3 49 465 2 1 11 Expected Results for Reference Conditions The MNRU conditions have been included to provide a frame of reference for the MOS test Also they provide anchor conditions for comparing results between test laboratories In listening evaluations where test conditions
18. The so68 testvec fixed directory contains files processed with the EVRC B fixed point reference software The files in these directories are the reference files for bit exact compliance A test codec is bit exact if it can reproduce all of the reference files in the 8068 testvec fixed directory exactly The outputs of the encoder and decoder of the test codec are to be obtained for the conditions given below in Table 3 2 4 5 1 and Table 3 2 4 5 2 The processing steps for these conditions are illustrated in Section 6 Table 3 2 4 5 1 SO 68 Encoder Bit exact Test Conditions Input E Reference packet File Operating Point Condition files for bit exact compliance src s22 EVRC B 9 3 kbps Nominal 22 dB 9 3 p22 src s22 EVRC B 5 8 kops Nominal 22 dB 5 8 p22 src s22 EVRC B 4 8 kops Nominal 22 dB 4 8 p22 src s32 EVRC B 9 3 kbps Low 32 dB 196 d amp b 9 3 p32 src s32 EVRC B 5 8 kbps Low 32 dB 1 d amp b 5 8 p32 src si2 EVRC B 9 3 kbps High 12 dB 9 3 p12 src si2 EVRC B 5 8 kops High 12 dB 5 8 p12 src c15 EVRC B 9 3 kbps Nominal 22 dB 15 dB carnoise 9 3 pc src c15 EVRC B 5 8 kbps Nominal 22 dB 15 dB carnoise 5 8 pc src b20 EVRC B 9 3 kbps Nominal 22 dB 20 dB babble 9 3 po src b20 EVRC B 5 8 kbps Nominal 22 dB 20 dB babble 5 8 po 515 EVRC B 9 3 kbps Nominal 22 dB 15 dB street 9 3 ps src si5 EVRC B 5 8 kbps Nominal 22 dB 15 dB street 5 8
19. The main parameter in the decision tree is 16 kHz support in the implementation Depending on the implementation profile of the Device under test one of 2 possible Test Suites are to be used to demonstrate SO 73 compliance These 2 test suites named Test suites A and B and the individual input test vectors comprising the Test suites are highlighted in Table 3 4 4 5 1 Table 3 4 4 5 1 Test Suites of input test vectors for SO 73 compliance Directory containing input test vectors so73 testvec source suiteA B 3s073 testvec source suiteB Files in the so73 testvec directory are provided for the purpose of qualifying a test codec as bit exact and conform to the file naming convention described in Section 2 2 4 The so73 testvec directory is divided into 2 subdirectories so73 testvec source and s073 testvec fixed The so73 testvec source directory contains input source files as well as packet files injected with frame erasures The so73 testvec fixed directory contains files processed with the EVRC NW fixed point reference software The files in these directories are the reference files for bit exact compliance A test codec is bit exact if it can reproduce all of the reference files in the 8073 testvec fixed directory exactly The outputs of the encoder and decoder of the test codec are to be obtained for the conditions given below in Table 3 4 4 5 2 Table 3 4 4 5 5 The processing steps for these conditions are illustrate
20. exact compliance Nominal 22 dB evrc nw opoO fer 396 2 point 0 16 kHz sampling 396 FER 022 evrc nw opoO fer 196 pls EVRC NW Nominal 22 dB evrc nw opO _1 22 operating point 0 196 FER 196 pls fer 196 pls 196 022 16 kHz sampling evrc nw opO0 p12 EVRC NW operating High 12 dB evrc nw op0 o12 point 0 16 kHz sampling evrc nw opO dim 196 p EVRC NW operating Low 32 dB evrc nw opO dim 196 32 point 0 16 kHz sampling 196 d amp B 032 evrc nw opO pc1 EVRC NW operating Nominal 22 dB evrc nw opO oc1 point 0 16 kHz sampling 10 dB car noise evrc nw opO fer 396 p EVRC NW Nominal 22 dB evrc nw opoO fer 396 c2 operating point 0 20 dB car noise oc2 16 kHz sampling fer 396 evrc nw opO ps EVRC NW operating Nominal 22 dB evrc nw opO os point 0 16 kHz sampling 15 dB street noise evrc nw opO pb EVRC NW operating Nominal 22 aB evrc nw opO ob point 0 16 kHz sampling 20 dB babble noise evrc nw opoO fer 3 EVRC NW operating Generic audio evrc nw opoO fer 3 point 0 16 kHz sampling signal fer 396 om evrc nw opi fer 396 p2 EVRC NW operating Nominal 22 dB evrc nw opi fer 396 2 point 1 8 kHz sampling FER 396 022 8k evrc nw op1 p12 EVRC NW operating High 12 dB evrc nw op1 012 8k point 1 8 kHz sampling evrc nw op1 p32 EVRC NW operating Low 32 dB evrc nw op1 032 8k point 1 8 kHz sampling evrc nw op7 p22 evrc nw opi dim 195 pl s 1 Yo p22
21. support T M d15 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support IT T d16 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support T M 2 46 C S0018 D v1 0 Exp 4 Narrowband P 835 917 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support M M d18 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support M T d19 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M M d20 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M T d21 Street 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support M M d22 Street 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support M T d23 Street 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support T T d24 Street 15dB SNR Mode 4 interoperable with Mode O of SO 68 support T M d25 Babble 20dB SNR 296 FER Mode 0 LB portion of Wideband mode decoder test only M M d26 Babble 20dB SNR 296 FER Mode 0 LB portion of Wideband mode decoder test only M T d27 Babble 20dB SNR 296 FER Mode 4 interoperable with Mode 0 of SO 68 support M M d28 Babble 20dB SNR 296 FER Mode 4 interoperable with Mode 0 of SO 68 support M T d29 Babble 20dB SNR 296 FER Mode 4 interoperable with Mode 0 of SO 68 support T T d30 Babble 20dB SNR 2 FER Mode 4 interoperable with Mode 0 of SO 68 support T M 2 2 3 2 3 5 Subjective Experiment 5
22. with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiment four listeners for each listening panel where each listening panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the ACR are presented in Figure 2 1 8 1 The instructions may be modified to allow for variations in laboratory data gathering apparatus 2 29 C S0018 D v1 0 This is an experiment to determine the perceived quality of speech over the telephone You will be listening to a number of recorded speech samples spoken by several different talkers and you will be rating how good you think they sound Use the single headphone on the ear you normally use for the telephone On each trial a two sentence sample will be played After you have listened to the sample determine the category from the list below which best describes the overall quality of the sample Press the numeric key on your keyboard corresponding to your rating for how good or bad that particular passage sounded The quality of the speech should be rated according to the scale below 5 4 3 2 1 Excellent Good Fair Poor Bad During the session you will hear samples varying in different aspects of quality Please take into account your total impression of each sample rather than concentrating on any partic
23. 1 d amp b M T 15 Low level 1 d amp b T T a16 Low level 196 d amp b T M a17 High level M M a18 High level M T a19 High level T T a20 High level T M a21 196 FER 196 PLS M M a22 196 FER 196 PLS M T a23 3 FER M M a24 3 FER M T 2 41 C S0018 D v1 0 2 2 3 2 3 2 Subjective Experiment 2 for SO 70 The Test Parameters for Listening Experiment 2 are presented in Table 2 3 2 3 2 1 4 Table 2 3 2 3 2 1 SO 70 Listening Experiment 2 Test Parameters P NSA P 835 Wideband Number of talkers Test conditions o Car Noise 10 dB SNR o Car Noise 20 dB SNR 2 FER o Street Noise 15 dB SNR o Babble noise 20 dB S N Encoder Decoder Combinations 4 M M M T T T T M Number of talkers Test conditions o Car Noise 10 dB SNR o Car Noise 20 dB SNR 2 FER o Street Noise 15 dB SNR o Babble noise 20 dB S N Encoder Decoder Combinations 4 M M T T T M 6 7 The Test Conditions for Listening Experiment 2 are presented in Table 2 3 2 3 2 2 2 42 Table 2 3 2 3 2 2 SO 70 Listening Experiment 2 Test Conditions Exp 2 Wideband P 835 Reference Conditions File MNRU SNR b01 MNRU 40dB SNR 40dB Reference b02 MNRU 40dB SNR 20dB Reference bo3 MNRU 40dB SNR 0dB Reference bo4 MNRU 0aB SNR 40dB Reference b05 MNRU 20dB SNR 40dB Reference bo6 MNRU 10dB SNR 10dB Reference b07 MNRU 20dB SNR 20dB Refer
24. 17 21 2 2 2 Subjective Performance Testing for SO 68 eee 2 19 22 2 2 3 Speech Material for SO 68 Testing eee eene 2 23 23 2 2 4 Processing of Speech Material for SO 68 Testing mm 2 24 24 2 2 5 BandomizaltiOn sz dedos erect tam 2 27 25 2 29 65 nita anaiai reri o ea Ca p apa sida eta epa dese uad dap aaa Reo aa adanada 2 29 26 atin due P dg ou den E He eng 2 29 27 2 2 8 Listening Test Procedures 2 29 28 2 2 9 Analysis of iT E 2 33 29 2 240 Expected Results for Reference Conditions 2 34 30 2 3 Performance Testing for SO 70 eee 2 35 81 2 3 1 Objective Performance Testing for SO 70 eene 2 35 32 2 3 2 Subjective Performance Testing for SO 70 eee 2 37 vi 22 23 24 25 26 27 28 29 30 31 C S0018 D v1 0 2 3 3 Speech Material for SO 70 Testing nnne 2 52 2 3 4 Processing of Speech Material for SO 70 Testing mm 2 52 2 3 5 Handomlization nio iei nee gei delectet gl Sues eere Perge sa e kak sod Dre e pe edad 2 56 2 3 6 etm eto det 2 58 2 97 ka eh p 2 58 2 3 8 Listening Test Procedures rnnt tete thntnnn teta tan ena tn tata ta tan ntent 2 58 239 Analysis of PRESUMES se ioo ka op ted kk ai epi ok a acu 2 62 2 3 10 Expected Results for Reference Conditions
25. 2 2 Method of Measurement The subjective test involves a listening only assessment of the quality of the codec being tested using the master codec as a reference Subjects from the general population of telephone users will rate the various conditions of the test Material supplied with this standard for use with this test includes source speech impaired packet files from the master codec encoder and source speech processed by various Modulated Noise Reference Unit MNRU conditions and other references The basic Absolute Category Rating test procedure involves rating all conditions using a five point scale describing the opinion of the test condition This procedure is fully described in 10 2 1 2 3 Test Conditions and Test Design for SO 3 Listening Experiments The two listening experiments for SO 3 are similar in design and are performed as MOS listening tests Each experiment will test the same number of codecs and the number of test conditions for each experiment is five There will be one condition typifying CDMA channels 396 FER a clear channel condition and a clear channel tandem condition All tandem conditions shall be asynchronous where asynchronous implies the introduction of a partial frame offset between encoding operations A nominal input level of 22 dB shall be used for these conditions Additional test conditions include background noise and audio input level variation For reference u law 4 MNRU conditions 5 15 20 a
26. 21 22 23 24 25 26 27 28 29 30 32 33 34 35 36 37 38 39 40 C S0018 D v1 0 subjective requirement for the test codec is based upon the ability of the test codec to demonstrate performance equivalent to or better than that of the specific EVRC floating point bit exact codec within a fixed allowable statistical error The purpose of the testing is not only to ensure adequate performance between one manufacturer s encoder and decoder but also that this level of performance is maintained with operation between any pairing of manufacturers encoders and decoders This interoperability issue is a serious one Any variation in implementing the exact standard must be avoided if it cannot be ensured that minimum performance levels are met when interoperating with all other manufacturers equipment meeting the standard This standard provides a means for measuring performance levels while trying to ensure proper interoperation with other manufacturers equipment The issue of interoperation can only be definitively answered by testing all combinations of encoder decoder pairings With the number of equipment manufacturers expected to supply equipment this becomes a prohibitive task therefore the objective and subjective tests rely upon the use of a master codec The master codec is defined as the floating point implementation of specific EVRC written in the C programming language The master codec software
27. B 8 5 bits sec 6 42 1 5 EVRC B 7 5k bits sec 5 52 1 5 21 22 23 24 25 26 27 28 29 30 C S0018 D v1 0 Capacity Operating Point Target Average Channel Data Rate kbps active speech average channel data rate EVRC B 7 0k bits sec 5 24 41 596 EVRC B 6 6k bits sec 4 82 1 5 EVRC B 6 2k bits sec 4 62 1 5 EVRC B 5 8k bits sec 4 45 1 5 EVRC B Half Rate Max 4 8k bits sec 3 75 1 5 The above table provides the maximum allowable average channel rate including full half quarter and eighth rate for the different capacity operating points These maximum allowable average channel rates were obtained by processing the 6 bench mark files through the master floating point software See Section 3 2 2 1 for details in using the provided software tool that can be used to aid in making this calculation 2 2 1 1 2 Average Data Rate Requirement for SO 68 The total average data rate Ravg for each capacity operating point shall not exceed the target average data rate by more than the tolerance level in Table 2 2 1 1 1 1 otherwise the test codec fails the compliance test 2 2 1 2 Unity Gain Requirement The specific EVRC B test codec shall output speech with unity gain when compared with the input speech The unity gain measurement output active speech level input active speech level will be performed over the entire input speech database for the clean nominal
28. D v1 0 2 2 9 Analysis of Results The response data from the practice blocks shall be discarded Data sets with missing responses from listeners shall not be used i e a complete set of data is required for 32 listeners four for each of eight listening panels Responses from the different listening panels for the corresponding test conditions shall be treated as equivalent in the analysis 2 2 9 1 Basic Results for the SO 68 Listening tests The votes for each of the test conditions for SO 68 Experiments and II shall be averaged to produce an associated mean score M as shown in Equation 2 2 9 1 1 and a Standard Deviation SD as shown in Equation 2 2 9 1 2 where L is the number of listeners and T is the number of talkers involved in the experiment 28 LxT ML 3 2 2 9 1 1 LxT 1 2 2 9 1 2 2 2 9 2 Minimum Subjective Requirement for SO 68 Listening Tests The Terms of Reference for the MPS tests state that the mean score for each of the Test Encoder Decoder Combinations E DC should be not worse than the mean score for the Reference E DC For most of the test conditions involved in the subjective experiments there are three Test E DC s M T T M and T T which means there are three statistical tests against the Reference E DC M M The three statistical tests are not independent however Since they all involve the same ratings for the Reference E DC t tests are not appropriate The appropriate statist
29. D v1 0 5 DUNNETT S TEST Most of the MPS statistical tests for SO 68 SO 70 and SO 73 compliance involve multiple Test Encoder Decoder Combinations E DC and a single Reference E DC The appropriate analysis for the statistical tests involved in the EVRC B MPS and EVRC WB MPS test is Dunnett s Test 20 Dunnett s Test is a special case of the more general Post Hoc Multiple Means Test where multiple treatment means are statistically compared to a common control mean In the case of the MPS tests the treatments are the three Test E DC s M T T M T T and the control is the Reference E DC M Dunnett s Test is conducted in two stages The first stage involves an Analysis of Variance ANOVA for the effects of E DC x Subjects where the E DC factor includes the four E DC s three test E DC s plus the Reference E DC and the Subjects factor includes the 32 subjects involved in the subjective test If the F ratio for the E DC effect is significant i e p 05 then there is significant variation among the scores for the E DC s and the Dunnett s test proceeds to the second stage of the process An F ratio that is not significant indicates that there is no significant variation among the Test and Reference E DC s A non significant F ratio indicates that the means for all four E DC s are statistically equivalent therefore all Test E DC s are not worse than the Reference E DC and all pass the MPS In the second stage of Dunnett s Test
30. EVRC A Tandem Nominal 22 dB T T T T b25 IS 96 C Tandem Nominal 22 dB R R b26 Reference MNRU 5dB b27 Reference MNRU 15dB b28 Reference MNRU 20dB b29 Reference MNRU 25dB b30 Reference G 728 b31 Reference u Law Source 2 6 C S0018 D v1 0 1 2 1 2 3 3 Numerical Parameters for SO 3 Listening Experiments 2 2 1 2 3 3 1 describes the resultant numerology that is used for each of the two SO listening s experiments The first column is a variable name given to each of the parameters the second 4 column is the description of the parameter the third column shows the required calculation for s determining the value of the parameter if it is dependent upon other parameter values and the last e column shows the numerical value for each of the parameters For each listening experiment four 7 codecs plus the IS 96 C codec are evaluated The number of reference conditions in each of the two 8 listening experiments is six and the number of test conditions is five Table 2 1 2 3 3 1 Numerical Parameters for SO 3 Listening Experiments Parameter Calculation Experiment Experiment Il EE LE C f Godes ssi C9 Reference Conditions e cs mas C StimuiperTake e Co FieSesons e cr listeners Voters C12 Listeners Voters per File Session ces 8 11 2 1 3 Source Speech Material for SO 3 Testing 12 All source material is derived from the Harvard Se
31. EVRC NW operating point 7 8 kHz sampling EVRC NW operating point 1 8 kHz sampling 3 42 Nominal 22 dB evrc nw op7 o22 8k Nominal 22 dB evrc nw opi dim 196 196 d amp b 196 pls pls 196 022 8k C S0018 D v1 0 Input Packet File evrc nw opi pc Operating Point EVRC NW operating point 1 8 kHz sampling Condition Reference output speech files for bit exact compliance Nominal 22 dB 15 dB car noise evrc nw opi oc 8k evrc nw op7 pc EVRC NW operating point 7 8 kHz sampling Nominal 22 dB 15 dB car noise evrc nw op7 oc 8k evrc nw opi ps EVRC NW operating point 1 8 kHz sampling Nominal 22 dB 15 dB street noise evrc nw opi os 8k evrc nw opi fer 296 pb EVRC NW operating point 1 8 kHz sampling Nominal 22 dB 15 dB babble noise 296 FER evrc nw opi fer 296 ob 8k evrc nw op6 fer 396 p2 2 EVRC NW operating point 6 8 kHz sampling Nominal 22 dB FER 396 evrc nw op6 fer 396 022 8k evrc nw op6 p12 EVRC NW operating point 6 8 kHz sampling High 12 dB evrc nw op6 012 8k evrc nw op6 p32 EVRC NW operating point 6 8 kHz sampling Low 32 dB evrc nw op6 032 8k evrc nw op6 dim 195 pl s 1 Yo p22 EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 196 d amp b 196 pls evrc nw op6 dim 196 pls 196 022 8k evrc nw op6 pc EVRC NW operating point 6 8 kHz sampling Nomin
32. If no target RMS value is specified the program calculates and prints the initial statistics mentioned above and copies the input file to the output file unmodified The program is invoked as follows sv56 Desired RMS Level File In File Out Sample Rate Resolution Note The desired level specified for sv56 differs by 3dB from the value required for this specification For example in order to adjust speech files 22dB in accordance with this specification the calling sequence is sv56 25 File In File Out 3 1 2 3 p Law Companding mu l c This program applies Law companding to the sample values in a linearly quantized speech file according to 7 The source code mu l c is available from 6 and 6a The input to the program is the speech file to be companded The output is the companded speech file Both files are linearly quantized speech files in accordance with Section 3 1 3 3 of this document The program is invoked as follows 3 4 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 C S0018 D v1 0 mu linput filename output filename 3 4 3 Master Codec for SO 3 This section describes the C simulation of the speech codec specified by 1 The master codec C simulation used for verifying the performance of a non bit exact EVRC implementation shall be the floating point master C simulation included in the associated Software Distribution 1a 3 1 3 1 Compiling the
33. M T 15 Car 20dB SNR 1 pls Mode 0 LB portion of Wideband mode decoder test only M M 16 Car 20dB SNR 196 pls Mode 0 LB portion of Wideband mode decoder test only M T 1 2 2 3 2 3 7 Subjective Experiment 7 for SO 70 The Test Parameters for Listening Experiment 7 are presented in Table 2 3 2 3 7 1 4 Table 2 3 2 3 7 1 SO 70 Listening Experiment 7 Test Parameters Condition Bescrpon Type of test ASR P 800 Wideband Number of genres Background noise none ambient 0 FER and 3 FER 2 MM M T 5 e The Test Conditions for Listening Experiment 7 are presented in Table 2 3 2 3 7 2 7 Table 2 3 2 3 7 2 SO 70 Listening Experiment 7 Test Conditions Exp 7 Wideband Music File Reference Condition g01 MNRU 15dB Reference g02 MNRU 25dB Reference g03 MNRU 35dB Reference g04 Source Reference File Test Condition Enc Dec 905 0 FER 906 0 FER 907 3 FER 908 3 FER 2 50 2 3 5 6 C S0018 D v1 0 2 3 2 3 8 Subjective Experiment 8 for SO 70 The Test Parameters for Listening Experiment 8 are presented in Table 2 3 2 3 8 1 Table 2 3 2 3 8 1 SO 70 Listening Experiment 8 Test Parameters Condition Description Type of test nen P 800 Narrowband Number of genres Background noise none ambient Audio Input Level 22 dB Filter characteristics MIRS Reference conditions 4 Specified ref
34. Master Codec Simulation The source code for floating point C simulation has been written in ANSI C and compiled using the GNU GCC C compiler and make utility Refer to Section 3 1 2 for information regarding obtaining GCC make and relevant documentation A GCC compatible makefile has been included in 1a Typing make in the appropriate directory will compile and link the code and create the executable file called EvrcFlt evrcflt exe on Win32 systems The included makefile may require some user modification for a particular hardware platform and or operating system 3 1 3 2 Running the Master Codec Simulation The EVRC executable files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing EvrcFlt with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding fun
35. Once EvrcFlt is compiled verification files should be processed as follows EvrcFIt i mstr ref pcm o verify pkt e 3 6 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 C S0018 D v1 0 EvrcFIt i verify pkt o verify dec d lf the output files mstr ref pkt and mstr ref dec exactly match the verify pkt and the verify dec respectively then verification of the master codec s operation is complete Because of differences in the way that floating point arithmetic is done in different computing environments it will not always be true that the floating point master C simulation will produce identical output in response to the same input when compiled and run on different compiler hardware platforms even though the simulation is operating correctly In the event that the exact match described in the preceding paragraph is not obtained it is recommended that the user verify that the version of GCC used is version 2 7 2 or later 3 1 4 Fixed Point Bit Exact Codec for SO 3 This section describes the C simulation of the speech codec specified by 1 The speech codec C simulation is based on finite precision fixed point arithmetic operations and is required to be used as a reference codec to verify the performance of a bit exact EVRC implementation of the fixed point C simulation of a test codec The bit exact EVRC codec along with the appropriate test vectors to verify
36. Reference Conditions 2 89 Figure 3 1 Basic Test 1 a aa nennen nnnm nnn entren nnns 3 1 Figure 3 2 Subjective Testing Equipment Configuration ssssssseeeeeneennes 3 2 Figure 3 3 2 4 1 SO 70 ITU T P 311 P 341 Transmit Mask and Filter responses 3 22 Figure 3 3 4 5 1 SO 70 Fixed point bit exact test suite decision flowchart 3 26 Figure 3 4 2 4 1 SO 73 ITU T P 311 P 341 Transmit Mask and Filter responses 3 36 Figure 3 4 4 5 1 SO 73 Fixed point bit exact test suite decision flowchart 3 40 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 LIST OF TABLES Table 2 1 2 3 1 1 SO 3 Listening Experiment Conditions 2 3 Table 2 1 2 3 1 2 SO 3 Listening Experiment Design seen 2 4 Table 2 1 2 3 2 1 SO 3 Listening Experiment Il Conditions sse 2 5 Table 2 1 2 3 2 2 SO 3 Listening Experiment II Design ssseeeeenme 2 5 Table 2 1 2 3 3 1 Numerical Parameters for SO Listening Experiments 2 7 Table 2 1 10 1 Multipliers for Equations 2 1 10 4 and 2 1 10 5 2 15 Table 2 2 1 1 1 1 Target
37. a12mis8 2021457 a21m4s3 8 aoom262 a05m3s5 a17m si a13mis2 2062354 a05 3s7 18 1 6 a18 2s8 is ai7mzss 11 1 2 a01m4s3 2241456 a20mas7 a13m282 2091354 a04m3s5 a 4857 al6fls2 al 4457 1455 a20mis4 2112351 a a22f3s3 352 al9f2s7 a09f1s7 al5f3s8 a20f2s2 a02m3s1 a21m3s4 21 152 a03m3s4 al7m3s4 The randomization lists for each of the eight listening panels for each experiment are provided in so73 subjctv exp data play lst 2 82 20 21 22 23 24 25 26 27 28 29 30 C S0018 D v1 0 2 4 6 Presentation Presentation of speech materials for the SO 73 codec listening tests shall be made with one side of high fidelity supra aural headphones with the other ear uncovered The speech material delivery system shall meet the requirements of Section 3 4 1 1 The listeners should be seated in a quiet room with an ambient noise level of 30 dBA or below 2 4 7 Listeners The listener sample is intended to represent the population of telephone users with normal hearing acuity The listeners should be naive with respect to telephony technology issues that is they should not be experts in telephone design digital voice encoding algorithms and so on They should not be trained listeners that is they should not have been trained in these or previous listening studies using feedback trials Age
38. all This standard is meant to define both verifications of bit exact implementations and the recommended minimum performance requirements of EVRC compatible variable rate codecs no matter where or how they are implemented in the cellular service Although the basic purpose of cellular telecommunications has been voice communication evolving usages for example data may allow the omission of some of the features specified herein provided that system compatibility is not compromised 1Numbers in brackets N refer to the reference document numbers For example 1 refers reference 1 in the reference list 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 C S0018 D v1 0 This standard concentrates specifically on the EVRC whether implemented at the mobile station or the base station or elsewhere in the cellular system This standard covers the operation of this component only to the extent that compatibility with the specific EVRC compatible variable rate codec is ensured 1 1 Scope This document specifies the procedures to test implementations of EVRC A EVRC B EVRC WB or EVRC NW compatible variable rate speech codecs either by meeting the bit exact implementation or meeting recommended minimum performance requirements The EVRC A is the Service Option 3 SO 3 speech codec the EVRC B is the Service Option 68 SO 68 speech codec the EVRC WB is
39. containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 4 4 5 Verifying Bit Exact Performance of the Fixed Point Test Codec This section outlines the methodology of verifying whether a Fixed point Test codec is bit exact to the Fixed point reference software The purpose of this testing is to evaluate the bit exactness of the test codec under a variety of conditions which may occur To accomplish this suites of test vectors have been designed to test for bit exactness of the Test Codec under a variety of conditions depending on a number of parameters These conditions include channel impairments audio background noise and different input levels Figure 3 4 4 5 1 illustrates a decision tree to arrive at the suite of test vectors that are needed to demonstrate Minimum Performance Spec compliance through bit exactness of a Test implementation of SO 73 for different profiles of equipments that support SO 73 3 39 C S0018 D v1 0 Is 16kHz Sampling Rate Supported Yes No Run Test Suite A Run Test Suite B Figure 3 4 4 5 1 SO 73 Fixed point bit exact test suite decision flowchart An implementation may support SO 73 only for 8 kHz sample rate input output for example a Base station transcoder or a Media Gateway An implementation may support SO 73 for both 16 kHz and 8 kHz sample rate for example a mobile station that supports wideband electro acoustics
40. dB b09 EVRC B 9 3 kbps Car Noise 15 dB M M b10 EVRC B 9 3 kbps Car Noise 15 dB M T b11 EVRC B 9 3 kbps Car Noise 15 dB T T b12 EVRC B 9 3 kbps Car Noise 2 15 dB T M b13 EVRC B 5 8 kbps Car Noise 2 15 dB M M b14 EVRC B 5 8 kbps Car Noise 15 dB M T b15 EVRC B 5 8 kbps Car Noise 15 dB T T b16 EVRC B 5 8 kbps Car Noise 15 dB T M b17 EVRC B 4 8 kbps Car Noise 2 15 dB M M b18 EVRC B 4 8 kbps Car Noise 15 dB M T b19 EVRC B 4 8 kbps Car Noise 15 dB T T b20 EVRC B 4 8 kbps Car Noise 2 15 dB T M b21 EVRC B 9 3 kbps Street Noise 15 dB M M b22 EVRC B 9 3 kbps Street Noise 15 dB M T b23 EVRC B 9 3 kbps Street Noise 15 dB T T b24 EVRC B 9 3 kbps Street Noise 15 dB T M b25 EVRC B 5 8 kbps Street Noise 15 dB M M b26 EVRC B 5 8 kbps Street Noise 15 dB M T b27 EVRC B 5 8 kbps Street Noise 15 dB T T b28 EVRC B 5 8 kops Street Noise 15 dB T M b29 EVRC B 9 3 kbps Office Noise 20 dB M M b30 EVRC B 9 3 kbps Office Noise 20 dB M T b31 EVRC B 9 3 kbps Office Noise 20 dB T T b32 EVRC B 9 3 kbps Office Noise 20 dB T M b33 EVRC B 5 8 kbps Office Noise 20 dB M M b34 EVRC B 5 8 kbps Office Noise 20 dB M T b35 EVRC B 5 8 kbps Office Noise 20 dB T T b36 EVRC B 5 8 kbps Office Noise 20 dB T M 2 2 2 8 8 Numerical Parameters for the SO 68 Listening Experiments s Table 2 2 2 3 3 1 describes the resultant numerology that is used for the two SO 68 listening e experi
41. distribution and gender should be nominally balanced across listening panels Each listener shall provide data only once for a particular evaluation A listener may participate in different evaluations but test sessions performed with the same listener should be at least two months apart so as to reduce the cumulative effects of experience 2 4 8 Listening Test Procedures 2 4 8 1 ACR Listening Test Procedures Experiments 1 and 3 The listeners shall listen to each sample and rate the quality of the test sample using a five point scale with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiments 1 and 3 four listeners for each listening panel where each listening panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the ACR are presented in Figure 2 4 8 1 1 The instructions may be modified to allow for variations in laboratory data gathering apparatus 2 83 C S0018 D v1 0 This is an experiment to determine the perceived quality of speech over the telephone You will be listening to a number of recorded speech samples spoken by several different talkers and you will be rating how good you think they sound Use the single headphone on the ear you normally use for the telephone On each trial a two sentence sample will be played After you have l
42. example a Base station transcoder or a Media Gateway An implementation may support SO 73 for both 16 kHz and 8 kHz sample rates for example a mobile station that supports wideband electro acoustics Therefore the main parameter in the decision tree is 16 kHz support in the implementation Depending on the implementation profile of the device under test one of 2 possible Test Suites are to be used to demonstrate SO 73 compliance These 2 test suites named Test suites A and B and the individual tests comprising the Test suites are highlighted in Table 2 4 2 1 Table 2 4 2 1 Test Suites for SO 73 compliance Test Suites Set of Experiments A Experiment 1 2 3 4 5 and 6 B Experiment 3 4 and 6 Each of the individual experiments are further defined in detail by Table 2 4 2 2 Table 2 4 2 2 Experiments for SO 73 compliance Experiment Individual tests Notes 1 WB clean level FER signaling ACR Mobile supporting 16 kHz Fs 2 WB noise FER P 835 Mobile supporting 16 kHz Fs 3 NB clean level FER signaling including BS supporting 8 kHz and MS supporting SO 68 interoperable mode tests ACR 8 16 kHz 4 NB noise FER including SO 68 BS supporting 8 kHz and MS supporting interoperable mode tests P 835 8 16 kHz 5 WB music decoder test ACR Mobile supporting 16 kHz Fs 6 NB music decoder test ACR BS supporting 8 kHz Fs 2 67 22 23 24 C S0018 D v1 0 2 4 2 1 Definition The codec s
43. for SO 70 The Test Parameters for Listening Experiment 5 are presented in Table 2 3 2 3 5 1 4 Table 2 3 2 3 5 1 SO 70 Listening Experiment 5 Test Parameters ACR P 800 Narrowband Number of talkers 4 males 4 females Test conditions Nominal level Mode 0 Low level Mode 0 High level Mode 0 Nominal level Mode 0 1 d amp b Nominal level Mode 0 10 d amp b 2 FER Mode 0 1 d amp b 6 FER Mode 0 10 d amp b Nominal Mode 0 1 pls Encoder Decoder Combinations 4 M M M T T T T M 5 e The Test Conditions for Listening Experiment 5 are presented in Table 2 3 2 3 5 2 2 47 3 4 C S0018 D v1 0 Table 2 3 2 3 5 2 SO 70 Listening Experiment 5 Test Conditions Exp 5 Narrowband ACR Reference Conditions File MNRU e01 5dB MNRU Reference 02 10dB MNRU Reference e03 15dB MNRU Reference e04 20dB MNRU Reference e05 25dB MNRU Reference e06 30dB MNRU Reference e07 35dB MNRU Reference e08 Direct Source Reference Test Conditions File Condition Enc Dec e09 Nominal Mode 0 LB portion of Wideband mode decoder test only M M e10 Nominal Mode 0 LB portion of Wideband mode decoder test only M T e11 Low Mode 0 LB portion of Wideband mode decoder test only M M e12 Low Mode 0 LB portion of Wideband mode decoder test only M T e13 High Mode 0 LB portion of Wideband mode decoder test on
44. frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 3 4 5 Verifying Bit Exact Performance of the Fixed Point Test Codec This section outlines the methodology of verifying whether a Fixed point Test codec is bit exact to the Fixed point reference software The purpose of this testing is to evaluate the bit exactness of the test codec under a variety of conditions which may occur To accomplish this suites of test vectors have been designed to test for bit exactness of the Test Codec under a variety of conditions depending on a number of parameters These conditions include channel impairments audio background noise and different input levels Figure 3 3 4 5 1 illustrates a decision tree to arrive at the suite of test vectors that are needed to demonstrate Minimum Performance Spec compliance through bit exactness of a Test implementation of SO 70 for different profiles of equipments that support SO 70 3 25 C S0018 D v1 0 Is 16kHz Sampling Rate Supported Yes No SO 68 Compliant SO 68 iant No Compliant Run Test Suite B Run Test Suite A Yes Run Test Suite C Run Test Suite D Figure 3 3 4 5 1 SO 70 Fixed point bit exact test suite decision flowchart An implementation may support SO 70 only for 8 kHz sample rate
45. in the so3 tools bin directory This makefile may need to be modified to conform to the user s hardware platform 7 The GNU C compiler GCC and software development tools including documentation are available without charge from the Free Software Foundation They can be contacted at Free Software Foundation Voice 1 617 542 5942 59 Temple Place Suite 330 Fax 1 617 542 2652 Boston MA 02111 1307 USA gnu gnu org or on the World Wide Web at http www fsf org 3 3 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 C S0018 D v1 0 Those non 3GPP2 supplied tools l mu I exe and sv56 exe available in C code form from 6 and 6a and compiled using GCC are identified and are to be used supplementary to those available on the Software Distribution The program descriptions that follow all use the convention of enclosing optional command line arguments in angle brackets 3 1 2 1 Average Data Rate Determination Utility avg rate c This utility program is used to determine the average data rate at which a test codec encodes a set of benchmark speech files The source code avg rate c is a 3GPP2 supplied tool and is located in the so3 tools avg rate directory of the associated Software Distribution The input to the program is a list of packet file names where each packet file referred to in the list conforms to the format described in Sec
46. in this document are provided within an associated Software Distribution The directory structure of the Software Distribution is represented in Table 4 1 Table 4 2 Table 4 3 and Table 4 4 Table 4 1 contains a brief description of the Software Distribution for the EVRC A MPS Table 4 2 contains a brief description of the Software Distribution for the EVRC B MPS Table 4 3 contains a brief description of the Software Distribution for the EVRC WB MPS and Table 4 4 contains a brief description of the Software Distribution for the EVRC NW MPS The prime sub directories of these distributions are so3 so68 so70 or so73 respectively These tables contain brief descriptions of the contents of these directories as well as cross references to the sections of this document in which they are described in detail Table 4 1 Description of EVRC A Software Distribution Contents References so3 simul fixed source code for the bit exact 3 1 4 fixed point code so3 subjctv Speech and other material 2 1 3 2 1 4 2 1 5 necessary to perform Subjective Experiments and 11 so3 objctv Speech material necessary to perform the Average Data Rate so3 cal Output level calibration file for 3 1 1 2 listening tests so3 tools Source code for the software 3 1 2 tools so3 testvec Test vectors for verifying bit 3 1 4 6 exact EVRC implementations 4 1 C S0018 D v1 0 Table 4 2 Description of EVRC B Software Distribution Contents so68 EV
47. input output for example a Base station transcoder or a Media Gateway An implementation may support SO 70 for both 16 kHz and 8 kHz sample rate for example a mobile station that supports wideband electro acoustics Further the implementation supporting SO 70 might already have demonstrated compliance to SO 68 Minimum Performance Spec This means that such an equipment has also demonstrated the Minimum Performance requirements for RATE REDUC operating points 4 and 7 of SO 70 which exactly correspond to the RATE REDUC operating points 0 and 7 of SO 68 Therefore the main parameters in the decision tree are a 16 kHz support in the implementation b SO 68 compliance of the test implementation Depending on the implementation profile of the Device under test one of 4 possible Test Suites are to be used to demonstrate SO 70 compliance These 4 test suites named Test suites A B C D and the individual input test vectors comprising the Test suites are highlighted in Table 3 3 4 5 1 3 26 C S0018 D v1 0 Table 3 3 4 5 1 Test Suites of input test vectors for SO 70 compliance Test Suites Directory containing input test vectors Notes Mobile application already supporting SO 68 compliance Mobile application NOT already supporting SO 68 compliance Infra MGW application already supporting SO 68 compliance Infra MGW application NOT already supporting SO 68 compliance so70 testvec source suiteA
48. kbps High 12 dB T M a37 EVRC B 9 3 kbps Nominal 22 dB 3 FER M M a38 EVRC B 9 3 kbps Nominal 22 dB 3 FER M T a39 EVRC B 5 8 kbps Nominal 22 dB 3 FER M M a40 EVRC B 5 8 kbps Nominal 22 dB 3 FER M T 2 2 2 3 2 Subjective Experiment Il for SO 68 The Test Parameters for Listening Experiment Il are presented in Table 2 2 2 3 2 1 Table 2 2 2 3 2 1 SO 68 Listening Experiment Il Test Parameters Condition Description Type of test P NSA P 835 Type oftest Test conditions a Car Noise 15 dB S N 9 3 5 8 4 8 kbps b Street Noise 15 dB S N 9 3 5 8 kbps c Office Babble 20 dB S N 9 3 5 8 kbps Encoder Decoder Combinations 4 M M M T T T T M 2 21 4 C S0018 D v1 0 The Test Conditions for Listening Experiment Il are presented in Table 2 2 2 3 2 2 Table 2 2 2 3 2 2 SO 68 Listening Experiment Il Test Conditions Label Operating Point Impairment Condition Encoder Decoder Combinations b01 Reference Car Noise 40 dB SNR MNRU 40 dB b02 Reference Car Noise 20 dB SNR MNRU 40 dB b03 Reference Car Noise 0 dB SNR MNRU 40 dB b04 Reference Car Noise 40 dB SNR MNRU 0 dB b05 Reference Car Noise 40 dB SNR MNRU 20 dB b06 Reference Car Noise 10 dB SNR MNRU 10 dB b07 Reference Car Noise 20 dB SNR MNRU 20 dB b08 Reference Car Noise 30 dB SNR MNRU 30
49. level source conditions for each mode The measurement should be made using the STL 2000 tool 6 6a actlev and must not show more than 0 5 dB deviation between input and output active speech levels This procedure is fully described in 9 2 2 4 3 End to end Algorithmic Delay Recommendation The algorithmic delay for the specific EVRC B test codec should be calculated analytically by the codec manufacturer In considering the algorithmic delay it can be assumed that all transmission channels have infinite bandwidth and that all processing elements have infinite throughput Algorithmic delay is defined as the sum of all sequential filter delays and buffering delays in the encode decode path The maximum end to end algorithmic delay should be no greater than that of the master codec For the master codecs defined in 1 the algorithmic delay is given as Delay Element SO 68 Signal Preprocessing Delay 3 milliseconds LPC Analysis Look ahead 10 milliseconds LPC Analysis Window 20 milliseconds Total 33 milliseconds 2 18 20 21 22 23 24 25 26 27 28 29 C S0018 D v1 0 Therefore the total algorithmic delay imposed by a SO 68 test codec should not exceed 33 milliseconds 2 2 2 Subjective Performance Testing for SO 68 This section outlines the subjective testing methodology of the subjective performance test The purpose of this testing is to evaluate the quality of the test codec under a variety
50. noise conditions ambient background noise 20 dB SNR babble noise condition 15 dB SNR car noise condition and 12 dB SNR street noise The background noise has been introduced by mixing the clean speech recording with the noise recording at the appropriate levels The benchmark recording employed in the average data rate test is a single sided recording similar to a telephone conversation It exhibits an approximate voice activity factor of 0 35 The processed files are not used in the subjective portion of the experiment The length of each of the benchmark files is approximately 480 seconds 2 1 1 1 1 Average Data Rate Computation The average data rate for the test codec shall be computed for each of the benchmark files as follows 9600 N1 4800 No 1200 Ng N where N4 number of frames encoded at Rate 1 number of frames encoded at Rate 1 2 Ng number of frames encoded at Rate 1 8 and Ng The total average data rate for the test codec is then given 2 This section does not apply whenever a codec has demonstrated bit exactness See 3 1 4 3 2 4 3 3 4 or 3 4 4 2 1 C S0018 D v1 0 Ravg 0833 R babble noise segment 12dB R car noise segment 12dB R street noise segment 12dB R ambient background segment 9 12dB R babble noise o a A P 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 segment 22d
51. obtained for the MNRU conditions in any SO 70 validation test should be compared to those shown in the graph below Inconsistencies beyond a small shift in the means in either direction or a slight stretching or compression of the scale near the extremes may imply a problem in the execution of the evaluation test In particular MOS should be monotonic with MNRU within the limits of statistical resolution and the contour of the relation should show a similar slope MOS w 10 20 30 40 50 dBQ Figure 2 3 10 1 1 Typical Plot of MOS versus MNRU 2 3 10 2 Reference Conditions for Experiments 2 4 and 6 Reference conditions for P 835 tests are constructed as a combination of SNR and MNRU processing to provide degradation in overall speech quality in two dimensions signal distortion and background noise intrusiveness Table 2 3 2 3 2 2 shows the eight reference conditions b01 b08 2 63 1 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 involved in the P 835 Experiments 2 4 and 6 In general results are expected for these reference conditions such that the obtained score profiles are similar to those shown in Figure 2 3 10 2 1 SNR 40dB Car Noise e SIG _ 4 OVRL OdB 20dB MNRU 40 dB MNRU 40dB e SIG _ OVRL 20 dB SNR Car Noise 40 dB P 835 Score
52. of Reference for the MPS tests state that the mean score for each of the Test Encoder Decoder Combinations E DC should be not worse than the mean score for the Reference E DC For most of the test conditions involved in the subjective experiments there are three Test E DC s M T T M and T T which means there are three statistical tests against the Reference E DC M M The three statistical tests are not independent however Since they all involve the same ratings for the Reference E DC t tests are not appropriate The appropriate statistical test for multiple Test conditions against a common Reference condition is Dunnett s Test A complete description of Dunnett s Test is contained in Appendix B The critical value for the Dunnett s test is 2 09 one sided test p lt 05 4 E DC s df 93 For those test conditions where a single Test E DC T T is compared against the Reference E DC the appropriate statistical test is Student s t test The critical value for the Student s t test is 1 70 one sided test p lt 05 df 31 In both the Dunnett s Test and the t test the MPS test is evaluated by dividing the difference between the mean score for the Test E DC and the mean score for the Reference ED C by the Standard Error of the Mean Difference SEyp as shown in Equation 2 4 9 2 1 If the resultant Test value is less than 6 The appropriate t test is a matched groups t test and the SEyp is based on the differences betw
53. on Win32 systems which will be placed in the bin directory The included makefiles may require some user modification for a particular hardware platform and or operating system There exists two options for compiling the fixed point EVRC simulation One option uses the 31 bit long multiply DSP math library and the other uses the 32 bit library A parallel set of bit exact test vectors is provided so that a CODEC may qualify as bit exact using either library By default the DSP math library compiles the 32 bit long multiply routines In order to compile with the 31 bit long multiply routines the following lines in so3 simul fixed dspmath makefile must be commented uncommented Change from 32 bit library 3 7 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 C S0018 D v1 0 Uncomment the following line to use alternate double precision multiplies CCAUXFLAGS DUSE ALT DP31 8 comment the following line out CCAUXFLAGS to 31 bit library Uncomment the following line to use alternate double precision multiplies CCAUXFLAGS DUSE ALT DP31 8 comment the following line out CCAUXFLAGS 3 1 4 3 Running the Fixed Point Codec Simulation The EVRC executable files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing EvrcFix with no command line arguments will display a brief description
54. op4 p32 4 8 kHz sampling src s22 8k EVRC WB operating point Nominal 22 dB evrc wb op4 dim 196 p22 4 8 kHz sampling 196 d amp b src s22 8k EVRC WB operating point Nominal 22 dB evrc wb op7 p22 7 8 kHz sampling src c15 8k EVRC WB operating point Nominal 22 dB evrc wb op4 pc 4 8 kHz sampling 15 dB car noise src s15 8k EVRC WB operating point Nominal 22 dB evrc wb op4 ps 4 8 kHz sampling 15 dB street noise src b20 8k EVRC WB operating point Nominal 22 dB evrc wb op4 pb 4 8 kHz sampling 20 dB babble noise src c15 8k EVRC WB operating point Nominal 22 dB evrc wb op7 pc 7 8 kHz sampling 15 dB car noise Table 3 3 4 5 5 SO 70 Suite B Decoder Bit exact Test Conditions Reference output speech 16 kHz sampling 3 29 Input Packet File Operating Point Condition files for bit exact compliance evrc wb opoO fer 3 22 EVRC WB Nominal 22 dB evrc wb opoO fer 396 022 operating point 0 396 FER 16 kHz sampling evrc wb opoO fer 196 pls 196 p22 EVRC WB Nominal 22 dB evrc wb opo operating point 0 3 FER fer_1 pls_1 022 16 kHz sampling evrc_wb_op0 p12 EVRC WB High 12 dB evrc wb op0 012 operating point 0 16 kHz sampling evrc wb opO dim 196 p32 EVRC WB Low 32 dB evrc wb opO0 dim 196 032 operating point O 196 d amp B C S0018 D v1 0 Input Packet File evrc wb opO pc1 Operating Point EVRC WB operating point O 16 kHz sampling Condition
55. output of the D A converter and the output of the headphone 1 Frequency response shall be flat to within 2 dB between 50 Hz and 7000 Hz and below 50 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 196 for signals between 50 Hz and 8000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at the ear reference plane of the headphone 4 Signal shall be delivered to the headphone on the listener s preferred telephone listening ear and the other ear shall be uncovered No signal shall be delivered to the other headphone The audio path for narrowband test conditions Experiments 3 and 4 must meet the following requirements for electro acoustic performance measured between the output of the D A converter and the output of the headphone 1 Frequency response shall be flat to within 2 dB between 200 Hz and 3400 Hz and below 200 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 1 for signals between 100 Hz and 4000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at the ear reference plane of the headphone 4 Signal shall be delivered to
56. point 1 Nominal 22 dB evrc nw opi pc 8 kHz sampling 15 dB car noise src s15 8k EVRC NW operating point 1 Nominal 22 dB evrc nw opi ps 8 kHz sampling 15 dB street noise src b20 8k EVRC NW operating point 1 Nominal 22 dB evrc nw opi pb 8 kHz sampling 20 dB babble noise src c15 8k EVRC NW operating point 6 Nominal 22 dB evrc nw op6 pc 8 kHz sampling 15 dB car noise src s15 8k EVRC NW operating point 6 Nominal 22 dB evrc nw op6 ps 8 kHz sampling 15 dB street noise src b20 8k EVRC NW operating point 6 Nominal 22 dB evrc nw op6 pb 8 kHz samplin 20 dB babble noise src c15 8k EVRC NW operating point 7 Nominal 22 dB evrc nw op7 pc 8 kHz sampling 15 dB car noise Table 3 4 4 5 5 SO 73 Suite B Decoder Bit exact Test Conditions 3 44 Reference output Input Packet File Operating Point Condition speech files for bit exact compliance evrc nw opO fer 396 p EVRC NW Nominal 22 dB evrc nw opoO fer 396 22 operating point O 396 FER 022 8k 8 kHz sampling evrc nw opO0 p12 EVRC NW High 12 dB evrc nw op0 012 8k operating point 0 8 kHz sampling evrc_nw_op0 p32 EVRC NW Low 32 dB evrc_nw_op0 032 8k operating point 0 8 kHz sampling evrc_nw_op0 dim_1 EVRC NW operating Nominal 22 dB evrc_nw_op0 dim_1 pls 196 p22 point 0 8 kHz sampling 196 d amp b 196 pls pls 196 022 8k evrc nw opO pc EVRC NW Nominal 22 dB evrc nw opO oc 8k operating point O 15 dB car nois
57. span approximately the same range of quality the MOS results for similar conditions should be approximately the same Data from previous studies allows a generalization to be made concerning the expected MOS results for the MNRU reference conditions see Figure 2 1 11 1 MOS scores obtained for the MNRU conditions in any SO 3 validation test should be compared to those shown in the graph below Inconsistencies beyond a small shift in the means in either direction or a slight stretching or compression of the scale near the extremes may imply a problem in the execution of the evaluation test In particular MOS should be monatomic with MNRU within the limits of statistical resolution and the contour of the relation should show a similar slope 2 15 C S0018 D v1 0 MOS w 10 20 30 40 dBQ 50 Figure 2 1 11 1 MOS versus MNRU 2 16 20 21 22 23 24 25 26 27 28 29 C S0018 D v1 0 2 2 Performance Testing for SO 68 2 2 4 Objective Performance Testing for SO 68 The objective testing portion of this specification consists of an average data rate test and compliance to End to End Algorithmic Delay and Unity gain requirements 2 2 1 1 Average Data Rate Test The average data rate for the test codec shall be measured using six source speech files that are contained in the so68 subjctv exp source directory Each file exhibits a different condition power levels 12 dB 22 dB and 32 dB and backgro
58. talker never being presented on consecutive trials Table 2 3 5 1 shows an example randomization for a single listening panel Each entry in the table is the file name for a sample with the following file naming convention xxyy zzz where xx is the talker yy is the sample and zzz is the test condition Table 2 3 5 1 Example Randomization for the Experiments 1 3 and 5 ACR Test 804454 8102251 2142252 8197356 atr masi aiomisi ad6mis7 a07m351 adimis6 a12mis8 2021457 a21m4s3 8 aoom262 a05m3s5 a17m si a13mis2 2062354 a05 3s7 18 1 6 a18 2s8 is ai7mzss 11 1 2 a01m4s3 2241456 a20mas7 a13m282 2091354 a04m3s5 a 4857 al6fls2 al 4457 1455 a20mis4 2112351 a a22f3s3 352 al9f2s7 a09f1s7 al5f3s8 a20f2s2 a02m3s1 a21m3s4 21 152 a03m3s4 al7m3s4 The randomization lists for each of the eight listening panels for each experiment are provided in so70 subjctv exp data play lst 2 57 20 21 22 23 24 25 26 27 28 29 30 C S0018 D v1 0 2 3 6 Presentation Presentation of speech materials for the SO 70 codec listening tests shall be made with one side of high fidelity supra aural headphones with the other ear uncovered The speech material delivery system shall meet the requirements of Section 3 3 1 1 The listeners should be seated in a quiet room with an ambient no
59. test M M only d10 Car 15dB SNR Mode 0 LB portion of Wideband mode decoder test M T only d11 Car 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support di2 Car 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support M T di3 Car 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support T T di4 Car 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support T M di5 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support T T di6 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support T M di7 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support di8 Car 15dB SNR Mode 7 interoperable with Mode 0 of SO 68 support M T d19 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test M M only d20 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test M T only d21 Street 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support M M d22 Street 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support M T d23 Street 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support T T d24 Street 15dB SNR Mode 1 interoperable with Mode 0 of SO 68 support T M 2 74 2 3 C S0018 D v1 0 2 4 2 3 5 Subjective Experiment 5 for SO 73 The Test Parameters for Listening Experiment 5 are presented in Table 2 4 2 3 5 1 Table 2 4 2 3 5 1 SO 73 Listening Experiment 5 Test Parameters Type of test ace P 8
60. test samples has been accomplished with the following constraints for each of the two experiments 1 A trial i e a test sample for the combination of each test condition and each talker shall be presented exactly once to each listening panel i e trials panel conditions x talkers 2 Randomization is in blocks such that one sample of each test condition is presented once with a randomly selected talker in each block This ensures that listeners rate each test condition equally often in the initial middle and final parts of the block and controls for the effects of time and order of presentation A block contains the same number of samples as there are test conditions involved in the test A test session consists of the same number of blocks as there are talkers involved in the test Each session is presented to a listening panel of four listeners 3 Randomizations are constructed such that talker gender is alternated on successive trials resulting in the same talker never being presented on consecutive trials Table 2 4 5 1 shows an example randomization for a single listening panel Each entry in the table is the file name for a sample with the following file naming convention xxyy zzz where xx is the talker yy is the sample and zzz is the test condition Table 2 4 5 1 Example Randomization for the Experiments 1 and 3 ACR Test 804454 8102251 2142252 8197356 atr masi aiomisi ad6mis7 a07m351 adimis6
61. the beginning Thank you for participating in this research Figure 2 1 8 1 Instructions for Listeners 2 1 9 Analysis of Results The response data from the practice blocks shall be discarded Data sets with missing responses from listeners shall not be used Responses from the different sets of encoder decoder processed files shall be treated as equivalent in the analysis 2 12 C S0018 D v1 0 The votes for each of the 31 conditions and references for each of SO 3 Experiment and II shall be averaged in accordance with 10 to produce an associated mean opinion score MOS Additionally the standard error SER for each condition shall be calculated as described in the next section 2 1 10 Minimum Subjective Requirement For each of the test combinations T M M T T T the MOS results are compared to those of the respective master codec M M The exception to this being the 396 FER case in which M T is compared to M M and T T is compared to T M3 If the MOS for the test combination condition is within an allowable difference as defined below of the MOS for the master combination condition then the subjective test is passed for that combination condition If any of the test combinations conditions exceeds the maximum allowable difference the test codec fails the compliance test These requirements can be clarified by first defining the MOS for a given combination condition as 0 e MOS jk gt 2 Xj kn je 1
62. the ear reference plane of the headphone 4 Signal shall be delivered to the headphone on the listener s preferred telephone listening ear and the other ear shall be uncovered No signal shall be delivered to the other headphone 3 3 1 2 Calibration The audio circuit shall deliver an average sound level of the stimuli to the listener at 18 dBPa 76 dB SPL at the ear reference plan This level was chosen because it is equivalent to the level delivered by a nominal ROLR handset driven by the average signal level on the PSTN network This level may be calibrated using a suitable artificial ear with circum aural headphone adapter and microphone A test file with a reference signal is included with the source speech database for the purpose of calibration The file cal 1004 16k is located in the directory so70 cal of the companion software The calibration file contains a 22 dB 1004 Hz reference signal The audio circuit shall be calibrated so that the test signal has a level of 15 dBPa at the ear reference plane while maintaining compliance with Section 3 3 1 1 3 20 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 C S0018 D v1 0 3 3 2 Software Test Tools for SO 70 This section describes a set of software tools useful for performing the MPS tests The code has been developed and compiled using the GNU g compiler and software maintenance utilities The tools have been verified under various repre
63. the headphone on the listener s preferred telephone listening ear and the other ear shall be uncovered No signal shall be delivered to the other headphone 3 4 1 2 Calibration The audio circuit shall deliver an average sound level of the stimuli to the listener at 18 dBPa 76 dB SPL at the ear reference plan This level was chosen because it is equivalent to the level delivered by a nominal ROLR handset driven by the average signal level on the PSTN network This level may be calibrated using a suitable artificial ear with circum aural headphone adapter and microphone A test file with a reference signal is included with the source speech database for the purpose of calibration The file cal_1004 16k is located in the directory so73 cal of the companion software The calibration file contains a 22 dB 1004 Hz reference signal The audio circuit shall be calibrated so that the test signal has a level of 15 dBPa at the ear reference plane while maintaining compliance with Section 3 4 1 1 3 34 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 C S0018 D v1 0 3 4 2 Software Test Tools for SO 73 This section describes a set of software tools useful for performing the MPS tests The code has been developed and compiled using the GNU g compiler and software maintenance utilities The tools have been verified under various representative operating systems on a number of different hardware platforms The
64. which is described in Section 3 1 3 SO 3 Section 3 2 3 SO 68 Section 3 3 3 SO 70 or Section 3 4 3 SO 73 is used as part of the interoperability testing 1 2 Definitions Base Station A station in the Domestic Public Cellular Radio Telecommunications Service other than a mobile station used for radio communications with mobile stations Bit Exact A test procedure for codecs by which a set of prescribed vectors are input to the test codecs and output vectors from the codecs correspond exactly bit for bit with output vectors prescribed by this standard CELP Code Excited Linear Predictive Coding This technique uses codebooks to vector quantize the excitation residual signal of a Linear Predictive Codec LPC Circum aural Headphones Headphones that surround and cover the entire ear Codec The combination of an encoder and decoder in series encoder decoder Compand The process of compressing and expanding a signal In this text the process is described in terms of u Law PCM 7 dB Normally taken to be defined as X dB 2010910 x In the context of digitized speech the unit dB is used to represent the average power level of a speech signal with respect to full scale For the purposes of this document full scale is defined as the maximum sinusoidal input level which does not result in clipping where 0 dB corresponds to the output level measured according to 9 for a full scale 1 kHz sinusoidal input Thi
65. with the MIRS receive filter mask An STL tool astrip is also used to split the concatenated files into the individual samples appropriate for the experiment Table 2 2 4 5 1 shows the cutting points to be used with the astrip tool for producing the two sentence samples for the Experiment ACR test Table 2 2 4 5 2 shows the cutting points to be used with the astrip tool for producing the single sentence sub samples for the Experiment Il P 835 test Table 2 2 4 5 3 shows the sub samples that make up the samples i e sentence triads for the P 835 test 2 25 C S0018 D v1 0 Table 2 2 4 5 1 Cutting Points for the astrip Software Tool for the Experiment ACR Test Experiment I ACR Sentence ar eng Sentence ar eng pair Sample sanples pair Sample samples 51712 40 faps 54016 N E 50432 a 50688 a 56320 2 2p2 563713 44 51712 13 51456 56576 F3p 15824 52480 50944 1 7 1s mp3 9108495017651 2 7 2 7 53760 H w ojo Ko a SJ HS BA OV CO p ge 1 p N N 1 4 1222145 2876673 49408 1269249 47104 f1p8 2926081 47616 5 54272 50432 1525249 1581569 6 6 m 1316353 m a w gt www NINININI NIe ojlo o AY WwW hd 1170177 51968 56 f4p7 2823937 52736 P f4p
66. 00 Wideband Number of genres Background noise none ambient Audio Input Level 22 dB Filter characteristics P 341 refer Section 3 3 2 4 Reference conditions 4 Specified reference conditions Test conditions 0 FER and 3 FER Encoder Decoder Combinations 2 M M M T 2 75 Exp 4 Narrowband P 835 d25 Babble 20dB SNR 2 FER Mode 0 LB portion of Wideband mode decoder test M M only d26 Babble 20dB SNR 2 FER Mode 0 LB portion of Wideband mode decoder test M T only d27 Babble 20dB SNR 2 FER Mode 1 interoperable with Mode 0 of SO 68 support d28 Babble 20dB SNR 2 FER Mode 1 interoperable with Mode 0 of SO 68 support M T d29 Babble 20dB SNR 2 FER Mode 1 interoperable with Mode 0 of SO 68 support TT d30 Babble 20dB SNR 2 FER Mode 1 interoperable with Mode 0 of SO 68 support T M d31 Car 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support M M d32 Car 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support M T d33 Car 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support T T d34 Car 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support T M d35 Street 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support M M d36 Street 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support M T d37 Street 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support T T d38 Street 15dB SNR Mode 6 interoperable with Mode 6 of SO 68 support T M d39 Babble 20dB SN
67. 1 ANOVA The equation for computing SEyp is shown in Equation 5 2 2 where is the Residual Mean Square from the ANOVA MSk Table 5 1 1 x ES 5 2 1 SE J 2x Ms subjects 5 2 2 For each Test CC the computed value of Dc is compared to critical values of the Dunnett s statistic where the parameters are o criterion probability p lt 05 o total number of CC s 4 o degrees of freedom for the df 93 o Dunnett 2 09 5 2 6 6 1 6 2 6 3 6 4 PROCESSING BLOCKS FOR SO 68 SO 70 AND SO 73 Nominal Level and Noise Processing C S0018 D v1 0 6 1 input speech file Master Test encoder in a output 0 m packet file 3 Master Test decoder gt src s22 c15 b20 s15 given operating point p speech file operating point FER Processing input speech file Master Test encoder in a Y packet file output sre s22 c15 b20 s15 given operating point packet file fersig with Erasure Master Test decoder speech file i operating point fer 396 bin Low level and Signaling Processing packet file input speech file dcin a Qn packet file gt gt and pli MasterTest output scaldemo land dane sre s32 operating point signalling level decoder speech file 1006 output file dimming 11 A
68. 16 Car 20dB SNR 296 FER T M b17 Street 15dB SNR M M b18 Street 15dB SNR M T b19 Street 15dB SNR T T b20 Street 15dB SNR T M b21 Babble 20dB SNR M M b22 Babble 20dB SNR M T b23 Babble 20dB SNR T T b24 Babble 20dB SNR T M C S0018 D v1 0 ___ Condton Type of test ACR P 800 Narrowband Number of talkers 4 males 4 females Background noise none ambient Audio Input Level 22 dB 32 dB 12 dB Filter characteristics MIRS Reference conditions 8 Specified reference conditions Test conditions Encoder Decoder Combinations The Test Conditions for Listening Experiment 3 are presented in Table 2 4 2 3 3 2 Nominal level Modes 0 4 7 Low level Modes 0 4 High Level Mode 0 4 196 d amp b 196 pls Modes 0 4 396 FER Modes 0 4 2 71 4 MM M T T T T M C S0018 D v1 0 Table 2 4 2 3 3 2 SO 73 Listening Experiment 3 Test Conditions Label Opera condition jig c01 Reference MNRU 5dB c02 Reference MNRU 10dB c03 Reference MNRU 15dB c04 Reference MNRU 20dB c05 Reference MNRU 25dB c06 Reference MNRU 30dB c07 Reference MNRU 35dB c08 Reference Direct c09 Mode 1 Nominal 22 dB M M c10 Mode 1 Nominal 22 dB M T c11 Mode 1 Nominal 22 dB T T c12 Mode 1 Nominal 22 dB T M c13 Mode 6 Nominal 22 dB M M c14 Mode 6 Nominal 22 dB M T c15 Mode 6 Nominal 22 dB T T c16
69. 2 2 Method of Measurement The subjective tests involve a listening only assessment of the quality of the codec being tested using the master codec as a reference Subjects from the general population of telephone users will rate the various conditions of the test Material supplied with this standard for use with this test includes source speech impaired packet files from the master codec encoder and source speech processed by various Modulated Noise Reference Unit MNRU conditions and other references The basic Absolute Category Rating test procedure involves rating all conditions using a five point scale describing the opinion of the test condition This procedure is fully described in 10 The P 835 test method involves rating all conditions on scales of Signal Background and Overall quality and is fully described in 13 2 3 2 8 Test Conditions and Test Design for SO 70 Listening experiments 1 3 5 for SO 70 are performed as ACR listening tests Experiments 2 4 and 6 for SO 70 are performed as P 835 listening tests 2 3 2 3 1 Subjective Experiment 1 for SO 70 The Test Parameters for Listening Experiment 1 are presented in Table 2 3 2 3 1 1 Condition Description MOS P 800 Wideband Number of talkers 4 males 4 females Reference conditions 8 Specified reference conditions Test conditions Low Audio Input Level 32 dB 196 d amp b Nominal Audio Input Level 22 dB High Audio Input Level 12 dB 3 FER and 1 F
70. 27 28 29 30 31 32 33 34 35 36 C S0018 D v1 0 Detailed descriptions of all processing operations are given in Section 6 2 4 4 1 Encoding by the Test Codec All of the source files will be encoded by the test codec to produce encoded packet files For ease of reference it is recommended that directories so73 subjctv exp t_pkt be created to deposit the test encoder output packets and that the naming conventions be made consistent with the master codec 2 4 4 2 Decoding by the Master Test Codecs The encoded packet files generated from the various encoders conditions shall be processed through the master and test decoders 2 4 4 8 Introduction of Impairments For the frame error conditions the impaired master codec encoded packet files are provided in the So73 subjctv exp m_pkt directory Unlike other conditions this condition uses only the test decoder and not the test encoder For the Dim and Burst processing and also the Packet Level Signaling conditions the processing requires inputs from a signaling file to control maximum encoding rate An external software utility Evrc_nw_iwf in Section 3 4 2 3 is also needed to reduce the data rate of certain packets from full rate to half rate Details of these operations are given in Section 6 The signaling file and other utilities are provided in so73 tools directory 2 4 4 4 Ensuring Proper Encoded Frame Packet Files All encoded frame packet files shall
71. 3 Composition of the Sentence Triad Samples for the Experiment Il P 835 Test entence entence entence entence triad 1 2 3 7 2 2 5 Randomization amp For each of the two subjective experiments each presentation sample consists of a speech sample 9 processed under a condition of the test For the ACR Experiment the sample consists of a pair of 10 concatenated sentences of approximately 8 sec duration For the P 835 Experiment 11 the sample 11 consists of three sub samples where each sub sample is a single sentence of approximately 4 sec 12 duration The samples shall be presented to the listeners in a randomized presentation order The listeners for each file set shall be presented with practice trials for subjective Experiments and Il 14 The randomization of the test samples has been accomplished with the following constraints for the two experiments 2 27 C S0018 D v1 0 1 A trial i e a test sample for the combination of each test condition and each talker shall be presented exactly once to each listening panel i e trials panel conditions x talkers 2 Randomization is in blocks such that one sample of each test condition is presented once with a randomly selected talker in each block This ensures that listeners rate each test condition equally often in the initial middle and final parts of the block and controls for the effects of time and order of presentation A block contains the same n
72. 32 dB M M a12 EVRC A Low 32 dB M T a13 EVRC A Low 32 dB T M 14 EVRC A Low 32 dB T T 15 IS 96 C Low 32 dB R R a16 EVRC A 396 FER For amp Rev M M a17 EVRC A 396 FER For amp Rev M T a18 EVRC A 396 FER For amp Rev T M 19 EVRC A 396 FER For 8 Rev T T a20 IS 96 C 3 FER For amp Rev R R a21 EVRC A HR Max Nominal 22 dB M M a22 EVRC A HR Max Nominal 22 dB M T a23 EVRC A HR Max Nominal 22 dB T M a24 EVRC A HR Max Nominal 22 dB T T 2 4 2 3 C S0018 D v1 0 Label Operating Point Condition Enc Dec Connection a25 IS 96 C HR Max Nominal 22 dB R R a26 Reference MNRU 5dB a27 Reference MNRU 15dB a28 Reference MNRU 20dB a29 Reference MNRU 25dB a30 Reference G 728 a31 Reference u Law Source 2 1 2 3 2 Subjective Experiment II for SO 3 The Test Conditions for Listening Experiment Il are presented in Table 2 1 2 3 2 1 Table 2 1 2 3 2 1 SO 3 Listening Experiment Il Conditions Test conditions 1 Clean 2 Car Noise IRS at 15 dB S N 3 Street Noise flat at 12 dB S N 4 Office Babble flat at 20 dB S N 5 Tandem Number of codecs 5 M M T M T T IS 96 C Encoding stages single and tandem The Test Design for Listening Experiment II are presented in Table 2 1 2 3 2 2 Table 2 1 2 3 2 2 SO 3 Listening Experiment Il Design Operating Point Condition Enc Dec Connection b01 EVRC A Clean Nomina
73. 3GPP2 C S0018 D 3RD GENERATION PARTNERSHIP Version 1 0 PROJECT 2 Date January 25 2010 3GPPz Minimum Performance Specification for the Enhanced Variable Rate Codec Speech Service Options 3 68 70 and 73 for Wideband Spread Spectrum Digital Systems 2010 3GPP2 3GPP2 and its Organizational Partners claim copyright in this document and individual Organizational Partners may copyright and issue documents or standards publications in individual Organizational Partner s name based on this document Requests for reproduction of this document should be directed to the 3GPP2 Secretariat at secretariat 3gpp2 org Requests to reproduce individual Organizational Partner s documents should be directed to that Organizational Partner See www 3gpp2 org for more information C S0018 D v1 0 REVISION HISTORY Revision C S0018 0 v1 0 C S0018 B v1 0 C S0018 C v1 0 C S0018 D v1 0 Minimum Performance Specification for the Enhanced Variable Rate Codec Speech Service Option 3 for Spread Spectrum Digital Systems Minimum Performance Specification for the Enhanced Variable Rate Codec Speech Service Options 3 and 68 for Spread Spectrum Digital Systems Minimum Performance Specification for the Enhanced Variable Rate Codec Speech Service Options 3 68 and 70 for Spread Spectrum Digital Systems Minimum Performance Specification for the Enhanced Variable Rate Codec Speech Service Options 3 68 70 and 73 for Spr
74. 3GPP2 supplied tools are all located in the so73 tools directory in the associated Software Distribution and can be built using the GNU g compiler Other software tools such as scaldemo actlev filter and astrip are available in 6 3 4 2 1 Channel Model Utilities fersig29 exe This utility program provides d the ability to introduce Frame Erasure channel impairment e the ability to verify use of half rate or lesser frame rate during dim and burst and packet level signaling f the ability to measure the Average Data Rate from an encoded packet file A log output of ersig29 provides detail on the ADR performance of the preceding encoder In these applications the utility is invoked as in following examples for 3 FER and 1 signaling fersig29 c EVRC NW e fer 3 bin infile outfile fersig29 c EVRC NW s dim 1 e fer 3 bin infile outfile 3 4 2 2 Channel Error and Signaling Masks These binary Frame Error Rate and Signaling masks source level and packet level 1 byte of either 0 or 1 per frame are used with the fersig29 channel impairment and inter working simulation functions for the various conditions fer 3 bin dim 1 bin dim 1 pls bin 3 4 2 3 EVRC NW Interworking Function IWF The software Evrc nw iwf cc can be compiled to yield a simulation utility Evrc nw iwf with usage defined as Evrc nw iwf s signaling mask file i encoded packet file o dimmed packet file where Evrc nw iwf co
75. 4 e 8 Voles per Condition Lose 192 256 192 oo oo 2 4 3 Speech Material for SO 73 Testing The source speech files used for SO 73 compliance testing consist of Harvard sentences pairs which are preprocessed to include proper level adjustment and noise mixing for use in the subjective experiments The talkers used in these files consist of adult males and adult females and are native speakers of North American English For the following discussion it may be useful to refer to Table 4 3 for the composition of the Software Distribution database The source speech material for subjective Experiments is contained in directory so73 subjctv exp source Each file has been appropriately pre filtered level adjusted and noise processed These files are named src The speech database also includes samples processed through the various reference conditions in directory so73 subjctv exp ref The reference conditions are named ref for the respective conditions given in the tables in Section 2 4 2 3 2 4 4 Processing of Speech Material for SO 73 Testing The source speech material shall be processed by the various combinations of encoders and decoders listed in the descriptions of the experiments given in Section 2 4 2 The master codec software described in Section 3 4 3 shall be used in the processing involving the master codec Generally the master codec encoder and decoder outputs have been provided in the respective directori
76. 4 Processing of Speech Material for SO 68 Testing The source speech material shall be processed by the various combinations of encoders and decoders listed in the descriptions of the two experiments given in Section 2 2 2 The master codec software described in Section 3 2 3 shall be used in the processing involving the master codec Generally the master codec encoder and decoder outputs have been provided in the respective directories so68 subjctv exp m pkt and so68 subjctv exp m m Execution of the master codec software is needed only for the test encoder master decoder combination for each experiment condition All codec processing shall be done digitally Noise suppression and post filter options shall be enabled for both the master and the test codecs The digital format of the speech files is described in Section 3 2 4 4 The naming convention of the processed speech is as follows For the packet files in the so68 subjctv exp1 m pkt directory Experiment l the p12 files are the master packet files for the 512 source file Likewise the p22 and p32 files are the respective packet files for the s22 and s32 source files For the packet files the file name 9 3 indicates an output from the master encoder at 9 3 kbps active speech channel rate Likewise the file names 5 8 and 4 8 indicate an output from the master encoder at the respective active speech channel rates The pf3 files are the impaired packet files which will
77. 4dBov STL tool astrip is also used to split the concatenated files into the individual samples appropriate for the experiment Table 2 3 4 5 1 shows the cutting points to be used with the astrip tool for producing the two sentence samples for the Experiments 1 3 and 5 ACR test Table 2 3 4 5 2 shows the cutting points to be used with the astrip tool for producing the single sentence sub samples for the Experiments 2 4 and 6 P 835 test Table 2 3 4 5 3 shows the sub samples that make up the samples i e sentence triads for the P 835 test 2 54 1 C S0018 D v1 0 Table 2 3 4 5 1 Cutting Points for the astrip Software Tool for the SO 70 Experiments 1 3 and 5 ACR Test Experiment I sentence Start Length Xj 7113706 2 fpi 113707 118586 4 221 344193 117286 amp 72672 123570 8 802991 110876 9 mp2 913867 102934 2 55 ACR Sentences Start Length 40 f4ps 4427918 111339 44 fape 4866256 122664 46 f3pe 5105359 127468 48 5342393 108807 49 mip7 5451200 118850 2p8 6729022 123975 64 7194312 102903 C S0018 D v1 0 Table 2 3 4 5 2 Cutting Points for the astrip Software Tool for the SO 70 Experiments 2 4 and 6 P 835 Test Experiment II P 835 3679398 55140 Start Length Start Length Start Length Sentence Sentence Sentence sample samples sample samples sample samples
78. 5 l k 1 4 1 2 MOS jk 5 2 1 10 1 E k 1 4 where is the experiment number jis the condition number k is the codec combination number 1 M M with 396 forward link FER 2 M T with 396 forward link FER 3 T M with 396 reverse link FER 4 T T with 396 reverse link FER and vis the associated listener vote Then the per combination condition requirement can be defined as 3 Refer to Section 2 1 4 3 In this case M M and M T are respectively the outputs of the master and test decoders in response to packets generated by the master encoder that have been corrupted using a 396 forward link FER model Similarly T M and T T are the outputs of the master and test decoders in response to packets generated by the master encoder that have been corrupted using a 3 reverse link error model 2 13 C S0018 D v1 0 iefl 2 J 5 k 2 4 i J 1 4 1 2 J 5 k 2 4 2 1 10 2 ti jj ILA4 MOS i j 1 MOS i j k lt i j k MOS i j 1 MOS i j k lt i j k except for the 396 FER condition 1 j 4 where the following requirement is defined 5 1 4 1 4 1 lt 4 k 1 k 1 3 MOS L4 k MOS L4 k Ds L4k 1 k e 13 2 1 10 3 In Equation 2 1 10 2 the maximum allowable difference i j k is given by j K max 0 12 c i j k SER Gj
79. 54265 216555 272219 328495 383955 448692 505885 567054 622106 676546 732516 790729 843742 900324 962012 5021304 89 mzsis 5080387 65752 190 mzsis 5146135 63251 54 m3sis 5385036 60125 96 3516 5508244 61792 B Ci U1 ON OT GT OT Of GT UT OV U1 OV UT UT OT WPOP RA OA OR ay Oo Oyo yw COP RFR A gt oO SPH WIM OM WWI o o to vo oJ o Table 2 4 4 5 3 Composition of the Sentence Triad Samples for the Experiments 2 and 4 P 835 Test entence t3 t 4 2 4 5 Randomization Pts 85 807 509 S12 For each of the first four subjective experiments each presentation sample consists of a speech sample processed under a condition of the test For the ACR Experiments 1 and 3 the sample consists of a pair of concatenated sentences of approximately 8 sec duration For the P 835 Experiments 2 and 4 the sample consists of three sub samples where each sub sample is a single sentence of approximately 4 sec duration The samples shall be presented to the listeners in a 2 81 C S0018 D v1 0 randomized presentation order The listeners for each file set shall be presented with practice trials for subjective Experiments 1 and 3 and Experiments 2 and 4 The randomization of the
80. 73 The Test Parameters for Listening Experiment 1 are presented in Table 2 4 2 3 1 1 Table 2 4 2 3 1 1 SO 73 Listening Experiment 1 Test Parameters MOS P 800 Wideband Number of talkers 4 males 4 females none ambient Test conditions Low Audio Input Level 32 dB 196 d amp b Nominal Audio Input Level 22 dB High Audio Input Level 12 dB 3 FER and 1 FER 2 pls at Nominal Audio Input Level 22 Encoder Decoder Combinations 4 MM M T T T T M The Test Conditions for Listening Experiment 1 are presented in Table 2 3 2 3 1 2 2 68 Table 2 4 2 3 1 2 SO 73 Listening Experiment 1 Test Conditions Exp 1 Wideband ACR Reference Conditions File MNRU a01 MNRU Reference a02 14dB MNRU Reference a03 21dB MNRU Reference a04 28dB MNRU Reference a05 35dB MNRU Reference a06 42dB MNRU Reference a07 49dB MNRU Reference a08 Direct Source Reference Test Conditions File Condition Enc Dec a09 Nominal level M M a10 Nominal level M T a11 Nominal level T T a12 Nominal level T M a13 Low level 196 d amp b M M 14 Low level 1 d amp b M T 15 Low level 1 d amp b T T a16 Low level 196 d amp b T M 17 High level M M a18 High level M T a19 High level T T a20 High level T M a21 196 FER 196 PLS M M a22 196 FER 196 PLS M T a23 3 FER M M a24 3 FER M T 2 69 C S0018 D v1 0 C S0018 D v1 0 2 4 2 3 2 Subjectiv
81. 835 test Table 2 4 4 5 3 shows the sub samples that make up the samples i e sentence triads for the P 835 test 2 79 C S0018 D v1 0 Table 2 4 4 5 1 Cutting Points for the astrip Software Tool for the SO 73 Experiments 1 and 3 ACR Test Experiment I sentence Start Length Xj 7113706 2 fpi 113707 118586 4 221 344193 117286 amp 72672 123570 8 802991 110876 9 mp2 913867 102934 2 80 ACR Sentences Start Length 40 f4ps 4427918 111339 44 fape 4866256 122664 46 f3pe 5105359 127468 48 5342393 108807 49 mip7 5451200 118850 2p8 6729022 123975 64 7194312 102903 C S0018 D v1 0 Table 2 4 4 5 2 Cutting Points for the astrip Software Tool for the SO 73 Experiments 2 and 4 P 835 Test Experiment II P 835 3679398 55140 Start Length Start Length Start Length Sentence Sentence Sentence sample samples sample samples sample samples T p 191501 57758 5 m2soz 288424 56412 2so2 404062 59226 9 masoi 463288 51884 1o m3soz 515172 59593 16 21 04 859135 59385 ras 2 04 982890 58954 1s f2s03 1041824 54821 3s10 3202618 58012 4 fisi2 3527467 51931 66 2 12 3734538 55821 82512 3852015 60100 6s msti 3912115 55432 090859 1
82. ADR vs Capacity Operating Point sse 2 17 Table 2 2 2 3 1 1 SO 68 Listening Experiment Test Parameters 2 19 Table 2 2 2 3 1 2 SO 68 Listening Experiment Test 2 20 Table 2 2 2 3 2 1 SO 68 Listening Experiment II Test Parameters 2 21 Table 2 2 2 3 2 2 SO 68 Listening Experiment II Test 2 22 Table 2 2 2 3 3 1 Numerical Parameters for the SO 68 Listening Experiments 2 23 Table 2 2 4 5 1 Cutting Points for the astrip Software Tool for the Experiment ACR Test 2 26 Table 2 2 4 5 2 Cutting Points for the astrip Software Tool for the Experiment II P 835 Test 2 27 Table 2 2 4 5 3 Composition of the Sentence Triad Samples for the Experiment II P 835 Test 2 27 Table 2 2 5 1 Example Randomization for the Experiment ACR 2 28 Table 2 3 1 1 1 1 Target ADR vs Capacity Operating 2 36 Table 2 3 2 1 Test Suites for SO 70 compliance sse 2 39 Table 2 3 2 2 Experiments for SO 70 compliance 2 39 Table 2 3 2 3 1 1 SO 70 Listening Experiment 1 Test Parameters
83. B R car noise segment 22dB R street noise segment 22dB R ambient background segment 22dB R babble noise segment 32dB R car noise segment 32dB R street noise segment 32dB R ambient background segment 32dB See Section 3 1 2 1 for details in using the provided software tool that can be used to aid in making this calculation 2 1 1 1 2 Average Data Rate Requirement The total average data rate Ravg shall not exceed 4400 bps otherwise the test codec fails the compliance test 2 1 1 2 Unity Gain Requirement The specific EVRC test codec shall output speech with unity gain when compared with the input speech The unity gain measurement output active speech level input active speech level will be performed over the entire input speech database for the clean nominal level source conditions for each mode The measurement should be made using the STL 2000 tool 6 6a actlev and must not show more than 0 5 dB deviation between input and output active speech levels This procedure is fully described in 9 2 1 1 8 End to end Algorithmic Delay Recommendation The algorithmic delay for the specific EVRC test codec should be calculated analytically by the codec manufacturer In considering the algorithmic delay it can be assumed that all transmission channels have infinite bandwidth and that all processing elements have infinite throughput Algorithmic delay is defined as the sum of all sequential filte
84. Conditions Talkers genres 8 8 e e e e e e Total Stimuli per Experiment 1536 1152 2560 1440 1536 oo 96 Usewngpanis 8 e a e e o a 4 StimuliperListening 192 144 s20 180 192 o 24 24 Listeners Voters per Listening Panel Votes per Gonditon 266 192 266 192 256 192 oe 96 2 3 3 Speech Material for SO 70 Testing The source speech files used for SO 70 compliance testing consist of Harvard sentences pairs which are preprocessed to include proper level adjustment and noise mixing for use in the subjective experiments The talkers used in these files consist of adult males and adult females and are native speakers of North American English For the following discussion it may be useful to refer to Table 4 3 for the composition of the Software Distribution database The source speech material for subjective Experiments is contained in directory so70 subjctv exp source Each file has been appropriately pre filtered level adjusted and noise processed These files are named src The speech database also includes samples processed through the various reference conditions in directory so70 subjctv exp ref The reference conditions are named ref for the respective conditions given in the tables in Section 2 3 2 3 2 3 4 Processing of Speech Material for SO 70 Testing
85. ER 2 pls at Nominal Audio Input Level 22 Encoder Decoder Combinations 4 M M M T T T T M Table 2 3 2 3 1 1 SO 70 Listening Experiment 1 Test Parameters Condition Description Type of test MOS P 800 Wideband Number of talkers 4 males 4 females Background noise none ambient Audio Input Level 22 dB 32 dB 12 dB Filter characteristics P 341 refer Section 3 3 2 4 Reference conditions 8 Specified reference conditions Test conditions o Low Audio Input Level 32 dB 1 d amp b o Nominal Audio Input Level 22 dB o High Audio Input Level 12 dB 2 40 1 2 3 C S0018 D v1 0 Condition Description o 3 FER and 1 FER 2 pls at Nominal Audio Input Level 22 Encoder Decoder Combinations 4 M M M T T T T M The Test Conditions for Listening Experiment 1 are presented in Table 2 3 2 3 1 2 Table 2 3 2 3 1 2 SO 70 Listening Experiment 1 Test Conditions Exp 1 Wideband ACR Reference Conditions File MNRU a01 7dB MNRU Reference a02 14dB MNRU Reference a03 21dB MNRU Reference a04 28dB MNRU Reference a05 35dB MNRU Reference a06 42dB MNRU Reference a07 49dB MNRU Reference a08 Direct Source Reference Test Conditions File Condition Enc Dec a09 Nominal level M M a10 Nominal level M T ail Nominal level T T a12 Nominal level T M a13 Low level 1 d amp b M M 14 Low level
86. EVRC NW operating point Low 32 dB evrc nw op6 p32 6 8 kHz sampling src s22 8k EVRC NW operating point Nominal 22 dB evrc nw op6 dim 196 p22 6 8 kHz sampling 196 d amp b src s22 8k EVRC NW operating point Nominal 22 dB evrc nw op7 p22 7 8 kHz sampling src c15 8k EVRC NW operating point Nominal 22 dB evrc nw opi pc 1 8 kHz sampling 15 dB car noise src s15 8k EVRC NW operating point Nominal 22 dB evrc nw opi ps 1 8 kHz sampling 15 dB street noise src b20 8k EVRC NW operating point Nominal 22 dB evrc nw opi pb 1 8 kHz sampling 20 dB babble noise src c15 8k EVRC NW operating point Nominal 22 dB evrc nw op6 pc 6 8 kHz sampling 15 dB car noise src s15 8k EVRC NW operating point Nominal 22 dB evrc nw op6 ps 6 8 kHz sampling 15 dB street noise 8 41 C S0018 D v1 0 Input File Operating Point Condition n E a src b20 8k EVRC NW operating point Nominal 22 dB evrc nw op6 pb 6 8 kHz sampling 20 dB babble noise src c15 8k EVRC NW operating point Nominal 22 dB evrc nw op7 pc 7 8 kHz sampling 15 dB car noise Note 9 3 kbps mode is generated using anchor operating point 0 and 5 8 kbps mode is generated using anchor operating point 2 Table 3 4 4 5 3 SO 73 Suite A Decoder Bit exact Test Conditions Input Packet File evrc nw opoO fer 3 2 Operating Point EVRC NW operating Reference output Condition speech files for bit
87. Good 5 Excellent Data from 64 listeners shall be used for each of the two experiments The experiment may be run with up to eight listeners in parallel that is hearing the same random order of test conditions at the same time Before starting the test the listeners should be given the instructions in Figure 2 1 8 1 The instructions may be modified to allow for variations in laboratory data gathering apparatus This is an experiment to determine the perceived quality of speech over the telephone You will be listening to a number of recorded speech samples spoken by several different talkers and you will be rating how good you think they sound The sound will appear on one side of the headphones Use the live side on the ear you normally use for the telephone On each trial a sample will be played After you have listened to each passage the five buttons on your response box will light up Press the button corresponding to your rating for how good or bad that particular passage sounded During the session you will hear samples varying in different aspects of quality Please take into account your total impression of each sample rather than concentrating on any particular aspect The quality of the speech should be rated according to the scale below Bad Excellent Rate each passage by choosing the word from the scale which best describes the quality of speech you heard There will be 279 trials including 31 practice trials at
88. Mode 6 Nominal 22 dB T M c17 Mode 7 Nominal 22 dB M M c18 Mode 7 Nominal 22 dB M T c19 Mode 7 Nominal 22 dB T T c20 Mode 7 Nominal 22 dB T M c21 Mode 1 Low 32 dB 1 d amp b 1 pls M M c22 Mode 1 Low 32 dB 1 d amp b 1 pls M T c23 Mode 1 Low 32 dB 1 d amp b 1 pls T T c24 Mode 1 Low 32 dB 1 d amp b 1 pls T M c25 Mode 6 Low 32 dB 1 d amp b 1 pls M M c26 Mode 6 Low 32 dB 1 d amp b 1 pls M T c27 Mode 6 Low 32 dB 1 d amp b 1 pls T T c28 Mode 6 Low 32 dB 1 d amp b 1 pls T M c29 Mode 1 High 12 dB M M c30 Mode 1 High 12 dB M T c31 Mode 1 High 12 dB T T c32 Mode 1 High 12 dB T M c33 Mode 6 High 12 dB M M c34 Mode 6 High 12 dB M T c35 Mode 6 High 12 dB T T c36 Mode 6 High 12 dB T M 2 72 2 3 5 6 C S0018 D v1 0 Label Reato Condition mee C37 Mode 1 Nominal 22 dB 396 FER M M c38 Mode 1 Nominal 22 dB 396 FER M T c39 Mode 6 Nominal 22 dB 396 FER M M c40 Mode 6 Nominal 22 dB 396 FER M T c41 Mode 0 Nominal LB portion of Widemode mode decoder M M test only c42 Mode 0 Nominal LB portion of Widemode mode decoder M T test only c43 Mode 0 Low 196 D amp B 196 PLS LB portion of Widemode M M mode decoder test only c44 Mode 0 Low 196 D amp B 196 PLS LB portion of Widemode M T mode decoder test only c45 Mode 0 High LB portion of Widemode mode decoder test M M only c46 Mode 0 High LB po
89. NRU c01 5dB MNRU Reference c02 10dB MNRU Reference c03 15dB MNRU Reference c04 20dB MNRU Reference c05 25dB MNRU Reference C06 30dB MNRU Reference c07 35dB MNRU Reference c08 Direct Source Reference Test Conditions File Condition Enc Dec c09 Nominal Mode 0 LB portion of Wideband mode decoder test only c10 Nominal Mode 0 LB portion of Wideband mode decoder test only M T c11 Nominal Mode 4 interoperable with Mode 0 of SO 68 support M M c12 Nominal Mode 4 interoperable with Mode 0 of SO 68 support M T c13 Nominal Mode 4 interoperable with Mode 0 of SO 68 support T T 2 44 C S0018 D v1 0 Exp 3 Narrowband ACR c14 Nominal Mode 4 interoperable with Mode 0 of SO 68 support T M c15 Nominal Mode 7 interoperable with Mode 0 of SO 68 support IT T c16 Nominal Mode 7 interoperable with Mode 0 of SO 68 support T M c17 Nominal Mode 7 interoperable with Mode 0 of SO 68 support M M c18 Nominal Mode 7 interoperable with Mode 0 of SO 68 support M T c19 Low Mode 0 LB portion of Wideband mode decoder test only M M c20 Low Mode 0 LB portion of Wideband mode decoder test only M T c21 Low Mode 4 interoperable with Mode 0 of SO 68 support M M c22 Low Mode 4 interoperable with Mode 0 of SO 68 support M T c23 Low Mode 4 interoperable with Mode 0 of SO 68 support IT T c24 Low Mode 4 interoperable with Mode 0 of SO 68 supp
90. OT already supporting SO 68 compliance Each of the individual experiments are further defined in detail by Table 2 3 2 2 Table 2 3 2 2 Experiments for SO 70 compliance Experiment Individual tests Notes 1 WB clean level FER signaling ACR Mobile supporting 16 kHz Fs 2 WB noise FER P 835 Mobile supporting 16 kHz Fs 3 NB clean level FER signaling including BS supporting 8 kHz and MS supporting SO 68 interoperable mode tests ACR 8 16 kHz SO 68 compliance not PROVEN 4 NB noise FER including SO 68 BS supporting 8 kHz and MS supporting interoperable mode tests P 835 8 16 kHz SO 68 compliance not PROVEN 5 NB clean level FER signaling NOT BS supporting 8 kHz SO 68 compliance including SO 68 interoperable mode already PROVEN tests ACR 6 NB noise FER signaling NOT including BS supporting 8 kHz SO 68 compliance SO 68 interoperable mode tests already PROVEN P 835 7 WB music decoder test ACR Mobile supporting 16 kHz Fs 8 NB music decoder test ACR BS supporting 8 kHz Fs 2 3 2 1 Definition The codec subjective test is intended to validate the implementation of the speech codec being tested using the master codec defined in 3 3 3 as a reference Experiments 1 3 and 5 are based on the Absolute Category Rating ACR method which yields the Mean Opinion Score MOS as described in 10 Experiments 2 4 and 6 are based on the ITU T Recommendation P 835 described in 13 2 39 C S0018 D v1 0 2 3
91. R 2 FER Mode 6 interoperable with Mode 6 of SO 68 support d40 Babble 20dB SNR 2 FER Mode 6 interoperable with Mode 6 of SO 68 support M T d41 Babble 20dB SNR 2 FER Mode 6 interoperable with Mode 6 of SO 68 support TT d42 Babble 20dB SNR 2 FER Mode 6 interoperable with Mode 6 of SO 68 support T M C S0018 D v1 0 1 2 The Test Conditions for Listening Experiment 5 are presented in Table 2 4 2 3 5 2 3 Table 2 4 2 3 5 2 SO 73 Listening Experiment 5 Test Conditions Exp 5 Wideband Music File Reference Condition e01 MNRU 15dB Reference e02 MNRU 25dB Reference e03 MNRU 35dB Reference e04 Source Reference File Test Condition Enc Dec e05 0 FER M M e06 0 FER M T 07 3 FER 08 3 FER M T 5 2 42 3 6 Subjective Experiment 6 for SO 73 e The Test Parameters for Listening Experiment 6 are presented in Table 2 4 2 3 6 1 7 Table 2 4 2 3 6 1 SO 73 Listening Experiment 6 Test Parameters Type of test ncn P 800 Narrowband Number of genres Sou noise none ambient 22 dB 0 FER and 3 FER 2 MM M T 8 The Test Conditions for Listening Experiment 6 are presented in Table 2 4 2 3 6 2 2 76 C S0018 D v1 0 Table 2 4 2 3 6 2 SO 73 Listening Experiment 6 Test Conditions Exp 6 Narrowband Music File Reference Condition 101 MNRU 10dB Reference 02 MNRU 20dB Reference MNRU 30dB Reference 04 Sourc
92. RCB FX source code for the bit exact 3 2 4 fixed point code so68 subjctv Speech and other material 2 2 1 1 2 2 3 2 2 4 2 2 5 necessary to perform Subjective Experiments and Il so68 cal Output level calibration file for 3 2 1 2 listening tests so68 tools Source code for the software 2 2 tools so68 testvec Test vectors for verifying bit 3 2 4 5 exact EVRC implementations Table 4 3 Description of EVRC WB Software Distribution Contents so70_73 EVRCWB_NW_FX source code for the bit exact 3 3 4 fixed point code so70 subjctv Speech and other material 2 3 1 1 2 3 3 2 3 4 2 3 5 necessary to perform subjective experiments so70 cal Output level calibration file for 3 3 1 2 listening tests so70 tools Source code for the software 3 2 tools so70 testvec Test vectors for verifying bit 3 3 4 5 exact EVRC implementations 4 2 C S0018 D v1 0 Table 4 4 Description of EVRC NW Software Distribution Contents so70 73 EVRCWB NW FX source code for the bit exact 3 3 4 fixed point code so73 subjctv Speech and other material 2 3 1 1 2 4 3 2 3 4 2 3 5 necessary to perform subjective experiments so73 cal Output level calibration file for 3 3 1 2 listening tests so73 tools Source code for the software 3 2 tools so73 testvec Test vectors for verifying bit 3 3 4 5 exact EVRC implementations 4 3 C S0018 D v1 0 This page intentionally left blank 4 4 20 21 22 23 24 25 C S0018
93. S SOMEWHAT INTRUSIVE VERY CONSPICUOUS VERY INTRUSIVE For the third and final sentence in each trial you will be asked to attend to the entire sample both the speech signal and the background and rate your opinion of the sample for purposes of everyday speech communication Select the category which best describes the sample you just heard for purposes of everyday speech communication the OVERALL SPEECH SAMPLE was 5 EXCELLENT GOOD FAIR POOR BAD 2 86 20 21 22 23 24 25 26 27 28 C S0018 D v1 0 2 4 9 Analysis of Results The response data from the practice blocks shall be discarded Data sets with missing responses from listeners shall not be used i e a complete set of data is required for 24 listeners four for each of six listening panels Responses from the different listening panels for the corresponding test conditions shall be treated as equivalent in the analysis 2 4 9 4 Basic Results for the SO 73 Listening tests The votes for each of the test conditions for SO 73 Experiments and II shall be averaged to produce an associated mean score M as shown in Equation 2 4 9 1 1 and a Standard Deviation SD as shown in Equation 2 4 9 1 2 where L is the number of listeners and T is the number of talkers involved in the experiment LxT ML 3 2 4 9 1 1 LXT 1 2 4 9 1 2 2 4 9 2 Minimum Subjective Requirement for SO 73 Listening Tests The Terms
94. SO 70 The total average data rate Ravg for each operating point shall not exceed the target average data rate by more than the tolerance level in Table 2 3 1 1 1 1 otherwise the test codec fails the compliance test 2 3 1 2 Unity Gain Requirement The specific EVRC WB test codec shall output speech with unity gain when compared with the input speech The unity gain measurement output active speech level input active speech level will be performed over the entire input speech database for the clean nominal level source conditions for each mode The measurement should be made using the STL 2000 tool 6 6a actlev and must not show more than 0 5 dB deviation between input and output active speech levels This procedure is fully described in 9 2 36 20 21 22 23 24 25 26 27 28 29 30 C S0018 D v1 0 2 3 1 3 End to end Algorithmic Delay Recommendation The algorithmic delay for the specific EVRC WB test codec should be calculated analytically by the codec manufacturer In considering the algorithmic delay it can be assumed that all transmission channels have infinite bandwidth and that all processing elements have infinite throughput Algorithmic delay is defined as the sum of all sequential filter delays and buffering delays in the encode decode path The maximum end to end algorithmic delay should be no greater than that of the master codec For the master codecs defined in 1 the algorithmic delay
95. Test Parameters 2 68 Table 2 4 2 3 1 2 SO 73 Listening Experiment 1 Test 2 69 Table 2 4 2 3 2 1 SO 74 Listening Experiment 2 Test Parameters 2 70 Table 2 4 2 3 2 2 SO 73 Listening Experiment 2 Test Conditions 2 70 Table 2 4 2 3 3 1 SO 73 Listening Experiment Test Parameters 2 71 Table 2 4 2 3 3 2 SO 73 Listening Experiment 3 Test 2 72 Table 2 4 2 3 4 1 SO 74 Listening Experiment 4 Test Parameters 2 73 Table 2 4 2 3 4 2 SO 73 Listening Experiment 4 Test Conditions 2 74 Table 2 4 2 3 5 1 SO 73 Listening Experiment 5 Test Parameters 2 75 Table 2 4 2 3 5 2 SO 73 Listening Experiment 5 Test 2 76 Table 2 4 2 3 6 1 SO 73 Listening Experiment 6 Test Parameters 2 76 Table 2 4 2 3 6 2 SO 73 Listening Experiment 6 Test 2 77 Table 2 4 2 3 7 1 Numerical Parameters for the SO 73 Listening Experimen
96. U G compiler and make utility Two GCC compatible makefiles have been included in the build directory Typing make in the build directory will compile and link the code and create the executable file called Evrc_nw_fx Evrc_nw_fx exe on Win32 systems which will be placed in the build directory The included makefiles may require some user modification for a particular hardware platform and or operating system 3 4 4 3 Running the Fixed Point Codec Simulation The EVRC NW executable files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing Evrc_nw_fx with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below 3 38 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 C S0018 D v1 0 outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding functio
97. acteristics for telephony Requirements V9 0 0 March 2009 C S0018 D v1 0 1 CONTENTS 2 F A LOUE NO MN Ss resi pe zen ann T A A A A A ATE 1 1 3 e koy D t EMI NI 1 2 4 2 IMPR LEM 1 3 5 1 3 Test Model for the Speech Codec sse nennen nnne nnn ens 1 5 2 Codec Minimum Standards iiie e bete n irte ere deae 2 1 7 2 1 Performance Testing for SO 2 1 8 2 1 1 Objective Performance Testing for SO 3 2 1 9 2 1 2 Subjective Performance Testing for SO 3 eee 2 2 10 2 1 8 Source Speech Material for SO 3 Testing sssssssseeeee 2 7 11 2 1 4 Processing of Speech Material for SO Testing sse 2 8 12 2 1 5 toic e Eee ie e ve Silage entere 2 10 18 2 1 6 Presentaton ire cem Re bem ante gene ed eet edo tete 2 11 14 2 14 LEISteners iin ette deett eot tete e tetas 2 11 15 2 1 8 Listening Test Procedures eisiea iai 2 11 16 21 9 Analysis of Results det Eae ea ta a d eed 2 12 17 2 1 10 Minimum Subjective Requirement sese 2 13 18 2 1 11 Expected Results for Reference 2 15 19 2 2 Performance Testing for SO 68 4 eene entente enn 2 17 20 2 2 4 Objective Performance Testing for SO 68 2
98. al 22 dB 15 dB car noise evrc nw op6 oc 8k evrc nw op6 ps EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 15 dB street noise evrc nw op6 os 8k evrc nw op6 fer 296 pb EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 15 dB babble noise 296 FER evrc nw op6 fer 296 ob 8k Table 3 4 4 5 4 SO 73 Encoder Suite B Bit exact Test Conditions Input File Operating Point Condition an oF src s22 8k EVRC NW operating point 1 Nominal 22 dB evrc_nw_op1 p22 8 kHz sampling src s12 8k EVRC NW operating point 1 High 12 dB evrc_nw_op1 p12 8 kHz sampling src s32 8k EVRC NW operating point 1 Low 32 dB evrc_nw_op1 p32 8 kHz sampling src s22 8k EVRC NW operating point 1 Nominal 22 dB evrc_nw_op1 dim_1 p22 8 kHz sampling 1 d amp b src s22 8k EVRC NW operating point 6 Nominal 22 dB evrc_nw_op6 p22 8 kHz sampling src s12 8k EVRC NW operating point 6 High 12 dB evrc_nw_op6 p12 8 kHz sampling 3 43 C S0018 D v1 0 Input File Operating Point Condition kaa kan or src s32 8k EVRC NW operating point 6 Low 32 dB evrc nw op6 p32 8 kHz sampling src s22 8k EVRC NW operating point 6 Nominal 22 dB evrc nw op6 dim 196 p22 8 kHz sampling 196 d amp b src s22 8k EVRC NW operating point 7 Nominal 22 dB evrc nw op7 p22 8 kHz sampling src c15 8k EVRC NW operating
99. all quality and is fully described in 13 2 2 2 3 Test Conditions and Test Design for SO 68 The first listening experiment for SO 68 is performed as an ACR listening test The second experiment for SO 68 is performed as a P 835 listening test 2 2 2 3 1 Subjective Experiment for SO 68 The Test Parameters for Listening Experiment are presented in Table 2 2 2 3 1 1 Table 2 2 2 3 1 1 SO 68 Listening Experiment I Test Parameters Condition Description Type of test MOS P 800 Number of talkers 4 males 4 females Background noise none ambient Audio Input Level 22 dB 32 aB 12 dB MIRS 8 Direct 3 9 15 21 27 33 39 2 19 C S0018 D v1 0 Test conditions a Low Audio Input Level 32 dB 9 3 5 8 kbps 196 d amp b 196 pls b Nominal Audio Input Level 22 dB 9 3 5 8 4 8 kbps c High Audio Input Level 12 dB 9 3 5 8 kbps d Nominal Audio Input Level 22 dB 9 3 5 8 kbps 396 FER M M M T Only Encoder Decoder Combinations 4 M M M T T T T M Conditions a c 2 M M Condition d The Test Conditions for Listening Experiment are presented in Table 2 2 2 3 1 2 Table 2 2 2 3 1 2 SO 68 Listening Experiment Test Conditions Label Operating Point Condition Encoder Decoder Combinations a01 Reference MNRU 3dB a02 Reference MNRU 9dB a03 Reference MNRU 15dB a04 Reference MNRU 21dB a05 Re
100. all Quality OVRL In general OVRL scores are highly correlated with MOS but the OVRL score provides greater sensitivity and precision in test conditions involving background noise While the OVRL score is of most interest here the SIG and BAK scores also provide valuable diagnostic information For each trial in a P 835 test listeners are presented with three sub samples where each sub sample is a single sentence approx 4 sec duration processed through the same test condition In one of the first two sub samples listeners rate the Signal Quality on a five point rating scale with the points labeled Very natural no distortion Fairly natural little distortion Somewhat natural some distortion Fairly unnatural fairly distorted Very unnatural very distorted NOAA For the other of the first two sub samples listeners rate the Background Quality on a five point rating scale with the points labeled 2 59 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 Not noticeable Fairly noticeable Noticeable but not intrusive Fairly conspicuous somewhat intrusive Very conspicuous very intrusive NOUA For the third sub sample listeners rate the Overall quality on a five point rating scale with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiments 2 4 and 6 four listeners for each listening panel where each listening
101. amples which can be accounted for by the noise suppression overlap delay plus the LPC look ahead This 13 ms delay will ensure the proper tandem processing It may be beneficial for the test codec to incur the same delay as the master codec to avoid potential quality differences due to framing skew This kind of delay ensures asynchronous tandem processing 2 1 4 5 Rate 1 2 Maximum Processing The appropriate speech files will be processed through the codecs for the Rate 1 2 Maximum processing test conditions The test speech codec shall be constrained to operate such that Rate 1 coding is not used 2 1 4 6 Ensuring Proper Encoded Frame Packet Files All encoded frame packet files shall be examined to ensure that the files only contain data in those file locations where data should exist for a given data rate The examination of the encoded frame packet files should indicate the occurrence of any improper data in the files but the examination must not alter the encoded frame packet files in any way 21 5 Randomization For each of the two subjective experiments each presentation sample consists of one sentence pair processed under a condition of the test The samples shall be presented to the listeners in a random order The listeners for each file set shall be presented with practice trials for subjective Experiments 2 10 1 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39
102. and ps files are the master packet files for the 15 b20 and s15 source files respectively For the master encode master decode directories so70 subjctv exp m m the naming convention of the speech files is such that the first two characters of the file name indicate the codec combination and the suffix indicates the condition numbers in Table 2 3 2 3 1 2 and Table 2 3 2 3 2 2 Naming conventions for the remaining two experiments follow accordingly Detailed descriptions of all processing operations are given in Section 6 2 3 4 4 Encoding by the Test Codec All of the source files will be encoded by the test codec to produce encoded packet files For ease of reference it is recommended that directories so70 subjctv exp t_pkt be created to deposit the test encoder output packets and that the naming conventions be made consistent with the master codec 2 3 4 2 Decoding by the Master Test Codecs The encoded packet files generated from the various encoders conditions shall be processed through the master and test decoders 2 3 4 8 Introduction of Impairments For the frame error conditions the impaired master codec encoded packet files are provided in the so70 subjctv exp m_pkt directory Unlike other conditions this condition uses only the test decoder and not the test encoder For the Dim and Burst processing and also the Packet Level Signaling conditions the processing requires inputs from a signaling file to cont
103. and Filter responses 3 3 3 Master Codec for SO 70 This section describes the C simulation of the speech codec specified by 1 The master codec C simulation used for verifying the performance of a non bit exact EVRC WB implementation shall be the floating point master C simulation included in the associated Software Distribution 1a 3 22 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 C S0018 D v1 0 3 3 3 1 Compiling the Master Codec Simulation The source code for floating point simulation can be compiled using the GNU G compiler and make utility A G compatible makefile has been included in the appropriate sub directory in 1a Typing make this directory will compile and link the code and create the executable file called Evrc wb Evrc wb exe on Win32 systems which will be placed in the same directory The included makefile may require some user modification for a particular hardware platform and or operating system 3 3 3 2 Running the Master Codec Simulation The EVRC WB floating point executable Evrc wb files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing Evrc wb with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file
104. ated with Sections 3 1 2 through 3 1 4 SO 3 3 2 2 through 3 2 4 SO 68 3 3 2 through 3 3 4 SO 70 or 3 4 2 through 3 4 4 SO 73 can be found in the Software Distribution associated with this document The objective and subjective testing requires that speech data files can be input to the speech encoder and that the output data stream can be saved to a set of files It is also necessary to input data stream files into the speech decoder and have the output speech data saved to a set of files This process suggests the use of a computer based data acquisition system to interface to the codec under test Since the hardware realizations of the speech codec may be quite varied it is not desirable to precisely define a set of hardware interfaces between such a data acquisition system and the codec Instead only a functional description of these interfaces will be defined A host computer system is necessary to handle the data files that must be input to the speech encoder and decoder and to save the resulting output data to files These data files will contain either sampled speech data or speech codec parameters hence all the interfaces are digital The generic Standard Equipment is shown in Figure 3 1 Host Computer Host Computer Digital Digital D ata D ata S peech S peech Encoder or Decoder Encoder or Decoder Figure 3 1 Basic Test Equipment The host computer has access to the data files needed for testing For encoder testing t
105. ay be modified to allow for variations in laboratory data gathering apparatus 2 58 C S0018 D v1 0 This is an experiment to determine the perceived quality of speech over the telephone You will be listening to a number of recorded speech samples spoken by several different talkers and you will be rating how good you think they sound Use the single headphone on the ear you normally use for the telephone On each trial a two sentence sample will be played After you have listened to the sample determine the category from the list below which best describes the overall quality of the sample Press the numeric key on your keyboard corresponding to your rating for how good or bad that particular passage sounded The quality of the speech should be rated according to the scale below Excellent Fair 5 4 Good 3 2 Poor 1 Bad During the session you will hear samples varying in different aspects of quality Please take into account your total impression of each sample rather than concentrating on any particular aspect Figure 2 3 8 1 1 Instructions for Listeners 2 3 8 2 835 Listening Test Procedures Experiments 2 4 and 6 Experiments 2 4 and 6 use the P 835 test methodology described in ITU T Rec P 835 13 The P 835 methodology is specifically designed to evaluate the quality of speech in background noise It yields a measure of Signal Quality SIG a measure of Background Quality BAK and a measure of Over
106. be described in Section 2 2 4 3 Similarly the directory so68 subjctv exp2 m pkt contains the master packet files for Experiment Il Here the pc pb and ps files are the master packet files for the c15 620 and s15 source files respectively For the master encode master decode directories so68 subjctv exp m m the naming convention of the speech files is such that the first two characters of the file name indicate the codec combination and the suffix indicates the condition numbers in Table 2 2 2 3 1 2 and Table 2 2 2 3 2 2 Detailed descriptions of all processing operations are given in Section 6 2 24 20 21 22 23 24 25 26 27 28 29 30 31 32 33 C S0018 D v1 0 2 2 4 1 Encoding by the Test Codec All of the source files will be encoded by the test codec to produce encoded packet files For ease of reference it is recommended that directories so68 subjctv exp1 t pkt and so68 subjctv exp2 t pkt be created to deposit the test encoder output packets and that the naming conventions be made consistent with the master codec 2 2 4 2 Decoding by the Master Test Codecs The encoded packet files generated from the various encoders conditions shall be processed through the master and test decoders 2 2 4 8 Introduction of Impairments For the 3 frame error condition Experiment condition d the impaired master codec encoded packet files are provided in the so68 subjctv exp1
107. be examined to ensure that the files only contain data in those file locations where data should exist for a given data rate The examination of the encoded frame packet files should indicate the occurrence of any improper data in the files but the examination must not alter the encoded frame packet files in any way 2 4 4 5 Post processing of test condition output files In order to build the play sets to be presented to the listening panels the output files for the various test conditions must be processed to provide the appropriate listening conditions In addition the concatenated output files must be partitioned into the samples representing the combination of test condition and talker The listening conditions for Narrowband experiments are provided by filtering the output files using the STL software tool filter with the MIRS receive filter mask The listening conditions for Wideband experiments are provided by mixing STL tool oper the output files with Psophometrically filtered noise STL tool filter PSO filter mask at 74dBov STL tool astrip is also used to split the concatenated files into the individual samples appropriate for the experiment Table 2 4 4 5 1 shows the cutting points to be used with the astrip tool for producing the two sentence samples for the Experiments 1 and 3 ACR test Table 2 4 4 5 2 shows the cutting points to be used with the astrip tool for producing the single sentence sub samples for the Experiments 2 and 4 P
108. c Results for the SO 70 Listening tests The votes for each of the test conditions for SO 70 Experiments 1 and 3 and Experiments 2 and 4 shall be averaged to produce an associated mean score M as shown in Equation 2 3 9 1 1 and a Standard Deviation SD as shown in Equation 2 3 9 1 2 where L is the number of listeners and T is the number of talkers involved in the experiment xxx L T LxT 2 3 9 1 1 Pre LxT 1 SD 2 3 9 1 2 2 3 9 2 Minimum Subjective Requirement for SO 70 Listening Tests The Terms of Reference for the MPS tests state that the mean score for each of the Test Encoder Decoder Combinations E DC should be not worse than the mean score for the Reference E DC For most of the test conditions involved in the subjective experiments there are three Test E DC s M T T M and T T which means there are three statistical tests against the Reference E DC M M The three statistical tests are not independent however Since they all involve the same ratings for the Reference E DC t tests are not appropriate The appropriate statistical test for multiple Test conditions against a common Reference condition is Dunnett s Test A complete description of Dunnett s Test is contained in Appendix B The critical value for the Dunnett s test is 2 09 one sided test p 05 4 E DC s df 93 For those test conditions where a single Test E DC T T is compared against the Reference E DC the appropria
109. c listening tests shall be made with one side of high fidelity circum aural headphones The speech material delivery system shall meet the requirements of Section 3 1 1 1 The delivery system shall be calibrated to deliver an average listening level of 16 dBPa 78 dB SPL The equivalent acoustic noise level of the delivery system should not exceed 35 dBA as measured on a standard A weighted meter The listeners should be seated in a quiet room with an ambient noise of 40 dBA or below 2 1 7 Listeners The listener sample is intended to represent the population of telephone users with normal hearing acuity The listeners should be naive with respect to telephony technology issues that is they should not be experts in telephone design digital voice encoding algorithms and so on They should not be trained listeners that is they should not have been trained in these or previous listening studies using feedback trials The listeners should be adults of mixed sex and age Each listener shall provide data only once for a particular evaluation A listener may participate in different evaluations but test sessions performed with the same listener should be at least one month apart so as to reduce the effect of cumulative experience 2 1 8 Listening Test Procedures The listeners shall listen to each sample and rate the quality of the test sample using a five point scale with the points labeled 1 Bad 2 Poor C S0018 D v1 0 3 Fair 4
110. ces of approximately 8 sec duration For the P 835 Experiments 2 4 and 6 the sample consists of three sub samples where each sub sample is a single sentence of approximately 4 sec duration The samples shall be presented to the listeners in a 2 56 1 C S0018 D v1 0 randomized presentation order The listeners for each file set shall be presented with practice trials for subjective Experiments 1 3 and 5 and Experiments 2 4 and 6 The randomization of the test samples has been accomplished with the following constraints for each of the six experiments 1 A trial i e a test sample for the combination of each test condition and each talker shall be presented exactly once to each listening panel i e trials panel conditions x talkers 2 Randomization is in blocks such that one sample of each test condition is presented once with a randomly selected talker in each block This ensures that listeners rate each test condition equally often in the initial middle and final parts of the block and controls for the effects of time and order of presentation A block contains the same number of samples as there are test conditions involved in the test A test session consists of the same number of blocks as there are talkers involved in the test Each session is presented to a listening panel of four listeners 3 Randomizations are constructed such that talker gender is alternated on successive trials resulting in the same
111. ch codecs under test are not required to provide performance across ranges of temperature humidity or other typical physical environmental variables 3 1 Specific Standard Test Conditions for SO 3 3 1 1 Audio Path and Calibration for SO 3 3 1 1 1 Audio Path The audio path must meet the following requirements for electro acoustic performance measured between the output of the D A converter and the output of the headphone 3 2 20 21 22 23 24 25 26 27 28 29 30 31 C S0018 D v1 0 1 Frequency response shall be flat to within 2 dB between 200 Hz and 3400 Hz and below 200 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 1 for signals between 100 Hz and 4000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at the ear reference plane of the headphone 4 Signal shall be delivered to the headphone on the listener s preferred telephone ear No signal shall be delivered to the other headphone 3 1 1 2 Calibration The audio circuit shall deliver an average sound level of the stimuli to the listener at 16 dBPa 78 dB SPL at the ear reference plan This level was chosen because it is equivalent to the level delivered by a nominal ROLR handset driven by the average signal level on the PSTN network This leve
112. ction The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default h max Sets the maximum allowable data rate to max where max is element of 4 3 1 using the codes specified in the first column of Table 3 1 3 3 1 min Sets the minimum allowable data rate to min where min is element of 4 3 1 using the codes specified in the first column of Table 3 1 3 3 1 If neither the h nor I option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 In addition if max z min the data rate varies between max and min using the same rate decision algorithm where the data rate is set to max if the selected data rate is max and the data rate is set to min if the 3 5 20 21 22 23 24 25 26 27 28 C S0018 D v1 0 selected data rate is min See the select rate routine in the file ratedec c for more information p flag If flag is set to 0 the post filter is disabled If the flag is set to 1 the post filter is enabled If the p option is not invoked the post filter is enabled during decoding n flag If flag is set to 0 noise suppression is disabled If the flag is set to 1 noise suppression is enabled If the n option is not invoked noise suppression is enabled during encoding 3 1 3 3 File Formats Files of speech contain 2 s complement 16 bit samples w
113. d in Section 6 3 40 C S0018 D v1 0 Table 3 4 4 5 2 SO 73 Encoder Suite A Bit exact Test Conditions Reference packet files for bit Input File Operating Point Condition exact compliance src s22 EVRC NW operating point Nominal 22 dB evrc nw opO0 p22 0 16 kHz sampling src s12 EVRC NW operating point High 12 dB evrc_nw_op0 p12 0 16 KHz sampling src s32 EVRC NW operating point Low 32 aB evrc nw opO dim 196 p32 0 16 kHz sampling 196 d amp b src c10 EVRC NW operating point Nominal 22 dB evrc nw opO pc1 0 16 kHz sampling 10 dB car noise Src c20 EVRC NW operating point Nominal 22 dB evrc nw opO pc2 0 16 kHz sampling 20 dB car noise src s15 EVRC NW operating point Nominal 22 dB evrc_nw_op0 ps 0 16 kHz sampling 15 dB street noise src b20 EVRC NW operating point Nominal 22 aB evrc nw opO pb 0 16 kHz sampling 20 dB babble noise src s22 8k EVRC NW operating point Nominal 22 dB evrc nw op1 p22 1 8 kHz sampling src s12 8k EVRC NW operating point High 12 dB evrc_nw_op1 p12 1 8 kHz sampling src s32 8k EVRC NW operating point Low 32 dB evrc_nw_op1 p32 1 8 KHz sampling src s22 8k EVRC NW operating point Nominal 22 dB evrc nw op1 dim 196 p22 1 8 kHz sampling 196 d amp b src s22 8k EVRC NW operating point Nominal 22 dB evrc nw op6 p22 6 8 kHz sampling src s12 8k EVRC NW operating point High 12 dB evrc nw op6 p12 6 8 kHz sampling src s32 8k
114. d with frame erasures The so3 testvec fixed31 so3 testvec fixed32 directory contains files processed with the 31 bit 32 bit DSP library The files in these directories are the reference files for bit exact compliance A test codec is bit exact if it can reproduce all of the reference files in either the so3 testvec fixed32 directory or the so3 testvec fixed31 directory 3 1 4 6 1 Description of Bit Exact Source Files The following source files are designed to exercise the majority of the bitstream slots vec 01 pcm 15dB babble 7 females 7 males vec 02 pcm 10dB car 7 females 7 males vec 03 pcm flat clean 7 females 7 males vec 04 pcm 15dB street 7 females 7 males vec 05 pcm high level 4 females 4 males vec 06 pcm low level 4 females 4 males vec 07 pcm irs clean 4 females 4 males vec 08 pcm flat clean 4 females 4 males vec 09 pcm 10dB car 4 females 4 males vec 10 pcm 15dB babble 4 females 4 males vec 11 pcm 12dB street 4 females 4 males vec 12 pcm mixed noise one sided conversation vec 13 pcm mixed noise one sided conversation The following source files are designed to exercise the RCELP algorithm NOTE These files must be processed in full rate only mode only rate 4 allowed shiftr pcm Frequency sweep shiftl pcm Frequency sweep The following source files are recordings of one sided conversations at different input levels and are designed to test the rate determination algorithm rda test pcm rda mod pcm rda hig
115. dB evrc wb opoO os 8k operating point 0 8 kHz sampling 15 dB street noise evrc wb opoO fer 296 pb evrc wb opoO fer EVRC WB operating point 0 8 kHz samplin EVRC WB operating point 0 8 kHz samplin 3 31 Nominal 22 dB 20 dB babble noise 2 FER Generic audio signal fer 396 evrc wb opO fer 296 0b 8k evrc wb opO fer 396 0m 8k C S0018 D v1 0 Table 3 3 4 5 8 SO 70 Encoder Suite D Bit exact Test Conditions Reference packet files for Input File Operating Point Condition etap liance src s22 8k EVRC WB operating point 4 Nominal 22 dB evrc wb op4 p22 8 kHz sampling src s12 8k EVRC WB operating point 4 High 12 dB evrc wb op4 p12 8 kHz sampling src s32 8k EVRC WB operating point 4 Low 32 dB evrc wb op4 p32 8 kHz sampling src s22 8k EVRC WB operating point 4 Nominal 22 dB evrc wb op4 dim 196 p22 8 kHz samplin 196 d amp b src s22 8k EVRC WB operating point 7 Nominal 22 dB evrc wb op7 p22 8 kHz sampling src c15 8k EVRC WB operating point 4 Nominal 22 dB evrc wb op4 pc 8 kHz sampling 15 dB car noise src s15 8k EVRC WB operating point 4 Nominal 22 dB evrc wb op4 ps 8 kHz sampling 15 dB street noise src b20 8k EVRC WB operating point 4 Nominal 22 dB evrc wb op4 pb 8 kHz sampling 20 dB babble noise src c15 8k EVRC WB operating point 7 Nominal 22 dB evrc wb op7 pc 8 kHz sampling 15 dB car noise
116. der test decoder 3 forward link FER e 4 master encoder test decoder 3 reverse link FER 2 1 4 4 Tandem Conditions The clear channel tandem condition shall be performed by e encoding the appropriate source file e decoding the encoder s output file 2 9 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 C S0018 D v1 0 e normalizing signal power to 22aB e companding the modified decoded speech file to law PCM format e encoding the p law PCM companded version of the decoded speech file e decoding the resultant encoder s output file to generate the processed speech file e normalizing signal power to 22aB e companding the modified decoded speech file to law PCM This process is performed for each combination of master encode test decode test encode master decode and test encode test decode The master test combinations for tandem processing represent master encode test decode master encode test decode and vice versa for the test master combination The master encode master decode files are provided The following four conditions are tested e M M MM e e e T T T T To expedite processing it may be possible to use the output files for Experiment II condition 1 tm1 mt1 and tt1 as the input for the three test combinations It is also worth noting that the front end algorithmic delay through the master codec is 13 ms or 104 s
117. e 8 kHz sampling C S0018 D v1 0 Input Packet File evrc nw opO ps Operating Point EVRC NW operating point 0 8 kHz sampling Condition Nominal 22 dB 15 dB street noise Reference output speech files for bit exact compliance evrc nw opO os 8k evrc nw opO fer 296 p b EVRC NW operating point 0 8 kHz sampling Nominal 22 dB 20 dB babble noise 296 FER evrc nw opoO fer 296 ob 8k evrc nw opO fer 396 p m EVRC NW operating point O 8 kHz sampling Generic audio signal fer 396 evrc nw opO fer 3 om 8k evrc nw opi fer 396 p 22 EVRC NW operating point 1 8 kHz sampling Nominal 22 dB FER 396 evrc nw opi fer 396 022 8k evrc nw op1 p12 EVRC NW operating point 1 8 kHz sampling High 12 dB evrc nw op1 012 8k evrc nw op1 p32 EVRC NW operating point 1 8 kHz sampling Low 32 dB evrc nw 0op1 032 8k evrc nw op6 fer 396 p 22 EVRC NW operating point 6 8 kHz sampling Nominal 22 dB FER 396 evrc nw op6 fer 396 022 8k evrc nw op6 p12 EVRC NW operating point 6 8 kHz sampling High 12 dB evrc nw op6 012 8k evrc nw op6 p32 EVRC NW operating point 6 8 kHz sampling Low 32 dB evrc nw op6 032 8k evrc nw op7 p22 EVRC NW operating point 7 8 kHz sampling Nominal 22 dB evrc nw op7 o22 8k evrc nw opi dim 196 pls 196 p22 EVRC NW operating point 1 8 kHz sampling Nom
118. e Experiment 2 for SO 73 The Test Parameters for Listening Experiment 2 are presented in Table 2 3 2 3 2 1 Table 2 4 2 3 2 1 SO 74 Listening Experiment 2 Test Parameters Number of talkers 3 males 3 females Test conditions o Car Noise 10 dB SNR o Car Noise 20 dB SNR 2 FER o Street Noise 15 dB SNR o Babble noise 20 dB S N Encoder Decoder Combinations 4 M M M T T T T M The Test Conditions for Listening Experiment 2 are presented in Table 2 4 2 3 2 2 Table 2 4 2 3 2 2 SO 73 Listening Experiment 2 Test Conditions Exp 2 Wideband P 835 Reference Conditions File MNRU SNR b01 MNRU 40aB SNR 40dB Reference b02 MNRU 40dB SNR 20dB Reference bo3 MNRU 40dB SNR 0dB Reference b04 MNRU 0aB SNR 40dB Reference bo5 MNRU 20dB SNR 40dB Reference b06 MNRU 10dB SNR 10dB Reference b07 MNRU 208B SNR 20dB Reference b08 MNRU 40aB SNR 30dB Reference Test Conditions File Condition Enc Dec bo9 Car 10dB SNR M M b10 Car 10dB SNR M T b11 Car 10dB SNR TT b12 Car 10dB SNR T M 2 70 2 3 5 6 2 4 2 3 3 Subjective Experiment 3 for SO 73 The Test Parameters for Listening Experiment 3 are presented in Table 2 4 2 3 3 1 Table 2 4 2 3 3 1 SO 73 Listening Experiment 3 Test Parameters Exp 2 Wideband P 835 b13 Car 20dB SNR 296 FER M M b14 Car 20dB SNR 2 FER M T b15 Car 20dB SNR 2 FER T T b
119. e Reference File Test Condition Enc Dec f05 0 FER M M f06 096 FER M T f07 396 FER M M f08 3 FER M T 2 4 2 3 7 Numerical Parameters for the SO 73 Listening Experiments Table 2 4 2 3 7 1 describes the resultant numerology that is used for the six SO 73 listening experiments The first column is the description of the parameter and columns 2 to 7 show the numerical value for each of the parameters for the six listening experiments For each listening experiment the different Encode Decode Test conditions include various interconnections between the Master and Test Encoders and the Master and Test Decoders There are eight reference conditions in each of the experiments 1 through 4 and four reference conditions in experiments 5 and 6 Table 2 4 2 3 7 1 Numerical Parameters for the SO 73 Listening Experiments Parameter Exp t Exp 2 Exp 3 Exp4 Exp 5 Exp 6 Reterence Conditions e e e s 4 a4 Total Gonditons ee 24 d e o oe e e e e vm SimuiperTaker gene 8 8 s amp e e Total Simul Experiment ___ 1596 1152 3072 2016 96 ft a fe ofa Stimuli per Listening Panel 192 144 252 24 2 77 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 C S0018 D v1 0 Paameter Expa Bxp2 Exp 3 Bp4 Exp 5 Exo Listeners Voters per Listening Panel 4 4 4
120. e Unit MNRU February 1996 ITU T Recommendation P 830 Methods for Objective and Subjective Assessment of Quality Annex D Modified IRS Send and Receive Characteristics MIRS February 1996 13 ITU T Recommendation P 835 Subjective test methodology for evaluating speech communication systems that include noise suppression algorithm November 2003 14 15 16 17 18 19 20 21 C S0018 D v1 0 INFORMATIVE REFERENCES 3GPP2 C S0011 C Recommended Minimum Performance Standards for cdma2000 Spread Spectrum Mobile Stations March 2006 3GPP2 C S0010 C Recommended Minimum Performance Standards for cdma2000 Spread Spectrum Base Stations March 2006 TIA EIA 95 B R2004 Mobile Station Base Station Compatibility Standard for Wideband Spread Spectrum Cellular Systems October 2004 IEEE 269 2002 R2002 Standard Method for Measuring Transmission Performance of Telephone Sets Handsets and Headsets April 2003 IEEE STD 661 1979 R2008 EEE Standard Method for Determining Objective Loudness Ratings of Telephone Connections September 2008 ISO IEC 9899 1999 C2001 C2004 Programming Languages C Dunnett C W A multiple comparison procedure for comparing several treatments with a control Journal of the American Statistical Association vol 50 1955 pp 1096 1121 3GPP TS 26 131 3rd Generation Partnership Project Technical Specification Group Services and System Aspects Terminal acoustic char
121. e for each data rate is shown in Table 3 1 3 3 1 Table 3 3 3 3 1 Packet File Structure From Master Codec Channel Error Model Value in Packet File ue ii Data Bits per Frame 4 0x0004 171 3 23 21 22 23 24 25 26 27 28 29 30 31 C S0018 D v1 0 Value in Packet File Rate Data Bits per Frame 3 0x0003 1 2 0 000 Blank fo 14 0x0000 Lo Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 3 4 Fixed Point Bit Exact Codec for SO 70 This section describes the C simulation of the speech codec specified by 1 The speech codec C simulation is based on finite precision fixed point arithmetic operations and is recommended to be used as a reference codec to verify the performance of a bit exact EVRC WB implementation of the fixed point C simulation of a test codec The bit exact EVRC WB codec along with the appropriate test vectors to verify the bit exactness performance are included in the associated Software Distribution 3 3 4 1 Fixed Point Codec Program Files This section describes the C program files which are provided in the associated software distribution for this document 3 3 4 2 Compiling the Fixed Point Codec Simulation The source code for t
122. e must contain packets of compressed data e Instructs the simulation to perform only the encoding function The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 8 1 using the codes specified in the first column of Table 3 1 3 3 1 m min Sets the minimum allowable data rate to min where min is element of 4 3 1 using the codes specified in the first column of Table 3 1 3 3 1 If neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 In addition if max z min the data rate varies between max and min using the same rate decision algorithm where the data rate is set to max if the selected data rate is gt max and the data rate is set to min if the selected data rate is min 3 3 4 4 File Formats Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 1 3 3 1 Unused bits are set to 0 For example in a Rate 1 8
123. e stimuli to the listener at 15 dBPa 79 dB SPL at the ear reference plan This level was chosen because it is equivalent to the level delivered by a nominal ROLR handset driven by the average signal level on the PSTN network This level may be calibrated using a suitable artificial ear with circum aural headphone adapter and microphone A test file with a reference signal is included with the source speech database for the purpose of calibration The file cos1004 290 is located in the directory so68 cal of the companion software The calibration file contains a 22 dB 1004 Hz reference signal The audio circuit shall be 3 13 20 21 22 23 24 25 26 C S0018 D v1 0 calibrated so that the test signal has a level of 15 dBPa at the ear reference plane while maintaining compliance with Section 3 2 1 1 3 2 2 Standard Software Test Tools for SO 68 This section describes a set of software tools useful for performing the MPS tests The code has been developed and compiled using the GNU g 8 compiler and software maintenance utilities The tools have been verified under various representative operating systems on a number of different hardware platforms The 3GPP2 supplied tools are all located in the so68 tools directory in the associated Software Distribution and can be built using the GNU g compiler Other software tools such as scaldemo actlev filter and astrip are available in 6 3 2 2 1 Channel Model Utilities f
124. each of the Test E DC means is compared statistically to the Reference E DC mean and the mean difference is evaluated for significance The three statistical tests use a common estimate of the Standard Error of the Mean Difference SEup derived from the Error Mean Square from the ANOVA 5 1 Stage 1 Analysis of Variance Table 5 1 1 shows the generalized Variance Source Table for the stage 1 ANOVA s involved in the Dunnett s Tests The Error Sum of Squares SoS in the ANOVA is the residual SoS after removal of the systematic effects due to the E DC and the Subjects factors 9 The scores for each subject are average values over talkers 5 1 C S0018 D v1 0 Table 5 1 1 Variance Source Table for the ANOVA Source Degrees or Freedom Sum of Squares SoS Mean Square F Ratio df MS x x f E DC EIDC s c 1 SoS 5X X SoS d MS MS x x f Subjects Subjects s 1 505 5 X Xo iia df df dfs 305 809 SoS SoS MS SoS df 2 Total c x s 1 SoS YY6 x 5 2 Stage 2 Dunnett s Multiple Means Test Test CC s vs the Reference CC In Stage 2 of the Dunnett s Test the Mean score for each of the Test E DC s X is compared statistically to the Mean for the reference codec as shown Equation 5 2 1 The value for the Standard Error of the Mean Difference SEyp is computed using the estimate of Mean Square Error MSg derived from the Stage
125. each trial you will be asked to attend only to the speech signal and rate how natural or conversely how degraded the speech signal sounds to you You will use the rating scale shown in the figure below to register your ratings of the speech signal Your task will be to choose the numbered phrase from the list below that best describes your opinion of the SPEECH SIGNAL ALONE and then enter the corresponding number on your keyboard 2 85 C S0018 D v1 0 Attending ONLY to the SPEECH SIGNAL select the category which best describes the sample you just heard the SPEECH SIGNAL in this sample was VERY NATURAL NO DEGRADATION FAIRLY NATURAL LITTLE DEGRADATION SOMEWHAT NATURAL SOMEWHAT DEGRADED FAIRLY UNNATURAL FAIRLY DEGRADED VERY UNNATURAL VERY DEGRADED For the second sentence in each trial you will be asked to attend only to the background and rate how noticeable intrusive and or conspicuous the background sounds to you You will use the rating scale shown in the figure below to register your ratings of the background Your task will be to choose the numbered phrase from the list below that best describes your opinion of the BACKGROUND ALONE and then enter the corresponding number on your keyboard Attending ONLY to the BACKGROUND select the category which best describes the sample you just heard the BACKGROUND in this sample was 5 NOT NOTICEABLE SOMEWHAT NOTICEABLE NOTICEABLE BUT NOT INTRUSIVE FAIRLY CONSPICUOU
126. ead Spectrum Digital Systems Date December 1999 August 2007 December 2007 January 25 2010 C S0018 D v1 0 FOREWORD This foreword is not part of the Standard This document specifies the procedures to test implementations of EVRC A EVRC B EVRC WB or EVRC NW compatible variable rate speech codecs either by meeting the bit exact implementation or meeting recommended minimum performance requirements The EVRC A is the Service Option 3 SO 3 speech codec the EVRC B is the Service Option 68 SO 68 speech codec the EVRC WB is the Service Option 70 SO 70 speech codec and the EVRC NW is the Service Option 73 SO 73 speech codec 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 C S0018 D v1 0 REFERENCES The following standards contain provisions which through reference in this text constitute provisions of this Standard At the time of publication the editions indicated were valid All standards are subject to revision and parties to agreements based on this Standard are encouraged to investigate the possibility of applying the most recent editions of the standards indicated below ANSI 3GPP2 TIA and ITU T maintain registers of currently valid national and international standards published by them 1a 6a 11 12 NORMATIVE REFERENCES 3GPP2 C S0014 D v2 0 Enhanced Variable Rate Codec Speech Service Options 3 68 70 and 73 for Wideba
127. een individual listener s average ratings where the average is over talkers Therefore the SEyp is based on 32 difference Scores one for each listener df 231 2 87 20 21 22 C S0018 D v1 0 the criterion value for the appropriate test 2 09 for Dunnett s Test 1 70 for the t test then the E DC passes the MPS test M res isi c Test 2 4 9 2 1 2 4 10 Expected Results for Reference Conditions 2 4 10 1 Reference Conditions for Experiments 1 3 and 5 The MNRU conditions have been included to provide a frame of reference for the Experiments 1 3 and 5 In listening evaluations where test conditions span approximately the same range of quality the MOS results for similar conditions should be approximately the same Data from previous studies allows a generalization to be made concerning the expected MOS results for the MNRU reference conditions see Figure 2 4 10 1 1 MOS scores obtained for the MNRU conditions in any SO 73 validation test should be compared to those shown in the graph below Inconsistencies beyond a small shift in the means in either direction or a slight stretching or compression of the scale near the extremes may imply a problem in the execution of the evaluation test In particular MOS should be monotonic with MNRU within the limits of statistical resolution and the contour of the relation should show a similar slope MOS w 10 20 30 40 50 dBQ Figure 2 4 10 1 1 Typica
128. eet noise evrc wb opO0 pb EVRC WB operating point 0 16 kHz sampling Nominal 22 dB 20 dB babble noise evrc_wb_op0 ob evrc_wb_op0 fer_3 pm EVRC WB operating point 0 16 kHz sampling Generic audio signal fer_3 evrc_wb_op0 fer_3 om Table 3 3 4 5 4 SO 70 Encoder Suite B Bit exact Test Conditions Input File Operating Point Condition mp ce me nnd TOU DIS exact compliance src s22 EVRC WB operating point Nominal 22 dB evrc wb op0 p22 0 16 kHz sampling src s12 EVRC WB operating point High 12 dB evrc_wb_op0 p12 0 16 kHz samplin src s32 EVRC WB operating point Low 32 aB evrc wb opoO p32 0 16 kHz sampling 196 d amp b src c10 EVRC WB operating point Nominal 22 dB evrc wb opO0 pc1 0 16 kHz sampling 10 dB car noise 3 28 C S0018 D v1 0 Input File Operating Point Condition mes PrN exact compliance src c20 EVRC WB operating point Nominal 22 aB evrc wb opO pc2 0 16 kHz sampling 20 dB car noise src s15 EVRC WB operating point Nominal 22 dB evrc_wb_op0 ps 0 16 kHz sampling 15 dB street noise src b20 EVRC WB operating point Nominal 22 aB evrc wb opO pb 0 16 kHz sampling 20 dB babble noise src s22 8k EVRC WB operating point Nominal 22 dB evrc wb op4 p22 4 8 kHz sampling src s12 8k EVRC WB operating point High 12 dB evrc wb op4 p12 4 8 kHz sampling src s32 8k EVRC WB operating point Low 32 dB evrc wb
129. ence b08 MNRU 40dB SNR 30dB Reference Test Conditions File Condition Enc Dec b09 Car 10dB SNR M M b10 Car 10dB SNR M T b11 Car 10dB SNR TT b12 Car 10dB SNR T M b13 Car 20dB SNR 296 FER M M b14 Car 20dB SNR 2 FER M T b15 Car 20dB SNR 2 FER TT b16 Car 20dB SNR 2 FER T M b17 Street 15dB SNR M M b18 Street 15dB SNR M T b19 Street 15dB SNR T T b20 Street 15dB SNR T M b21 Babble 20dB SNR M M b22 Babble 20dB SNR M T b23 Babble 20dB SNR T T b24 Babble 20dB SNR T M 2 43 C S0018 D v1 0 1 2 4 5 C S0018 D v1 0 2 3 2 3 3 Subjective Experiment 3 for SO 70 The Test Parameters for Listening Experiment 3 are presented in Table 2 3 2 3 3 1 Table 2 3 2 3 3 1 SO 70 Listening Experiment 3 Test Parameters ACR P 800 Narrowband Type of test Number of talkers 4 males 4 females Background noise none ambient Audio Input Level 22 dB 32 dB 12 dB Filter characteristics MIRS Reference conditions 8 Specified reference conditions Test conditions Nominal level Modes O 4 7 Low level Modes 0 4 High Level Mode 0 4 196 d amp b 196 pls Modes 0 4 396 FER Modes 0 4 Encoder Decoder Combinations 4 M M M T T T T M The Test Conditions for Listening Experiment 3 are presented in Table 2 3 2 3 3 2 Table 2 3 2 3 3 2 SO 70 Listening Experiment 3 Test Conditions Exp 3 Narrowband ACR Reference Conditions File M
130. er of talkers P NSA P 835 Narrowband 3 males 3 females Background noise Specified test conditions Audio Input Level 22 dB Filter characteristics MIRS Reference conditions 8 Specified reference conditions Test conditions o Car Noise 15 dB SNR Modes 0 4 7 o Street Noise 15 dB SNR Modes 0 4 o Babble noise 20 dB SNR 2 FER Modes 0 4 Encoder Decoder Combinations 4 M M T T T M The Test Conditions for Listening Experiment 4 are presented in Table 2 3 2 3 4 2 Table 2 3 2 3 4 2 SO 70 Listening Experiment 4 Test Conditions Exp 4 Narrowband P 835 Reference Conditions File MNRU 901 MNRU 40dB SNR 40dB Reference 902 MNRU 40dB SNR 20dB Reference MNRU 40dB SNR 0dB Reference d04 MNRU 0aB SNR 40dB Reference 005 MNRU 20dB SNR 40dB Reference 906 MNRU 10dB SNR 10dB Reference d07 MNRU 20dB SNR 20dB Reference d08 MNRU 40dB SNR 30dB Reference Test Conditions File Condition Enc Dec 909 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M M d10 Car 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M T 911 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support M M 912 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support M T d13 Car 15dB SNR Mode 4 interoperable with Mode 0 of SO 68 support IT T 914 Car 15dB SNR Mode 4 interoperable with Mode 0 of SO 68
131. erence conditions 0 FER and 3 FER Encoder Decoder Combinations 2 M M M T The Test Conditions for Listening Experiment 8 are presented in Table 2 3 2 3 8 2 Table 2 3 2 3 8 2 SO 70 Listening Experiment 8 Test Conditions Exp 8 Narrowband Music File Reference Condition h01 MNRU 10dB Reference h02 MNRU 20dB Reference h03 MNRU 30dB Reference h04 Source Reference File Test Condition Enc Dec h05 096 FER M M h06 0 FER M T h07 3 FER M M h08 3 FER M T 2 3 2 3 9 Numerical Parameters for the SO 70 Listening Experiments Table 2 3 2 3 9 1 describes the resultant numerology that is used for the eight SO 70 listening experiments The first column is the description of the parameter and columns 2 to 9 show the numerical value for each of the parameters for the eight listening experiments For each listening experiment the different Encode Decode Test conditions include various interconnections between the Master and Test Encoders and the Master and Test Decoders There are eight reference 2 51 C S0018 D v1 0 conditions in each of the experiments1 through 6 and four reference conditions in experiments 7 and Table 2 3 2 3 9 1 Numerical Parameters for the SO 70 Listening Experiments Exp 1 Exp2 Exp 3 Exp 4 5 Exp5 Exp 7 Exp esas aon esas ACR 835 Encode Decode Test conditons 16 16 se ze ie amp 4 4 Reference Conditions Total
132. ersig27 exe This utility program provides a the ability to introduce Frame Erasure channel impairment b the ability to verify use of half rate or lesser frame rate during dim and burst and packet level signaling the ability to measure the Average Data Rate from an encoded packet file A log output of ersig27 provides detail on the ADR performance of the preceding encoder In these applications the utility is invoked as in following examples for 3 FER and 1 signaling fersig27 c EVRC B e fer 3 bin infile outfile fersig27 c EVRC B s dim 1 e fer 3 bin infile outfile 3 2 2 3 Channel Error and Signaling Masks These binary Frame Error Rate and Signaling masks source level and packet level 1 byte of either 0 or 1 per frame are used with the fersig27 channel impairment and inter working simulation functions for the various conditions fer 3 bin dim_1 bin dim 15 pls bin 8 The GNU C compiler G and software development tools including documentation are available without charge from the Free Software Foundation They can be contacted at Free Software Foundation Voice 1 617 542 5942 59 Temple Place Suite 330 Fax 1 617 542 2652 Boston MA 02111 1307 USA gnu gnu org or on the World Wide Web at http www fsf org 3 14 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 C S0018 D v1 0 3 2 2 3 EVRC B Interworking Function IWF The soft
133. es so73 subjctv exp m pkt and so73 subjctv exp m m Execution of the master codec software is needed only for the test encoder master decoder combination for each experiment condition All codec processing shall be done digitally Noise suppression and post filter options shall be enabled for both the master and the test codecs The digital format of the speech files is described in Section 3 4 4 4 The naming convention of the processed speech is as follows For the packet files in the so73 subjctv exp 1 3 m_pkt directory the p12 files are the master packet files for the s12 source file Likewise the p22 and p32 files are the respective packet files for the s22 and s32 source files The pf3 files are the impaired packet files which will be described in Section 2 4 4 3 Similarly the directory so73 subjctv exp 2 4 m_pkt contains the master packet files for the respective experiments Here the pc10 pb20 and ps files are the master packet files for the c15 b20 and s15 source files respectively For the master encode master decode directories so73 subjctv exp m m the naming convention of the speech files is such that the first two characters of the file name indicate the codec combination and the suffix indicates the condition numbers in Table 2 4 2 3 1 2 and Table 2 4 2 3 2 2 Naming conventions for the remaining two experiments follow accordingly 2 78 20 21 22 23 24 25 26
134. et File Structure from Master Codec Channel Error Model Value in Packet File Re Data Bits per Frame 4 0x0004 171 3 37 21 22 23 24 25 26 27 29 C S0018 D v1 0 Value in Packet File Rate Data Bits per Frame 3 0x0003 1 2 _ Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 4 4 Fixed Point Bit Exact Codec for SO 73 This section describes the C simulation of the speech codec specified by 1 The speech codec C simulation is based on finite precision fixed point arithmetic operations and is recommended to be used as a reference codec to verify the performance of a bit exact EVRC NW implementation of the fixed point C simulation of a test codec The bit exact EVRC NW codec along with the appropriate test vectors to verify the bit exactness performance are included in the associated Software Distribution 3 4 4 1 Fixed Point Codec Program Files This section describes the C program files which are provided in the associated software distribution for this document 3 4 4 2 Compiling the Fixed Point Codec Simulation The source code for the fixed point codec simulation has been written in C and can be compiled using any general purpose compiler such as the GN
135. executable The files so3 simul fixed test source pcm contain the original unprocessed speech files The files in soS simul fixed test fixed32 contain the encoded packet files and the decoded speech files generated by the 32 bit long multiply DSP library Likewise files in so3 simul fixed test fixed31 were processed with the 31 bit DSP library The processed files have the following naming convention The encoded packet have the extension pkt and are generated by running EvrcFix i pcm o pkt e the decoded speech files dec are generated by running EvrcFix i pkt o dec d If the output files pkt and dec exactly match verify pkt and verify dec respectively then verification of the operation of the fixed point codec s operation is complete 3 1 4 6 Verifying Bit Exact Performance of the Fixed Point Test Codec Files in the so3 testvec directory are provided for the purpose of qualifying a test codec as bit exact The files in the so3 testvec directories are 16 bit PCM binary files in PC format LSB MSB and obey the following file extension naming convention Source speech pcm encoder output pkt decoder output dec The so3 testvec directory is divided into subdirectories so3 testvec source soS testvec fixed31 and so3 testvec fixed32 3 9 C S0018 D v1 0 The so3 testvec source directory contains input source files and includes original speech files as well as packet files injecte
136. execution Executing EvrcB with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below o outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding function The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 3 2 1 using the codes specified in the first column of Table 3 2 3 3 1 20 21 22 23 24 25 26 27 28 29 30 C S0018 D v1 0 m min Sets the minimum allowable data rate to min where min is element of 4 3 2 1 using the codes specified in the first column of Table 3 2 3 3 1 lf neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 W target active speech channel adr Specifies the target active speech cha
137. f the specific EVRC service option including the noise suppression rate determination and post filter components Should the candidate EVRC differ in any of these components the test codec shall be tested using the objective and subjective tests prescribed by this standard That is EVRC compliance of a test codec can be achieved by either e Complying with Sections 2 1 1 and 2 1 2 SO 3 or Sections 2 2 1 and 2 2 2 SO 68 or Sections 2 3 1 and 2 3 2 SO 70 or Section 2 4 1 and 2 4 2 SO 73 and demonstrating bit exactness according to the procedure described in Section 3 1 4 SO 3 or Section 3 2 4 SO 68 or Section 3 3 4 SO 70 or Section 3 4 4 SO 73 respectively e Following the objective and subjective testing procedures set forth in Sections 2 1 1 and 2 1 2 or Sections 2 2 1 and 2 2 2 or Sections 2 3 1 and 2 3 2 or Section 2 4 1 and 2 4 2 SO 73 of this standard With the exception of Sections 3 1 4 3 2 4 3 3 4 and 3 4 4 the remaining text applies only to implementations that do not satisfy the requirement for bit exactness Testing the codec is based on two classes of procedures objective tests and subjective tests In the event that the test codec fails any of the objective or subjective tests the test codec fails the compliance test Objective tests are based upon actual measurements from the speech codec function Subjective tests are based on listening tests to judge overall speech quality The minimum 20
138. f1p2 a27 1 5 01 f1p6 a30 f4p3 ao3 m m mip5 al3 m4p1 a29 EIEE 1 m3pi a22 a0 mpB a32 m4p6 a37 f al f2p7 a09 f3p6 a06 f2p7 a24 p2 a02 m3p7 a21 t2p2 a25 fip4 a04 Bus 3 12 2p3 a39 5 05 4 1 3 p5 a38 7 02 6 6 1 6 0 p8 a21 f3p8 a0 p3 a40 p2 al3 p3 25 m3p4 a36 f4p6 a01 m3 mi f2 m4 m2p8 a01 m4p8 a38 m2p2 a39 f2p3 all m2pl f4p4 a35 f2pl a05 3 4 01 f2p3 all a28 A 3p8 f4p5 a31 p3 a13 3p6 f i m3p2 a30 m1lp5 a08 mip5 ao3 f m3p5 al 3p8 a31 p2 a33 4p7 a40 p4 a25 p8 a32 m2p8 a02 p3 a24 p8 a10 pl al3 m4p6 a36 3 Mh A o blolalalelals oe MA CS B t m2p4 a05 f3p4 a35 3p6 a24 f2p6 al7 4p6 a07 f1p4 a31 mipi a26 mip8 a35 m2p7 a31 4 1 B f4p6 a28 f2p6 a23 f2pi a32 f2pi a27 m2p6 a06 m3pl a20 mlp6 a32 m2p8 a12 m2p2 ai15 p6 a20 flpl al7 f1p3 a22 3p4 a26 f4p4 a39 m3pl a07 m1p8 a36 m2p7 a25 m4p6 a30 f3p4 a29 f2p8 a02 f2p3 al15 f1p3 a24 f2p1 a13 f4p7 a10 m3p4 a09 m3p5 a39 m3p7 a28 ml1p8 a24 1 8 27 Panel 1 ko 4 8 3 is i7 25 7 917 35 49 2 28 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 The randomization lists for each of the eight listening panels for each expe
139. ference MNRU 27dB a06 Reference MNRU 33dB a07 Reference MNRU 39aB a08 Reference Direct a09 EVRC B 9 3 kbps Nominal 22 dB M M a10 EVRC B 9 3 kbps Nominal 22 dB M T 11 EVRC B 9 3 kbps Nominal 22 dB T T a12 EVRC B 9 3 kbps Nominal 22 dB T M a13 EVRC B 5 8 kbps Nominal 22 dB M M 14 EVRC B 5 8 kbps Nominal 22 dB M T 15 EVRC B 5 8 kbps Nominal 22 dB T T a16 EVRC B 5 8 kbps Nominal 22 dB T M a17 EVRC B 4 8 kbps Nominal 22 dB M M 18 EVRC B 4 8 kbps Nominal 22 dB M T a19 EVRC B 4 8 kbps Nominal 22 dB T T a20 EVRC B 4 8 kbps Nominal 22 dB T M a21 EVRC B 9 3 kbps Low 32 dB 196 d amp b 196 pls M M a22 EVRC B 9 3 kbps Low 32 dB 196 d amp b 196 pls M T a23 EVRC B 9 3 kbps Low 32 dB 196 d amp b 196 pls T T a24 EVRC B 9 3 kbps Low 32 dB 196 d amp b 196 pls T M a25 EVRC B 5 8 kbps Low 32 dB 196 d amp b 196 pls M M 2 20 2 3 C S0018 D v1 0 Label Operating Point Condition Encoder Decoder Combinations a26 EVRC B 5 8 kbps Low 32 dB 196 d amp b 196 pls M T a27 EVRC B 5 8 kbps Low 32 dB 196 d amp b 196 pls T T a28 EVRC B 5 8 kbps Low 32 dB 196 d amp b 196 pls T M a29 EVRC B 9 3 kbps High 12 dB M M a30 EVRC B 9 3 kbps High 12 dB M T a31 EVRC B 9 3 kbps High 12 dB TT a32 EVRC B 9 3 kbps High 12 dB T M a33 EVRC B 5 8 kbps High 12 dB M M a34 EVRC B 5 8 kbps High 12 dB M T a35 EVRC B 5 8 kbps High 12 dB T T a36 EVRC B 5 8
140. flowchart An implementation may support SO 70 only for 8 kHz sample rate input output for example a Base station transcoder or a Media Gateway An implementation may support SO 70 for both 16 kHz and 8 kHz sample rates for example a mobile station that supports wideband electro acoustics Further the implementation supporting SO 70 might already have demonstrated compliance to SO 68 Minimum Performance Spec This means that such equipment has also demonstrated the Minimum Performance requirements for RATE REDUC operating points 4 and 7 of SO 70 which exactly correspond to the RATE REDUC operating points 0 and 7 of SO 68 Therefore the main parameters in the decision tree are a 16 kHz support in the implementation and b SO 68 compliance of the test implementation Depending on the implementation profile of the device under test one of 4 possible Test Suites are to be used to demonstrate SO 70 compliance These 4 test suites named Test suites A B C D and the individual tests comprising the Test suites are highlighted in Table 2 3 2 1 2 38 3 C S0018 D v1 0 Table 2 3 2 1 Test Suites for SO 70 compliance Test Set of Experiments Notes Suites Mobile MGW already supporting SO 68 A Experiment 1 2 and 7 compliance B Experiment 1 2 3 4 7 and 8 Mobile MGW NOT already supporting SO 68 compliance Infra MGW already supporting SO 68 C Experiment 5 6 and 8 compliance D Experiment 3 4 and 8 Infra MGW N
141. g sssssseseeneeeenenn enne 6 1 6 4 High level Processing reete oett bate t lah ok ibis 6 1 viii 20 21 22 23 C S0018 D v1 0 LIST OF FIGURES Figure t 3 1 Test ab kana dok akeyan kaba pd ae deba nas koma ga ked aie 1 6 Figure 2 1 8 1 Instructions for Listeners eene 2 12 Figure 2 1 11 1 MOS versus MNRU 2 16 Figure 2 2 8 1 1 Instructions for Listeners 2 30 Figure 2 2 10 1 1 MOS versus MNRU rennen 2 34 Figure 2 2 10 2 1 P 835 Score Profiles for Reference 2 35 Figure 2 3 2 1 SO 70 Subjective test suite decision 2 38 Figure 2 3 8 1 1 Instructions for 2 59 Figure 2 3 10 1 1 Typical Plot of MOS versus enne nennen 2 63 Figure 2 3 10 2 1 Typical P 835 Score Profiles for Reference Conditions 2 64 Figure 2 4 2 1 SO 73 Subjective test suite decision 2 67 Figure 2 4 8 1 1 Instructions for Listeners 2 84 Figure 2 4 10 1 1 Typical Plot of MOS versus MNRU ccccccseceeeeeeeeee cesses eeeeeeeeeeeesaeeesaaeeeeneeee 2 88 Figure 2 4 10 2 1 Typical P 835 Score Profiles for
142. g src c10 EVRC WB operating point O Nominal 22 dB evrc wb opO pc1 16 kHz sampling 10 dB car noise src c20 EVRC WB operating point O Nominal 22 dB evrc wb opO pc2 16 kHz sampling 20 dB car noise 5 515 EVRC WB operating point 0 Nominal 22 dB evrc wb opO ps 16 kHz sampling 15 dB street noise src b20 EVRC WB operating point 0 Nominal 22 aB evrc wb opO pb 16 kHz sampling 20 dB babble noise 3 27 C S0018 D v1 0 Table 3 3 4 5 3 SO 70 Suite A Decoder Bit exact Test Conditions Reference output operating point 0 16 kHz sampling 10 dB car noise Input Packet File Operating Point Condition speech files for bit exact compliance evrc wb opoO fer 3 22 EVRC WB Nominal 22 dB evrc wb opoO fer 396 022 operating point 0 3 FER 16 kHz sampling evrc_wb_op0 fer_1 pls_1 p22 EVRC WB Nominal 22 dB evrc wb opo operating point 0 396 FER fer 196 pls 196 022 16 kHz sampling evrc wb p12 EVRC WB High 12 dB evrc wb op0 012 operating point O 16 kHz sampling evrc wb opO dim 196 p32 EVRC WB Low 32 dB evrc wb opO dim 196 032 operating point 0 1 d amp B 16 kHz sampling evrc_wb_op0 pc1 EVRC WB Nominal 22 dB evrc_wb_op0 oc1 evrc_wb_op0 _fer_3 pc2 EVRC WB Nominal 22 dB evrc wb opO fer 396 0c2 operating point 0 20 dB car noise 16 kHz sampling fer 396 evrc wb opO ps EVRC WB Nominal 22 dB evrc wb opO0 os operating point O 16 kHz sampling 15 dB str
143. h pcm 3 10 C S0018 D v1 0 rda low pcm The following source files are encoded packets which have been corrupted with frame erasure at different rates They are designed to exercise the decoder s frame error handling vec 07 1 pkt Encoded packet w 196 FER vec 07 2 pkt Encoded packet w 296 FER vec 07 3 pkt Encoded packet w 3 FER vec 08 1 pkt Encoded packet w 196 FER vec 08 2 pkt Encoded packet w 296 FER vec 08 3 pkt Encoded packet w 3 FER vec_10_1 pkt Encoded packet w 1 FER vec 10 2 pkt Encoded packet w 296 FER vec 10 3 pkt Encoded packet w 3 FER 3 1 4 6 2 Instructions for Processing Bit Exact Test Vectors The following table is a list of source files to be processed in DEFAULT MODE rates 1 3 4 allowed and the names of the corresponding reference files The files are to be processed as follows Encode EvrcFix e i file pcm o file pkt Decode EvrcFix d i file pkt o file dec Table 3 1 4 6 2 1 Source and Bit exact Default Mode Test Vector Files PCM Source File Encoded Packet File Decoded Speech File rda high pcm rda high pkt rda high dec rda low pcm rda low pkt rda low dec rda mod pcm rda mod pkt rda mod dec rda test pcm rda test pkt rda test dec vec 01 pcm vec 01 pkt vec 01 dec vec 02 pcm vec 02 pkt vec 02 dec vec 03 pcm vec 03 pkt vec 03 dec vec 04 pcm vec 04 pkt vec 04 dec vec 05 pcm vec 05 pkt vec 05 dec vec 06 pcm vec 06 pkt vec 06 dec C S0018 D v1 0 PCM Source File Encoded Packet Fi
144. he d option below o outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding function The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 3 2 1 using the codes specified in the first column of Table 3 4 3 3 1 m min Sets the minimum allowable data rate to min where min is element of 4 3 2 1 using the codes specified in the first column of Table 3 4 3 3 1 If neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 3 4 8 3 File Formats for SO 73 Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 4 3 3 1 Table 3 4 3 3 1 Pack
145. he 2400 bps rate and Rate 1 8 frames use the 1200 bps rate The allowable speech encoding frame rates for SO 73 Rate 1 frames use the 8550 bps rate Rate 1 2 frames use the 4000 bps rate Rate 14 frames use the 2000 bps rate and Rate 1 8 frames use the 800 bps rate ROLR Receive Objective Loudness Rating a measure of receive audio sensitivity ROLR is a frequency weighted ratio of the line voltage input signal to a reference encoder to the acoustic output of the receiver 17 defines the measurement of sensitivity and 18 defines the calculation of objective loudness rating Supra aural Headphones Headphones that cover but do not surround the entire ear Tmax The maximum undistorted sinusoidal level that can be transmitted through the interfaces between the EVRC and the PCM based network This is taken to be a reference level of 3 17 1 3 Test Model for the Speech Codec For the purposes of this standard a speech encoder is a process that transforms a stream of binary data samples of speech into an intermediate low bit rate parameterized representation As mentioned elsewhere in this document the reference method for the performance of this process is given in 1 This process may be implemented in real time as a software program or otherwise at the discretion of the manufacturer Likewise a speech decoder is a process that transforms the intermediate low bit rate parameterized representation of speech given 1 bac
146. he C simulation of the speech codec specified by 1 The master codec C simulation used for verifying the performance of a non bit exact EVRC NW implementation shall be the floating point master C simulation included in the associated Software Distribution 1a 3 36 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 C S0018 D v1 0 3 4 3 1 Compiling the Master Codec Simulation The source code for floating point simulation can be compiled using the GNU G compiler and make utility A G compatible makefile has been included in the appropriate sub directory in 1a Typing make this directory will compile and link the code and create the executable file called Evrc nw Evrc nw exe on Win32 systems which will be placed in the same directory The included makefile may require some user modification for a particular hardware platform and or operating system 3 4 3 2 Running the Master Codec Simulation The EVRC NW floating point executable Evrc nw files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing Evrc nw with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see t
147. he extension for the half rate max packets 2 8 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 C S0018 D v1 0 Similarly the directory so3 subjctv exp2 m pkt contains the master packet files for Experiment Il Here the p22 files are the master packet files for the s22 source files and the pc pb and ps files are the master packet files for the car bab and str source files respectively For the master encode master decode directories soS subjctv exp m m the naming convention of the speech files is such that the first two characters of the suffix indicate the codec combination and third indicates the condition number 1 through 5 It is required that this convention be used for the other codec combinations mt tm and tt so that the supplied randomization lists see Section 2 1 5 are valid Two exceptions to this naming convention is the master encoder master decoded 3 reverse link FER files which shall be assigned the extension tm4 and the 3 forward link FER files shall be assigned the extension mm4 2 1 4 4 Encoding by the Test Codec All of the source files will be encoded by the test codec to produce encoded packet files For ease of reference it is recommended that directories so3 subjctv exp1 t and so3 subjctv exp2 t be created to deposit the test encoder output packets and that the naming conventions be made consistent with t
148. he fixed point codec simulation has been written in C and can be compiled using any general purpose compiler such as the GNU G compiler and make utility Two GCC compatible makefiles have been included in the build directory Typing make in the build directory will compile and link the code and create the executable file called Evrc wb fx Evrc wb fx exe on Win32 systems which will be placed in the build directory The included makefiles may require some user modification for a particular hardware platform and or operating system 3 3 4 8 Running the Fixed Point Codec Simulation The EVRC WB executable files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing Evrc wb fx with no command line arguments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below 3 24 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 C S0018 D v1 0 d Instructs the simulation to perform only the decoding function The input fil
149. he host computer has the source speech data files which it outputs to the speech encoder The host computer simultaneously saves the speech parameter output data from the encoder Similarly for decoder testing the host computer outputs speech parameters from a disk file and saves the decoder output speech data to a file The choice of the host computer and the nature of the interfaces between the host computer and the speech codec are not subject to standardization It is expected that the host computer would be some type of personal computer or workstation with suitable interfaces and adequate disk storage The 3 1 C S0018 D v1 0 interfaces may be serial or parallel and will be determined by the interfaces available on the particular hardware realization of the speech codec Figure 3 2 shows a generic block diagram of the audio path for the subjective test using four listeners per session The audio path is shown as a solid line the data paths for experimental control are shown as broken lines This figure is for explanatory purposes and does not prescribe a specific implementation Digital Speech Files D A Converter Reconstruction Filter Software Response Terminals Control EDIT IT Program Bandpass Filter Attenuator or Electronic Amplifier Switch Headphones Figure 3 2 Subjective Testing Equipment Configuration For the purposes of this standard spee
150. he master codec 2 1 4 2 Decoding by the Master Test Codecs The encoded packet files generated from the various encoders conditions shall be processed through the master and test decoders For all conditions the signal power shall be normalized to 22 dB The signal shall then be u law companded into PCM files See Sections 3 1 2 2 and 3 1 2 3 for details in using the provided software tools that can be used for this post processing 2 1 4 3 Introduction of Impairments For the 3 frame error condition Experiment condition 4 the impaired master codec encoded packet files are provided in the so3 subjctv exp1 m_pkt directory Unlike other conditions this condition uses only the test decoder and not the test encoder The performance of the test decoder is compared to that of master decoder using master encoder generated packets from two different frame error models 3 forward FER and 3 reverse FER The 3 forward FER packets pf3 are then used by the test decoder to generate the master encoder test decoder combination mt4 and the 396 reverse FER packets pr3 are used by the test decoder to generate the master encoder test decoder combination tt4 The respective master decoder outputs are the mm4 and tm4 To clarify the naming convention the following four conditions are tested e mm4 master encoder master decoder 3 forward link FER e tm4 master encoder master decoder 3 reverse link FER e mt4 master enco
151. ical test for multiple Test conditions against a common Reference condition is Dunnett s Test A complete description of Dunnett s Test is contained in Appendix B The critical value for the Dunnett s test is 2 09 one sided test p 05 4 E DC s df 93 For those test conditions where a single Test E DC T T is compared against the Reference E DC M M the appropriate statistical test is Student s t test The critical value for the Student s t test is 1 70 one sided test p lt 05 df 31 In both the Dunnett s Test and the t test the MPS test is evaluated by dividing the difference between the mean score for the Test E DC and the mean score for the Reference ED C by the Standard Error of the Mean Difference SEmp as shown in Equation 2 2 9 2 1 If the resultant Test value is less than the criterion value for the appropriate test 2 09 for Dunnett s Test 1 70 for the t test then the E DC passes the MPS test 4 The appropriate t test is a matched groups t test and the SEyp is based on the differences between individual listeners average ratings where the average is over talkers Therefore the SEyp is based on 32 difference Scores one for each listener df 231 2 33 C S0018 D v1 0 M res zx M wa Vi Test JO 2 2 9 2 1 2 2 10 Expected Results for Reference Conditions 2 2 10 1 Experiment Reference Conditions The MNRU conditions have been included to provide a frame of reference for the Experiment MOS
152. in 1 the algorithmic delay is given as Delay Source Delay ms Signal Preprocessing Delay 0 0 Filterbank Analysis 0 8 LPC Analysis Look ahead 10 0 LPC Analysis Window 20 0 Highband excitation generation delay 1 5 Highband synthesis overlap and add delay 2 0 Filterbank Synthesis Delay 1 1 Total 35 4 Therefore the total algorithmic delay imposed by a SO 73 test codec should not exceed 35 4 milliseconds 2 4 2 Subjective Performance Testing for SO 73 This section outlines the subjective testing methodology of the subjective performance test The purpose of this testing is to evaluate the quality of the test codec under a variety of conditions which may occur in the CDMA system To accomplish this suites of listening experiments have been designed to test speech codec quality under a variety of conditions depending on a number of parameters These conditions include channel impairments audio background noise and different input levels Figure 2 4 2 1 illustrates a decision tree to arrive at the suite of tests that are needed to demonstrate Minimum Performance Spec compliance of a Test implementation of SO 73 for different profiles of equipment that support SO 73 2 66 C S0018 D v1 0 Is 16kHz Sampling Rate Supported No Run Test Suite B Figure 2 4 2 1 SO 73 Subjective test suite decision flowchart Run Test Suite A An implementation may support SO 73 only for 8 kHz sample rate input output for
153. inal 22 dB 196 d amp b 196 pls evrc nw opi dim 196 pls 196 022 8k evrc nw op1 pc EVRC NW operating point 1 8 kHz sampling Nominal 22 dB 15 dB car noise evrc nw opi oc 8k evrc nw op6 dim 195 pls 196 p22 EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 196 d amp b 196 pls evrc nw op6 dim 196 pls 196 022 8k evrc nw op6 pc EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 15 dB car noise evrc nw op6 oc 8k evrc nw op7 pc EVRC NW operating point 7 8 kHz sampling Nominal 22 dB 15 dB car noise evrc nw op7 oc 8k evrc nw opi ps EVRC NW operating point 1 8 kHz sampling Nominal 22 dB 15 dB street noise evrc nw opi os 8k evrc nw opi fer 296 p b EVRC NW operating point 1 8 kHz sampling Nominal 22 dB 15 dB babble noise evrc nw opi fer 296 ob 8k evrc nw op6 ps EVRC NW operating point 6 8 kHz sampling Nominal 22 dB 15 dB street noise evrc nw op6 os 8k evrc nw op6 fer 296 p b EVRC NW operating point 6 8 kHz sampling 3 45 Nominal 22 dB 15 dB babble noise evrc nw op6 fer 296 ob 8k C S0018 D v1 0 This page intentionally left blank 3 46 C S0018 D v1 0 4 CONTENTS OF SOFTWARE DISTRIBUTION The source code for the master codec fixed point bit exact codec and software tools as well as the material needed to perform the objective and subjective tests described
154. ing requirements for electro acoustic performance measured between the output of the D A converter and the output of the headphone 1 Frequency response shall be flat to within 2 dB between 50 Hz and 7000 Hz and below 50 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 196 for signals between 50 Hz and 8000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at the ear reference plane of the headphone 4 Signal shall be delivered to the headphone on the listener s preferred telephone listening ear and the other ear shall be uncovered No signal shall be delivered to the other headphone The audio path for narrowband test conditions Experiments 3 4 5 and 6 must meet the following requirements for electro acoustic performance measured between the output of the D A converter and the output of the headphone 1 Frequency response shall be flat to within 2 dB between 200 Hz and 3400 Hz and below 200 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 1 for signals between 100 Hz and 4000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at
155. is given as Delay Source Delay ms Signal Preprocessing Delay 0 0 Filterbank Analysis 0 8 LPC Analysis Look ahead 10 0 LPC Analysis Window 20 0 Highband excitation generation delay 1 5 Highband synthesis overlap and add delay 2 0 Filterbank Synthesis Delay 1 1 Total 35 4 Therefore the total algorithmic delay imposed by a SO 70 test codec should not exceed 35 4 milliseconds 2 3 2 Subjective Performance Testing for SO 70 This section outlines the subjective testing methodology of the subjective performance test The purpose of this testing is to evaluate the quality of the test codec under a variety of conditions which may occur in the CDMA system To accomplish this suites of listening experiments have been designed to test speech codec quality under a variety of conditions depending on a number of parameters These conditions include channel impairments audio background noise and different input levels Figure 2 3 2 1 illustrates a decision tree to arrive at the suite of tests that are needed to demonstrate Minimum Performance Spec compliance of a Test implementation of SO 70 for different profiles of equipment that support SO 70 2 37 C S0018 D v1 0 Is 16kHz Sampling Rate Supported Yes No SO 68 Compliant SO 68 iant No Compliant Run Test Suite B Run Test Suite A Yes Run Test Suite C Run Test Suite D Figure 2 3 2 1 SO 70 Subjective test suite decision
156. is not necessary to reproduce the source packet files only the decoded speech files The files are to be processed as follows Decode EvrcFix d i file pkt o file dec 20 21 22 23 24 25 C S0018 D v1 0 Table 3 1 4 6 2 4 Decoder Output Test Vector Files Decoded Speech File vec 07 1 dec vec 07 2 dec vec 07 3 dec vec 08 1 dec vec 08 2 dec vec 08 3 dec vec 10 1dec vec 10 2 dec vec 10 3 dec 3 2 Specific Standard Test Conditions for SO 68 3 2 1 Audio Path and Calibration for SO 68 3 2 1 1 Audio Path The audio path must meet the following requirements for electro acoustic performance measured between the output of the D A converter and the output of the headphone 1 Frequency response shall be flat to within 2 dB between 200 Hz and 3400 Hz and below 200 Hz the response shall roll off at a minimum of 12 dB per octave Equalization may be used in the audio path to achieve this A suitable reconstruction filter shall be used for playback 2 Total harmonic distortion shall be less than 1 for signals between 100 Hz and 4000 Hz 3 Noise over the audio path shall be less than 35 dBA measured at the ear reference plane of the headphone 4 Signal shall be delivered to the headphone on the listener s preferred telephone listening ear and the other ear shall be uncovered No signal shall be delivered to the other headphone 3 2 1 2 Calibration The audio circuit shall deliver an average sound level of th
157. ise level of 30 dBA or below 2 3 7 Listeners The listener sample is intended to represent the population of telephone users with normal hearing acuity The listeners should be naive with respect to telephony technology issues that is they should not be experts in telephone design digital voice encoding algorithms and so on They should not be trained listeners that is they should not have been trained in these or previous listening studies using feedback trials Age distribution and gender should be nominally balanced across listening panels Each listener shall provide data only once for a particular evaluation A listener may participate in different evaluations but test sessions performed with the same listener should be at least two months apart so as to reduce the cumulative effects of experience 2 3 8 Listening Test Procedures 2 3 8 1 ACR Listening Test Procedures Experiments 1 and 5 The listeners shall listen to each sample and rate the quality of the test sample using a five point scale with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiments 1 3 and 5 four listeners for each listening panel where each listening panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the ACR are presented in Figure 2 3 8 1 1 The instructions m
158. istened to the sample determine the category from the list below which best describes the overall quality of the sample Press the numeric key on your keyboard corresponding to your rating for how good or bad that particular passage sounded The quality of the speech should be rated according to the scale below 5 4 3 2 1 Excellent Good Fair Poor Bad During the session you will hear samples varying in different aspects of quality Please take into account your total impression of each sample rather than concentrating on any particular aspect Figure 2 4 8 1 1 Instructions for Listeners 2 4 8 835 Listening Test Procedures Experiments 2 and 4 Experiments 2 and 4 use the P 835 test methodology described in 13 The P 835 methodology is specifically designed to evaluate the quality of speech in background noise It yields a measure of Signal Quality SIG a measure of Background Quality BAK and a measure of Overall Quality OVRL In general OVRL scores are highly correlated with MOS but the OVRL score provides greater sensitivity and precision in test conditions involving background noise While the OVRL score is of most interest here the SIG and BAK scores also provide valuable diagnostic information For each trial in a P 835 test listeners are presented with three sub samples where each sub sample is a single sentence approx 4 sec duration processed through the same test condition In one of the fi
159. ite A Bit exact Test 3 41 Table 3 4 4 5 3 SO 73 Suite A Decoder Bit exact Test Conditions 3 42 Table 3 4 4 5 4 SO 73 Encoder Suite B Bit exact Test 3 43 Table 3 4 4 5 5 SO 73 Suite B Decoder Bit exact Test Conditions 3 44 Table 4 1 Description of EVRC A Software Distribution 4 1 Table 4 2 Description of EVRC B Software Distribution 4 2 Table 4 3 Description of EVRC WB Software Distribution Contents 4 2 Table 4 4 Description of EVRC NW Software Distribution Contents 4 3 Table 5 1 1 Variance Source Table for the nennen 5 2 Xii 20 21 22 23 24 25 26 27 28 29 30 31 32 33 C S0018 D v1 0 1 INTRODUCTION This standard details definitions methods of measurement verification of bit exactness and minimum performance characteristics of the EVRC A EVRC B EVRC WB and EVRC NW enhanced variable rate speech codecs for digital cellular spread spectrum mobile stations and base stations specified in 1 This standard share
160. ith the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 1 3 3 1 Table 3 1 3 3 1 Packet File Structure From Master Codec Channel Error Model Value in Packet File Rate Data Bits per Frame 4 0x0004 171 15 0x000f Full Rate Probable 14 00000 0 Unused bits set to 0 For example a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 1 3 4 Verifying Proper Operation of the Master Codec Files are provided for the purpose of verifying the fixed point codec executable Three files mstr ref pcm mstr ref pkt and mstr ref dec are included in the directory master test to provide a means for verifying proper operation of the master codec software The file mstr ref pcm is an unprocessed speech file The file mstr ref pkt is a packet file that was obtained by running EvrcFlt i mstr ref pcm o mstr ref pkt e The file mstr ref dec is a decoded speech file that was obtained by running EvrcFlt i mstr ref pkt o mstr ref dec d
161. k into a stream of binary data samples suitable for input to a digital to analog converter followed by an electro acoustic transducer The test model compares the output streams of the test encoder and or decoder to those of a master encoder or decoder when driven by the same input stream Figure 1 3 1 shows how the various combinations of outputs are generated Various test conditions will dictate the specific source material and the functions of the gain blocks the frame error model block and the external rate control The input stream for an encoder is a sequence of 16 bit linear binary 2 s complement samples of speech source material The speech can be clean no background noise or can have background noise added depending on the condition being tested The source is passed through the gain block which can amplify or attenuate the signal depending on the condition being tested This signal is then processed by both the master and test encoders with the ability to control the maximum packet rate externally The output of the test encoder for a given rate must conform to the packet files formats specified in 1 The master encoded speech packets can be presented to a frame error model which 20 21 22 23 24 25 C S0018 D v1 0 simulates packet loss over a CDMA air interface The potentially corrupted encoded speech packets from the master and test encoders are then used as inputs to each of the master and test decoders for
162. l 22 dB M M b02 EVRC A Clean Nominal 22 dB M T b03 EVRC A Clean Nominal 22 dB T M Clean Nominal 22 dB T T 2 5 C S0018 D v1 0 Label Operating Point Condition Enc Dec Connection b05 IS 96 C Clean Nominal 22 dB R R b06 EVRC A Car Noise IRS at 15 dB S N M M b07 EVRC A Car Noise IRS at 15 dB S N M T b08 EVRC A Car Noise IRS at 15 dB S N T M b09 EVRC A Car Noise IRS at 15 dB S N T T b10 IS 96 C Car Noise IRS at 15 dB S N R R b11 EVRC A Street Noise Flat at 12 dB S N M M b12 EVRC A Street Noise Flat at 12 dB S N M T Car Noise IRS at 12 dB S N b13 EVRC A Street Noise Flat at 12 dB S N T M Car Noise IRS at 12 dB S N b14 EVRC A Street Noise Flat at 12 dB S N T T Car Noise IRS at 12 dB S N b15 IS 96 C Street Noise Flat at 12 dB S N R R Car Noise IRS at 12 dB S N b16 EVRC A Office Noise Flat at 20 dB S N Car Noise IRS at 15 dB S N b17 EVRC A Office Noise Flat at 20 dB S N M T Car Noise IRS at 15 dB S N b18 EVRC A Office Noise Flat at 20 dB S N T M Car Noise IRS at 15 dB S N b19 EVRC A Office Noise Flat at 20 dB S N T T Car Noise IRS at 15 dB S N b20 IS 96 C Office Noise Flat at 20 dB S N R R Car Noise IRS at 15 dB S N b21 EVRC A Tandem Nominal 22 dB M M M M b22 EVRC A Tandem Nominal 22 dB M M T T b23 EVRC A Tandem Nominal 22 dB T T M M b24
163. l Plot of MOS versus MNRU 2 4 10 2 Reference Conditions for Experiments 2 and 4 Reference conditions for P 835 tests are constructed as a combination of SNR and MNRU processing to provide degradation in overall speech quality in two dimensions signal distortion and background noise intrusiveness Table 2 4 2 3 2 2 shows the eight reference conditions b01 b08 2 88 1 C S0018 D v1 0 involved in the P 835 Experiments 2 and 4 In general results are expected for these reference conditions such that the obtained score profiles are similar to those shown in Figure 2 3 10 1 1 P 835 Scores 5 4 SNR 40dB Car Noise e SIG _ 4 OVRL 20dB MNRU 40 dB P 835 Scores MNRU 40dB e SIG _ OVRL 20 dB SNR Car Noise 40 dB 5 P 835 Scores m w E MNRU SNR SIG OVRL 10 10 dB 20 20 dB 30 30dB 40 40 dB SNR Car Noise Figure 2 4 10 2 1 Typical P 835 Score Profiles for Reference Conditions 2 89 C S0018 D v1 0 This page intentionally left blank 2 90 20 21 22 23 24 25 26 27 28 C S0018 D v1 0 3 CODEC STANDARD TEST CONDITIONS This section describes the conditions equipment and the software tools necessary for the performance of the tests of Section 2 The software tools and the speech database associ
164. l may be calibrated using a suitable artificial ear with circum aural headphone adapter and microphone A test file with a reference signal is included with the source speech database for the purpose of calibration The file cos1004 290 is located in the directory so3 cal of the companion software The calibration file contains a 22 dB 1004 Hz reference signal The audio circuit shall be calibrated so that the test signal has a level of 16 dBPa at the ear reference plane while maintaining compliance with Section 3 1 1 1 3 1 2 Standard Software Test Tools for SO This section describes a set of software tools useful for performing the tests specified in Section 2 1 Where possible code is written in C code 19 and has been developed and compiled using the GNU GCC C language compiler and software maintenance utilities The tools have been verified under various representative operating systems on a number of different hardware platforms The 3GPP2 supplied tools are all located in the so3 tools directory in the associated Software Distribution and can be built using the GNU make utility using static libraries and no special optimizations by copying the contents of the so3 tools directory to a new directory on a writeable disk and typing make all in that directory A GCC compatible makefile has been provided for this purpose in the so3 tools directory The makefile creates the executables avg rate exe mu exe and sv56 exe
165. l the entire 16 bit linear range As specified within Section 3 of 1 the master codec assumes a 16 bit integer input output normalization 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 2 CODEC MINIMUM STANDARDS This section describes the validation procedures that shall be used to verify the quality and interoperability of an EVRC implementation The procedures are both comprehensive and backward compatible in that they are provided for the SO 3 SO 68 SO70 and SO 73 implementations of EVRC The validation procedures comprise a set of objective and subjective tests as well as a maximum algorithmic delay Recommendation These are described in the following sections 2 1 Performance Testing for SO 3 2 1 4 Objective Performance Testing for SO 3 The objective testing portion of this specification consists of an average data rate test and compliance to End to End Algorithmic Delay and Unity gain requirements 2 1 1 1 Average Data Rate Test The average data rate for the test codec shall be measured using benchmark files that are contained on the accompanying Software Distribution in the so3 objctv subdirectory The average data rate for the test codec shall be measured using twelve benchmark files that are contained in the associated Software Distribution in the soS objctv subdirectory Each file exhibits a different combination of input level 12 dB 22 dB and 32 dB and background
166. le Decoded Speech File vec 07 pcm vec 08 pcm vec 09 pcm vec 10 pcm vec 11 pcm vec 12 pcm vec 13 pcm vec 07 pkt vec 08 pkt vec 09 pkt vec 10 pkt vec 11 pkt vec 12 pkt vec 13 pkt vec 07 dec vec 08 dec vec 09 dec vec 10 dec vec 11 dec vec 12 dec vec 13 dec The following table is a list of source files to be processed in Rate 1 2 Maximum rates 1 3 allowed and the names of the corresponding reference files The files are to be processed as follows Encode EvrcFix e h i file pcm o file h pkt Decode EvrcFix d i file_h pkt o file h dec Table 3 1 4 6 2 2 Source and Bit exact Rate 1 2 Max Test Vector Files PCM Source File Encoded Packet File Decoded Speech File vec 05 pcm vec 05 h pkt vec 05 h dec vec 06 pcm vec 06 h pkt vec 06 h dec vec 08 pcm vec 08 h pkt vec 08 h dec The following table is a list of source files to be processed in FULL RATE ONLY MODE only rate 4 allowed and the names of the corresponding reference files The files are to be processed as follows Encode EvrcFix e l 4 i file pcm o file pkt Decode EvrcFix d i file pkt o file dec Table 3 1 4 6 2 3 Source and Bit exact Full Rate Only Test Vector Files PCM Source File Encoded Packet File Decoded Speech File shiftl pcm shiftl pkt shiftl dec shiftr pcm shiftr pkt shiftr dec The following table is a list of source packet files to be decoded and the names of the corresponding reference files Note that it
167. lt males and four adult females and are native speakers of North American English 14 For the following discussion it may be useful to refer to Table 4 2 for the composition of the Software 1s Distribution database 2 23 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 C S0018 D v1 0 2 2 8 1 Source Speech Material for SO 68 Experiment I The source speech material for subjective Experiment is contained in directory so68 subjctv exp1 source Each file is MIRS filtered and level adjusted to 22 12 or 32 dB These files are named src s22 src s12 and src s32 respectively The speech database also includes samples processed through the various reference conditions in directory so68 subjctv exp1 ref The reference conditions are named a01 through a08 for the respective conditions given in Table 2 2 2 3 1 2 2 2 3 2 Source Speech Material for SO 68 Experiment II The source speech material for subjective Experiment Il is contained in directory so68 subjctv exp2 source This directory contains the source material for the car street and babble noise conditions which are named src c15 src s15 and src b20 respectively The speech database also includes samples processed through the various reference conditions in directory so68 subjctv exp2 ref The reference conditions are named ref b01 through ref b08 for the respective conditions given in Table 2 2 2 3 2 2 2 2
168. ly M M e14 High Mode 0 LB portion of Wideband mode decoder test only M T e15 Nominal Mode 0 1 D amp BS LB portion of Wideband mode decoder test only M M e16 Nominal Mode 0 1 D amp BS LB portion of Wideband mode decoder test only M T e17 Nominal Mode 0 10 D amp BS interoperable with Mode 0 of SO 68 support M M e18 Nominal Mode 0 10 D BS interoperable with Mode 0 of SO 68 support M T e19 FER 2 Mode 0 1 D amp BS LB portion of Wideband mode decoder test only M M e20 FER 2 Mode 0 1 D amp BS LB portion of Wideband mode decoder test only M T e21 FER 6 Mode 0 10 D amp BS interoperable with Mode 0 of SO 68 support M M 22 FER 6 Mode 0 10 D BS interoperable with Mode 0 of SO 68 support M T e23 Nominal Mode 0 196 PLS LB portion of Wideband mode decoder test only M M 24 Nominal Mode 0 1 PLS LB portion of Wideband mode decoder test only M T 2 3 2 3 6 Subjective Experiment 6 for SO 70 The Test Parameters for Listening Experiment 6 are presented in Table 2 3 2 3 6 1 2 48 2 The Test Conditions for Listening Experiment 6 are presented in Table 2 3 2 3 6 2 Table 2 3 2 3 6 1 SO 70 Listening Experiment 6 Test Parameters C S0018 D v1 0 Type of test P NSA P 835 Narrowband Number of talkers 3 males 3 females Background noise Specified test conditions Audio Input Level 22 dB Filter characteristics MIRS Reference conditions 8 Specified reference conditions Test condition
169. ly noticeable Noticeable but not intrusive Fairly conspicuous somewhat intrusive Very conspicuous very intrusive NOUA For the third sub sample listeners rate the Overall quality on a five point rating scale with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiment Il four listeners for each listening panel where each listening panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the P 835 test are presented below The instructions may be modified to allow for variations in laboratory data gathering apparatus Instructions for P 835 Speech Rating Experiment In this speech rating experiment each trial will involve three sentences and you will give a rating for each sentence For the first sentence in each trial you will be asked to attend only to the speech signal and rate how natural or conversely how degraded the speech signal sounds to you You will use the rating scale shown in the figure below to register your ratings of the speech signal Your task will be to choose the numbered phrase from the list below that best describes your opinion of the SPEECH SIGNAL ALONE and then enter the corresponding number on your keyboard 2 31 C S0018 D v1 0 Attending ONLY to the SPEECH SIGNAL select the category which best describes the sample yo
170. m directory Unlike other conditions this condition uses only the test decoder and not the test encoder For the Dim and Burst processing and also the Packet Level Signaling conditions in Experiment the processing requires inputs from a signaling file to control maximum encoding rate An external software utility EvrcB iwf in Section 3 2 2 3 is also needed to reduce the data rate of certain packets from full rate to half rate Details of these operations are given in Section 6 The signaling file and other utilities are provided in so68 tools directory 2 2 4 4 Ensuring Proper Encoded Frame Packet Files All encoded frame packet files shall be examined to ensure that the files only contain data in those file locations where data should exist for a given data rate The examination of the encoded frame packet files should indicate the occurrence of any improper data in the files but the examination must not alter the encoded frame packet files in any way 2 2 4 5 Post processing of test condition output files In order to build the play sets to be presented to the listening panels the output files for the various test conditions must be processed to provide the appropriate listening conditions In addition the concatenated output files must be partitioned into the samples representing the combination of test condition and talker The listening conditions are provided by filtering the output files using the STL software tool filter
171. ment 22dB R 20 dB SNR car noise segment 22dB R 15 dB SNR street noise segment 22aB For the 8 kHz input The total average channel data rate for the test codec is then given by Ravg 1 6 R ambient background segment 9 12dB R ambient background segment 9 32dB R ambient background segment 22dB R 20 dB SNR babble noise segment 22dB R 15 dB SNR car noise segment 22dB R 15 dB SNR street noise segment 22dB The above files are to be processed with EVRC WB encoder at various capacity operating points defined by the active speech average channel rate shown in Table 2 3 1 1 1 1 Table 2 3 1 1 1 1 Target ADR vs Capacity Operating Point Capacity Operating Point Target Average Channel Data Rate kbps active speech average channel data rate EVRC WB RATE REDUC 000 5 641 596 EVRC WB RATE REDUC 100 5 921 596 EVRC WB RATE REDUC 111 3 29 1 5 The above table provides the maximum allowable average channel rate including full half and eighih rate for the different operating points These maximum allowable average channel rates were obtained by processing the 7 wide band benchmark files for the 16 kHz case and 6 narrow band benchmark files for the 8kHz case through the master floating point software See Section 3 3 2 1 for details in using the provided software tool that can be used to aid in making this calculation 2 3 1 1 2 Average Data Rate Requirement for
172. mentation may support SO 70 only for 8 kHz sample rate input output for example a Base station transcoder or a Media Gateway or mobile station The average data rate for the test codec for this case shall be measured using six narrow band source speech files that are contained in the so70 subjctv exp 3 4 source directories Each file exhibits a different condition power levels 12 dB 22 dB and 32 dB and background noise conditions 20 dB SNR babble noise 15 dB SNR car noise and 15 dB SNR street noise The input source files used in the average data rate test have an approximate voice activity factor of 0 6 and are the same input files used in the subjective portion of the experiment 2 3 1 1 1 Average Data Rate Computation for SO 70 The average channel data rate for the test codec shall be computed for each of the benchmark files as follows 9600 N4 4800 N2 1200 N8 N where N4 number of frames encoded at Rate 1 number of frames encoded at Rate 1 2 Ng number of frames encoded at Rate 1 8 and 2 35 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 Ng For the 16 kHz input The total average channel data rate for the test codec is then given by Ravg 1 7 R ambient background segment 12dB R ambient background segment 32dB R ambient background segment 22dB R 20 dB SNR babble noise segment 22dB R 10 dB SNR car noise seg
173. ments The first column is a variable name given to each of the parameters the second 2 22 C S0018 D v1 0 1 column is the description of the parameter the third column shows the required calculation for 2 determining the value of the parameter if it is dependent upon other parameter values and the last two columns show the numerical value for each of the parameters for the two listening experiments 4 For each listening experiment four codecs are evaluated with a differing number of conditions three 5 for the EVRC B 9 3 and 6 6 kbps codecs and one for the EVRC B 5 8 and 4 8 kbps codecs There e are eight reference conditions in both experiments Table 2 2 2 3 3 1 Numerical Parameters for the SO 68 Listening Experiments Value ES I erer oodecs tor Test contens 2 oodecs orTestGonding ps Godec Conbinatons condionst3 es 2 a pemco 8 Total Conditions C1 C2 C3 C5 40 36 pe fae c10__ stimuipertaker c13 Listening Panel tT C 5 Listeners voes Listeners Voters per Listening Panel 9 2 2 3 Speech Material for SO 68 Testing 1 0 The source speech files used for SO 68 compliance testing consist of 128 Harvard sentences which 11 are preprocessed to include proper level adjustment and noise mixing for use in the two subjective 1 experiments The talkers used in these files consist of four adu
174. ments will display a brief description of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding function The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 3 2 1 using the codes specified in the first column of Table 3 2 3 3 1 m min Sets the minimum allowable data rate to min where min is element of 4 3 2 1 using the codes specified in the first column of Table 3 2 3 3 1 If neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 In addition if max z min the data rate varies between max and min using the same rate decision algorithm where the data rate is set to max if the selected data rate is max and the data rate is set to min if the
175. mes use the 9600 bps rate Rate 1 2 frames use the 4800 bps rate Rate 1 4 frames use the 2400 bps rate and Rate 1 8 frames use the 1200 bps rate The allowable speech encoding frame rates for SO 3 Rate 1 frames use the 8550 bps rate Rate 2 frames use the 4000 bps rate Rate 1 4 frames are not used in Service Option 3 and Rate 1 8 frames use the 800 bps rate Rates for SO 68 The allowable traffic frame rates for SO 68 Rate 1 frames use the 9600 bps rate Rate 1 2 frames use the 4800 bps rate Rate 1 4 frames use the 2400 bps rate and Rate 1 8 frames use the 1200 bps rate The allowable speech encoding frame rates for SO 68 Rate 1 frames use the 20 21 22 23 24 25 26 27 28 29 30 32 33 34 35 36 37 38 39 40 41 C S0018 D v1 0 8550 bps rate Rate 1 2 frames use the 4000 bps rate Rate 1 4 frames use the 2000 bps rate and Rate 1 8 frames use the 800 bps rate Rates for SO 70 The allowable traffic frame rates for SO 70 Rate 1 frames use the 9600 bps rate Rate 1 2 frames use the 4800 bps rate and Rate 1 8 frames use the 1200 bps rate The allowable speech encoding frame rates for SO 70 Rate 1 frames use the 8550 bps rate Rate 1 2 frames use the 4000 bps rate and Rate 1 8 frames use the 800 bps rate Rates for SO 73 The allowable traffic frame rates for SO 73 Rate 1 frames use the 9600 bps rate Rate 1 2 frames use the 4800 bps rate Rate 1 4 frames use t
176. ming four combinations of decoded outputs The four output combinations are master encode master decode test encode master decode master encode test decode and test encode test decode or more simply M M T M M T and T T respectively The decoded speech material is then appropriately gain adjusted inversely to input gain and formatted u Law PCM for SO 3 and 16 bit linear PCM for SO 68 SO 70 and SO 73 to form the final outputs The representation of output speech is the same as that for input speech material Master 4 Encoder Master Decoder External Rate Control Test Decoder Test Encoder Intermediate Packet Format Figure 1 3 1 Test Model Various implementations of the encoder and decoder especially those in hardware may not be designed to deliver or accept a continuous data stream as previously described lt is the responsibility of the manufacturer to implement a test platform that is capable of delivering and accepting these formats in order to complete the performance tests described in the following sections This may involve a custom hardware interface or a fair implementation of the algorithm in software or some other mechanism A fair implementation in software shall yield bit exact output with reference to any hardware implementation that it is claimed to represent The input speech material has been precision limited by an 8 bit quantization algorithm in which the inverse quantized linear samples fil
177. n The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 3 2 1 using the codes specified in the first column of Table 3 4 3 3 1 m min Sets the minimum allowable data rate to min where min is element of 4 3 2 1 using the codes specified in the first column of Table 3 4 3 3 1 If neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 In addition if max z min the data rate varies between max and min using the same rate decision algorithm where the data rate is set to max if the selected data rate is gt max and the data rate is set to min if the selected data rate is min 3 4 4 4 File Formats Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 4 3 3 1 Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word
178. nd 25 dBQ values and 9 728 8 will be included in each experiment The IS 96 C codec specified in 2 is included for all conditions as an additional codec 2 1 2 3 1 Subjective Experiment for SO 3 The Test Conditions for Listening Experiment are presented in Table 2 1 2 3 1 1 Table 2 1 2 3 1 1 SO 3 Listening Experiment Conditions Type of test MOS P 800 Number of talkers 4 males 4 females Background noise none ambient Audio Input level 22 dB except for high low input cond Fiter characteristics Reference conditions u law source 5 15 20 25 dBQ G 728 2 3 2 C S0018 D v1 0 Test conditions 1 Clean 2 High Audio Input Level 12 dB 4 3 FER forward and reverse 5 Rate 1 2 Maximum 1 2 3 Low Audio Input Level 32 dB 4 S Number of codecs 5 MM T M T T IS 96 C The Test Design for Listening Experiment are presented in Table 2 1 2 3 1 2 Table 2 1 2 3 1 2 SO 3 Listening Experiment Design Label Operating Point Condition Enc Dec Connection a01 EVRC A Clean Nominal 22 dB M M a02 EVRC A Clean Nominal 22 dB M T a03 EVRC A Clean Nominal 22 dB T M a04 EVRC A Clean Nominal 22 dB T T a05 IS 96 C Clean Nominal 22 dB R R a06 EVRC A High 12 dB M M a07 EVRC A High 12 dB M T a08 EVRC A High 12 dB T M a09 EVRC A High 12 dB T T a10 IS 96 C High 12 dB R R a11 EVRC A Low
179. nd Spread Spectrum Digital Systems January 2010 3GPP2 C R0014 C v1 0 Software Distribution for Enhanced Variable Rate Codec Speech Service Options 3 68 and 70 for Wideband Spread Spectrum Digital Systems September 2007 3GPP2 C S0009 0 v1 0 Speech Service Option Standard for Wideband Spread Spectrum Systems December 1999 3GPP2 C S0018 0 v1 0 Minimum Performance Specification for the Enhanced Variable Hate Codec Speech Service Option 3 for Spread Spectrum Digital Systems December 1999 ANSI S1 4 1983 R2006 Sound Level Meters Specification for March 2006 ANSI S1 4A 1985 R2006 Sound Level Meters Specifications for Supplement to ANSI S1 4 1983 March 2006 ITU T Recommendation G 191 Software Tools for Speech and Audio Coding Standardization September 2005 User s Group on Software Tools ITU T Software Tool Library 2005 User s Manual Distributed with the software for STL2000 September 2005 ITU T Recommendation G 711 Pulse code modulation PCM of voice frequencies November 1988 ITU T Recommendation G 728 Coding of speech at 16 kbit s using low delay code excited linear prediction September 1992 ITU T Recommendation P 56 Objective Measurement of Active Speech Level March 1993 ITU T Recommendation P 800 Methods for Subjective Determination of Transmission Quality Annex B Listening Tests Absolute Category Rating ACR August 1996 ITU T Recommendation P 810 Modulated Noise Referenc
180. ng mask file i encoded packet file o dimmed packet file where Evrc wb iwf converts full rate frames in the input encoded packet file to half rate frames at packet level that is using a simple scaling down of the packet instead of a complicated transcoding method 3 21 C S0018 D v1 0 3 3 2 4 P 341 Tx Filter The software utility p341 tx c can be compiled to yield a Tx filtering utility p341 tx with usage defined as p341 tx input file name output file name where p341 tx is the 3GPP2 Tx filter compliant to ITU T P 341 Figure 3 3 2 4 1 shows the frequency response of p341 tx filter Also shown in this figure is the response of the ITU T P 341 STL 2000 filer implementation as well as the transmit masks for the ITU T P 341 P 311 and the wideband transmit response from Table 9 in the 3GPP electro acoustics specification 21 From this figure it can be seen that the STL 2000 filter response in red does not meet the frequency response of the 3GPP electro acoustics specification while the p341 tx filter response in green meets both the P 341 P 311 masks as well as the 3GPP electro acoustics specification mask ITU T 3GPP Transmit Masks vs Filter Responses 20 n STL 2000 P 341 3GPP2 P 341 Tx es 3GPP Handset Send 5dB m ITU T P 311 341 Tx Sod Magnitude dB 10 10 10 10 Frequency Hz Figure 3 3 2 4 1 SO 70 ITU T P 311 P 341 Transmit Mask
181. nnel average data rate in kbps that the EVRC B encoder should target For example W 7 5 for 7 5 kbps 3 2 3 8 File Formats for SO 68 Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 2 3 3 1 Table 3 2 3 3 1 Packet File Structure From Master Codec Channel Error Model Value in Packet File EEUU NEN Data Bits per Frame 4 0x0004 171 8 008 fue Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 2 4 Fixed Point Bit Exact Codec for SO 68 This section describes the C simulation of the speech codec specified by 1 The speech codec C simulation is based on finite precision fixed point arithmetic operations and is recommended to be used as a reference codec to verify the performance of a bit exact EVRC B implementation of the fixed point C simulation of a test codec The bit exact EVRC B codec along with the appropriate test vectors to verif
182. ntence Pair Database and matched in overall level There are a total of 64 original source files from 8 different talkers While individual sentences are 14 repeated every sample uses a distinct sentence pairing Talkers were chosen to have distinct voice qualities and are native speakers of North American English For the following discussion it may be useful to refer to Table 4 1 for the configuration of the associated Software Distribution 18 2 1 3 1 Source Speech Material for Experiment 19 The source speech material for subjective Experiment is contained directory 20 so3 subjctv expl source Each sentence is IRS filtered gain adjusted and u Law companded in 2 7 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 C S0018 D v1 0 accordance with 7 The talkers in subjective Experiment consist of four adult males and four adult females The source material for Experiment consists of 8 sentence pairs from 8 different speakers for a total of 64 speech files for both of the nominal input conditions conditions 1 and 5 These files are named s22 This directory also contains the source material for each of the high and low level input conditions which are named s12 and s32 respectively for a total of 3 x 64 192 files The speech database also includes samples processed through the various reference conditions in direc
183. nverts full rate frames in the input encoded packet file to half rate frames at packet level that is using a simple scaling down of the packet instead of a complicated transcoding method 3 35 C S0018 D v1 0 3 4 2 4 P 341 Tx Filter The software utility p341 tx c can be compiled to yield a Tx filtering utility p341 tx with usage defined as p341 tx input file name output file name where p341 tx is the 3GPP2 Tx filter compliant to ITU T P 341 Figure 3 4 2 4 1 shows the frequency response of p341 tx filter Also shown in this figure is the response of the ITU T P 341 STL 2000 filter implementation as well as the transmit masks for the ITU T P 341 P 311 and the wideband transmit response from Table 9 in the 3GPP electro acoustics specification 21 From this figure it can be seen that the STL 2000 filter response in red does not meet the frequency response of the 3GPP electro acoustics specification while the p341 tx filter response in green meets both the P 341 P 311 masks as well as the 3GPP electro acoustics specification mask ITU T 3GPP Transmit Masks vs Filter Responses 20 n STL 2000 P 341 3GPP2 P 341 Tx es 3GPP Handset Send 5dB m ITU T P 311 341 Tx Sod Magnitude dB 10 10 10 10 Frequency Hz Figure 3 4 2 4 1 SO 73 ITU T P 311 P 341 Transmit Mask and Filter responses 3 4 3 Master Codec for SO 73 This section describes t
184. of conditions which may occur in the CDMA system To accomplish this two listening experiments have been designed to test speech codec quality under a variety of conditions These conditions include channel impairments audio background noise and different input levels 2 2 2 1 Definition The codec subjective test is intended to validate the implementation of the speech codec being tested using the master codec defined in 3 2 3 as a reference Experiment is based on the Absolute Category Rating ACR method which yields the Mean Opinion Score MOS as described in 10 Experiment 1 is based on the ITU T Recommendation P 835 described in 13 2 2 2 2 Method of Measurement The subjective test involves a listening only assessment of the quality of the codec being tested using the master codec as a reference Subjects from the general population of telephone users will rate the various conditions of the test Material supplied with this standard for use with this test includes source speech impaired packet files from the master codec encoder and source speech processed by various Modulated Noise Reference Unit MNRU conditions and other references The basic Absolute Category Rating test procedure involves rating all conditions using a five point scale describing the opinion of the test condition This procedure is fully described in 10 The P 835 test method involves rating all conditions on scales of Signal Background and Over
185. of the required and optional command line arguments The options are described below i infn required Specifies the name of the input speech file or the name of the input packet file if only decoding is being performed see the d option below o outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding function output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default f max Sets the maximum number of frames to be processed h max Sets the maximum allowable data rate to max where max is element of 4 8 1 using the codes specified in the first column of Table 3 1 3 3 1 min Sets the minimum allowable data rate to min where min is element of 4 3 1 using the codes specified in the first column of Table 3 1 3 3 1 If neither the h nor I option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 In addition if max z min the data rate varies between max and min using the same rate decision algorithm where the data rate is set to max if the selected data rate is max and the data rate is se
186. ontained in the so73 subjctv exp 3 4 source directories Each file exhibits a different condition power levels 12 dB 22 dB and 32 dB and background noise conditions 20 dB SNR babble noise 15 dB SNR car noise and 15 dB SNR street noise The input source files used in the average data rate test have an approximate voice activity factor of 0 6 and are the same input files used in the subjective portion of the experiment 2 4 1 1 1 Average Data Rate Computation for SO 73 The average channel data rate for the test codec shall be computed for each of the benchmark files as follows 9600 N4 4800 No 2400 N4 1200 N8 N where N4 number of frames encoded at Rate 1 2 64 20 21 22 23 24 25 C S0018 D v1 0 N2 number of frames encoded at Rate 1 2 N4 number of frames encoded at Rate 1 4 Ng number of frames encoded at Rate 1 8 and Ng For the 16 kHz input The total average channel data rate for the test codec is then given by Ravg 1 7 R ambient background segment 12dB R ambient background segment 32dB R ambient background segment 22dB R 20 dB SNR babble noise segment 22dB R 10 dB SNR car noise segment 22dB R 20 dB SNR car noise segment 22dB R 15 dB SNR street noise segment 22aB For the 8 kHz input The total average channel data rate for the test codec is then given by Ravg 1 6 R ambient background segment
187. or the name of the input packet file if only decoding is being performed see the d option below o outf required Specifies the name of the output speech file or the name of the output packet file if only encoding is being performed see the e option below d Instructs the simulation to perform only the decoding function The input file must contain packets of compressed data e Instructs the simulation to perform only the encoding function The output file will contain packets of compressed data If neither the d or the e option is invoked the coder performs both the encoding and decoding functions by default M max Sets the maximum allowable data rate to max where max is element of 4 3 1 using the codes specified in the first column of Table 3 3 3 3 1 m min Sets the minimum allowable data rate to min where min is element of 4 3 1 using the codes specified in the first column of Table 3 1 3 3 1 If neither the M nor m option is invoked the coder allows the data rate to vary between Rate 1 and Rate 1 8 3 3 3 8 File Formats for SO 70 Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file valu
188. ort T M c25 High Mode 0 LB portion of Wideband mode decoder test only c26 High Mode 0 LB portion of Wideband mode decoder test only M T c27 High Mode 4 interoperable with Mode 0 of SO 68 support M M c28 High Mode 4 interoperable with Mode 0 of SO 68 support M T c29 High Mode 4 interoperable with Mode 0 of SO 68 support IT T c30 High Mode 4 interoperable with Mode 0 of SO 68 support T M c31 Mode 0 1 D amp B 1 PLS LB portion of Wideband mode decoder test only M M c32 Mode 0 1 D amp B 1 PLS LB portion of Wideband mode decoder test only M T c33 Mode 4 1 D amp B 1 PLS interoperable with Mode 0 of SO 68 support M M C34 Mode 4 1 D amp B 1 PLS interoperable with Mode 0 of SO 68 support M T c35 Mode 4 1 D amp B 1 PLS interoperable with Mode 0 of SO 68 support IT T c36 Mode 4 1 D amp B 1 PLS interoperable with Mode 0 of SO 68 support T M C37 Mode 0 396 FER LB portion of Wideband mode decoder test only M M C38 Mode 0 396 FER LB portion of Wideband mode decoder test only M T c39 Mode 4 3 FER interoperable with Mode 0 of SO 68 support M M c40 Mode 4 3 FER interoperable with Mode 0 of SO 68 support M T 2 2 3 2 3 4 Subjective Experiment 4 for SO 70 The Test Parameters for Listening Experiment 4 are presented in Table 2 3 2 3 4 1 2 45 C S0018 D v1 0 2 Table 2 3 2 3 4 1 SO 70 Listening Experiment 4 Test Parameters Condition Description Type of test Numb
189. panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the P 835 test are presented below The instructions may be modified to allow for variations in laboratory data gathering apparatus Instructions for P 835 Speech Rating Experiment In this speech rating experiment each trial will involve three sentences and you will give a rating for each sentence For the first sentence in each trial you will be asked to attend only to the speech signal and rate how natural or conversely how degraded the speech signal sounds to you You will use the rating scale shown in the figure below to register your ratings of the speech signal Your task will be to choose the numbered phrase from the list below that best describes your opinion of the SPEECH SIGNAL ALONE and then enter the corresponding number on your keyboard 2 60 C S0018 D v1 0 Attending ONLY to the SPEECH SIGNAL select the category which best describes the sample you just heard the SPEECH SIGNAL in this sample was VERY NATURAL NO DEGRADATION FAIRLY NATURAL LITTLE DEGRADATION SOMEWHAT NATURAL SOMEWHAT DEGRADED FAIRLY UNNATURAL FAIRLY DEGRADED VERY UNNATURAL VERY DEGRADED For the second sentence in each trial you will be asked to attend only to the background and rate how noticeable intrusive and or conspicuous the background sounds to you You
190. ps Note 9 3 kbps mode is generated using anchor operating point 0 and 5 8 kbps mode is generated using anchor operating point 2 C S0018 D v1 0 Table 3 2 4 5 2 SO 68 Decoder Bit exact Test Conditions Input Reference output Packet Operating Point Condition speech files for bit File exact compliance 9 3 p22 EVRC B 9 3 kbps Nominal 22 dB 9 3 022 5 8 p22 EVRC B 5 8 kbps Nominal 22 dB 5 8 022 4 8 p22 EVRC B 4 8 kbps Nominal 22 dB 4 8 022 9 3 p32 EVRC B 9 3 kbps Low 32 GB 1 d amp b 1 pls 9 3 032 5 8 p32 EVRC B 5 8 kbps Low 32 GB 1 d amp b 1 pls 5 8 032 9 3 p12 EVRC B 9 3 kbps High 12 dB 9 3 012 5 8 p12 EVRC B 5 8 kbps High 12 dB 5 8 012 9 3 pc EVRC B 9 3 kbps Nominal 22 dB 15 dB carnoise 9 3 0c EVRC B 5 8 kops Nominal 22 dB 15 dB carnoise 5 8 0C EVRC B 9 3 kbps Nominal 22 dB 20 dB babble 9 3 ob 5 8 EVRC B 5 8 kops Nominal 22 dB 20 dB babble 5 8 ob EVRC B 9 3 kbps Nominal 22 dB 15 dB street 9 3 0s EVRC B 5 8 kbps Nominal 22 dB 15 dB street 5 8 05 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 C S0018 D v1 0 3 8 Specific Standard Test Conditions for SO 70 3 3 1 Audio Path and Calibration for SO 70 3 3 1 1 Audio Path The audio path for wideband test conditions Experiments 1 and 2 must meet the follow
191. r delays and buffering delays in the encode decode path The maximum end to end algorithmic delay should be no greater than that of the master codec For the master codecs defined in 1 the algorithmic delay is given as Delay Element SO 3 Signal Preprocessing Delay 3 milliseconds LPC Analysis Look ahead 10 milliseconds LPC Analysis Window 20 milliseconds Total 33 milliseconds Therefore the total algorithmic delay imposed by a SO 3 test codec should not exceed 33 milliseconds 2 1 2 Subjective Performance Testing for SO 3 This section outlines the subjective testing methodology of the subjective performance test The purpose of this testing is to evaluate the quality of the test codec under a variety of conditions which may occur in the CDMA system To accomplish this two listening experiments have been designed to test speech codec quality under a variety of conditions These conditions include channel impairments codec tandem audio background noise and different input levels In addition half rate maximum operation of the codec will be examined 2 2 20 21 22 23 24 25 26 C S0018 D v1 0 2 1 2 1 Definition The codec subjective test is intended to validate the implementation of the speech codec being tested using the master codec defined in Section 3 1 3 as a reference The subjective tests for SO 3 are based on the Absolute Category Rating Mean Opinion Score MOS test as described in 10 2 1
192. rc wb op4 ps EVRC WB operating point 4 8 kHz sampling Nominal 22 dB 15 dB street noise evrc wb op4 os 8k evrc wb op4 fer 296 pb EVRC WB operating point 4 8 kHz sampling 3 30 Nominal 22 dB 15 dB babble noise evrc wb op4 fer 296 0b 8k C S0018 D v1 0 2 Table 3 3 4 5 6 SO 70 Encoder Suite C Bit exact Test Conditions Operating Input Fil Point Condition Reference packet files for bit exact compliance No need encoder tests 4 Table 3 3 4 5 7 SO 70 Suite C Decoder Bit exact Test Conditions Reference output speech Input Packet File Operating Point Condition files for bit exact compliance evrc wb opO dim 19o fer 296 p22 EVRC WB Nominal 22 dB evrc wb opO dim 196 fer 296 operating point 0 196 d amp b 022 8k 8 kHz sampling 296 FER evrc wb opO pls 196 p22 EVRC WB Nominal 22 dB evrc wb opO pls 196 022 8k operating point 0 196 pls 8 kHz sampling evrc wb opO0 p12 EVRC WB High 12 dB evrc wb op0 012 8k operating point 0 8 kHz sampling evrc_wb_op0 p32 EVRC WB Low 32 dB evrc wb op0 032 8k operating point 0 8 kHz sampling evrc wb opO dim 296 pc EVRC WB Nominal 22 dB evrc wb opO dim 296 0c 8k operating point 0 15 dB car noise 8 kHz sampling 2 d amp b evrc_wb_op0 pls_1 pc EVRC WB Nominal 22 dB evrc wb opO pls 196 0c 8k operating point 0 15 dB car noise 8 kHz sampling 196 pls evrc wb opO ps EVRC WB Nominal 22
193. riment are provided in so68 subjctv exp1 data play Ist and so68 subjctv exp2 data play lst respectively 2 2 6 Presentation Presentation of speech materials for the SO 68 codec listening tests shall be made with one side of high fidelity supra aural headphones with the other ear uncovered The speech material delivery system shall meet the requirements of Section 3 2 1 1 The listeners should be seated in a quiet room with an ambient noise level of 30 dBA or below 2 2 7 Listeners The listener sample is intended to represent the population of telephone users with normal hearing acuity The listeners should be naive with respect to telephony technology issues that is they should not be experts in telephone design digital voice encoding algorithms and so on They should not be trained listeners that is they should not have been trained in these or previous listening studies using feedback trials Age distribution and gender should be nominally balanced across listening panels Each listener shall provide data only once for a particular evaluation A listener may participate in different evaluations but test sessions performed with the same listener should be at least two months apart so as to reduce the cumulative effects of experience 2 2 8 Listening Test Procedures 2 2 8 1 ACR Listening Test Procedures Experiment l The listeners shall listen to each sample and rate the quality of the test sample using a five point scale
194. rol maximum encoding rate An external software utility Evrc_wb_iwf in Section 3 3 2 3 is also needed to reduce the data rate of certain packets from full rate to half rate Details of these operations are given in Section 6 The signaling file and other utilities are provided in so70 tools directory 2 3 4 4 Ensuring Proper Encoded Frame Packet Files All encoded frame packet files shall be examined to ensure that the files only contain data in those file locations where data should exist for a given data rate 2 53 C S0018 D v1 0 The examination of the encoded frame packet files should indicate the occurrence of any improper data in the files but the examination must not alter the encoded frame packet files in any way 2 3 4 5 Post processing of test condition output files In order to build the play sets to be presented to the listening panels the output files for the various test conditions must be processed to provide the appropriate listening conditions In addition the concatenated output files must be partitioned into the samples representing the combination of test condition and talker The listening conditions for Narrowband experiments are provided by filtering the output files using the STL software tool filter with the MIRS receive filter mask The listening conditions for Wideband experiments are provided by mixing STL tool oper the output files with Psophometrically filtered noise STL tool filter PSO filter mask at 7
195. rst two sub samples listeners rate the Signal Quality on a five point rating scale with the points labeled NOAA For the other of the first two sub samples listeners rate the Background Quality on a five point rating scale with the points labeled Very natural no distortion Fairly natural little distortion Somewhat natural some distortion Fairly unnatural fairly distorted Very unnatural very distorted 2 84 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 Not noticeable Fairly noticeable Noticeable but not intrusive Fairly conspicuous somewhat intrusive Very conspicuous very intrusive NOUA For the third sub sample listeners rate the Overall quality on a five point rating scale with the points labeled 5 Excellent 4 Good 3 Fair 2 Poor 1 Bad Data from 32 listeners shall be used for Experiments 2 and 4 four listeners for each listening panel where each listening panel uses a different randomization Before starting the test the listeners should be given instructions for performing the subjective test An example set of instructions for the P 835 test are presented below The instructions may be modified to allow for variations in laboratory data gathering apparatus Instructions for P 835 Speech Rating Experiment In this speech rating experiment each trial will involve three sentences and you will give a rating for each sentence For the first sentence in
196. rtion of Widemode mode decoder test M T only c47 Mode 0 Nominal 3 FER LB portion of Widemode mode M M decoder test only c48 Mode 0 Nominal 396 FER LB portion of Widemode mode M T decoder test only 2 4 2 3 4 Subjective Experiment 4 for SO 73 The Test Parameters for Listening Experiment 4 are presented in Table 2 4 2 3 4 1 Table 2 4 2 3 4 1 SO 74 Listening Experiment 4 Test Parameters P NSA P 835 Narrowband Number of talkers 3 males 3 females Reference conditions 8 Specified reference conditions Test conditions o Car Noise 15 dB SNR Modes 0 4 7 o Street Noise 15 dB SNR Modes 0 4 Babble noise 20 dB SNR 2 FER Modes 0 4 Encoder Decoder Combinations 4 M M M T T T T M The Test Conditions for Listening Experiment 4 are presented in Table 2 4 2 3 4 2 2 73 C S0018 D v1 0 Table 2 4 2 3 4 2 SO 73 Listening Experiment 4 Test Conditions Exp 4 Narrowband P 835 Reference Conditions File MNRU 901 MNRU 40dB SNR 40dB Reference do2 MNRU 40dB SNR 20dB Reference do3 MNRU 40dB SNR 0dB Reference do4 MNRU O0aB SNR 40dB Reference 905 MNRU 20dB SNR 40dB Reference do6 MNRU 10dB SNR 10dB Reference d07 MNRU 20dB SNR 20dB Reference 908 MNRU 40dB SNR 30dB Reference Test Conditions File Condition Enc Dec dog Car 15dB SNR Mode 0 LB portion of Wideband mode decoder
197. s MNRU SNR e SIG OvRAL 10 10 dB 20 20 dB 30 30 dB 40 40 dB SNR Car Noise Figure 2 3 10 2 1 Typical P 835 Score Profiles for Reference Conditions 2 4 Performance Testing for SO 73 2 4 1 Objective Performance Testing for SO 73 The objective testing portion of this specification consists of an average data rate test and compliance to End to End Algorithmic Delay and Unity gain requirements 2 4 1 1 Average Data Rate Test An implementation may support SO 73 for 16 kHz sample rates for example a mobile station that supports wideband electro acoustics The average data rate for the test codec shall be measured using seven source speech files that are contained in the so73 subjctv exp 1 2 source directories Each file exhibits a different condition power levels 12 dB 22 dB and 32 dB and background noise conditions 20 dB SNR babble noise 10 dB SNR car noise 20 dB SNR car noise and 15 dB SNR street noise The input source files used in the average data rate test have an approximate voice activity factor of 0 6 and are the same input files used in the subjective portion of the experiment An implementation may support SO 73 only for 8 kHz sample rate input output for example a Base station transcoder or a Media Gateway or mobile station The average data rate for the test codec for this case shall be measured using six narrow band source speech files that are c
198. s Table 2 3 2 3 6 2 SO 70 Listening Experiment 6 Test Conditions Car Noise 15 dB SNR Mode 0 Street Noise 15 dB SNR Mode 0 Babble 20 dB SNR 2 FER Mode 0 Car Noise 15 dB SNR Mode O0 296 d amp b Car Noise 15 dB SNR Mode 0 196 pls Encoder Decoder Combinations 4 M M M T T T T M Exp 6 Narrowband P 835 Reference Conditions File MNRU 01 MNRU 40aB SNR 40dB Reference 02 MNRU 40aB SNR 20dB Reference MNRU 40dB SNR 0dB Reference f04 MNRU 0aB SNR 40dB Reference 05 MNRU 20aB SNR 40dB Reference 06 MNRU 10aB SNR 10dB Reference 07 MNRU 20aB SNR 20dB Reference 08 MNRU 40aB SNR 30dB Reference Test Conditions File Condition Enc Dec 09 Car 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M M 10 Car 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M T 1 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M M 12 Street 15dB SNR Mode 0 LB portion of Wideband mode decoder test only M T 3 Babble 20dB SNR 296 FER Mode 0 LB portion of Wideband mode decoder test only M M 2 49 C S0018 D v1 0 Exp 6 Narrowband P 835 14 Babble 20dB SNR 2 FER Mode 0 LB portion of Wideband mode decoder test only M T 15 Car 20dB SNR 2 d amp b Mode 0 LB portion of Wideband mode decoder test only M M 16 Car 20dB SNR 296 d amp b Mode 0 LB portion of Wideband mode decoder test only
199. s corresponds to a digitally referenced input level of 3 dBov and 7 defined tone level of 3 17 dBm0 Nominal input speech level is defined to be approximately 22 dB below this reference tone level and is equivalent to 25 dBov or 19 dBm0 For 16 bit signed integers a sine wave with a peak amplitude of 32768 corresponds to 0 dB according to this definition Because a sine wave with amplitude A has a RMS value of A140 the level in dB of a voice active segment of speech x n x n N 1 quantized with 16 bit two s complement linear data spanning 32768 32767 is given by 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 C S0018 D v1 0 i n 2 n N 1 10log a l SET dBA A weighted sound pressure level expressed decibels obtained by the use of metering characteristic and the weighting A specified in 4 and 5 dBm0 Power relative to 0 transmission level point 7 specifies a theoretical load capacity with a full scale sine wave to be 3 17 dBm0 for law PCM coding and 3 14 dBm0 for A Law PCM coding dBPa Sound level with respect to one Pascal 20 logig Pressure 1 Pa dB SPL Sound Pressure Level in decibels with respect to 0 002 dynes cm 2010910 Pressure 0 002 dynes cm2 dBPa is preferred Decoder A device for the translation of a signal from a digital representation into an analog format For the purpo
200. s the purpose of 14 and 15 This is to ensure that a mobile station can obtain service in any cellular system that meets the compatibility requirements of 16 This standard consists of this document and an associated software distribution The Software Distribution contains e Audio source material e Clear channel packets produced from the master codec e Impaired channel packets produced from the master codec and degraded by a channel model simulation e Output audio files produced from the master encoded packets decoded by the master decoder e Calibration source material e C C language source files for the compilation of bit exact fixed point codec e C C language source files for a number of software data analysis tools e Modulated Noise Reference Unit MNRU reference files e Input and output vectors for bit exact testing An overview of the contents and formats of the software distribution is given in Section 4 of this document The EVRC A EVRC B EVRC WB and EVRC NW enhanced variable rate speech codecs collectively referred to as EVRC are intended to be used at mobile stations at compatible base stations in the cellular service This statement is not intended to preclude implementations in which codecs are placed at a Mobile Switching Center or elsewhere within the cellular system Indeed some mobile to mobile calls however routed may not require the implementation of a codec on the fixed side of the cellular system at
201. selected data rate is min W target active speech channel adr Specifies the target active speech channel average data rate in bps that the EVRC B encoder should target For example W 7500 for 7 5 kbps 22 23 24 25 C S0018 D v1 0 3 2 4 4 File Formats Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 2 3 3 1 Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 2 4 5 Verifying Bit Exact Performance of the Fixed Point Test Codec Files in the so68 testvec directory are provided for the purpose of qualifying a test codec as bit exact and conform to the file naming convention described in Section 2 2 4 The so68 testvec directory is divided into 2 subdirectories so68 testvec source and so68 testvec fixed The so68 testvec source directory contains input source files as well as packet files injected with frame erasures
202. sentative operating systems on a number of different hardware platforms The 3GPP2 supplied tools are all located in the so70 tools directory in the associated Software Distribution and can be built using the GNU g compiler Other software tools such as scaldemo actlev filter and astrip are available in 6 3 3 2 1 Channel Model Utilities fersig28 This utility program provides a the ability to introduce Frame Erasure channel impairment b the ability to verify use of half rate or lesser frame rate during dim and burst and packet level signaling c the ability to measure the Average Data Rate from an encoded packet file A log output of ersig28 provides detail on the ADR performance of the preceding encoder In these applications the utility is invoked as in following examples for 3 FER and 1 signaling fersig28 c EVRC WB e fer 3 bin infile outfile fersig28 c EVRC WB s dim 1 e fer 3 bin infile outfile 3 3 2 2 Channel Error and Signaling Masks These binary Frame Error Rate and Signaling masks source level and packet level 1 byte of either 0 or 1 per frame are used with the fersig28 channel impairment and inter working simulation functions for the various conditions fer 3 bin dim 1 bin dim 15 pls bin 3 3 2 3 EVRC WB Interworking Function IWF The software Evrc wb iwf cc can be compiled to yield a simulation utility Evrc wb iwf with usage defined as Evrc wb iwf s signali
203. ses of this standard a device compatible with a specific EVRC implementation Encoder A device for the coding of a signal into a digital representation For the purpose of this standard a device compatible with a specific implementation FER Frame Error Rate equals the number of full rate frames received in error divided by the total number of transmitted frames IRS Intermediate Reference System 12 MGW Media Gateway MIRS Modified Intermediate Reference System 12 MNRU Modulated Noise Reference Unit A procedure to add speech correlated noise to a speech signal in order to produce distortions that are subjectively similar to that produced by logarithmically companded PCM systems The amount of noise is expressed as a signal to noise ratio value in dB and is usually referred to as dBQ 11 Mobile Station A station in the Domestic Public Cellular Radio Telecommunications Service It is assumed that mobile stations include portable tranceivers for example hand held personal tranceivers and tranceivers installed in vehicles MOS Mean Opinion Score The result of a subjective test based on an absolute category rating ACR where listeners associate a quality adjective with the speech samples to which they are listening These subjective ratings are transferred to a numerical scale and the arithmetic mean is the resulting MOS number 10 Rates for SO 3 The allowable traffic frame rates for SO 3 Rate 1 fra
204. seseese 3 19 Table 3 3 3 3 1 Packet File Structure From Master Codec Channel Error Model 3 23 Table 3 3 4 5 1 Test Suites of input test vectors for SO 70 3 27 Table 3 3 4 5 2 SO 70 Encoder Suite A Bit exact Test 3 27 Table 3 3 4 5 3 SO 70 Suite A Decoder Bit exact Test Conditions 3 28 Table 3 3 4 5 4 SO 70 Encoder Suite B Bit exact Test 3 28 Table 3 3 4 5 5 SO 70 Suite B Decoder Bit exact Test Conditions 3 29 Table 3 3 4 5 6 SO 70 Encoder Suite C Bit exact Test Conditions 3 31 Table 3 3 4 5 7 SO 70 Suite C Decoder Bit exact Test Conditions 3 31 Table 3 3 4 5 8 SO 70 Encoder Suite D Bit exact Test Conditions 3 32 Table 3 3 4 5 9 SO 70 Suite D Decoder Bit exact Test Conditions 3 32 Table 3 4 3 3 1 Packet File Structure from Master Codec Channel Error Model 3 37 Table 3 4 4 5 1 Test Suites of input test vectors for SO 73 3 40 Table 3 4 4 5 2 SO 73 Encoder Su
205. t to min if the selected data rate is min See the select rate routine in the file ratedec c for more information 3 8 23 24 25 26 27 28 29 30 31 32 33 34 C S0018 D v1 0 p flag If flag is set to 0 the post filter is disabled If the flag is set to 1 the post filter is enabled If the p option is not invoked the post filter is enabled during decoding n flag If flag is set to 0 noise suppression is disabled If the flag is set to 1 noise suppression is enabled If the n option is not invoked noise suppression is enabled during encoding 3 1 4 4 File Formats Files of speech contain 2 s complement 16 bit samples with the least significant byte first The packet file contains twelve 16 bit words with the low byte ordered first followed by the high byte The first word in the packet contains the data rate while the remaining 11 words contain the encoded speech data packed in accordance with the tables specified in 1 The packet file value for each data rate is shown in Table 3 1 3 3 1 Unused bits are set to 0 For example in a Rate 1 8 frame the packet file will contain the word 0x0100 byte swapped 0x0001 followed by one 16 bit word containing the 16 data bits for the frame in byte swapped form followed by ten 16 bit words containing all zero bits 3 1 4 5 Verifying Proper Operation of the Fixed Point Codec Files are provided for the purpose of verifying the fixed point codec
206. te statistical test is Student s t test The critical value for the Student s t test is 1 70 one sided test p lt 05 df 31 In both the Dunnett s Test and the t test the MPS test is evaluated by dividing the difference between the mean score for the Test E DC and the mean score for the Reference ED C by the Standard Error of the Mean Difference SEyp as shown in Equation 2 3 9 2 1 If the resultant Test value is less than 5 The appropriate t test is a matched groups t test and the SEyp is based on the differences between individual listener s average ratings where the average is over talkers Therefore the SEyp is based on 32 difference Scores one for each listener df 231 2 62 20 21 22 C S0018 D v1 0 the criterion value for the appropriate test 2 09 for Dunnett s Test 1 70 for the t test then the E DC passes the MPS test M res isi ki Test n 2 3 9 2 1 2 3 10 Expected Results for Reference Conditions 2 3 10 1 Reference Conditions for Experiments 1 3 and 5 The MNRU conditions have been included to provide a frame of reference for the Experiments 1 3 and 5 In listening evaluations where test conditions span approximately the same range of quality the MOS results for similar conditions should be approximately the same Data from previous studies allows a generalization to be made concerning the expected MOS results for the MNRU reference conditions see Figure 2 3 10 1 1 MOS scores
207. test In listening evaluations where test conditions span approximately the same range of quality the MOS results for similar conditions should be approximately the same Data from previous studies allows a generalization to be made concerning the expected MOS results for the MNRU reference conditions see Figure 2 2 10 1 1 MOS scores obtained for the MNRU conditions in any SO 68 validation test should be compared to those shown in the graph below Inconsistencies beyond a small shift in the means in either direction or a slight stretching or compression of the scale near the extremes may imply a problem in the execution of the evaluation test In particular MOS should be monotonic with MNRU within the limits of statistical resolution and the contour of the relation should show a similar slope MOS w 10 20 30 40 50 dBQ Figure 2 2 10 1 1 MOS versus MNRU 2 2 10 2 Experiment Il Reference Conditions Reference conditions for P 835 tests are constructed as a combination of SNR and MNRU processing to provide degradation in overall speech quality in two dimensions signal distortion and background noise intrusiveness Table 2 2 2 3 2 2 shows the eight reference conditions b01 b08 involved in the P 835 Experiment Il In general results are expected for these reference conditions such that the obtained score profiles are similar to those shown in Figure 2 2 10 2 1 2 34 20 21 22 23 24 25 26 27 28
208. the Service Option 70 SO 70 speech codec and the EVRC NW is the Service Option 73 SO 73 speech codec all described in 1 The procedures specified in this document for the SO 3 speech codec are fully consistent with those contained in 3 The SO 3 speech codec is used to digitally encode the speech signal for transmission at a variable data rate of 8550 bps 4000 bps or 800 bps The SO 68 speech codec is used to digitally encode the speech signal for transmission at a variable data rate of 8550 bps 4000 bps 2000 bps or 800 bps The SO 70 speech codec is used to digitally encode the speech signal for transmission at a variable data rate of 8550 bps 4000 bps or 800 bps The SO 73 speech codec is used to digitally encode the speech signal for transmission at a variable data rate of 8550 bps 4000 bps 2000 bps or 800 bps Like some other speech coding standards this standard provides a bit exact method of verifying the test codec for minimum performance In this optional procedure a given set of test vectors are input to the test codec and the output vectors from the test codec must be bit exact with the output vectors given in the software distribution which is associated with this standard If they are bit exact the test codec passes the minimum performance requirement and no further testing is required The bit exact mode of testing however is only applicable to codecs whose design conforms in all respects to the algorithmic description o
209. the bit exactness performance are included in the associated Software Distribution There are two options for compiling the fixed point EVRC simulation One option uses the 31 bit long multiply DSP math library and the other uses the 32 bit library A parallel set of bit exact test vectors is provided so that a CODEC may qualify as bit exact using either library 3 1 4 1 Fixed Point Codec Program Files This section describes the C program files which are provided in the directory so3 simul fixed in the companion software All of the files needed to compile run and verify the fixed point codec are located in the directory so3 simul fixed 3 1 4 2 Compiling the Fixed Point Codec Simulation The source code for the fixed point codec simulation has been written in ANSI C and can be compiled using any general purpose compiler such as the GNU GCC C compiler and make utility Refer to Section 3 3 for information regarding obtaining GCC make and relevant documentation Two GCC compatible makefiles have been included in the so3 simul fixed code and so3 simul fixed dspmath directory All of the files contained on the associated Software Distribution under the directory fixed should be copied onto a writable disk making sure to preserve the directory structure Typing make in the dspmath directory first followed by typing make in the directory code will compile and link the code and create the executable file called EvrcFix evrcfix exe
210. tion 3 1 3 3 The output of the program is for each file referred to by the input file list The file name the number of packets contained in the file and the average data rate calculated as described in Section 2 1 1 1 The average data rate utility is intended to be used on the packet files created by the test codec in response to the average rate benchmark files referred to in Section 2 1 1 1 and located in the so3 objctv directory of the associated Software Distribution The program is invoked as follows avg rate filename 1 filename 2 filename 3 filename n 3 1 2 2 Scaling speech files sv56 c This program is used to scale each sample in a linearly quantized speech file by a factor that renders the file s root mean square RMS level equal to a user specified value The program is intended to be used on the test codec s speech output files to ensure that their RMS level is consistent with the requirements of Section 2 1 2 3 of this document The source code sv56 c is available from 6 and Ga The inputs to the program are the optional desired RMS value in dB the input speech file name and the optional output speech file name The outputs are the initial prior to scaling maximum sample RMS and average DC values in the speech file the final after scaling maximum RMS and DC values in the output file the number of samples that were clipped the scale factor applied and an output speech file appropriately scaled
211. tory so3 subjctv exp1 ref The reference conditions are named q05 through q25 for the respective MNRU conditions and 728 for the G 728 reference The samples processed by the IS 96 C codec for each of the five conditions are named qc1 through qc5 respectively and qc4 is replaced with qf3 and qr3 corresponding to the IS 96 C codec 3 forward and reverse FER respectively also reside here 2 1 3 2 Source Speech Material for Experiment II The source speech material for subjective Experiment is contained in directory so3 subjctv exp2 source Each sentence is flat filtered and law companded in accordance with 7 The talkers in subjective Experiment Il consist of four adult males and four adult females The clean source material for Experiment Il conditions 1 and 5 consists of 8 sentence pairs from 8 different speakers for a total of 64 speech files These files are named s22 This directory also contains the source material for the car street and babble noise conditions which are named car str and bab respectively for a total of 4 x 64 256 files The speech database also includes samples processed through the various reference conditions in directory so3 subjctv exp2 ref The reference conditions are named q05 through q25 for the respective MNRU conditions and 728 for the G 728 reference The samples processed by the IS 96 C codec for each of the five conditions named qc1 through qc5 respectivel
212. ts 2 77 Table 2 4 4 5 1 Cutting Points for the astrip Software Tool for the SO 73 Experiments 1 and 3 ACR Testo o aa ce datae afe date adt ates 2 80 Table 2 4 4 5 2 Cutting Points for the astrip Software Tool for the SO 73 Experiments 2 and 4 P 835 NM PEE 2 81 Table 2 4 4 5 3 Composition of the Sentence Triad Samples for the Experiments 2 and 4 P 835 Test MIN ETUR 2 81 Table 2 4 5 1 Example Randomization for the Experiments 1 and 3 ACR 2 82 Xi 20 21 22 23 24 25 26 27 28 29 30 31 C S0018 D v1 0 Table 3 1 3 3 1 Packet File Structure From Master Codec Channel Error Model 3 6 Table 3 1 4 6 2 1 Source and Bit exact Default Mode Test Vector 3 11 Table 3 1 4 6 2 2 Source and Bit exact Rate 1 2 Max Test Vector 3 12 Table 3 1 4 6 2 3 Source and Bit exact Full Rate Only Test Vector 3 12 Table 3 1 4 6 2 4 Decoder Output Test Vector 3 13 Table 3 2 3 3 1 Packet File Structure From Master Codec Channel Error Model 3 16 Table 3 2 4 5 1 SO 68 Encoder Bit exact Test Conditions sse 3 18 Table 3 2 4 5 2 SO 68 Decoder Bit exact Test Conditions ssssss
213. u just heard the SPEECH SIGNAL in this sample was VERY NATURAL NO DEGRADATION FAIRLY NATURAL LITTLE DEGRADATION SOMEWHAT NATURAL SOMEWHAT DEGRADED FAIRLY UNNATURAL FAIRLY DEGRADED VERY UNNATURAL VERY DEGRADED For the second sentence in each trial you will be asked to attend only to the background and rate how noticeable intrusive and or conspicuous the background sounds to you You will use the rating scale shown in the figure below to register your ratings of the background Your task will be to choose the numbered phrase from the list below that best describes your opinion of the BACKGROUND ALONE and then enter the corresponding number on your keyboard Attending ONLY to the BACKGROUND select the category which best describes the sample you just heard the BACKGROUND in this sample was 5 NOT NOTICEABLE SOMEWHAT NOTICEABLE NOTICEABLE BUT NOT INTRUSIVE FAIRLY CONSPICUOUS SOMEWHAT INTRUSIVE VERY CONSPICUOUS VERY INTRUSIVE For the third and final sentence in each trial you will be asked to attend to the entire sample both the speech signal and the background and rate your opinion of the sample for purposes of everyday speech communication Select the category which best describes the sample you just heard for purposes of everyday speech communication the OVERALL SPEECH SAMPLE was 5 EXCELLENT GOOD FAIR POOR BAD 2 32 20 21 22 23 24 25 26 27 28 29 30 C S0018
214. ubjective test is intended to validate the implementation of the speech codec being tested using the master codec defined in 3 4 3 as a reference Experiments 1 and 3 are based on the Absolute Category Rating ACR method which yields the Mean Opinion Score MOS as described in 10 Experiments 2 and 4 are based on the ITU T Recommendation P 835 described in 13 2 4 2 2 Method of Measurement The subjective tests involve a listening only assessment of the quality of the codec being tested using the master codec as a reference Subjects from the general population of telephone users will rate the various conditions of the test Material supplied with this standard for use with this test includes source speech impaired packet files from the master codec encoder and source speech processed by various Modulated Noise Reference Unit MNRU conditions and other references The basic Absolute Category Rating test procedure involves rating all conditions using a five point scale describing the opinion of the test condition This procedure is fully described in 10 The P 835 test method involves rating all conditions on scales of Signal Background and Overall quality and is fully described in 13 2 4 2 3 Test Conditions and Test Design for SO 73 Listening experiments 1 and 3 for SO 73 are performed as ACR listening tests Experiments 2 and 4 for SO 73 are performed as P 835 listening tests 2 4 2 3 1 Subjective Experiment 1 for SO
215. ular aspect Figure 2 2 8 1 1 Instructions for Listeners 2 2 8 2 P 835 Listening Test Procedures Experiment II Experimental uses the P 835 test methodology described in ITU T P 835 13 The P 835 methodology is specifically designed to evaluate the quality of speech in background noise It yields a measure of Signal Quality SIG a measure of Background Quality BAK and a measure of Overall Quality OVRL In general OVRL scores are highly correlated with MOS but the OVRL score provides greater sensitivity and precision in test conditions involving background noise While the OVRL score is of most interest here the SIG and BAK scores also provide valuable diagnostic information For each trial in a P 835 test listeners are presented with three sub samples where each sub sample is a single sentence approx 4 sec duration processed through the same test condition In one of the first two sub samples listeners rate the Signal Quality on a five point rating scale with the points labeled NOAA For the other of the first two sub samples listeners rate the Background Quality on a five point rating scale with the points labeled Very natural no distortion Fairly natural little distortion Somewhat natural some distortion Fairly unnatural fairly distorted Very unnatural very distorted 2 30 20 21 22 23 24 25 26 27 28 29 30 31 32 C S0018 D v1 0 Not noticeable Fair
216. umber of samples as there are test conditions involved in the test A test session consists of the same number of blocks as there are talkers involved in the test Each session is presented to a listening panel of four listeners 3 Randomizations are constructed such that talker gender is alternated on successive trials resulting in the same talker never being presented on consecutive trials Table 2 2 5 1 shows an example randomization for a single listening panel Each entry in the table is the file name for a sample with the following file naming convention xxyy zzz where xx is the talker yy is the sample and zzz is the test condition Table 2 2 5 1 Example Randomization for the Experiment ACR Test Blk 1 f2p8 a06 m3p8 a03 f2p7 a22 p6 a09 p2 a07 m3p8 a19 p6 al6 p8 a34 p8 a39 1 28 p3 ao5 pl al2 p3 a37 m4p6 a20 f3p2 a23 m3p4 a27 f2p6 a30 m2p2 a26 p7 a29 p8 al pl a2 m4p1 a04 2p2 a3 p3 al p3 al1 p4 a0 3p7 al m3p3 a3 Panel 1 f3 e f2p3 a36 3p4 a34 f1p2 a20 f1p4 a39 4p8 a05 m3p2 a17 m3p8 a04 m4pl a31 m3p7 a18 m2p2 a36 2p4 a04 2 7 10 2p6 a07 f1p6 a08 2 14 3p7 a09 mja m v Foliis t E 2 4 4 2 m2p8 a40 m2p3 a38 m2pl a33 m3p3 a05 f4p6 a22 f1p4 a33 f3p8 a08 flp6 a40 f2p2 a29 f2p4 a08 f2p7 a19 1 f4 1 2 f4 4 m m3p7 a33 m3p8 a23 m2p3 a19 m2p5 a21 m3p2 a26 1 19 mip3 a30 3p7 a21 f4p2 a04 4 2 09 f1p2 a36 4p4 a02 p8 a18 m2p1 a09
217. und noise conditions 20 dB SNR babble noise condition 15 dB SNR car noise condition and 15 dB SNR street noise The input source files used in the average data rate test have an approximate voice activity factor of 0 78 and are the same input files used in the subjective portion of the experiment 2 2 1 1 1 Average Data Rate Computation for SO 68 The average channel data rate for the test codec shall be computed for each of the benchmark files as follows 9600 N4 4800 No 2400 N4 1200 Ng N where N4 number of frames encoded at Rate 1 number of frames encoded at Rate 1 2 N4 number of frames encoded at Rate 1 4 Ng number of frames encoded at Rate 1 8 and Ng The total average channel data rate for the test codec is then given by Ravg 1 6 R ambient background segment 12dB R ambient background segment 32dB R ambient background segment 22dB R 20 dB SNR babble noise segment 22dB R 15 dB SNR car noise segment 22dB R 15 dB SNR street noise segment 22dB The above files are to be processed with EVRC B encoder at various capacity operating points defined by the active speech average channel rate shown in Table 2 2 1 1 1 1 Table 2 2 1 1 1 1 Target ADR vs Capacity Operating Point Capacity Operating Point Target Average Channel Data A Rate kbps active speech average channel data rate EVRC B 9 3k bits sec 6 93 1 5 EVRC
218. ware EvrcB_iwf cc can be compiled to yield a simulation utility EvrcB iwf with usage defined as EvrcB iwf s signaling mask file i encoded packet file o dimmed packet file where EvrcB iwf converts full rate frames in the input encoded packet file to half rate frames at packet level that is using a simple scaling down of the packet instead of a complicated transcoding method 3 2 3 Master Codec for SO 68 This section describes the C simulation of the speech codec specified by 1 The master codec C simulation used for verifying the performance of a non bit exact EVRC B implementation shall be the floating point master C simulation included in the associated Software Distribution 1a 3 2 3 1 Compiling the Master Codec Simulation The source code for floating point simulation can be compiled using the GNU G compiler and make utility A G compatible makefile has been included in the appropriate sub directory in 1a Typing make this directory will compile and link the code and create the executable file called EvrcB EvrcB exe on Win32 systems which will be placed in the same directory The included makefile may require some user modification for a particular hardware platform and or operating system 3 2 3 2 Running the Master Codec Simulation The EVRC B floating point executable EvrcB files use command line arguments to receive all information regarding input and output files and various parameters used during
219. will use the rating scale shown in the figure below to register your ratings of the background Your task will be to choose the numbered phrase from the list below that best describes your opinion of the BACKGROUND ALONE and then enter the corresponding number on your keyboard Attending ONLY to the BACKGROUND select the category which best describes the sample you just heard the BACKGROUND in this sample was 5 NOT NOTICEABLE SOMEWHAT NOTICEABLE NOTICEABLE BUT NOT INTRUSIVE FAIRLY CONSPICUOUS SOMEWHAT INTRUSIVE VERY CONSPICUOUS VERY INTRUSIVE For the third and final sentence in each trial you will be asked to attend to the entire sample both the speech signal and the background and rate your opinion of the sample for purposes of everyday speech communication Select the category which best describes the sample you just heard for purposes of everyday speech communication the OVERALL SPEECH SAMPLE was 5 EXCELLENT GOOD FAIR POOR BAD 2 61 20 21 22 23 24 25 26 27 28 C S0018 D v1 0 2 3 9 Analysis of Results The response data from the practice blocks shall be discarded Data sets with missing responses from listeners shall not be used i e a complete set of data is required for 32 listeners four for each of eight listening panels Responses from the different listening panels for the corresponding test conditions shall be treated as equivalent in the analysis 2 3 9 1 Basi
220. y also reside here 2 1 4 Processing of Speech Material for SO Testing The source speech material shall be processed by the various combinations of encoders and decoders listed in the descriptions of the two experiments given in Section 2 1 2 The master codec software described in Section 3 1 3 shall be used in the processing involving the master codec Generally the master codec encoder and decoder outputs have been provided in the respective so3 subjctv exp m pkt and so3 subjctv exp m m directories Execution of the master codec software is generally needed only for the test encoder master decoder combination for each experiment condition The exception to this is the tandem condition in Experiment 11 where double codec processing is required see Section 2 1 4 4 All codec processing shall be done digitally Noise suppression and post filter options shall be enabled for both the master and the test codecs The digital format of the speech files is described in Section 3 1 4 4 The naming convention of the processed speech is as follows For the packet files in the so3 subjctv exp1 m pkt directory Experiment l the p12 files are the master packet files for the 512 source files Likewise the p22 and p32 files are the respective packet files for the s22 and 532 source files The pf3 and pr3 are the impaired packet files which will be described in Section 2 1 4 3 Condition five Rate 1 2 maximum it uses phr as t
221. y the bit exactness performance are included in the associated Software Distribution 3 2 4 1 Fixed Point Codec Program Files This section describes the C program files which are provided in the associated software distribution for this document All of the files needed to compile run and verify the fixed point codec are located in the directory so68 EVRCB FX 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 36 37 C S0018 D v1 0 3 2 4 2 Compiling the Fixed Point Codec Simulation The source code for the fixed point codec simulation has been written in C and can be compiled using any general purpose compiler such as the GNU G compiler and make utility Refer to Section 3 3 for information regarding obtaining GCC make and relevant documentation Two GCC compatible makefiles have been included in the so68 EVRCB FX build directory Typing make in the build directory will compile and link the code and create the executable file called EvrcB fx EvrcB fx exe on Win32 systems which will be placed in the build directory The included makefiles may require some user modification for a particular hardware platform and or operating system 3 2 4 8 Running the Fixed Point Codec Simulation The EVRC B executable files use command line arguments to receive all information regarding input and output files and various parameters used during execution Executing EvrcB fx with no command line argu

Minimum Performance Specification for the Enhanced

Contents

Download Pdf Manuals

Related Search

Related Contents