Home

BNCより抽出したビジネス語彙

image

Contents

1. likelihood ratio LLR identified intermediate level technical words and mutual information MI identified advanced level technical words These measures were effective in separating technical vocabulary from general purpose vocabulary and provided the template for identifying business vocabulary for our current study 2 Purpose of the Study The goals of this study are 1 to extract a spoken and written business lexicon from the British National Corpus BNC using nine statistical measures 2 to create intermediate level business lists and 3 to combine our previously published e learning vocabulary teaching program for Japanese students with the business lists produced and refine the program to maximize its effectiveness 3 Methodology 3 1 The data 3 1 1 BNC spoken amp written business sub corpora The British National Corpus is one of the largest electronically accessible corpora consisting of over 100 million written and spoken words in British English Burnard 2000 Kennedy 2003 Leech et al 2001 9 50 Itis organized into various subject related components or sub corpora such as the sciences literature world affairs arts and commerce and finance Among them are a 1 3 million word spoken business sub corpus and a 7 1 million word written business sub corpus and these have been selected as the base corpora for this study The proced
2. B 0 40 2007 understood by 6th grade students LLR Chi2 and Yates are known by 8th to 10th grade students and 80 of the MI and McNemar words are known by 12th or 13th grade students college freshmen In terms of practical pedagogical application we inferred from this data in addition to several of our previous similar studies Chujo and Utiyama 2005 Chujo and Utiyama 2006 Chujo Utiyama and Oghigian 2006 23 24 25 that 1 the business words extracted by Freq Dice Cosine and CSM might be most useful for beginner level EFL learners 2 the LLR Chi2 and Yates lists might be most useful for intermediate level EFL learners and 3 the MI and McNemar vocabulary might be most appropriate for advanced level EFL learners Spoken Business Written Business US grade 100 100 ON A 60 60 nz __ O10 40 40 os 20 20 j o4 0 0 Freq Dice Cosine CSM LLR Chi2 Yates MI MeNemar Freq Dice Cosine CSM ELR CnlliYates ME MeNemar Beginning Intermediate Advanced Beginning Intermediate Advanced Fig 4 US grade level comparisons for the top 500 words 5 Developing an E Learning Vocabulary Buil
3. 9 written business word lists Cosine y Fig 2 Procedure for extracting BNC spoken amp written business word lists 4 Understanding the Meaning of the Extracted Specialized Lists 4 1 Extracted business words comparison In order to get a clear picture for potential pedagogical applications we examined the top 500 outstanding words for each statistical measure Outstanding words indicate those words appearing near the top of each list and which are ranked as outstandingly prominent in terms of each statistical tool s criteria Due to space limitations Table 2 shows only the top 15 outstanding words with the extractions from three of the nine tools from the two business corpora spoken and written This snapshot provides a clear and simple illustration of the types of words extracted by each measure 2007 Table 2 Top 15 extracted business words comparison Written Business Spoken Business MI LLR MI motion region congress union colleague client company train conference procedure business premium file trade retrieve measurable acquisition conductivity induction civilize incompetent feasibility upturn activist scandal marquis right hand amalgamation retention they on will this market company bank the business p
4. 4 2 Business English dictionary entry word overlap First to evaluate how effectively these tools extracted technical business words they were compared to an existing technical business vocabulary control list For this we used the 4 565 entry words in the Longman Business English Dictionary In Fig 3 the inner line in the radar chart shows the overlap in percentage between the business dictionary entries and the top 500 spoken business words and the outer line shows the overlap with the top 500 written business words We can see the nine tools effectively produced relevant technical vocabulary In particular we see that there is a 67 overlap between the LLR Chi2 and Yates spoken word list and the dictionary entries For the written business words 79 of LLR Chi2 and Yates and 80 of the MI words overlap with the dictionary entries Freq Dice Cosine CSM Chi2 Yates LLR Spoken Business Written Business Fig 3 The overlap of the top 500 extractions with the Longman Business English Dictionary 4 3 Grade level based on word familiarity Next we investigated at what US grade level the top 500 words would be understood by native speaking children For comparative data we used Dale and O Rourke s 1981 data published in The Living Word Vocabulary In looking at the horizontal line in Fig 4 we can see that 80 of the words from Freq Dice Cosine and CSM are generally
5. O08 Q 400 2007 BNC UOOUUUUUUU OL elearning O O O DO OO0OOOOOOOOO UVUUUUUUUUUUUUUUUUUU Creating E Learning Material with Statistically Extracted Spoken amp Written Business Vocabulary from the British National Corpus Kiyomi CHUJO Kathryn OGHIGIAN Chikako NISHIGAKI Masao UTIYAMA and Takahiro NAKAMURA For many people specialist vocabulary is a key element of ESP English for Specific Purposes however the use of conventional selection criteria frequency and range for identifying ESP vocabulary has been found to be only partly successful Because the focus of these measures is ranking general purpose vocabulary in order of priority separating technical vocabulary from general purpose vocabulary is still labor intensive time consuming and heavily dependent on the selector s expertise in English education and specialist knowledge of the domain which English teachers generally do not have To address this need Chujo and Utiyama 2006 have established an easy to use automated tool using nine statistical measures to identify level specific domain specific words In this study the log likelihood ratio was applied to the 1 32 million word spoken business component and the 7 12 million word written business component of the British National Corpus We examined the top 500 most outstanding words of each list and confirmed that the log likelihood ratio iden
6. OOOO00 O00 0 OO0 00000 Mod 00 0 0 000 o Oo 0000000 0o00 m o 0000000 0 000000 Mod 00 00 0 OOd 0000 OOOOO0U00 00000 o0 0 Phrase 1 support the motion a different region a former colleague a conference call a data file a composite material a conference delegate a spreadsheet program an environmental activist positive feedback a formally agreed price a voluntary organization a manual worker a membership fee safety in the workplace assertive behavior regional development adopt a resolution a telephone directory a sales rep use a formula an appraisal system a company secretary the small print a job description make a referral a collision damage waiver a personal diary a task force a self employed person under review on the agenda a training session aggressive behavior a consignment of goods a setup program a private investigator a car salesman a major sponsor of the Olympics a ring binder oO 0 OO 000000 OOOO000 0 0000 oo 00 000000 00000000 000 oo0oo0o0o000 ooog 00000 ta ai oO a o o o a 00000000000 000 0000 miim Mooomooomooo00 000000000000 0 mModgddo00mo000 ooo0Moo00d000000 0000000000000 0 ooooo000000 00000000 OOU00d OOoOoOdd 0 0o00 00 00000 00000 00000000000000 OOOO 000o
7. 18 Phrase 2 propose a motion in the region of 2 000 male colleagues hold a conference a file name a composite tax rate delegate authority to a spreadsheet file a human rights activist get feedback It s formally seconded a voluntary worker an instruction manual a full membership his former workplace an assertive person a regional office pass a resolution a data file directory a safety rep a standard formula an appraisal report a private secretary in print a trade description a referral service a waiver of the premium a desk diary a task group a self employed builder a review committee an agenda item in session an aggressive marketing campaign a consignment note a setup screen an accident investigator an insurance salesman an official sponsor form a loose leaf binder 2 00000 0 0 0 OOOOU00d OO OOOUdd0 000000000000 OOOd0g oooi 0000000000 000000 0000000000000 OOOOOOO oooo0g0o000 OOOO oooi 0000000 0 00000 00000000D 0 miimi OOOOdg OOoOodddg 00000000000 OOO00g000dg OOd0d OOdg OOU0000 OOd0d0 0 O00 000d OOO00 OOO00 ooog 00000000000 z 9 ODO o U Dn Ua A W N e U WwW LU U U U U Wlw N N N N N N N N N N e RL O OWN A Ua A U N KF Lou a NYA AUA RA U YP KF Lo a N Dn A Ww YP KILO tax income asset investor credit finance dividend transaction seller clause bond debt
8. program Armed with our vocabulary lists we then considered how we could present them in a framework that would best support vocabulary learning We wanted to integrate theories of learning information processing second language acquisition and TESL to ensure long term retention and create an enjoyable learning atmosphere Taking advantage of the possibilities inherent in computer assisted language learning we have been incorporating various kinds of specialized vocabulary such as TOEIC and TOEFL with a variety of exercises into an original vocabulary teaching method previously published Chujo 2002 Chujo et al 2002 2003 2004 8 7 30 3D Previous implementation of this prototype software with other types of vocabulary showed that students enjoyed learning with this program and it allowed for a high retention rate for the acquired target words Because this program has been shown to be an effective tool for learning vocabulary and improving communicative proficiency as measured by the TOEIC Chujo et al 2003 we have used it as our template for this intermediate business vocabulary learning material This e learning program has 12 units of ten words each so students learn 120 words in total As is shown in Table 3 they learn 10 target words for each unit using the Five Learning Steps which are organized to teach target words first in isolation Steps 1 2 and 4 and then in phrases Steps 3 and 5 11 OOOOU0O0UD
9. that rates whether the word is familiar to students in U S grade levels 4 through 16 This list was used to determine the grade level at which the central meaning of a word can be readily understood 4 The junior and senior high school JSH textbook vocabulary list containing 3 245 different base words was compiled from the top selling series of JSH textbooks the New Horizon 1 2 3 series and the Unicorn I II and Reading series gt 1 in Japan Japanese high school students generally use these or similar books to study English before entering a university This vocabulary was used to subtract all the words taught in junior and senior high school and to confirm the business core words are unknown to university students 3 2 Statistical measures Next using the earlier Chujo and Utiyama 2006 study as our template we applied nine statistical tools to these two spoken business and written business master lists to extract and organize the vocabulary into more meaningful lists The nine tools used 14 included simple frequency Freq the Dice coefficient Dice 1d Cosine Cosine the complementary similarity measure CSM the log likelihood ratio LLR the chi square test Chi2 18 chi square test with Yates s correction Yates mutual information MD 19 and McNemar s test McNemar 20These statistical measures are widely used in computational linguistics Using each measure the stati
10. Leech G Rayson P and Wilson A 2001 Word Frequencies in Written and Spoken English Harlow Pearson Education Limited 7 CLAWS7 1996 http www comp lancs ac uk computing users eiamjw claws claws7 html 8 Chujo K 2004 Measuring vocabulary levels of English textbooks and tests using a BNC lemmatised high frequency word list In Nakamura J Inoue N and Tabata T Eds English Corpora under Japanese Eyes Amsterdam Rodopi pp 231 249 9 Pearson Education Limited 2000 Longman Business English Dictionary Harlow Pearson Education Limited 10 Dale E and O Rourke J 1981 The Living Word Vocabulary Chicago World Book Childcraft International Inc 11 Asano H Shimomura Y Makino T Ikeda M Ikeya A Ishizuya S et al 1999 New Horizon English Course 1 2 and 3 Tokyo Tokyo Shoseki 12 Suenaga K Yamada Y Fukai K Nakamura S Ishizuka K Ichinose K et al 2001 Unicorn English Course I II and Reading Tokyo Bun eido 13 Manning C D and Schiitze H 1999 Foundations of Statistical Natural Language Processing Cambridge The MIT Press 14 Manning C D and Schiitze H 1999 Foundations of Statistical Natural Language Processing Cambridge The MIT Press 14 B 0 40 2007 15 Wakaki M and Hagita N 1996 Recognition of degraded machine printed characters using a co
11. O 000 0o00 Phrase 2 a sales tax a net income fixed assets overseas investors a tax credit international finance an interim dividend a financial transaction direct sellers a penalty clause a bond market the national debt equity capital an auditor s report the high inflation a sales revenue a single currency a commercial property a merger plan a safety requirement a discount rate a mortgage payment a prospective purchaser a price competition a takeover bid a marginal tax rate a portfolio manager the international monetary system a board chairman a stock valuation a wholesale dealer a small retailer an overseas branch lenders and borrowers a breach of trust a sterling balance an accountancy service the provision of appropriate training an equilibrium price a liquidity ratio OOO ood 0 OOdg 000o 00000 0 00 Od ood 00000000000000 0 000m0 0000
12. OCOCOO BQ 407 2007 Table 3 Five learning steps Introduction Step 1 Overview of 10 new words Learning Step 2 Sounds amp Meanings of 10 words Step 3 Sounds amp Meanings of 20 phrases Follow up Step 4 Spelling of 10 words Step 5 Dictation of 20 phrases In Fig 7 example pages from this program are shown In the Introduction pictures depicting suitable business scenes corresponding to the ten target words in the unit are provided In Step 1 a list of the target words is presented In Step 2 students learn the correct pronunciation and the Japanese equivalent of each target word In Step 3 two phrases for each target word are learned with their pronunciation In Step 4 learners double check their understanding of the target words by recalling the spelling of the words Finally in Step 5 learners listen to and transcribe each phrase in their notebooks with a pencil Introduction Step3 Step4 Step5 Fig 7 Example pages from the Business Vocabulary 1 12 B 0 40 2007 This Business Vocabulary 1 e learning material was created using Homepage Builder Pictures were purchased from Shutterstock com Support http www shutterstock com and sounds are obtained from http www research att com ttsweb tts demo php gt The phrases
13. ding Program 5 1 Intermediate level word selection Since our goal in this study is intermediate level business vocabulary we chose the LLR lists as the target lists for the following reasons 1 LLR was effective in extracting business words as shown by the overlap with business dictionary entry words 2 it extracted appropriate business words for intermediate level words in terms of word familiarity 3 it is a well established statistical technique and behaves well whatever the corpus size according to Oakes 1998 p 174 and 4 it is one of the most widely 9 B 0 40 2007 used statistical tools in corpus linguistics see Chujo and Utiyama 2006 pp 256 257 27 To further refine the intermediate business vocabulary from each of the top 500 spoken and written business words we next subtracted all the known words taught in Japanese junior and senior high school JSH words see Fig 5 That gave us 71 core business words that appeared prominently in both spoken and written business contexts and that would be new to high school graduates 190 and 194 words appeared prominently in a spoken and written business context respectively spoken usiness top written busihess top Subtract JSH words spoken spoken written business business 190 words os 194 words 71 words Total 455 diffe
14. ds by using statistical measures but also to evaluate how effectively this was done also In addition we wanted to know if these words appear generally in English at what U S native speaker grade level and to determine which are learned by Japanese students in the course of their junior and senior high school years For these reasons the following four control vocabulary lists were created by using the same procedures described above in 3 1 1 and each vocabulary is described in detail below 1 The British National Corpus High Frequency Word List BNC HFWL a list of 13 994 lemmatized words representing 86 million BNC words that occur 100 times or more compiling procedure is detailed in Chujo 2004 was used for comparison to Statistically determine if and how these business related words appear differently in a general corpus 2 Longman Business English Dictionary Pearson Education Limited 2000 includes over 20 000 words and phrases based on the analysis of millions of authentic business texts In this study we used only words totaling 4 565 entries as an existing technical vocabulary control list to evaluate how effectively nine statistical tools extracted business words 3 The Living Word Vocabulary Dale and O Rourke 1981 10 includes more than 4 B 0 40 2007 44 000 items and each has a percentage score
15. equity auditor inflation revenue currency property merger requirement discount mortgage purchaser competition takeover marginal portfolio monetary chairman valuation dealer retailer overseas lender breach sterling accountancy provision equilibrium liquidity o OOOOOU0UO00CD BJ 400 2007 Top 40 Written Business Vocabulary 0o00 0 00 0 000 oo00000 OOd O00 000 0 00 0 000000 0000 00o 000000000 00o 0 000000 00g Phrase 1 a local tax an income tax net assets individual investors a credit card corporate finance a final dividend a commercial transaction a goods seller an exclusion clause a government bond a bad debt 0 equity finance 0 the company s auditors the inflation rate a tax revenue a foreign currency an intellectual property a merger agreement a legal requirement a cash discount a mortgage rate a potential purchaser an international competition a hostile takeover a marginal cost a portfolio investment a monetary policy a committee chairman a property valuation a car dealer a food retailer an overseas market a money lender a breach of contract a pound sterling an accountancy firm a provision of services the market equiibrium the international liquidity 19 00 00 0o00 o 00000 0o00 OOOOUddd ooooogdo OOd 0o00 00o 000 0o00 O
16. for each target word were retrieved from the corresponding BNC sub corpora using the Shogakukan Corpus Network http scn02 corpora jp sakura03 and two phrases were chosen with respect to occurrence of frequency and the brevity of examples In other words we chose real phrases directly from the BNC selecting those which appeared most often and of those phrases which were the most concise Both words and phrases are provided with Japanese translations using A Dictionary of English Usage for Business and Finance Hashimoto 1991 as well as consulting with a businessman specializing in financing Because there may be some variation in what users will select we have listed the 120 business words and their 240 business phrases from Business Vocabulary 1 in the Appendix 6 Further Research In this study we hoped to demonstrate how easily technical business vocabulary can be extracted from corpora and to provide other educators with the tools to do so themselves The selected intermediate business words created in our study are available electronically at http www5d biglobe ne jp chujo The Business Vocabulary 1 e learning material is available at http weekend kir ne jp Business Vocabulary 2 and Business Vocabulary 3 are currently in development Further research will focus on implementing this e learning material as a case study in university level English classes in Japan to evaluate its effectivenes
17. hallenge was how to take such a large corpus and find specific kinds of words such as a business lexicon and or a targeted level of vocabulary Table 1 Examples of English language corpora and usages Examples of English Language Examples of Observational Corpora Language Usage Bank of English written amp spoken English for COBUILD _ frequently used words British National Corpus 100 million spoken amp written British words common collocations American National Corpus 100 million spoken amp written American words specialized words lists International Corpus of English international varieties of English We know that applying conventional frequency and range criteria to a corpus tends to extract general purpose vocabulary Sutarsyah et al 1994 48 and is therefore of limited use in identifying technical ESP English for Specific Purposes words Separating technical vocabulary from general purpose vocabulary is still labor intensive time consuming and heavily dependent on the selector s expertise in English education and a specialist knowledge of the domain Chujo and Utiyama 2006 256 In 2006 Chujo and Utiyama established an easy to use tool employing nine statistical measures to identify level specific domain specific words They found that specific statistical measures identified specific types of vocabulary For example the log B 0 40 2007
18. ient company an international organization J an ordinary shareholder a first time buyer 0 a monthly payment a cash payment the public sector the product liability a trust fund an audit report health insurance an insurance policy a national union corporate body a relationship between employer and employee an executive committee a trainee accountant supplier companies an employment agency an objective fact a competitive advantage a membership fee a premium price lifetime earnings a target market an acquisition of a company a delivery service a procedures manual recession proof o government expenditure a soft option a pension plan OO OOO00 OO000g0000 0o00 000000000 0o00 000000 00o 0000 000 000000000 00 000 z 9 ODO o U A Ua A W N e U U U U U U U UJI N N N N N N N N N N e Rl O y A Ua A U YN KF Lo a N AUA RAR U YP KF Lo a N Aa A Ww YP KILO motion region colleague conference file composite delegate spreadsheet activist feedback formally voluntary manual membership workplace assertive regional resolution directory rep formula appraisal secretary print description referral waiver diary task self employed review agenda session aggressive consignment setup investigator salesman sponsor binder OOOOOU00U000C0D BJ 400 2007 Top 40 Spoken Business Vocabulary
19. mplementary similarity measure and error correction learning IEICE Trans Inf amp Syst E79 D 5 16 Dunning T E 1993 Accurate methods for the statistics of surprise and coincidence Computational Linguistics 19 1 61 74 17 Hisamitsu T and Niwa Y 2001 Topic word selection based on combinatorial probability NLPRS 2001 289 296 18 Hisamitsu T and Niwa Y 2001 Topic word selection based on combinatorial probability NLPRS 2001 289 296 19 Church K W and Hanks P 1989 Word association norms mutual information and lexicography Proceedings of ACL 89 76 83 20 Rayner J C W and Best D J 2001 A Contingency Table Approach to Nonparametric Testing New York Chapman amp Hall CRC 21 Scott M 1999 WordSmith Tools Computer software Oxford Oxford University Press 22 Chujo K and Utiyama M 2006 Selecting level specific specialized vocabulary using statistical measures SYSTEM 34 2 255 269 23 Chujo K and Utiyama M 2005 Selecting level specific BNC applied science vocabulary using statistical measures Selected Papers from the Fourteenth International Symposium on English Teaching Taipei English Teachers Association ROC 195 202 24 Chujo K and Utiyama M 2006 Selecting level specific specialized vocabulary using statistical measures SYSTEM 34 2 255 269 25 Chujo K Utiyama M and Oghigian K 2006 Selecting level specific Kyoto tourism vocabulary using statis
20. rent words Fig 5 Developing three intermediate level business word lists The goal in using statistical measures is to narrow down the number of candidates for the targeted technical word list but it is not meant to be a definitive list These statistical tools can help users to select technical vocabulary automatically and eliminates the need for expertise in that field By using extracted lists such as these users can easily manually delete irrelevant words From these 455 words we created lists of core words As shown in Fig 6 we chose 40 words appearing in both the spoken and written contexts as business core words according to each word s score ranking We also chose 40 core spoken words and 40 core written words based on each word s LLR score ranking In this study these 120 words will form the basis of the Business Vocabulary 1 e learning material and the remaining 335 words will be used for developing Business Vocabulary 2 and Business Vocabulary 3 10 B O0 40 2007 Core Vocabulary Business Vocabulary 1 Total 120 words 40 written core words 40 spoken amp written core words Business Vocabulary 2 120 words Business Vocabulary 3 120 words Fig 6 Core vocabulary for Business Vocabulary 1 5 2 Building an e learning
21. rice rate cost firm tax investment account share profit contract lading buyout long run arbitrage subcontractor stockmarket offeror drafter no arbitrage shareholding headhunter payout issuer liquidity salesperson Frequency 2440 22 152838 42795 203 Word Length 6 5 9 3 2 8 5 6 8 9 We can see that the lists are very different from each other even though they were extracted from the same spoken and written business master lists For example in the written LLR list we can recognize fairly simple business words such as market company and bank while in the written MI business list we can see more complex words such as lading buyout arbitrage and subcontractor The bottom two rows of each column show the average frequency score and average word length of these 15 words As we see from the Table 2 the average frequency score decreases from left to right or from Freq to MI On the other hand the average word length increases from Freq to MI ranging from 2 4 or 2 8 to 9 3 or 8 9 letters Although we are aware that word difficulty may be influenced by many more factors than frequency and word length this might support the possibility that specific statistical tools can be used to target specific grade level vocabulary This will be explored in the following sections 0o 00000 BQ 400 2007
22. rnal of the College of Industrial Technology Nihon University 36 29 43 32 Chujo K Yamazaki A and Ushida T 2003 Bijuaru beishikku niyoru TOEIC yoo goiryoku yoosei sofutowuea no shisaku 2 The development of English CD ROM material to teach vocabulary for the TOEIC test utilizing Visual Basic Part 2 Journal of the College of Industrial Technology Nihon University 36 43 53 33 Shutterstock com Support http www shutterstock com 34 Sounds http www research att com ttsweb tts demo php 35 Shogakukan Corpus Network http scn02 corpora jp sakura03 36 Hashimoto M 1991 Keizai Eigo Eiwa Katsuyou Jiten A Dictionary of English Usage for Business and Finance Tokyo Nihon Keizai Shimbunsha 37 Chujo K and Oghigian K 2006 Creating E Learning Material with Statistically Extracted Spoken amp Written Business Vocabulary from the BNC Paper presented at the 2006 Asia TEFL International Conference Fukuoka Japan 16 Appendix z 9 ODO oo U A Ua A W N e U WwW U U U U U U UJI N N N N N N N N N N e w O WN A Ua A U YN KF Lo a NA AUA RAR WwW NY KF Lo N Aa A Ww YP KILO account investment profit contract financial customer management employee client organization shareholder buyer payment cash sector liability fund audit insurance policy union corporate employer executive accountant supplier employment objective competitive fee premium ea
23. rnings target acquisition delivery procedure recession expenditure option pension OOOOOU0UO00C0D BJ 400 2007 Business Vocabulary 1 Top 40 Spoken amp Written Business Vocabulary Word 00 m OO 0 OOOO000 Od OOO 0 OOOO OOOO00d 00000 OO 0 00 0 0 00 0 0 Phrase 1 a bank account a capital investment profit before tax an employment contract financial management customer service top management a government employee a client group a social organization a minority shareholder buyers and sellers an interest payment a cash flow the private sector the tax liability a pension fund an internal audit a life insurance company an economic policy a trade union corporate finance a manufacturer employer a chief executive officer a chartered accountant a software supplier an employment opportunity the main objective a competitive market a licence fee a premium rate export earnings a sales target merger and acquisition a delivery date a new procedure an economic recession capital expenditure a share option a pension fund oO Oo A 0o00 000000 oo O0 OO OO 0 0000 OOd 0000 OOOUUd 17 Phrase 2 a profit and loss account a business investment a net profit a futures contract a financial problem customer satisfaction system management an employee share ownership plan a cl
24. s and to obtain feedback from students to help refine the material in order to promote better retention usability and motivation Acknowledgement This study is funded by a Grant in aid for Scientific Research No 17520401 from the Japan Society for the Promotion of Science and Ministry of Education Science Sports and Culture It was also supported in part by the College of Industrial Technology Nihon University 13 B 0 40 2007 Part of this study is based on a presentation given at the Fourth Asia TEFL International Conference August 18 2006 in Fukuoka Japan I References 1 Beglar D and Hunt A 2005 Six principles for teaching foreign language vocabulary A commentary on Laufer Meara and Nation s ten best ideas The Language Teacher 29 7 7 10 2 Sutarsyah C Kennedy G and Nation P 1994 How useful is EAP vocabulary for ESP A corpus based study RELC Journal 25 34 50 3 Chujo K and Utiyama M 2006 Selecting level specific specialized vocabulary using statistical measures SYSTEM 34 2 255 269 4 Burnard L 2000 Reference guide for the British National Corpus World Edition http www natcorp ox ac uk World HTML thebib html 5 Kennedy G 2003 Amplifier collocations in the British National Corpus Implications for English language teaching TESOL Quarterly 37 3 467 487 6
25. stical score for the extent of each word s outstanding ness Scott 1999 in frequency of occurrence is computed The formula for each measure is available on the web and in Chujo and Utiyama 2006 3 3 Extracting spoken and written business words As shown in Fig 2 when each statistical tool is applied to the spoken business or written business corpus these tools automatically identify outstanding words in frequency of occurrence by making comparisons between the business words and general English In other words we apply the statistics to both the ESP Master Word Lists and to the entire EGP BNC Word List BNC HFWL and compare the results These statistics indicate whether a word is overused or underused in a specified list compared with a list of general English In this way we can statistically determine how words in the targeted lists in this case the business words would appear differently from words in a general corpus Spoken Business 9 statistical tools Freq LLR MI EGP Spoken Business Master List V 9 spoken business word lists Dice Cosine y y LLR Chi2 y y MI J 2007 Written Business 9 statistical tools Freq LLR MI Written Business Master List 2 973 words EGP
26. tical measures New Aspects of English Language Teaching and Learning Taipei Crane Publishing Company Ltd 126 138 26 Oakes M 1998 Statistics for Corpus Linguistics Edinburgh Edinburgh University Press 27 Chujo K and Utiyama M 2006 Selecting level specific specialized vocabulary using statistical measures SYSTEM 34 2 255 269 28 Chujo K 2002 Development of CD ROM material for teaching TOEIC test vocabulary to beginning level students JACET Summer Seminar Proceedings 2 40 44 29 Chujo K Shiina K and Nishigaki C 2002 Development of a CALL system to teach vocabulary for the TOEIC test Paper presented at the AILA Association Internationale de Linguistique Appliqu e Conference Singapore 15 0 BQ 40 2007 30 Chujo K Yamazaki A and Ushida T 2003 Bijuaru beishikku niyoru TOEIC yoo goiryoku yoosei sofutowuea no shisaku 2 The development of English CD ROM material to teach vocabulary for the TOEIC test utilizing Visual Basic Part 2 Journal of the College of Industrial Technology Nihon University 36 43 53 31 Chujo K Ushida T Yamazaki A Genung M Uchibori A and Nishigaki C 2004 Bijuaru beishikku niyoru TOEIC yoo goiryoku yoosei sofutowuea no shisaku 3 The development of English CD ROM material to teach vocabulary for the TOEIC test utilizing Visual Basic Part 3 Jou
27. tified intermediate level spoken and written business words by grade level and verified that the measure was effective in separating business vocabulary from general purpose vocabulary and in identifying spoken and written business vocabulary This study outlines a systematic way to create spoken and written business vocabulary lists for a targeted proficiency level by using an established statistical measure and describes the development of an effective e learning program for Japanese college students based on our previously published vocabulary teaching material for teaching TOEIC vocabulary Keywords Business English ESP Specialist Vocabulary Statistical Measures E Learning Material B 0 40 2007 1 Background As educators we know vocabulary is the heart of a language Beglar and Hunt 2005 7 and with advances in corpus linguistics we now have the tools to create our own real world vocabulary lists Corpus linguistics is essentially studying language through existing texts or corpora In other words we as researchers can take a corpus of several million words already existing in real world English and analyze it in diverse ways for example what are the most frequently used words in English or what are the most frequently occurring collocations or in the case of our study what business words appear in a corpus see Table 1 Our c
28. ure for preparing these business related sub corpora for statistical applications is shown in Fig 1 We first lemmatized both sub corpora to extract all base forms using the CLAWS7 tag set and created two alphabetical word lists Lemmatizing means we counted only one form for each word for example employ and noted the number of inflections such as employs employed and employing Then for pedagogical application all proper nouns and numerals were identified by their part of speech tags and deleted manually and all unusual or infrequent words were eliminated by deleting words appearing fewer than 10 times in the spoken list and fewer than 100 times in the written list Finally this left us with a 2 780 word spoken business master list and a 2 973 word written business master list B 0 40 2007 British National Corpus BNC 100 million spoken and written words z3 Spoken Business Sub corpus Written Commerce and Finance Sub corpus 1 3 million words 7 1 million words v lemmatized v lemmatized 2 frequency 100 Vv Written Business Master List 2 973 words 2 frequency 10 v Spoken Business Master List 2 780 words Fig 1 Procedure for preparing BNC business sub corpora for statistical applications 3 1 2 Control lists We wanted not only to extract spoken and written business wor

Download Pdf Manuals

image

Related Search

Related Contents

取扱説明書 - 三菱電機  333297C - GMAX 3400, GMAX II 3900/5900/7900, and  MEMSスキャナ 「ECOSCAN」が可能にした 外乱光の影響を受けない  baixar - Universidade de São Paulo  Electrolux TM 570 User's Manual  Dell Metered PDU LED Information Guide  Fluoride in Acid Solutions  Bio-Plex Pro™ & Bio-Plex Pro II Wash Stations  Rollei DF-S 290 HD  Philips VRKD11YL VCR User Manual  

Copyright © All rights reserved.
Failed to retrieve file