Chapter 1Introduction

1.1 Research BackgroundIn the 20th century, the society has witnessed the rapid prosperity in humancommunication, especially the appearance of the computer and the establishment of theInternet. As a tool of the human mind, the computer has acted as the bridge amongvarious cultures of science and technology to some degree, as well as the history andart (Kettemann, 1995). Due to its increased availability in such tasks as searching,recording and storing, computers are stimulative for the natural language beingprocessed and analyzed automatically, which further facilitates the high degree ofaccuracy of measurement in linguistic researches. The strong power of the Internet inproviding numerous authentic linguistic data, as well as the application and guaranteeof the computer in analyzing language more scientific, clears the way for thedevelopment of language corpora. A corpus is a collection of written or spoken textscompiled to represent a certain language or a specific linguistic area. With the adventof corpus, a collection of texts, written or spoken stored in computer, more and moreresearchers have resorted to corpus to analyze language features. Corpus-basedlanguage studies have been receiving keen interest and wide welcome, with its easyaccess to large-scale corpora and many powerful corpus software tools. In this way,language corpora become an integral part in modern linguistics, corpus linguistics.Corpus linguistics is undoubtedly applied in many linguistic fields, includinggrammatical studies, contrastive analysis, lexicography, and so on. Gradually, thisrevolutionary approach accelerates the transformation from previous qualitative tomixed methods where quantitative and qualitative methods are both used efficiently.The application of language corpora in the revolutionary of dictionary-making fieldsgradually feeds into the use in language teaching material. It was Johns (1991) thatfirst proposed the application of corpora into language teaching, aiming at facilitatingthe transformation from a language learner to a language researcher. Fundamentally,corpora have provided evidence to verify our intuitions for language features and moreoften the corpora have shown that it can be faulty when it comes to issues onpragmatics or semantics. It is conspicuous to see that the contribution of corpuslinguistics to the identification of the language we teach is hard to doubt. From theviewpoint of McCarthy (2001), corpus linguistics represents cutting-edge change fromscientific techniques and methodologies to more encyclopedic shifts which will“impinge upon our long-held notions of education, roles of teachers, the culturalcontext of the delivery of educational services and the mediation of theory andtechnique”. Numerous studies have revealed the fact that the language included intextbooks is frequently still based on intuitions about how we use language instead ofthe authentic language use. 

1.2 Research Significance
Proposed by Renouf and Sinclair in 1991, the term "collocational framework"refers to a framework composed of two discontinuous words. Indeed, as an extensionof collocational frameworks, phrase frame is a term referring to several discontinuouswords, mainly within a span of three to five words. The occurrence of suchdiscontinuous sequences of words might provide some supplementary information tothe study of continuous sequences of words that maybe igno

