This project will be divided into three phases or stages. In a first attempt, we will concentrate on specific points of the grammar of spoken English through an analysis of various corpora: ICE, BNC, and the COBUILD LSWEC, large databases that contain an oral subcorpus, to focus later on in corpus of oral language, such as in CANCODE, SWITCHBOARD and MICASE.
In the second phase of the project a complementary objective is going to be introduced: to describe how spoken English varies depending on the context of use. Linguists have long accepted the relationship between language and context (Gregory, 1967; Gregory & Carroll, 1978) and the need to understand, in particular, the impact of this. With the development of corpus studies, the issue of variation was definitely introduced in linguistic description, including grammar (Biber, 1988; Biber et al. 1999). From the standpoint of discourse analysis, the concept of gender is widely accepted among specialists to account for linguistic variation (Halliday & Hassan, 1979; Swales, 1990 & 1994; Bhatia, 1993). As in General Linguistics, oral genres were paid much less attention than written ones, with notable exceptions (Ventola, 1987). The project aims at filling an important part of this gap by studying defining traits in oral genres, particularly in the field of academic communication but also in other contexts of communication (daily conversation between two or more speakers, oral presentations, parliamentary debates, etc). The interest in spoken English, particularly in academic contexts, is due to several reasons: a) the growing importance of oral communication in academic life (international mobility programs) and research (transnational projects), and b) the specialization of some of our team members in the study of scientific and academic English with research projects and postgraduate courses focusing on this topic area. Specialists find a clear shortage of work on oral genres in the field of education and science (Swales, 1990). The collection of studies Ventola et al. (2002) on various genres used in scientific conferences and pieces of research created from corpora of spoken English in academic fields, MICASE, compiled by Swales are outstanding exceptions but they are clearly insufficient. To carry out our investigation at this particular stage, we will use data from MICASE, a corpus of free access, SWITCHBOARD (a corpus of spoken American English  that contains telephone conversations), compiled by Godfrey, Holliman & McDaniel (1992), and a corpus ad hoc of oral presentations at conferences which we will develop for this stage of the project and the next. For the analysis of other non-academic oral genres, we will use speech samples from more general corpora such as ICE, the BNC as mentioned above. In this phase of the project, we will also consider the relevance of the gestures in oral communication by making use of this corpus ad hoc of oral presentations at conferences. In this phase of the study, we will tackle kinaesthetic and progemic components of oral communication in academic settings.
The third phase of the project focuses on contrast and learning. As a first step and based on our own experience as teachers and the many allusions to the subject by specialists and the media (Brown & Yule, 1983; Brumfit, 1984; Bygate, 1987, McCarthy & Carter, 1994; Carter & Mc Carthy , 1997), we believe that there is a relative failure in the teaching / learning of spoken English. The data analysis carried out in the first and the second stages of the investigation will now serve as a counterpoint to identify areas in which the oral language of our students and non-native users differs from the EFL users of English in academic contexts. Attention to the difficulties of the users of English as a foreign language in this context is justified by the vacuum of knowledge in this field and the growing importance of oral communication in the academic world. Non-verbal language (Knapp, 1982; Descamps, 1990; Poyatos, 1994; Cestero, 1999; Davis, 2005) is also a matter of study. To this end, we will draw up a corpus ad hoc of presentations from Spanish researchers at conferences in English. These presentations will be analyzed and contrasted with native patterns identified in the second phase of the project. Corpora with samples from EFL learners, such as the ICLE (International Corpus of English Learner) and SULEC (Santiago University Learner Corpus of English) will be used for the analysis of patterns and learning difficulties concerning some areas of oral competence in educational settings. This last corpus was developed by our research team and is expected to be complemented so as to meet the specific purposes of this project.

  1. To justify the existence and pertinence of a specific grammar of spoken English.
  2. To describe and characterise spoken English both from the grammatical and the discourse perspectives.
  3. To study in depth the specific characteristics of spoken English in different genres or types of text.
  4. To describe guidelines and identify the difficulties Spanish students and users of English as a foreign language have in the use of the English language in oral contexts.
  5. To identify areas of spoken English where great differences are perceived between the use of native and non-native speakers with the aim of analysing the didactic implications derived from those differences.
  • To study whether it is necessary to claim for the existence of a grammar of spoken English independent from that typical of writing.
  • To draw comparisons between oral and written language through the use of representative oral and written language corpora and databases.
  • To analyse those grammatical and discourse phenomena of oral English which are exclusive to that medium of expression.
  • To look into the grammatical and discourse phenomena of spoken English which show a significantly higher or lower frequency with respect to their counterparts in writing.
  • To systematise those aspects of spoken English which are due to a lack of planning in discourse, such as repetitions, self-repair, empty words, vague expressions, etc.
  • To look into those forms from oral English which are used to express emotions, attitudes, points of view, such as intensifiers, idioms, interjections, exclamatives, expletives, vocatives, etc.
  • To describe the guidelines and identify the difficulties Spanish students and users of English as a foreing language (EFL) have in the management of interpersonal relationships, which are a result of, for instance, differences between languages/cultures in the expression of courtesy, turn-taking in conversations, etc.
  • To describe the guidelines and identify the difficulties Spanish students and users of English as a foreing language (EFL) have in the use of mechanisms of cohesion and in the way of ensuring discourse coherence. This means, for instance, the description of the management and the possible problems shown by students and users of English as a foreign language when using textual connectors and other kinds of textual metadiscourse, discourse markers typical of conversation, logical connections between propositions, ideas and discourse, in order to identify correctly the different participants in discourse, or to reinforce lexical connections, etc.
  • To describe the guidelines and identify the difficulties Spanish students and users of English as a foreing language (EFL) have in the use of paralinguistic elements whith communicative aims, such as gestures, postures, spaces, audiovisual means.
  • To identify the non-verbal components frequent in academic presentations.
  • To describe the means and strategies used by learners to compensate for their limitations, understanding that part of the communicative competence of a foreing language speaker is to acquire those strategies (readjustment of information depending on the existing possibilities, reformulation through alternative expressions, circumlocutions, self-repair, creativity, etc) which will help them optimise the use of the resources they have to ensure communicative effectiveness.
  • To describe the cultural differences present in oral communication, both in contexts of personal interaction and in communication in academic contexts.


The phases of the investigation described in the preceding paragraphs share in essence the same methodological procedures and require similar means to achieve them; however, the more restricted view adopted in the second and third sections require the compilation of a corpus ad hoc. In general, a qualitatively exhaustive text analysis of different levels will be made to incorporate quantitative techniques in a second stage (when interesting phenomena and potential explanatory factors that enable us to make accurate assumptions have been identified).
The goals of the first phase of this study concentrate initially on native language where the use of English corpora already publicly available, such as the British National Corpus (BNC), the International Corpus of English (ICE) or the Longman Corpus of Written and Spoken English among others are justified. These corpora are widely-known and constitute representative sources which allow the comparison of our analysis with that of previous studies. Once we draw a comparison between oral and written language, other specific corpora containing samples of oral language, for example, the CANCODE, SWITCHBOARD and MICASE may be used.
In the second phase of the study we will put special emphasis on the contextual factors that determine the linguistic forms; this means that we will need to complement the aforementioned materials with supplementary corpora that may contain sufficient information on the communicative situation. This corpus will be developed following the corpora already mentioned as models and drawing on the experience gained by the team when compiling SULEC.
Video recordings will be used to capture production as precisely as possible with its most significant details being also transcribed in the text. It is essential that authenticity of spoken language is not biased due to some recording conditions being too artificial. Therefore, especially when it comes to interactions in a foreign language, it is necessary to ensure that collection procedures include tasks and interview protocols that leave some room for spontaneity. Sections of the corpus devoted to usage and more informal and general communicative situations will require a careful design and specific conditions for the obtaining of data. In the case of samples of more formal and specialized speech, this problem is largely solved given the generally open nature of communicative events in which the presence of a large audience may lessen the effects of outside observers.
This is the reason why we have decided to record samples of spoken language in specialized conferences and, more specifically, in communication panels. Far from being a homogenous genre, communications at conferences are made of a variety of genres, including, for example, the presentation of the informant by a moderator, the various sections in which the presentation is divided or the usual brief question-intervention from the audience. This internal diversity reflects a spectrum of uses and communicative functions sufficiently varied for our purposes and it is also perfectly comparable to a situation in a wide range of academic disciplines. It is, moreover, as noted earlier, one of most outstanding and representative contexts in current academic life.
With regard to the size of the corpus, we cannot establish it a priori to avoid data collection becoming an unapproachable task.
A possible distribution of the initial corpus is presented below. The first block is for General English and the second is devoted to English for Specific Purposes:
* Speaker’s level of competence:
-Formal Style: Woman-5hrs+ Man -5 hrs
-Spontaneous Style: Woman-5hrs + Man-5 hrs
* Speaker’s level of competence:
Style-Formal: Man-5hrs + Woman-5hrs
Spontaneous-Style: Woman-5hrs + Man-5 hrs
Total hrs: 40
* Area of research: Experimental Sciences:
-Expert: Woman-5hrs + Man-5 hrs
-Novice: Woman-5hrs + Man-5 hrs
* Area of research: Humanities and Social Sciences:
-Expert: Woman-5hrs + Man-5 hrs
-Novice: Woman-5hrs + Man-5 hrs
Total hrs: 40
This is obviously an initial estimation. The sample size will be increased to the extent that the assumptions are more accurate and incorporate new factors in quantitative analysis. In particular, in the case of academic English it may be necessary to take into account the type of conference (for example, the degree of thematic specialization) and produce comparable samples between native language and non-native language.