Promoting academicwriting in English for the World Wide Web: Part I [Archives:2005/862/Education]

July 25 2005

Prof R K Jayaraman
Department of English
Faculty of Education
Sana'a University, Sana'a
[email protected]

In my article “Artificial Intelligence: An Introduction” in Yemen Times dated 30 May 2005, I had said that 'knowledge' and 'reasoning' are two very important components of 'human intelligence', and that in the field of Artificial Intelligence (AI), we seek primarily to electronically simulate these two aspects of human intelligence (HI). 'Knowledge Representation' and 'Inference Engine' are concepts often used in the literature of AI to refer to these concerns. In the article that follows, which is Part I of a paper in two parts, I discuss the prospects and problems of using 'Knowledge Representation' and 'Inference Engine' in promoting 'Academic Writing in English for the World Wide Web (www). Part II of the paper will appear in the next issue of Yemen Times. ]

This paper is not about EAP per se; it is about 'knowledge representation', considered in relation to the concerns of academic writings in English for the Web. It is argued that the excellence of academic writings in English for the Web is to be evaluated in terms of the facilities available for 'intelligently' searching them, which is what knowledge representation should guarantee. Excellence, in this context, also subsumes 'quality' and, therefore, the paper addresses the issue: 'How does one ensure the reliability of the Web as a respectable source of academic information?', in an equal measure.

Having set out its aim in the way it has, the paper steers clear of the definition of EAP, and its relation to ESP, in terms of its historical development, nature, and stated goals, if any, as not relevant to its immediate purpose, though these may be important in their own right. These are issues about which experts have taken different positions at different points in time, and one has reason to fear that the last word hasn't yet been said. What is of enduring interest in an otherwise messy state of flux, however, is the undeniable importance of 'content' to EAP deliberations of any persuasion. As 'knowledge representation' is basically a matter of organization of content, content in EAP is of central importance to the thesis of the present paper, which it will return to presently.

In order to place the main thesis of the paper in perspective, even at the risk of labouring the obvious, it becomes necessary to briefly describe the Internet scenario. The Internet is a network of computers all over the world connected to each other through telephone lines. Some of these computers are known as servers, in that they serve information, and a larger number of them are known as clients, and the clients seek information. The information sought and served is available in what are known as web pages and web sites, millions of which have already been created by individuals, and institutions, like for example, universities all over the world, using so many different languages. All these web pages and web sites, put together, constitute what has come to be known as the W(orld) W(ide) W(eb), or the Web, for short.

The Web houses a colossal body of information, on almost any subject under the sun, that gets bandied about across the globe via the Internet. This body of information keeps growing at a pace faster than the pace at which the measures to control it are put in place, making the Internet a virtual free-for-all. The entire enterprise, if we forget its military origins for a moment, has now become a thriving commercial venture, and its clientele includes a large number of people for whom academic matters are of little importance. For most of them the Internet, and through it, the Web – is a source of distraction. Not much of the information available on the Web, however, is academically oriented because it isn't intended to be. Even the little that is academically oriented is not necessarily academically respectable. A sizable chunk of it is certainly no more than just junk, and unless careful, a searcher of the Web may be sucked into a deluge of irrelevant imbroglio.

Each web page or web site is uniquely indexed and has a unique address called an URL. Anyone may visit a web page or a site using its URL. But when a user does not have an URL, he / she uses any of the search engines, such as, Yahoo, Alta Vista, Google, Ask Jeeves and, to mention a few. The use of a search engine requires the user to type in a 'search term'. There are ways in which a search term is specified. Experienced searchers use Boolean operators, such as, “and”, “or” and “not” to sharpen the focus of their search. A number of search engines provide advanced search facilities, by seeking more information from the searcher so as to narrow down the locations, dates of publication, and so on of the goal state yields. A search takes place in what is called a search space, and the Web in its totality constitutes the search space. The moment a category is selected and a search term is specified and the word “go” is given, the search begins. This is the initial state of the search. The matches or the hits that the search engine yields for the search term constitute its goal state. The time/ distance (i.e. the web path) taken to yield a match is called the 'cost'. The cost of a search, the quality and the relevance of the matches yielded, are factors which differentiate an academically motivated search from a search driven by a desire for distraction.

Though some of the advanced search techniques mentioned above make the search term more and more unique, they do not by any means make the search 'intelligent' so long as there is a dependence on 'keyword-based' search even if it is semantically oriented. A keyword-based search measure recognizes meaning negatively as 'difference', (after the fashion of Saussure, if we care to remember), and 'differential meaning' cannot account for the semantic load that a word acquires in context, which is a matter of 'meaning as substance'; it is 'meaning as substance' which is important in academic studies and reasoning, and ontological pursuits. For example, the word “sustainable”, occurring in the context of “sustainable development for global environmental protection” is vastly different in its semantic load from the same word occurring in the context of, say, a “sustainable court case”. A keyword-based search fails to recognize important distinctions such as this and, therefore, often yields irrelevant hits.

It becomes necessary, at this point, to discuss two conventional modes of information interchange that academics normally resort to in satisfying their academic needs. This will help us understand what it means to satisfy such needs in the context of the Web environment. Simply put, these two modes are speech and writing.

Information interchange through spoken discourse has the following features: To begin with, it may take the form of a lecture, a seminar discussion, or an academic debate or argument, to mention only a few, and it generally involves two or more interlocutors and /or potential participants, operating face-to-face, and in real time. These may be teachers, scholars, students, researchers or anyone who has a genuine urge to interact academically. The spoken mode may be used to transmit information or knowledge, to justify an academic stand taken or to question one, to seek a clarification or to give one, to deduce a conclusion from a set of premises using any of the recognized reasoning processes, or to demolish an argument, to mention only a few, again. It is not uncommon, again, to find situations where the interlocutors begin to switch roles: the seeker becoming the giver and vice versa. But most importantly, such interactions are often aided by the schemata that the interlocutors operate, with the result that the interlocutors assume different degrees of background knowledge relating to the topic of discussion on the part of each other. Where there seems to be doubt about the background knowledge, clarifications are sought and given so that everyone concerned soon begins to use the language naturally and economically, giving and gaining more and more information about the topic of discussion. All these are genuine concerns of Language for Academic purposes.

The question is: Can this be simulated in the context of the Web? Can a person use the Internet and enter into an online, real time, interactive, spoken discourse with the Web, moving naturally and spontaneously from one related academic topic to another, learning and contributing, convincing and getting convinced, and so on? In other words, can the Internet successfully replace, say, a human teacher?

Though the answer is “yes”, one is compelled to express serious reservations about this happening in the immediate future. The C-STAR (Consortium for Speech Translation Advanced Research) project, launched jointly by the Advanced Telecommunications Research of Japan, Carnegie Mellon University (USA), Siemens of Germany and a few other organizations, about ten years ago, has successfully demonstrated the possibility of spontaneous speech to speech translation with regard to certain limited domains. Using speech recognition, speech synthesis, and language-to-language translation procedures and techniques, C-STAR has shown, twice during the past decade, that it is possible for speakers of two different languages to interact, online, and in real time, through speech. The topics of discussion, however, related to two very highly restricted LSP domains, namely 'conference organization' and 'travel arrangements'. One area where more research is felt to be necessary before greater heights can be reached, is 'the change of human behaviour during conversations', resulting probably from a continually modified schema of an interlocutor. This problem is sure to be felt even more keenly when one moves from a relatively restricted LSP domain to a more broad-based EAP, involving the use of language not merely for exchange of information but also for reasoning, arguing, vigorously establishing an academic position and so on.

But, this is not an impossibility, considering the achievements of C-STAR. What is needed is an inter-institutional collaborative project involving interdisciplinary researches, to be carried out in different stages. We will take up the question of the possible shape that such a project may take in Part II of the paper.

In the mean time, readers who would like to know more about any of the points discussed in the paper may profitably consult:

a) Charniak, E and Mcdermott (1985). Introduction to Artificial Intelligence. Reading, A.A: Addison-Wesley and / or,

b) Rich, E and Knight, K (1991). Artificial Intelligence. New Delhi: Tata McGraw-Hill,

for more information.