Differentiating the Regional Communication Journals:

A Computer Assisted Concept Analysis

Timothy Stephen, Associate Professor

Dept. of Language, Literature & Communication

Rensselaer Polytechnic Institute

Troy, New York 12180

phone: 518 887 2443

email: stephen@rpi.edu

This is a working draft. Please contact stephen@rpi.edu for permission to quote or cite.

Differentiating the Regional Journals

Abstract

The journals of the four U.S. regional communication associations, all maintaining equivalent editorial policies, have published jointly more than 2,900 articles since 1970. Computer assisted automated content analysis was employed to study the conceptual structure of the discipline as represented by this literature. Using data from the ComIndex database, the words in article titles were linguistically normalized and filtered to isolate significant concept terms. Cluster analysis was then applied to the transformed data. This procedure identified 12 clusters of concepts, representing areas of significant scholarly interest across the four journals. ANOVA procedures revealed differences between the four journals on 5 of the 12 clusters. Results are considered in light of differences between the journals and the implications of the findings for the role of omnifocus journals in an era of increasing fragmentation in scholarly publishing.

Differentiating the Regional Communication Journals:

A Computer Assisted Concept Analysis

The four U. S. regional communication associations have contributed substantially to the scholarly literature of the field. The journals of the Eastern Communication Association and the Central States Communication Association (``Communication Quarterly" and ``Communication Studies") have been in print for more than 45 volumes and the journals of the Western States Communication Association and the Southern States Communication Association (``Western Journal of Communication" and ``Southern Journal of Communication") have been in print for more than 60 volumes. Until the early 1970's, which marked the beginning of a frenzied rate of increase in the launch of new journals devoted to communication, the journals of the four regional associations together represented a significant portion of all major titles devoted solely to the publication of communication scholarship. Only ``Journalism and Mass Communication Quarterly", ``Quarterly Journal of Speech", ``Communication Education", ``Public Opinion Quarterly", ``Communication Monographs", and ``Journal of Communication" have roots as deep as those of the publications of the regional societies. Even focusing exclusively on the discipline's publication output since 1970, the four primary journals of the regional associations have contributed jointly a disproportionately high 10% (i.e., 3,000 of 31,000 articles) of all articles indexed in the current volume of the ComIndex database (Stephen, Harrison, and Silvestre, 1998), which covers 70 communication-related publications since 1970.

It is fair to inquire, however, about the status of these journals. What differentiates them from other communication journals and what differentiates them from each other? In a majority of cases journals are inaugurated to facilitate publication of content in particular areas of scholarship (e.g., ``Health Communication" and ``Communication, Policy and Law"), to advance particular modes of inquiry and theoretical perspectives (e.g., ``Philosophy and Rhetoric", ``Discourse Processes" and ``Critical Studies in Mass Communication"), or to provide outlets for reports from other categories of scholarship such as theory development (e.g., ``Communication Theory"), applied research (e.g., ``Journal of Applied Communication Research"), or brief or preliminary studies (e.g., ``Communication Reports" and ``Communication Research Reports"). The four regional journals are nearly unique in their construction as omnifocus journals distinctive in their affiliation with regional organizations but not in the substantive direction of their contents. The four journals sustain equivalent editorial policies, encouraging submissions from any methodological and theoretical perspective on human communication studies that meets appropriate standards of quality. Hence, one would expect there to be no meaningful differentiation between the four journals on the basis of topical content. Does it mean anything therefore for an article to be published in one regional journal rather than another? Is there any reason for a scholar to select a particular regional journal for a manuscript submission?

One way in which scholars have attempted to differentiate journals has been through bibliometric citation analysis. Unfortunately, bibliometric studies have relied almost exclusively on databases compiled by the Institute for Scientific Information which, while presumably adequate for analysis of publication practices in some fields, covers only a small, ad hoc collection of titles in communication (Funkhouser, 1996). Of the four regional journals, only ``Southern Communication Journal" has been included consistently so that previous bibliometric studies of communication serials (e.g., Reeves, Byron, and Borgman, 1983 and Rice, Borgman, and Reeves, 1988) provide inadequate information about citation patterns for the four regional titles.

Funkhouser (1996) attempted to overcome this limitation by collecting his own data on citation patterns for the year 1990 for a set of communication titles not covered by ISI, including the regional journals. He concluded that the regionals received more citations in articles published in other journals that year than other titles traditionally covered by ISI. In his sample of 27 communication titles, ``Communication Quarterly" (CQ) ranked 9th, ``Western Journal of Communication" (WJC) ranked 10th, ``Communication Studies" (CS) ranked 13th, and ``Southern Communication Journal" (SCJ) ranked 15th with respect to overall number of citations received. A disciplinary impact ranking, calculated by adjusting for the number of self citations and for the dispersion of citations across the entire range of journals in Funkhouser's sample placed CQ in 6th position, WJC in 9th, CS in 12th, and SCJ in 14th. Hence there is no question as to the importance of the regionals to the discipline and, with the exception that for 1990 CQ may have ranked significantly higher than SCJ, no profound basis for discrimination among them with respect to impact.

Citation analysis has played an important role in research on scholarly communication. Methodological procedures for the analysis of co-citation data has a long history in the literature and predominates among formal studies of scientific communication. The technique has been particularly important in its use for generating impact ratings for authors, journals, and articles. However, citation studies can not illuminate patterns of conceptual interest in the field. For example, while citation studies tell us that in 1990 CQ placed 6th in impact in a sample of 27 titles, such studies are not able to identify what topics or ideas have characterized CQ's publication history and if those topics and ideas are in any way distinct from topics and ideas covered in WJC, CS, or SCJ. Up to now, little systematic investigation has been undertaken to explore the patterns of conceptual focus that have characterized the communication literature because we have not had an adequate data source and because there has been no methodological precedent.

Small (1986) and Rees-Potter (1989) have approached such questions using variations of citation analysis but the recent introduction of techniques of direct content analysis suitable for automatic application in large databases provides a better alternative. The data sets necessary for meaningful content analysis are extremely large and accumulating continuously as new literature is generated. For this reason dynamic procedures for automated textual analysis are required. Recent developments in computer assisted textual analysis now permit beginning steps to be taken. Perhaps most meaningful among these has been the dissemination of algorithms permitting automated recognition of words derivative of common root forms (see Frakes, 1992). This and the publication of new databases addressing a much wider portion of the communication field's literature than the ISI databases make it practical to begin to undertake such research.

In 1992 the Communication Institute for Online Scholarship began developing its comprehensive bibliographic indexes to communication serials and launched the first volume of the ComIndex database (Stephen, Harrison, and Silvestre, 1992). The first electronic index of the discipline's periodical literature, ComIndex is now in its 7th edition (1998) and provides listings of all regular scholarly articles published in important communication serials including the four regional journals. Book reviews, obituaries, reports of conferences, brief introductory remarks by editors, while included in early volumes of ComIndex, are now excluded.

With indexes to the four regional journals available from 1970 in electronic form in ComIndex, it was possible to undertake a content analysis of the journals' publishing history to map the concepts and ideas that have captured attention in the journals and to explore the question of differentiation.

Method

The data for this study were a combined sample of 2,997 titles of regular research articles published in CS, WJC, SCJ, and CQ appearing in volume 7 of the ComIndex database. For CQ the data included volume 18 (1970) through volume 44 (1996); for CS volume 21 (1970) through volume 46 (1995); for WCJ volume 34 (1970) through volume 61 (1997); and for SCJ volume 36 (1970) through volume 62 number 1 (1996). Although it might seem unusual to think of titles of research articles as data for content analysis, there is significant precedent for doing so (for example, Tijssen and Van Raan, 1984). A tradition of scientometric studies have attempted to augment network studies of citation patterns by focusing on substantive assessments of the contents of scientific publications. Whereas citation studies are frequently undertaken to aid in personnel evaluation or in network studies of scholarly communication, the goal of content analytic studies has been to isolate linked concepts and to produce conceptual maps of a discipline's works. The normal practice is to use a quantitative grouping procedure (usually hierarchical cluster analysis) on document titles after they have been linguistically normalized and filtered to remove common terms.

With relatively few exceptions, titles of articles in the communication literature are constructed tightly and feature the major concepts an article addresses. According to the Publication Manual of the AmericanPsychological Association (1994, p. 7), ``A title should summarize the main idea of the paper ... It should be a concise statement of the main topic and should identify the actual variables or theoretical issues under investigation and the relationship between them." Fortunately, adherence to this convention has characterized the regional journals, making it possible to perform meaningful content analyses that help in identifying the substantive issues they have addressed. There are certainly cases of titles that have not conformed to this requirement (e.g., Medhurst's (1982) ``The sword of division"); however, they are clearly exceptions.

In simple computer-assisted content analysis, a series of routine procedures are applied to normalize the textual data. In the present analysis these initial steps included conversion of all words to upper case text, conversion of terms with British spelling to their American equivalents, and removal of all punctuation and most hyphenated prefixes (e.g., ``micro-", ``quasi-", ``pseudo-"). Next, removal of common terms was accomplished by parsing data into words and eliminating all words that have been designated as common or otherwise undesirable. Typically a dictionary of common terms (alternatively referred to as a ``stop list" or ``exclusion list") contains words such as: ``about", ``and", ``a", ``as", ``the", ``of", ``but", ``or", ``else", ``there", etc. When applied to a title these procedures would reduce ``The role of nonverbal behaviors in modifying expectancies during initial encounters" to ``NONVERBAL BEHAVIORS MODIFYING EXPECTANCIES INITIAL ENCOUNTERS".

As the present study is part of a larger project working with the full 70 title ComIndex database, extensive preliminary analysis of word frequencies and use-in-context across the entire ComIndex corpus permitted the construction of an expanded and more finely articulated dictionary of common terms. Hence the stop list applied in the present study contains in addition to the approximately 200 items occurring generically in stop lists, many hundreds of additional terms that are appropriate to remove in the context of analysis of the scholarly literature of the communication field. These include items such as forenames (e.g., ``Wilbur" in ``Wilbur Schramm"), terms with indiscriminate meaning (e.g., ``resume", referring, without a diacritical mark, either to a professional credential or to the action of restarting), and terms with both high frequency and indeterminate meaning (such as: ``communication", ``analyze", ``comparison", ``behavior", ``research", ``study", ``investigation", etc.). Applying the expanded stop list dictionary reduces the example title to ``NONVERBAL EXPECTANCIES INITIAL ENCOUNTERS".

The next stage in the analysis applied a synonym dictionary to equate terms with equivalent meaning and to reduce common phrases to singular tokens. Over 800 synonym transformations were identified from the 70-title ComIndex corpus and used to transform the data. Examples of such transformations include conversion of ``HIV" to ``AIDS"; ``know", ``knowing", and ``knowledgeable" to ``knowledge"; and ``movies", ``movie", ``cinematography", and ``film" to ``cinema". Several hundred common phrases were also identified and used to normalize the article titles. For example, if ``acquired immune deficiency syndrome" appeared in a title, the phrase was transformed to ``AIDS"; similarly, any occurrence of ``motion picture" was transformed to ``cinema". As well, phrases such as ``spiral of silence", ``black box", ``content analysis", ``virtual reality", ``self disclosure" and ``college student" were converted to singular tokens: ``SPIRALOFSILENCE", ``BLACKBOX", ``CONTENTANALYSIS", ``VIRTUALREALITY", ``SELFDISCLOSURE", and ``COLLEGESTUDENT".

Term refinement was an iterative process that required passing the textual data repeatedly through a set of custom computer programs that applied each of the filters and then generated a transformed term list ordered by term frequency. Each term in this resulting list (generally from 10,000 to 14,000 terms in length) was then examined to see if it represented a unique and discriminating concept with as little ambiguity in its associations as possible. This was done with the assistance of the ComIndex database product, which makes it possible to conveniently review the conceptual context of any term, displaying every context of expression for every instance in which a term has been used in titles in the communication literature.

The next step in the data preparation stage was the computerized application of Porter's suffix stripping procedure (Porter, 1980) to all the terms remaining in the corpus. Porter's procedure is widely used in information retrieval applications to convert terms with common linguistic roots to equivalent tokens (Frakes, 1992). The Porter procedure, for example, reduces ``organize", ``organizational", and ``organization" to the common token ``organ". A Porterization of the example SCJ article title ``NONVERBAL EXPECTANCIES INITIAL ENCOUNTERS" yields ``NONVERB EXPECT INITI ENCOUNT". Porterization is extremely useful in computerized textual analysis but the output cannot be applied without careful and extensive manual correction and some pre-coded tokenization designed to handle circumstances not handled properly by the Porter procedure. The Porter procedure, for example, is not capable of recognizing that ``BURKE" is the root of ``BURKEAN", that ``FACIAL" should be reduced to ``FACE", or that it is undesirable to reduce ``PROTESTANT" in analysis of the communication literature to the semantically distinct root token ``PROTEST". These and several hundred additional special cases were identified and handled in a computer automated preprocessing stage using software constructed by the author. The resulting reduced list of tokens, one list per article title, was the input to subsequent analyses. 1

Results

The first step in the analysis involved examining the frequencies with which tokens appeared in the data set. The data consisted of 3,109 unique tokens, 1,339 of which occurred more than one time (for example, the token ``IMAGIN" - representing ``imagination", ``imagined", ``imaginative", and ``imagining" - occurred 6 times) and 1,770 of which occurred only once. The average title contained 4.37 tokens (sd = 1.92). The 100 most frequently occurring tokens accounted for 37% of all observations. The next 100 most frequently occurring tokens accounted for only an additional 13% of observations and the third 100 most frequently occurring tokens accounted for an additional 8% of observations. Additional tokens accounted for progressively smaller proportions. Clearly a small set of high frequency tokens accounted for the lion's share of all tokens observed. The 25 tokens with the highest observed frequency are listed in Table 1.

Next, relationships between the concepts appearing in the dataset were mapped using WORDSTAT (Provalis Research, 1998) to assist in the study of patterns of
co-occurrence within the tokenized titles. WORDSTAT provides a computer assisted environment for analyzing patterns of co-occurrence through hierarchical cluster analysis. Cluster analysis is performed on a co-occurrence matrix computed using the Jaccard similarity/distance measure popular in document analysis. The Jaccard measure is a ratio computed by dividing the number of co-occurrences of two tokens by the number of
co-occurrences plus the number of singular occurrences (one token but not the other) plus the number of simultaneous nonoccurrences (neither token present).

As in other qualitative data analysis procedures, productive textual analysis is guided not as much by application of hard canonical decision rules as it is by (a) the apparent coherence and thematic consistency of a particular interpretation and (b) the application of ``loose" statistical guidelines, such as selecting cluster solutions that optimize homogeneously sized clusters or that are formed at a jump point in the agglomeration index (Lebart, Salem, and Berry, 1998). In this manner a range of cluster solutions was applied until a result was obtained that met two goals: (a) to maximize thematic consistency (terms should group in meaningful and self consistent categories) and (b) to minimize the number of article titles left unclassified (i.e., comprised solely of tokens that were not represented within the classification system).

For this analysis optimal results were obtained by limiting the number of tokens included in the clustering to only those that occurred at least 20 times in the data set. As noted by Lebart, Salem, and Berry (1998), limiting the analysis in this way may aid interpretation without undue risk of misrepresenting the structure of the larger textual dataset. This set of 123 tokens accounted for 5,356 of the 12,984 occurrences of tokens, or 41% of all token occurrences. Eighty-four percent of all article titles contained at least one of these high frequency tokens. After this initial identification, some of the high frequency tokens were converted to ``maxiterms" by blending in highly interrelated, low frequency tokens that otherwise would have been left out of the cluster process. Hence the maxiterm RELIGION was formed by adding to the moderately high frequency token RELIGION a range of tokens of low frequency that were clearly related. These included the terms theological, evangelical, preach, sermon, reverend, Mormon, Protestant, Christian, and pulpit. In the following presentation of results tokens and maxiterms are listed in upper case. In most cases the token or maxiterm represents a broader group of terms and when where the sense of the token may not be apparent one or more examples are given in parenthesis. 2

Hierarchical cluster analysis was performed to probe for structure in the matrix of co-occurrences. The WordStat cluster procedure partitioned the data into three high order clusters, two of which were comprised of meaningful subclusters, and an isolated group containing no other subclusters, which simply joined the tokens THEATRE and ART. These two tokens did not occur with particularly high frequency (21 occurrences for THEATRE and 22 for ART) but did tend to co-occur consistently and uniquely and so were isolated in the hierarchical cluster analysis at its highest level of structural abstraction. This suggests that while there has been interest in these areas within the literature of the four journals, studies addressing these topics have rarely drawn conceptual links to other areas of research included in the journals and so stand apart in the literature.

The two remaining high level clusters were structurally complex, segmenting the tokens into large groups of concepts representing on the one hand media studies, rhetoric, and public communication and on the other communication in the contexts of interpersonal, group, education, and organizations. A third, very small group of tokens was also located at the same level of the hierarchical solution. This group contained the tokens CONSTRUCT, SYSTEM, SYMBOL, and ETHIC. The association of the first three of these tokens derives from a number of studies addressing symbolic construction, construct systems, and symbol systems. The attachment of ETHIC to this cluster appeared to be somewhat arbitrary as indicated by a very lengthy stem connecting this term to the others in the set. The remaining two high level clusters accounted for the majority of the 123 tokens and maxiterms examined in this study.

Media, Rhetoric, and Public Communication

This large division consisted of 45 tokens and maxiterms (36% of the entire set of 123) divided into four moderately sized subclusters. It is possible and indeed often instructive to consider the cluster solution at finer levels of resolution and simple visual inspection of the dendogram segments frequently yields further insight into the structure of concept relationships; however, it was analytically valuable to retain a broader focus in this study and so the treatment that follows rarely bores into the hierarchical structure deeper than level 3 (that is, a cluster within a cluster within a cluster). At the most general level the theme underlying the entire high order cluster is communication (or rhetoric) in the institutional and public sphere. This divides in an interesting way into two large multifaceted clusters of tokens, one with three subdivisions treating American history, politics, social conflict and Burkean rhetoric, and one emphasizing mass media.

The first two subdivisions connect concepts in political communication with those in rhetoric (see Figure 1). The first of these subdivisions, cluster 1.1.1.1 is a large group of tokens with several references to American politics. At the top the token NATION (national) combines with ADDRESS and this pair of tokens combines quickly with REAGAN, MYTH, and CRISIS. Added to this are the tokens PRESID (president, presidential, presidency), POLIT (political, politics, politicized, politically), CAMPAIGN, DEBAT (debate), DISCOURS (discourse), and PUBLIC. Finally, and quite late to merge, the pair BLACK (a maxiterm consisting of black, negro, AfricanAmerican, blacks, and AfricanAmericans) and IMAG (image) is added.

Cluster 1.1.1.2 treats rhetoric, oratory and American history. The duet AMERICA and CULTUR (culture, cultural, intracultural) combines with LAW and ORATORY. To this is joined the duets CRITIC (criticism, critic, criticizing) and RHETOR (rhetoric, rhetorical), HISTOR (historical, history, historians) and MOVEM (movements), and SOUTH (south, southern) and WAR.

Cluster 1.1.2 represents Burkean rhetorical analysis in connection with terms representing studies of feminism and religion and with two methodological concepts (case studies, and narrative) that suggest qualitative approaches to scholarship in this area (see Figure 2). BURKE, FORM, and DRAMA combine with CASESTUDI (case study), CONFRONT (confrontation), FEMIN (feminism), and the maxiterm RELIGION. Also included is the quartet FUNCTION, METAPHOR, NARR (narrative), and VALU (values).

Cluster 1.2 is a group of 10 tokens referencing media and television (see Figure 3). A somewhat similar cluster was prominent in a content analysis of ``Human Communication Research" (xxxxx, in press). This cluster joins the tokens AUDIENC (audience) and the maxiterm NEWSREPORTING (press, newspaper, news) to the duet consisting of CHILD (children) and TELEVISI (television). This group in turn is connected to the duet consisting of CONCEPT and VIEW (viewing). The final set of tokens in this cluster consists of CRITICAL (critical), KNOWLEDG (knowledge), MEDIA, and TRADITION.

Education, Performance, and Interpersonal/Organizational Communication

The second general partition is the largest, containing 72 tokens and maxiterms (59% of the high frequency terms) divided initially into three coherent high level divisions, two of which were structurally complex. The three highest order divisions appeared to represent the educational context, the interpersonal/group/ organizational context, and the context of persuasion.

Cluster 2.1.1 consists of 14 tokens strongly and coherently connected to the theme of education. These tokens connect studies of communication apprehension and speaking anxiety as represented in the tokens ANXIETI (anxiety), PUBLICSPEAKING, and APPREHEN (apprehension) to the token CLASSROOM. The tokens COLLEG (college, university), COMMUN (community), TRAIN (training), STUDENT, and TEACHER comprise another coherent subsection of this cluster. Also joining are COURSE, METHOD, TEACH (teaching, instruction) and EDUC (education) and SPEECH.

Cluster 2.1.2 connects other concepts in communication-as-performance to the educational context of cluster 2.1.1. The duet INTERPRET (interpretation, interpretive) and ORAL is attached to PERFORM (performance, performers) and joined to the duet SPEAK (speaking) and WRIT (writing).

Analysis shifts now to the large group of tokens comprising cluster 2.2.1. This set of 45 tokens ties together concepts from the interpersonal, group, and organizational contexts in three complexly structured but loosely themed divisions. The patterns of connection among the terms in the three divisions of cluster 2.2.1 suggests considerable overlap in concepts of interest in the interpersonal, group, and organizational areas. Lines of division are less clearly themed than was the case in the preceding clusters.

The first division, cluster 2.2.1.1, is suggestive of processes of group formation and development. Although the within-cluster structure is complex, the connections of the quartet GROUP, DECISION, DISCUS (discussion), and LEADERSHIP to the trio VERBAL, ARGUM (argument), and PREDICTOR are evocative of themes of group development. As well, connections between the duets COGNITION and PROCESS, ORGANIZE (organization, organizing) and MODEL, and INFLU (influence) and STRUCTUR (structure) are sensible within this interpretation as is the connection of the trio INITI (initial), INTERAC (interaction), and NONVERB (nonverbal). Overall, the interrelationship of these concepts connects group formation and development to organizational structure and processes of initial interaction and influence. As can be seen in Figure 5, the token COMPLIANCEGAIN (compliance gaining, compliance), is only loosely connected to the rest of cluster 2.2.1.1. Studies of compliance gaining are not irrelevant to the context but represent a sufficiently strong presence in the literature to almost comprise a category of their own (as was the case examined above with studies of symbolic construction). The lengthy stem connects the token to the remainder of cluster 2.2.1.1 reluctantly.

Cluster 2.2.1.2. consists of 5 tokens organized in two subdivisions that were slow to join. The first consists of the duet COMPETENCE and MEASUR (measurement) and the second consists of the trio RELATIONAL, CONTROL, and CONVERS (conversation, conversational). Though loosely themed the tokens in this cluster are predominantly from the interpersonal area.

Cluster 2.2.1.3 is the largest examined in the study and has three apparent divisions. First is a grouping of tokens clearly representing sex differences. The duet ATTITUD (attitudes) and LANGUAG (language) attaches to the trio SEX, DIFFER (differences), and GENDER, which connects in turn to the trio MEN (men, male), WOMEN (women, female) and RIGHT (rights). It is interesting that the related token representing feminism connected earlier to the public/rhetorical/political division of the clustering and not here.

The second division of cluster 2.2.1.3 is suggestive of dyadic interaction and the many aspects in which it has been addressed in the literature. The duet INTERPERSON (interpersonal) and CONFLICT connects to the duet PERCEP (perception, perceived, perceptual) and RELATIONSHIPS, to the trio MANAG (manage, management), STYL (style), and POWER, and to the trio ROLE, SATISFAC (satisfaction), and SUBORDIN (subordinate). The token SELF is loosely attached to the preceding structure, appearing to do almost as well as a standalone category. Analysis of the trio DECEIV (deception, deceiver, lies, liar), SELFDISCLOSUR (self disclosure) and MOTIV (motivation, motive), comprising the third division of cluster 2.2.1.3, reveals that this is more an ``overflow" category than a coherent subsystem of concepts.

The final section of this partition - cluster 2.3 - consists of 8 tokens that appear to represent the persuasion area, particularly as it appears in classic studies of persuasion, credibility, and message effects. The tokens representing this area are CREDIBL (credibility), SPEAKER, SOURC (source) and the duet MESSAG (messages) and PERSUASION. Attached to this group are the tokens EFFECTIVEX (effectiveness), EXPERIMENT (experimental, experiment), and LISTEN (listen, listening, and listener).

Differentiating the Journals

The foregoing analysis identified 12 substantive clusters. The question of differentiation among the four journals was considered next. Each title of each journal was given a score on each of the twelve clusters by counting the number of tokens that were present in the title for each cluster. The set of cluster scores were then submitted to multivariate oneway analysis of variance with journal of origin as the independent variable to determine if there were differences between the journals in the extent to which they tended to represent the 12 clusters in their publication histories. Wilks' criteria indicated an overall significant difference between the journals on the cluster scores (F(36, 8811) = 5.10, p < .001). On this basis univariate differences were explored on the 12 individual cluster scores.

There were significant differences (i.e., p < .05 or better) between the journals on five of the twelve clusters as indicated in significant univariate Fs. Post hoc comparisons were conducted to test for differences between the journals on these five clusters. The Scheffe test was used and all results reported were significant at the .05 level or better. Cluster 1.1.1.2, which represented themes in rhetorical criticism, oratory, social movement studies, the Southern experience and the civil war received significantly higher scores by SJC (.43) over CQ (.25) or WJC (.30) and significantly higher scores by CS over CQ (.25). Cluster 1.1.2, representing Burkean rhetoric, feminist studies, narrative, case studies, drama, and religion received significantly higher scores by SCJ (.15) than by WJC (.08). Cluster 2.1.1, representing themes in communication apprehension and communication education received significantly higher scores by CQ (.33) over CS (.21), SJC (.15), and WJC (.15). Cluster 2.2.1.2, a loosely themed cluster related to interpersonal communication, consisting of the tokens COMPETENCE, MEASUR, CONTROL, RELATIONAL, and CONVERS, was rated significantly higher by both CQ (.08) and WJC (.08) than by SJC (.05) and CS (.03). On cluster 2.2.1.3, a cluster representing predominantly interpersonal concepts, there were significantly higher mean scores on CQ (.42) and WJC (.33) over CS (.18) and SJC (.22).

Discussion

There were two purposes to the foregoing analyses: first, to survey and begin to map the conceptual interests of the regional journals and, second, to determine if there have been meaningful differences between the journals in their publication practices.

Surveying Conceptual Interests

The clustering technique provides an abstract portrait of the literature that is internally self consistent with tokens grouping in meaningful clusters. Connections between terms were sensibly thematic, not arbitrary, and, with allowance for an inevitable degree of ``looseness" in the associations of elements at the deepest levels of substructure, this was generally so across all levels of the analysis. Not only were the larger formations of tokens meaningfully grouped, but exploration of substructure within larger clusters often revealed sensible subgrouping as well. That this was so suggests that the language scholars have used in titles to express the focus of their studies has juxtaposed concepts in ways that are generally nonarbitrary and predictable. For example, the technique of this study not only identified that the concept ``anxiety" was a frequent focus of scholarly concern but that it often appears in connection with ``public speaking" and that it is a concept most appropriately located in the context of scholarship in the area of communication education, which is itself more accurately placed in the company of studies of interpersonal and group phenomena than of rhetoric, media, and public communication. Externally, the structure revealed in this study seems generally consistent with the divisional structures of the scholarly societies that publish the four journals.

All four societies have recently dropped the term ``speech" from their organizational names and from the names of their journals in favor of the more general term ``communication". This suggests a broadening of interests and a realignment of the field. Perhaps in time this will be demonstrated in the literature; for now, however, the structure discovered in this analysis is strongly suggestive of traditional speech communication studies as this area has been manifest in American universities since the 1970s. The structure reflects the dual approaches of rhetorical and social scientific scholars, the field's concerns with public discourse, interpersonal, and organizational matters, with education and oral performance, and its intense interests in specific phenomena such as gender differences, anxiety, Kenneth Burke's rhetorical theories, social movements, presidential campaigns, etc. Consistent with traditional configurations of the speech communication curriculum, however, it reflects only minor attention to mass media, to journalism, to public relations, to linguistics, to language acquisition, or to the written word, and none at all to non human communication (e.g., animal communication or machine communication).

The procedures employed in this study relied on custom software systems that extensively filtered, parsed, and otherwise normalized the set of terms used in the cluster process. Unfortunately, there is no extant statistical system that can accomplish such an analysis on unfiltered language. Work by Lambert (1996), for example, demonstrates that attempts to cluster words from transcripts of natural dialogue tend to produce clusters based on parts of speech that merely appear frequently, rather than those parts of an exchange in which critical meaning is conveyed. In the present study, simply passing the unfiltered titles through the analysis would have produced clusters based on words of extremely high frequency and either indiscriminate meaning or peripheral interest. In order to operate at the level of meaningful conceptual relationships, terms of peripheral interest must be stripped away, phrases reduced to tokens, synonyms resolved, and rules applied for recognizing and separating terms of similar appearance but important differences in meaning (e.g., ``criticism" and ``critical").

In the end, in this study the set of raw titles could be reduced and clustered with only a common home computer in automated processes that took less than 15 minutes to execute. However, this was only possible following more than a year's study of the dataset and the derivation and expression in software of an extensive set of parsing rules. Indeed the large system of rules is continually evolving. This study is the second in a broader project investigating the entire 70 title ComIndex dataset that aims, ultimately, to build a software system that will permit automatic classification of the communication field's literature. Each new study in this project leads to further refinement of the rule set. Eventually, the incorporation of new titles in the literature will necessitate change as well. For example, though the field has recently shown growing interest in health communication and computer mediated communication, these areas have not yet generated articles in sufficient numbers in the four regional journals to obtain a place in the cluster structure. This may easily change over time.

Differentiating the Journals

In as much as the four journals express equivalent editorial policies it seemed reasonable to anticipate that there would be no significant differences between them on the basis of their publication history. However, the opposite proved to be the case. This study suggests that CQ is the preeminent outlet for studies in the communication apprehension/communication pedagogy area. SCJ and CS have more titles with rhetorical concepts and CQ and WCJ have more titles with interpersonal concepts, suggesting that scholars or editors have favored these areas along regional lines. It is not possible with the present dataset to examine the question of the origin of these differences. They may be due to editorial filtering but they might also result from a tendency for scholars to perceive traditions of interest (and thereby to establish and perpetuate them) regardless of a journals' actual editorial policies. Whatever the cause, the fact remains that, editorial policy to the contrary, the journals are empirically distinctive in their publication practices, especially SCJ and CQ.

Given that the last 25 years has seen the launch of more than 40 new journals in communication, one might reasonably inquire about the role of the regional communication journals in this time of increasing specialization. With new titles drawing off manuscripts in areas such as health, political communication, women's studies, communication and law, critical theory, computer mediated communication, conversation analysis and ethnomethology, communication theory, broadcast and electronic media, business communication, applied communication, etc., what is the status of articles published in journals that do not specialize? A partial answer to this can be found by comparing the results of this study of those of a similar study conducted on a twenty five year span of the journal ``Human Communication Research" (HCR) (xxxxx, in press). Although some areas of research were similarly depicted in the two studies, there was no representation within HCR of many areas that appeared here, such as communication education, rhetoric, written communication, oral performance, or political communication.

HCR has sustained an open editorial policy but has limited its articles to those with a much narrower range of foci and methodologies. This resulted in an interesting contrast: whereas it was concluded in the HCR analysis that HCR's publication history was broadly representative of a particular theoretical/methodological stance, the present data appears to be representative of the eclectic intellectual labor of an entire field of study (at least as manifest in speech communication-related curriculum). Unlike HCR, the regional journals do not so much represent a self consistent perspective or theory of communication as they do the organization of the speech communication discipline as a whole. The regional journals do a better job of representing the multifacited character of the discipline, symbolizing connection and unity, even if along lines that can only be understood to be sensible in historical perspective rather than in terms of shared theoretical focus. In an era of increasing specialization and fragmentation in scholarly publishing and academic life, the regionals outline the institutional structure that knits the speech communication component of the field together.

References

American Psychological Association (1994). Publication Manual of the American
Psychological Association (4th ed.). Washington, DC: Author.

Frakes, William B. (1992). Stemming algorithms. In Frakes, William B., and Baeza-Yates, Ricardo (Eds.). Informationretrieval: Data structures and algorithms. Upper Saddle River, New Jersey: Prentice-Hall PTR, 131-160.

Funkhouser, Edward T. (1996). The evaluative use of citation analysis for communication journals. Human CommunicationResearch, 22, 563-574.

Lambert, Bruce (1996). The theme machine: Theoretical foundation and summary of methods. Paper presented at the annual meeting of the International Communication Association. Chicago.

Lebart, Ludovic, Salem, Andre, and Berry, Lisette (1998). Exploring textual data. Dordrecht, Netherlands: Kluwer Academic Publishers.

Medhurst, Martin J. (1982). The sword of division. Western Journal of Speech Communication, 46, 383-390.

Porter, M. F. (1980). An algorithm for suffix stripping. Program, 14, 130-137.

Provalis Research (1998). WordStat. [Computer software]. Montreal, Quebec: Authors.

Rees-Potter, L. (1989). Dynamic thesaural systems: A bibliometric study of terminological and conceptual change in sociology and economics with application to the design of dynamic thesaural systems. Information Processing and Management, 25, 677-691.

Reeves, Byron, and Borgman, Christine L. (1983). A bibliometric evaluation of core journals in communication research. Human Communication Research, 10, 119-136.

Rice, Ronald, Borgman, Christine, and Reeves, Byron (1988). Citation networks of communication journals, 1977-1985: Cliques and positions, citations made and citations
received. HumanCommunication Research, 15, 256-283.

Small, Henry. (1986). The synthesis of specialty narratives from co-citation clusters. Journal of the American Society forInformation Science, 37, 97-110.

Stephen, T., Harrison, T. and Silvestre, P. (1992). ComIndex: An electronic index to
communication serials. [Computer software]. Rotterdam Junction, NY: Communication Institute for Online Scholarship.

Tijssen, R. J. W., and Van Raan, A. F. J. (1984). Mapping co-word structures: A comparison of multidimensional scaling and Leximappe. Scientometrics, 15, 283-295.

Footnotes

1. The objection may be raised that linguistic normalization of article titles does not take account of negative forms, equivalencing cases such as ``Mass Media Do Not Use Interpersonal Channels" and ``Mass Media Use Interpersonal Channels". In both cases, the procedures for linguistic normalization described in this article would reduce these titles to ``MASSMEDIA INTERPERSON CHANNEL". However in this study placing these two titles on an equal basis is appropriate since the primary focus of both papers are the concepts ``mass media", ``interpersonal", and ``channel".

2. Complete definitions for all tokens and maxiterms are available from the author.

Table 1

Twenty-five Highest Frequency Tokens

Frequency Token Observed Text

563 RHETOR (rhetoric, rhetorical, rhetorica, rhetorico, rhetorically,
rhetorics, rhetor, rhetorician, rhetoricians)
181 SPEECH (speech, speeches)
117 PERCEP (perception, perceived, perceptual, perceptions, perceiving,
perceive)
95 CRITIC (criticism, critic, critics, criticizing)
95 GROUP (group, groups, intragroup)
91 ORGANIZE (organization, organizational, organizations, organizing)
91 INTERPERSON (interpersonal)
86 POLIT (political, politics, politicized, politically)
75 MESSAG (message, messages)
74 AMERICA (american, america, unitedstates)
73 ARGUM (arguments, argumentation, argument, argumentative,
argumentativeness, argumentatives)
68 TELEVI (television, televisionprogramming, televised, tv,
televisionprogram, televisionprograms, televisionseries)
68 PERSUASION (persuasibility, persuasion, persuasive, persuasiveness,
persuader, persuaders, persuading)
66 WOMEN (female, women, woman, females)
64 PRESID (president, presidential, presidency, presidents)
63 INTERAC (interaction, interactions, interactionally, interactional)
60 RELATIONSHIPS (relationships)
59 MODEL (model, models, modeling)
57 CULTUR (culture, cultures, cultural, intracultural)
55 APPREHEN (apprehension)
53 CAMPAIGN (campaign, campaigns, campaigning)
50 INTERPRET (interpreter, interpretation, interpreters, interpretive,
interpreting, interpretations, interpretative)
49 CASESTUDI (casestudy)
49 TEACH (teaching, instructor, teach, instruction, instructional,
microteaching, instructors)
48 PERFORM (performance, performances, performed, performers,
performing)


File translated from TEX by TTH, version 2.00.
On 24 May 1999, 17:48.