Extension Education Evaluation:
An Evolutionary Perspective
with Implications for Theory and Practice

Cathy H. Hamilton, Graduate Research Assistant
Satish Verma, Professor of Extension Education
Michael Burnett, Director
Louisiana State University
School of Vocational Education
142 Old Forestry Bldg.
(504) 388-5748
(504) 388-5755

chamilt@unix1.sncc.lsu.edu
vocbur@lsumvs.sncc.lsu.edu
sverma@agctr.lsu.edu

Paper Presentation, Extension Education Evaluation
Topical Interest Group
American Evaluation Association
Annual Meeting, Atlanta, Georgia
November 6-9, 1996

Extension Education Evaluation: An Evolutionary Perspective
with Implications for Theory and Practice

 

The mission of the Cooperative Extension System (CES) is: "to help people improve their lives through an education process which uses scientific knowledge focused on issues and needs." During the past two decades, Cooperative Extension has confronted an increased demand for evaluation of exactly how well its resources and activities are achieving desired goals. Thus, since the 1970s more emphasis has been placed in developing appropriate models of evaluation for program management.

Evaluation takes place within the context of what Greene (1994) calls the "dizzying pluralism" of social inquiry. Nowhere is this "dizzying pluralism" more evident than in the diversity of programs and clientele within the CES. From its historical roots in production agriculture, Cooperative Extension now finds itself attempting to be responsive to amazingly diverse segments of society.

  • In agriculture, for example, services are extended to large commercial farmers, to limited resource farmers, and to backyard gardeners. In home economics, traditional homemakers, women employed outside of the home, and single parents are served. Youth programs serve teen leaders, farm boys and girls, and urban children; while community development lends assistance to local officials, low-income community groups, and tourist agencies. With these varied programs and clientele has come a diversification of subject matter (Warner and Christensen, 1984).
  • Cooperative Extension's challenge has been to develop appropriate models of evaluation that translate meaningfully throughout these myriad contexts. This diversity of not only program needs, but program evaluation, puts CES in a unique position to influence what Michael Scriven (1993, 1991) proclaims to be the emergence of a new discipline of evaluation. This infant discipline has evolved from an encompassing set of practices to systematic methodological models to general theories and finally, to what Scriven contends is a crucial development, a metatheory of evaluation. He writes that the theses of evaluation as a discipline "were not deduced from a preexisting general theory--as statistics from probability theory--but rather inferred from analysis of practice" (p. 3). Cooperative extension has 82 years of practice. Scriven's characterization will serve as a conceptual frame to document and analyze design and process elements of selected extension evaluations, primarily in the sub field of program evaluation, that fit the descriptions of this progression.

    Scriven claims that there is only one discipline of evaluation or what he prefers to call a "transdicipline" (p. 37). This transdiscipline consists of

    1) a wide range of practical applications of evaluation in various fields,
    2) has a nascent core discipline devoted to developing a distinctive and valid logic,
    3) general methods and theories of evaluation and
    4) a metatheory that guides ethical as well as scientific obligations of the field.

    Transdisciplines are "tool" disciplines in that they exist only to serve and are driven from the bottom up (1993, p. 17). Hence the burden of constructing the core discipline has fallen on workers in the applied fields (p. 20).

    Scriven's four elements in the emergence of a new discipline are interdependent. They do not necessarily follow a linear progression, but are cyclical and mutually interacting in nature (1991, p. 25). However, given that this paper attempts to document the extent that Extension evaluation has mirrored Scriven's evolutionary framework , trends in Extension evaluation will be discussed as they relate to each essential component of Scriven's "transdiscipline."

    The first of Scriven's four required components is:

    1) The emergence of consciousness or recognition that something new is afoot or possible and a definition of what that something is; 'a mapping its shape and location,' getting the definition 'right' This definition is a crucial step or steps because it lays the groundwork for the other three.

    Included in Scriven's (1991) description of a transdiscipline of evaluation is that evaluation by definition must claim a crucial part of every discipline...and that the study and improvement of all these uses of evaluation is said to be part of the proper subject matter of evaluation" (p. 39). The words of Jennifer Greene (1994) capture that concept:

  • There exist many scenarios of social program evaluation. They vary in the nature of the social issues involved, in the perspectives taken on the issues, in the geographic scope of services to be reviewed, in the kinds of information sought, and in the stated purposes for which the information will be used. Underlying these differences, however, are some fundamental commonalties that demarcate evaluation contexts, and that thereby distinguish program evaluation as a unique form of social inquiry" (p. 531).
  • Given the diverse nature of the programs involving Extension clientele, the challenge in program evaluation has been to define the nature and the purpose of studies of Extension program results in terms that could cut across differing contexts. In 1983, as a response to the call for increased accountability in Extension programming, a resource guide was developed for state administrative program leaders and others with program evaluation responsibilities to advise them how to design studies of program results that were accurate and reliable (Revera, Bennett & Walker, 1983). This collection of impact studies was intended to provide appropriate research designs to evaluate program impact.

    Rivera, Bennett and Walker offered various definitions of program impact. They quote Sanders' (1982) broad definition that included intended and unintended results, positive, negative or neutral impacts and those of immediate or future consequence. The guidelines for the Extension Accountability and Evaluation (A&E) System (1983) defined "Impact Studies" as technically valid in-depth studies to assess: a) the economic or social consequences of Extension efforts, or (b) other aspects of Extension inputs, operations or programs. Ultimately the authors defined program impact as "the economic, social, environmental and individual consequences (results) of program-induced learning and practices" (p. 7). They state that ideally, an impact study "should somehow assess a program's final consequences: (a) preferably through providing evidence bearing directly on the program's end results, or (b) by discussing how a program's measured educational and/or practice results might be expected to produce its end results" (p. 8). Although the research designs can be categorized as objectives-based or summative evaluation, implied in the models is formative evaluation of the program development process as they relate to the desired outcomes.

    As recently as 1990, the Journal of Extension article entitled Organizational Philosophy for Program Evaluation defines the dual role of program evaluation to be 1) program management and improvement and 2) accountability and impact documentation (Decker & Yerka, 1990). The authors quote a state program leader stating, "Evaluation should be 75% useful to the programmer and 25% useful for administrative reporting needs. They add, "Making evaluation useful for the programmer means using evaluation to support program decision making. Evaluation should be considered in this context!" Stated in these terms, the models for CES program evaluation become vulnerable to Walker and Christensen's (1984) critique that all too frequently the issues important to the clientele and public are left out of the process (p.2). Although this "top-down" approach might seem incongruent with the needs-based or grass-roots initiated programming espoused by CES, the severity of that critique would depend on the level of input from the target clientele throughout each stage.

    The writers of the 1983 guide compare traditional models of Extension evaluation . Relying primarily on the Context, Input, Process and Product (CIPP, Stufflebeam, 1971), and Bennet's Hierarchy of Evidence (1976,) the authors state that rather than limiting evaluation to one phase of a cyclical process--the phase following program planning and program implementation (summative)--evaluation should be used in each phase of programming. Although different models divide the overall programming process into different phases, all of the models deal with the questions of deciding what kind of program to have, then how to conduct it, and finally, deciding on improvements based on whether or not the program objectives have been met. The resource guide focused on studies which help to evaluate program products, outcomes or results. It defined levels of program results as:

     

    Although "value of information" as perceived by participants of Extension programs is included in the resource, the target audience for the use of the evaluations consists of policymakers and program managers.

    In 1977, the Congressional Farm Bill mandated a more rigorous system of accountability and evaluation with greater scientific rigor be used in an attempt to document the connection between Extension's programs and outcomes. In the 1980s, George Mayeske brought the concept of Evaluability Assessments (EAs) to the Cooperative Extension System (Verma & Mayeske, 1990). (See Figure 1) Evaluability assessment is a formal study of a program which provides information as to whether or not there is clarity about the goals and objectives, and whether the program is plausible and measurable. This kind of information can be used to support decisions about "improving" a program (Smith, 1989; Wholey, 1979).

    The major purpose served by the CES EAs was the planning of plausible, evaluable programs appropriate for the diverse Extension programming needs (Verma & Mayeske, 1990). Evaluability Assessments clarified the logic (theory) of a program, specified functional components (activities and resources) and provided indicators to determine when planned activities were implemented and outcomes achieved. Stakeholder input is important in evaluability assessments to determine their awareness and interest in a program and their perceptions as to what a program is meant to accomplish . Once program logic models are developed, these models serve as guideposts for the program staff in program management and policy decisions. EAs served as pre-impact evaluation studies.

    Cronbach, et al. (1980) argue that "The distinction between studies that ask how good a service is and those that ask how the service can be improved has been around for decades" (p. 23). Positing this formative/summative debate as an either/or proposition is neatly rendered obsolete in Bennet and Rockwell's (1995) expanded hierarchy: Targeting Impacts of Programs (TIP) . This new model designed for developing and evaluating Extension programs is based on the 1976 design, however, the authors stress the assessment stages of program development much more than the earlier version, weighting program development equally with the impact evaluation. Specifically mentioned in the introduction to the expanded model is the flexibility of the design enabling its use in myriad contexts and for multidsciplinary programming teams (p. 1). Thus the model fulfills one of Scriven's primary criteria for the transdiscipline of evaluation, that of applicability across disciplines.

    Using the hierarchy (see Figure 2), Bennett and Rockwell's program development model uses a common framework to target program impacts as well as evaluate program performance in achieving targeted impacts. For assessment in program development, stakeholders and program planners/managers begin by assessing the current social, economic and environmental conditions with the ultimate aims of the program being the improvement of those current conditions. Working through the levels in descending order, the program planners assess changes needed in the practices of the participants, knowledge, opinions, skills and aspirations (KOSA) of the participants, continuing towards the reactions of the participants involved in the activities to the resources needed to support the desired changes. Bennet and Rockwell reiterate the need for evaluation (assessment) at each stage of program planning and delivery.

    For evaluating program performance, one ascends the "programing staircase" (p. 10) gathering evidence of the degree of achievement of targets previously defined at each stage of the program design. Bennett and Rockwell divide the program performance evaluation into two evaluations: the process of program implementation and the effectiveness of program implementation in achieving targeted program impacts. Process evaluations focus on achievements of targets at the lower four levels of the hierarchy (resources used, activities held, participants involved and reactions of the participants). They contend that ascertaining the extent that targets at levels 1-4 are realized suggests the program's potential to produce targeted program impacts. "Generally the more nearly targets at the lower four levels are achieved, the more positive the evaluation of program process" (p. 10).

    The impact evaluations indicate degree of achievement of pre-determined targets (from the program development assessments) and the extent of influence of Extension program implementation on such achievement. These program evaluation processes are then used to improve program management and accountability.

    In summary, the TIP model integrates program planning and performace evaluation. TIP uses the hierarchy levels to conceptualize both program development assessments and program performance evaluations. Needs and opportunity assessments as well as program design assessments are used to plan and develop programs. Process and impact evaluations are used to evaluate program performance (p. 5).

    Patton's (1988) discussion of evaluation of impacts or outcomes broadens the "educational, practice and end results" defined by the 1983 resource guide. Taking into account the diversity of Extension programming, he divides outcomes into four major impact themes that frequently compete for dominance within Extension's theoretical assumptions about the purpose of evaluation. Roger Rennekamp (1995) summarizes the four themes:

    This fourth impact theme resonates with the newer Interdependency model discussed in the fourth section of this paper concerning theoretical perspectives of extension evaluation.

    Definitions of the purpose of extension evaluation have moved from activities and outputs (how many participated, how many adopted the innovation) to a broader definition of outcomes the program hoped to achieve.

    Scriven's second component of a discipline mandates that evaluators develop appropriate methodology to be able to discuss evaluative conclusions.

    2) There is the identification and development of an appropriate methodology--a set of procedures and tools to generate enlightening or useful results in the new field. These tools and procedures range from really specific (scoring keys in marking essays) to the level of general methods and models of analysis (a program evaluation checklist) (Scriven, 1991, p. 25).

    Rivera, Bennett and Walker's (1983) resource guide for Extension program evaluation was concerned only with evaluation designs that provided accurate and reliable evidence that behavioral and status changes of program clientele could be attributed to their participation in an Extension program (p. 29). The authors of the 1983 resource guide cite previous publications and guidelines in Extensions' evaluation history. They point to the need to move beyond studying and reporting of program results to evaluating program results in terms of success and failure (p. 24). The authors acknowledge that they focus only on one step of planning, conducting and utilizing evaluations, that of the design. However, they state that without reliable evidence, drawing evaluative conclusions is difficult. Rivera, Bennett and Walker's argue that it is the quality of the research designs that provide evidence on the nature and extent of program results and provide data to answer evaluative questions such as "Did the results justify the amount of resources invested?" "How badly were these results needed?" "Are other programs which receive similar amounts of resources accomplishing more?" (p. 24).

    Their document was the culmination of a nationwide search for analytical studies containing findings on Extension program impacts. Four hundred and fifty abstracts of studies that contained findings on programs during the years 1961-1978 were submitted. Studies included in the resource guide fulfilled the following criteria.

    (a) contained data on program results rather than just program processes;
    (b) focused on programs clearly identified as Extension programs;
    (c) data supplied by program participants or clientele rather than just by Extension staff;
    (d) consistency of study hypotheses and measurements with objectives and structure of the

    Extension programs studied;

    (e) validity of measurements and analysis of the data;
    (f) basis of any interferences regarding results of the program studied;
    (g) basis for any generalization of sample findings to a larger population;
    (h) primary data collection from clients with adequate size of sample and response rate (Rivera,
    Bennett, Walker, p. 114).

    All of the designs proffered by the resource guide fall within Campbell and Stanley's (1966) guide to experimental and quasi-experimental research designs. The studies consist of within-group and between-group designs of Survey ("one shot" ex-post facto), Time-Series, Field Experiment, and Comparison Groups.

    The influence of the 1983 resource guide is an example of Scriven's much lamented value-free, objective constraints that appear in extension evaluation concepts. Rivera, Bennett and Walker state that the chosen designs were in no way meant to limit design options for the future, nor did they "intend to imply that qualitative approaches are to be ignored in evaluation of Extension programs ..Nonquantitative approaches are simply not within the scope of this publication" (p. 104). However, given that the resource was designed for program leaders and specialists all across the country, and in special response to the documented dearth of evaluation specialists in most areas, the influence of this document would have been profound. The assumptions underlying the validation of program evaluation are discussed further within Scriven's fourth component of evaluation as a discipline.

    Numerous critics of evaluation studies cite the damaging limitations imposed on evaluation by the value-free paradigm and the regrettable consequence of limited usefulness of evaluation findings (Greene & McClintock, 1991; Scriven, 1967, 1973, 1983, 1990; Weiss, 1970, 1972, 1977, 1987; Stake, 1967) Greene convincingly argues that new genres of evaluation methodologies arose in response to the failure of experimental science to provide timely and useful information for program decision making (p. 532).

    Sara Steele (1994) argued:

  • Our traditional standard evaluation procedures assume that people are all alike (uniform in their entry behavior), when in fact, we only have made them seem alike by developing the arithmetic mean for the group. That mean conceals the amount of diversity. But it is very convenient to compare the group as a whole (mean) against some other mean,...and thus identify the amount of results. It would be more meaningful to look at amount and kinds of results for each individual and see how they cluster. In qualitative data, this means looking at individual case histories and developing themes or categories.
  • Patton's (1988) response represented a new genre of evaluation methodologies oriented to decision making and practical application. This new pragmatism, as it was called, promoted pragmatic selection of methods, both quantitative and qualitative, to match the practical problem at hand. The approach is utilization-focused and attempts to answer questions concerning program improvement and program goal achievement as well as evaluating a program's benefit to the beneficiaries.

    Evaluability assessments and TIP designs opened the door for methods that tap into a broader spectrum of models of evidence gathering than typically used for traditional quantitative designs.

    John Elliott and Linda Olson (1992) in their research, Evaluation of the Leadership and Local Government Education Project designed their study with the purpose of summarizing the context, activities, implementation processes and outcomes of the three year leadership program. The overall purpose of the three year program was to enhance the capacity of local leaders, officials, and institutions to provide necessary services to residents in the nine county region in southwest Michigan. It was a multifaceted project that provided programming in three areas: leadership development, education and training of local government officials, and technical assistance for increased coordination of public service delivery.

    Because the project had a variety of programs to address the three areas of emphasis, a combination of evaluation methods were used to evaluate the various projects individually and globally within the objectives. These methods included personal interviews, telephone interviews, review of program materials and feedback instruments from various components as well as direct observation of a few of the program activities.

    In their findings, Elliott and Olson provide a qualitative description of each of the major projects and their outcomes. From that documentation, they concluded that the program had met its stated goals (p. 67).

    Increasingly, collaborative or participatory designs are being used for data gathering as well as data analysis leading to evaluative conclusions. Ohio State University Extension Family and Consumer Sciences Division presented the findings of a participatory research and evaluation family nutrition project that involved all stakeholders (Extension, program participants and various community agencies) throughout the program development, implementation and evaluation process. During the development of the grant, each stakeholder had input into the proposal. The participants collaboratively developed common goals as well as goals for each of the stakeholders. For example, improved nutrition intake was one common goal. Indicators were developed to measure the quantitative outcomes of the program, used focus groups involving Extension personnel, agency personnel, and program participants were used to discover program benefits and ways to improve the program.

    These examples demonstrate changes in acceptable methodology for Extension evaluation documents. However, the vast majority of evaluation reports still fall within the traditional quantitative paradigm.

    3 ) There is the development of findings. These consist of databases and general principles and theories. The element of generality will be present (as opposed to specifics) at a minimum in the form of new conceptual schemes and their associated terminology. Usually there are explicit taxonomies, and often also laws, generalizations, and theories (Scriven, 1991, p. 25).

    Scriven discusses the difference between the science of evaluation vs. the practice of evaluation (1991, p. 4). The science of evaluation is the explicit study of the principles and practice; like science, he claims, it involves the production of knowledge. However, this knowledge is about the relative merit, worth or value of things. "This intellectual process of evaluation is one that technology and science share with all the disciplines..and with rational thought in general. It is the process whose duty is the systematic and objective determination of merit, worth or value. Without such a process, there is no way to distinguish the worthwhile from the worthless...In the usual taxonomy of cognitive processes it [evaluation] is listed as the most sophisticated of all" (p.4).

    Rivera, Bennett and Walker (1983) distinguish between evaluation findings and evaluative conclusions (p. 24). They deal with the question of who will use the evaluations of program results by describing a pyramid with policy makers at the top, policy administrators below them, program leaders next on the descending path and program staff making up the base. (See figure 3) The authors argue that users high in the pyramid have more need for information on a program's end results and less need for information on practice or educational results. Conversely, the lower the position of users in the pyramid, the greater their need for information on educational and practice results and less immediate need for information of end results (p. 20). They discuss the importance of selecting the most appropriate evaluation design that will provide intended study users with evidence of program results in which they are most interested (p. 21). >From their description, the target audience or participants of the program are excluded from the evaluation findings and any implications from the evaluative conclusions.

    The Cooperative Extension System contracted Kappa Systems, Inc. (1979) to establish standards regarding technical accuracy. Kappa Systems, Inc. (1979) three-volume review, offered ten guidelines for the development of useful findings ( In Rivera, Bennett and Walker, p. 23):

    1. Clearly state study purposes.
    2. Specify study limitations and/or degree of generalizability.
    3. Describe the Extension program being assessed.
    4. Relate study questions and measures to program objectives.
    5. Discuss the reliability and validity of the measures selected.
    6. Establish a link between client outcomes and Extension program delivery.
    7. Provide adequate labeling of tables, charts, and graphs.
    8. Separate presentation of findings from conclusions.
    9. Provide adequate support for conclusions and a comparison if program success or failure is
    concluded.
    10.Balance completeness of report with succinctness of presentation.

    All of the examples of program evaluation in Rivera, Bennett and Walker's resource guide comply with the standards regarding technical accuracy. Two major sets of standards for evaluations appeared in the 1980s:

    Rivera, Bennett and Walker point out that one strength of the Joint Committee's Standards is that they don't insist that technical accuracy equates usefulness of the evaluation. They quote,

    Steele (1994) argues that "Program evaluation is locked into paradigms which solidified approximately forty years ago." She questions whether or not Extension program evaluation has progressed much beyond equating evaluation with reporting or limiting the definition of program evaluation as scientifically measuring the attainment of objectives. Summarizing her argument she juxtaposes traditional belief systems (Positivism, post-positivism, Information transfer) that foster conformity with alternative systems (Constructivism, Critical theory) that allow for disconfirming evidence in the evaluative conclusions.

    Traditional belief systems of program evaluation operate on the following assumptions:

  • 1) people are sufficiently uniform that it is safe, even desirable, to generalize from a random sample.

    2) agencies can and should set objectives for people based on the agency's knowledge rather than what the customers want. It is the agency's role to manage people in attaining these desirable objectives.

    3) people are at the same uniform level of void in terms of the information or action and will benefit equally (in ways defined by the agency). Thus, one set of objectives should apply to all program participants, and uniform results should be expected from a majority of program participants.

  • She offers a composite alternative view that:

  • 1) people construct what they learn and adapt it to their individual circumstances, thus there is diversity in what people take from a program;

    2) people usually differ extensively in what they know and do when they enter a program;

    3) evaluation should not limit itself to the agency's view but should critically examine the program and its assumptions as well as its outcomes (critical theory paradigm);

    4) regardless of the program's objectives, people gain diverse things from the same program; and,

    5) outcomes and value should be defined by the clientele (customer) not by the agency.

  • The set of principles recommended by the American Evaluation association's Task Force on Guiding Principles and adopted by the Association's membership in 1994, is an overt attempt to move evaluation towards Scriven's third criterion of a discipline. In these guidelines, the association and its membership identify five principles: 

    The fourth, and for Scriven, the most important component separating evaluation as a discipline rather than simply a set of practices, methodologies and principles, is the development of a theoretical perspective that allows for an evaluation of the evaluation. He states the 4th component as:

    4) There is a meta-theory of discipline that provides a loose and almost invisible framework for the practice that is both descriptive and prescriptive. It is through the meta-theory that a discipline justifies its procedures, its ontology, or its boundaries.

    The debate at the meta-theory level reflects the tension surrounding the birth of a new discipline or "transdiscipline" according to Scriven. If, as he argues, "the intellectual conceptions drive practice and are driven by it," some consensus in this arena is imperative to understand how to evaluate the evaluation.

    For Scriven, the key obstacle for establishing evaluation as a discipline lies in the inability to validate evaluative conclusions. He laments what he describes as "a pervasive and disgraceful prejudice" against the development of evaluation as a discipline due to the constricting value-free doctrine of the social sciences (1991, p. 24). These traditional evaluation theories adopted a post-positivist systems approach and were driven by issues of efficiency, accountability and theoretical causal knowledge. Using traditional quantitative experimental methodology and designs, and systems analysis, they attempted to answer questions of the logical feasibility that program outcomes can be attributed to the program intervention. The clients of the evaluations tended to be high-level policy and decision makers (Greene, 1994).

    House (1995), in his critique of the principles adopted by the American Evaluation Association, charges the values-free doctrine with creating ethical fallacies in the principles.

  • Most of these ethical fallacies have at their root the value-free doctrine, which amounts to the evaluator's not taking responsibility for ethical and moral judgments but rather substituting the values of clients, the powerful, or participants as the bias for the evaluation, or by ignoring the issue altogether through seeking refuge in methodology or the contract. These positions are not ethically defensible (p. 31).
  • Evaluation in Cooperative Extension reflects the ideological debate throughout evaluation discourse. Although we see the objectives-based model still prevalent in the Decker and Yerka (1990) article, we are also witnessing a call for new paradigmatic responses to the nature and purpose of evaluation.

    One writer familiar with Cooperative Extension evaluation projects states, "Even in new paradigmatic attempts of evaluation, what is occurring in extension program evaluation is still a subset of a more dominant tradition based in the natural sciences, quantitative methods and top-down edicts on what farmers should do for their own survival and the national good" (Nolan, Personal Communication, July 6, 1996.).

    However, examples exist of attempts to forge new territory without the constrictions of the value-free paradigm. Cristina Bosio de Ortecho's article in the Summer, 1991 edition of the Journal of Extension spoke of the importance of the collaborative learning in a housing cooperative participatory evaluation project. Frustrated with the inadequacy of traditional evaluation designs, Bosio de Ortecho enumerated several crucial features to the evaluation design: participants evaluate, evaluation specialists facilitate their work by means of proposing methods and techniques, evaluation approaches and topics come from the expectations and concerns of participants, and proceedings are simple with clear stages and steps. One challenge of the methodology was to train the community groups in different types of procedures so they could work not only with opinions, perceptions and ideas, but also verify and measure items of importance. Using the data, the groups were able to back up a housing proposal used to obtain funding.

    Bosio de Ortecho pointed to not only the success of the final report, but the useful outcomes during the evaluation processes: shared feelings, expectations and ideas not typically exchanged, reconsidering group values, collectively acquiring knowledge and being aware of the relationship between particular problems and long-range problems of social and political context.

     She proposes that

  • social projects turned into collective learning processes are little by little being recognized as a way to mobilize human resources. We're facing methodological questions we didn't think of a couple of years ago. The frontier to be pushed is enabling community groups to handle useful evaluation process. The time has come to face a challenge of a different nature--to turn these group learning processes into larger community learning processes to match the magnitude of the changes needed and expected with the ability to produce them (p. 16).
  • Her evaluative claims mirror House (1990) who states that

  • social justice is among the most important values we should hope to secure in evaluation studies...As a social practice, evaluation entails an inescapable ethic of public responsibility...[serving] the interests of the larger society and of various groups within society, particularly those most affected by the program under review (pp. 23-24).
  • Evaluation operates in structural political-eco-social constraints. Greene (1994) documents how program evaluation "is integrally intertwined with political decision making about societal priorities, resource allocation, and power" (p. 531).

    These kinds of evaluation studies and reported results have interesting implications in light of Bennett's Interdependence model of interagency partnerships. Bennett attempts to combine Cooperative Extension's adult education component with research-based technological transfer roles. His interdependency model calls for a more focused emphasis on the role of education in Extension rather than the role of transferring information and advice. He argues that people need to be able to define problems, question assumptions, test alternates, and organize to solve problems (p. 5). Extension clientele, he points out, often have greater need for comprehension and ...decision-making capacities than for information and advice about adopting specific technologies, practices, and systems.

    These thoughts are echoed in the words of Steele:

  • Some people are saying that Extension's role should go considerably beyond providing new information and should help people deal with the 'information glut' that is occurring in some areas. In addition to helping people augment what they know, Extension can also play major roles helping people integrate information from various sources, evaluate information for usefulness in their particular situation, and know when and how to use new information...Extension can play a major role in capacity building.
  • Scriven's (1991) closing arguments clarify the importance of recognizing evaluation as an emerging discipline. He eloquently summarizes pragmatic, ethical, social and business, intellectual and personal roles that evaluation, done well, can play. He writes:

  • Doing evaluation and doing it well matters in pragmatic terms because bad products and services cost lives and health, destroy the quality of life, and waste the resources of those who cannot afford waste. In ethical terms, evaluation is a key tool in the service of justice, in program as well as in personnel evaluation. In social and business terms, evaluation directs effort where it is most needed, and endorses the 'new and better way' when it is better than the traditional way. In intellectual terms, it refines the tools of thought and exposes a pervasive and disgraceful prejudice...In personal terms, it provides the only basis for justifiable self-esteem (p. 43).
  • For 82 years, Cooperative Extension's programs have touched people's lives in this holistic manner. Given the potential impact of Extension programs in the everyday lives of people in every county in the nation as well as far-reaching international influence puts the Cooperative Extension System in a unique and powerful position to influence the direction of evaluation in the 21st Century.

     

    References

     

    Argyris, C. & Schon, D. (1989). Participatory action research and action science. American Behavior Scientist,32, 5, (May/June).

    Bennett, C. F. (1976). Analyzing impacts of Extension programs. Washington D. C.: Extension Service, U.S.D.A.
    No. ESC-575). (rev. ed.).

    Bennett, C. F. & Rockwell, K. (1995). Targeting Impacts of Programs (TIP): Targeting impacts to develop and evaluate Extension programs. Lincoln, NE: Nebraska Cooperative Extension Service.

    Campbell, D. T. & Stanley, J C. (1963). Experimental and quasi-experimental design for research, Chicago: Rand McNally.

    Cronbach, L. J. : Ambron, S. R.; Dornbusch, S. M.; Hess, R. D.; Hornik, R. C.; Phillips, D. C.; Walker, D. F.; & Weiner, S. S. (1980). Toward reform of program evaluation: Aims, methods and institutional arrangements. San Francisco, CA: Jossey-Bass.

    Elliott, J. & Olson, L. Evaluation of the leadership and local government education project. Summary of Research in Extension. 1992-1993. Vol. 6. Department of Agricultural Education and Experimental Statistics. Mississippi State University.

    Extension Accountability/Evaluation System (1983). State Extension plan of work and report guidelines. October, 1983-Sept. 30, 1987. Washington, D.C.: Extension Service, U.S. D. A.

    Greene, J. C. (1994). Qualitative program evaluation: Practice and promise. In Norman K. Denzin & Yvonna S. Lincoln (Eds.). Handbook of qualitative research. Thousand Oaks, CA: Sage Publications.

    House, E. R. (1990). Methodology and justice. In K. A. Sirontnic (Ed.), Evaluation and social justice (pp. 23-36). San Francisco, CA: Jossey-Bass.

    House, E. R. (1995, Summer). Principled evaluation: A critique of the AEA guiding principles. In W. R. Shadish, D. L. Newman, M. A. Scheirer, and C. Wye (Eds.). Guiding principles for evaluators. New Directions for Program Evaluation, no. 66.

    Kappa Systems, Inc. (1979) Guidelines for improving Extension impact studies. Vol. III. Arlington, VA: Author.

    Patton, M. Q. (1990). Qualitative evaluation and research methods (2nd ed.). Newbury Park, CA: Sage Publications.

    Patton, M. Q. (1988). Extension's future: Beyond technology transfer. Knowledge, 1, (4, June), 22-24.

    Rennekamp, R. A. (1995). A focus on impacts: A guide for program planners. Cooperative Extension Service: University of Kentucky; Kentucky State University.

    Rivera, W. M. , Bennett, C. F. , & Walker, S. M. (1983). Designing studies of Extension program results. Vol. 1-Text. Washington, D. C.: Extension Service, U. S. D. A.

    Sanders, J. (1982)). Criteria for technically valid impact studies. In Rivera, W. M., Bennett, C. F., & Walker, S. M. (1983). Designing studies of Extension program results. Vol. 1-Text. Washington, D. C.: Extension Service, U. S. D. A.

    Scriven, M. (1993, Summer). Hard-won lessons in program evaluation. New Directions for Program Evaluation, no. 58.

    Scriven, M. (1991). Evaluation thesaurus (4th ed.). Newbury Park, CA: Sage Publications. Smith, M. F. (1989). Evaluability Assessment: A practical approach. Boston: Kluwer Academic Associates.

    Spiegel, M. R. (1994) Integrating performance measures among multiple partners. In Proceedings of the 1994 American Evaluation Association Annual Meeting, November 2-5, 1994, Boston, MA.

    Steele, S. (1994). Uniformity as a blind to diversity and a block to social justice. Proceedings of the Extension Education Evaluation Topical Interest Group, American Evaluation Association Annual Conference, Boston, MA: November 2-5, 1994.

    Stufflebeam, D. L. (1971). A conceptualization of evaluation. Washington, D. C.: American Educational Research Association (Training Tape, Series C).

    Wholey, J. S. (1979). Evaluation: Promise and performance. Washington, D. C.: The Urban Institute.

    Verma, S. & Mayeske, G. (1990). Using evaluability assessment to improve programs in the Cooperative Extension System. Paper presentation at the First Annual meeting of the Association of Louisiana Evaluators, New Orleans, September 14, 1990.


    Return to AEA EEE-TIG Front Page

    Send comments to ANR at David.Underwood@ucop.edu