Despite the substantial increase in the use of foresight as an instrument of science, technology and innovation policy since the beginning of the 1990s, there has been relatively little systematic work aimed at evaluating foresight or understanding its effects in aggregate. A first point to note is that the timescale of foresight makes it rare for evaluations to be concerned with accuracy of prediction embodied in the scenarios, Delphi surveys, etc. The only exception to this at national level has been in Japan, where the maintenance of a relatively stable activity since the 1970s has allowed accuracy to be assessed, though it has also been noted that foresight may have changed the rate and direction of activity in the first place, creating a problem of measurement.
Other efforts have included self-evaluations, for example by a senior member of the French Key Technologies Programme and Panel reviews of foresight programmes in Germany, Hungary and Sweden, as well as some smaller countries. The Panel reviews have come closest to true evaluations as they have been independent and based on surveys of key stakeholders (postal and face-to-face hearings or interviews) and have had similar terms of reference to this exercise. In the UK a number of partial attempts have been made either to evaluate or to design evaluation approaches for the first two phases. Most were focussed upon measuring outputs and activity rather than effects and appropriateness. The most influential exercise was the internal review of 2001 which led to the establishment of the current activity. It may be concluded that there is not a ready-made approach that can be adopted but that existing evaluations can provide some useful pointers on issues to raise. Probably the most important issue to have emerged from past foresight evaluations has been the importance of aligning foresight with the implementation environment. This is not to say that foresight should not be disruptive, but rather that its impact is strongly dependent on how well stakeholders have been engaged and the processes established for delivering results into the policy arena. Assessing those linkages will form a significant part of this proposed approach for evaluating the Colombian TF Programme sponsored by COLCIENCIAS and SENA.
Foresight, just as any policy instrument, consumes both time and resources, and for this reason it may be logical to expect that TFP should be subject to evaluation of a comparable rigour to other similar instruments. Evaluation frameworks are generally designed to test three major concerns:
Accountability – concerns about accountability normally include issues such as whether the activity was efficiently conducted and proper use made of public funds;
Justification – concerns about the justification generally focus on the expected and expected effects of the programme in order to justify its continuation and extension; and
Learning – concerns about learning usually centres on how foresight can be done better in particular circumstances.
In a standard evaluation approach, it is important to define the scope and purpose of what is being evaluated at an early stage (Gibbons and Georghiou, 1986). However, whichever approach is taken, Georghiou and Keenan (2007) suggest that the following three evaluation criteria are rather likely to be included in most foresight evaluation processes:
Efficiency of implementation, focused upon managerial and logistical issues. These are not necessarily trivial or only of bureaucratic concern. Process evaluation covers topics such as organisation and management, and would for example ask: Were the ‘right’ people involved in an exercise? Did expert panels (if used) receive adequate support? Was the exercise adequately linked to decision-making centres? It may also address the question of the appropriateness and efficiency of methods used, for example: Should a Delphi have been used? Were scenario workshops properly facilitated? A well-conducted process evaluation can cast light upon the dynamics of foresight.
Impact and Effectiveness, focused on what has been produced by foresight in terms of outputs and outcomes. Probably the most important observation here is that outputs measure only activity and not its significance. Hence, while it may be useful to know numbers participating in meetings or surveys, reports disseminated, meetings held, website hits and so on, none of these measure the effects of these contacts or their contribution to outcomes. Numbers may even be misleading; the number of “new networks” formed disguises variation in their novelty, size, significance and durability. Outcome evaluation is normally made far more difficult by the problem of attribution, discussed more extensively below.
Appropriateness, focused on questions of rationale. For national foresight activities this includes the issue of state intervention but also raises questions of what the alternatives would have been (including the counterfactual).
When a Foresight Programme reaches a certain level of maturity where some activities supported by the programme are beginning to show their own dynamics (e.g. network consolidation), it is important to embark on a systematic evaluation process in order to assess and improve ongoing and future activities.
In the most recent evaluation process carried out by the University of Manchester, we considered the following 20 criteria to evaluate a national foresight programme:
- Criterion 01: Appropriateness and level of achievement of objectives
- Criterion 02: Performance of the management and funding mechanisms
- Criterion 03: Justification of the programme in terms of value for money
- Criterion 04: Effectiveness and efficiency of the organisational structure
- Criterion 05: Effectiveness and efficiency of the approaches and methods
- Criterion 06: Effectiveness and efficiency of implementation and aftercare
- Criterion 07: Level of capacities and foresight culture achieved
- Criterion 08: Level of national, sub-national and international presence
- Criterion 09: Level of commitment of participants
- Criterion 10: Level of novelty and impact of projects
- Criterion 11: Impact on public and private policies and strategies
- Criterion 12: Impact on agendas of STI programmes and institutions
- Criterion 13: Impact on the consolidation of research groups
- Criterion 14: Impact on the consolidation of S&T capacities
- Criterion 15: Impact on other (inter)national projects
- Criterion 16: New products and services (publications, courses, etc.)
- Criterion 17: New policy recommendations and research agendas
- Criterion 18: New processes and skills (management, research)
- Criterion 19: New paradigms or scientific/technological developments
- Criterion 20: New players (e.g. sponsors, collaborators, networks)