HOME
PREVIOUS PAGE
NEXT PAGE

FOREWORD

In 1977, «Legal Decisions and Information Systems», written by Jon Bing and Trygve Harvold, was published by Norwegian University Press.

During subsequent discussions with North-Holland Publishing Company, there emerged an interest in a larger-sized, comprehensive and detailed Handbook, in which on the one hand the theoretical base would be broadened and updated, and on the other hand the descriptive coverage of retrieval systems around the world would be enlarged and brought up to date as well.

The result of these endeavours, for which Norwegian University Press kindly granted permission, now lies before you.

As testified to under «Acknowledgements», the generous help from a sizeable group of persons from many countries has been indispensable to us.

Further, recent work of the Norwegian Research Center for Computers and Law has been incorporated into the general parts (I and II), derived for instance from Jon Bing's thesis on legal communication processes, and Tove Fjeldvig and Trygve Harvold's work on the principles of text retrieval. Robert Svoboda has contributed with his practical experience from a large number of European operational systems in various countries.

The international survey has been greatly expanded and covers currently 25 countries. A historical introduction and a section on international organisations have been added. Also the survey of research in text retrieval has been enlarged.

It must be appreciated, however, that the book does not claim to be complete or correct in all details.

We hope that readers finding omissions, errors and misunderstandings will not hesitate to communicate these to the authors at the address of NRCCL (given at the end of the Foreword). We would also appreciate to be made aware of relevant literature, including newsletters published by legal information services for the benefit of their users. North-Holland has indicated that there may be a possibility for later editions of this Handbook, and we certainly will appreciate the possibility not only to report on further developments, but also to amend and further update the descriptions of the different systems.

This book is the result of team-work. For practical reasons, Jon Bing has written the text giving it a uniform style, while using contributions from the other authors - especially in Part II.

This implies that the book is written by someone not having English as his mother-tongue. This will be obvious from the idiosyncracies of the text. We hope the users will excuse the lack of elegance and the «nordicisms» of grammar, choice of words, etc.

The whole production up to a camera-ready copy has been made in Oslo. This implies also that the manuscript has been proof-read in Oslo - under the same limitations. We are painfully aware of the fact that there still are too many errors in the text, reflecting that the proof-reading proved to be a process we had grossly underestimated in terms of time and resources. However, without the expert help of Gunnar Bach, we would have been completely lost.

The production of this book has been organised as a project within the NRCCL research programme NORIS. The project has been partially funded by outside sources. We would like to thank the Royal Norwegian Council for Scientific and Industrial Research (NTNF, which has funded the NRCCL studies of text retrieval over the past years, the Bergen Bank Foundation which has made a study trip possible in 1983, and Emil Mostue who has made it possible for NRCCL to subscribe to several foreign legal information retrieval services. We also would like to thank our friends and colleagues for support and encouragement, which one more has proved the value of working within the framework of collective enthusiasm at the NRCCL.

Lastly, we would like to thank Dr. K. Michielsen of North-Holland for his support and patience through the several years that this book has been in the making.

Skillebekk, March 1984.

Address:
Norwegian Research Center for Computers and Law,
University of Oslo,
Niels Juels Gt. 16,
N-0272 Oslo 2,
Norway.


ACKNOWLEDGEMENTS

In writing this book, we have had the benefit of advice, comments and information from many sources and in many forms. We would like in general to thank all our friends around the world and in Norway who have helped us to collect information. They are not the ones who should be blamed for eventual misrepresentations or omissions in the text, rather they have been instrumental in preventing these faults from being even graver.

Though we cannot mention individually all those who have contributed in some way or the other, we would like to thank the following persons for commenting directly on draft versions of text sections:

Directeur Louis Barbel, Centre de documentation et d'informatique, Paris; Professor Roger Brown, Faculty of Law, University of Tasmania; York Brusse, Bundesministerium der Justiz, Bonn; Professor Yaacov Choueka, Bar-Ilan University, Israel; Director Costantino Ciampi, Istituto per la documentazione giuridica, Florence; Lecturer Marc Fallon, Universite de Louvain; Margaret Anne Foster, Canadian Law Books Ltd.; Sven Yves Poullet, Centre de Recherches «Informatique et Droit», Facultés universitaires Notre Dame de la Paix, Namur; Professor Bryan Niblett, Department of Computer Science, University College of Swansea; Anja Oskamp, Computer/Law Institute, Free University of Amsterdam; Professor Gerald Salton, Department of Computer Science, Cornell University; Stephen Saxby, Faculty of Law, University of Southampton; Professor Peter Seipel, Swedish Law and Informatics Research Institute, University of Stockholm; Professor Guy Vandenberghe, Interfacultair Centrum voor Management, Rijksuniversiteit Gent.


PART I: LEGAL DECISION AN COMMUNICATION PROCESSES

[Page 1 ]


1 CONTEXT OF RETRIEVAL SYSTEMS

In this book we will deal with legal information services. By this we mean services designed for bringing the legal information of statutes, regulations, cases etc to the lawyer. These services take many forms. The conventional services are often provided by a publishing house, using the vessle of a book or a journal to bridge the gap to the user. The more innovative services have introduced computer technology, and this book is mainly addressed to the possibilities and problems of such services.

It may not be quite as obvious that the context of these services also need to be described. We think, however, that it is essential to be aware of, and have a clear understanding of, the environment of the legal information services.

This environment may be regarded as created by two interlocking processes, the communication process and the decision process.

By the communication process we indicate the process by which information is made available to the user. This process generally involves the provider of the service, an editoral staff and a number of activities which result in the possibility of a user to subscribe to a service, and employ this for retrieving legal information which may help to solve legal problems.

The decision process is the process by which a user actually retrieves the information and applies it in solving a legal problem or in formulating a legal opinion.

One may say that the communication process is the description of how the user acquires information, while the decision process is the description of what the user does with this information. These two processes have, however, one important element in common, namely the information system proper: that mechanism which makes it possible to wring information

[Page 3 ]



out of the volumes of case reporters or disk drives of a data base.

Information services are designed for one thing: to make the user able to make his legal decisions in a better way. The aim of the legal information service is identical with the aim of the legal decision process itself - to make better legal decision. The main point is, of course, that it is not obvious what a "good" legal decision really is.

In principle this is an important problem. As we are concerned with legal information services, we would like to assess such services and decide which one is the better. In order to do so, we have to have a scale by which to measure them. As this is related to "good" decisions, we are at a loss if we are not able to decide which one of two decisions is the better. We will then also have problems - at least ultimately - in deciding which one of two information services is the better.

In other works this method has been pursued, cfr especially Bing/Harvold 1977:225-259 and Bing 1982:188-279.

In this book we shall not attack the goals on this general and, perhaps, slightly abstract level. We will take the goals one level down, and maintain that a "good" legal decision is a decision which follows the rules to which decisions have to comply. Such rules - the norms of legal decision making - may also be quite vague and often implicit. But in our model of a legal decision process, we shall try to disclose such rules of importance to legal information services. By tracing the relations between these rules and legal information services, we also set some standards for such services.

In the chapter dealing with legal communication

[Page 4 ]



processes, we shall also introduce goals on an even less general level: what we shall term functional efficiency. This will describe the functional performance of a given information system and - if the relations between these functions and the decision process have been established - make this more specific goal meaningful as well.

In describing the context of legal information services, we shall commence with a description of the most basic one - the legal decision process. We shall then move on to describe how this may be attached to the legal communication process. Finally, we shall discuss in this part of the book the relations between the use and the cost of retrieval systems.

This part of the book will be quite general. The models have been developed for sufficent flexibility to handle both computerized and traditional legal information services. We shall, however, consciously try to develop them in order to bring out the characteristics of computerized services, which are the subject of this book.

We would like to point out that though we have tried to make these models general, they are naturally formed on the basis of Norwegian legal theory and tradition. This is an important reservation. Also, a strong reservation is implied in the term "model". We do not attempt to describe physical or psychological processes as they may occur in reality, but rather model them into a sequence of elements with specified interrelations for analytical purposes.

[Page 5 ]


2 DECISION PROCESSES

2.1 Introduction

The purpose of presenting a model of legal decision processes is to specify and relate the different elements of such a process. It is intended to reflect an uncontroversial picture of how a legal decision process is organized, by using a presumably untraditional approach.

The elements specified in the model will frequently seem quite trivial to the lawyer, and may actually be just intuitive leaps of the mind. Consequently it may seem rather unrealistic to try to portray them in a model. Also, the model grossly understates the interaction between the different elements when a problem is being worked out in the mind of a lawyer. The content of any element may not be determined until a decision finally is reached - only then is a balance established which makes it possible to analyse the decision. For representational purposes, we shall describe the process as if it had one beginning and one end.

Though the approach has certain limitations, we think it is also useful when discussing elements of the decision process - as for instance the retrieval process. The model elucidates the relationship between this and other elements, making it possible to discuss terms which otherwise may remain vague or imprecise.

In presenting the model, we shall bear in mind the purpose of this book. Consequently, we shall explore the anatomy of the decision process with our attention centered on the retrieval process. Other elements will be sketched, but not in detail. For a more comprehensive description, cfr Bing 1975.

[Page 6 ]



2.2 Initiation

A legal problem is something that emerges out of the interaction between humans, or between humans and the environment. It is part of a society, and exists by itself before it is brought to the notice of a lawyer. We are, of course, only concerned with the legal problems brought to the notice of a lawyer, either by a client who experiences the problem, or by the activity of the lawyer himself (who may be a legal researcher).

The problem concerns one or more persons whom we shall call "the client". In order to initiate the decision process, the client must contact a lawyer. This is in itself an observation of some importance, as the initiative of the client presupposes that he is aware of the legal nature of his problem.

For the purpose of this book, we use the term "lawyer" to denote all persons working with legal problems, regardless of their formal education or degrees. Civil servants will, in this sense, generally be "lawyers" as they deal with cases to be decided by legal norms. For the sake of simplicity, we also restrict our description of the problems to those part of a typical client-lawyer relationship, though the discussion might easily be extended to include also those "problems" defined by the lawyer himself (with himself as the "client"), as would be the case of the academic lawyer homing in on the problems implied by a distinction made in a statute.

The lawyer himself is not part of the problem, but rather a sort of spectator giving advice to his client. His contact with the problem is indirect - the private lawyer through his client, the civil servant through a citizen. A judge also has a "client" - who is not, of course, one of the contesting parties before the court, but rather the case presented jointly by these parties.

The problem, as experienced by the client, is not

[Page 7 ]



primarily classified as "legal". The first step of the lawyer is to determine whether the problem - as presented to him through his client - is legal or partially legal. This is not as trivial as it may appear. If the client complains of bad health, inferior housing, a nagging spouse and an unsound economy, it is not obvious that the best remedies are an invalidity pension, housing grants and a divorce. The legal problem may be part of a more complex problem situation - or, indeed, a symptom of other problems. The client may perhaps be better aided by medical care, retraining and martial advice rather than extensive legal assistance.

As a lawyer, one is nevertheless restricted to isolating the legal problems of the totality. This presupposes that the lawyer can identify a legal problem, which will be of second nature to a lawyer and in most cases quite trivial. The lawyer is an expert with background knowledge of the legal system - and consequently he is able to grasp the legal problems that are part of the totality. As a characteristic, one may say that a problem is "legal" if legal arguments may contribute towards its solution. This is a pragmatic characterization, but it will suffice for our purpose.

Looking a bit closer at the nature of a "legal problem", one may specify at least three typical examples:
  • (1) There exists valid norms prescribing that problems of a certain type may be solved by legal reasoning. Typical examples are decisions by the courts or by the public administration - statutes governing these activities will imply that decisions are the results of legal decision processes.
  • (2) The problem is in the form of a dispute which may be brought before the court (or a similar agency for deciding disputes, as mentioned under (1)). The possibility of a trial will throw a shadow across the problem and make a legal decision the normal solution even when the case is settled out of court.
  • (3) The parties (or the environment) accept that a legal decision process is a valid way of

[Page 8 ]


settling the problem. This agreement will depend on several factors, for instance the social prestige or authority of the lawyer in question or of lawyers in general, the effectivness of a legal decision process (which is a rather effective way of arriving at a solution) etc.

[Page 9 ]


2.3 The facts of the case

2.3.1 The probable facts of the case

When a lawyer has decided which are the legal issues of his client's problem, he sets out to describe the facts of the case. Having no direct knowledge of the problem, he has to rely on evidence - primarily discussions with his client, but also with other persons concerned with the problem, through examination of documents etc.

Some facts will be evident (like the identity of his client). But here a few meta-norms prescribe how the lawyer is to arrive at the probable facts of the case.

The term "meta-norm" is, for the purpose of this book, used for all norms governing the legal decision and communication processes. This is in contrast to norms of a substantive nature, which prescribe the possible solutions to the initial problem. Meta-norms will consequently define or determine the outline of legal decision processes. We are aware of the fact that meta-norms themselves may be of different categories, but feel that this terminology will be adequate for our purposes.

The meta-norms relevant for determining the probable facts of the case, are relative to the position of the lawyer. Under Norwegian law, a judge has to ascertain that the factual aspects of a case is sufficiently documented before making a decision (cfr rl sect 190, strpl sect 331(5)). A similar obligation is prescribed for the civil servant (cfr fvl sect 13), and if the case is not adequately documented, the decision may be deemed void.

Perhaps more important, there are also meta-norms governing what is permitted as evidence. In Norwegian law, these norms are few and of little practical importance, and restricted to a few topics like the question of whether evidence secured by illegal means

[Page 10 ]



may be allowed in court. In general the judge or the lawyer may take into consideration what is thought appropriate. In other legal systems, especially in the Anglo-American systems, the law of evidence plays a more prominent role in determining the probable facts of the case.

The general rule is, of course, that the lawyer takes into account what he finds probable based on general human knowledge. When in doubt, he will choose the set of facts most likely to be true. It may also be worth noting that the assessment is not linked directly with formal probability theory, even when this is applicable. Thus the lawyer's view of what is probable may differ from what follows from a strict mathematical argument.

Attitudes of a different nature may also be introduced at this stage of the process - for instance the meta-norm of client loyality. The lawyer will quite naturally be inclined to accept the version of the facts presented to him by his client, even when another version may appear more likely. Having a client implies an obligation for looking after the interests of the client, which may very well lead to arguments for a set of unlikely circumstances being true. Also, the loyality to the client may influence the assessment of what is probable.

The sort of influence exemplified by client loyality may be regarded as a feedback-loop from later stages of the decision process, indicating the dynamic nature of the process somewhat lost in our model.

2.3.2 The proven facts of the case

The law of evidence may already have introduced some discrepancy between what is "probable" in the commonly understood sense of the word, and what is "probable" based on the admissable evidence. But most legal systems add another layer of refinement to this, and introduce special qualifications in respect to probability. A set of facts is not regarded as proven unless it is qualified more probable than any other set of facts. The meta-norms govern the burden of proof, and they are, of course, relative to the

[Page 11 ]



legal system. A general example is the prosecutor's burden of proof in a criminal case - you are "innocent until proven guilty".

This group of meta-norms is, however, not too well defined. In Norwegian legal theory it has, for instance, been proposed that norms governing the burden of proof may be gererated by the special facts of the case itself, for instance based on an assessment of the consequences of alternative decisions, cfr Bratholm/Hov 1973:300-301. Once again we encounter a set of meta-norms which open a feedback channel from a later stage in the decision process.

Through the meta-norms governing the burden of proof, the lawyer arrives at a set of facts which is proven. These may differ from the probable facts of the case to the extent that qualified probability is required for a certain circumstance to be proven.

The case - as described by the proven facts - is the platform from which the lawyer launches his search for relevant norms. (To us it is important to stress that the model takes the facts of the case rather than the legal norms as its point of departure, cfr Kilian 1974:42.) The lawyer has used his legal background knowledge in order to extract the proven facts from the totality of the problem. And he may already have used feedback information from later elements in our model of the decision process. But generally speaking we may say that up to this point substantive law has not entered into the process.

In order to arrive at a decision, the lawyer has, of course, to select the relevant norms from the legal system. The retrieval process is part of - but not identical with - this selection.

[Page 12 ]



2.4 Legal sources

For the purpose of this book, we understand by the term "norm" a certain content found in certain statements (cfr Sundby 1974:17). A norm is of a semantic nature, but is based on certain statements of a syntactic nature.

In order to identify norms, it is therefore necessary to define what statements qualify as a foundation when arguing that a certain norm exists. This definition is given by meta-norms, and the statements qualified according to these meta-norms are called "legal sources".

The distinction between legal sources and legal norms is fundemental. In regards to computerized systems, the distinction is also of a very practical nature. Legal sources are typically of a written nature (statutes, regulatory law, court decisions, etc). The text of these sources may be processed by a computer, and the assistance of a computerized system may be used to sort out documents defined by certain characteristics like the occurence of certain terms or citations or by combinations of these. But when the computer has retrieved the defined units of text, the lawyer is left with only a set of legal sources. The legal norms that may be founded on this set of legal sources must still be formulated in the mind of the lawyer.

It would fall outside the scope of this book to discuss the problems related to legal norms and their nature. On the other hand, legal norms are fundamental to all legal thought, and a clear understanding of their nature is necessary even in a book with our limited scope. We have taken recent legal theory as our frame of reference, cfr especially the detailed discussion by Sundby 1974.
In our terminology, norms are of a semantic nature. This does not imply that we disregard the efforts made to represent legal norms in a more

[Page 13 ]


unambiguous way than possible in natural language. Both deontic logic and certain projects with computerized systems (deontic systems) have indicated ways of representing norms in an unambiguous way. Cfr for instance Bing 1977, Niblett 1980, Ciampi 1982 and Maretti 1982. This does not, however, alter the fact that the norms are still of a semantic nature, while the representations - though less ambiguous - are of a syntactic nature.

By our definition, a legal source is a text, a statement or an opinion (voiced, for instance, by the local chamber of commerce on the sales practices within a certain trade). The content of the source may be used by the lawyer as arguments in constructing legal norms. The lawyer must, ought to or may take the arguments of a certain source into consideration when arguing for the existence or specific content of a legal norm. A group of legal sources having the same origin will be termed a type of legal sources.

Only arguments which are derived from legal sources may be used when reasoning on the existence or content of a specific legal norm. Consequently, the qualifications of a statement as a legal source becomes a matter of some importance. The qualification is based on legal meta-norms relative to the legal system in question. They are rarely made explicit. On the other hand, they are not - at least within the Norwegian legal system - controversial. Disagreement on a legal point very rarely centers on whether the statement basic to an argumet is a legal source or not. In Norwegian legal theory there has been, however, a few instances where the status of certain types of legal sources have been discussed - cfr for instance Fleischer (1965:152) on the status of decisions by the first instance courts.

A similar discussion has taken place in California. In 1963 the state parliament introduced selective publishing of court decisions under the supervision of the Californian Supreme Court, cfr Government Code of California sect 68895. According to this section, the Supreme Court decided that "important" decisions of the appeal courts should be published. In 1972 the "burden of proof" was reversed in such a way that instead of

[Page 14 ]


not publishing cases when they were found unimportant, cases should only be published when found important. In Rule no 975(b) in California Rules of Court it is now stated that
"No opinion of a Court of Appeal or of an appellate department of the superiour Court shall be published in the Official Reports unless such an opinion (1) establishes a new rule of law or alters or modifies an existing rule, (2) involves a legal issue of continuing public interest, or (3) critisizes existing law." Each of these criteria is explained in footnotes.
The criteria will not seem unduly restrictive judged against the background of the Norwegian legal system. But obviously this regulation has to be assessed in the perspective of the Anglo-American case law system, and the extensive publishing which led to amendments in two steps, in 1963 and 1972.
The justification for the amendments were simply to reduce the volume of case law which every year was added to the legal system, and which made it more difficult for lawyers to cope with the information situation, cfr Kanner 1973:388. But the result was not satisfactory. It created a serious availability crisis, what Kanner (1973:390) calls "two lawyer levels":
"... the uninitiated ordinary practitioner who keeps up with the advance sheet and knows only what he reads there, and the specialist-insider who collects unpublished opinions in his field as well, and who therefore posesses a special insight into the thinking of intermediate appellate courts."
In our context, the major point is that no improvement in the information situation of the lawyer became appearant. As published decisions had the same status as an unpublished decision, it resulted only in the loss of an information system previously available, and an availability crisis as mentioned above. It was also maintained that the Superior Court exploited its control of the publishing to "get rid of what it apparently

[Page 15 ]


deems to be erroneous or otherwise improvident decisions".
In this situation, several remedies might be sought. Interestingly enough, California opted for a regulation of what should be qualified as legal sources. In an amendment taking force January 1st, 1974 the Rules of Court no 977 was given the following form:
"An opinion of a Court of Appeal or of an appellate department of a superior court that is not published in the Official Reports shal not be cited by a court or by a party in any other action or proceeding except ..." (the exceptions are fairly limited).
In this way, California gives us a rare example of an attempt to regulate the qualification of legal sources by a written and explicit rule. It also gives us an insight into the interrelationship between information systems (in this case, the Official Reports) and the legal argument itself.

Through these meta-norms, a volume of statements is defined. This volume represents the total number of legal sources of the legal system. The lawyer looking for norms relevant to his case, must direct his attention towards these sources.

Obviously, the lawyer does not have access to the total number of legal sources, but only to the part of these sources which is available to him. Sources are made available through some sort of communication process, and at this point in our model, we find that it impinges upon the model we shall discuss later - the model of legal communication processes. At this stage we shall therefore simply presume that some sources have been made available. We note that those circumstances which limit the access to the total volume of legal sources are termed availability factors, and also that there is no real disaster for the lawyer to be screended in this way: the total number of sources would be a deluge of trivial decisions and outdated regulations which in general he is grateful not to have to wade through in his search for relevant legal sources. Nevertheless, he will be concerned if the availability factors in a systematic

[Page 16 ]



way make a smaller set of sources available to him than to his collegues working in the same legal field. Such a discrimination will be termed "availability discrimination", and will be considered in more detail later.

At this stage we collapse the communication process to a screen between the total volume of legal sources and those available to the lawyer. Noting this, we go on to consider what he will be doing with the available legal sources. It is towards these he will turn in his quest for the appropriate legal norms, and to identify the legal sources on which these may be based, he has to conduct a retrieval process.

[Page 17 ]



2.5 The retrieval process

At this point, the lawyer has described his case through a set of proved facts. His task is to find the legal sources from which he may derive arguments to construct relevant legal norms. Only rarely is the background knowledge of the lawyer sufficient for him to decide the case outright. And in our model, we will presume that it is necessary to supplement the background knowledge with a more specific knowledge of applicable legal norms.

On one hand the lawyer has the problem described as a set of facts or circumstances. On the other hand the lawyer has a volume of texts. His task is to bridge the gap between the problem and the legal sources in order to arrive at the applicable legal norms. This is the retrieval process.

In order to give a general discription of how this gap is bridged, one may briefly take a look at the legal norms for which the lawyer is searching.

A legal norm is usually described as consisting of two elements, the antecedent setting out the conditions for its application, and the consequent, setting out the consequences of its application. The two segments are combined in a way which is usually described by verbs like "must", "can", "may" etc.

This simple model of a legal norm is sufficient to point out that the antecedent sets out the conditions for the application of a certain norm, and these conditions are descriptions of situations in terms of circumstances and facts. We note that there is a correspondence between the antecedent and the facts of the case at hand: If the norm in question may be applied to the case at hand, the facts of this case must fit the description of the antecedent.

We have already established that legal norms may be constructed out of arguments derived only from legal sources. The antecedent, as part of the legal norm, must be constructed in the same way. Consequently,

[Page 18 ]



the text of the legal sources must contain descriptions of the situations defined by the antecedents.

In this way, it is possible to point out that the bridge between the case at hand and the legal sources is in principle represented by the facts of the case. By using these facts, the lawyer should be able to retrieve the relevant legal sources.

This presumes that the lawyer has access to an information system. Obviously, the lawyer cannot let the search for relevant legal sources be dependent on chance only, for instance by opening books at random. Neither can the lawyer permit himself to read all the available legal sources sequentially. He has to use tools which allow him to take short-cuts, retrieving legal sources with a high probability of being relevant. The information system is just such a short-cut.

Retrieval systems may be designed in a number of different ways. At this stage we shall not dwell on the different alternatives, but rather point out that they have at least one feature in common: they allow the user to formulate search requests.

The search request has to be consistent with certain rules imbedded in the information system. Any information system impose restrictions on the search request.

A traditional back-of-the-book index restricts the user to those terms listed in the index: the search request has to be one of these terms, and will then give a reference to the pages characterized by that term.

A computerized text-retrieval system will allow the user to include any word from the natural language texts of the documents, including codes and terms in abstracts or indexes added to the authentic text by an editor.

Though the rules for formulating search requests - the search language - may vary widely from one system to another, the basic fact remains: within the restrictions imposed by the system the user may formulate his request.

[Page 19 ]



We stress this point just to clarify that all retrieval systems imply the formulation of search requests, regardless of whether the system is computerized or manual. The available legal sources are of a syntactic nature. In order to employ a retrieval system, a bridge has to be constructed reaching from the semantic to the syntactic level. The search request - which represents the problem or part of it in a way permitted by the information system - is this bridge.

In a computerized system the search request plays an essential part. Such systems usually offer more possibilities in formulating requests than conventional manual systems. The attention is shifted towards the requests. Retrieval strategies - the way of formulating adequate requests - becomes more important, reflecting the possibilities of choice.

As pointed out, we do not find differences in principle between the use of a computerized system and, for instance, a conventional library system. Both requires that questions are transformed into search requests. The flexibility of the search language will vary from one system to another, but this is only a difference in degree. The principal difference would be between systems where a syntactic search is necessary, and systems which would "understand" the question in much the same way as a human, for instance a collegue, would understand the lawyer when describing his problem. So far systems allowing this are as not developed. Even those experimental systems employing techniques from the field of artificial intelligence and computational linguistics pivots on conventional search strategies. In our view, the critical comments of Slayton (1974:22) on the difference between retrieval in a "normal library situation" and retrieval by computerized systems are based on the misconception that the "normal library situation" includes information systems permitting what Slayton somewhat vaguely calls "random conceptual searching".

As an illustration of the restrictions imposed on the user in formulating his request by a conventional retrieval system, we may take the precedent files in a public agency. Research has demonstrated that such

[Page 20 ]



files are generally organized corresponding to sections in the statute governing the activities of the agency. In such a situation the lawyer cannot use the facts of the case at hand when formulating his search request - he must formulate the request as a section of a statute. If he is mistaken in his choice of sections, he will not be able to find an identical case in the precedent file (cfr Bing 1982b).

This may also serve as an occasion to look somewhat more critically at the model. According to the model, the lawyer considers the facts of his case and uses these for constructing a search request. The search request functions as some sort of definition, qualifying which circumstances should be described in a legal source for it to be "retrieved".

The lawyer may object that this is a rather unrealistic description of the retrieval situation. Often the retrieval process is intuitive - the memory of the lawyer prompts him with the possible relevant references to the literature and case law. Often the lawyer will also easily identify which sections of a statute or which general systematic term cases or other sources will be indexed by to be considered relevant for the case at hand. These are examples of the background information which makes alternative search strategies possible and adequate. Though the facts of the case represent the only bridge from the problem to the law, the lawyer in practice will interpose his knowledge and experience, and be able to do quite a lot of the work inside his own head, which in our model is presumed to be carried out explicitly by a retrieval system.

The result of processing the initial search request is a preliminary retrieval of a set of legal sources. These sources are written documents in the form of statutes, regulations, court decisions, etc. By interpreting these sources, the lawyer may construct the legal norms. The interpretation may be a trivial process indeed, and include only the reading of a text. But in our terminology, any understanding of a legal source (through reading or listening) presumes an interpretation, a "decoding" in the psychological sense. A further discussion on interpretation will be persued below at sect 2.6.

During the preliminary interpretation if the legal

[Page 21 ]



source, the lawyer considers the description of facts or circumstances found in the sources, comparing them to the facts of his case. Where a correspondence is found, the legal source in question is put aside as relevant.

The examination of the legal sources will take place regardless of the information system employed by the lawyer. In an interactive computerized system, this phase will correspond to the browsing at the terminal. The text of the retrieved sources may be displayed, and the lawyer may rapidly browse through them in search for possible relevant documents. Special features of the system - for instance highlighting or focusing - facilitates browsing. In this phase, the computer functions as a "reading glass". The increased efficiency of the browsing by computerized compared to manual systems, should be highly appreciated.

In going through a set of legal sources, the lawyer will gain insight into the legal problems related to his case. The browsing stage is also a learning stage. User research has indicated that the lawyer may spend quite some time at the terminal, and the explanation given is the time being spent mapping the general legal background on which the case at hand is to be judged.

One may observe that since the facts of the case define which legal sources are relevant, the legal norms (and, indirectly, the legal sources) define which facts are relevant. The lawyer, when deciding that one fact among his set of proven facts is relevant, bases this decision on the hypothesis that there exists at least one norm including this fact. But norms may be formulated only on the basis of legal sources, and to find these the lawyer- in principle - needs to construct search requests on the basis of legally relevant facts. At this stage the lawyer does not know whether the selected facts are really relevant. He relies on his background knowledge. This is necessary both in order to select facts for constructing the search requests, and for formulating the requests, ie selecting the terms that are most likely to represent the facts in the available sources (or in an index to such sources). A lack of background knowledge will only accidentially be remedied by an effective information retrieval system. Therefore a computerized system will also serve the knowledgable

[Page 22 ]



lawyer better than the ignorant one. And laymen, lacking legal insight, will be more or less helpless when confronted with such a system. An information system designed to give legal information to non-lawyers should be designed on other principles, including, for instance, a sub-system capable of problem analysis, a process which in our model is presumed to be manual.

One may note that the analysis prior to the formulation of a search requests indicates one of the major problems of computerized text retrieval as experienced by the user - the problem of specificity. When formulating the request, the lawyer uses the proven facts of the case. But in the legal sources, the facts may be represented by words different from those one would be inclined to select when describing the case a hand. If the case concerns a bull inflicting damage on a hiker, a legal source of a general nature (like, for instance, a statute) would most probably not use the word "bull", but rather a more general phrase like "domestic animal". In a legal source of a casuistic nature (for instance case law), the fact may be represented by the word "bull", but equally well by synonyms like "ox" or "steer", or by terms denoting other domestic animals like "horse" or "goat" - or even the names of specific animals, like "Ferdinand" for the guilty bull. This illustrates the importance of background knowledge not only of the relevant field of law, but also of the legal language. Later we shall discuss strategies of system design which may help to overcome this problem.

The feedback given through browsing in the preliminary retrieved sources may alter the lawyer's understanding of this problem. He may want to rephrase his search request in order to reflect this change. (It may even become necessary to go back to earlier stages of the decision process - this is one of the dynamic aspects understated by our model.) The new search request may retrieve additional sources, which once again may deepen the lawyer's understanding and lead to a rephrasing of the search request.

This iterative nature of the retrieval process ought to be reflected by the design of computerized systems, offering the possibility of modifying earlier

[Page 23 ]



search requests and using a former request as part of a new request.

Also, another source for further search requests may be identified. In retrieving a preliminary set of legal sources, the lawyer may be confronted with the usual problems of interpretation. A certain term is used, and the lawyer is not quite sure of its meaning. A new search request is formulated in order to find other sources discussing the meaning of this term. Or a citation to a case is found in a case of possible relevance - and the lawyer uses this citation as a base for a new search request, retrieving the cited case.

In at least these two ways the retrieval process is iterative. The retrieved sources may deepen and amend the understanding of the legal problem, and their interpretation may spark off new problems of their own.

Whether this iterative nature of the retrieval process is revealed by a practical example, will depend upon a number of pragmatic factors. For instance, a simple legal problem may have such an obvious solution that the lawyer simply retrieves the decisive legal source (for instance a section of the statute) and does not bother to map the borderlines of the norms applied. This is often the case in public administration, where a great number of trivial legal problems are decided as a matter of routine. Also, the situation in which the lawyer finds himself is of great importance: again, public administration offers an illustration of a situation where the user is often faced with a time pressure which makes him settle for a sufficient rather than a satisfactory justification for his decision.

In principle, the merry-go-round of the iterative retrieval process is halted by one of two causes. The ideal cause would be that the lawyer is satisfied that he has nothing more of substance to gain from retrieving further legal sources. The retrieved set contains arguments for constructing a set of norms applicable to the problem at hand. The less ideal (but, perhaps, more common) cause is simply that the lawyer cannot afford to use more resources on research: the trial lawyer has run out of time, the cost of further research exceeds what the client is

[Page 24 ]



willing to pay, etc. In both cases, the lawyer then moves on to further stages of the decision process.

In these observations are buried some important problems. Are there standards for legal research? Is the lawyer free to determine what is sufficient, or are there obligations upon the lawyer to conduct a search which is "appropriate" according to some enforceable standards?

[Page 25 ]



2.6 Interpretation: Relations between sources and norms

2.6.1 Relevancy of sources - weight of arguments

The goal of the retrieval process is to identify "relevant" sources. A source is "relevant" when it is possible to derive from that source at least one argument which can be used in constructing a legal norm applicable to the case at hand. This gives a rather general idea of what "relevant" may mean - and perhaps generally does mean - when used to qualify a legal source.

We need, however, to be quite specific at this point. In our terminology, we distinguish clearly between the sources on one hand, and the information contained in these sources on the other hand. The need for this distinction is emphasized by computer technology, while computer systems may easily keep track of words and phrases in a document, the system cannot today "understand" the text as this term is commonly used. In a way the distinction between sources and norms therefore becomes a distinction between what we currently can and cannot do.

Lawyers are not confronted with this limitation. Therefore the distinction has less importance when applied to their work. A lawyer may find it straightforward to describe his reasoning by maintaining that "section so and so of the statute applies to the case", while it would be more exact in our terminology to say "arguments may be derived from section so and so, of which a norm, applicable to this case, may be constructed". There is no real difference between these two statements, but our subject makes us sensitive to the distinctions simplified by our terminology.

Thus a clear distinction is maintained between sources and norms. To combine these two different elements, we have constructed an auxiliary semantic element called "argument". The justification for introducing this link between the sources and the norms,

[Page 26 ]



is that we think it clarifies some aspects of the interpretation of legal norms.

Firstly, it becomes more easily understood that one source may yield more than one argument. A court decision may, for instance, contain a number of arguments useful for constructing the norms applicable to the case at hand.

Secondly, it makes it easier to point out that one source may yield arguments for more than one norm.

Thirdly, it makes it easier to introduce the concept of "weight", which is closely related to our concept of relevance.

In this way we have created a model for which the legal sources are imput. The lawyer interprets these sources, and derives arguments from them. These arguments are fitted together into one or more legal norms applicable to the case at hand.

Initially, we stated that a legal source was "relevant" if an argument could be derived from the source which was used to construct an applicable legal norm - which, however, will hardly lead to a completely satisfactory understanding of the context of this book. This is simply due to the fact that the performance of information retrieval systems is generally measured by their ability to retrieve "relevant" sources.

Rather often the lawyer will find a source over which he will ponder, at least deciding that it addresses another point than that of interest in the case at hand, and consequently no argument is derived from this source. Such a decision is by no means trivial. But as no argument is derived, the source will be deemed "non-relevant" by the definition suggested above. The retrieval of this source is then held against the system as a performance failure.

Obviously this would not be appropriate. If the interpretation of the source in respect to the case at hand were a matter of legal argument, the lawyer would probably be grateful to have had the opportunity to make this assessment himself. He would have felt uncomfortable if these elusive decisions were made by the retrieval system. It would seem easy to

[Page 27 ]



argue that the concept of "relevance", with respect to retrieval system when measuring their performance, should be slightly more generous than what was indicated initially.

For this purpose - and stressing that the validity of our argument is limited to the context of retrieval systems - we shall suggest a different and slightly more elaborate definition of relevance:

A legal source is relevant if:

  • (1) The argument of the user would have been different if the user did not have any knowledge of the source, ie at least one argument must be derived from the source; or
  • (2) legal meta-norms require that the user considers whether the source belongs to category (1); or
  • (3) the user himself deems it appropriate to consider whether the source belongs to category (1).

The focal point of the definition - category (1) - is the same as the concept of relevance introduced initially. One will note that the actual contribution of the source to the argument may be slight - it is sufficient that an argument, perhaps of neglible weight (cfr below), is derived. But from this point of departure, two additional categories have been included. Firstly, those instances in which legal meta-norms demand that the possible relevancy of certain sources should be considered. An example may be that the authority clause of a statute should be examined if the case rests upon a regulation based on this statute. Secondly, those instances in which the user himself finds it appropriate to consider a legal source. Admittedly, this would make the relevance concept somewhat subjective. On the other hand, it would be less than satisfactory if one maintains that a retrieval system is malfunctioning when finding sources which the user inspects with interest before discarding them.

In a review of Bing/Harvold 1977, Tapper (1977:11) queries the category (1) of this definition:

[Page 28 ]


"It seems to be premised that some source material is available, since otherwise no provisional decision could be made. If so, it appears from the definition that further material to the same effect must be irrelevant, because, being to the same effect, it would not alter the decision."
The criticism addressed a prior formulation of category (1), where the amendment of a result was made a criterion, rather than the contribution to a legal argument. The rephrasing of the definition should meet this objection.
One should also note that the definition is used for only the relevance of legal sources. Obviously the lawyer will bring knowledge from other sources into the overall solution of a problem, not least his general background of legal knowledge.

Thus, we have given a definition of relevance. The definition is mainly concerned with content relevance (categories (1) and (2)), but has an element of subjective relevance (category (3)).

It should be noted as well that the relevance concept is binary - a source is either relevant or not. This is in contrast to many general uses of the concept of relevance - one is prone to say that a source is more or less relevant, measuring relevance in degrees.

Once more the explanation of our definition is the use which we are making of the concept. We are characterizing legal sources, which are texts and that only. In order to measure relevance by degrees, a rather thorough interpretation has to take place. In order to keep the concept as simple and generous as possible, and thus to facilitate its operative use in respect to computerized systems, we keep the relevance concept binary.

By finding a source relevant, one has said just something rather trivial of that source: from the source are derived arguments which are either used to construct the applicable legal norms, or which were considered appropriate in such use.

For a discussion of legal reasoning, this may not be

[Page 29 ]



sufficient. In such cases the need will arise to characterize further the relation between a source and the applicable legal norms.

We suggest that this, in our model, is accomodated by the arguments. The arguments are, as stated above, semantic entities linking the sources and the norms. Arguments are associated with a weight. This weight is determined by a number of factors, for instance the rank of the legal source, the age and the similarity of a prior court decision, the reputation of an author etc. A number of factors will determine the weight of an argument, and the relative weight will indicate the influence which this argument will have in constructing the legal norm.

It may be maintained that to say "from a relevant legal source was derived an argument with neglible weight" is a roundabout way of saying "the source had negible relevance". We think, however, that this is justified by the use made of the concepts in this book.

Experiments have demonstrated that lawyers often disagree on the "relevance" of a certain source of law; an infamous example being the Joint American Bar Foundation and IBM project mentioned below in part II. This lack of consistency is not restricted to lawyers (cfr Saracevic 1968:116-129).

Our definition will not remove such uncertainty: the definition includes an element of subjectivity. Nevertheless, the distinction between relevance and weight may reduce disagreement. Even though disagreeing on the more difficult question on which relative weight derived arguments may have, a consensus may be found on whether the source itself should be considered relevant or not.

2.6.2 Words and uncertainty

In this context, we have no intention of giving a summary of the doctrine of legal interpretation. The process of interpretation is governed by legal metanorms, and is the subject of an extensive literature as well as of any lawyer's training. No general

[Page 30 ]



discussion will therefore be offered beyond the one implied by the model introduced above.

We shall, however, offer some observations on the interpretation of words and phrases occurring in natural language texts - mainly because text retrieval systems, which is our main concern in this book, operates by retrieving text through identifying the words contained in them. Some comments on the interpretation of words as such may therefore be useful in respect to later discussions.

Words occurring in natural language texts are necessarily vague, and often ambiguous. This is also true when the natural language text is as carefully drafted as the text of a statute.

One reason is the way in which the interpretation of terms are molded by their context. Our language does not consist, as a rule, of well-defined elements. The words are rather like nodes in a semantic-associative network. The interpretation of the words are influenced by the total context as well as by the background knowledge of the interpreter. A well-known example illustrating this aspect of language is the different meanings implied by the word "man" when combined with another word (cfr Rommetveit 1972:64):

"man" - "animal"
"man" - "woman"
"man" - "boy"
"man" - "son"

Another reason is that the words are vague, a classical observation is contained in rethorical questions like "When does a copse become a forest?" and "When does a shack become a house?". Another example is the term "book", which - for instance - may occur in a VAT statute, stating that "books" are excepted from VAT (as actually is the case in Norway). It then becomes necessary to distinguish between "books" and a publication not being a "book". In many cases this is a trivial distinction, offering no problems to the lawyer. But in certain cases there is doubt - when does a collection of loose leaves in a ring binder become a "book", when does a voluminous magazine become a "book" etc.

To resolve such vagueness, the lawyer has initially

[Page 31 ]



to look for the general use of the term "book" in society. The vagueness is to be resolved outside the legal system, as part of the language rather than of the law.

In this way, it is easy to demonstrate that the problem of interpretation of legal sources may not be separated from the more general problem of natural language itself. This also characterize text retrieval systems used for legal purposes: some of the problems encountered are not caused by the legal nature of the documents in the data base, but by the properties of natural language. But this is also an insufficient explanation of the problems of interpretation, as will be obvious to any lawyer. It is not sufficient to look for the normal use of a word for resloving its legal interpretation.

One technique commonly used in statutory language is to introduce definitions of a word. Such legal definitions obviously will take precedence over the common use of the word, and perhaps create a divergence of the interpretation of the word as part of a legal and non-legal context. For the word "book" it would, for instance, be reasonable to adopt the UNESCO definition of a "book" as a non- periodical publication of at least 48 pages. This may exclude some publications commonly described as books (for instance certain children's books), but would, if introduced in our fictitious VAT statute, remove the vagueness and take precedence over interpretations implied by the common use of the word.

Explicit definitions play an important role in statutory language. As the defined terms do not by themselves signal that they are used in a defined meaning, it becomes important to the lawyer to identify possible definitions relevant to the statutory clauses under interpretation. Actually algorithms for the identification of definitions have been devised, and they seem to work well with respect to English statutory language.

The explicit definitions are, however, only one of several ways in which the terms used in describing legal norms in the sources are made less vague. A common situation would be that a case is brought before the court, in which the judge has to take a stand on the interpretation of a certain vague term -

[Page 32 ]



for instance "book". In making his decision, the judge creates a new legal source which is then injected into the legal system. The law changes in a minute detail - no longer may the interpretation of the word "book" be based solely on the common use, thus the case has added a specific interpretation to the term. As time goes by, these additional legal sources may well give the term "book" a specific interpretation which removes the legal meaning from the every-day use of the term.

Such additional material, which may determine the interpretation of a word in a statute, is not signaled by that word itself. Again, the lawyer must be aware of this possibility, and make the necessary research to establish if such a legal meaning of the terms has developed. It is also obvious that text retrieval systems are extremely well suited to cope with this type of legal research, giving fast access to all sources in the data bases using the word or phrase in question.

A vagueness of a different order may, however, also be associated with the words of legal language. And certain words in the legal language are indicators of discretionary decisions which have to be made.

Good examples may be difficult to give in general, and dissociated from a specific, national system. But we may be served by the word culpable, found, for instance, in a statutory clause on torts. This word is then a reference to a set of legal sub-norms, which will decide if a person's actions is to be qualified as "culpable". This set of sub-norms is not specified explicitly in the statute, and - indeed - not specified by any (primary) legal source. There is, however, such a set of sub-norms, and from prior cases, text books etc, the lawyer will gain information which will allow him to construct the set and make his discretionary decision.

Obviously this is something different both from the task of resolving the natural vagueness of a term, and finding the sources giving legal definitions or describing prior interpretations of a term. The vagueness here is not associated with language, but with law.

Again, the word "culpable" by itself does not signal

[Page 33 ]



that it is, in fact, a discretionary flag. This the lawyer has to clairfy for himself. Again, he may be served by a text retrieval system. But the help given by the system this time is of a more indirect nature - the lawyer may find sources discussing the discretionary decision implied by the term, but the decision itself will have to be made over again in each case, employing the sub-norms that the lawyer considers appropriate.

The discussion of discretionary norms is obviously very sketchy. More detailed discussions may be found in for instance Bing 1980.

2.6.3 Harmonization

Through the interpretation, arguments are selected from the legal source in order to arrive at the legal norm relevant to the case. During this process, the lawyer may discover that two or more legal sources contain arguments for diverging or even conflicting legal norms. There are, for instance, two cases that seem to disagree on the interpretation of a statutory clause. The legal sources (the statutory clause combined with one of the two cases) may serve as the basis of two diverging norms. In such instances, the lawyer may look to the weights of the arguments associated with each source, and resolve the divergence when constructing the applicable legal norms.

But from time to time it is not possible or desirable to integrate the resolving of the divergence in the process of interpretation. The lawyer is forced to conclude that two norms, equally applicable but in conflict, is the result of the interpretation. In such a case, harmonization of the legal norms themselves is necessary.

A curious example may be found in the quite complex Norwegian legislation govering the sale and consumption of alcoholic beverages. The sect 14 and 21 of the Spirits Act of April 5th, 1927, state that alcoholic beverages are not to be sold to persons under the age of 18. An older statute of May 31st, 1900 No 5 sect 23, states, however, that an exception may be made when the beverage

[Page 34 ]


is served as a refreshment to a meal or when travelling. It is difficult to avoid a conflict between legal norms based on these sections. The conflict is resolved by using the principle of lex posterior degorat priori, ie the norm derived from the most recent of the statutes is given priority, cfr letter of the Ministry of Justice July 26th, 1974 (1975/74 E TS/AV).

There exist meta-norms governing this sort of harmonization. Some of them are commonly known as maxims, as for instance the lex posterior-principle mentioned above, and the other classical principles lex specialis and lex superior. But even these are guite vague, and the meta-norms governing harmonization has not as a whole been very well analyzed - at least not in Norwegian theory.

Most of the harmonization is based on a ranking of types of legal sources. A legal norm derived from a source of higher rank is given priority in a case of conflict. Hierarchies of types of legal sources usually place the Constitution on the top, proceeding through statutes enacted by the parliament down to case law and regulatory law. The details of the ranking will certainly be relative to the legal system, and even in regard to one legal system, the ranking may be relative to different user groups (for example may judges be opposed to civil servants in respect to ministerial regulations).

Several times we have used the phrase "conflict of norms" without defining this concept. What is actually a conflict of norms, depends to a great extent on the nature of norms. For our purposes, it is sufficient to use the phrase "conflict of norms" as a characteristic; a discussion may be found in Eckhoff 1971:270-305 and Sundby 1974:278-281.

In the process of harmonization (and also in assigning weights to arguments), the relative rank of different types of legal sources is essential. Actually the relative rank is quite a controversial matter, though the relations between the Constitution, statutes and regulatory law is well established. Trying to relate the decisions of a first instance court and legal literature would, however, be something quite different. Our prior discussion of arguments as links

[Page 35 ]



between sources and norms will also show that it may be rather difficult to determine from which legal source a norm is actually derived, when the norm is constructed from arguments derived from a variety of sources. In many ways, one may demonstrate that the problems of harmonization is - perhaps justified - simplified by traditional theory, based on examples rather than on a general theory.

One should make quite clear that the "rank" of a type of legal sources is a normative question. Even if a type of sources have a high rank, it may be of little practical use. The Constitution offers an obvious example: though one would agree that the Constitution is given the supreme rank, one rarely finds the solution to the legal problems of a general practitioner in this source.

There may, however, be some interrelationship between the normative rank and the practical utility or importance of a certain source. To be of practical importance, the source obviously has to be available. A good information system is therefore a prerequisite for making a source important. And on the lower rungs of the hierarchy, one may expect the rank to be influenced by the practical importance of a source. An improvement of the information system serving a type of sources will have an impact on its utility. One should also be aware of the impact which it may have on the normative aspect, for instance on the relative ranking of sources.

[Page 36 ]



2.7 The normative interval

We have discussed some aspects of the process of interpreting legal sources, the harmonization of these sources as part of the interpretation, and the harmonization of possible divergencies in the norms arrived at through the interpretation. The aspects discussed are mainly those of interest in relation to legal information systems. But even within these restraints, the outlines of the process should, we think, have emerged.

The process leaves the lawyer with a set of legal norms which are not in internal conflict. These norms are applicable to the case at hand - and possibly the solution of the legal problem may emerge as the simple combination of the norms and the facts of the case. This probably would be an adequate way of considering simple legal problems.

In general, however, we feel that this would be a too simple description. The final legal norms are defined by the sources found relevant, interpreted and harmonized according to the legal meta- norms. As we have stressed a number of times, these meta-norms are vague. Also, the legal norms themselves may leave room for uncertainty - norms requiring "discretion" may often permit different solutions. It is generally accepted that the result may not be well-defined, but rather be regarded as a normative interval - or, as Stone puts it (1968:192 and 320): there still exsists choice within the "leeways left by the guides of law".

Some main causes for such "leeway" may be listed. (1) Reasonable disagreement on what sources may be qualifed as legal sources. (2) Reasonable disagreement on the interpretation of legal sources, causing reasonable disagreement on which norms are applicable. (3) Reasonable disagreement on the priority between diverging norms. (4) Reasonable disagreement in discretionary decisions.

[Page 37 ]


Instances of such "reasonable disagreement" may explain why two or more reputable lawyers may arrive at different decisions even though agreeing on the proven facts of the case. For more on causes for uncertainty, cfr Bing 1982c.

The lawyer, confronted with a normative interval, cannot arrive at a decision without selecting one of the possible norms within the interval. This selection is, obviously, not a random process, but is of an extra-legal character. An important aspect of the legal decision process is, in our opinion, that it incorporates the use of extra-legal elements. This aspect is usually trivial, as the leeways leave us with quite a narrow interval and a small room for choice. But in controversial questions where there is little support for the arguments in legal sources and, consequently, the normative interval is sufficiently broad to contain distinct alternatives, this aspect may attract the attention of the public.

It would be outside the scope of this book to dwell on the nature of the selection process. Our important point is that even when all relevant legal sources have been consulted, and their interpretation has been made, it may still be necessary to exercise further judgement before arriving at a decision. As we have pointed out, this selection process is extra-legal - which implies that the lawyer may take into consideration elements of a non-legal nature, like elements of a political, moral or ethical character.

Though these elements are extra-legal, it should be clearly understood that their use is part of the legal decision process, and is governed by legal meta-norms as the rest of this process. Even within the leeways of law, there are limits to what extralegal aspects may be taken into consideration. Reviewing a decision of a public agency, a court may, for instance, find that the decision is void exactly for this reason. But once again, we must admit that these meta-norms are vague and not well analysed. (For a different view than the one sketched here, cfr Kilian 1974:228.)

One extra-legal element often decisive at this stage, is client loyality. Obviously, the candidate to be selected will be the norm within the interval most

[Page 38 ]



benificiary to the client. This obviously will be permitted also by the meta-norms; lawyers are after all expected to argue the case of their clients.

In our context, it may also be of interest to note that for the selection of a norm within the interval, the legal information retrieval system can give little aid to the lawyer. This is part of the decision process which presumes an understanding of the individual case, an understanding one can hardly expect to retrieve from a data base of legal sources.

[Page 39 ]



2.8 The result - and feedback from the result

The lawyer has now arrived at the relevant norms, which, combined with the facts of the case at hand, give the result.

Up till now our model has not allowed for the effect of the result playing any part in the reasoning of the lawyer (excluding the indirect way in which this may have determined the burden of proof, choice within the normative interval, etc). The effect of the result - or the sensibility of the decision - is, however, a legal source in its own right, but of a very different nature from those included in the data base. In our model, this legal source is represented as a feedback loop. The lawyer evaluates the result according to his extra-legal value norms. His evaluation is then taken into consideration as a legal source, and as such will influence the normative interval. This feedback may cause a revision of the result. An iterative process is initiated, which comes to a stop only when the feedback can no longer influence the normative interval.

As an example of the reasoning in which this feedback plays a part, we may cite a Norwegian superior court case (Rt 1965:607). At the time of the decision, paternity was decided with different burdens of proof according to whether the child was born in or out of wedlock. The case concerned a girl who, on discovering that she was pregnant, told a boy she had been with, that he was the father. The boy married the girl, but both kept living with their parents. The child was born in wedlock, as the martial status of the parents at the moment of birth is decisive. After the birth, it became apparent that there existed a number of possible fathers, and the probability of the husband actually being the father was not higher than the probability of one of the other candidates. According to the burden of proof in wedlock, he nevertheless was the father. Had the case been decided in respect to a child born out of wedlock, he would not have been held the

[Page 40 ]


father. One judge of the minority felt that this result was so "unjust" in the concrete context of the case, that he maintained that this by itself could justify an exception, in spite of the clearly formulated statutory rule. Without taking a stand on the justice of the issue, this would seem a quite explicit example of feedback from the result widening the normative interval to include new alternative norms - according to the argument of the minority.

It may be noted that "the evaluation of the result" is a legal source of a qualitatively different nature compared to the other types of legal sources. Several times we have stressed that the majority of other legal sources is of a syntactic nature, having the form of texts. The "evaluation of the result" is a semantic argument based on the judgement of the lawyer.

Another characteristic of the "evaluation of the result" may be noted: the evaluation can only be carried out in respect of a given case with a given result. It does not exist independently of the case, but is gererated by the case itself. Consequently, it cannot be "retrieved" from any data base established prior to the case.

We believe the feedback loop represented by the "evaluation of the result" to be of great interest, also with respect to legal information systems. It is one of the best illustrations of the iterative nature of legal decision processes - a characteristic which most automated decision processes have not yet been able to represent, or have ignored.

It may be appropriate to stress once more that our model is just a model of the elements in a legal decision process, describing the relation between these elements - but not the psychological process itself. Certainly a lawyer may have selected his result at a far earlier stage of the process than represented in our model. He will select what Soelberg (1967:23 and 26) has named a "choice candidate", and in practice the lawyer's activities may be wholly concerned with justifying that a decision based on his choice candidate does not violate the meta-norms governing the legal decision process. Cfr also Eckhoff

[Page 41 ]


1971:29 who discusses what comes first in the mind of a judge: the result or the reasoning to justify the result.

The legal decision process is a formal process; it is governed by meta-norms to a greater extent than decision processes within other areas. The meta-norms are admittedly vague and leave room for disagreement even between lawyers, but nevertheless they are valid. If a lawyer violates these meta-norms - or, rather, if such a violation is found to have taken place - his decision may be declared void. The metanorms mostly demand that a case be decided on what has happened, and according to legal norms that were in existence at that time. It is a retrospective process in which the lawyer concentrates most of the time on a situation from the past.

The evaluation of the result is an escape from this retrospective perspective. Legal meta-norms allow the lawyer at this stage of the process to look at the present and even into the future, asking, "What will be the effect of my decision?"

The evaluation of the result represents a safety valve in the legal decision process. Through this the lawyer may make his decision oriented more towards the consequences.

An illustration of this was noted in one of the surveys of the legal information systems of the Social Security Administration (Bing/Harvold 1973:228). The Social Security Administration is mostly staffed with civil servants without a formal legal education. They have more often been trained in medical or welfare environments, and are used to thinking in terms of the future of the clients. A medical decision, for instance, is oriented towards its consequences: if the patients get better, the ordained cure was "correct" - even when selected by intuition and in disagreement with the opinion of authorities. Not so in a legal decision process: Even if the decision makes the client happy, it is invalid if in conflict with a statute.

The link between the "medical" and the "legal" decision processes is the "evaluation of the result". We found that this legal source was given higher rank in the Social Security Administration relative to

[Page 42 ]



arguments derived from sources like regulatory law. And as the rank of the "evaluation of the result" was upgraded, the gap between the "medical" and "legal" reasoning was reduced.

These reservations may be of some importance to legal information systems. Better legal information systems may - as we have mentioned earlier - result in a displacement of the established relative ranks of the legal sources. If a better legal retrieval system were established in the Social Security Administration, this might result in increasing the rank of conventional types of legal sources, like administrative decisions. It would correspond to a reduced relative weight given to arguments derived from the type "evaluation of the result". Since this seems at present to serve as some sort of bridge between two different types of decision processes, ie the "medical" and the "legal", the reduced weight might break that bridge. The possible consequences of introducing a "better" legal information system with its resulting dynamics, should not be underestimated.

[Page 43 ]



2.9 Standards for legal information retrieval

The legal decision process has been described above as a formal process, indicating that to a large extent it is governed by meta-norms. We have also discussed the retrieval process itself, and shown that this is an iterative process brought to a halt either by the lawyer being confident that all possible relevant sources have been retrieved, or by exhausting the resources (in terms of time or money) allocated for this activity.

But since the legal decision process is formal, we might expect to find standards enforced within the legal system which would require a minimum quality of the legal research. If this minimum of research were not observed, we would expect the legal system to direct some sort of sanctions against the responsible lawyer.

The traditional doctrine of error juris may be examined from this point of view. Error juris - ignorance of the law - is a rather traditional aspect of the legal doctrine. Error juris may, obviously, have a number of causes - the most common probably being unsound reasoning and careless interpretation of legal sources. We will not discuss these causes, which are related to the dysfunction of lawyers rather than to the malfunction the information systems.

The malfunctions of the retrieval system may - for the purpose of this discussion - be divided into two broad categories.

Above we have tried to make the point that the information situation of the lawyer is determined by the availability factors screening the total volume of legal sources. These factors are of widely different nature, and one would expect them to be different for each individual lawyer. In this case, the set of sources available to one lawyer will not be identical to the set of sources available at the same costs to another. This creates some uncertainty in the legal decision processes. Given the same set of proven

[Page 44 ]



facts and the same cost frame for legal research, two different lawyers may very well come up with two different sets of relevant legal sources due to their different information situation. This cause for retrieving a different set of legal sources may be termed the cause of availability failure.

There is, however, also another cause related to the information system. In order to retrieve the sources, we have seen that the lawyer has to formulate a search request accepted by the information system. Though the request may be quite to the point, there is always the possibility of failure in the search mechanism of the information system. An obvious indexing term is, for instance, omitted, or a word in a text is misspelled, making the text retrieval system unable to identify the desired source. The differences in the retrieval tools may therefore also cause a different set of relevant sources to be identified. This cause for two lawyers retrieving different sets of sources may be termed retrieval failure.

It is, of course, rather common that two lawyers disagree on a legal issue. And obviously, if a problem is taken to court, at least one of them will not gain the consent of the judge. If a client has followed the advice of the lawyer, but later learns that the opinion on which the advice was based, was incorrect - ie a different decision is reached by a court or established in another authorative way - this constitutes proof of error juris. The client may then bring a suit against the lawyer, suing him for the loss which the incorrect advice has caused.

The interesting thing about this situation, is that the court will have to discuss the advice offered by the lawyer, and - if the error juris is related to the information system - consider if the lawyer is liable for the properties of the information systems he employed, or the way he employed these systems.

These situations do not occur too frequently in practice, and the case law is less than abundant. Below, however, two situations are outlined. Because the discussion is limited to certain jurisdictions, the description is somewhat sketchy.

The first situation is that of a civil servant making

[Page 45 ]



a decision pursuant to some law or regulation. This may then cause a loss to a private citizen. Subsequently, it is demonstrated that the civil servant has made an error juris. If the citizen then sues the authority, it must be considered if the authority is liable.

This situation would be interesting in our context if the cause of the incorrect opinion is related to the information system. Norwegian case law does not, however, contain such examples. The doctrine maintains that a public authority is more easily made responsible for losses caused by incorrect procedures than those caused by incorrect interpretation or application of the law (Eckhoff 1978:603-604).

The more interesting aspect in our context is, however, the rather strong criticism voiced in legal literature, where it is stressed that the risks of causing losses through the exercise of public authority should rest with the state rather than with those citizens subjected to incorrect application of the law. This point of view is then used to argue for strict liability for such losses, which certainly would also embrace malfunctions etc of the information systems used by public authorities (Frihagen 1977:150).

In our terms, this would imply that an error juris by a public authority resulting in a citizen suffering a loss, would be liable to this citizen if the cause of the error juris was related to the information systems employed. This would cover both the case of availability and retrieval failure. This may also be viewed as an encouragement for public authorities to make the necessary resources available to maintain their information systems with satisfactory performance.

The second situation is that of a private practising lawyer offering advice to a client, and this client is suffering a loss when acting on his advice. One may consider whether the lawyer is liable.

In general this question is answered affirmatively by Scandinavian doctrine - but it is a rather theoretical general rule. No strict liability is considered, and the modifications argued by the literature are many - the culpability is related to the nature of

[Page 46 ]



the advice, the fee, the way the lawyer has phrased his advice, etc.

This relativity is illustrated by a Danish case (UfR 1945:205). In brief, the case concerned a seller of a property containing an attic apartment. After the conclusion of the sale, the construction of this apartment was shown to be in conflict with current regulations. The lawyer advised the seller to reduce the price as he should be able to claim compensation from the person who had sold the property to him. Subsequently, the lawyer changed his mind. And the client sued the lawyer for the loss.

In this case the court held that it was not unreasonable for the lawyer to answer the question of the client on the basis of his background knowledge, without further legal research. In fact, the court maintained that the lawyer was under no obligation to use an information system - and this is considered to be a rather general conclusion by Danish theory (Kruse 1976:51). In our terms, the cause of the error juris was availability failure, no resources were assigned to legal resarch.

A further example may be a Swedish case (NJA 1957:89). In this case, a lawyer representing a limited company arranged for the loan with another of his clients, staking the property of the company as security. The clause of the security bond limited this to the property within a certain municipality. Later, the company moved out of that municipality. In the bankruptcy proceedings the security was found invalid due to the cited clause. The client sued the lawyer, who maintained that his incorrect advice was caused by a reasonable interpretation of an ambiguous section in a statute of 1883.

The court admitted that this clause was indeed ambiguous, but pointed out that this ambiguity had been solved by the superior court in a decision of 1904. The lawyer argued further that this decision was not cited in the foot-note to the statutory section in the privately published compilation of statutes in force. The court granted this, but referred to a text book of 1927, which contained the appropriate citation, and which was considered to be rather basic to this area of law.

[Page 47 ]



In our context, it is interesting to notice that the argument actually tends to discuss the problems of retrieval failure. The compilation of statutes in force did not have the necessary citation - and did consequently malfunction as an index to major cases. The court argued contrary to this, by referring to another book also functioning as an index, and which the court held he ought to have consulted. The court did not address the problem of what the outcome would have been if this second "index" had not been available, but opened for the possibility that in such a case, the lawyer would not have been liable. Traditional Norwegian theory also takes this stand (Platou 1915:81).

Actually, this is rather interesting for the providers of computerized services, offering a more efficient retrieval system. The reasoning above may well imply that the very existence of a more efficient retrieval system also creates some sort of obligation or inducement to use such a system.

In addition there is another aspect that has not been illustrated by these examples. In the Swedish case, the lawyer was found liable. The cause would seem partly to be the omission in the privately published compilation of statutes in force. It would have been rather interesting if the lawyer in his turn had sued the publishers, a case which would have illuminated the liability of the providers of a legal information service.

Indeed, this problem is addressed by Mehl (1979) and the draft recommendation which is the basis of the Council of Europe Recommendation No R (83) 3 on "The protection of users of computerised legal information services" (the chapter on liability is not included in the final recommendation). This activity takes computerized legal information services as its point of departure. As the discussion in this section has demonstrated, the legal question is closely related to the traditional doctrine of error juris. It is discussed in the literature with a reference to traditional information services, and probably case law more to the point may be found than those Scandinavian examples cited here.

This problem of the liability of legal information services is only one of several issues which

[Page 48 ]


make up "the law of legal information services". These are discussed more comprehensivly by Moon/Oskamp 1982.

[Page 49 ]


3 COMMUNICATION PROCESSES

3.1 Introduction

A "communication process" is a system of activities which supports the transport of information from one person (the sender) to another person (the receiver). This process is performed by some mechanism known as an "information system" (cfr Goffman 1970:726). Parts of this process are not only activities and persons, but also those objects carrying the information (for instance scraps of paper or magnetic media) and those tools used for sorting and selecting relevant information (like a text retrieval system). Initially, a communcation process may be pictured as a link between the sender and the receiver, this link being the information system itself.

In a legal communication process, the information will be of a legal nature, the receiver will be a lawyer, while the sender will be a combination of the producer of a legal source (the parliament, a certain court) and an editor maintaining a certain information service.

Obviously, the formal description of communication processes may easily be elaborated, and definitions introduced to eliminate the ambiguities in the description above. What, for instance, is indicated by "legal" information - is the newspaper stories of a sensational murder trial "legal information" in our sense of the word? In Bing 1982a such questions have been addressed in some detail. It is, however, our opinion that for the purpose of this book, a quite informal and general description is sufficient. When returning to the issues which will be discussed in more detail, we will try to be as specific as necessary. But the communication process is in this chapter mainly used as a common perspective on certain related issues, and these issues, rather than the overall process, will be brought into focus.

[Page 50 ]


The perspective of the communication process is useful, we believe, to stress that legal information systems are actually media for communication between two parties. In respect to computerized systems, there is, perhaps, a tendency to concentrate too much on the power of the tool for retrieving documents, and too little on its properties as a distribution network. In many manual systems, this is solved less satisfactorily. Firstly, in a manual system one will have to plan ahead by subscribing to or purchasing an information service before the need to use this service actually arises. Secondly, a local maintenance of the manual data base is necessary, and also often time-consuming - as anybody struggling to keep a loose-leaf service up to date will confirm. Computerized systems have centralized maintenance of the data bases, and as long as the user has access to the service, he may also have access to any of the included specialized data bases when the need arises.

We believe, therefore, that the perspective of the communication process may bring into focus some characteristics of computerized systems not easily identified when consentrating solely on the information system proper.

In addition, we believe that the editoral work is important. Many properties of information systems are not determined by the programs of the computer, but rather by the decisions of the editor. This concerns such an all-important aspect as the coverage of the data base, and also that of document design: How is the source presented to the user of a legal information system? In the perspective of the communication process, the editor is clearly visible as the representative of the sender, and his influence on the quality and efficiency of the information system therefore may be easier to evaluate.

In this discussion of the commpunication process, we shall concentrate on the editoral process, and on some aspects of the use of information systems. The information system as such will be discussed only as a part of the communication process, since we shall have ample opportunity later in this book to concentrate on exactly that element (Part II).

[Page 51 ]


3.2 The editorial process: Data base selection

3.2.1 Introduction

A legal source is defined by meta-norms of the particular jurisdiction in question. Most legal sources have a written form, though some types, like customary law, exist only as opinions, attitudes or behaviour in society. In our discussion, we have restrained ourselves to those sources which have a written form.

These are obviously important types, like statutes, regulator law, court and administrative decisions, legal literature etc. These are also the types which are subject to dissemination by legal information systems.

Before initiating the communication of the law, the legal source itself must be created. This is by no means a trivial process, of which major legislative efforts or court cases bear ample proof. But in our context, we presume that a legal source is brought into existence according to the rules of the jurisdiction in question.

The result is a text. It may be interesting to note that at some stage this text must be written down. A judge may, for instance, draft his decision, or dictate the decision to a tape recorder, and his secretary will type it - perhaps for further correction.

The simple registration of the text is, of course, a trivial element in the creation of legal sources. In our context, however, it may be worth-while to notice this activity.

Firstly, the author of the source gives it the form which in this book will be called authentic or original form. We need a certain term for describing the original form of a statute, a court decision etc, because the source is rarely communicated to the user in exactly that form. The editor will, even when making the full text available, supplement this text

[Page 52 ]



with certain editorial additions, like abstracts, indexing terms, titles, footnotes etc, and may also make other editoral amendments like replacing the names of the parties in a court decision with letters, in order to protect the privacy of the involved persons.

When speaking of computerized systems, one frequently uses the term "full text systems". Unfortunately this term is an ambiguous one. It often indicates that the documents are in an authentic form. But even systems which are restricted to, for instance, documenting abstracts of court decisions, are termed "full text systems" in order to indicate the method used to index the abstracts - ie making all words retrievable. In these cases, the term says only that the computerized system is a type of text retrieval system (in contrast to, for instance, a type of data base management system), and says nothing of the relation between the documents and the sources. Due to this ambiguity in terminology, we do not use the phrase "full text" in a technical sense. To describe the relation between documents and sources, we shall use terms like "authentic form", "abstracts" etc, and to describe the type of computerized system we shall use "text retrieval systems" for systems allowing retrieval in principle by any word or combination of words from the actual documents.

Secondly, if we want to make the communication process more effective, we may - looking forward - introduce appropriate word processing equipment at this stage, and then exploit the same computerized text for later stages in the communication process, like printing the decision or updating a computerized data base.

When authorized, the legal source is introduced into the legal system as a new statute, a new court decision, a new monograph etc. When introduced in this way, the source becomes available for lawyers, and arguments from the source may be employed in making future legal decisions. The constant creation of new sources is the main reason for the evolution of the legal system.

The constant creation of new sources feeds an

[Page 53 ]



accumulating collection of sources representing the total volume of sources of the legal system in question. This is only a theoretical collection, there is no legal system which maintains a physical total collection of sources. It would even be very difficult to identify all sources. They would include a great number of trivial sources, for instance decisions by public agencies in those legal systems qualifying such decisions as legal sources. They would also include dated, but still "valid" sources. And the actual validity of a source may be difficult to establish. Even if a new statute replaces an old, the old statute may retain some interest as part of the legislative history of the new.

Without dwelling further on the practical and theoretical problems of identifying clearly the total volume of sources of legal systems, we observe that it is possible to envision a collection of them. Not all of these sources are communicated onwards to the user - and in general, there is little reason to regret this. A selection of the sources is made for the purpose of including such sources in an information system, and thereby make the sources available to the user.

An area of great interest is the processes governing the selection of sources for documentation in an information system. It is perhaps a trivial, but nevertheless a basic observation, that one cannot retrieve from an information system a document that is not part of the system. One of the basic properties of any legal information system is determined at this stage: What will be included in the data base?

3.2.2 Selection

An editor has a responsibility in respect to a certain information system. The system will rarely be of a completely general nature, setting out to document any type of legal source. Typically, the information system will have some degree of specialization - and this specialization will be called the documentation area.

The documentation area qualify which types of legal

[Page 54 ]



sources are to be considered for inclusion in the information system. Within the documentation area, a further selection very often takes place, limiting the documented sources to those considered as being of "importance" or of "special interest", and discarding those regarded as trivial.

In general, the selection for publication will therefore be a two stage process. Firstly, the editor determine whether the source lies within the documentation area. Secondly, he determines whether it qualify according to additional supplemental criteria.

We believe that most traditional systems have a rather high degree of specialization; a case reporter documenting decisions from a certain court, a compilation of statutes obviously being limited to a certain type of legal sources, a tax law journal being limited to a certain branch of the law. It is interesting to note that computerized systems very often have more inclusive documentation areas, documenting any court decision, all papers of legal journals etc. The reason is that a number of pragmatic restraints have been removed in these systems - like the simple problem of sheer volume in a paper-based system. A computerized service is, however, generally divided into "data bases" or "libraries". These very often correspond closely to the documentation areas covered by conventional services.

For selection, two main categories of criteria can be identified, they may be called "systematic" and "evaluative" criteria.

There are at least three sub-groups of systematic criteria:

(1) Criteria relating to the type of legal source to be documented, for instance only supreme court decisions, regulations etc. Within this group, it is also convenient to include criteria more oriented towards the producer than the type of legal sources, for instance a case reporter for a certain appeal court, whose decisions is strictly of the same type as other appeal courts,
(2) Criteria based on legal disciplines, for

[Page 55 ]


instance a tax law journal, or the Computer Law Case Reporter.
(3) Criteria based on procedural characteristics of the legal source. Obviously, such criteria will vary considerably from one jurisdiction to another. A common example is that a court, when a case is thought unusually difficult or important, may sit in plenum, or be supplemented by additional judges. This will then also be a characteristic of the source which may be utilized when selecting cases for publication. In this category we may also include the use of temporal or spatial delimeters, for instance when a new computerized service decides to document cases only back to a certain date.

One will note that all these types of criteria make it possible to develop rather strict rules for selection. The type (or producer) of a legal source is usually self-evident. The procedural characteristics are made explicit by the source itself. To identify a certain source within a certain area of law may, of course, be more difficult. A case settling a labour dispute between performing artists may be labour law, but also intellectual property law. Though borderlines like this will always exist, the degree of discretion exercised in qualifying the source in respect to a legal discipline is quite limited, and creates no great uncertainty.

These systematic criteria are generally used in determining the documentation area of a legal information system. It is, of course, important for the user to have a clear understanding of what area of law is covered by a system - and the lack of vagueness in these systematic criteria makes them well suited for the purpose of determining the documentation area.

Evaluative criteria are generally described in terms like "important", "central" etc. There are a number of different phrases used to describe the evaluation to be made by the editor, but they would all seem to boil down to the criteria being the expert opinion of the editor. In respect to court decisions, all lawyers would understand what type of evaluation will take place if "important" decisions are to be

[Page 56 ]



selected for publication, though they might not necessarily agree with the actual choices of the editor.

As we have mentioned above, the selection of the sources for documentation is an essential part of the communication process, and will decide the performance level of the information system. In spite of this, little is known of the selection itself, whether or how often it is controversial, etc.

A special study has been conducted in Norway of three manual systems with almost identical documentation areas (the decisions of the Social Security Court). Two of these systems were inhouse precedent systems in the Social Security Court itself and the National Insurance Institution, one was a published case reporter. The study disclosed a surprisingly high degree of independence between the systems (cfr fig 3/1), indicating that the evaluative selection criteria may actually have been quite different even when formulated in a similar way by the editors, and even within a small and specialized area of law.
Fig 3/1 - Relations between the data bases of three different information systems with identical documentation areas - actual and per cent figures, cfr Bing 1982b:213.

SSC = Social Secuity Court (precedent file)
NII = National Insurance Institution (precedent file)
PCR = Published case reports

In calculating per centages, the total number of different decicisons in the three system is used as a basis.

[Page 57 ]


The studies made seem to indicate that selection based on evaluative criteria is - perhaps not surprisingly - quite pliable, being molded by outside interests. In a Norwegian study, it was observed that the publication of a government report on a certain legal issue seemed to prompt a higher selection of cases related to this issue for publication. Other examples of probable outside influence on the selection are given as well (Bing 1982b:236).

The fact that such procedures may be influenced in this way, is by itself no reason for alarm or distrust. But it may be a reminder that this crucial part of the communication process should be given a critical look from time to time.

The selection criteria may be perceived as governing the selection process. But they do not by themselves determine which sources will be included in the information system. One must also look to the selection procedures. In most cases, the selection is made by one or a few editors working together - but there are, of course, other possibilities. The creation of legal sources is to a certain extent decentralized, and especially in respect to the decisions of the lower courts, it may be appropriate to have decentralized selection procedures, for instance using local judges or lawyers.

Many aspects of the selection may be explained only by the selection procedures. As a curious example may be referred that the reason for a local Norwegian court not supplying the relevant case reporter with any material, was explained when it was disclosed that this court did not have an adequate photocopier, and therefore had trouble making the necessary copies of the decision (Br&then 1978:10). Though curious, it may illustrate how vulnerable an information system is to practical circumstances of this kind.

Finally it should be mentioned that there are a number of other factors determining the final selection. In conventional paper-based systems there are considerations to be taken to volume, mail-rates etc, not to speak of the economic consequences of, for instance, adding another page to a publication with a great number of subscribers. Computerized systems change many of these considerations - and one of the major advantages of computerized systems is that the sheer

[Page 58 ]



volume of the text is by itself a problem of little importance, liberating the information service from the chains of paper. On the other hand, such systems do, of course, introduce new practical considerations.

3.2.3 The data base

(1) Representativity

We have already introduced the notion of a collection of sources. From these sources, a number is selected for the information system. These are then edited according to the needs of the information system. In the next chapter we shall take a closer look at this editing, which we have labelled "document design".

At this stage it is sufficient to observe that through the selection, a data base is created. This data base is composed of documents, and each document represents one legal source. There is a complete correspondence between the collection of sources and the data base both on the aggregated and the individual level.

In order to discuss legal information retrieval systems, we need some concepts for describing the relation between the collection of sources and the data base. This is generally the distinction between unpublished and published sources. Publication plays, of course, a vital role in any legal system - for instance as a condition for enforcing a statutory provision (publicatio legis). It is, however, not this legal aspect of publication that deserves the prime interest, but the more practical side.

In passing, it may be worth noting that this distinction is of greater interest in respect to legal than to many other types of information systems. For instance, the status of a court decision as a legal source does not generally rest on its publication, but on its existence (with a few exceptions, as mentioned in respect to the Californian situation, above at sect 2.4). In a situation with competing legal information systems, and with the possibility of actually searching the extensive files of a court or

[Page 59 ]



a public agency, it is important to know to what extent one may find all the interesting sources through one information system.

In such a case it would be useful to be able to measure to what extent or how adequatly, a certain data base reflects the corresponding collection of sources. This is a need which one will not have with respect to, for instance, chemical literature: One would be interested to learn how great a fraction of published papers is available through one information service, but hardly how great a fraction of all papers on chemistry actually gets published. Rejects would rarely be of any interest to the learned chemist, while "rejects" for case reporters may interest a lawyer, and may prove of great practical use if cited before a court.

In order to compare the two collections, we shall have to determine a scale. This is attained by referring to the concept of a document area. This is generally determined by rather strict rules. In applying these rules to the total volume of legal sources it will be possible to qualify a sub-set corresponding to the data base of the information system.

The basic standard is the publication ratio, ie the relation between the published and total number of sources. If all legal sources within the documentation area are published, this ratio equals 1 - and the closer to 1, the higher the fraction of published sources.

For a number of important sources, this ratio in practice is 1 - for instance within the documentation area of statutory law. On the other hand, one frequently finds quite low fractions. For appeal court decisions in Norway, this ratio is 0.03 (Br&then 1978:8), which is not uncommon in a European jurisdiction. When introducing computerized systems in Great Britain, Butterworth Telepublishing discovered that less than half of the reported cases were actually published - which is a surprisingly low publication ratio for the home of the stare decisis-doctrine.

The publication ratio has, however, one serious disadvantage. Obviously the legal interest is not equal in all the sources within the documentation

[Page 60 ]



area. Case law and similar sources often contain trivial or redundant material. Probably the editor has taken this into account. If every second decision is trivial or redundant, a publication ratio of 0.5 would be sufficient to give an excellent service within the chosen documentation area.

Though this boost of the representativity from editoral assistance is obvious, it is, however, difficult to measure. The reason is simply that we would have to base such a measure on the information of the collection of sources, compared to the information in the data base - and as "information" (in the semantic sense) cannot be measured, neither can fractions of information.

It is, however, possible in theory to envision such a measure. The information in those sources documented by the information system is measured against the information in all sources. The resulting fraction will be called representativity.

Actually, the development of the concept of representativity may be made somewhat more formal than the sketchy argument above. In using a theory suggested by Schreider (1965) one may imagine a "semantic thesaurus", and relate the concept of representativity to changes in this device. For our purposes, however, the informal introduction above may be sufficent.

Though representativity cannot be measured, its relation to the publication ratio can be determined. Imagine that we select at random half the sources within the documentation area for publication. We would expect to get on an average half the information contained by the total number of sources. The selection, however, is not the result of a random process, but rather the product of an editor's expert choice. An editor would start by peeling away the trivial and redundant material, thus bringing represenativity a bit above the publication ratio. Though the representativity would never become 1 until everything were published - even the most trivial scrap of a source - it would increase faster - as illustrated by fig 3/2.

[Page 61 ]



Fig 3/2 - Relation between publication ratio and representativity.

This relation illustrates that representativity reaches higher values than the corresponding publication ratio. It does, however, also illustrate that representativity is linked to the publication ratio, and at very low publication ratioes (like the one cited for Norwegian appeal court cases, 0.03) the representativity cannot be very high.

We shall not, of course, venture to suggest what would be an appropriate publication ratio (and, indirectly, neither an appropriate representativity). It would perhaps be fair to say that the publication ratio should be as high as possible, and only be reduced after assessing the consequences. Obviously, the type of legal sources within the documentation area, the fraction of trivial sources, the cost of documentation, etc are examples of factors which will determine what is thought to be an appropriate publication ratio. Assessments can be made only within a certain documentation area, and in respect to a certain information system - and should be oriented towards identifying the justification of a publication ratio less than 1.

The relation to the publication ratio gives to the theoretical concept of representativity an empirical basis which makes it more attractive for the analysis of information systems.

The concept of coverage may seem closely related to the concept of representativity. Coverage

[Page 62 ]


is a measure of how much of the "relevant" documents on average is contained in the data base of an information system. The measurement of coverage is clearly related to the situation of the user, and we shall return to this concept below. At this point it may be sufficient to mention that representativity (and publication ratio) tells something of the relation between the data base and the corresponding collection of legal sources, while the concept of coverage tells something of the relation between the data base of one system compared to the combined data bases of all available systems.

(2) Objectivity

Another measure for the quality of the data base is objectivity. Often the legal sources are gererated when two parties are in conflict. A typical example (taken from Simitis 1974) would be labour disputes, where the two parties are the organizations of the employer and the employee. Such disputes would in some cases be settled by a court (often a specialized labour court), and the decisions of this court would then become legal sources for solving future conflicts.

By objectivity one indicates the ideal aim that the data base should give a fair reflection of the interests of the parties involved. If, for instance, only decisions in favour of the employees were selected for documentation, the data base would become biased compared to the corresponding collection of sources.

One may note that objectivity has no absolute relation to representativity. If representativity is high, there may be a presumption of high objectivity. But obviously the selection may be biased in such a way that a rather high publication ratio is achieved with a low objectivity, implying that the majority of those sources excluded from the information system concern the interest of one of the parties.

It is perhaps misleading to describe objectivity as a "measure", as obviously it cannot be measured directly and it will remain a subject of argument, ie an objective for the information system.

[Page 63 ]



Referring to our previous discussion of selection, one of the points made concerned the selection according to evaluative criteria. We maintained that research indicates that these criteria are pliable, and respond to outside interests. This lead us to suggest that one should be critical towards selections governed mainly by selective criteria. One of the critical aspects ought to be precautions against a loss of objectivity in documentation areas where parties exercise a strong interest - like that of labour law.

Such precautions may include the amendment of the criteria from evaluative to a more systematic nature. This possibility is not always open, as there is no adequate criterion available.

Another possibility would be the development of pluralistic selection procedures. By involving several parties in the selection, each of them representing different interests, one may create some guarantee for adequate objectivity.

An example of such a reform has been found in respect to the case reporter of the Norwegian Social Security Court. Initially, this was edited by the National Insurance Institution. It was felt, however, that as the National Insurance Institution pleaded its point of view before the Social Security Court, it was itself party of the conflict, and that its selection might be biased by this fact. Consequently, in the end of 1975 the selection procedure was changed, introducing an independent editor who would accept suggestions for publication from both the Social Security Court and the National Insurance Institution, and supplement these with his own additions. When the decisions selected for publication after this amendment was examined, it was found that 42 per cent were initially suggested by the National Insurance Institution, and 41 per cent by the Social Security Court, while only 5 per cent were suggested initially by both institutions. (For 11 per cent the institution making the initial suggestion was not known, and the editor played a very minor part in the initial suggestions.) Cfr Bing 1982b:145-160. This may be considered a successful reform. Obviously, there

[Page 64 ]


are differences of opinion between the two institutions, and as these are reflected in the selection, a higher objectivity is achieved.

The traditional way of securing a sufficently high degree of objectivity, would be to have competing information systems. The objectivity of each system may be moderate, but their combination would achieve a high objectivity.

Also, in this case, the key may be pluralism - but not restricted to pluralistic selection procedures. The difference between several competing systems, and one system with pluralistic selection procedures should, perhaps, not be exaggerated with respect to the objectivity. In small jurisdictions, however, there may not be a sufficient market for more than one information system within one documentation area. This may make the establishment of pluralistic selection procedures an attractive alternative.

A change from several competing information systems to one coordinated system may create a concern for the continued objectivity. And with respect to the introduction of the German computerized information system JURIS, launched by the Ministry of Justice, such a concern was voiced. The project director at that time maintained, for instance (Fabry 1973:6):

"A comprehensive public system would, owing to access through a certain monopolistic position, necessarily gain importance as a means of influencing opinion to a high degree. It would, therefore, be intolerable if such a system were to reduce the plurality of stored legal points of view or produce a distorted picture of their quantitative representation."

This strong emphasis on objectivity has a background in the German discussion, where especially the producers of "conventional" information services - the German legal publishers - had been very outspoken on how the freedom of the press as a fundamental principle in a democratic society produced limits for the growth of the JURIS system. In a report, the Verlegervereinigung Rechtsinformatik (1975:19) maintains:

[Page 65 ]



"Einmal wuerde naemlich mit der Verdraengung der spezialiserten Fachzeitschriften gerade das fuer die Meinungsvielfalt im juristichen Bereich wesentlichste Informationsmedium entfallen."

As some sort of conclusion, we may indicate that this abstract reasoning has led us to believe that two characteristics of an information service may create a concern for its objectivity: low publication ratio and selection according to evaluative criteria. If the publication ration is low, this means that a great number of sources are excluded from the information system, and this enhances the importance of the selection. And if this selection is governed by evaluative criteria, the selection will probably be influenced by outside interests that are not always appearant to the editor himself. In such cases, one should consider measures to guarantee the objectivity of the system. We therefore suggest pluralistic selection procedures.

(3) Updating

The concepts of documentation area, publication ratio, representativity and objectivity all characterize qualities of the data base. They do not, however, describe adequately the dynamic features of the data base, though these features are essential. The legal system itself is in a state of continuous development, and it is often important to know some of the dynamic properties of the information system in order to carry out an evaluation.

One concept, closely related to that of publication ratio, is the growth of the data bases, measured for instance in the annual increase of documents. If the publication ratio is stable, the growth will vary according to the additional number of new legal sources within the documentation area. This relationship between growth and publication ratio may be quite common, but in addition there is another tendency. For a conventional publication, a more or less fixed number of pages is often available for the annual growth within the documentation area. Consequently, the annual growth of the data base is quite stable, but the annual publication rate will vary: In years

[Page 66 ]



of a relative high production of legal sources, the publication rate will be relatively low; and the opposite will be true in years with a relatively low production of legal sources. An actual example is discussed in Bing 1982b:179.

Another concept is the updating frequency. This denotes the average interval between the updatings of the data base. Updating will mainly imply the addition of new documents, but may also include the deletion of outdated documents. For information services documenting case law, only the first type of updating is actually necessary. For udating of statutory or regulatory data bases, also the deletion of outdated sections may be found appropriate. The old sections may, however, still be important, at least for a period, when deciding cases originating at a time when these sections were still in force. Therefore, probably only the status of such dated material will be changed, in order to facilitate access to the law in force at any given time.

A time-segmented data base, as indicated above, would reflect the nature of the legal system itself, in which the validity of a statutory clause or similar legal instrument is always linked to dates. However, only few information systems are able to cope with this. There has been text retrieval systems especially designed with features to cope with time-segmented data bases, One of these being the German TR/1 (Kraemer 1975). Another is the Norwegian SIFT, currently implemented in a prototype version in Oslo and Strasbourg, and which supports "historical browse". The only operational system which has a design favouring a time-segmented data base is, to our knowledge, the French MISTRAL marketed by Honeywell Bull and used, for instance, in the Common Market CELEX system.

The frequency of updating is, of course, important to the user. The frequency will vary from systems with weekly, monthly or annual updating (this is common in respect to manual systems), to systems with daily or even more frequent updating (as many computerized systems).

To the user, more important than the updating frequency is the updating response. The updating

[Page 67 ]



response is defined as the average time from the introduction of a new legal source into the legal system, until this source is present as a document in the data base. Obviously, it is hardly of any value to have daily updating if the material going in is already some months old.

An aim for the legal information service would be to keep the update response at a minimum. In Recommendation No R (83) 3 from the Council of Europe, one of the clauses demands of computerized services that "the frequency of updating must exactly match the working environment", ie the legal system itself.

This will possibly have an impact on the very design of legal information services, which should not include features which prolong unduly the updating response. Such a feature may be an intellectual indexing of documents. If this is found desirable, one should seek solutions which do not have a negative impact on the updating response. One might, for instance, introduce a two-stage updating, in which the raw text of the document is introduced in the first stage, and the elaborations of intellectual indexing in the next.

When measuring the updating response, one is primarily concerned with the documenting of new legal sources. An information service may very well document older sources to some extent - for instance court cases brought to the notice of the editor some time, even years, after the case was decided. For computerized services, it is common practice to introduce a data base comprising a certain number of years, and subsequently supplement the base with older material at the same time as the system is updated by new sources. Measuring the average time between creation and updating too rigidly will make the concept of updating response less useful.

[Page 68 ]


3.3 The editorial process: Document design

3.3.1 Documents

Up to now, we have been using the term "document" without defining it. It is a convenient term, and is generally used when talking of computerized systems to denote the unit containing a text etc. In a manual system, a "document" is a reference to a physical object, one or several pages of paper. In respect to computerized systems, we lose this physical reference - and it may be convenient to specify in some more detail our use of the term "document" - though for most purposes, the intuitive and rather vague meaning is sufficient.

Our concern is with computerized legal information systems. These systems will represent legal sources in some way or other. The representation of a source in a system is what we shall call a document. The representation may include the authentic form of the source, or it may be only an abstract or some indexing terms.

Frequently the representation of one source in a system does not come together in one output format. This is typical for manual systems. In a case reporter, a case may be represented by its authentic text and headed by an abstract. But it will also be mentioned in the table of content and in the back-of-the-book index. All these elements, representing the same case, constitute one document in our terminology.

This logical document concept may seem to be a slightly artificial construction. Our choice is, however, governed by a concern with document design, where the efforts of the editor is aimed at presenting a legal source in the information system in a way which is appropriate for the functions of this system. Important basic properties of any information system is determined by the selected document design.

Above we have stated the trivial, but essential observation that a document cannot be retrieved if

[Page 69 ]



not part of the data base - which leads us to associate considerable importance to the representativity and publication ratio. It may be equally trivial, but also equally essential, to observe that the performance of an information system cannot trancede the limitation determined by the document design.

In order to determine what is one document, the point of departure is one legal source. From this source are traced all elements representing this source in the information system. This gives a fair working definition of "document".

There are some problems related to this definition, however, but these do not seem severe, and they do not - in our opinion - involve any practical difficulties.

One of these problems may nevertheless be mentioned, as it clarifies the nature of legal source. One will frequently find that one source cites another source. A court mentions in its decision a prior decision, this prior decision cites a paper, which in turn cites a number of decisions and foreign legislation, etc. Imbedded in the last case are representations of other legal sources.

This may be called the "composite character of legal sources", they stick together like Chinese boxes, and you cannot have the biggest box without the smaller within.

If strictly following the definition suggested above, one will find that when tracing the representation of the cited case, the citation is itself a representation - and should this then be included within our logical concept of a document? This would obviously be inappropriate. The definition could be refined to exclude this small problem, for instance in pointing out that the document includes only such elements of representation which are parts of the editoral scheme, excluding representation caused by activities outside the system. But for practical purposes it is hardly necessary to refine the definition of "document".

The distinction between an original text and the representation of that text in the system, is by Lancaster/Fayen 1973 made by the use of the terms

[Page 70 ]


"document" and "document surrogate", which corresponds exactly to our distinction between "source" and "document". Our deviation in terminology is justified partly by the fact that our use of "document" corresponds fairly well with the use of this term in respect to computerized systems. But we also think that the term "document surrogate" is misleading in respect to typical legal information systems. In such systems, the source is represented in authentic form. It is not a "surrogate" in the sense that there is something more "authentic" around.
The terminology of Lancaster/Fayen may be influenced by their work on bibliographical systems. In such systems, the document typically contains only elements like title, author, a few indexing terms etc, perhaps supplemented by a brief abstract. This may be sufficient for the user to decide that he wants the book or paper represented, but obviously not sufficient for exploiting the information of that paper or book: he has to get hold of the original. In this situation, the term "surrogate" would seem adequate.
Actually, much of the terminology and theory on information systems are branded by their background in bibliographic systems, which has functional properties different from the typical legal text retrieval systems. Therefore, parallels should be drawn with care.

Within a document, an important distinction is made between those elements which has an origin outside the system, typically the authentic form of the source, and those which has been added to the document by the editorial staff. These additions will typically take the form of indexing terms, characterizing the source, or an abstract. Such additions will be the result of an intellectual indexing. In computerized systems there may be additions which are made more or less by automatic means - an example might be the enrichment of a document through the use of an automatic thesaurus by synonyms to the intellectually selected indexing terms.

Document design invokes the picture of an editor taking a legal source, tailoring it for the need of the information system and adding to it abstracts,

[Page 71 ]



indexing terms etc. The document design always takes the legal source as its point of departure, and this is defined by the meta-norms of the legal system in question. There are examples within some jurisdictions for another form than the original to be qualified as legal sources.

One example may be the ITALGIURE system. Some of the documents in this system represent the decisions of the Corte suprema di cassazione. They are represented as the special form of abstracts - massimes - which are prepared by a special office of the court itself (cfr below in part III for a description of the system). In practice, a lawyer argues on the basis of the massimes rather than on the basis of the original decisions themselves. This may indicate that in the Italian legal system, not only the original decisions, but also the carefully prepared massimes are qualified as legal sources. One the other hand, some are obviously critical to this view and explain the situation by the fact that as the massimes are so much easier available, they have replaced the decisions without this being justified in legal meta-norms. For a critical discussion, see Ciampi 1974:711-713 and 721-725.

Another interesting example is given by Berger 1981:159-160. The old German Reichsgericht was careful in drafting its abstracts of the cases in the form of questions. In its internal files, however, the court had given the abstracts a more authoritative form. When the federal court was established in 1950, priority was given to its functions in developing the law. In order to support this function, abstracts were now published like maxims or rulings. In addition, the internal files of the old Reichsgericht was published. This is, perhaps, one of the causes to the "Leitsatzkult" at present characterizing German law, where - in Berger's opinion - "ces sommaries sont souvant traite comme des dispositions legales" and where "les methodes d'interpretation de leur lois sont appliquees." Legal meta-norms would seem to have developed to allow for the maxims to have become some sort of legal sources in their own right.

These complications are mentioned only in order to illustrate the fact that the authentic form of the legal source may not always be one form. In the

[Page 72 ]



German situation, one might say that there are two types of legal sources, the court decisions and the maxims as abstracts of these decisions. Both have their authentic form - and in addition, the maxims are a representation of the decision. This is perhaps just one more example of the composite character of legal sources, and is not really very different from a legal text book (which is a legal source with an authentic form) citing and interpreting a certain court decision.

However, if pragmatic circumstances make the user replace a legal source with a text unqualified as such a source the situation could become problematic. In the critic's view of the Italian situation, what may happen is that the massimes are readily available, and though not a legal source in their own right, they replace the less available court decisions for all practical purposes.

There may be several aspects of concern in this situation. One of them is what Simitis (1974:32) has called "das Manipulationsdilemma":

"... jede Dokumentation, die sich aus ausgewaehlten und bearbeiteten Texten zusammensetz, (ist) suspekt. Deshalb sind gegenwaertig nur Dokumentationen akzeptabel, die dank einer Minimisering der Eingriffe in das vorhandene Material auch eine Minimisierung der Manipulationschancen ermoeglichen. Die Entwicklung von Dokumentationssystemen, de Texte rezepieren, ohne sie zu veraendern, ist die technologische Antwort auf die politische Foerderung nach unmanipulierte Information."

We shall recognize this concern as the concern for objectivity, this time not the objectivity in selecting the data base, but the objectivity in designing the document. Actually, most concerns relevant to the selection of data base is, on a lower aggregate level, reflected in the design of documents. It would, for instance, be possible to develop a concept of representativity and publication ratio comparing a legal source and a document representing that source. These issues do not, however, seem as important with respect to document design as with respect to data base selection, and will not be pursued further.

[Page 73 ]



"Document design" invokes, as mentioned, the impression of a pre-existing legal source. There may, however, be some exceptions to this rule. In some public agencies (the example given being the Norwegian Consumer Ombudsman), a decision is made through a correspondence with the parties. These decisions are legal sources, and they are consulted as precedents, at least within the agency in question. There exists, however, no letter summing up the circumstances of the case and concluding with a decision similar to that of a court. In order to represent the sources as documents in the computerized system, a process of not only document design, but rather "document construction" was necessary, cfr Tysland 1979:140-142.

Below, we shall discuss in detail the document design. We shall start by describing three typical forms of documents and their properties. We shall then introduce the major functions of an information system, and compare in turn each type of document to these major functions - in this way indicating their influence on the functional performance of information retrieval systems.

It will be appreciated that the literature relevant to what we have called "document design" is extensive, especially on methods for indexing and abstracting. We shall not, of course, try to adequatly summarize this literature, but be more concerned with our own perspective, indicated by the process of designing documents for legal information retrieval systems.

3.3.2 Three typical forms of documents

(1) Indexes

Indexes are well-known forms of documents, also from conventional legal information systems - all books or compilations will contain some form of an index.

The literature on indexing is extensive, and includes discussions of different types of indexing systems

[Page 74 ]



and their properties. This wealth of literature will not be summarized here. A few types of indexes will be briefly mentioned with some comments on automatic index production and examinations of indexes in legal information systems.

One does often distinguish between subject indexes and key-term indexes, cfr Seipel 1976:73. In a subject index, the indexing term will refer to those documents which treats the subject characterized by the term, even when the document itself does not contain that exact term. In a key-term index, one will have references only to documents actually containing the same term as the index. Obviously, subject indexes are more ambitious than key-term indexes. Behind the subject index is an understanding of the text of the document, and often a claim of the indexing term being an appropriate characteristic of the content of the document. Surprisingly, one will find that indexes are often of the more simple key-term nature rather than the more ambitious subject nature.

Another distinction is between free and defined use of indexing terms. When indexing terms may be selected freely, the indexer selects those terms at any time found most appropriate to characterize the document. As this assessment will be relative to time and persons, there will easily occur inconsitencies in the indexing. To reduce such inconsistencies, one may introduce definitions of the indexing vocabulary and the use of indexing terms. These definitions may be introduced in different ways, a common method is the definition of a vocabulary to be used when indexing. Often the vocabulary has definitions of the different terms, and prescriptions for their use such as distinctions between related terms, relations to other terms etc. The indexer will not use terms that are not included in the vocabulary in a different way without - through prescribed procedures - having revised the indexing vocabulary.

An indexing vocabulary with definitions of the terms are traditionally known as a thesaurus. Such a thesaurus is a tool for indexing, for document design - and is brought into play at a stage prior to the inclusion of the document in the information system. In respect to computerized legal information services, thesauri have come to mean mainly certain aids for retrieving documents. Though these two types of

[Page 75 ]



thesauri may be similar, and though the indexing thesaurus may be used also for assisting retrieval, they should not be confused. Consequently, in this book we shall reserve the term "thesaurus" for those used in assisting retrieval where not otherwise indicated.

A defined indexing vocabulary is, in practice, necessary for constructing a hierarchical index. A hierarchical index has a tree structure, ie in climbing the tree, one moves from specific to more general terms.

In order to construct such a hierarchical index, it is necessary to have an extensive knowledge of the content of the documents to be indexed, which means the subjects under discussion. Such hierarchical structures may often be challenged, and may become outdated - reflecting an understanding different from the prevailing one.

A typical example of hierarchical indexes are those systematic tables often used in legal publication as a supplement to a subject index. These are usually rather traditional in their design, and probably do not reflect the current understanding of the legal system in question. But often tradition itself - the experience in mastering a certain structure - is of great importance for the users of the index. So though one may disagree with the relations implied by the structure, one may still find sufficient reason for not changing the structure.

Different from a hierarchical index is a flat index. In such an index, no attempts have been made to specify internal relations between the indexing terms. These are perhaps more common than the hierarchical indexes.

A last type of indexes with interesting properties are the citation indexes. As indexing terms a citation index has identifiers of certain legal sources, like a case citation. Reference is made to other sources related to the indexing term. The reference is a claim that there exists some sort of relation between the two sources of a sufficient strength to justify the reference. This relation may be authentic, meaning that there is an explicit citation in the text of at least one of the sources to the other source, or it may be editorial, meaning that the

[Page 76 ]



claim is based on the indexer's understanding of the two sources.

In Anglo-American law, case citators are widely used in order to identify precedents. One of the well-known American case-citators, Shephard's, have actually been implemented as part of both the WESTLAW and LEXIS service.

A possible measure for the depth of indexing, is the relative indexing volume (cfr Seipel 1976:7). This is defined as the relation between number of indexing terms (n) and the number of indexed documents (m), making the relative indexing volume n/m.

An important practical aspect is how indexes are produced. The traditional method may be called intellectual indexing. An indexer assigns indexing terms to each document according to the rules governing the indexing.

The disadvantages of this method are at least twofold. Firstly, it will take time. This period may cause a delay from the production of the legal source to its inclusion into the data base, prolonging the updating response. Secondly, even quite simple indexing will demand quite a high standard of expertise from the indexer. This is a difficult problem - not least within small jurisdictions, where it may be difficult to offer sufficient conditions to experienced lawyers, to inspire them to tackle the often routine indexing work. Obviously, the indexing will add extra cost to the providing of the information service, and a commercial service will be able to offer sophisticated indexing only if the market is big enough to pay for this additional cost.

This is a generally acknowledged problem, cfr Campbell 1975 on the situations in the Northern-Irish and Scottish jurisdictions. Cfr also Kornerup 1969, who maintains that the French extensive use of indexes may be related to the organization of their legal education making available a number of well qualified lawyers at the universities.

An alternative to intellectual indexing, would be automatic indexing.

[Page 77 ]



In order to make possible such an automatic indexing one must be able to define strict rules for the production of an index on the basis of the text of the documents (or defined parts of the text). Obviously, flat key-term indexes would be easiest to produce in this way, though in using thesauri and other techniques, somewhat more sophisticated indexes might be produced.

Automatic indexing would not imply the two disadvantages mentioned in relation to intellectual indexing, but would presume that the text to be indexed is in machine-readable form and that appropriate programs are available. The economic relation would depend, on the one hand, upon the expences incurred by qualified indexers, and, on the other hand, upon the expences incurred by the converting of the text to a machinereadable form, and for the use use (perhaps even development of) an adequate program.

The greatest controversy with respect to automatic indexing concerns, however, the qualitative aspect. Obviously, intellectual indexing implies the possibility of achieving a higher quality in the indexes. This possibility is, however, not always achieved.

In an examination of 1.575 references in ten different Swedish legal indexes, Seipel determined which terms would have been assigned the documents by a simple automatic indexing scheme and which probably or definitley would not. The automatic indexing was presumed to be made with only the title as input. On average, he found that 36 per cent of the indexing terms would not have been assigned by automatic methods. The ratio varied, however, from 6 to 86 per cent, cfr Seipel:106-109. There exists, however, methods for improving the results.

However, if an information service employs computers, the production of conventional indexes becomes only one of several possibilities. In a text retrieval system, an index is also produced - ie the search file. Taking the documents, the system will construct an alphabetic index containing all words occuring in all documents. Each word is associated with a reference indicating the position of the word in the text file. This reference is generally an address giving document identification, sentence-number within the

[Page 78 ]



document and word-number within the sentence. In practice, not all words are treated like this - a small number of predefined stop-words is excluded - words like prepositions ("in", "for", "on"), pronouns ("she", "it"), cojunctions ("and", "but", "when"), and verbs like "have", "be", and "ought"). By excluding such words, the total number of text words will be reduced by some 40-50 per cent, while the number of different words will be only fractionally reduced. In a Norwegian statutory text of 118 069 text words, the 100 most frequent words made up 55 per cent of the text (cfr Harvold 1976:12).

Usually, one distinguishes between the index and the text or documents characterized by the indexing terms. This distinction becomes less obvious when the index is produced like in a text retrieval system, since the content of the index is identical with the content of the text itself, the "only" difference being that they are sorted differently (alphabetically rather than in sentences), and exclude certain stop-words. Therefore, one usually describes the retrieval process as "searching the text", while it is actually a searching of indexes to the text.

(2) Abstracts

The abstract is also well-known to lawyers. Case reporters usually bring abstract of the cases as a head-note.

By "abstract" we mean an abbreviated representation of the source in natural language. Typically, this is a description of the content of the source. This may be rather brief and take on the form of a title or headline, or it may be quite extensive - though this would be rare - giving a lengthy report on the content of the source.

In our terminology, the main point of difference between indexes and abstracts is the natural language of the abstract - the source is characterized by full sentences, not only indexing terms. The borderline between indexes and abstracts is blurred, cfr the use of "telegraphic

[Page 79 ]


abstracts" where a defined vocabulary is used, and where the terms of the abstract may be sorted into alphabetical indexes - cfr Borko/Bernier 1975:18-20. A traditional definition of an abstract would be
"... an abbreviated, accurate representation of a document without added interpretation or critisism and without distinction as to who wrote the abstract". Weil cited after Borko/Bernier 1975:4.
This definition is not used in our book for several reasons. Obviously, we use the term "document" in a different way, indicating the abstract, rather than the source it represents. But in addition we doubt whether an abstract can be produced "without added interpretation" or "without distinction as to who wrote the abstract".

Abstracts may be classified according to the type of author. One important category is represented by those abstracts written by an author who wrote the source as well, for instance an abstract of a case written by the judge himself. The advantages of such abstracts would be the author's first hand knowledge of the text he was abstracting. The disadvantages would be the author's lack of experience in following the rules for abstracting. It is maintained that the author of the original text may easily distort it in an abstract, Borko/Bernier 1975:13-14.

In case reporters, it is not uncommon that the judges themselves, or someone employed by the court, produces the abstract. This is for instance done by the administrative courts of Sweden. - At the Italian Corte suprema di cassazione, high priority is given to the production of abstracts. These are produced by a special office of the court, Ufficio del massiario e del ruolo. In the period of 1942-49 it was experimented with letting the judges themselves produce the abstracts. This, however, was not considered to bring results - the abstracts became too specific with respect to facts and too abstract with respect to the law, cfr Ciampi 1974:722.

The general opinion is that the best abstracts are produced by an expert within the field, who has the necessary training to obey the rules for document

[Page 80 ]



design as well. Legal publishers of case reports will furnish examples of such expertise in abstracting, and many have a reputation for producing headnotes of a high quality.

Another distinction exists between informative and descriptive abstracts. In informative abstracts the aim is to present all factual information of the original - ie facts, conclusions etc. An informative abstract may therefore replace the original. A descriptive abstract would describe which problems are discussed in the original, but does not aim at containing sufficient information to be used independently of the original source.

This distinction may be less clear With respect to legal sources than with respect to literature within certain other disciplines, where the main points of an original may be given as clear conclusions. But obviously the distinction indicates a scale. Often abstracts in legal systems are associated with a representation of the authentic text, which is typical for the headnotes of a case reporter. In this context, the abstracts become descriptive - the authentic text is at hand for use in the legal arguments. In other contexts - as in a text-book or a commentary - the abstract is given independently of the authentic text. In these instances, there would seem to be an inclination towards using the arguments derived only from the abstract. Here informative abstract may seem appropriate. This, however, would have to be assessed with respect to whether it seems desirable - or, indeed, permitted - to encourage a replacement of the source by the abstract.

In addition to the abstract proper, there are a few types of documents which will not be distinguished from abstracts unless called for by the context. One is the extract. An extract is identical with the authentic text of a source, but is only a selected part of it. Another is what may be called a report of the source. The report seeks to give a complete representation of the authentic text, but is not necessarily identical to this. The reason for selecting the form of a report, may be the need for document construction - a situation where no appropriate

[Page 81 ]



authentic form of the document is at hand.

The abstracts may also be free abstracts or normalized abstracts. In principle, the abstracts may be constructed only from defined indexing terms. This would be much rarer with respect to abstracting than to indexing. More common, there will be certain rules for the design of the abstact - for instance the naming of an author or court, date, standards for abbreviations, citations etc. There may also be other guidelines, like the abstract being constructed out of general rather than specific terms.

Measures for the quality of abstracts are not in common use. Consequently, one will have to rely mostly upon user experience.

This is also the first item of the list presented by Borko/Bernier 1975:180. In addition is mentioned the degree of which the abstract conforms to the rules of the information system or other norms for the construction of abstracts, the absence of errors, consistency, readability etc.

An alternative, presented by Mathis (and cited from Borko/Bernier 1975:182) is what is called "data coefficient". This measure rests on the identification in the abstract and in the authentic text of "data elements", defined as "one concept", represented by "name-relation- name patterns", again defined as "language strings composed of words representing names and relations". Borko/Bernier 1975:183 holds this method to be the most precise offered at present to measure the quality of abstracts. The assessment of the method would seem to rely completely on how feasable one would regard the possibility of identifying "data element".

The conventional method for producing abstracts is to have a qualified abstracter read the authentic text and then formulate briefly his understanding of the text. The disadvantages of this method is the same as with respect to intellectual indexing - it is time-consuming and requires abstracts of advanced qualifications. Perhaps the time consumed in writing abstracts are even more extensive than in indexing, as the abstract must reflect more completely the authentic text. A strategy to reduce the time consumption, would be to let the author of the authentic text

[Page 82 ]



produce the abstract as well.

Attempts have been made for automatic abstract production. This is a rather more ambitious task than to produce automatic indexes, as the abstract has to be formed in natural language. And it is rather complicated to have a computer cope with natural language. So far only experimental schemes have been devised.

Also, automatic abstracting would presume that the authentic text was machine readable. In such a case the need for an abstract would be reduced, as other methods may be employed to fulfil the functions of an abstract - for instance focusing, highlighting or KWIC-formats.

(3) Authentic text

The least complicated form of representing a source, would be to reproduce the authentic text of the source in the document. This is the usual way of representing most types of sources - statutes, regulations, court decisions etc.

As mentioned above, what is the authentic form of a source will be determined by the meta-norms of the legal system in question. It is conceivable that the authentic form is actually an abstract of another and more extensive text - cfr the examples given above from Italy and Germany.

The authentic text will typically be part of the document in a legal information system, but will rarely constitute the complete document. In addition, the document will usually contain indexing terms and an abstract.

Also, the editor may have made some amendments in the autentic text. He may have deleted the name of the parties for privacy reasons, and he may have abbreviated the text in different ways, for instance by excluding some of the formulas often introducing a decision. Actually, there is a blurred borderline between an extract and the authentic text.

There are some disadvantages of representing a source

[Page 83 ]



by its authentic text. Firstly, the authentic text will be relatively lenghty. In situations where costs are relative to length, this will be an argument for a briefer representation.

This has been an important argument for computerized systems. For older material, the text had to be re-keyed in order to convert it into machine readable form. Here cost would be proportional to the number of key-strokes, and it would be tempting to only re-key the abstracts or headnotes of cases etc, which would be relatively cheaper.

Secondly, in authentic text one would have to accept the ideosyncracies of the author. This would hardly be a severe problem, since the author would be expected to respect the norms of the written language. One would, however, expect many inconsistencies in the terms used. And with respect to multi-lingual documentation areas, like international law, it might create problems. Corresponding problems might be found in jurisdictions with more than one official language - like Belgium, Canada, Finland, Ireland or Switzerland. In such situations, information systems will have to develop special features to cope with the multilinguism.

3.3.3 Functional performance

(1) Introduction

Functional performance is a measure for how well an information system fulfils certain functions defined as desirable or necessary. These functions are rather simple and basic. The three major functions are

  • (1) the retrieval function, ie the ability to identify possible relevant documents within the data base;
  • (2) the relevance function, ie the ability to easily determine the relevance of a retrieved document with respect to an actual problem; and
  • (3) the source function, ie the access to a document representing the source in such a way

[Page 84 ]


that arguments for legal reasoning can be derived from the document.

In addition there may be a fourth function, namely the current awareness function, bringing to the notice of the user important new developments or other changes of the law.

These are all functions of an information system. In this section, we shall discuss only the functional performance relating to document design: what influence will document design have on functional retrieval. Obviously, there may also be other features of the information system that will have an impact on functional performance. The search language will concern the performance of the retrieval function, the focusing or highlighting will concern the relevance function etc. But the document design will actually limit the performance of these functions, and is therefore discussed separately in relation to each of them.

One should also be aware of the restriction of the discussion to document design in contrast to information systems. There are systems mainly designed to support only one of the functions - like an official gazette bringing all statutory amendments: this will be excellent in relation to the current awareness function, but inadequate with respect to the other functions. A citation index may be useful as a retrieval tool, but will not satisfy the other functions at all. Consequently it should be stressed that though the functional performance of one legal information system assessed isolated may seem one-sided, the individual information services often have to be assessed in relation to each other as complementary services. This is also true for computerized legal information services, which for the foreseeable future will exist as part of the total information environment of the user rather than as that environment itself.

[Page 85 ]



(2) The retrieval function

An information service should permit the user, when specifying a question, to retrieve documents of possible relevance to this question.

In practice, only different types of indexes can fulfil the retrieval function of a system. In the index certain properties of the documents are characterized and sorted according to certain criteria - for instance alphabetic, systematic etc. The user may then formulate a search request which in the same way characterizes properties of his problem. In matching the request with the index he may find a reference to a document which is thus indentified as being of a probable relevance.

In theory other methods for retrieval may perhaps be conceivable. The simplest method would be to read sequentially through all documents in the data base in the random sequence they occur with respect to the problem. This method is not very efficient, but from time to time it may be the only choice, because there may be no index to the data base, or the existing indexes, have not been designed to cope with the user's type of problem.

Indexes presume the existence of indexing terms in a document - perhaps with the small exception of table of content and similar indexes, which may be constructed out of very brief summaries (as mentioned above, the exact borderline between indexes and abstract is somewhat blurred).

It is quite unnecessary to discuss to what extent abstracts or authentic texts satisfy the functional requirements for retrieval. It is important to state the basic observation that in theory and practice the problem-oriented use of an information system presumes an index.

Otherwise one would be limited to sequentional reading of shorter or longer documents. This is not impossible, and represents a method to which most lawyers have had to resort from time to time. But undoubtedly documents containing only abstracts or authentic text satisfy the retrieval function less satisfactorily than indexes.

[Page 86 ]



In order to avoid misunderstanding, one should bear in mind that the form of index used in text retrieval systems is produced on the basis of the authentic text or abstracts. But also in these cases, the retrieval is supported by indexes.

The question then becomes what type of indexes are most efficient. This question is too general to be appropriately answered, and must be split into a number of sub-questions. Of these, only a few will be discussed here, questions which we think have a special interest for computerized legal information services.

An obvious point of departure would be the question of whether empirical studies have determined which type of indexing language is the most appropriate. Above, several possible distinctions have been indicated - free and defined vocabulary, hierarchical and flat indexes etc. Several indexing systems are in use, and it would have been tempting to compare these or examine their efficiency.

There are a number of tests which examines exactly this problem. Some of the more important are Cleverdon (for instance 1967), Salton (1971), and Saracevic (for instance 1979b). These three general and extensive tests had aimed primarily at disclosing which indexing language was the "better". By "better" is meant a measure for retrieval performance. The measure used in this book is recall and precision - but other measures are possible and have been used (for instance by Saracevic).

Recall and precision will be further discussed in part II. If the relevant documents are R, the retrieved documents T and the retrieved, relevant documents are D, then recall is T/R and precision D/T. Recall and precision both approaches 1 with increasing performance. They are often illustrated as points or curves in graphs.

It is precarious trying to summarize the results of these major and basic projects. Nevertheless we would like to focus on a result which we consider important, but which perhaps have not been fully appreciated. This result may - somewhat boldly - be formulated as the disappointment in the performance of

[Page 87 ]



sophisticated and time-consuming indexing systems.

In one of his conclusions, Cleverdon (1967:620) states that an indexing language characterized as "relatively simple" would seem to give the best cost-benefit performance, and indicates that in certain circumstances this may also yield a higher performance. Cleverdon is careful in his conclusions, but apparently the results justify a certain scepticism towards sophisticated indexing languages. The results of Cleverdon are supported by Salton (cfr for instance Salton/Lesk 1968:639), who states on the relation between intellectual and automatic indexing (Salton/Lesk 1968/638) that:

"... one is tempted to say that the efforts of trained indexers may well have been superfluous for the collection at hand, since equally effective results could be obtained by simple word matching techniques. Such results appear even more probable in the case of larger or less homogeneous collections, where the manual indexing tends to be less effective because of the variabilities among indexers, and the difficulties ensuring a uniform application of a given set of indexing rules to all documents."

This may have been expressed most pointedly by Saracevic, who, in one of his general conclusions, stresses that the "human factor" seems to be the most important one with respect to retrieval performance. In the choice between indexing languages and systems, he states (1970b:680):

"The length of indexes (i.e. variations in number of index terms per index as produced from titles, abstracts, or full texts) seems to affect the performance considerably more that do the indexing languages; given the same length (often termed 'depth') various indexing languages tend to perform at an equivalent level."

When underscoring this scepticism in relation to sophisticated indexing methods, for instance employing input thesauri for term control, it is done mainly to justify the opinion that in legal information systems large resources should not be used on indexing. Indexing is essential, since indexes are in practice the only way of supporting the retrieval

[Page 88 ]



function. But perhaps simpler and cheaper indexing schemes may be sufficient.

The selection of indexing structure and languages as well as methods for the production of indexes, cannot be assessed independently of those retrieval strategies supported by the information system in question. In printed indexes, few retrieval strategies are open for the user - in practice a request has to take the form of one indexing term, to which the references are identified. In systematic indexes (or other types of hierarchical indexes) one is permitted to move up and down the levels of generalization. Some have certain other features, and some may be used in conjunction, for instance in retrieving a statutory clause through a subject index, and then follow the references in the annotations to the clause.

Citation indexes and their performance are not mentioned above. The use of citation indexes for retrieval of scientific literature is found very handy by the users - cfr for instance the assessment of Science Citation Index cited by Martyn 1965:360. One must suppose that similar results are achieved through citation indexes for legal sources. Empirical research seems to indicate - perhaps not surprisingly - that the indexing by way of citations is remarkably consistent (for statutory citations, cfr Bing 1982b:221-228. Users maintain that citation indexes are important, cfr Technical Study I 1977:55-56).

Computerized systems give the user a wider range of retrieval strategies. In this context it is sufficient to observe that they will permit forms of indexes which for practical purposes are excluded in respect to printed publications. In particular this includes indexes produced automatically on the basis of the text of a document, containing nearly all the different word forms of a data base with information (addresses) of the position of the words in the data base of documents.

An issue of concern would be whether such indexes permit a higher retrieval performance than would indexes produced by intellectual indexing. One may envision data bases representing the same legal sources. In one data base, the indexing terms have been assigned by a qualified indexer, in the other the

[Page 89 ]



indexing terms have been gererated automatically on the basis of the text of the documents (presumably of the authentic text or of abstracts of the sources).

This issue is perhaps not yet settled. In the development of computerized legal information systems in Europe, there has been a tendency in Latin countries to favour intellectual indexing, while in Great Britain and Scandinavia (especially Sweden) there have been used documents containing only authentic texts. This has been considered a conflict between indexing and full text systems, cfr Bing/Harvold 1977:80.

At present the conflict is not a matter of great concern, not because the issue has been settled, but perhaps because all systems work more or less with documents partly containing the authentic text completely or in extract, partly at intellectually assigned indexing terms and abstracts.

We do not want to solve the conflict in this connection. It is sufficient to mention that no empirical test has shown intellectual indexing to be clearly more effective than automatic indexing, and that automatic indexing of authentic text in a computerized system usually will be cheaper, and also satisfy the source function. We also refer to Salton/Lesk 1968, as cited above, where one of the results was the failure of determining that automatic indexing was inferior to intellectual indexing.

At the Norwegian Research Center for Computers and Law, Oslo University, a number of "controlled experiments" in text retrieval have been carried out. None of these justify a general assessment of the respective merits of automatic or intellectual indexing. In two of the experiments, however, recall failure due to "implicity" is determined - this failure occurring if some ideas in the authentic text are not made explicit, but are rather implied by the context. It has been maintained that this is a type of performance failure more common in automatic than in intellectual indexing, as the intellectual indexing tries to disclose such implicity by assigning explicit indexing terms on the basis of the indexer's understanding of the content.

In both experiments implicity was found to be the cause of 12 per cent of recall failure. This means

[Page 90 ]



that in 12 per cent of the cases where an idea in a document was not retrieved, this was caused by content being implied rather than explicitly stated by the words of the document.

Note that the figures refer to the recall failure. In 12 per cent of the cases where an idea of a document is not found, the cause was implicity. - The two experiments mentioned was conducted on data bases of decisions by Swedish administrative courts (Bing/Harvold 1974:102) and by the Norwegian Social Security Court (Bing/Harvold/Kjonstad/Stabell 1976:104).

The experimental results cited are very encouraging. Recall failure due to implicity in natural language documents would not seem to be more severe than the failure in intellectual indexing caused by failure to include a relevant indexing term or incorrect assignment of terms.

In Lancaster's (1969:646) examination of the medical information system MEDLARS the indexing itself caused 37.4 per cent of the recall failure, and of this the cause "insufficiently exhaustive" made up 20.3 per cent. The results cannot, of course, be compared directly, but do demonstrate that there are problems corresponding to "implicity" playing a considerable part as a cause of recall failure also for intellectual indexing.

The second problem would be what type of document was most appropriate for automatic indexing - which type of document results in the best indexes for retrieval purposes.

In this respect, one may suggest three possibilites: documents consisting of titles (which may count as very brief abstracts), abstracts and documents based on the authentic text.

As mentioned above, Saracevic (1970b:680) maintained that retrieval performance was more dependent upon the length of the index than upon the indexing language. As titles are briefer than abstracts, and abstracts briefer than authentic texts, one might suppose this last type of documents gave the best results. This is supported by the tests of Saracevic

[Page 91 ]



(1970b:674 and 1968:87-95). Cleverdon's tests indicate that indexes based on titles do not perform as well as those based on abstracts (1967:619 fig 14). Salton/Lesk (1968:630) conclude:

"... document abstracts are more effective for content analysis purposes than document titles alone; further improvement appear possible when abstracts are replaced by larger text portions; however, the increase in effectiveness is not large enough to reach the unequivocal conclusion that full text processing is always superior to abstract processing."

In one of the controlled experiments in text retrieval at the Norwegian Research Center for Computers and Law, performance in retrieving on abstract-based indexes are compared with retrieval on "full text"-based indexes, cfr Fjeldvig 1976:95-111.

The documents in the experiment was responsaes from the Tax Administration, where the abstract was produced by the editor of a specialized journal (Utvalget). In fig 3/3 the result is given as three curves, each curve referring to a certain type of document: abstract, extract of the authentic text without abstract, and extract of the authentic text with abstract (Fjeldvig 1976:104).

Fig 3/3 - Average recall-precision curves for three different types of documents.

[Page 92 ]


(3) Relevance function

The "relevance function" denotes the function of documents in a retrieval system as one to give efficient means for determining the relevance of a source represented by the documents.

If a document contains the authentic text of the source, this is taken as the best possible form for determining relevance. It is indeed difficult to see how alternative document designs could offer the user a better foundation for relevance assessment than the one offered by the authentic text.

Cfr Saracevic 1970b:678, who characterizes the relevance assessment on other bases than the authentic text preliminary with respect to his text. - One might mention that the relevance concept used in this book presumes the existence of a legal problem. It is therefore a condition that the user, before making his final relevance assessment, has access to a document satisfying the "source function", ie a document qualified as satisfying the legal meta-norms for being utilized in the legal argument, cfr below at (3).

The disadvantage of basing the assessment on the authentic text is, in practice, that it is relatively long. The length itself may demand of the user to spend considerable resources assessing the relevance of the documents retrieved by the system. Cfr Persson (1974:79) who has found a clear accordance between the length of the text and the time required for relevance assessment. There are certain ways of reducing these disadvantages through the system design (rather than the document design), but the usual way is to design an alternative and briefer representation of the source, and include this in the document.

One way to improve the relevance function for long documents, is the inclusion in the computerized system of a focusing or KWIC function (KWIC being an acronym for "key word in context"). The user retrieves a set of documents, and wants to examine these to assess their possible relevance. The system presents the documents on the screen displaying that part of the document which contains one or more search terms. This location of

[Page 93 ]


the document is probably important for the relevance assessment. The function is regarded favourably by users of such systems, cfr Bing/Harvold 1977:127-128.

Another strategy is known as "highlighting", ie the search terms are highlighted on the screen by higher intensity print or reverse video as they appear on the screen. This also facilitates the relevance assessment of long documents.

One will note that these are features of the system, not of the documents. In this section we shall mainly consentrate on how document design may facilitate relevance assessment. One should, however, be aware that document design is not the only feature determining the relevance function of the overall system.

If a document in authentic text is to have an alternative and briefer representation, the choice is between indexing terms and abstracts.

Indexing terms give little possibility of relevance assessment beyond the implicit assessment of the retrieval itself. The reason for the choice of certain search terms, and the retrieval of a set of documents, is closely related to the probability of finding relevant documents. Search requests may be regarded as a hypothesis of the properties of a relevant document.

Users may also, as part of the retrieval process, suggest that documents retrieved by term "a" has a higher probability of being relevant than those retrieved by term "b". He may sort documents in different sets using such properties (ranking), for instance by specifying that those documents found both by term a and term b has a higher probability of being relevant than those containing only one of the terms.

It is the ability of the system to formulate appropriate hypothesises of this kind which is measured by recall and precision. A retrieval system efficient in recall and precision, will also be efficient in this limited form of relevance assessment.

Indexing terms should be regarded as part of the retrieval function rather than, in this context, part of

[Page 94 ]



the relevance function.

It is perhaps of interest to dwell somewhat on the distinction between the relevance and retrieval functions. To facilitate retrieval, the document element in question has in practice to be part of an index. In text retrieval systems, all elements of the document will be part of the search file, and this distinction is not as obvious as in a manual system. Imagine, however, a manual system with a number of indexing cards, each card representing one case and all cards indexed with statutory citations. The retrievable index is the small flags bearing the different citations, and sorted into the file behind the flag are those cases characterized by the citation. When the citation is used as a search term, all cards behind the flag are retrieved. When determining relevance, the user will examine titles, additional indexing terms etc, which may be elements of the card. Here it is obvious what is retrieval and what is relevance assessment. But over time dynamic factors may change this. The number of cards for one citation may grow out of proportion, and this is then split into sub-citations on the level, for instance, of paragraphs rather than sections. The information of paragraphs ceases to be utilized in the relevance function and is utilized already in the retrieval - the paragraph citations now being part of a search file.

The other alternative for representation of legal sources in a brief format is the abstract. It is commonly accepted that the abstract is appropriate for making relevance assessment more efficient - this is also the justification for using head-notes in conventional publication. This intuitive attitude is also supported by the test done by Saracevic (1970b:679):

"... in comparison with full texts, approximately two thirds of relevant answers from titles and three fourths from abstracts (were recognized by users). At the same time they recognized ninetysix per cent of nonrelevant answers from titles and ninety-eight per cent from abstracts."

This indicates that a relevance assessment on the

[Page 95 ]



basis of abstracts (or titles, which in this context may be regarded as a very brief abstract) is reasonably efficient.

With respect to the results of Saracevic, one should be aware that his relevance concept is not identical to that used in this book, cfr Saracevic 1970b:571. This should not, however, reduce the general value of the conclusions. Below in fig 3/4 we cite the values given by Saracevic for relevance. In the table, "R" is "relevant", "P" is "partially relevant" and "N" is for "nonrelevant". As the relevance concept in this book is binary, probably also "partially relevant" should be considered "relevant" in our terminology. The table illustrates how assessments based on abstracts are distributed relative to assessment based on authentic text.
Fig 3/4 - Relevance assessment based on abstracts, cfr Saracevic 1970b:679 table IV.
    Full Text Judgements
    207 156 723
    R P N
  175 R 160 3 12
Abstract 169 P 23 125 21
Judgements 742 N 24 28 690

Saracevic explains the deviations between the relevance assessment based on abstracts and authentic texts by the less exhaustive content of the abstracts. He indicated, however, that also the uncertainty of the user or development over time of the understanding of the user may contribute to the explanation. We would consider it improbable that a better correlation between results would be obtained by comparing the relevance assessment of one user at two different points in time, each time based on the authentic text. For an example of user changes in relevance assessment, cfr Bing/Harvold/Kj&nstad/Stabell 1976:57-61 and Kj&nstad 1976.

[Page 96 ]



(4) The source function

Under sect 3.3.1 we have discussed briefly the qualification of what is the authentic form of a legal source. This depends entirely on the legal meta-norms of a legal system, which qualify what form a document will have for a lawyer to be permitted to derive arguments from it and incorporate these arguments into a legal reasoning.

It has been stressed that any representation qualified by the meta-norms of a system in this way, is an authentic form of the legal source. Examples have been offered with reference to the Italian and German legal systems, of which may be maintained that abstracts of court decisions are treated as a legal source, and that these abstracts have an authentic form though being a briefer representation of the original court decision as well.

Earlier we have also dwelt on the composite character of legal sources, and mentioned that when citing one source - for instance a court case - we may indirectly draw upon other sources in turn cited by that case. Our arguments are derived from the case, but their weight may to a large extent be determined by the cited sources.

After this reminder of some of the concepts and the terminology underlying the phrase "authentic text", we move on to the third major function of a legal information system: the communication to the user of a text which satisfy the meta-norms of the legal system, and which may be utilized in his legal argument. This is what will be called the source function of the system.

In our terminology it is somewhat of a pleonasm to maintain that the source function is satisfied only by the authentic form of the source, as both the source function and the authentic form is qualified with a reference to the same legal meta-norms. However, a more practical observation may be added to this: In general the authentic form will in general be the text as produced by the original author - who may be a legislator, a judge, a civil servant or a

[Page 97 ]



legal author. No abstract of this original text will be generally qualified as a source (though exceptions may exist and have been indicated), and this will be even more true for indexing terms. Of the three types of documents mentioned above, only the one which in our terminology is an "authentic text" may serve the source function - and this will indeed correspond to what is generally called a "full text".

We shall not discuss the source function further, but still stress that this is absolutely essential to a communication process. If the source function is not satisfied, the lawyer is not helped at all. It may be interesting to know that there exists a court decision of probable relevance and high importance, but nevertheless it must be considered rather frustrating if that decision is not available in a form allowing the lawyer to exploit it in a legal argument. Any legal information system which has not taken into account how to solve the source function, is at risk.

There will, of course, be legal information systems specialized to serve only one or two of the major functions - an example would be an index or digest which serves only the retrieval and relevance function, characterizing each of the listed sources by indexing terms and a brief abstract. Such systems have not solved the source function, and the criticism above may be regarded as relevant to such systems. This is, however, not necessarily so. Some systems can only be understood when viewed as supplementary or accessory to other systems. Though each of the systems has a low functional performance, the performance increases considerably when they are regarded as one composite system.

Some of the success of computerized systems is, however, probably due to the combination of a highly efficient retrieval function and a nearly unique source function. If the authentic form of the sources is part of the documents, the source function is literary satisfied by the touch of a button. Browsing through possible relevant documents is easier on a screen than in a book, and the fact that trivial things like turning pages or looking up different volumes have been done away with, should not be underestimated.

[Page 98 ]



The largest commercial systems are the American systems of WESTLAW and LEXIS. These mainly contain published court decisions. One may reflect on the relative importance of the retrieval and source function in such systems. To a large extent WESTLAW obviously has a more efficient retrieval function compared to the conventional case reporters of West Publishing Co, while these conventional reporters may offer the user a better source function than that of the computerized system. Though the high availability of identified documents have been emphasized above, one should bear in mind that the simple matrix printers of most computer terminals do not offer texts in a form which can compete with the clear fonts of a printed page. If the computerized system is used as a retrieval tool for the conventional library, the source function need not be stressed in that system. On the other hand, the more independent the computerized system becomes, and the more unpublished material is documented, the more vital an appropriate solution will be for the source function, including document presentation.

Bringing the source function into focus, may disclose the communication aspect of a computerized system as well. A tendency to consentrate on the retrieval function of such systems may be justified, but the simple fact that in such system an identified document is instantly available in the desired form, may in practice be equally important.

User research has disclosed the importance of trivial availability factors, and it is worth noting that the simple fact that the source was missing from the library was a main cause of the unsatisfactory result of legal research in a major German survey (Jungjohann/Seidel/Sorgel/Uhlig 1974:44). This may be combined with the probably typical result from an Italian survey, demonstrating that half the lawyers had no particular system in their library, and more than three quarters had no index to their own library (Rawlence 1975:374- 375).

These facts may indicate that it would indeed be a great practical boon to the lawyer to have an information system which easily made available the authentic form of the source, and where the documents were

[Page 99 ]



never missing due to a simultaneous use by a collegue.

Essential functions should not be neglected. In conclusion, one may state that any legal information system is only as good as the source function associated with that function.

(5) The current awareness function

The current awareness function is not one of the major concerns of document design. This function would be features of a document which were designed to make the reader aware of a special novelty associated with the document. For instance a case reporter might, by a special character, identify those cases which, in the opinion of the editor, changed the state of law in some area.

There are, however, rarely taken special care to make the documents support the current awareness function. The explanation for this is two-fold and rather simple.

Firstly, most legal information systems are published according to changes or developments in the legal system, thereby announcing that they may have an interest in keeping a lawyer up to date. No special measures in the document design of, for instance, a case reporter is necessary to convey the message to the lawyer - though the headnotes, considered be wholly justified by the relevance function, may be convenient for the user when browsing through recently published reports.

Secondly, in most legal systems, there are dedicated information systems for satisfying the current awareness function. These, again, do not really make any special allowances in their document design, but by their very nature, they will document only what is deemed necessary in view of their special function.

A typical system would be the official legal gazette common to many jurisdictions, bringing the text of the latest legislation etc. Some jurisdictions have specialized audio-cassette journals which bring the busy lawyer up to date through the cassette player of

[Page 100 ]



the car on the way to his office. And the advertisements announcing new legislation in newspapers or other mass media are also, of course, typical current awareness systems, though addressed to the layman rather than to the lawyer.

Interestingly, computerized current awareness systems have been launched, mainly in the United Kingdom, where both the Infolex and Lawtel services under Prestel, is designed as stop-gaps to give the lawyer information on what has developed after the last edition of a conventional information service, or, indeed, the last update of his "conventional" computerized service.

There are, however, true examples of document designed for current awareness. Many case reporters have a newsletter included, in which the editor sums up the highlights of the most recent cases. As this newsletter represents some of the same cases represented otherwise in the reports, the corresponding parts of the newsletter is part of the logical document representing these cases in the information system - a part of the document especially designed to support the current awareness function. At least one computerized service, WESTLAW, includes a weekly newsletter for its subscribers.

Current awareness functions are, in conventional systems, almost a by-product of the fact that the subscriber receives regulary updates, which he browses through in order to familiarize himself with current developments. It is perhaps somewhat surprising to see that this is a function lacking in most computerized systems: for instance, the user cannot by a command to the system have access to all documents added to the data base of the last update. The computerized system is not designed in such a way that current awareness becomes a by-product of the updating, and one would have considered it as rather self evident to introduce a special feature which would be the equivalent of the current awareness function in conventional systems. Though few good examples of this are known, there are systems which allow ranking of documents by date, a facility which may be used for the current awareness function. The solution for computerized services would be a function of the system rather than a feature of document design.

[Page 101 ]



As mentioned above, the current awareness function is not regarded as a major function on the level of document design, but rather as an objective of specialized information systems. In a discussion on functional performance, we shall not in general address the current awareness function of document design. We shall, however, return to some of the same points in our discussion on passive use of legal information systems.

3.3.4 Conclusion

In this section, we have discussed some of the basic properties of legal retrieval systems in the perspective of document design, relating the functions of a retrieval system to the design and properties of the documents of that system. Obviously the discussion has been rather brief compared to the complexity of the subject. Nevertheless, we do feel that the perspective offered in this section is important, and that it supplements the following discussions concentrating on the functions of the retrieval mechanism and other features of computerized systems. As emphasized initially, the performance of a computerized system is contained within the limits determined by the document design.

We have related three typical forms of documents to the functions of an information system. As a conclusion, we may maintain that each of these types of documents have their strongest advantages in respect to one of the functions, illustrated by fig 3/5.

Fig 3/5 - Document types and system functions
  Retrieval Relevance Source
  function function function
Indexes x

Abstracts   x
Authentic text     x

The conclusion indicated by fig 3/5 is hardly controversial. It should be easily agreed that efficient retrieval must be based on indexes, that abstracts solve the relevance function in a convenient way, and

[Page 102 ]



that authentic texts are necessary for satisfying the source function. The discussion is related primarily to what type of indexes are the most efficient and how they should be produced, what type of abstracts are most appropriate and how the authentic text should be made available.

As part of the Common Market user survey of legal information systems, an assessment related to that in fig 3/5 was made, and this may be a useful supplement to the conclusion. We have not made a distinction between documents demanding "low" or "high intellectual effort", but this distinction does not need special comments. One should also note that "summaries" (in our terminology "abstracts") and "fulltext." (in our terminology "authentic text") is specified as satisfying the retrieval function - this implies that indexes are produced automatic on the basis of these types of documents, and in a text retrieval system.

Fig 3/6 - Document design and system functions, cfr Technical Study I 1977:127.
  Retrieval Relevance Assessment Final DP-output Problem Solving
H L H L E D  
Document Characteristics              
References         x    
Keywords and others x   x        
Summaries   x x        
Fulltext   x   x   x x

H = high intellectual effort
L = low intellectual effort
E = originals easy accessible
D = originals difficult accessible

This table gives the same general assessment as above and in fig 3/4. It is produced as guidelines for an information system. One will see that retrieval based on an index constructed by intellectual indexing is presumed to tax the user more than on an index produced by abstracts or authentic text. It may be confusing that relevance assessment is specified as more burdensome on the basis of abstracts than authentic

[Page 103 ]



text, but this is explained by relevance assessment being more simple or final (though perhaps demanding more resources in research time) on the basis of the authentic text. The column "Final DP-Output" specifies some of the relations to other systems. If the authentic text is easily available through other systems, the computerized system may be limited to give a reference as to where the authentic text may be found.

[Page 104 ]



3.4 The information system

3.4.1 Elements of the information system

The information system is the bridge between editor and user in the communication process. In general terms, any information system may be described as consisting of three elements, in addition to the sender (editor) and receiver (user, lawyer). These three elements may be described most conveniently in relation to the user.

The user will have a legal problem to solve, and will initiate the communication process by formulating a search request, cfr above at sect 2.5. This search request is formulated within the restraints of the search language of the information system in question. The request may take the form of specifying indexing terms for look-up in the back-of-the-book index of a case reporter, it may be an oral request made by telephone to the local library where books of interest may be found, or it may be a request formulated with Boolean operators for retrieval by a computerized system.

If the search request is accepted, the information system engage its retrieval mechanism, yielding references to the user. In the back-of-the-book index, this may be references to the cases by the indexing terms, in the library it may be the return call from the librarian explaining which books have been identified and, in the computerized system, it may be a response identifying which documents satisfy the Boolean argument. The retrieval mechanism may be designed and function very differently, and is exemplified, in the examples given above, by an index look-up, a librarian's use of library information systems, and the computer programs of a text retrieval system. The retrieval mechanism does also define the valid form of the search request - for example the search language. There is a close correspondence between the search request and the retrieval mechanism.

[Page 105 ]



As noted, the result is a number of documents - or document elements - being made available to the user. These documents may, through the reference in an index, be an abstract or the authentic text of the source. It may, however, be only a reference, and the user has to employ an auxiliary system to secure the authentic text of the case. Actually, the use of a citation index may have been the occasion for the telephone call by a lawyer to the law library, in order to get a photocopy of a case not available in his private library. The communicated document is a selection, the criteria of the selection being implied by the search request.

After this short and general sketch of an information system, we shall look into a few particulars. Information systems will not, however, be discussed in detail at this stage. The reason is simply that in part II we shall discuss in a more detailed and technical approach a special type of information system, the text retrieval system, which is of a special interest within the context of this book.

3.4.2 Information - and on what

(1) The concept of information

It is not actually necessary in this book to introduce the technical definition of "information", but it may be useful to make the point in order to avoid some confusion which may be caused by all the different meanings in which this term is currently used.

In most legal contexts, the word "information" is used loosely as a term which may be a synonym for - for instance - "data" or "news" or "reports" etc. In information science, the word is used with a more specific meaning, but here also definitions vary according to the pragmatics of the use. However, two main definitions may be discerned.

The first is based on the communication theory formulated by Shannon in 1948. According to this theory, information is defined in terms of the probability of the occurrence of a certain event or the reception of a certain message. The less probability of the

[Page 106 ]



occurrence of the event or the reception of the message, the higher information contained by the event or that message. Thus the information contained by rare characters like "+" or "x" will be higher than the information contained by commonplace characters like "," or "e". The same would apply to words like "dolus" or "copyright" compared to words like "the" or "for".

By way of illustration, one may refer to the well-known game, in which one person thinks of a word, and the others are to guess that word letter by letter. If the word is "lawyer", it would probably take a number of guesses to determine the first letter "I"; to determine the next letter "a" would be quite a bit easier. And if one had proceeded as far as "lawy", the completion of the word by the last two letters would be trivial.

The concept of information in communication theory is mentioned only to stress that this concept is quite formal, and only loosely connected with the intuitive meaning implied by everyday use. Communication theory is concerned with signs and their formal properties. It is not appropriate for our context. Here "information" should be given a semantic rather than a syntactic interpretation.

In information and computer science, "information" is usually taken as "knowledge or new knowledge". It is usual to distinguish between "data" and "information". In order for data to be transformed into information, it has to be received and understood by someone. As long as these two conditions are not fulfilled, the data remains a collection of characters and are not transformed into knowledge.

A simple example borrowed from Abramson (1963:2) may clarify the relationship. The phrase "le soleil brille" will yield information only to some readers. Firstly, the reader actually has to read the phrase (communication), and secondly he must be able to understand French (understanding). A more juridical example may be offered. In everyday speech the words "domicile" and "residence" are perhaps synonyms, but to a private international lawyer the words will have different implications. Both words will, however, yield information to the layman as well as to the lawyer.

[Page 107 ]



Defining the concept of "information" in this way, it becomes relative and subjective. Different individuals will, due to their differing backgrounds and knowledge, understand differently identical sets of data, and consequently have different information communicated to them by the same data.

For our discussion, this concept would seem appropriate. It is easy to integrate into the sketchy model of the legal decision process (presented in chapter 2) the "arguments" derived from legal sources (ie a type of data) as being one type of "information" communicated by these sources.

Further refinements of the definition could be produced, but the main point is to underscore that "information" should be regarded as a process of understanding, ie as some perceptive activity in the mind of a reader.

Actually, there are a number of terms which are rather closely related to the idea of "information". The term "sign" is a very general term, denoting any symbol which may be associated with meaning. A typical sign is a letter or a number, but a sign may also be a sentence, an action, a picture etc (Stamper 1973:18). What is to be qualified as a sign in a given situation, will then depend upon the pragmatics of the situation. The word "signal" is nearly a synonym for "sign", perhaps most frequently used for transient sign like an electrical signal or a light signal. In psychology, the words "sign" and "signal" are often used with the same meaning as the word "stimulus", cf the wide interpretation of "sign" noted above.
The word "data" is used in a way akin to the word "sign". By "data" one does, however, often imply that the data may be transformed into information - data is potential information, while a sign (for instance a single letter out of context) is not necessarily intended to convey information.
Such terminology offers a simple three-level structure: data (which is potential information), information and relevant information. When data takes the form of a text, the transition from

[Page 108 ]


data to information for a reader is trivial and easily understood. This effortless transition from data to information may make the distinction rather superfluous for most legal discussions. It would nevertheless seem that this distinction is productive in the perspective of the legal communication and decision processes.
Without going into detail, it should be stressed that there are some inherent problems in the definition (or rather: characterization) given above. For instance, it should be noted that old information is also information. Langefors (1970:59) defines information as "any kind of knowledge or message than can be used to improve, or make possible a decision", relating the concept to the traditional definition of information as "data of value in decision making" (Yovits/Ernst 1967:280). There would seem to be a problem of defining something as "information" when no decision has to be taken - even when "decision" is given a rather wide and vague meaning, similar to "making up one's mind". The attempts to phrase definitions should, however, be assessed for the use made of these definition. For the purpose of this book the simple definition presuming information to be data communicated and understood, will suffice.

(2) The subject of information

The legal information system is supposed to supply the lawyer with information relevant to his problems. This information is of a legal nature - and an explanation of what this may imply, is given in sect 2.6 above. The information to be conveyed by legal information system is the same type of information which may be derived from legal sources.

In order to make a point, one may consider at a type of system which is more similar to those usually discussed in the literature of computerized information systems. We may imagine an information system supporting a plumber in his selection of cobber pipes. The plumber has a huge store of such pipes. The pipes are stored in such a way that on each shelf

[Page 109 ]



there are only pipes of one length and one diameter. Each shelf is numbered. The plumber has an information system which is a conventional card index. On the card there are data on the diameter and length of the pipes. The cards are sorted on diameter. If the plumber needs a tube with certain properties, he looks up his index on the diameter he needs, and ascertain whether suitable lengths are available. If such a pipe is found, he may ask his assistant to take a trip to the store and collect a length of pipe from a shelf identified with a number.

Transferring this example to a legal environment, the plumber is replaced by a lawyer who is preparing his case before the court. He would like to have a precedent to support his arguments on a certain point. He turns to his legal information system, retrieves a reference to a certain case and asks his para-legal secretary to copy the appropriate case report.

The comparison demonstrates the differences in the situation of the plumber and the lawyer.

Firstly, it is easy to see that the plumber, when designing his information system, is in a better position to predict which criteria should be taken into consideration: length and diameter may be obvious choices. The lawyer has a more difficult task - he will hardly be able to predict in respect to what future problems he may want to retrieve a certain case.

Secondly, the relevance assessment is rather different in the two situations. The assessment of the plumber is simple while the lawyer will often have a hard task interpreting a case before deciding whether arguments derived from that case have relevance, and their relative weight.

These two aspects relates to basic differences in the two situations. The information retrieval of the plumber is "fact retrieval", that of the lawyer is "interest retrieval" - and these two types of information situations imply different requirements for the information systems.

Fact retrieval is, as the name implies, a search for facts: a specified type of information (names, numbers, volumes) or a certain document (a

[Page 110 ]


letter specified by reference number, a bill etc). In fact retrieval the relevance assessment is absolute. There is only one correct response, and this is independent of the user making the search. Consequently, requirements for relevance may be stated prior to the search itself. Interest retrieval is searching for references or identification of documents which discuss or explain a certain problem. It is more complex than fact retrieval, and the relevance assessment is relative to the user. Legal information retrieval for solving legal problems is a typical example of interest retrieval.

In this context, one may also point to another interesting difference. In the information system of the plumber, there is an evident and unproblematic distinction between the pipes themselves and the description of the pipes found on the indexing cards. The legal information retrieval system lacks this obvious distinction - it is a gradual distinction between the authentic text of the legal source and document elements describing the legal source.

In sect 3.1 we distinguished between the authentic text of the source as an element of the document, and such additional elements which were designed by the editor. The additional elements are generally designed to satisfy the relevance or retrieval functions, and are typically descriptions of the source. But elements in the authentic text itself may be descriptions: The titles of a statute, for instance, are part of the authentic text and are important in interpreting the statute, but are also descriptions of statutes, chapters or sections.

In this way information in a legal information system will have two different types of information as subject. Partly it will be information on legal norms as found in the legal sources, and partly it will be information on the legal sources as found especially in the additional document elements. It is not necessary, only characteristic, that both types of information is communicated within the same legal information system.

The descriptions of the legal sources represent some sort of a way station on the road from a defined legal problem to the legal norms, which have to be

[Page 111 ]



related to legal sources. The descriptions do not give the lawyer information which can be utilized directly as a legal argument. They serve as a basis for a temporary decision: Whether the authentic text of the source should be consulted. This intermediate decision is analogous to the decision of the plumber sending his assistant for a length of copper tubing. In the actual legal argument the source may not be utilized until the next stage. And similarly the plumber will utilize the pipe in his work (if he does not discard the pipe as "non-relevant" due to bends or discoloration).

Being aware of this double nature of a legal information system, it causes no real problems for terminology or reasoning. But without this clarification one may have problems in coping with what is really the subject of the information conveyed by legal information systems. And perhaps one will also be somewhat more cautious in applying the reasoning developed in respect to other types of information systems.

[Page 112 ]



3.5 Using the information system

3.5.1 User-constructed information systems

Definitions of information systems vary according to the context in which they are discussed. In the literature on computerized legal information systems, the definition of a "system" usually pivots on a provider of a service; a centre, a publisher or some organization. Thus, it is usual to describe ITALGIURE, EUROLEX, LEXIS etc as systems.

In this perspective, a legal information system has one provider and a number of users or subscribers. Features of the system, like data base content or updating response and frequency, are quite well defined. The data base of the ITALGIURE at a given date consists, for instance, of the documents stored in its various text files.

But as for other system concepts, one may amend the definition for different purposes, adapting it to highlight other aspects in the relationship between the provider and the user of a service. An obvious alternative would be to let the definition pivot on the user rather than the provider. In this perspective, a legal information system has one user and a number of different providers.

This may be a perspective well suited to bring out some features of the information situation of the user, and this situation is, of course, essential for an understanding of how legal information systems work within a jurisdiction.

In the perspective of the user, providers offer information services of which he may make take advantage - at a certain price. These services will be of different nature, from newsletters through journals and case reporters to monographs and, of course, on-line retrieval systems.

The user will see these offers as potential building-blocks for his own information environment.

[Page 113 ]



He will choose according to costs and to his perceived information needs, and in this way patch together a self-constructed information system.

This will be a heterogenous system, composed of services based on different technology, from the conventional paper-based systems to, possibly, the advanced computerized services. The homogeneity of information systems typical in a definition pivoting on the provider is lost and, consequently, many of the concepts used to describe an information system must be restated. For instance, how is the data base of such a system to be defined? Is the university library part of the data base of the user-constructed system of an academic lawyer?

Not only will this system be a patchwork composed by the individual user, but individual users will hardly compose the same patchwork. Even lawyers specialized within the same field will probably choose to design their own crazy quilt of an information system that are different in most details.

The provider-oriented information system is common for a number of users, and may therefore be discussed as a matter of general interest. The user-designed system, however, will be specific in relation to the individual user. Consequently, a general discussion of any such particular system is not desirable, one should rather try to develop some way of discussing characteristics of these systems.

Within the frame of this tentative sketch of user-constructed legal information systems the possible use of such a theory will be indicated. But this tentative nature of the sketch is also some sort of disclaimer in respect to the details drawn.

Ideas basic to this sketch were first presented by Bing 1979, and have been considerably developed by Bing 1982 and 1983.

[Page 114 ]


3.5.2 Availability factors

(1) Introduction

The use of any information service is associated with costs. It is obvious when the user subscribes to a journal or a computerized service: the user is then billed for the subscription fee. This is perhaps less obvious, but still evident, when the user browses through his own files or look up references in a compilation of statute law. In this case the cost is associated with the expenditure of time.

"Availability factors" is the - perhaps somewhat inelegant - term chosen to describe any circumstance associated with the use of an information system causing costs for the user, mainly costs in terms of money or time.

The concept is borrowed from Blekeli (1974:30-32), and has been used roughly in this sense in prior studies, for instance Bing/Harvold 1977:22-24 and Bing 1979. Availability factors are related to some of Cleverdon's (1967) "performance criteria", especially that sub-set qualified as "operations-oriented criteria" ("response time", "user effort" and "form of output"). It may be argued that "response time" and "form of output" are only two of the possible specifications of the third and more general criterion, "user effort". Lancaster (1977:312- 321) does not use the concept of "availability factors", but is concerned with "accessability", especially in respect to libraries and the physical access to material in a library.

Below, two different categories of availability factors are discussed. It may be noted, however, that the costs gererated by the factors will also be of two different types.

Some costs are associated with maintaining the user- constructed legal information systems. These costs will be subscription fees, salaries to staff responsible for filing or categorizing material, costs for furnishing the library and renting space for it, costs of terminal, microform readers or other acquired equipment etc.

[Page 115 ]



These maintenance costs are not associated with any single case of the user, but with the pre-problem stage in which the user prepares for problem solving. When a case comes along, part of the maintenance costs will have to be assigned to that case. In the most simple example the amount to be assigned will be a division of the maintenance cost over a period of time by the probable number of cases within the same period. More realistic, a user will diversify according to the type of case, and take account of the current interest rate. And obviously, general costs related to other aspects of his business would have to be similarly distributed - like salaries to secretaries, office rent etc.

Other costs are related to the work on each case. The user may spend hours in the library searching for relevant literature, and telecommunication costs and fees for accessing computerized data bases may escalate.

These are variable costs which will vary from one case to another. The variable costs of a case will have to be added to the calculated fraction of maintenance costs to determine the costs of information retrieval for that case.

The maintenance costs, as a rule, may be quite easy to determine. The variable costs, generally, will be quite difficult to discern from the general work on the case. Some costs are quite easy to identify: the cost information printed out after a terminal session may simply be included in the client's account. But it is difficult to separate a lawyer's time in "finding the law" and "analysis of the law". Research and analysis are probably iterative and interdependent activities, and are hardly to be generally distinguished.

There are studies which claim to do this, for instance Lang 1972:65 ("finding law" and "analysis to determine relevance") and Gluek 1976:83 ("Informationssuche" and "Informationsverarbeitung"). See also Erikstad 1979:62-69 where a terminal session for legal information retrieval was divided into "relevance assessment" and "giving commands and search arguments". Such distinctions may be of interest for special purposes, but in

[Page 116 ]


principle on should be wary of attempts to quantify processes which, analytically, can hardly be separated.

These general problems do not, however, create any major difficulties of discussing in principle the costs of information retrieval for a case - although it demonstrates that such a discussion will have to be somewhat theoretical - and that there may be severe difficulties associated with attempts to determine exactly which part of general maintenance and variable costs are to be qualified as relevant.

(2) Pragmatic and formal availability factors

Availability factors may be classified in different ways, but there is one distinction which is quite important - that between pragmatic and formal availability factors.

Pragmatic factors are those discussed in the introductory section: the costs associated with purchases and fees, expenditure of time and money to access and use information systems. There are numerous different pragmatic factors.

An interesting, though trivial factor is distance. The costs associated with using a certain information service are related to the distance from the user to the place where that service may be accessed. This distance is an availability factor, only to be overcome through incurring costs - the user spends time going to the files in the neighbouring room, the next floor or the local library; the user has to wait for a mailed request to reach a documentation center, etc. It may be offered as some sort of universal law of the use of legal sources that the frequency by which the source is accessed is directly related to the distance between the user's desk and the point of access.

Pragmatic factors may have the common characteristic of being overcome by the expenditure of costs. By allocating sufficient resources, a user may always have the information made available in spite of severe pragmatic availability factors.

[Page 117 ]



Not so in respect to formal availability factors. These are circumstances which determine the access to information services, but which cannot be overcome by incurring costs.

A typical example is the formal availability factor of the law of confidentiality. In many jurisdictions, the decisions by public authorities are a source of law - a new decision must always take into consideration the result of prior decisions. But these decisions will generally incorporate personal information on the clients subject to the decisions. And such information will very often be protected by confidentiality. The lawyers working within that agency will have access to former decisions, and may argue on the basis of such decisions - which may be cited in an anonyminized form. But a lawyer representing a client is denied access to the files containing the prior decisions, and cannot utilize this important source of law in his own legal argument. And this availability factor cannot be overcome by incurring additional costs - it is normative, and may not be removed by user effort.

(3) User research on availability factors: An example

As "availability factors" has not been a concept directly utilized in any user research (perhaps with the exception of Karnov 1978), one cannot expect to find empirical results directly related to this concept, setting out, for instance, which availability factors are of greatest practical importance. But several studies characterize more indirectly the availability factors in the information environment of the user.

In the major German study, Jungjohann/Seidel/Soergel/Uhlig 1974:49, is listed the most frequent causes for the "verspaetete oder ausbleibende Information". This may be regarded as some sort of ranking of availability factors. Four causes are identified which each explains more that 10 per cent of the cases, the sum of these four causes explaining more than three quarters of the examples in which information is belated or missing.

[Page 118 ]



Fig 3/7 - Causes for missing information

Zeitmangel 33 %
Zeitraum zwischen Enstscheidung und Veroeffentlichung zu lang 21 %
Literaturbestand nicht ausreichend 13 %
Verzoegerung der Umlaeufe 10 %

This table has several interesting features. For instance, the main cause - "lack of time" - may cover two entirely different situations. The first situation may be one when the user has too little time available to make adequate efforts to research the law - a situation common, for instance, in overworked public agencies. The cause is here related to the general job situation rather than to information systems. But it may also cover the situation in which the response time of the information system is too long for the user - for instance a lawyer working against the deadlines set by the court - to be able to profit from the use of the system. In that case, the cause is clearly related to the information system. It may be argued that "lack of time" is too general a cause to be very useful in an analysis of the information situation of the user - it only implies that the user lacks resources, and does not explain why resources are lacking, or what types of resources are lacking.

Similar general causes may be found in other studies of user research, for instance in Karnov's careful study of Danish lawyers, where one of the conclusions (1978:30-31) is that the major difficulty perceived by the Danish lawyer is the large volume of literature which is too difficult to accessible. Again, this is rather too general to help understand in detail what are the problems of the users.

The second cause identified by the German study - the lapse of time between a decision and its publishing - is not an availability factor, but rather a criticism of the information services offered. It may be of interest to note that the Danish study (Karnov 1978:30-31) identified as the second major cause of defects in the user's information situation that the relevant sources were not "officially published" - a cause which seems related to the one mentioned in the German study.

[Page 119 ]



The third is, however, a typical availability factor - namely defects in the library. Such defects have to be repaired by supplementing the library by out-of-house services, which may obviously cause delays. Similarly the last of the four German causes is a typical availability factor - the time for journals etc on circulation to reach the user.

In this context, the analysis of available user studies will not be pursued in order to demonstrate that they may disclose further information on which availability factors are of importance to the lawyer. Further examples will, however, be given in the following sections. For general discussions, see Bing 1982:164-171 and Bing/Frøystad 1982.

3.5.3 The cost curve

(1) Area of interest - area of documentation

Above in sect 3.5.2, availability factors were briefly discussed as circumstances causing costs for the user in acquiring or accessing legal information services. It was also mentioned that these costs may be divided into maintenance and variable costs.

Looking to the maintenance costs, these are most closely related to the user-constructed information system. Obviously, this information system is not designed by accident. The user has some sort of motivation in acquiring or subscribing to a certain service.

A user, in general, will have some idea as to what future problems he may be required to respond. These are problems which correspond to his specialization or office, and may be described as his area of interest. When assessing possible information services, the user will try to prepare for his future work, and obviously try to find services which are useful in respect to his perceived area of interest.

Any information service offered, will have a documentation area, cfr sect 3.2.2 above.

Taken together, the selection criteria of a certain

[Page 120 ]



provider-oriented legal information system define the documentation area of that system. The user may describe his area of interest by corresponding criteria (though this is rarely done very consciously).

When acquiring legal information services, one may picture the initial assessment of the user as an attempt to identify and acquire information services with documentation areas overlapping his area of interest. By adding one service to another, the documentation areas of the services provide an overlay on the area of interest. Only if the user represents a typical user - for instance a dedicated tax lawyer - he may expect to find information systems which match exactly his area of interest to their area of documentation. More generally the user will have to find several systems of which the union of documentation areas provide sufficient coverage for his area of interest.

But the matching of areas of documentation to the area of interest is only part of the assessment made by the lawyer. Certainly another part of his consideration would be the rank of the documented legal sources. For a tax specialist, a compilation of statutes in force may have a general documentation area. But on the other hand, statutes have such a high rank in a legal system that any lawyer will require easy access to them. The lack of concord between documentation and interest areas is compensated by the rank (or importance) of the documented type of legal sources.

Thus may be indicated the "rational" behaviour of a user of legal information services. He will consider his area of interest. He will survey the information services offered, and try to match the documentation areas of these services to his own area of interest. In doing this matching, he will attempt to select systems whose composite documentation areas provide an acceptable coverage of his area of interest. And he will consider the type of legal sources documented, in order to have satisfactory acccess to those of a high rank.

[Page 121 ]



(2) The local data base: The concept of coverage

Provider-oriented information systems have well-defined data bases while the user-constructed information systems do not, as the systems themselves are not very well-defined. It is a system problem to determine where the user-constructed system stops and its environment begins.

This is a practical, though in most cases, a trivial problem. For instance in a university, the professors will acquire some books and journals for their personal use - but will rely mainly on the university library. This library may be a national library, distributed throughout the country and linked to other national libraries. Obviously the professor's information system should be considered larger than that represented by the books and journals in his own shelves - but equally obviously not all-embracing. Similar problems will arise in respect to lawyers, government agencies etc.

In this our concept of costs (derived from overcoming availability factors) may help us. Consider the cost curve which is theoretically drawn as the lawyer accesses new information services. The curve starts with an initial cost calculated for the individual case on the basis of maintenance costs. The variable costs will then make costs grow when accessing any service.

But in our reasoning, we have pointed out that the user does not acquire information services by chance: he acquires services thought to be useful. The services most easily available are those services generally considered most useful. Consequently, in most cases the user will first employ these easily available services. Only when these do not give the necessary information to solve the problem, the user will move to other and less available systems.

In this way, we may argue that the curve will become progressively steeper. It may also be argued that the curve typically will have an "elbow", indicating the point where the user leaves those services prepared for use by prior acquisition, and turns to other services. This "elbow" may be regarded as a definition of what is to be considered the local "data base" in

[Page 122 ]



the user-constructed system. This data base cannot, in contrast to that of provider-oriented information systems, be clearly defined. But the reference to the cost curve should provide a working definition.
Fig 3/8 - Typical cost curve for use of information services in an average case - elbow indicating "local data base"

User research would seem to bear out this point.

The Danish survey has a fascinating triad of responses. Three quarters of the users maintained that the sources for which they had a "great need" were kept easily available, ie in the same building as the lawyer had his office (Karnov 1978:30) - illustrating the point of user-constructed information systems being designed to match the area of interest. The majority also maintained - in ideal correspondence with the advice of legal theory - that the availability factor of distance never made them refrain from collecting information from a library outside their own office (Karnov 1978:44). On the other hand, responding to a different question, the majority also stated that they never or rarely based their research on legal sources made available from the outside (Karnov 1978:43). It is tempting to maintain that these two last responses illustrate our progressively increasing cost curve: even though there may be a need for additional information, the costs associated with acquiring this from outside localities are simply too high.

[Page 123 ]



The same observation is made by Lang 1972 for Canadian lawyers (cited from Bing/Frøystad 1982:65):

"Lawyers carry out the bulk of their research in their own offices with the aid of law books from their office library. Rare is the lawyer who conducts most of his research at home or at a county court library."

We may perhaps venture to conclude that the "elbow" in the cost curve is a definition of sorts of the "data base" of the user-constructed information system. This "elbow" may be difficult to determine empirically. But both the theoretical arguments and the cited user research would seem to point to the content of the local library - the shelves in the office of the user - as a good indication of the nucleus of that data base.

In describing the data base of user-constructed data bases, the concept coverage is very useful. Often coverage is used to describe the quality of the data base of a provider-oriented data base. In information systems, the value of high coverage is usually emphasized (see for instance Tapper 1973:78-80).

Coverage is generally defined as "the extent to which the system includes a data base required by the user" (McCarn/Stein 1967). Using the two concepts introduced in sect 3.5.3 (1), one may say that high coverage indicates a large intersection of documentation and interest areas, and exhaustive documentation of sources within the interest area.

Following this definition, one will find that it leads to curious results of legal information services. Take for instance the information service provided by a case reporter for the supreme court, and compare the data base of this with the requirements of a specialized tax lawyer. It is obvious that the tax experts would require legislation, regulatory law and decisions from lower courts as well as text books and other secondary sources. Consequently, the data base covers only a small fraction of the "data base required by the user" - coverage is low. But this is hardly a point of criticism, or even of interest, in respect to the case reporter.

The much acclaimed concept of coverage does not apply

[Page 124 ]



adequately to the specialized legal information services, with areas of documentation clearly defined in respect, for instance, to types of legal sources. The concept was, we believe, originally developed to characterize the quality of a library serving a certain profession. It could also be successfully applied to describe the data base of a global information system, claiming that its documentation area is identical to the. area of interest of a certain typical user situation or that of a whole profession. Because many computerized legal information systems (as opposed to most conventional services) claim to be general and serving any information need of a lawyer, coverage may be a relevant measure for their data bases.

In respect to coverage it should be stressed that this criterion is well suited to characterize the data base of the user-constructed information system. It may indeed be said that the aim of a user is to construct an individual information system with an optimal coverage.

The point has also a reverse aspect: the concept of coverage is of little interest in relation to specialized legal information services - in this respect, the qualification of the area of documentation (and some additional criteria, like the fraction of sources within this area documented by the system), will be better suited to describe and analyse the system.

(3) Factors determining the cost curve

No empirical data are available making it possible to determine exactly the cost curve discussed in theory above. And certainly there are a number of practical and fundamental problems connected with the gaining of such empirical information. There are, however, other possibilities for determining some of the factors influencing the shape of the curve.

Obviously, one important factor is the size of the local data base, or simply the local library. The larger the local data base, the longer the segment of the curve with moderate inclination, below the "elbow".

[Page 125 ]



In general this will clearly be related to the level of maintenance costs. The greater maintenance costs, the larger the local data base (as a rule of the thumb) will be gererated. And the fraction of the maintenance costs to be assigned to the individual case, will be correspondingly larger.

It is unrealistic to isolate the costs of information services from the other costs related to a case. But if this is done for the sake of argument, one will see that a user with low maintenance costs will be able to offer advice in simple cases at a comparatively lower cost than a user with higher maintenance costs. In difficult cases requiring more legal research, however, the latter user will be able to offer advice at the lower cost. This illustrates some of the deliberations necessary for the user in determining the ambition level of his individual information system.

Two or more users will often cooperate in maintaining a common information system. This is the case among partners of a law firm, the employers in a government office, the professors at a university etc. Obviously, if each case contributes an identical fraction towards maintenance costs (and all other factors being equal), one will realize that an office of two partners will be able to spend twice the amount on maintaining their service than a lawyer working on his own.

This demonstrates that as far as costs for information services is considered an important factor in the total costs for legal services, the larger organization is given an advantage.

Lang (1982:77) has a table in which the acceptable level of costs for legal information retrieval services is set out. It demonstrates that the majority of large offices are prepared to pay more than three times as much for such services as small offices. Lang defines small offices as those with 1-3 lawyers, and large offices as those exceeding 10 lawyers (1972:61). When compensated for the underlying facts, the survey does actually demonstrate that the individual lawyer working in small offices is prepared to pay as much, or more, as the lawyer working in the

[Page 126 ]


larger offices. Even so, the larger offices will have the advantage of a larger local data base.

Though large organizations may maintain a larger local data base at the same cost per case, the size of an organization probably is another factor determining the cost curve. A larger organization implies greater average distance from an office to the local library or the local files, or the book or the facility desired is already in use by a colleague, etc.

This would seem to be supported by some results of the German user survey, see Jungjohann/Seidel/Soergel/Uhlig 1974:44. In this survey, the users identified causes for unsatisfactory result of information retrieval. The two factors ranked highest included "Informationen zu verstreut" and "Literatur nicht in der Bibliothek erhaeltlich". One should think that in a larger organization, both causes would be relatively more severe: documents would be even more dispersed throughout the organization, and books would be more frequently on loan to some colleague.

Also a third factor will be related to this: the cost of searching a large data base is generally considered to be relatively higher than the cost of searching a smaller data base (Langefors 1970:227). As larger organizations accumulate relatively larger local data bases, the costs for searching these will also be higher.

According to this argument, it may be productive to distinguish between "large" and "small" organizations (though how these should be defined, will be relative to the country in question). Other factors being equal, large organizations may establish relatively larger local data bases for the same cost per case. By doing this, the large organization creates a curve with a comparatively long segment of modest inclination for variable costs. But nevertheless the inclination will be steeper than that of the corresponding segment for small organizations: the size of the organization itself and the size of the local data base will result in a relatively steeper inclination of the curve representing variable costs. The contrast between "small" and "large" organizations is illustrated in fig 3/9.

[Page 127 ]



Fig 3/9 - Typical cost curves of small and large organizations (functional performance excluded)

The factors above are not, however, the only relevant factors forming the cost curve. Also the functional performance of the user-constructed information systems is important. The meaning of this concept is discussed above at sect 3.3.3 and will be discussed in further in respect to user-constructed systems below at section 3.5.4 (2). Functional performance may be characterized as those features of an information system which facilitate retrieval, relevance assessment and access to the source document (retrieval, relevance and source functions). Obviously, by establishing a library index, retrieval of books from the library becomes more efficient. By making summaries of in-house reports on file, it becomes easier to determine their relevance to a problem at hand. By subscribing to a computerized service documenting regulatory law in force, it becomes easier to have a copy of the current form of an identified regulation.

These examples should be sufficient at this point to illustrate what form increases in functional performance may take. Such measures will contribute to an increase in the costs of maintaining the user- constructed information system without a corresponding increase in the coverage of the local data base. But these measures will bring down the variable costs, making the first segment of the curve rise less steeply. Whether it is "rational" to invest in such enhanced features will be an assessment based on whether the user on the average needs to access a volume of documents greater than that indicated by the intersection of the curves representing the situation with and without the enhancement.

[Page 128 ]



Fig 3/10 - Costs curves with and without enhancement of the functional performance of the user-constructed information system

It would probably be possible to find further examples of factors forming the cost curve. The main point of this section is not, however, to provide an exhaustive list of such possible factors - but rather to exemplify how the curves may assist the reasoning in respect to features of the user-constructed information system. Such reasoning may be useful in determining the characteristics of different user situations, and in comparing such situations. This would seem to be true even when the curve itself cannot be determined in detail by empirical studies of the situation of the user.

(4) Availability discrimination

The cost curves may be used for comparing the situations of different users. As stated above, two users will rarely have an identical information situation: their choices in constructing their own individual information systems will vary. Consequently, also the cost curves will be specific to each of the users. The stressing of differences in user situations is therefore of minor interest.

There may, however, be unjustified differences between users, and such differences may be termed availability discrimination.

[Page 129 ]



In order to identify availability discrimination, one has to justify the comparison of users. A comparison would seem to be justified if the area of interest is (approximately) identical.

One may imagine that a group of lawyers is created of all those with an identical area of interest - for instance tax lawyers. For all these lawyers, the cost curves are determined. Then this group is subdivided by different criteria. If the average costs curves for the sub-groups are clearly different, one would maintain that this is a case of availability discrimination.

A trivial way of sub-dividing lawyers would be to have one group of those residing and working within the capital (or larger cities), and one group of those working elsewhere. As mentioned above under section 3.5.2 (2), geographical distance is an availability factor. It would seem probable that the centrally situated lawyer has a better information situation, due to smaller communication costs (both in time and telecommunication rates), better access to central files or libraries (Eckhoff 1971:14) etc. (On the other hand factors like the rent of office space, salaries etc may bring total costs up.) The discrimination of the periphery in respect to a centre is a theme from the general policy debate in many countries, and one should not be surprised to see this reflected also in respect to the information situation of lawyers.

Another sub-division frequently discussed is the one between lawyers within and outside public administration. Within, for instance, the tax administration lawyers will have access to information systems denied to those outside - perhaps due to the very nature of the information in these files (see sect 3.5.2 (2) for examples of formal availability factors). This will create a different information environment for the two groups, which may often seem less than justified.

A third sub-division of special interest in our context is the group with and without local access to computerized legal information services. A computerized service may often imply relatively high maintenance costs, an increase in the local data base, and

[Page 130 ]



a dramatic increase in functional performance. Referring to our discussion at sect 3.5.3 (3), one may maintain that this sub-division will resemble the sub-division between large and small organizations. Also, the sub-division between large and small organizations may correspond roughly to the sub-division between organization in and outside cities. In this fact lies, perhaps, the core of the problem posed by the tariff structures of some computerized services: will they strengthen an existing availability discrimination thought to be less than desirable? Will the potential reduction of availability discrimination through new information technology be eliminated due to, for instance, tariff structures confirming existing differences?

Availability discrimination is not always unjustified. The example of geographical distance as a cause for availability discrimination is illustrating an issue which cannot be resolved for legal information system alone, but which has to be addressed as a general policy issue of centralization and decentralization. Formal availability factors - which will always create availability discrimination - may perhaps be fully justified. But the existence of availability discrimination - of which certainly more examples can be given than mentioned in this section - provides a focal point for the critical examination of legal information systems.

3.5.4 Active and passive use of information systems

(1) Introduction

When discussing legal information systems, especially computerized systems, the interest often concentrates on the efficiency of the system in retrieving relevant documents. The assumed situation of the user implies a legal problem, which the user is supposed to solve. Legal problems are solved by locating legal sources, from which applicable norms are derived through interpretation.

This may be seen as the active use of information systems. The active use presumes a legal problem, and the user formulates search requests in order to

[Page 131 ]



retrieve possible relevant documents. This situation also requires that the information systems will permit the processing of a problem-oriented request. A text retrieval system is extremely well designed for processing problem-oriented requests, accepting flexible and complex queries on the same level of specification as the documents themselves.

But there is also another typical user situation, which is often associated with the conventional information systems - for instance a legal journal. When receiving a new issue, the user will often browse through the journal. This is to be expected, as the system will typically communicate documents within his area of interest. The user has to maintain his background knowledge to give off-the-cuff response to clients, and to be efficient in the active use of information systems.

This current-awareness function may be seen as a passive use of information systems. No legal problem is being solved, the user is only brushing up his knowledge of the law.

User research has acquired some information on passive use of systems. It should be noted that there are indications of conferences and conversation with colleagues playing an important role as passive information systems (Rawlence 1975:349, Technical Study I 1977:99 and Karnov 1978:24). In general, however, journals are ranked high for maintaining background knowledge (Jungjohann/Seidel/Soergel/Uhlig 1974, Technical Study I 1977:99 (for "routine use") and Karnov 1978:24-25). The studies also have a corresponding ranking for active use. Here the journals disappear, and information systems like compilations of statutes in force, case reporters and commentaries to the statutes are emphasized.

One should note that different systems are designed for different use. Journals, especially those approaching newsletters in form and content, will be of interest mainly for passive use. Case reporters will be used both initially for keeping track of developments (emphasized in some jurisdictions like the US by the advance sheet service of case reporters), and for later reference in active use. Computerized services will rarely be used passively.

[Page 132 ]



It may be a point of criticism in respect to computerized services that they are designed with little thought on passive use. It should, for instance, be trivial to have a command displaying the most recent documents which have been added to the data base as a current awareness function. Some services do include a newsletter, highlighting recent developments, more or less in the form of an electronic journal. Such functions deserve more attention in respect to computerized services, because the passive use is probably extremely important for practising lawyers as they are often required to give instant answers to the requests of users.

As far as we know there is no system offering a "forum function", in which the users may discuss recent developments, pin notes for the attention of colleagues when working on a problem etc. One would expect the users to find such alternative and improved ways of communicating of interest in respect to the vital part played by inter-lawyer communication.

(2) Functional performance

The different functions will not be discussed in this context (cfr above at sect 3.3.3).

Also the user-constructed system will have features determining functional performance. The major characteristic of such systems, however, is that they exploit functions imbedded in the information services provided through subscription or purchase.

For retrieval, the user will exploit back-of-the-book indexes, text retrieval systems, digests, citation indexes etc, which are part of the provided services. The user will not have one uniform retrieval system, but rather a multitude of sub-systems which he has to manipulate in order to access his local data base. Most of the retrieval systems will be designed to access only those documents which are parts of the data base of one provider-oriented system: the index of a book will give access to only that book, the index of a case reporter will give access to only the reports published in that reporter, a text retrieval

[Page 133 ]



system will give access only to documents in the computerized data base, etc. The sole exceptions will be designed as more general retrieval tools - a citation index may give access to more than one reporter, an annotated encyclopedia may cite all major legal sources on an item etc. But also more general retrieval tools like these will be quite specific and will not give access to the total data base of the user-constructed system.

This, then, will be a major defect in the user-constructed systems. The user will have to work with a number of different retrieval systems, with different "search languages" (the index of a case reporter also defines some sort of search language). Only if a global service is offered - a service whose documentation area corresponds closely with the area of interest of the user, and with a high coverage - will the user avoid the reduced performance and increasing variable costs of legal research.

The user may, of course, invest in creating a retrieval function for his individual information system with the consequences for the cost curve indicated in sect 3.5.3. (3). User research does indicate that lawyers will rarely establish such individual retrieval facilities.

In the German user survey, the fourth most important cause of unsatisfactory retrieval was given as "Unvollkommene Suchhilfen" (Jungjohann/Seidel/Soergel/Uhlig 1974:44). The study does not, however, specify whether the unsatisfactory research tools were associated with the retrieval function provided in purchased information systems, or lack of adequate retrieval tools in respect to the user-constructed system. - In an Italian survey, Rawlence (1975:374-375) looked into the libraries of "advocati e procurati" in Florence, finding that 50 per cent did not have any special system in their libraries, and 77 per cent were lacking any form of library index. This is clearly an indication of a lack of specific retrieval tools for the user-constructed system.

For the relevance assessment, this is also a function imbedded in the systems purchased by the user. In respect to this function, fewer problems will arise by the profusion of different systems. The relevance

[Page 134 ]



assessment is made on the basis of a retrieved document, and even if this function is solved differently in one system compared to another, this does not create similar needs for an umbrella solution.

The same holds true for the source function. In order to access the authentic form of a legal source, the user has to rely on services provided to him. (With the exception of sources gererated by, for instance, a court. This will be relevant also for future problems at that court, and may be accessed by the users within the court through their own files). The source function is critical to the lawyer - it is hardly interesting for him know the existence of a probable relevant case if the text of that case cannot be consulted. There are some specialized providers like libraries or documentation centra which will make the authentic form available when the source is identified.

This brief discussion of functional performance demonstrates that in respect to user-constructed systems, it may be that the diversification of retrieval systems represents a defect which the user may be able to repair by some sort of investment in the maintenance of his own system. Computerized legal information services generally have data bases including the union of a number of conventional data bases (for instance a great number of volumes of case reporters). This may increase the performance in the user-constructed information system not only due to the more efficient retrieval function, but also because the fragments are brought together under one umbrella.

(3) Delegation

User resarch has disclosed a somewhat unexpected tendency to delegate legal research. Technical Study I 1977:33 found that intermediaries were employed frequently in respect to computerized systems. This was considered rather discouraging, as computerized services are designed to be used by the lawyer himself, and to give feed-back which the lawyer is expected to utilize for instance to reformulate his search request.

[Page 135 ]



In our context, delegation implies that the user, in respect to his user-constructed information system, finds resources within his own organization to do legal research, generally increasing the variable costs. As discussed above, the user-constructed system will often contain a number of retrieval systems. It might be especially disturbing if found that delegation took place to a large degree in respect to a sub-system which generally will have a high coverage and an efficient retrieval function, like many computerized services.

But in this respect user research is not conclusive. It is true that as a rule lawyers do not delegate research. But many of them state that this is simply because they do not have this possibility within their own organization - in Denmark 40 per cent, in Germany an even greater percentage (Karnov 1978:41-42, Uhlig 1976:61). Looking at the German survey, one will find that delegation takes place most frequently for lawyers in public service for senior staff, and for lawyers in private practice working in "large organizations" (Uhlig 1976:62).

Obviously the results may be interpreted as showing that when the organization offers the possibility for delegation, the lawyer will use this possibility quite frequently in order to increase the resources put into legal research. The salaries of such staff may be considered part of the maintenance costs of the user-constructed system, making the assessment of whether their employment is "rational" similar to that indicated in sect 3.5.3 (3). Alternatively the time spent by intermediaries on legal research may be seen as a variable cost related to the individual case.

Also, it would seem probable that computerized information systems are initially subscribed to by the very same organizations which have resources for delegation. The emergence of intermediaries would then be only a repetition of a general theme rather than something special for computerized systems - though the focus and analysis of these systems would have made such intermediaries more visible.

It is also probable that activities relating to the source function are widely delegated - for instance

[Page 136 ]



through the copying of a source text or by other practical steps taken to secure the authentic text in an appropriate form for the "end user". On the other hand, relevance assessment would probably be carried out mainly by the "end user" himself, or at least delegation would here play a different role. It might be interesting to see user surveys concentrate on the different activities within user organizations, for instance to assess the changes implied by the growing group of "para-legals" working with lawyers within many jurisdictions.

3.5.5 Technological change and costs

In closing, one may employ some of the arguments presented above in order to discuss the situation in which a user purchases a computerized legal information service. For the user-constructed information system, the access to a computerized service will have several implications.

The first effect will be an extended local data base. Obviously the degree of extention will vary. For some users, the computerized service will greatly extend the data base into documentation areas quite beyond their primary area of interest. For other users, the data base will be identical mainly to the local data base already existing. By concentrating on these two ends of a spectrum, one will see that for the first group, the main motivation will lie in an access to a larger local data base, while the second group will subscribe to the service only if the more efficient computerized system brings down the costs for legal research.

Evaluating the cost curve, a subscription to a computerized service will generally imply an increase in maintenance costs. This will be due partly to the purchase of local equipment (terminals, printers, modems) and partly to a flat subscription rate of the service (a minimum fee or similar rate). The absolute increase is difficult to assess, and is certainly relative to the situation of the user. If the service presumes the purchase of dedicated terminals and printers, the investment may be relatively high compared to the general maintenance costs of the user's

[Page 137 ]



total system. But if the user has already word processing equipment (even at his own desk), and this may be used for retrieval, the upgrading may be quite marginal. As a rule, one must presume somewhat increased maintenance costs, even if the computerized service replaces one or more conventional systems to which the user formerly subscribed.

The effect on variable costs is more difficult to assess, and should be considered from two angles.

Firstly, one should consider situations in which the computerized system greatly increases the local data base. If the computerized system is very general, an increase will often be the result. For a "rational" user, this will be attractive if his area of interest is wide. A general practising lawyer may have specialities, but usually he has to tackle problems across the whole range of the legal system. For practical purposes, his local data base composed of conventional services will have to be rather limited compared to his area of interest - his coverage will not be too high. To this lawyer, it would be tempting to have a general computerized service made available and thus greatly increase the local data base. The effect would mainly be an extension of that segment of the cost curve which has a modest inclination. Whether this segment become slightly steeper through the purchase of the computerized service, would probably not be decisive.

Secondly one should consider situations where the user is highly specialized. Then the computerized service offering a local data base completely outside the user's area of interest would not be attractive. But if this user were offered a system with improved functional performance, this might bring down the variable costs. So even without increasing the local data base within his area of interest, the use of information systems for this data base could actually prove to be cheaper.

This indicates two typical user situations which are both characterized mainly in terms of the area of interest. A general practising lawyer may be offered as an example of a user with a general and often vaguely defined area of interest. He would be tempted by a computerized service offering an increase in the local data base. A government agency may be offered

[Page 138 ]



as an example of a specialized user. This agency would be tempted by a system bringing down the variable costs of legal research.

Actually a government agency frequently has an information situation with a relatively steep variable cost curve. This is because the organization often relatively large, the local data base extensive and the functional performance in the user-constructed system rather poor. The two latter observations are related to the fact that in many agencies, their own prior decisions represent an important legal source, but available through by indexes which have to be kept up to date by manual and cumbersome means. By making the same data base available through a computerized system, the variable costs in researching the files may be brought down. Also the increased maintenance cost is a question of budget rather than of profitable investment. It may indeed be maintained that many European systems have started out within public agencies mainly for this reason.

It is a matter of discussion whether a computerized service actually will bring down variable costs for retrieval, or the average costs for legal research (the sum of maintenance and variable costs for an average case).

In his detailed analysis of the then proposed German JURIS system, Gluek (1976:84-85, 91 and 94) estimated a reduction of manhours per user in public administration of 1.4 hours and in lawyer firms of 1.7 hours.

The probable cost reduction from increased functional performance will have to be set against the additional variable costs from use of the information retrieval services (fees to the service providers, which are generally time-related).

Such a comparison would also be difficult to make, as user behaviour probably will change when a new information system is introduced. As mentioned above (sect 3.5.2 (3)), "lack of time" is a major cause for not using information services. If this is due to the restraints in the situation of the user, a more efficient information system may result in easing the frustration of not being able to research the cases adequately while the same amount of time per case is used for such research. The result will not be a

[Page 139 ]



reduction of costs for legal research, but more adequate research within the same cost restraints.

To produce a detailed calculation of probable savings in terms of time and money through the introduction of a computerized service will therefore be difficult, mainly because of lack of necessary insight into the new system's influence on user behaviour. Perhaps assessments could be made more easily by the rule-of-the-thumb-principle introduced in the beginning of this section - estimating whether a potential user is characterized mainly by a general or a specialized area of interest, and considering the impact which an additional computerized service will have on his user-constructed information systems - both in terms of local data base and functional performance.

[Page 140 ]


HOME
PREVIOUS PAGE
NEXT PAGE