Wednesday, February 17, 2010

12. Indexed Files

An indexed file is really two files – the data file, which is created in sequence but can be accessed randomly, and the index file, which contains the value of each key field and the disk address of the record with that corresponding key field. To access an indexed record randomly, the key field is looked up in the index file to find the disk address of the record; then the record is accessed in the indexed data file directly.

The index on a disk is similar to a book’s index, which has unique subjects (keys) and their corresponding page numbers (addresses). There would be two ways to find a topic in the book. You can read the book sequentially, from the beginning, until that topic is found, but this would be very time consuming and inefficient. The best method would be to look up the topic in the index, find the corresponding page number, and go directly to that page. This is precisely how records can be accessed on a disk file that has an index.

With an indexed file, records can be accessed either sequentially or randomly, depending on the user’s needs. The term random access implies that records are to be processed or accessed in some order other than the one in which they were physically written on the disk.

Creating an Indexed File
Indexed files are created in sequence; that is, reading each record from an input file, in sequence by the key field, creates the indexed file and writing the output indexed disk records in the same sequence. Note, however, once the indexed file is created, it can be accessed randomly.

The ORGANIZATION clause
The clause ORGANIZATION IS INDEXED indicates that the file is to be created with an index.

The ACCESS clause
Since indexed files may be accessed either sequentially or randomly, the ACCESS clause is used to denote which method will be used in the specific program. If the ACCESS clause is omitted, the compiler will assume that the file is being processed in SEQUENTIAL mode.

The RECORD KEY clause
The RECORD KEY clause names the key field within the disk record that will be used to form the index. This field must be in the same physical location in each index record. Usually, it is the first field. It must have a unique value for each record.




Eg 12.1:
ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT IND-EMP-FILE
ASSIGN TO “INDEMP.DAT”
ORGANIZATION IS INDEXED
ACCESS IS SEQUENTIAL
RECORD KEY IS I-EMP-NO.
DATA DIVISION.
FILE SECTION.
FD IND-EMP-FILE
LABEL RECORDS STANDARD.
01 IND-EMP-REC.
05 I-EMP-NO PIC 9(4).
05 I-EMP-NAME PIC X(25).
05 I-EMP-DEPT PIC X(4).
05 I-EMP-SAL PIC 9(5)V99.


The INVALID KEY clause
With WRITE
The INVALID KEY clause is used with a WRITE instruction to test for two possible errors: (1) a key field that is not in sequence or (2) a key field that is the same as one already on the indexed file. If any of these conditions exist, we call this an INVALID KEY condition. The computer checks for an INVALID KEY prior to writing the record.
Thus, if you use an INVALID KEY clause with the WRITE statement and a record has an erroneous key, the record is not written and the statement(s) following INVALID KEY would be executed.

Eg 12.2:
WRITE IND-EMP-REC
INVALID KEY
PERFORM 2000-ERROR-PARA.

With READ
When reading a disk file randomly, we do not test for an AT END condition because we are not reading the file in sequence; instead, we include an INVALID KEY test. If there is no record in the INDEXED-FILE with a RECORD KEY equal to T-EMP-NO, the INVALID KEY clause will be executed.

Eg 12.3:
DISPLAY “ENTER EMPLOYEE CODE :”
ACCEPT T-EMP-EMP-NO.
MOVE T-EMP-NO TO I-EMP-NO.
READ IND-EMP-FILE
INVALID KEY
PERFORM 600-ERR-RTN.

DELETE verb
The DELETE verb can be used to delete records from indexed files. Note that we use the file-name with the DELETE verb, but the word RECORD can be specified as well. That is, both the statements DELETE INDEXED-FILE and DELETE INDEXED-FILE RECORD can be used to delete the record in the INDEXED-FILE storage area.
To delete a record from an indexed file, you should first read the record into storage and then instruct the computer to delete it.

Eg 12.4:
MOVE “Y” TO WS-FOUND.
MOVE 1001 TO I-EMP-NO.
READ IND-EMP-FILE
INVALID KEY
MOVE “N” TO WS-FOUND.
IF WS-FOUND = “Y”
DELETE IND-EMP-FILE
INVALID KEY
DISPLAY “ERROR DELETING RECORD”.

Using ALTERNATE RECORD KEYs
Indexed files may be created with, and accessed by, more than one identifying key field. That is, we may want to access employee records using the name as the key field. To enable a file to be accessed randomly using more than one key field, we would need to establish an ALTERNATE RECORD KEY.
To establish multiple key fields for indexing, we use an ALTERNATE RECORD KEY clause in the SELECT statement.

Note:
1. More than one ALTERNATE record key can be used.
2. WITH DUPLICATES means than an ALTERNATE RECORD KEY need not be unique. Thus, fields like EMP-DEPT can be used as a key even though numerous records may have the same department no.
3. A record can be accessed by its RECORD KEY or any of its ALTERNATE RECORD KEYs.

Eg 12.5:
SELECT IND-EMP-FILE
ASSIGN TO “INDEMP.DAT”
ORGANIZATION IS INDEXED
ACCESS IS SEQUENTIAL
RECORD KEY IS I-EMP-NO
ALTERNATE RECORD KEY IS I-EMP-DEPT WITH DUPLICATES.

Accessing records randomly by alternate record key
The program that accesses the file by key field has the same SELECT clause except that ACCESS IS RANDOM rather than SEQUENTIAL. In the PROCEDURE DIVISION , we can access records by either I-EMP-NO , the record key, or I-EMP-DEPT, the alternate key.
The KEY clause is used with the READ statement when an indexed file has ALTERNATE RECORD KEYs that we want to use to randomly access a record. If the KEY clause is omitted when accessing a file randomly, the RECORD KEY is assumed to be the KEY used for finding the record.
Suppose ALTERNATE RECORD KEY WITH DUPLICATES was specified in the ENVIRONMENT DIVISION and there is more than one record with the same ALTERNATE RECORD KEY. The first one that was actually placed on the disk will be the one retrieved by the READ.

The START statement
The START statement enables a program to begin processing an indexed file sequentially but at a record location other than the first or next physical record in the file. The access of the file is to be in sequence (ACCESS IS SEQUENTIAL) if we use the RECORD KEY for finding a record, even though we want to start the access at some point other than the beginning. The ACCESS IS DYNAMIC clause is used if we want to begin the processing an indexed file based on the contents of the ALTERNATE RECORD KEY.
When the record to be accessed has a key equal to the one placed in the RECORD KEY, the KEY clause in the START statement is not required. The INVALID clause is executed only if no such record is found.
Note that the START locates the desired record but it does not READ it into storage. The record must always be brought into storage with a READ statement.

Eg 12.6:
MOVE “Y” TO WS-FOUND.
MOVE 1001 TO I-EMP-NO.
START IND-EMP-FILE
KEY > I-EMP-NO
INVALID KEY DISPLAY “THERE IS NO EMP NO > 1001”
MOVE “N” TO WS-FOUND.
IF WS-FOUND = “Y”
READ IND-EMP-FILE
AT END
MOVE “Y” TO WS-EOF.

Suppose we wish to begin processing with an I-EMP-NO greater than 006. We must include a KEY clause with the START because we wish to position the file at allocation greater than the value of a RECORD KEY. The KEY clause can be omitted only if the record to be located has a RECORD KEY equal to the one stored.

The ACCESS IS DYNAMIC clause
Sometimes we wish to access an indexed file both randomly and sequentially in a single program. For this, we say that ACCESS IS DYNAMIC.
In addition to using ACCESS IS DYNAMIC for combining sequential and random access techniques in a single program, we can use this clause to access records by ALTERNATE RECORD KEY. Also, when records are to be accessed by both RECORD KEY and ALTERNATE RECORD KEY, use ACCESS IS DYNAMIC.

Rules for using the START statement
The file must be accessed with (a) ACCESS IS SEQUENTIAL for reading records in sequence by the RECORD KEY or (b) ACCESS IS DYNAMIC for reading records in sequence by an ALTERNATE RECORD KEY.
The file must be opened as either input or I-O.
If the KEY phrase is omitted, the relational operator “IS EQUAL TO” is implied and the primary record key is assumed to be the key of reference.
We use KEY =, >, NOT <>, NOT <>, NOT

Back to COBOL Index

No comments: