Wednesday, February 17, 2010

8. File Handling

File Organization methods
It is important that file be so organized that efficient processing can be accomplished by matching the file data characteristics, processing method and file organization. Basically, three methods of file organization are available on disk systems: sequential, indexed sequential, and relative file organization.

Sequential file organization
Sequential file organization indicates that the records in the file are positioned in a sequential order, such as according to part number.

Indexed sequential file organization
Indexed sequential file organization is one in which the records are filed sequentially, but a table (index is available which identifies the location of groups of records, thereby reducing access time.

Relative file organization
Relative file organization is such that the logical order and physical order of the records do not necessarily correspond with one another. For such a file, a technique, or rule, is required to determine the location of the record in the disk system.

This chapter shows you how to process sequential files in COBOL.

Any program that (1) reads data from input files or (2) produces output files, requires an INPUT-OUTPUT SECTION and a FILE SECTION to describe the input and output areas.

INPUT-OUTPUT SECTION
The INPUT-OUTPUT SECTION of the ENVIRONMENT DIVISION follows the CONFIGURATION SECTION and supplies information concerning the input and output devices used in the program. In the FILE-CONTROL paragraph, a file-name is selected for each file to be used in the program; in addition, each file-name selected is assigned to a device. The SELECT statement is coded in Area B.

Eg 8.1:
ENVIRONMENT DIVISION.
:
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT EMPLOYEE-FILE
ASSIGN TO “EMP.DAT”
ORGANIZATION IS LINE SEQUENTIAL.

FILE SECTION.
Each file is described in the FILE SECTION with an FD sentence that may consist of a series of clauses. After the clauses are specified, the FD sentence ends with a period. FD is an abbreviation for File Description. Each FD entry will describe a file defined in a SELECT statement in the ENVIRONMENT DIVISION.
The two entries, DATA DIVISION and FILE SECTION, are coded in Area A. FD is also coded in Area A. The file-name, however, is typically coded in Area B.

Eg 8.2:
DATA DIVISION.
FILE SECTION.
FD EMPLOYEE-FILE
LABEL RECORDS ARE STANDARD
RECORD CONTAINS 70 CHARACTERS
BLOCK CONTAINS 10 RECORDS.

Label Records
Label records are usually created as the first and last records of a disk or tape to provide identifying information about the file on disk or tape. Labels are created on output files so that, when the same file is later read as input, the labels may be checked to ensure that the file being accessed is the correct one. Labels are created on output files and checked on input files. The COBOL compiler will supply the routine for writing labels on output files or for checking labels on input file if the entry LABEL RECORDS ARE STANDARD is included.

This LABEL RECORDS clause will result in the following :
For output files, the first record on disk or tape file will be created as a standard 80-position header label identifying the file to the system; similarly, the last record on the disk or tape will be created as a trailer label.
For input files, these labels will be computer-checked to ensure that the file being processed is the correct one.

The clause LABEL RECORDS ARE STANDARD is permitted for disk and tape files only. Devices such as printers do not use label records, since identifying information is unnecessary where data is visible to the human eye. The clause LABEL RECORDS ARE OMITTED is used for such files.

RECORD CONTAINS clause
The RECORD CONTAINS clause indicates the size of each record. For printer files the RECORD CONTAINS clause may include one extra position that is used to control the spacing of the form (e.g., single spacing, double spacing). Thus, for 132 character printers, a record size is sometimes set as 133 characters. In such cases, the first or leftmost position in these 133-position print records is the form control position; it is not actually printed.


BLOCK CONTAINS clause
The BLOCK CONTAINS clause is included in the File Description entry only for files in which disk or tape records have been blocked. Blocking is a technique that increases the speed of input/output operations and makes more effective use of storage space on disk and tape. A group of logical records is included within one block to maximize the efficient use of a disk or tape area. For example, reading in a block of 10 disk records, is more efficient than reading in each disk record separately. Even if blocking is used, the program processes records in the standard way, that is, one logical record at a time.

Record Description entries
A record is a unit of information consisting or related data items within a file. Most often, a file consists of records that all have the same length and format. These are called fixed-length records.
For each file defined, we have one record format.

Eg 8.3:
01 EMPLOYEE-REC.
05 EMP-NAME.
10 EMP-FIRST-NAME PIC X(10).
10 EMP-LAST-NAME PIC X(15).
05 EMP-DEPT PIC X(4).
05 EMP-SALARY PIC 9(5)V99.
05 EMP-DOJ PIC 9(6).

Input/output verbs
There are 4 input/output verbs : OPEN, READ, WRITE, CLOSE.

OPEN statement
Before an input or an output file can be used by the program it must be opened. An OPEN statement, designates files as either input or output. It also accesses the specific devices, and makes the files available for processing. It performs header label routines if label records are STANDARD. The OPEN statement checks the header label to determine if the correct file has been accessed.

Eg 8.4:
OPEN INPUT EMPLOYEE-FILE.
OPEN OUTPUT REPORT-FILE.

The order in which files are opened is not significant. The only restriction is that a file must be opened before it may be read or written; a file must be accessed before it may be processed. Since the OPEN statement accesses the files, it is generally on of the first instructions coded in the PROCEDURE DIVISION.

READ statement
After an input file has been opened, it may be read. A READ statement transmits data from the input device, assigned in the ENVIRONMENT DIVISION, to the input storage area, defined in the FILE SECTION of the DATA DIVISION.
The primary function of the READ statement is to transmit one data record to the input area reserved for that file. That is, each time a READ statement is executed, one record is read into primary storage.
The READ statement has, however, several other functions. Like the OPEN statement, it performs certain checks. It checks the length of each input record to ensure that it corresponds to the length specified in a RECORD CONTAINS clause in the data DIVISION. If a discrepancy exists, an error message prints, and a program interrupt occurs.
The READ statement will also use the BLOCK CONTAINS clause, if specified, to perform a check on the blocking factor.
The AT END clause in the READ statement tests to determine if there is any more input. An AT END clause of the READ statement tells the computer what to do if there is no more data to be read.

Eg 8.5:
READ EMPLOYEE-FILE
AT END
MOVE “YES” TO END-OF-FILE.

WRITE statement
The WRITE instruction takes data in the output area defined in the DATA DIVISION and transmits it to the device specified in the ENVIRONMENT DIVISION.
Note that although files are read, we write records. The record-name appear on the 01 level and is generally subdivided into fields. The record description specifies the format of the output.

Eg 8.6:
WRITE EMPLOYEE-REC.

CLOSE statement
A CLOSE statement is coded at the end of the job after all records have been processed to release these files and deactivate the devices. All files that have been opened at the beginning of the program are closed at the end of a program. The CLOSE statement, like the OPEN, will perform additional functions. When creating disk or tape records, for example, the CLOSE will create trailer labels; it will also rewind a tape.

Eg 8.7:
CLOSE EMPLOYEE-FILE.


COPY statement
A COPY statement is used to bring into a program a series of prewritten COBOL entries that have been stored in a library. Copying entries from a library, rather than coding them, has the following benefits : (1) it could save a programmer a considerable amount of coding and debugging time; (2) it promotes program standardization since all programs that copy entries from a library will be using common data-names and/or procedures; (3) it reduces the time it takes to make modifications and reduces duplication of effort; if a change needs to be made to a data entry, it can be made just once in the library without the need to alter individual programs; and (4) library entries are extensively annotated so that they are meaningful to all users; this annotation results in better-documented programs and systems.
Most often, the COPY statement is used to copy FD and 01 entries that define and describe files and records. In addition, standard modules to be used in the PROCEDURE DIVISION of several programs may also be stored in a library and copied as needed.

Contents of EMP.REC
Eg 8.8a:
01 EMPLOYEE-REC.
05 EMP-NAME.
10 EMP-FIRST-NAME PIC X(10).
10 EMP-LAST-NAME PIC X(15).
05 EMP-DEPT PIC X(4).
05 EMP-SALARY PIC 9(5)V99.
05 EMP-DOJ PIC 9(6).

The DATA DIVISION entry using a COPY statement
Eg 8.8b:
DATA DIVISION.
FILE SECTION.
FD EMPLOYEE-FILE
LABEL RECORDS ARE STANDARD
RECORD CONTAINS 70 CHARACTERS
BLOCK CONTAINS 10 RECORDS.
COPY “EMP.REC”.

A program to create the employee file.

Eg 8.9:
IDENTIFICATION DIVISION.
PROGRAM-ID. FILE-CRT.
* This program creates a sequential EMPLOYEE file.

ENVIRONMENT DIVISION.
INPUT-OUTPUT SECTION.
FILE-CONTROL.
SELECT EMPLOYEE-FILE ASSIGN TO "EMP.DAT"
ORGANIZATION IS LINE SEQUENTIAL.


DATA DIVISION.
FILE SECTION.
FD EMPLOYEE-FILE
LABEL RECORDS STANDARD.
01 EMPLOYEE-REC.
05 EMP-NO PIC 9(4).
05 EMP-NAME.
10 EMP-FIRST-NAME PIC X(10).
10 EMP-LAST-NAME PIC X(15).
05 EMP-DEPT PIC X(4).
05 EMP-SALARY PIC 9(5)V99.
05 EMP-DOJ PIC 9(6).

WORKING-STORAGE SECTION.
01 WS-ANS PIC X(01) VALUE "Y".
88 ANS-NO VALUE "N" "n".

PROCEDURE DIVISION.
0000-MAIN.
OPEN OUTPUT EMPLOYEE-FILE.
PERFORM 1000-ACPT-PARA UNTIL ANS-NO.
CLOSE EMPLOYEE-FILE.
STOP RUN.

1000-ACPT-PARA.
DISPLAY "ENTER YOUR EMP CODE : " WITH NO ADVANCING.
ACCEPT EMP-NO.
DISPLAY "ENTER YOUR FIRST NAME : " WITH NO ADVANCING.
ACCEPT EMP-FIRST-NAME.
DISPLAY "ENTER YOUR LAST NAME : " WITH NO ADVANCING.
ACCEPT EMP-LAST-NAME.
DISPLAY "ENTER YOUR DEPARTMENT : " WITH NO ADVANCING.
ACCEPT EMP-DEPT.
DISPLAY "ENTER YOUR SALARY : " WITH NO ADVANCING.
ACCEPT EMP-SALARY.
DISPLAY "ENTER YOUR DATE OF JOINING : " WITH NO ADVANCING.
ACCEPT EMP-DOJ.
WRITE EMPLOYEE-REC.
DISPLAY "DO YOU WANT TO ADD MORE RECORDS : "
WITH NO ADVANCING.
ACCEPT WS-ANS.

Back to COBOL Index

No comments: