File Storage and Retrieval Essay

Information stored in a mass storage system is conceptually grouped into large units called files. A typical file may consist of a complete text document, a photograph, a program, a music recording, or a collection of data about the employees in a company. Mass storage devices dictate that these files be stored and retrieved in smaller, multiple byte units. For example, a file stored on a magnetic disk must be manipulated by sectors, each of which is a fixed predetermined size. A block of data conforming to the specific characteristics of a storage device is called a physical record.

Thus, a large file stored in mass storage will typically consist of many physical records. In contrast to this division into physical records, a file often has natural divisions determined by the information represented. For example, a file containing information regarding a company’s employees would consist of multiple units, each consisting of the information about one employee. Or, a file containing a text document would consist of paragraphs or pages. These naturally occurring blocks of data are called logical records. Logical records often consist of smaller units called fields.

For example, a logical record containing information about an employee would probably consist of fields such as name, address, employee identification number, etc. Sometimes each logical record within a file is uniquely identified by means of a particular field within the record (perhaps an employee’s identification number, a part number, or a catalogue item number). Such an identifying field is called a key field. The value held in a key field is called a key. Logical record sizes rarely match the physical record size dictated by a mass storage device.

In turn, one may find several logical records residing within a single physical record or perhaps a logical record split between two or more physical records. The result is that a certain amount of unscrambling is associated with retrieving data from mass storage systems. A common solution to this problem is to set aside an area of main memory that is large enough to hold several physical records and to use this memory space as a regrouping area. That is, blocks of data compatible with physical records can be transferred between this main memory area and the mass torage system, while the data residing in the main memory area can be referenced in terms of logical records. An area of memory used in this manner is called a buffer. In general, a buffer is a storage area used to hold data on a temporary basis, usually during the process of being transferred from one device to another. For example, modern printers contain memory circuitry of their own, a large part of which is used as a buffer for holding portions of a document that have been received by the printer but not yet printed.

