Traditional File Systems versus Databases


In many respects, you might consider databases the next generation of the traditional file system. The traditional file system was developed in the sixties for handling back office applications. Consisting of a series of files holding master and transaction data that when combined can produce reports needed to manage the business. In back office applications, each file works with an application program that coordinates the processing of two or more files. Since it is not easy to bind multiple text files (i.e. combine), one of the weaknesses of traditional file systems has been its propensity to allow for redundant data. What this means is that as application programmers developed new applications they tend to get around this problem of joining text files by putting more and more data in the text file duplicating a significant amount of data. When information exists in more than one file there's real potential for the files to become unsynchronized (not to mention wasted space from redundant data located in more then one location on disk). Data redundancy is seldom a good thing. One example often used to illustrate this weakness is the problem of record updates. If I have two files that both contain customer address information and I change one of the files but forget to change the other, my files now are different and I have a problem. If the file that didn't get changed is the billing file, my customer's invoice no longer goes to the correct address. If I use this address for shipping, they also do not get their order. A very bad situation for business.


Databases are actually applications (also known as database management software or database management system) that control and coordinate maintenance of organizational data. Instead of the programmer being able to directly access files and uncoordinated fashion, a data base environment requires that the programmer work with the database management software. The database management software provides coordinated security that does not exist in traditional file systems.


In addition, when the data in the database is created, the database management software requires a series of rules concerning the data, called constraints, to be defined by the person adding tables to the database. These rules help to ensure that redundancy is minimized and critical unique data is not duplicated. If index or keys have been created to assist the merging of different tables, the database software ensures that the accuracy of the data is not compromised. Whereas sequential files only contain data, a database contains many application programs which sit between the user and the data to ensure that security in accuracy are not compromised.



Figure 5: A database management system




Database Management Software (Systems): Is a software application that sits between the user and the data and controls the security and accuracy of data. Popular database management software is Microsoft Access, Oracle, MS SQL Server, IBM DB2 and MySQL. The Database Management Systems (DBMS) act as a layer between the users of data and the actual data.


An additional component of database processing that is more difficult to implement in traditional file processing is a two phase commit. A two phase commit allows the user/programmer to send a transaction to the database and optionally back off the change to its original state if the change with cause a data inaccuracy (i.e. two uses or programs trying to access and update the same information at the same time thus having one change negate the other). In most databases, the two phase commit process uses the COMMIT command to commit changes to the database and ROLLBACK when the data change needs to be rolled back to its original value.


Two Phase Commit - the ability to implement a two phase commit allows the program to verify the integrity of the change before the database is officially updated. The control is necessary for a number of reasons but the most important scenario is when two users (or programs) try to update the same row at the same time. As one is changing the date of birth and the other the address, it is possible for one program to change the date of birth and the other program to update the address (with the old date of birth). With the ROLLBACK option, we could undo the transaction and restore the data back to its original state.


Background Information: What databases do you use? - Everyday, most of us use many databases. Sometimes our use of databases isn't that obvious. When you go to the ATM machine and you withdraw or deposit money from your account, you're using an application program which is communicating with a database that is updated when you make your ATM transaction.


Maybe you're working on the job and perhaps taking an order from a customer. The application program which you are running in Microsoft Windows has text boxes and buttons along with menus and toolbars to accept information concerning the order. It is not the Windows application that holds the information you are processing. The application is communicating with a database program and acts as the interface to your data stored in the database. As the person processing the information on the order, you have certain security privileges which have been awarded to you by the table creator. As you fill out the information on the screen and then submit the order for processing, the appropriate tables in the database are modified. Your updates make this information available for other users of the organization to use. Does this sound familiar? If you are a student, do you access systems like this for grades or student information?


Click to close