"

1 Chapter 1 The Handling of Data Over the Years

Fred Strickland

Original Author: Adrienne Watt

Rewrite: Fred Strickland

Learning Outcomes

Computing Sub Discipline

Document Code, Reference Code, and Page Number

Text

Computer Engineering

CE2016

CE-SWD-10 Database systems

(Page 103)

Explain how use of database systems evolved from programming with simple collections of data files.

Computer Science

CS2013

IM/Database Systems (DS)

(Page 113)

1. Explain the characteristics that distinguish the database approach from the approach of programming with data files. [Familiarity]

CS2023

DM-Core: Core Database Systems Concepts

(Page 115-116))

ILO CS Core 1. Identify at least four advantages that using a database system provides.

CS2023

DM-Modeling: Data Modeling

(Pages 116-117)

Non-core 5. Spreadsheet models

Non-core 6. Object-oriented models

CS2023

DM-Relational: Relational Databases

(Pages 117-118)

KA Core 4. Physical database design: file and storage structures.

Information Technology

IT2017

ITE-IMA Domain: Information Management

  • Competencies

(Page 56)

A. Express how the growth of the internet and demands for information have changed data handling and transactional and analytical processing, and led to the creation of special purpose databases. (Requirements)

IT2017

Subdomains:

ITE-IMA-01 Perspectives and Impact

(Page 92)

a. Describe how data storage and retrieval has changed over time.

b. Justify the advantages of a database approach compared to traditional file processing.

c. Describe how the growth of the internet and demands for information for users outside the organization (customers and suppliers) impact data handling and processing.

d. Tell a brief history of database models and their evolution.

Introduction

Data1 could be stored in filing cabinets, in folders, in ledgers, in lists, or in piles on a desk. When computers became common place, then the data could be stored in spreadsheets, in electronic documents, in file-based systems, or in in a database management system.

In the early days, the data was stored in file-based systems. It was discovered that file-based systems had limitations. Database management systems were developed. Today’s users take for granted the many benefits found in a database system.

(Clay) Tablets to Handwritten Records

According to Staxbill.com, the earliest evidence of a primitive accounting method may have been around 5000 BC in Mesopotamia, in Sumer, in Assyria, and in Babylon. These documents recorded that people exchanged good. According to Oldest.org, the first appearance of written language may have been around 3500 to 3000 BC. The Sumerians from Mesopotamia used cuneiform and the Egyptians used hieroglyphics for recording information about daily life.

According to Investopedia, bartering has existed for centuries. There are no records of earliest bartering. Around 800 to 770 BC, coins were created in China. Having a medium of exchange made it easier to have a trading system.

Between 500 BC and 450 BC, the Roman Empire created an accounting system that documented public spending and revenue. The Roman army kept detailed records of cash, provisions, and business transactions. Private estates had their own accounting systems. The Romans used these detailed records to making decisions. These are the earliest instances of using data for driving decisions.

China moved to a paper currency around 1260 AD. Europe did not embrace paper currency until the 16th century.

Early Mechanical Data Recording

The Industrial Revolution brought about many changes. One noteworthy change is the beginning of mechanical recording data. Punch cards came out this period. These provided instructions to textile looms and player pianos. In 1837 Charels Babbage proposed the Analytical Engine, which was a primitive calculate that would use punch cares for instructions and response. Later Herman Hollerith made this into a reality. The United States Census Bureau used punch cards for the 1890 census. This enabled a rapid processing of data. Government offices, libraries, hospitals, and businesses developed elaborate database systems. One noteworthy example is the Dewey Decimal System.

File-Based Systems

Before the computing systems became wide-spread, data was stored in physical files. When computing systems became more available, the data was stored in electronic files. These would be organized in folders and folders would be organized in a hierarchy of directories and subdirectories. A person would follow a path from a directory to a subdirectory to the actual file. This hierarchical storage methodology may be known as “file Storage” or “file-level storage” or “file-based storage.” This works well for easily organized amounts of structured data. As the number of files increased, the file retrieval approach would become very cumbersome and very time-consuming. To scale (to expand) meant adding more hardware devices or upgrading to higher-capacity devices.

The data could be in two or more places within the organization. Having quick access meant creating a computer program. This was manageable for small organizations when the amount of data is a reasonable amount. The stored data was also known as a “flat file.”

Consider a traditional banking system that uses a file-based system to manage the organization’s data as shown in Figure 1.1. As we can see, there are different departments in the bank. Each department has its own suite of applications for managing and manipulating the data files. For banking systems, the programs may be used to debit or credit an account, find the balance of an account, add a new mortgage loan and generate monthly statements.

image

Figure 1.1. Example of a file-based system used by banks to manage data.

Characteristics of a File-Based Approach

The Geeks for Geeks website listed seven characteristics:

  • The data of certain companies or organizations were kept as “Files.”
  • The files stored in different departments were independent of each other, which caused severe data redundancy.
  • Those files were developed using programming languages like COBOL, C, and C++.
  • Each file includes information for a particular department or region, such as the library, tuition, and students’ exams.
  • The traditional file system is way less flexible than DBMS and has many disadvantages.
  • The maintenance of those files was also of high cost.
  • Each of the units of “Files” used to be known as “Flat Files.”

Some of the mentioned characteristics are also disadvantages.

Disadvantages of the File-Based Approach

Using the file-based system to keep organizational information has a number of disadvantages. Listed below are five examples.

Data Isolation

If a user needs to see data stored in two or more parts of the organization, then effort must be expended to locate the needed files, to find the relationships, and then to act on the insights.

Data Redundancy

Users in different parts of the organization may collect and work with the data. They may have applications created specifically for internal usage. Over time, the collected data may also appear in another work area of the organization. This is data redundancy.

Data Inconsistency

The content of files may change over time. Data that is duplicated in two or more locations may not be in sync. This can lead to data inconsistency. If the data is revised in one location, but not in the other location(s), then problems will occur.

Incompatible Data Format

Data could be stored in one location as plaintext in a word processing program. The other location may have the data stored in a spreadsheet. A computer program may use a different data format from another computer program.

Security Problems

Security can be a problem with a file-based approach because:

  • There are no constraints regarding accessing privileges.
  • Application requirements are added to the system in an ad-hoc manner so it is difficult to enforce constraints.

Concurrency Access

Concurrency is the ability to allow multiple users to access the same record without adversely affecting transaction processing. A file-based system must manage, or prevent, concurrency by the application programs. Typically, in a file-based system, when an application opens a file, that file is locked. This means that no one else can access that file.

Transactions Issues

If an update is being attempted and the system crashes, then the data would be in an inconsistent state.

Database Approach

The difficulties that arise from using the file-based system have prompted the development of a new approach in managing large amounts of organizational information called the database approach.

Computerized database systems started in the 1960s, because computers were becoming more affordable. Two database models were developed:

  • Network model
  • Hierarchical model

Charles Bachman developed the Integrated Data Store (IDS). This was based on a network data model. In the late 1960s, IBM developed the Integrated Management Systems (IMS). This was based on the hierarchical model.

In 1970, Edgar Codd developed the relational database model.

In the 1980s, IBM developed the Structured Query Language (SQL).

Databases and database technology play an important role in most areas where computers are used, including business, education and medicine. To understand the fundamentals of database systems, we will start by introducing some basic concepts in this area.

Role of Databases in Business

Everybody uses a database in some way, even if it is just to store information about their friends and family. That data might be written down or stored in a computer by using a word-processing program or it could be saved in a spreadsheet. However, the best way to store data is by using database management software. This is a powerful software tool that allows you to store, manipulate and retrieve data in a variety of ways.

Most companies keep track of customer information by storing it in a database. This data may include customers, employees, products, orders, or anything else that assists the business with its operations.

The Meaning of Data

Data could be factual information such as measurements or statistics about objects and concepts. We use data for discussions or as part of a calculation. Data can be a person, a place, an event, an action or any one of a number of things. A single fact is an element of data, or a data element.

Data can be stored in:

  • Filing cabinets
  • Spreadsheets
  • Folders
  • Ledgers
  • Lists
  • Piles of papers on your desk

These store data and so does a database. Databases have terrific powers for managing and for processing the data. Data processing renders data as information for decision making and for other actions.

The Internet gave users the ability to access data beyond the local work area.

With this understanding of data, we can start to see how this tool has the capacity to store a collection of data and to organize it, to conduct a rapid search, to retrieve, and to process. The result is that we can have a different understanding of how we can use the data. This book is all about managing data.

 

Key Terms

concurrency: The ability of the database to allow multiple users to access to the same record without adversely affecting transaction processing.

data element: A single fact or piece of information

data inconsistency: A situation where various copies of the same data are conflicting.

data isolation: A property that determines when and how changes made by one operation become visible to other concurrent users and systems.

data redundancy: A situation that occurs when the same data appears in two or more locations.

database approach: Allows the management of large amounts of organizational information.

database management software: A powerful software tool that allows you to store, manipulate and retrieve data in a variety of ways.

file-based system: An application program designed to manipulate data files.

 

Exercises

  1. Discuss each of the following terms:
    1. data
    2. field
    3. record
    4. file
  2. What is data redundancy?
  3. Discuss the disadvantages of file-based systems.
  4. Explain the difference between data and information.
  5. Explain the characteristics that distinguish the database approach from the approach of programming with data files. (CS2013 IM/DS 1)
  6. Justify the advantages of a database approach compared to traditional file processing. (IT2017 ITE-IMA-01 b)
  7. Write how the growth of the internet and the demands for information have changed data handling, data transactional processing, and data analytical processing with the result of creating special purpose databases. (IT2017 ITE-IMA A.)
  8. Describe how the growth of the internet and demands for information for users outside the organization (customers and suppliers) impact data handling and processing. (IT2017 ITE-IMA-01 c)
  9. Tell a brief history of database models and their evolution. (IT2017 ITE-IMA-01 c)
  10. Describe how data storage and retrieval has changed over time. (IT2017 ITE-IMA-01 a)

 

Attribution

This chapter of Database Design (including its images, unless otherwise noted) is a derivative copy of Database System Concepts by Nguyen Kim Anh licensed under Creative Commons Attribution License 3.0 license

The following material was written by Adrienne Watt for the second edition:

  • Introduction
  • Key Terms
  • Exercises

The whole chapter was completely revised by Fred Strickland for the third edition.

References

“8 Oldest recorded History in the world,” Oldest.org, n.d. https://www.oldest.org/culture/recorded-history/

Andrew Beattie. “The History of Money: Bartering to Banknotes to Bitcoin,” Investopedia, April 2, 2024. https://www.investopedia.com/articles/07/roots_of_money.asp

“Disadvantage of File Systems,” Geeks for Geeks, May 15, 2023. https://www.geeksforgeeks.org/disadvantages-of-file-systems/

“History of DBMS,” Geeks for Geeks, February 26, 2024. https://www.geeksforgeeks.org/history-of-dbms/

Keita Mitsuhashi. “The Evolution of Databases: A Historical Deep Dive – Part 1: Database Origins: From Ancient Libraries to Digital Records,” medium.com, August 30, 2023. https://medium.com/morph-blog/the-evolution-of-databases-a-historical-deep-dive-part-1-database-origins-from-ancient-28608b291add

Keith D. Foote. “A Brief History of Data Storage,” dataversity.net, November 1, 2017. https://www.dataversity.net/brief-history-data-storage/

Serge Frigon. “The History of Accounting Goes Back Further Than You Think,” staxbill.com, n.d. https://staxbill.com/the-history-of-accounting/

“What is file storage?” IBM, n.d. https://www.ibm.com/topics/file-storage

 

1 Point of Grammar
The word data can be either singular or plural depending on meaning and context. In general usage,
  • Data is treated as singular when used as a mass noun to mean “information.”
  • Data is treated as plural when used to mean “individual facts.”
  • In scientific and academic writing, data is almost always used as a plural noun.
  • In digital technology, data is usually treated as a singular mass noun to mean “digitally stored information.”

Note: The second edition authors are Canadians and have tended to use the word “data” in the plural based on the Latin where “datum” is the singular form of “data.” A United States contributor is following the digital technology custom and is using data as a singular mass noun.

License

Database Design - 3rd Edition Copyright © by Fred Strickland. All Rights Reserved.

Share This Book