"

Preface for Database Design – 3rd Edition

Fred Strickland

Contents

Preface for Database Design – 3rd Edition1

The Past Editions1

The Present Edition2

Association of Computing Machinery (ACM)3

Computer Engineering3

Computer Science3

Cybersecurity13

Data Science14

Information Systems (Also written as “Computer Information Systems.”)15

Information Technology15

Software Engineering18

United States National Security Agency (NSA)18

Data Administration (DBA)19

Database Management Systems (DMS)19

Databases (DAT)20

Association of Information Technology Professionals (CIPS)20

New Chapters21

Licensing22

Classroom Use22

Acknowledgments22

Progress Notes23

Technical Considerations23

A Note on the Cover23

References24

The Past Editions

The first edition was released sometime before 2014.

Adrienne Watt wrote the following for the second edition (copyright 2014):

The primary purpose of this text is to provide an open source textbook that covers most introductory database courses. The material in the textbook was obtained from a variety of sources. All the sources are found in at the end of each chapter. I expect, with time, the book will grow with more information and more examples.

I welcome any feedback that would improve the book. If you would like to add a section to the book, please let me know.

Adrienne Watt

The Present Edition

For the third edition, we have decided to greatly revise the book to make it more useful for a larger audience. We have researched the curriculum guidance from the three North American agencies:

  • Association of Computing Machinery (ACM)
  • the United States National Security Agency (NSA)
  • Association of Information Technology Professionals (CIPS)

This edition will cover the learning outcomes1 for the following computing subdisciplines:

  • Computer Engineering (CE)
  • Computer Science (CS)
    • 2013 (Knowledge based approach)
    • 2023 (Competency based approach)
  • Cybersecurity (CYB)
  • Data Science (DS)
  • Information Systems (IS) (Also written as “Computer Information Systems.”)
  • Information Technology (IT)
  • Software Engineering (SE)

As a side note, there are old computer science textbooks that stated that these supported “CS1” or “CS2.” Matthew Hertz pointed out that this usage dates from about 1979 when “the ACM Computing Curricula used the terms ‘CS1’ and ‘CS2” to designate the first two course in the introductory sequences of a computer science major.” However, these textbooks did not provide detailed information about the covered learning outcomes. Dr. Chris Bourke’s textbook is an example of a textbook that states it supports CS12. It is rare to find a textbook that mentions which curricular books are being supported. As a personal note, I have not seen any recent computer science textbooks that mention supporting CS2013.

The next few pages list the database related learning outcomes from the current seven sub computing disciplines. Please note that this textbook does not attempt to cover every learning outcome. Instructors may feel free to add additional materials to cover these omitted learning outcomes.

Association of Computing Machinery (ACM)

Computer Engineering

Computer Engineering Curricula 2016 (https://www.acm.org/binaries/content/assets/education/ce2016-final-report.pdf)

Chapters

CE-PPP-9 Contemporary issues (Page 88)

Articulate some of the privacy implications related to massive database systems.

Chapter 3

CE-SEC-9 Authentication (Page 91)

Discuss eavesdropping and server database reading and explain how various authentication methods deal with them.

CE-SPE-5 Risk, dependability, safety, and fault tolerance

Describe fault tolerance and dependability requirements of different applications (such as database, aerospace, telecommunications, industrial control, and transaction processing).

CE-SWD-10 Database systems (Page 103)

Explain how use of database systems evolved from programming with simple collections of data files.

Chapter 1

Describe the major components of a modern database system.

Describe the functionality provided by languages such as SQL.

Chapter 9

Give examples of interactions with database systems that are relevant to computer engineering.

Computer Science

Computer Science Curricula 2013 (https://www.acm.org/binaries/content/assets/education/cs2013_web_final.pdf)

Chapters

Information Management (IM) (Page 113)

IM/Database Systems (Pages 113-114)

1. Explain the characteristics that distinguish the database approach from the approach of programming with data files. [Familiarity]

2. Describe the most common designs for core database system components including the query optimizer, query executor, storage manager, access methods, and transaction processor. [Familiarity]

3. Cite the basic goals, functions, and models of database systems. [Familiarity]

4. Describe the components of a database system and give examples of their use. [Familiarity]

5. Identify major DBMS functions and describe their role in a database system. [Familiarity]

6. Explain the concept of data independence and its importance in a database system. [Familiarity]

7. Use a declarative query language to elicit information from a database. [Usage]

8. Describe facilities that databases provide supporting structures and/or stream (sequence) data, e.g., text. [Familiarity]

[Elective]

9. Describe major approaches to storing and processing large volumes of data. [Familiarity]

1. Chapter 1

2.

3. Chapter 2

4. Chapter 11

5. Chapter 11

6. Chapter 3

7. Chapter 9

8.

9.

IM/Relational Databases (Pages 115-116)

1. Prepare a relational schema from a conceptual model developed using the entity- relationship model. [Usage]

2. Explain and demonstrate the concepts of entity integrity constraint and referential integrity constraint (including definition of the concept of a foreign key). [Usage]

3. Demonstrate use of the relational algebra operations from mathematical set theory (union, intersection, difference, and Cartesian product) and the relational algebra operations developed specifically for relational databases (select (restrict), project, join, and division). [Usage]

4. Write queries in the relational algebra. [Usage]

5. Write queries in the tuple relational calculus. [Usage]

6. Determine the functional dependency between two or more attributes that are a subset of a relation. [Assessment]

7. Connect constraints expressed as primary key and foreign key, with functional dependencies. [Usage]

8. Compute the closure of a set of attributes under given functional dependencies. [Usage]

9. Determine whether a set of attributes form a superkey and/or candidate key for a relation with given functional dependencies. [Assessment]

10. Evaluate a proposed decomposition, to say whether it has lossless-join and dependency-preservation. [Assessment]

11. Describe the properties of BCNF, PJNF, 5NF. [Familiarity]

12. Explain the impact of normalization on the efficiency of database operations especially query optimization. [Familiarity]

13. Describe what is a multi-valued dependency and what type of constraints it specifies. [Familiarity]

1. Chapter 4

Chapter 5

2. Chapter 3

Chapter 6

3. Chapter 10

4. Chapter 10

5. Chapter 10

6. Chapter 7

7. Chapter 7

8. Chapter 7

9. Chapter 7

10. Chapter 5

11. Chapter 7

12. Chapter 7

13. Chapter 7

IM/Query Languages (Page 116)

1. Create a relational database schema in SQL that incorporates key, entity integrity, and referential integrity constraints. [Usage]

2. Use SQL to create tables and retrieve (SELECT) information from a database. [Usage]

3. Evaluate a set of query processing strategies and select the optimal strategy. [Assessment]

4. Create a non-procedural query by filling in templates of relations to construct an example of the desired query result. [Usage]

5. Embed object-oriented queries into a stand-alone language such as C++ or Java (e.g., SELECT Col.Method() FROM Object). [Usage]

6. Write a stored procedure that deals with parameters and has some control flow, to provide a given functionality. [Usage]

1. Chapter 9

2. Chapter 9

3.

4. Chapter 12

5. Chapter 12

Chapter 13

6. Chapter 12

CS2023 Shift

In CS2013, this area was named “Information Management.” In CS2023, this area is now named “Data Management (DM).” This was done in order to “avoid confusion with the similar definitions used in Information Systems and Information Technology curricula.” CS2023 has greater attention to security knowledge area and to society, ethics, and profession knowledge area. CS2023 includes NoSQL approaches and MapReduce as core topics.

This was released in mid-2024 after major work had been completed on this textbook. I have attempted to list where these new learning outcomes appear. My effort may not be accurate. Consider this to be a work in progress.

Computer Science Curricula 20233 (https://dl.acm.org/doi/pdf/10.1145/3664191)

Bloom4

Chapters

DM-Data: The Role of Data and the Data Life Cycle (Page 115)

CS Core5 1. The Data Life Cycle: Creation-Processing-Review/Reporting Retention/Retrieval Destruction

Illustrative Learning Outcomes

ILO CS Core 1. Identify the five stages of the Data Life Cycle.

Evaluate

DM-Core: Core Database Systems Concepts (Pages 115-116)

Explain

CS Core 1. Purpose and advantages of database systems

Chapter 3

CS Core 2. Components of database systems

Chapter 11

CS Core 3. Design of Core DBMS functions (e.g., query mechanisms, transaction management, buffer management, access methods)

Chapter 11

CS Core 4. Database architecture, data independence, and data abstraction

Chapter 11

CS Core 5. Transaction management

Chapter 11

CS Core 6. Normalization

Chapter 7

CS Core 7. Approaches for managing large volumes of data (e.g., NoSQL database systems, use of MapReduce)

CS Core 8. How to support CRUD-only applications

Chapter 15

CS Core 9. Distributed databases/cloud-based systems

CS Core 10. Structured, Semi-structured, and unstructured data

CS Core 11. Use of a declarative query language

KA Core 12. Systems supporting structured and/or stream content

Illustrative Learning Outcomes

ILO CS Core 1. Identify at least four advantages that using a database system provides.

Chapter 1

ILO CS Core 2. Enumerate the components of a (relational) database system.

Chapter 11

ILO CS Core 3. Follow a query as it is processed by the components of a (relational) database system.

Chapter 11

ILO CS Core 4. Defend the value of data independence.

Chapter 2

ILO CS Core 5. Compose a simple select-project-join query in SQL.

Chapter 9

ILO CS Core 6. Enumerate the four properties of a correct transaction manager.

Chapter 11

ILO CS Core 7. Describe the advantages for eliminating duplicate repeated data.

Chapter 7

ILO CS Core 8. Outline how MapReduce uses parallelism to process data efficiently.

ILO CS Core 9. Evaluate the differences between structured and semi/unstructured databases.

DM-Modeling: Data Modeling (Pages 116-117)

Develop and Explain

CS Core 1. Data modeling

Chapter 4

CS Core 2. Relational data model

Chapter 4

Chapter 5

KA Core 3. Conceptual models (e.g., entity-relationship, UML diagrams)

Chapter 4

Chapter 5

KA Core 4. Semi-structured data models (expressed using DTD, XML, or JSON Schema, for example)6

Non-core 5. Spreadsheet models

Chapter 1

Non-core 6. Object-oriented models

Chapter 1

Non-core 6a. GraphQL

Non-core 7. New features in SQL

Chapter 9

Non-core 8. Specialized Data Modeling topics

Non-core 8a. Time series data (aggregation, join)

Non-core 8b. Graph data (link traversal)

Non-core 8c. Techniques for avoiding inefficient raw data access (e.g., “avg daily price”): materialized views and special data structures (e.g., Hyperloglog, bitmap)

Non-core 8d. Geo-Spatial data (e.g., GIS databases)

Illustrative Learning Outcomes

ILO CS Core 1. Describe the components of the relational data model.

Chapter 5

ILO CS Core 2. Model 1:1, 1:n, and n:m relationships using the relational data model.

Chapter 6

ILO KA Core 3. Describe the components of the E-R (or some other non-relational) data model.

Chapter 5

ILO KA Core 4. Model a given environment using a conceptual data model.

Chapter 4

ILO KA Core 5. Model a given environment using the document-based or key-value store-based data model.

DM-Relational: Relational Databases (Pages 117-118)

Explain and Develop

CS Core 1. Entity and referential integrity: Candidate key, superkeys

Chapter 5

Chapter 6

CS Core 2. Relational database design

Chapter 5

KA Core 3. Mapping conceptual schema to a relational schema

Chapter 5

KA Core 4. Physical database design: file and storage structures

Chapter 1

KA Core 5. Introduction to Functional dependency theory.

Chapter 7

KA Core 6. Normalization Theory

Chapter 7

KA Core 6a. Decomposition of a schema; lossless join, and dependency-preservation properties of a decomposition

Chapter 7

KA Core 6b. Normal forms (BCNF)

Chapter 7

KA Core 6c. Denormalization (for efficiency)

Chapter 7

Non-core 7. Functional dependency theory

Chapter 7

Non-core 7a. Closure of a set of attributes

Chapter 7

Non-core 7b. Canonical Cover

Chapter 7

Non-core 8. Normalization theory

Chapter 7

Non-core 8a. Multi-valued dependency (4NF)

Chapter 7

Non-core 8b. Join dependency (PJNF, 5NF)

Chapter 7

Non-core 8c. Representation theory

Illustrative Learning Outcomes

ILO CS Core 1. Describe the defining characteristics behind the relational data model.

Chapter 5

ILO CS Core 2. Comment on the difference between a foreign key and a superkey.

Chapter 5

ILO CS Core 3. Enumerate the different types of integrity constraints.

Chapter 3

Chapter 6

ILO KA Core 4. Compose a relational schema from a conceptual schema which contains 1:1, 1:n, and n:m relationships.

Chapter 6

ILO KA Core 5. Map appropriate file structure to relations and indices.

ILO KA Core 6. Describe how functional dependency theory generalizes the notion of key.

Chapter 7

ILO KA Core 7. Defend a given decomposition as lossless and or dependency preserving.

Chapter 7

ILO KA Core 8. Detect which normal form a given decomposition yields.

Chapter 7

ILO KA Core 9. Comment on reasons for denormalizing a relation.

Chapter 7

DM-Querying: Query Construction (Page 118)

Develop

CS Core 1. SQL Query Formation

Chapter 9

CS Core 1a. Interactive SQL execution

CS Core 1b. Programmatic execution of an SQL query

Chapter 13

KA Core 2. Relational Algebra

Chapter 10

KA Core 3. SQL

Chapter 9

KA Core 3a. Data definition including integrity and other constraint specifications.

Chapter 9

KA Core 3b. Update sublanguage

Non-core 4. Relational Calculus

Chapter 1

Non-core 5. QBE and 4th-generation environments

Non-core 6. Different ways to invoke non-procedural queries in conventional languages.

Chapter 13

Non-core 7. Introduction to other major query languages (e.g., XPATH, SPARQL)

Non-core 8. Stored procedures

Chapter 13

Illustrative Learning Outcomes

ILO CS Core 1. Compose SQL queries that incorporate select, project, join, union, intersection, set difference, and set division.

Chapter 9

ILO CS Core 2. Determine when a nested SQL query is correlated or not.

ILO CS Core 3. Iterate over data retrieved programmatically from a database via an SQL query

ILO KA Core 4. Define, in SQL, a relation schema, including all integrity constraints and delete/update triggers.

Chapter 9

ILO KA Core 5. Compose an SQL query to update a tuple in a relation.

DM-Processing: Query Processing (Pages 118-119)7

Develop

KA Core 1. Page structures

KA Core 2. Index structures

KA Core 2a. B+ trees

KA Core 2b. Hash indices: static and dynamic

KA Core 2c. Index creation in SQL

KA Core 3. File structures

KA Core 3a. Heap files

KA Core 3b. Hash files

KA Core 4. Algorithms for query operators

KA Core 4a. External Sorting

KA Core 4b. Selection

KA Core 4c. Projection8; with and without duplication elimination

KA Core 4d. Natural Joins: Nested loop, Sort-merge, Hash join

KA Core 4e. Analysis of algorithm efficiency

KA Core 5. Query transformations

KA Core 6. Query optimization

KA Core 6a. Access paths

KA Core 6b. Query plan construction

KA Core 6c. Selectivity estimation

KA Core 6d. Index-only plans

KA Core 7. Parallel Query Processing (e.g., parallel scan, parallel join, parallel aggregation)

KA Core 8. Database tuning/performance

KA Core 8a. Index selection

KA Core 8b. Impact of indices on query performance

KA Core 8c. Denormalization9

Illustrative Learning Outcomes

ILO KA Core 1. Describe the purpose and organization of both B+ tree and hash index structures.

ILO KA Core 2. Compose an SQL command to create an index10 (any kind).

ILO KA Core 3. Specify the steps for the various query operator algorithms: external sorting, projection with duplicate elimination, sort-merge join, hash-join, block nested-loop join.

ILO KA Core 4. Derive the run-time11 (in I/O requests) for each of the above algorithms.

ILO KA Core 5. Transform a query in relational algebra12 to its equivalent appropriate for a left-deep, pipelined execution.

ILO KA Core 6. Compute selectivity estimates for a given selection and/or join operation.

ILO KA Core 7. Describe how to modify an index structure to facilitate an index-only operation for a given relation.

ILO KA Core 8. For a given scenario decide on which indices to support for the efficient execution of a set of queries.

ILO KA Core 9. Describe how DBMSs leverage parallelism to speed up query processing by dividing the work across multiple processors or nodes

DM-Internals: DBMS Internals (Pages 119-120)

Explain

KA Core 1. DB Buffer Management.

Chapter 11

KA Core 2. Transaction Management

Chapter 11

KA Core 2a. Isolation Levels

Chapter 11

KA Core 2b. ACID

Chapter 11

KA Core 2c. Serializability

Chapter 11

KA Core 2d. Distributed Transactions

Chapter 11

KA Core 3. Concurrency Control

Chapter 11

KA Core 3a. 2-Phase Locking

Chapter 11

KA Core 3b. Deadlocks handling strategies

Chapter 11

KA Core 3c. Quorum-based consistency models

KA Core 4. Recovery Manager

Chapter 11

KA Core 4a. Relation with Buffer Manager

Chapter 11

Non-core 5. Concurrency Control

Chapter 11

Non-core 5a. Optimistic concurrency control

Chapter 11

Non-core 5b. Timestamp concurrency control

Chapter 11

Non-core 6. Recovery Manager

Chapter 11

Non-core 6a. Write-Ahead logging

Chapter 11

Non-core 6b. ARIES recovery system (Analysis, REDO, UNDO)

Illustrative Learning Outcomes

KA Core 1. Describe how a DBMS manages its Buffer Pool.

Chapter 11

KA Core 2. Describe the four properties for a correct transaction manager.

Chapter 11

KA Core 3. Outline the principle of serializability.

DM-NoSQL: NoSQL Systems (Pages 120-121)

Explain

KA Core 1. Why NoSQL? (e.g., Impedance mismatch between Application [CRUD] and RDBMS)

Chapter 4

Chapter 8

Chapter 15

KA Core 2. Key-Value and Document data model

Chapter 8

Non-core 3. Storage systems (e.g., Key-Value systems, Data Lakes)

Chapter 8

Non-core 4. Distribution Models (Update and Read, Quorum consistency, CAP theorem)

Non-core 5. Graph Databases

Chapter 8

Non-core 6. Consistency Models (Update and Read, Quorum consistency, CAP theorem)

Non-core 7. Processing model (e.g., Map-Reduce, multi-stage map-reduce, incremental map-reduce)

Non-core 8. Case Studies: Cloud storage systems (e.g., S3); Graph databases; “When not to use NoSQL”

Illustrative Learning Outcomes

ILO KA Core 1. Develop a use case for the use of NoSQL over RDBMS

Chapter 4

Chapter 8

ILO KA Core 2. Describe the defining characteristics behind Key-Value and Document-based data models

Chapter 8

DM-Security: Data Security and Privacy (Page 121)13

Explain

CS Core 1. Differences between data security and data privacy.

CS Core 2. Protecting data and database systems from attacks, including injection attacks such as SQL injection

Chapter 13

CS Core 3. Personally identifying information (PII) and its protection.

CS Core 4. Ethical considerations in ensuring the security and privacy of data.14

KA Core 5. Need for, and different approaches to securing data at rest, in transit, and during processing.15

KA Core 6. Database auditing and its role in digital forensics.

KA Core 7. Data inferencing and preventing attacks.

KA Core 8. Laws and regulations governing data security and data privacy

Non-core 9. Typical risk factors and prevention measures for ensuring data integrity

Non-core 10. Ransomware and prevention of data loss and destruction.

Illustrative Learning Outcomes

KA Core 1. Describe the differences in the goals for data security and data privacy.

KA Core 2. Identify and mitigate risks associated with different approaches to protecting data.

KA Core 3. Describe legal and ethical considerations of end-to-end data security and privacy.

KA Core 4. Develop a database auditing system given risk considerations.

KA Core 5. Apply several data exploration approaches to understanding unfamiliar datasets.

DM-Analytics: Data Analytics (Page 121-122)

Explain

KA Core 1. Exploratory data techniques (motivation, representation, descriptive statistics, visualizations)

KA Core 2. Data science lifecycle: business understanding, data understanding, data preparation, modeling, evaluation, deployment, and user acceptance

KA Core 3. Data mining and machine learning algorithms: e.g., classification, clustering, association, regression

KA Core 4. Data acquisition and governance.

KA Core 5. Data security and privacy considerations

KA Core 6. Data fairness and bias

KA Core 7. Data visualization techniques and their use in data analytics

KA Core 8. Entity Resolution

Illustrative Learning Outcomes

KA Core 1. Describe several data exploration approaches, including visualization, to understanding unfamiliar datasets.

KA Core 2. Apply several data exploration approaches to understanding unfamiliar datasets.

KA Core 3. Describe basic machine learning/data mining algorithms and when they are appropriate for use.

KA Core 4. Apply several machine learning/data mining algorithms.

KA Core 5. Describe legal and ethical considerations in acquiring, using, and modifying datasets.

KA Core 6. Describe issues of fairness and bias in data collection and usage.

DM-Distributed: Distributed Databases/Cloud Computing (Page 122)

Non-core 1. Distributed DBMS

Non-core 1a. Distributed data storage

Non-core 1b. Distributed query processing

Non-core 1c. Distributed transaction model

Non-core 1d. Homogeneous and heterogeneous solutions

Non-core 1e. Client-server distributed databases

Non-core 2. Parallel DBMS

Non-core 2a. Parallel DBMS architectures: shared memory, shared disk, shared nothing

Non-core 2b. Speedup and scale-up, e.g., use of the MapReduce processing model

Non-core 2c. Data replication and weak consistency models

DM-Unstructured: Semi-structured and Unstructured Databases (Pages 122-123)

Non-core 1. Vectorized unstructured data (text, video, audio, etc.) and vector storage

Non-core1a. TF-IDF Vectorized with ngram

Non-core 1b. Word2Vec

Non-core 1c. Array database or array data type handling

Non-core 2. Semi-structured databases (e.g., JSON)

Non-core 2a. Storage

Non-core 2ai. Encoding and compression of nested data types

Non-core 2b. Indexing

Non-core 2bi. Btree, skip index, Bloom filter

Non-core 2bii. Inverted index and bitmap compression

Non-core 2biii. Space filling curve indexing for semi-structured geo-data

Non-core 2c. Query processing for OLTP and OLAP use cases

Non-core 2ci. Insert, Select, update, and delete tradeoffs

DM-SEP: Society, Ethics, and the Profession (Page 123)

Explain

KA Core 1. Enumerate three social and three legal issues related to large data collections.

KA Core 2. Describe the value of data privacy.

KA Core 3. Identify the competing stakeholders with respect to data ownership.

KA Core 4. Enumerate three negative unintended consequences from a given (well known) data-centric application (e.g., Facebook, LastPass, Ashley Madison)

KA Core 5. Describe the meaning of data provenance and lineage.

KA Core 6. Identify how a database might contribute to data security as well as how it may introduce insecurities.

Cybersecurity

Cybersecurity Curricula 2017 (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/csec2017.pdf)

Cybersecurity encompasses many areas such as database management. (Page 17)

There are no specific learning outcomes. The best list comes from the National Security Agency.

4.1 Knowledge Area: Data Security

This focuses on the protection of data at rest, during processing, and in transit.

CSEC2017 does have topics that address database security.

4.7 Knowledge Area: Organizational Security (Page 59)

This focuses on protecting organizations from cybersecurity threats and managing risk.

CSEC2017 does have topics that address this knowledge area:

  • Risk: One topic is about identifying the information assets in an organization.
  • Systems Administration
  • Database system administration
  • System hardening commonly used systems such as database systems.

Data Science

Computing Competencies for Undergraduate Data Science Curricula (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/dstf_ccdsc2021.pdf)16

Chapters

Data Mining (DM) (Page 74)

This involves the processing, analysis, and presentation of data in order to gain valuable information.

DM-Information Retrieval – T2 (Pages 80-81)

This is about identifying and retrieving information from a large data set. Some of the data may be in a database.

Students need to have knowledge of

  • The techniques for creating and searching relational databases systems.
  • Various relational, non-relational, and other database formats.

Chapter 9

Chapter 8

Skills

Create and use a relational database structure using SQL.

Chapter 9

Data Integrity (DPSIA/DI) (page 89)

This is about the completeness, accuracy, and consistency of data. One part of the data life cycle is storing and retrieving.

DPSIA/DI Logical Integrity – T1 (Page 89)

Students need to have knowledge of

  • Types of integrity constraints in database systems
  • Entity integrity, referential integrity, domain integrity, user-defined integrity

Chapter 3

Chapter 6

Programming, Data Structures, and Algorithms (PDA) (Page 111)

This is about the tools for data collection and analysis.

One competency is the need to write appropriate database queries.

PDA-Programming (Page 112-113)

Skills

Manipulate data from selected sources (e.g., databases, spreadsheets, text documents, XML) utilizing appropriate techniques (e.g., database queries, API calls, regular expressions).

Chapter 9

Chapter 12

Chapter 13

Software Development and Maintenance (SDM) (Page 119)

There is a statement that this area draws from CS2012 knowledge area on software engineering.

SDM-Software Design and Development (page 119)

Students need to have T1 level of knowledge of

  • Integration with Information Management/Database systems

Chapter 12

Chapter 13

Skills (Page 120)

Describe how to integrate or interact with Information Management/Database Systems.

Chapter 12

Chapter 13

Information Systems (Also written as “Computer Information Systems.”)

IS 2020: A Competency Model for Undergraduate Programs in Information Systems (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/is2020.pdf)

Chapters

A3.2 Data / Information Competency Realm (Page 101)

A3.2.1 Competency Area – Data / Information Management (Page 101)

  • Query the relational model (Page 102)

Chapter 9

  • Design relational databases (Page 103)

Chapter 4

Chapter 5

  • Programming database systems using functions and triggers (Page 103)

Chapter 12

  • Secure a database (Page 103)

Chapter 12

  • Compare tradeoffs of different concurrency modes (Page 103)

Chapter 8

  • Develop non-relational models (Page 104)

Information Technology

Information Technology Curricula 2017 (IT2017) (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/it2017.pdf)

Chapters

ITE-IMA Domain: Information Management (Page 56) for all listed competencies.)

Competencies

A. Express how the growth of the internet and demands for information have changed data handling and transactional and analytical processing, and led to the creation of special purpose databases. (Requirements)

Chapter 1

B. Design and implement a physical model based on appropriate organization rules for a given scenario including the impact of normalization and indexes. (Requirements and development)

Chapter 9

C. Create working SQL statements for simple and intermediate queries to create and modify data and database objects to store, manipulate and analyze enterprise data. (Testing and performance)

Chapter 9

D. Analyze ways data fragmentation17, replication, and allocation affect database performance in an enterprise environment. (Integration and evaluation)

E. Perform major database administration tasks such as create and manage database users, roles and privileges, backup, and restore database objects to ensure organizational efficiency, continuity, and information security. (Testing and performance)

Roles were mentioned in Chapter 11, but the usage part was not done.

Chapter 12

Subdomains (Page 92)

L1: A minimal degree of engagement.

L2: A medium degree of engagement.

L3: A large degree of engagement.

ITE-IMA-01 Perspectives and impact [L1] (Page 92 for this section.)

a. Describe how data storage and retrieval has changed over time.

Chapter 1

b. Justify the advantages of a database approach compared to traditional file processing.

Chapter 1

c. Describe how the growth of the internet and demands for information for users outside the organization (customers and suppliers) impact data handling and processing.

Chapter 1

d. Tell a brief history of database models and their evolution.

Chapter 1

ITE-IMA-02 Data-information concepts [L2] (Page 92 for this section.)

a. Describe the role of data, information, and databases in organizations.

Chapter 2

b. Compare and use key terms such as: information, data, database, database management system, metadata, and data mining.

Chapter 2

c. Illustrate data quality, accuracy, and timeliness, and explain how their absence will impact organizations.

Chapter 6

d. Describe mechanisms for data collection and their implications (automated data collection, input forms, sources).

e. Describe basic issues of data retention, including the need for retention, physical storage, backup, and security.

ITE-IMA-03 Data modeling [L3] (Page 92 for this section.)

a. Design Entity Relationship diagrams based on appropriate organizational rules for a given scenario.

Chapter 5

Chapter 6

b. Describe the relationship between a logical model and a physical model.

Chapter 4

c. Evaluate importance of database constraints.

Chapter 3

d. Design a physical model for the best performance including impact of normalization and indexes18.

e. Compare and contrast the differences and similarities between the relational and the dimensional data modeling (OLTP vs. OLAP)

ITE-IMA-04 Database query languages [L3] (Page 92 for this section.)

a. Create, modify, and query database objects using the Structured Query Language (SQL).

Chapter 9

b. Perform filtering and sorting data using various clauses including where, order by, between, like, group by, and having.

Chapter 9

c. Use joins to select data across multiple tables.

Chapter 9

d. Use embedded SQL queries.

Chapter 12

e. Perform calculations in a query using calculated fields and aggregate functions.

Chapter 9

f. Create updatable and non-updatable views

Chapter 12

ITE-IMA-05 Data organization architecture [L3] (Page 92 for this section.)

a. Demonstrate select, project, union, intersection, set difference, and natural join relational operations using simple example relations provided.

Chapter 9

b. Contrast and compare relational databases concepts and non-relational databases including object-oriented, XML, NewSQL and NoSQL databases.

Chapter 8

c. Express the relationship between functional dependencies and keys, and give examples.

Chapter 7

d. Evaluate data integrity and provide examples of entity and referential integrity.

Chapter 3

Chapter 6

e. Analyze how data fragmentation, replication and allocation affect database performance.

ITE-IMA-06 Special-purpose databases [L1] (Page 92 for this section.)

a. Describe major concepts of object oriented, XML, NewSQL, and NoSQL databases.

Chapter 8

b. Demonstrate an understanding of online analytical processing and data warehouse systems.

c. Describe methods of data mining and what insights may be gained by these methods.

ITE-IMA-07 Managing the database environment [L2] (Page 92 for this section.)

a. Contrast and compare data administration and database administration.

b. Describe tasks commonly performed by database administrators.

c. Create and manage database users, roles, and privileges19.

d. Consider the concept of database security and backup and recovery.

Chapter 3

e. Evaluate the importance of metadata in database environment.

Chapter 3

Software Engineering

Software Engineering 2014 (https://www.acm.org/binaries/content/assets/education/se2014.pdf )

From page 27, the cognitive skill level for each topic is specified as follows:

Knowledge (k): Remembering previously learned material. Test observation and recall of information; that is, “bring to mind the appropriate information” (such as dates, events, places, knowledge of major ideas, and mastery of subject matter).

Comprehension (c): Understanding information and the meaning of material presented. For example, being able to translate knowledge to a new context, interpret facts, compare, contrast, order, group, infer causes, predict consequences, and so forth.

Application (a): Using learned material in new and concrete situations. For example, using information, methods, concepts, and theories to solve problems requiring the skills or knowledge presented.

The exact details are not provided.

Chapters

Computing Essentials (Page 28)

This includes the computer science foundations that support software product design and construction.

CMP.cf.10 Database fundamentals Comprehension coverage. And is essential part of the core. (Page 29)

Chapter 2

Software Design (Page 32)

DES.dd.2 Database design At the application level. And is an essential part of the core.

United States National Security Agency (NSA)

In 2024 CAE Cyber Defense (CAE-CD) Knowledge Units document (https://dl.dod.cyber.mil/wp-content/uploads/cae/pdf/unclass-cae-cd_ku.pdf), there are three Knowledge Units that specifically address databases:

  • Data Administration (DBA) (Page 53)
  • Database Management systems (DMS) (Page 56)
  • Databases (DAT) (Page 58)

The National Security Agency’s Knowledge Units have an arrangement that follows this scheme:

  • Introduction
  • Learning Outcomes
  • Topics
  • Vocabulary
  • Related Knowledge Units
  • Most Related DCWF and/or NICE Framework Work Role(s)

We will only list the learning outcomes. Many of the topics will be covered in the chapters.

Data Administration (DBA)

Data Administration (DBA)

The intent of the Data Administration Knowledge Unit is to provide students with methods to protect the confidentiality, integrity, and availability of data throughout the data life cycle.

Chapters

Outcomes

To complete this KU, students should be able to:

1. Draw and describe a data and information lifecycle, identifying specific and general security issues at all stages.

2. Define and evaluate data and information quality, accessibility, and utility.

3. Examine how the origination, change, distribution, storage, and deletion of information is managed and secured.

4. Compare and contrast data and information ownership, stewardship, management, possession, and governance.

5. Outline the role of data and information classification in security.

1

2

3

4

5

Database Management Systems (DMS)

Database Management Systems (DMS)

The intent of the Database Management Systems Knowledge Unit is to provide students with the skills to utilize database management system to solve specific problems

Chapters

Outcomes

To complete this KU, students should be able to:

1. Compare and contrast database types including relational, hierarchical, distributed, and other models.

2. Describe the role of a database, a DBMS, and a database server within a complex system supporting multiple applications.

3. Apply SQL to create and administer databases and to manipulate the data they contain.

4. Describe DBMS access controls, privilege levels, and security principles and apply them to a simple database.

5. Outline common structures for storing data in a database management system.

6. Design and deploy a simple database for a specified application.

1. Chapter 2

2. Chapter 2

3. Chapter 9

4. Chapter 12

5

6. This is the running project.

Databases (DAT)

Databases (DAT)

The intent of the Databases Knowledge Unit is to teach students how database systems are used, managed, and issues associated with protecting the associated data assets

Chapters

Outcomes

To complete this KU, students should be able to:

1. Describe the role of a database, a database management system (DBMS), and a database server within a complex system supporting one or more applications.

2. Outline different models for databases and cases where they may be used.

3. Identify and describe common security concerns in databases and database management systems

1. Chapter 2

2. Chapter 2

3.

Association of Information Technology Professionals (CIPS)

Canada’s Association of Information Technology Professionals (CIPS, https://cips.ca) is an accreditation agency just like the Computing Accreditation Commission of the ABET. At this writing, the CIPS does not have any specific guidance concerning information management nor about databases.

New Chapters

In addition to embracing the best practice of using learning outcomes from two major credentialing agencies, we have added more information to the current chapters, combined some chapters, and added new chapters. The new chapters are:

  • Chapter 10: “Relational Algebra, Relational Calculus, and Relational Theory”
  • Chapter 11: “Under the Hood of the SQL Server”
  • Chapter 12: “Protecting the DBMS”
  • Chapter 13: “Programs to Work with a Database”
  • Chapter 14: “An Example Website with Ca Database Connection”
  • Chapter 15: “More on an Example Website with a Database Connection”

A running design project has been added. After a topic has been introduced, then a submodule uses questions to guide students to create their own final project.

Licensing20

This text is released under the Creative Commons license Attribution-NonCommercial-NoDerivs. This text is like a conventional book, in other words, except that it is free. You may copy the work and distribute it to others for any noncommercial use (and for some commercial uses; see below), but all reuse requires attribution.

The Creative Commons license does not precisely spell out what constitutes “noncommercial” use. The author considers any sale of printed copies of this book, even by a non-profit organization and even if the price just covers expenses, to be commercial use. Personal printing, and free distribution of printed selections, do qualify as noncommercial.

Classroom Use

This book is meant as a serious and more-or-less thorough text for an introductory college or graduate course in databases. The contents have been carefully researched, with consistent notation and style, and complete with diagrams and exercises.

Instructors who adopt this book in a course, as either a primary or a secondary text, are strongly encouraged to let us know, as this helps support continued work on the book. Below is a list of the institutions that have adopted our book. Commercial publishers get this information from sales records, but that won’t work here; if you want to see your institution listed, contact us!

  • Douglas College
    • CSIS 230 Database I
  • Kwantlen Polytechnic University
    • INFO 2312 Database Management Systems
  • University of Maine at Presque Isle
    • COS 350 Databases and Database Management Systems (Fall 2021 and Fall 2023)

Acknowledgments

We would like to thank the many Douglas College students that have provided invaluable feedback on the text and on the exercises. We would like to thank the following people from outside of Douglas College who have contributed technical or editorial comments. If we have missed anyone or their institution, please let us know.

For the third edition, I wish to thank Nelson Eng for his feedback and for his encouragement. This has been a labor of love. It is not perfect. I expect there are some typos and some paragraphs that need to be revised. I hope I can work on the fourth edition.

Fred L. Strickland, Ph.D.

Patent Examiner

United States Patent and Trademark Office.

Progress Notes

The third edition was started in early 2024. The third edition was declared complete as of xxx 202x.

Technical Considerations

The book was prepared on my computer using Microsoft Word. There was a brief period when Microsoft Word stopped working and I had to resort to using LibreOffice Writer. Then I uploaded each part to the BC Campus’ PressBooks website. I had to do some format corrections.

A Note on the Cover

The photograph is of a spiral staircase.

References

Matthew Hertz. “What Do ‘CS1’ and ‘CS2” Mean? Investigating Differences In the Early Courses.” SIGCSE’10: Proceedings of the 41st ACM technical symposium on Computer science education, Pages 199-203, March 10-13, 2010. The actual paper is available at https://cse.buffalo.edu/~mhertz/meaning-sigcse-2010.pdf

 

1We collected all learning outcomes from a section. If we have no plans to cover a specific learning outcome, then there are no references to a chapter.
2 You can find the coversheet of Dr. Chris Bourke’s computer science textbook at https://github.com/cbourke/CS1/blob/main/versions-PDF/ComputerScienceOne_v1.3.5.pdf The citation is Chris Bourke. Computer Science I Version 1.3.5, Department of Computer Science & Engineering, April 7, 2018. https://bitbucket.org/chrisbourke/computersciencei/src/master/ComputerScienceOne.pdf and https://github.com/cbourke/CS1
3 This has a cover date of 2023. The publication date is January 2024. The ACM website has the publication date of May 30, 2024.
4 The Bloom taxonomy was applied to the topics, not to the learning outcomes. As explained in the next footnote, a knowledge unit has topics and learning outcomes. There may be more topics than learning outcomes. The listed Bloom taxonomy skill level may not align completely. These came from page 352 and following.
5 The ACM has Computer Science (CS) Core topics that every computer science graduate must know. The Knowledge Area (KA) Core are elective topics that are not required of every computer science graduate. “… a curriculum is expected to cover a considerable percentage of elective topics.” Each knowledge unit has some learning outcomes. These are “meant to be descriptive, not prescriptive.” That is, not every listed topic has a linked learning outcome. An institution may add more learning outcomes. For this textbook, we will only cover the presented learning outcomes.
6 JSON tends to be used on websites. I did not address JSON, XML, and the related in the web development part of tis textbook.
7 Many of these topics are covered in a data structures course.
8 Projection is covered in relational algebra part of a course.
9 This topic is covered in a database design part of a course.
10 I mentioned indexing, but I did not go very far. I stated that Microsoft SQL Server will automatically generate indexes on a table.
11 Run-time calculations are normally taught in an operating systems course.
12 I talk about how to translate between a query and a relational algebra expression. I do not discuss left-deep, pipelined execution.
13 These topics could be addressed in a cybersecurity program.
14 Ethics could be added to this course. I did not address this topic, because the textbook is long enough at 15 chapters.
15 This could be addressed in a cybersecurity program.
16I have been using the abbreviation for each computing sub discipline based on the title from the PDF file name. That has worked out to being two-letter codes. The ACM did not do that for the data science sub discipline. They used “ccdsc.” I decided to go with “DS” as the abbreviation.
17 This is a topic that is covered in an operating systems course.
18 A physical model captures what the customer wants in a database. The work of making it work best comes later on.
19 In Chapter 11 and in Chapter 12, I address database access. I did not tell how to do this.
20The words on this page and following were borrowed from another author’s preface. Peter L. Dordal, An Introduction to Computer Networks, Desktop Edition 2.0.11, http://intronetworks.cs.luc.edu/current2/html/preface.html

License

Database Design - 3rd Edition Copyright © by Fred Strickland. All Rights Reserved.

Share This Book