Preface for Database Design – 3rd Edition
Fred Strickland
Contents
Preface for Database Design – 3rd Edition1
Association of Computing Machinery (ACM)3
Information Systems (Also written as “Computer Information Systems.”)15
United States National Security Agency (NSA)18
Database Management Systems (DMS)19
Association of Information Technology Professionals (CIPS)20
The Past Editions
The first edition was released sometime before 2014.
Adrienne Watt wrote the following for the second edition (copyright 2014):
The primary purpose of this text is to provide an open source textbook that covers most introductory database courses. The material in the textbook was obtained from a variety of sources. All the sources are found in at the end of each chapter. I expect, with time, the book will grow with more information and more examples.
I welcome any feedback that would improve the book. If you would like to add a section to the book, please let me know.
Adrienne Watt
The Present Edition
For the third edition, we have decided to greatly revise the book to make it more useful for a larger audience. We have researched the curriculum guidance from the three North American agencies:
- Association of Computing Machinery (ACM)
- the United States National Security Agency (NSA)
- Association of Information Technology Professionals (CIPS)
This edition will cover the learning outcomes1 for the following computing subdisciplines:
- Computer Engineering (CE)
- Computer Science (CS)
- 2013 (Knowledge based approach)
- 2023 (Competency based approach)
- Cybersecurity (CYB)
- Data Science (DS)
- Information Systems (IS) (Also written as “Computer Information Systems.”)
- Information Technology (IT)
- Software Engineering (SE)
As a side note, there are old computer science textbooks that stated that these supported “CS1” or “CS2.” Matthew Hertz pointed out that this usage dates from about 1979 when “the ACM Computing Curricula used the terms ‘CS1’ and ‘CS2” to designate the first two course in the introductory sequences of a computer science major.” However, these textbooks did not provide detailed information about the covered learning outcomes. Dr. Chris Bourke’s textbook is an example of a textbook that states it supports CS12. It is rare to find a textbook that mentions which curricular books are being supported. As a personal note, I have not seen any recent computer science textbooks that mention supporting CS2013.
The next few pages list the database related learning outcomes from the current seven sub computing disciplines. Please note that this textbook does not attempt to cover every learning outcome. Instructors may feel free to add additional materials to cover these omitted learning outcomes.
Association of Computing Machinery (ACM)
Computer Engineering
Computer Engineering Curricula 2016 (https://www.acm.org/binaries/content/assets/education/ce2016-final-report.pdf)
|
Chapters |
||
|
CE-PPP-9 Contemporary issues (Page 88) |
|
|
|
|
Articulate some of the privacy implications related to massive database systems. |
Chapter 3 |
|
CE-SEC-9 Authentication (Page 91) |
|
|
|
|
Discuss eavesdropping and server database reading and explain how various authentication methods deal with them. |
|
|
CE-SPE-5 Risk, dependability, safety, and fault tolerance |
|
|
|
|
Describe fault tolerance and dependability requirements of different applications (such as database, aerospace, telecommunications, industrial control, and transaction processing). |
|
|
CE-SWD-10 Database systems (Page 103) |
|
|
|
|
Explain how use of database systems evolved from programming with simple collections of data files. |
Chapter 1 |
|
|
Describe the major components of a modern database system. |
|
|
|
Describe the functionality provided by languages such as SQL. |
Chapter 9 |
|
|
Give examples of interactions with database systems that are relevant to computer engineering. |
|
Computer Science
Computer Science Curricula 2013 (https://www.acm.org/binaries/content/assets/education/cs2013_web_final.pdf)
|
Chapters |
||
|
Information Management (IM) (Page 113) |
|
|
|
|
IM/Database Systems (Pages 113-114) 1. Explain the characteristics that distinguish the database approach from the approach of programming with data files. [Familiarity] 2. Describe the most common designs for core database system components including the query optimizer, query executor, storage manager, access methods, and transaction processor. [Familiarity] 3. Cite the basic goals, functions, and models of database systems. [Familiarity] 4. Describe the components of a database system and give examples of their use. [Familiarity] 5. Identify major DBMS functions and describe their role in a database system. [Familiarity] 6. Explain the concept of data independence and its importance in a database system. [Familiarity] 7. Use a declarative query language to elicit information from a database. [Usage] 8. Describe facilities that databases provide supporting structures and/or stream (sequence) data, e.g., text. [Familiarity] [Elective] 9. Describe major approaches to storing and processing large volumes of data. [Familiarity] |
1. Chapter 1 |
2. |
|||
3. Chapter 2 |
|||
4. Chapter 11 |
|||
5. Chapter 11 |
|||
6. Chapter 3 |
|||
7. Chapter 9 |
|||
8. |
|||
9. |
|||
|
|
IM/Relational Databases (Pages 115-116) 1. Prepare a relational schema from a conceptual model developed using the entity- relationship model. [Usage] 2. Explain and demonstrate the concepts of entity integrity constraint and referential integrity constraint (including definition of the concept of a foreign key). [Usage] 3. Demonstrate use of the relational algebra operations from mathematical set theory (union, intersection, difference, and Cartesian product) and the relational algebra operations developed specifically for relational databases (select (restrict), project, join, and division). [Usage] 4. Write queries in the relational algebra. [Usage] 5. Write queries in the tuple relational calculus. [Usage] 6. Determine the functional dependency between two or more attributes that are a subset of a relation. [Assessment] 7. Connect constraints expressed as primary key and foreign key, with functional dependencies. [Usage] 8. Compute the closure of a set of attributes under given functional dependencies. [Usage] 9. Determine whether a set of attributes form a superkey and/or candidate key for a relation with given functional dependencies. [Assessment] 10. Evaluate a proposed decomposition, to say whether it has lossless-join and dependency-preservation. [Assessment] 11. Describe the properties of BCNF, PJNF, 5NF. [Familiarity] 12. Explain the impact of normalization on the efficiency of database operations especially query optimization. [Familiarity] 13. Describe what is a multi-valued dependency and what type of constraints it specifies. [Familiarity] |
1. Chapter 4 Chapter 5 |
2. Chapter 3 Chapter 6 |
|||
3. Chapter 10 |
|||
4. Chapter 10 |
|||
5. Chapter 10 |
|||
6. Chapter 7 |
|||
7. Chapter 7 |
|||
8. Chapter 7 |
|||
9. Chapter 7 |
|||
10. Chapter 5 |
|||
11. Chapter 7 |
|||
12. Chapter 7 |
|||
13. Chapter 7 |
|||
|
|
IM/Query Languages (Page 116) 1. Create a relational database schema in SQL that incorporates key, entity integrity, and referential integrity constraints. [Usage] 2. Use SQL to create tables and retrieve (SELECT) information from a database. [Usage] 3. Evaluate a set of query processing strategies and select the optimal strategy. [Assessment] 4. Create a non-procedural query by filling in templates of relations to construct an example of the desired query result. [Usage] 5. Embed object-oriented queries into a stand-alone language such as C++ or Java (e.g., SELECT Col.Method() FROM Object). [Usage] 6. Write a stored procedure that deals with parameters and has some control flow, to provide a given functionality. [Usage] |
1. Chapter 9 |
2. Chapter 9 |
|||
3. |
|||
4. Chapter 12 |
|||
5. Chapter 12 Chapter 13 |
|||
6. Chapter 12 |
|||
CS2023 Shift
In CS2013, this area was named “Information Management.” In CS2023, this area is now named “Data Management (DM).” This was done in order to “avoid confusion with the similar definitions used in Information Systems and Information Technology curricula.” CS2023 has greater attention to security knowledge area and to society, ethics, and profession knowledge area. CS2023 includes NoSQL approaches and MapReduce as core topics.
This was released in mid-2024 after major work had been completed on this textbook. I have attempted to list where these new learning outcomes appear. My effort may not be accurate. Consider this to be a work in progress.
Computer Science Curricula 20233 (https://dl.acm.org/doi/pdf/10.1145/3664191)
|
Bloom4 |
Chapters |
|||||
|
DM-Data: The Role of Data and the Data Life Cycle (Page 115) |
|
|
||||
|
|
CS Core5 1. The Data Life Cycle: Creation-Processing-Review/Reporting Retention/Retrieval Destruction |
|
|
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO CS Core 1. Identify the five stages of the Data Life Cycle. |
Evaluate |
|
|||
|
|
|
|
|
|||
|
DM-Core: Core Database Systems Concepts (Pages 115-116) |
Explain |
|
||||
|
|
CS Core 1. Purpose and advantages of database systems |
|
Chapter 3 |
|||
|
|
CS Core 2. Components of database systems |
|
Chapter 11 |
|||
|
|
CS Core 3. Design of Core DBMS functions (e.g., query mechanisms, transaction management, buffer management, access methods) |
|
Chapter 11 |
|||
|
|
CS Core 4. Database architecture, data independence, and data abstraction |
|
Chapter 11 |
|||
|
|
CS Core 5. Transaction management |
|
Chapter 11 |
|||
|
|
CS Core 6. Normalization |
|
Chapter 7 |
|||
|
|
CS Core 7. Approaches for managing large volumes of data (e.g., NoSQL database systems, use of MapReduce) |
|
|
|||
|
|
CS Core 8. How to support CRUD-only applications |
|
Chapter 15 |
|||
|
|
CS Core 9. Distributed databases/cloud-based systems |
|
|
|||
|
|
CS Core 10. Structured, Semi-structured, and unstructured data |
|
|
|||
|
|
CS Core 11. Use of a declarative query language |
|
|
|||
|
|
KA Core 12. Systems supporting structured and/or stream content |
|
|
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO CS Core 1. Identify at least four advantages that using a database system provides. |
|
Chapter 1 |
|||
|
|
ILO CS Core 2. Enumerate the components of a (relational) database system. |
|
Chapter 11 |
|||
|
|
ILO CS Core 3. Follow a query as it is processed by the components of a (relational) database system. |
|
Chapter 11 |
|||
|
|
ILO CS Core 4. Defend the value of data independence. |
|
Chapter 2 |
|||
|
|
ILO CS Core 5. Compose a simple select-project-join query in SQL. |
|
Chapter 9 |
|||
|
|
ILO CS Core 6. Enumerate the four properties of a correct transaction manager. |
|
Chapter 11 |
|||
|
|
ILO CS Core 7. Describe the advantages for eliminating duplicate repeated data. |
|
Chapter 7 |
|||
|
|
ILO CS Core 8. Outline how MapReduce uses parallelism to process data efficiently. |
|
|
|||
|
|
ILO CS Core 9. Evaluate the differences between structured and semi/unstructured databases. |
|
|
|||
|
DM-Modeling: Data Modeling (Pages 116-117) |
Develop and Explain |
|
||||
|
|
CS Core 1. Data modeling |
|
Chapter 4 |
|||
|
|
CS Core 2. Relational data model |
|
Chapter 4 Chapter 5 |
|||
|
|
KA Core 3. Conceptual models (e.g., entity-relationship, UML diagrams) |
|
Chapter 4 Chapter 5 |
|||
|
|
KA Core 4. Semi-structured data models (expressed using DTD, XML, or JSON Schema, for example)6 |
|
|
|||
|
|
Non-core 5. Spreadsheet models |
|
Chapter 1 |
|||
|
|
Non-core 6. Object-oriented models |
|
Chapter 1 |
|||
|
|
|
Non-core 6a. GraphQL |
|
|
||
|
|
Non-core 7. New features in SQL |
|
Chapter 9 |
|||
|
|
Non-core 8. Specialized Data Modeling topics |
|
|
|||
|
|
|
Non-core 8a. Time series data (aggregation, join) |
|
|
||
|
|
|
Non-core 8b. Graph data (link traversal) |
|
|
||
|
|
|
Non-core 8c. Techniques for avoiding inefficient raw data access (e.g., “avg daily price”): materialized views and special data structures (e.g., Hyperloglog, bitmap) |
|
|
||
|
|
|
Non-core 8d. Geo-Spatial data (e.g., GIS databases) |
|
|
||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO CS Core 1. Describe the components of the relational data model. |
|
Chapter 5 |
|||
|
|
ILO CS Core 2. Model 1:1, 1:n, and n:m relationships using the relational data model. |
|
Chapter 6 |
|||
|
|
ILO KA Core 3. Describe the components of the E-R (or some other non-relational) data model. |
|
Chapter 5 |
|||
|
|
ILO KA Core 4. Model a given environment using a conceptual data model. |
|
Chapter 4
|
|||
|
|
ILO KA Core 5. Model a given environment using the document-based or key-value store-based data model. |
|
|
|||
|
DM-Relational: Relational Databases (Pages 117-118) |
Explain and Develop |
|
||||
|
|
CS Core 1. Entity and referential integrity: Candidate key, superkeys |
|
Chapter 5 Chapter 6 |
|||
|
|
CS Core 2. Relational database design |
|
Chapter 5 |
|||
|
|
KA Core 3. Mapping conceptual schema to a relational schema |
|
Chapter 5 |
|||
|
|
KA Core 4. Physical database design: file and storage structures |
|
Chapter 1 |
|||
|
|
KA Core 5. Introduction to Functional dependency theory. |
|
Chapter 7 |
|||
|
|
KA Core 6. Normalization Theory |
|
Chapter 7 |
|||
|
|
|
KA Core 6a. Decomposition of a schema; lossless join, and dependency-preservation properties of a decomposition |
|
Chapter 7 |
||
|
|
|
KA Core 6b. Normal forms (BCNF) |
|
Chapter 7 |
||
|
|
|
KA Core 6c. Denormalization (for efficiency) |
|
Chapter 7 |
||
|
|
Non-core 7. Functional dependency theory |
|
Chapter 7 |
|||
|
|
|
Non-core 7a. Closure of a set of attributes |
|
Chapter 7 |
||
|
|
|
Non-core 7b. Canonical Cover |
|
Chapter 7 |
||
|
|
Non-core 8. Normalization theory |
|
Chapter 7 |
|||
|
|
|
Non-core 8a. Multi-valued dependency (4NF) |
|
Chapter 7 |
||
|
|
|
Non-core 8b. Join dependency (PJNF, 5NF) |
|
Chapter 7 |
||
|
|
|
Non-core 8c. Representation theory |
|
|
||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO CS Core 1. Describe the defining characteristics behind the relational data model. |
|
Chapter 5 |
|||
|
|
ILO CS Core 2. Comment on the difference between a foreign key and a superkey. |
|
Chapter 5 |
|||
|
|
ILO CS Core 3. Enumerate the different types of integrity constraints. |
|
Chapter 3 Chapter 6 |
|||
|
|
ILO KA Core 4. Compose a relational schema from a conceptual schema which contains 1:1, 1:n, and n:m relationships. |
|
Chapter 6 |
|||
|
|
ILO KA Core 5. Map appropriate file structure to relations and indices. |
|
|
|||
|
|
ILO KA Core 6. Describe how functional dependency theory generalizes the notion of key. |
|
Chapter 7 |
|||
|
|
ILO KA Core 7. Defend a given decomposition as lossless and or dependency preserving. |
|
Chapter 7 |
|||
|
|
ILO KA Core 8. Detect which normal form a given decomposition yields. |
|
Chapter 7 |
|||
|
|
ILO KA Core 9. Comment on reasons for denormalizing a relation. |
|
Chapter 7 |
|||
|
DM-Querying: Query Construction (Page 118) |
Develop |
|
||||
|
|
CS Core 1. SQL Query Formation |
|
Chapter 9 |
|||
|
|
|
CS Core 1a. Interactive SQL execution |
|
|
||
|
|
|
CS Core 1b. Programmatic execution of an SQL query |
|
Chapter 13 |
||
|
|
KA Core 2. Relational Algebra |
|
Chapter 10 |
|||
|
|
KA Core 3. SQL |
|
Chapter 9 |
|||
|
|
|
KA Core 3a. Data definition including integrity and other constraint specifications. |
|
Chapter 9 |
||
|
|
|
KA Core 3b. Update sublanguage |
|
|
||
|
|
Non-core 4. Relational Calculus |
|
Chapter 1 |
|||
|
|
Non-core 5. QBE and 4th-generation environments |
|
|
|||
|
|
Non-core 6. Different ways to invoke non-procedural queries in conventional languages. |
|
Chapter 13 |
|||
|
|
Non-core 7. Introduction to other major query languages (e.g., XPATH, SPARQL) |
|
|
|||
|
|
Non-core 8. Stored procedures |
|
Chapter 13 |
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO CS Core 1. Compose SQL queries that incorporate select, project, join, union, intersection, set difference, and set division. |
|
Chapter 9 |
|||
|
|
ILO CS Core 2. Determine when a nested SQL query is correlated or not. |
|
|
|||
|
|
ILO CS Core 3. Iterate over data retrieved programmatically from a database via an SQL query |
|
|
|||
|
|
ILO KA Core 4. Define, in SQL, a relation schema, including all integrity constraints and delete/update triggers. |
|
Chapter 9 |
|||
|
|
ILO KA Core 5. Compose an SQL query to update a tuple in a relation. |
|
|
|||
|
DM-Processing: Query Processing (Pages 118-119)7 |
Develop |
|
||||
|
|
KA Core 1. Page structures |
|
|
|||
|
|
KA Core 2. Index structures |
|
|
|||
|
|
|
KA Core 2a. B+ trees |
|
|
||
|
|
|
KA Core 2b. Hash indices: static and dynamic |
|
|
||
|
|
|
KA Core 2c. Index creation in SQL |
|
|
||
|
|
KA Core 3. File structures |
|
|
|||
|
|
|
KA Core 3a. Heap files |
|
|
||
|
|
|
KA Core 3b. Hash files |
|
|
||
|
|
KA Core 4. Algorithms for query operators |
|
|
|||
|
|
|
KA Core 4a. External Sorting |
|
|
||
|
|
|
KA Core 4b. Selection |
|
|
||
|
|
|
KA Core 4c. Projection8; with and without duplication elimination |
|
|
||
|
|
|
KA Core 4d. Natural Joins: Nested loop, Sort-merge, Hash join |
|
|
||
|
|
|
KA Core 4e. Analysis of algorithm efficiency |
|
|
||
|
|
KA Core 5. Query transformations |
|
|
|||
|
|
KA Core 6. Query optimization |
|
|
|||
|
|
|
KA Core 6a. Access paths |
|
|
||
|
|
|
KA Core 6b. Query plan construction |
|
|
||
|
|
|
KA Core 6c. Selectivity estimation |
|
|
||
|
|
|
KA Core 6d. Index-only plans |
|
|
||
|
|
KA Core 7. Parallel Query Processing (e.g., parallel scan, parallel join, parallel aggregation) |
|
|
|||
|
|
KA Core 8. Database tuning/performance |
|
|
|||
|
|
|
KA Core 8a. Index selection |
|
|
||
|
|
|
KA Core 8b. Impact of indices on query performance |
|
|
||
|
|
|
KA Core 8c. Denormalization9 |
|
|
||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO KA Core 1. Describe the purpose and organization of both B+ tree and hash index structures. |
|
|
|||
|
|
ILO KA Core 2. Compose an SQL command to create an index10 (any kind). |
|
|
|||
|
|
ILO KA Core 3. Specify the steps for the various query operator algorithms: external sorting, projection with duplicate elimination, sort-merge join, hash-join, block nested-loop join. |
|
|
|||
|
|
ILO KA Core 4. Derive the run-time11 (in I/O requests) for each of the above algorithms. |
|
|
|||
|
|
ILO KA Core 5. Transform a query in relational algebra12 to its equivalent appropriate for a left-deep, pipelined execution. |
|
|
|||
|
|
ILO KA Core 6. Compute selectivity estimates for a given selection and/or join operation. |
|
|
|||
|
|
ILO KA Core 7. Describe how to modify an index structure to facilitate an index-only operation for a given relation. |
|
|
|||
|
|
ILO KA Core 8. For a given scenario decide on which indices to support for the efficient execution of a set of queries. |
|
|
|||
|
|
ILO KA Core 9. Describe how DBMSs leverage parallelism to speed up query processing by dividing the work across multiple processors or nodes |
|
|
|||
|
DM-Internals: DBMS Internals (Pages 119-120) |
Explain |
|
||||
|
|
KA Core 1. DB Buffer Management. |
|
Chapter 11 |
|||
|
|
KA Core 2. Transaction Management |
|
Chapter 11 |
|||
|
|
|
KA Core 2a. Isolation Levels |
|
Chapter 11 |
||
|
|
|
KA Core 2b. ACID |
|
Chapter 11 |
||
|
|
|
KA Core 2c. Serializability |
|
Chapter 11 |
||
|
|
|
KA Core 2d. Distributed Transactions |
|
Chapter 11 |
||
|
|
KA Core 3. Concurrency Control |
|
Chapter 11 |
|||
|
|
|
KA Core 3a. 2-Phase Locking |
|
Chapter 11 |
||
|
|
|
KA Core 3b. Deadlocks handling strategies |
|
Chapter 11 |
||
|
|
|
KA Core 3c. Quorum-based consistency models |
|
|
||
|
|
KA Core 4. Recovery Manager |
|
Chapter 11 |
|||
|
|
|
KA Core 4a. Relation with Buffer Manager |
|
Chapter 11 |
||
|
|
Non-core 5. Concurrency Control |
|
Chapter 11 |
|||
|
|
|
Non-core 5a. Optimistic concurrency control |
|
Chapter 11 |
||
|
|
|
Non-core 5b. Timestamp concurrency control |
|
Chapter 11 |
||
|
|
Non-core 6. Recovery Manager |
|
Chapter 11 |
|||
|
|
|
Non-core 6a. Write-Ahead logging |
|
Chapter 11 |
||
|
|
|
Non-core 6b. ARIES recovery system (Analysis, REDO, UNDO) |
|
|
||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
KA Core 1. Describe how a DBMS manages its Buffer Pool. |
|
Chapter 11 |
|||
|
|
KA Core 2. Describe the four properties for a correct transaction manager. |
|
Chapter 11 |
|||
|
|
KA Core 3. Outline the principle of serializability. |
|
|
|||
|
DM-NoSQL: NoSQL Systems (Pages 120-121) |
Explain |
|
||||
|
|
KA Core 1. Why NoSQL? (e.g., Impedance mismatch between Application [CRUD] and RDBMS) |
|
Chapter 4 Chapter 8 Chapter 15 |
|||
|
|
KA Core 2. Key-Value and Document data model |
|
Chapter 8 |
|||
|
|
Non-core 3. Storage systems (e.g., Key-Value systems, Data Lakes) |
|
Chapter 8 |
|||
|
|
Non-core 4. Distribution Models (Update and Read, Quorum consistency, CAP theorem) |
|
|
|||
|
|
Non-core 5. Graph Databases |
|
Chapter 8 |
|||
|
|
Non-core 6. Consistency Models (Update and Read, Quorum consistency, CAP theorem) |
|
|
|||
|
|
Non-core 7. Processing model (e.g., Map-Reduce, multi-stage map-reduce, incremental map-reduce) |
|
|
|||
|
|
Non-core 8. Case Studies: Cloud storage systems (e.g., S3); Graph databases; “When not to use NoSQL” |
|
|
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
ILO KA Core 1. Develop a use case for the use of NoSQL over RDBMS |
|
Chapter 4 Chapter 8 |
|||
|
|
ILO KA Core 2. Describe the defining characteristics behind Key-Value and Document-based data models |
|
Chapter 8 |
|||
|
DM-Security: Data Security and Privacy (Page 121)13 |
Explain |
|
||||
|
|
CS Core 1. Differences between data security and data privacy. |
|
|
|||
|
|
CS Core 2. Protecting data and database systems from attacks, including injection attacks such as SQL injection |
|
Chapter 13 |
|||
|
|
CS Core 3. Personally identifying information (PII) and its protection. |
|
|
|||
|
|
CS Core 4. Ethical considerations in ensuring the security and privacy of data.14 |
|
|
|||
|
|
KA Core 5. Need for, and different approaches to securing data at rest, in transit, and during processing.15 |
|
|
|||
|
|
KA Core 6. Database auditing and its role in digital forensics. |
|
|
|||
|
|
KA Core 7. Data inferencing and preventing attacks. |
|
|
|||
|
|
KA Core 8. Laws and regulations governing data security and data privacy |
|
|
|||
|
|
Non-core 9. Typical risk factors and prevention measures for ensuring data integrity |
|
|
|||
|
|
Non-core 10. Ransomware and prevention of data loss and destruction. |
|
|
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
KA Core 1. Describe the differences in the goals for data security and data privacy. |
|
|
|||
|
|
KA Core 2. Identify and mitigate risks associated with different approaches to protecting data. |
|
|
|||
|
|
KA Core 3. Describe legal and ethical considerations of end-to-end data security and privacy. |
|
|
|||
|
|
KA Core 4. Develop a database auditing system given risk considerations. |
|
|
|||
|
|
KA Core 5. Apply several data exploration approaches to understanding unfamiliar datasets. |
|
|
|||
|
DM-Analytics: Data Analytics (Page 121-122) |
Explain |
|
||||
|
|
KA Core 1. Exploratory data techniques (motivation, representation, descriptive statistics, visualizations) |
|
|
|||
|
|
KA Core 2. Data science lifecycle: business understanding, data understanding, data preparation, modeling, evaluation, deployment, and user acceptance |
|
|
|||
|
|
KA Core 3. Data mining and machine learning algorithms: e.g., classification, clustering, association, regression |
|
|
|||
|
|
KA Core 4. Data acquisition and governance. |
|
|
|||
|
|
KA Core 5. Data security and privacy considerations |
|
|
|||
|
|
KA Core 6. Data fairness and bias |
|
|
|||
|
|
KA Core 7. Data visualization techniques and their use in data analytics |
|
|
|||
|
|
KA Core 8. Entity Resolution |
|
|
|||
|
|
Illustrative Learning Outcomes |
|
|
|||
|
|
KA Core 1. Describe several data exploration approaches, including visualization, to understanding unfamiliar datasets. |
|
|
|||
|
|
KA Core 2. Apply several data exploration approaches to understanding unfamiliar datasets. |
|
|
|||
|
|
KA Core 3. Describe basic machine learning/data mining algorithms and when they are appropriate for use. |
|
|
|||
|
|
KA Core 4. Apply several machine learning/data mining algorithms. |
|
|
|||
|
|
KA Core 5. Describe legal and ethical considerations in acquiring, using, and modifying datasets. |
|
|
|||
|
|
KA Core 6. Describe issues of fairness and bias in data collection and usage. |
|
|
|||
|
DM-Distributed: Distributed Databases/Cloud Computing (Page 122) |
|
|
||||
|
|
Non-core 1. Distributed DBMS |
|
|
|||
|
|
|
Non-core 1a. Distributed data storage |
|
|
||
|
|
|
Non-core 1b. Distributed query processing |
|
|
||
|
|
|
Non-core 1c. Distributed transaction model |
|
|
||
|
|
|
Non-core 1d. Homogeneous and heterogeneous solutions |
|
|
||
|
|
|
Non-core 1e. Client-server distributed databases |
|
|
||
|
|
Non-core 2. Parallel DBMS |
|
|
|||
|
|
|
Non-core 2a. Parallel DBMS architectures: shared memory, shared disk, shared nothing |
|
|
||
|
|
|
Non-core 2b. Speedup and scale-up, e.g., use of the MapReduce processing model |
|
|
||
|
|
|
Non-core 2c. Data replication and weak consistency models |
|
|
||
|
DM-Unstructured: Semi-structured and Unstructured Databases (Pages 122-123) |
|
|
||||
|
|
Non-core 1. Vectorized unstructured data (text, video, audio, etc.) and vector storage |
|
|
|||
|
|
|
Non-core1a. TF-IDF Vectorized with ngram |
|
|
||
|
|
|
Non-core 1b. Word2Vec |
|
|
||
|
|
|
Non-core 1c. Array database or array data type handling |
|
|
||
|
|
Non-core 2. Semi-structured databases (e.g., JSON) |
|
|
|||
|
|
|
Non-core 2a. Storage |
|
|
||
|
|
|
|
Non-core 2ai. Encoding and compression of nested data types |
|
|
|
|
|
|
Non-core 2b. Indexing |
|
|
||
|
|
|
|
Non-core 2bi. Btree, skip index, Bloom filter |
|
|
|
|
|
|
|
Non-core 2bii. Inverted index and bitmap compression |
|
|
|
|
|
|
|
Non-core 2biii. Space filling curve indexing for semi-structured geo-data |
|
|
|
|
|
|
Non-core 2c. Query processing for OLTP and OLAP use cases |
|
|
||
|
|
|
|
Non-core 2ci. Insert, Select, update, and delete tradeoffs |
|
|
|
|
DM-SEP: Society, Ethics, and the Profession (Page 123) |
Explain |
|
||||
|
|
KA Core 1. Enumerate three social and three legal issues related to large data collections. |
|
|
|||
|
|
KA Core 2. Describe the value of data privacy. |
|
|
|||
|
|
KA Core 3. Identify the competing stakeholders with respect to data ownership. |
|
|
|||
|
|
KA Core 4. Enumerate three negative unintended consequences from a given (well known) data-centric application (e.g., Facebook, LastPass, Ashley Madison) |
|
|
|||
|
|
KA Core 5. Describe the meaning of data provenance and lineage. |
|
|
|||
|
|
KA Core 6. Identify how a database might contribute to data security as well as how it may introduce insecurities. |
|
|
|||
Cybersecurity
Cybersecurity Curricula 2017 (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/csec2017.pdf) Cybersecurity encompasses many areas such as database management. (Page 17) There are no specific learning outcomes. The best list comes from the National Security Agency. |
|
|
4.1 Knowledge Area: Data Security This focuses on the protection of data at rest, during processing, and in transit. CSEC2017 does have topics that address database security. |
|
4.7 Knowledge Area: Organizational Security (Page 59) This focuses on protecting organizations from cybersecurity threats and managing risk. CSEC2017 does have topics that address this knowledge area:
|
Data Science
Computing Competencies for Undergraduate Data Science Curricula (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/dstf_ccdsc2021.pdf)16
|
Chapters |
|||
Data Mining (DM) (Page 74) This involves the processing, analysis, and presentation of data in order to gain valuable information. |
|
|||
|
DM-Information Retrieval – T2 (Pages 80-81) This is about identifying and retrieving information from a large data set. Some of the data may be in a database. Students need to have knowledge of
|
Chapter 9 |
||
Chapter 8 |
||||
|
|
Skills |
|
|
|
|
|
Create and use a relational database structure using SQL. |
Chapter 9 |
Data Integrity (DPSIA/DI) (page 89) This is about the completeness, accuracy, and consistency of data. One part of the data life cycle is storing and retrieving. |
|
|||
|
DPSIA/DI Logical Integrity – T1 (Page 89) Students need to have knowledge of
|
Chapter 3 Chapter 6 |
||
Programming, Data Structures, and Algorithms (PDA) (Page 111) This is about the tools for data collection and analysis. One competency is the need to write appropriate database queries. |
|
|||
|
PDA-Programming (Page 112-113) |
|
||
|
|
Skills |
|
|
|
|
|
Manipulate data from selected sources (e.g., databases, spreadsheets, text documents, XML) utilizing appropriate techniques (e.g., database queries, API calls, regular expressions). |
Chapter 9 Chapter 12 Chapter 13 |
Software Development and Maintenance (SDM) (Page 119) There is a statement that this area draws from CS2012 knowledge area on software engineering. |
|
|||
|
SDM-Software Design and Development (page 119) Students need to have T1 level of knowledge of
|
Chapter 12 Chapter 13 |
||
|
|
Skills (Page 120) |
|
|
|
|
|
Describe how to integrate or interact with Information Management/Database Systems. |
Chapter 12 Chapter 13 |
Information Systems (Also written as “Computer Information Systems.”)
IS 2020: A Competency Model for Undergraduate Programs in Information Systems (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/is2020.pdf)
|
Chapters |
||
A3.2 Data / Information Competency Realm (Page 101) |
|
||
|
A3.2.1 Competency Area – Data / Information Management (Page 101) |
|
|
|
|
|
Chapter 9 |
|
|
|
Chapter 4 Chapter 5 |
|
|
|
Chapter 12 |
|
|
|
Chapter 12 |
|
|
|
Chapter 8 |
|
|
|
|
Information Technology
Information Technology Curricula 2017 (IT2017) (https://www.acm.org/binaries/content/assets/education/curricula-recommendations/it2017.pdf)
|
Chapters |
|||||
|
ITE-IMA Domain: Information Management (Page 56) for all listed competencies.) |
|
||||
|
|
Competencies |
|
|||
|
|
|
A. Express how the growth of the internet and demands for information have changed data handling and transactional and analytical processing, and led to the creation of special purpose databases. (Requirements) |
Chapter 1 |
||
|
|
|
B. Design and implement a physical model based on appropriate organization rules for a given scenario including the impact of normalization and indexes. (Requirements and development) |
Chapter 9 |
||
|
|
|
C. Create working SQL statements for simple and intermediate queries to create and modify data and database objects to store, manipulate and analyze enterprise data. (Testing and performance) |
Chapter 9 |
||
|
|
|
D. Analyze ways data fragmentation17, replication, and allocation affect database performance in an enterprise environment. (Integration and evaluation) |
|
||
|
|
|
E. Perform major database administration tasks such as create and manage database users, roles and privileges, backup, and restore database objects to ensure organizational efficiency, continuity, and information security. (Testing and performance) |
Roles were mentioned in Chapter 11, but the usage part was not done. Chapter 12 |
||
|
|
|
|
Subdomains (Page 92) L1: A minimal degree of engagement. L2: A medium degree of engagement. L3: A large degree of engagement. |
|
|
|
|
|
|
|
ITE-IMA-01 Perspectives and impact [L1] (Page 92 for this section.) |
|
|
|
|
|
|
a. Describe how data storage and retrieval has changed over time. |
Chapter 1 |
|
|
|
|
|
b. Justify the advantages of a database approach compared to traditional file processing. |
Chapter 1 |
|
|
|
|
|
c. Describe how the growth of the internet and demands for information for users outside the organization (customers and suppliers) impact data handling and processing. |
Chapter 1 |
|
|
|
|
|
d. Tell a brief history of database models and their evolution. |
Chapter 1 |
|
|
|
|
|
ITE-IMA-02 Data-information concepts [L2] (Page 92 for this section.) |
|
|
|
|
|
|
a. Describe the role of data, information, and databases in organizations. |
Chapter 2 |
|
|
|
|
|
b. Compare and use key terms such as: information, data, database, database management system, metadata, and data mining. |
Chapter 2 |
|
|
|
|
|
c. Illustrate data quality, accuracy, and timeliness, and explain how their absence will impact organizations. |
Chapter 6 |
|
|
|
|
|
d. Describe mechanisms for data collection and their implications (automated data collection, input forms, sources). |
|
|
|
|
|
|
e. Describe basic issues of data retention, including the need for retention, physical storage, backup, and security. |
|
|
|
|
|
|
ITE-IMA-03 Data modeling [L3] (Page 92 for this section.) |
|
|
|
|
|
|
a. Design Entity Relationship diagrams based on appropriate organizational rules for a given scenario. |
Chapter 5 Chapter 6 |
|
|
|
|
|
b. Describe the relationship between a logical model and a physical model. |
Chapter 4 |
|
|
|
|
|
c. Evaluate importance of database constraints. |
Chapter 3 |
|
|
|
|
|
d. Design a physical model for the best performance including impact of normalization and indexes18. |
|
|
|
|
|
|
e. Compare and contrast the differences and similarities between the relational and the dimensional data modeling (OLTP vs. OLAP) |
|
|
|
|
|
|
ITE-IMA-04 Database query languages [L3] (Page 92 for this section.) |
|
|
|
|
|
|
a. Create, modify, and query database objects using the Structured Query Language (SQL). |
Chapter 9 |
|
|
|
|
|
b. Perform filtering and sorting data using various clauses including where, order by, between, like, group by, and having. |
Chapter 9 |
|
|
|
|
|
c. Use joins to select data across multiple tables. |
Chapter 9 |
|
|
|
|
|
d. Use embedded SQL queries. |
Chapter 12 |
|
|
|
|
|
e. Perform calculations in a query using calculated fields and aggregate functions. |
Chapter 9 |
|
|
|
|
|
f. Create updatable and non-updatable views |
Chapter 12 |
|
|
|
|
|
ITE-IMA-05 Data organization architecture [L3] (Page 92 for this section.) |
|
|
|
|
|
|
a. Demonstrate select, project, union, intersection, set difference, and natural join relational operations using simple example relations provided. |
Chapter 9 |
|
|
|
|
|
b. Contrast and compare relational databases concepts and non-relational databases including object-oriented, XML, NewSQL and NoSQL databases. |
Chapter 8 |
|
|
|
|
|
c. Express the relationship between functional dependencies and keys, and give examples. |
Chapter 7 |
|
|
|
|
|
d. Evaluate data integrity and provide examples of entity and referential integrity. |
Chapter 3 Chapter 6 |
|
|
|
|
|
e. Analyze how data fragmentation, replication and allocation affect database performance. |
|
|
|
|
|
|
ITE-IMA-06 Special-purpose databases [L1] (Page 92 for this section.) |
|
|
|
|
|
|
a. Describe major concepts of object oriented, XML, NewSQL, and NoSQL databases. |
Chapter 8 |
|
|
|
|
|
b. Demonstrate an understanding of online analytical processing and data warehouse systems. |
|
|
|
|
|
|
c. Describe methods of data mining and what insights may be gained by these methods. |
|
|
|
|
|
|
ITE-IMA-07 Managing the database environment [L2] (Page 92 for this section.) |
|
|
|
|
|
|
a. Contrast and compare data administration and database administration. |
|
|
|
|
|
|
b. Describe tasks commonly performed by database administrators. |
|
|
|
|
|
|
c. Create and manage database users, roles, and privileges19. |
|
|
|
|
|
|
d. Consider the concept of database security and backup and recovery. |
Chapter 3
|
|
|
|
|
|
e. Evaluate the importance of metadata in database environment. |
Chapter 3 |
Software Engineering
Software Engineering 2014 (https://www.acm.org/binaries/content/assets/education/se2014.pdf ) From page 27, the cognitive skill level for each topic is specified as follows: • Knowledge (k): Remembering previously learned material. Test observation and recall of information; that is, “bring to mind the appropriate information” (such as dates, events, places, knowledge of major ideas, and mastery of subject matter). • Comprehension (c): Understanding information and the meaning of material presented. For example, being able to translate knowledge to a new context, interpret facts, compare, contrast, order, group, infer causes, predict consequences, and so forth. • Application (a): Using learned material in new and concrete situations. For example, using information, methods, concepts, and theories to solve problems requiring the skills or knowledge presented. The exact details are not provided. |
Chapters |
||
|
Computing Essentials (Page 28) This includes the computer science foundations that support software product design and construction. |
|
|
|
|
CMP.cf.10 Database fundamentals Comprehension coverage. And is essential part of the core. (Page 29) |
Chapter 2 |
|
Software Design (Page 32) |
|
|
|
|
DES.dd.2 Database design At the application level. And is an essential part of the core. |
|
United States National Security Agency (NSA)
In 2024 CAE Cyber Defense (CAE-CD) Knowledge Units document (https://dl.dod.cyber.mil/wp-content/uploads/cae/pdf/unclass-cae-cd_ku.pdf), there are three Knowledge Units that specifically address databases:
- Data Administration (DBA) (Page 53)
- Database Management systems (DMS) (Page 56)
- Databases (DAT) (Page 58)
The National Security Agency’s Knowledge Units have an arrangement that follows this scheme:
- Introduction
- Learning Outcomes
- Topics
- Vocabulary
- Related Knowledge Units
- Most Related DCWF and/or NICE Framework Work Role(s)
We will only list the learning outcomes. Many of the topics will be covered in the chapters.
Data Administration (DBA)
Data Administration (DBA) |
|
The intent of the Data Administration Knowledge Unit is to provide students with methods to protect the confidentiality, integrity, and availability of data throughout the data life cycle. |
Chapters |
Outcomes To complete this KU, students should be able to: 1. Draw and describe a data and information lifecycle, identifying specific and general security issues at all stages. 2. Define and evaluate data and information quality, accessibility, and utility. 3. Examine how the origination, change, distribution, storage, and deletion of information is managed and secured. 4. Compare and contrast data and information ownership, stewardship, management, possession, and governance. 5. Outline the role of data and information classification in security. |
1 |
2 |
|
3 |
|
4 |
|
5 |
|
Database Management Systems (DMS)
Database Management Systems (DMS) |
|
The intent of the Database Management Systems Knowledge Unit is to provide students with the skills to utilize database management system to solve specific problems |
Chapters |
Outcomes To complete this KU, students should be able to: 1. Compare and contrast database types including relational, hierarchical, distributed, and other models. 2. Describe the role of a database, a DBMS, and a database server within a complex system supporting multiple applications. 3. Apply SQL to create and administer databases and to manipulate the data they contain. 4. Describe DBMS access controls, privilege levels, and security principles and apply them to a simple database. 5. Outline common structures for storing data in a database management system. 6. Design and deploy a simple database for a specified application. |
1. Chapter 2 |
2. Chapter 2 |
|
3. Chapter 9 |
|
4. Chapter 12 |
|
5 |
|
6. This is the running project. |
|
Databases (DAT)
Databases (DAT) |
|
The intent of the Databases Knowledge Unit is to teach students how database systems are used, managed, and issues associated with protecting the associated data assets |
Chapters |
Outcomes To complete this KU, students should be able to: 1. Describe the role of a database, a database management system (DBMS), and a database server within a complex system supporting one or more applications. 2. Outline different models for databases and cases where they may be used. 3. Identify and describe common security concerns in databases and database management systems |
1. Chapter 2 |
2. Chapter 2 |
|
3. |
|
Association of Information Technology Professionals (CIPS)
Canada’s Association of Information Technology Professionals (CIPS, https://cips.ca) is an accreditation agency just like the Computing Accreditation Commission of the ABET. At this writing, the CIPS does not have any specific guidance concerning information management nor about databases.
New Chapters
In addition to embracing the best practice of using learning outcomes from two major credentialing agencies, we have added more information to the current chapters, combined some chapters, and added new chapters. The new chapters are:
- Chapter 10: “Relational Algebra, Relational Calculus, and Relational Theory”
- Chapter 11: “Under the Hood of the SQL Server”
- Chapter 12: “Protecting the DBMS”
- Chapter 13: “Programs to Work with a Database”
- Chapter 14: “An Example Website with Ca Database Connection”
- Chapter 15: “More on an Example Website with a Database Connection”
A running design project has been added. After a topic has been introduced, then a submodule uses questions to guide students to create their own final project.
Licensing20
This text is released under the Creative Commons license Attribution-NonCommercial-NoDerivs. This text is like a conventional book, in other words, except that it is free. You may copy the work and distribute it to others for any noncommercial use (and for some commercial uses; see below), but all reuse requires attribution.
The Creative Commons license does not precisely spell out what constitutes “noncommercial” use. The author considers any sale of printed copies of this book, even by a non-profit organization and even if the price just covers expenses, to be commercial use. Personal printing, and free distribution of printed selections, do qualify as noncommercial.
Classroom Use
This book is meant as a serious and more-or-less thorough text for an introductory college or graduate course in databases. The contents have been carefully researched, with consistent notation and style, and complete with diagrams and exercises.
Instructors who adopt this book in a course, as either a primary or a secondary text, are strongly encouraged to let us know, as this helps support continued work on the book. Below is a list of the institutions that have adopted our book. Commercial publishers get this information from sales records, but that won’t work here; if you want to see your institution listed, contact us!
- Douglas College
- CSIS 230 Database I
- Kwantlen Polytechnic University
- INFO 2312 Database Management Systems
- University of Maine at Presque Isle
- COS 350 Databases and Database Management Systems (Fall 2021 and Fall 2023)
Acknowledgments
We would like to thank the many Douglas College students that have provided invaluable feedback on the text and on the exercises. We would like to thank the following people from outside of Douglas College who have contributed technical or editorial comments. If we have missed anyone or their institution, please let us know.
For the third edition, I wish to thank Nelson Eng for his feedback and for his encouragement. This has been a labor of love. It is not perfect. I expect there are some typos and some paragraphs that need to be revised. I hope I can work on the fourth edition.
Fred L. Strickland, Ph.D.
Patent Examiner
United States Patent and Trademark Office.
Progress Notes
The third edition was started in early 2024. The third edition was declared complete as of xxx 202x.
Technical Considerations
The book was prepared on my computer using Microsoft Word. There was a brief period when Microsoft Word stopped working and I had to resort to using LibreOffice Writer. Then I uploaded each part to the BC Campus’ PressBooks website. I had to do some format corrections.
A Note on the Cover
The photograph is of a spiral staircase.
References
Matthew Hertz. “What Do ‘CS1’ and ‘CS2” Mean? Investigating Differences In the Early Courses.” SIGCSE’10: Proceedings of the 41st ACM technical symposium on Computer science education, Pages 199-203, March 10-13, 2010. The actual paper is available at https://cse.buffalo.edu/~mhertz/meaning-sigcse-2010.pdf