Creating a database plan
The first step in creating a database is creating a plan that serves both as a guide to be used when implementing the database and as a functional specification for the database after it has been implemented. The complexity and detail of a database design is dictated by the complexity and size of the database application as well as the user population.
The nature and complexity of a database application, as well as the process of planning it, can vary greatly. A database can be relatively simple and designed for use by a single person, or it can be large and complex and designed, for example, to handle upwards of half a billion rows similar to those designed for the Web server logs used by the Site Tracker application.
Before creating a database, you must have a good understanding of the function the database is expected to perform. If the database is to replace a paper-based or manually performed information system, the existing system will give you most of the information you need. It is important to gather all available information to find out how users interact with the data and what they need from the database. It is also important to identify what they want the new system to do, as well as to identify the problems, limitations, and bottlenecks of any existing system.
One of the strengths of a relational database is the ability to relate or associate information about various items in the database. Isolated types of information can be stored separately, but the database engine can combine data when necessary. Identifying the relationships between objects in the design process requires looking at the tables, determining how they are logically related, and adding relational columns that establish a link from one table to another.
Database creation/modification steps
On any new system the DBA should be brought into the project around the time that the requirements are finalized. The database design steps should follow the pattern below
- The project leader will schedule meetings with the DBA to discuss requirements, database layout, database schema and any other necessary information.
- Once the DB schema has been verbally agreed upon, the DBA will create the physical DB and give the developer access to it.
- The developer will then take the plan and develop the database per the previous meetings, requirements, and discussions.
- Meet with the DBA to review, and finalize, the design to ensure that it meets all requirements.
- Once the DBA has given approval that the schema is accurate and complete, the database is locked and no new schema changes will be made by anyone other than the DBA.
Any changes to a database or permissions need to be requested by sending a database change request to the DBA. You can find this form under Shared Documents on the ISD&S SharePoint site
. The DBA will be the primary person who can move new database changes to the production servers. Please give at least two days notice to allow for code review before the code is migrated. In the event that the DBA is unavailable the group leaders will have access to production as well.
The logical design of the database, including the tables and the relationships between them, is the core of an optimized relational database. A good logical database design can lay the foundation for optimal database and application performance. A poor logical database design can impair the performance of the entire system.
Normalizing a logical database design involves using formal methods to separate the data into multiple, related tables. A greater number of narrow tables (with fewer columns) is characteristic of a normalized database. A few wide tables (with more columns) are characteristics of an non-normalized database.
In relational-database design theory, normalization rules identify certain attributes that must be present or absent in a well-designed database. A complete discussion of normalization rules goes well beyond the scope of this topic. However, there are a few rules that can help you achieve a sound database design:
- A table should have a numeric or uniqueidentifier (GUID) primary key
- A table should store only data for a single type of entity (all fields should relate directly to the key)
For example: Avoid storing information about a student and his/her test scores in the same table
- A table should avoid nullable columns
- A table should use default values where appropriate
- A table should not have repeating values or columns
For example: TEST_SCORE_1, TEST_SCORE_2 and so on
As normalization increases, so do the number and complexity of joins required to retrieve data. Too many complex relational joins between too many tables can hinder performance. Reasonable normalization often includes few regularly executed queries that use joins involving more than four tables.
Enforcing data integrity ensures the quality of the data in the database. One of the more common forms of data integrity is referential integrity.
Referential integrity preserves the defined relationships between tables when records are entered or deleted. In Microsoft SQL Server, referential integrity is based on relationships between foreign keys and primary keys or between foreign keys and unique keys. Referential integrity ensures that key values are consistent across tables. Such consistency requires that there be no references to nonexistent values and that if a key value changes, all references to it change consistently throughout the database.
One of the functions of a database is to protect the data by preventing certain users from seeing or changing highly sensitive data and preventing all users from making costly mistakes. For this reason, each application will use a separate user with specific permissions to only the data needed to provide a successful implementation.
When you design a database, you must ensure that the database performs all the important functions correctly and quickly. Some performance issues can be resolved after the database is in production, but other performance issues may be the result of a poor database design and can be addressed only by changing the structure and design of the database.
When you design and implement a database, you should identify the large tables in the database and the more complex processes that the database will perform, and give special consideration to performance when designing these tables. Also consider the effect on performance of increasing the number of users who can access the database.
In conjunction with correct database design, correct use of indexes is important for achieving good performance.
Indexes in databases are similar to indexes in books. In a book, an index allows you to find information quickly without reading the entire book. In a database, an index allows the database program to find data in a table without scanning the entire table. An index in a book is a list of words with the page numbers that contain each word. An index in a database is a list of values in a table with the storage locations of rows in the table that contain each value. Indexes can be created on either a single column or a combination of columns in a table. An index contains an entry with one or more columns (the search key) from each row in a table.