Every small decision at the start of the database design process has a big impact on the result. In projects I have worked on previously, I have seen clients implementing an unoptimized database design as suggested by product vendors engineers and paying the price of a bad decision by having to rework the initial design after years of the volume of data needing to be manually fixed and the stress of reevaluating the database to adjust to their new requirements. It can happen very quickly if you don’t take into account the future state of the data for clients. In order to help you avoid common mistakes that can affect applications, here are 10 best practices to consider when designing a database.
1. Involve Product Owners and Stakeholders Early
Get agreement on the expectations and think about the difficulty to train users and the database usability from their perspective.
2. Choose A Database Type
Depending on the data structures (or flexibility), you may want to choose between SQL and NoSQL database types. SQL databases are the standard for high integrity and structured data. The most common ones are: Oracle, SQL Server, MySQL, and SQLite, while NoSQL databases provide more speed, scalability, and fluid requirements which accommodate better applications like machine learning, web analytics, social networks and the Internet of Things (IoT). Some examples of NoSQL databases are: RavenDB, Couchbase, and MongoDB.
3. Data Normalization is Good
The more complex the structure and needs of the business, the more critical it becomes l to normalize data to achieve minimum redundancy.
Multi-valued attributes are the ones on an entity that may have more than one value linked with its primary key. For example, a gamer may have several gaming platforms or consoles, so the entity “console” or “game platform” should be its own entity to allow the normalization of data. Reduce or avoid multi-valued attributes and duplicated attributes as much as possible.
4. Focus on Transparency
Think of the future database users during the design process. Avoid the use of abbreviations, plurals, or random words to name the objects in the database. It is better to stick to consistent naming conventions. Change is a constant and the design will likely need to be changed in the future, so don’t make it difficult for those who may need to extend the database design.
5. Define Constraints to Maintain Data Integrity
Constraints will help you to keep data integrity and avoid inconsistency and bugs in the software. Usually, software applications have a validation system in place to prevent some of the persistent errors in the database, but we should ensure protection from bad data from the database level as well. Some recommendations to ensure data integrity are:
Always validate input data
Implement access controls
Keep an audit trail
Always backup data
Adopting security best practices
Educate your workforce
Implement the use of change control (versioning of files) tools.
Plan for business continuity.
Depending on the database implementation, you could apply the following constraints to different database objects. For Oracle databases, this chart would help you identify the types of constraints and their applicable objects:
6. Document as much as needed
Although it might seem annoying, documentation is an essential piece of the database design. It is important to document the design, entity-relationship schemas, triggers, and other database objects for future users and engineers.
7. Plan for Disaster
Disaster recovery is an important consideration before delving too deeply into the design, both from natural or man-made disasters. Some strategies to put in place in order to ensure that the database structure remains intact are:
8. Keep Privacy Primary
Privacy should be a priority for business users and customers alike. Encrypt passwords, and assign an administrator with privacy training and qualifications. Having vulnerabilities impact data integrity very highly which in turn affects everything else in the business. The database should be as closed as possible.
9. Optimize for Speed
Optimize by creating indexes for queries used regularly. It is recommended to use a database analyzer to determine if an index or a clustered index is necessary. Here are some database optimization tools that might prove useful depending on the requirements:
EverSQL - Free, supports MySQL, PostgreSQL, MongoDB databases, Windows, Linux, macOS operating systems, and cloud platforms.
Paessler PRTG Network Monitor - Although it does not optimize itself, it provides details about the queries analyzed and how to optimize.
dbForge Studio - May help to identify bottlenecks and analyze queries to optimize.
Plan Explorer - Free, but only available on Windows.
Holistic.dev - Only supports PostgreSQL databases.
Database Performance Analyzer - Uses cloud and machine learning techniques to identify optimization points in the database.
ApexSQL - Affordable subscription for business. Allows analysis of SQL Server execution plan.
10. Keep the Database on Its Own Server
Allocate the database server on separate hardware not just the webserver to lower CPU usage. This also helps to have additional security access and keep the data out of the reach of unauthorized users.
These are some of the best practices to consider. It is not a ruleset and every business might have restrictions or different needs that might not fit for them, but keep flexibility in mind and refine, re-direct or break the recommendations to best accommodate your needs to stay in line with the business goals.