about software programming techniques


JJ van Zon 2023

[ Draft ]

💽 Database Conventions

back

Database conventions help us keep things consistent and well-optimized in a way that cooperates well with other software components. It also provides a stable way to deploy database structure changes.

Contents

Developing a Database

Developing a database generally involves the following steps:

Keep in mind to limit the use of ‘exotic’ data types.

For instance: if a number would fit in a tinyint, use an ‘int’ anyway, because this saves the system a lot of casting, and a 32-bit number is better for memory / disk alignment, which is better for performance. Chances are, a system will reserve 32 bit for a 8-bit number anyway to accomplish this memory alignment. But this is just an example.

Here are the recommended data types:

Only if you need a bigger range:

Naming Conventions

Object Type Example Name
Database name ShopDB
Tables MyTable
Columns MyColumn
ID column ThingID
Indexes IX_MyTable_MyColumn
Primary keys PK_MyTable
Foreign keys FK_MyTable_OtherTable3
Or when there are multiple relations between the same tables:

FK_MyTable_<OtherTable_ColumnName_MinusID>
FK_MyTable_ThingA
FK_MyTable_ThingB
Unique keys When not many columns:
IX_MyTable_MyColumn_Unique

When many columns and only one constraint in the table:
IX_MyTable_Unique
Stored procedures SP_DoSomething / spDoSomething
Functions FN_DoSomething / fnDoSomething
Triggers TR_MyTable_OnInsert / trgMyTable_Insert / ...

Rules

Do not use the following object types, because these things are managed in .NET:

For new databases, prefer int’s as primary keys over guids, because guids create performance penalties throughout the software stack.

Only use additional guid columns as an alternative key for entities that need to be unique across multiple systems or databases. Do not forget to put an index on the guid column. Prefer surrogate keys rather than complicated composite keys. Prefer auto-incremented ID’s, except for enum-like tables.

< TODO: Mention: Security? Guids can be safe for security. For instance, for smaller underlying entities you could not guess the ID and sneekily change someone elses data, when only the user-ownership of higher objects are checked. >

For development databases use the “DEV_” prefix, e.g. DEV_ShopDB.

For test use the prefix TEST_ and for acceptance use the prefix ACC_. For production use no prefix at all.

On development databases add the user dev with password dev. For test add the user test with password test. For acceptance you might use specific user names depending on security demands, otherwise add user name acc with password acc. In production databases use the administrator user’s password with the administrator password for databases or create a separate user name for production with a strong password.

Upgrade Scripts

Database upgrade scripts are managed as follows.

Excel Sheet

Each database structure gets an Excel in which all the upgrade SQL scripts are registered.

The Excel sheet and SQL scripts are put in a Visual Studio project to manage them easily.

Always edit the Excel in the dev branch, because Excels cannot be merged.

The name of an SQL file has a specific format:

2014-08-28 040 ShopDB Supplier.Name not null.sql

So it has the format:

{Date} {Number} {DatabaseStructureName} {DatabaseObject}{SubDatabaseObject} {Change}.sql
Element Description Examples
Date Use the format yyyy-mm-dd 2014-08-28
Number Use 3 digits and count in 10’s so you might insert one in betweeen 040
DatabaseStructureName   ShopDB
DatabaseObject A table name or index name or other database object name Supplier
IX_Supplier_Name
FK_Supplier_Branch
SubDatabaseObject Optional. Usually a column name .Name
Change Optional. Usually left out. You can sometimes mention a specific change, but be brief. not null

In the Excel, add a column for each database instance for that database structure. There can be different databases with the same structure for different staging areas (dev, test, acc, prod) or a database for different customers or databases running on different servers. Put ‘TRUE’ (or ‘WAAR’ in Dutch) where the upgrade script has been executed. For instance:

Also include a column saying whether you have scripted it at all (for if you are in a hurry and have no time to script it). A release date column is also handy, to get some sense of when things went live.

Exceptional Cases

For upgrades that should only be executed on a specific database, put ‘N/A’ (or ‘N.V.T.’ in Dutch) in the appropriate spread sheet cell.

You can also add something to the SQL file name to indicate this:

2015-01-23 010 OrderDB SHOPDB ONLY Order.DeliveryDateTimeUtc.sql

Some things should be done manually and not with SQL. Those actions should also be mentioned in the Excel:

2015-01-23 020 OrderDB OrderID Identity Yes DO MANUALLY

If a script requires that you be extra careful, you can mention this as follows:

2015-01-23 010 OrderDB Order.DeliveryDateTimeUtc.sql CHECK MANUALLY
2015-01-23 010 OrderDB Order.DeliveryDateTimeUtc.sql EXECUTE SEPARATELY

But be sparse with that, because the person running the script might not actually know what it is he is supposed to check and will feel uneasy executing this script since it is obviously so dangerous, while he has no idea why.

Summary

This section covered:

Scripts

The individual upgrade SQL scripts should not contain GO statements. GO is not an SQL keyword, it is a Management Studio command telling it to execute the script up until that point. What must be separated by GO statements in Management Studio must be split up into multiple SQL files in the database upgrade scripts.

Also get rid of any automatically generated SET ANSI_NULLS ON and SET QUOTED_IDENTIFIER ON statements. Those are the default behavior anyway, and it just add unnecessary fluff to your scripts. Also: SET ANSI_NULLS OFF will generate an error in future versions of SQL Server anyway.

The upgrade scripts should be incremental: DO make assumptions about the previous state of the database structure and script a specific change. Do not write scripts like ‘if not exists’ then add, or ‘drop and create table’ scripts, because you may be throwing away data, or execute things on the wrong database. It is better to make a specific change and not be tolerant to differences.

DO NOT script changes from Identity Yes to Identity No or the other way around. Changes in the Identity property of a column require recreating the whole database table. If you script it now, executing it onto a database does not only add the Identity Yes property, it will also restore the whole table structure to the state it had at the time you scripted the Identity.

Summary

This section covered:

Deployment

To deploy multiple database structure changes you can use the Excel.

Always edit the Excel in the dev branch, because Excels cannot be merged.

You can easily see which scripts are still to be executed onto the database.

After you have executed them, put TRUE in the appropriate spread sheet cells.

You could execute the scripts one by one, but there is a handier, safer way to do it.

With some creative copying and pasting the SQL file names, you can create a composite upgrade script like this:

begin try
    print 'Begin transaction.';
    begin transaction;
    declare @verbose bit = 0;
    declare @folder varchar(255) = 'C:\JJ\Install\SqlScripts';

    exec spExecuteSqlFile @folder, '2015-01-23 010 OrderDB Order.DeliveryDateTimeUtc.sql', @verbose;
    exec spExecuteSqlFile @folder, '2015-01-23 010 OrderDB Order.DeliveryDateTimeUtc not null.sql', @verbose;
    
    --print 'Rolling back transaction.';
      --rollback transaction;

    print 'Committing transaction.';
    commit transaction;
end try
begin catch
    print Error\_Message();
    print 'Rolling back transaction.';
    rollback transaction;
end catch

This safely executes all changes in a single transaction and shows error information if something goes wrong.

This does require you to add the following stored procedure to the database:

create procedure spExecuteSqlFile(@folderPath varchar(255), @fileName varchar(255), @verbose bit = 0) as
begin
    set nocount on;

    print 'Executing ''' + @fileName + '''';
    declare @filePath varchar(255) = @folderPath + '\' + @fileName;
    declare @readFileSql varchar(1024) = 'select BulkColumn from openrowset(bulk ''' + @filePath + ''', single\_blob) x;'
    declare @temp table (contents varchar(max));
    insert into @temp exec (@readFileSql);
    declare @sql varchar(max);
    set @sql = (select top 1 contents FROM @temp);

    -- Remove BOM from UTF-8.
    if (LEFT(@sql, 3) = '') set @sql = RIGHT(@sql, LEN(@sql) - 3);

    exec (@sql);
    if (@verbose = 1) print @sql;
end

Summary

This section covered:

SqlScripts Table

Consider maintaining a list of executed database upgrade SQL scripts in a table, because it happens too often, that someone has put a database somewhere, without administrating the Excel file, no matter how many times you say it.

If you do this, then the stored procedure above would have to be changed so it inserts a record in this database table.

C#-Based Migrations

Some data migrations are easier to program using C# than SQL scripts.

Sometimes the contrast between how easy it is to do in C# or SQL is so large, that the benefits of programming it in C# outweigh the downsides. It could be a factor 20 difference in development time in some cases.

A benefit of SQL scripts is that it always operates on the right intermediate version of the entity model, while C# code always operates on the latest version of the entity model. This means that earlier C#-based migrations might not compile anymore for a newer version of the entity model, and can only work with an older version of the model.

This problem with C#-based migrations can be mitigated in several ways. Here are a few ideas:

Always Rerunnable Tool

Replace the one-off C# migration by a tool that does something more general, that can operate on any version of the model.

For instance, in a certain project, resaving most data to the database using newer business logic would set a lot of things right in the data and this procedure was rerunnable at any time, regardless of the version of the model. ‘Run the resaver’ would be the description in the list of data migrations to execute.

Get Specific Version, Build, Get Specific Version, Build

You can let the C#-based migration operate on a specific version of the model by getting the older version of the software from source control, then building it. Each time you have to do a C#-based migration, you can make a separate executable, that operates on a specific version of the code. As soon as a migration does not compile anymore, you can simply outcomment or remove it.

Snapshots of Entity Model

Storing a snapshot of an entity model in a separate project specifically intended for that migration might be a solution. (Not tried out in practice. Might turn out to be very impractical.)

Any Other Ideas

Are welcome.

back