Schema Extensibility in Commercial Enterprise Software
This document describes how schema extensibility is typically achieved in commercial enterprise software applications; in particular, in customer relationship management (CRM) applications. A future update will discuss newer extensibility mechanisms such as native XML column types in databases.
Flexibility Requirements
In CRM applications, implementing organizations need to extend the customer data model with business-specific information. The data model extensions consist of adding single-valued and/or multiple-valued properties to the customer data record.
- The data model extensions must be efficient. The customer data tables typically have tens of thousands to hundreds of thousands of records in a business-to-business setting, and millions to tens of millions of records in a business-to-consumer setting.
- The data model extensions must facilitate efficient query and join processing in the database.
- The data model extensions must not prevent the vendor from easily upgrading the implementation in the field.
- The data model extensions must allow the vendor to ship a single binary code image that can be run in all customer implementations.
General Approach
Vendors typically use a metadata-driven approach:
- Data model extensions are limited to certain tables and are described with schema extension tables.
- The customer of the application can extend the schema by describing the schema extension in a UI, which saves records in the schema extension tables.
- Once the customer has finished designing the schema extension, the application extends the schema by adding columns to the extensible tables (for single-valued properties) or by creating new sub-tables (for multi-valued properties).
Example: Single-Valued Schema Extension
The implementing organization wishes to add a ???????favorite color??????? field to their customer records. The extension table has the following schema:
| Extension Table Schema: | |
| COLUMN NAME | COLUMN TYPE |
| Table_name | Varchar |
| Prop_name | Varchar |
| Prop_type | Varchar |
| Prop_length | Integer |
| Prop_precision | Integer |
| Prop_singlevalued | Boolean |
Through a UI, the user creates the following record:
| COLUMN NAME | COLUMN VALUE |
| Table_name | Customer |
| Prop_name | Favorite_Color |
| Prop_type | Varchar |
| Prop_length | 25 |
| Prop_precision | Null |
| Prop_singlevalued | True |
When the user is satisfied (ie when the user hits a button), the application executes the schema change by issuing the following SQL statement:
ALTER TABLE Customer ADD COLUMN Favorite_Color varchar(25);
Notes:
The Column_precision column will only be used for NUMERIC fields.
Extension columns must typically be NULL-able or the user must provide a default value for the new column.
Example: Multi-Valued Schema Extension
The implementing organization wishes to associate a list of purchased products with a customer record. The schema extension record looks like this:
| COLUMN NAME | COLUMN VALUE |
| Table_name | Customer |
| Prop_name | Purchases |
| Prop_type | Integer |
| Prop_length | 11 (or NULL) |
| Prop_precision | Null |
| Prop_singlevalued | False |
The application executes the following schema change in response:
CREATE TABLE Customer_Purchases(
Customer_id INT NOT NULL,
Purchases_id INT NOT NULL,
Purchases_Value INT (11),
PRIMARY KEY (customer_id, purchases_id),
FOREIGN KEY (customer_id)
REFERENCES Customer (customer_id)
);
Thus, the Customer_Purchases table associates a set of Purchases values with the Customer record.
Scalability Issues
Record Length Limits and Views
Databases typically limit the size of a single record. Implementations deal with this by putting all the extension fields in a separate table joined back to the main table. If necessary, a view can be created to maintain the illusion of a single table:
-- this table is not extensible: CREATE TABLE Customer_Base ( Customer_id INT NOT NULL PRIMARY KEY, [OTHER FIELDS GO HERE] ); -- this table is extensible CREATE TABLE Customer_Ext ( Customer_id INT NOT NULL PRIMARY KEY, ); -- add view CREATE VIEW Customer AS SELECT CB.*, CX.* FROM Customer_Base CB JOIN Customer_Ext CX ON CB.customer_id = CX.customer_id
The view is mainly there to hide the details of the customer schema from report writers.
Indexes
It is generally not a good idea to allow users to easily create indexes on the single-valued schema extension columns. Indexes impose overhead and should only be added to extension fields after it is determined that the index will result in a speedup that justifies the index maintenance overhead.
August 27th, 2006