FarragoAlterTableAddColumn

From Eigenpedia

Jump to: navigation, search

Contents

Overview

This page describes the design and implementation of Farrago support for ALTER TABLE ADD COLUMN, per SQL:2003 Part 2 Section 11.11.

Some other forms of ALTER TABLE (adding columns in the middle of a table, dropping columns, changing column types and default values, adding/removing constraints) remain unsupported and are not discussed here. However, Farrago does support altering identity columns (as described in AutoincrementColumns) as well as the non-standard ALTER TABLE REBUILD.

Syntax

ALTER TABLE table-name ADD [ COLUMN ] column-definition

column-definition ::= 
unqualified-column-name
data-type
[ DEFAULT default-value-literal | identity-column-specification ]
[ NOT NULL ]
  • according to SQL:2003, it should be possible to specify an added column as UNIQUE or PRIMARY KEY; however, including such constraint definitions is not currently supported (only NOT NULL is allowed)
  • identity-column-specification is as in AutoincrementColumns

Semantics

Values of the new column for existing rows are filled in with null, default value, or generated sequence value as specified. For sequences, there is no guarantee as to which rows gets assigned which sequence value, since row order is not a logical property of a table.

(In the future, when we support columns defined as generated from expressions, the values will be filled in by computing the expression against existing rows.)

Per SQL:2003, if table T has three columns before ALTER, an existing view which was defined as select * from T will still have only three columns after the ALTER, even though T now has four; the new column is not visible through V until V gets recreated.

Exceptions

Any of the following problems will cause the statement to fail, in which case the table definition will be automatically restored to its pre-ALTER state.

  • any syntax or validation problem with the column's data type, default value, or identity column specification (same rules as for CREATE TABLE)
  • adding a column with a name already in use by an existing column in the table
  • adding an identity column to a table which already has one
  • attempting to alter something (e.g. a view or foreign table) other than a local base table
  • attempting to alter a temporary table (SQL:2003 says this should be allowed for global temporary tables, so that remains a TODO)
  • attempting to alter a table on a local data server which does not support it (e.g. the mock local data server)
    • currently, FTRS (row-store) and LCS (column-store) are the two local data wrappers known to support ALTER TABLE ADD COLUMN
  • adding an identity column with NO CYCLE and a MINVALUE or MAXVALUE which prevents the sequence from being successfully generated for all existing rows in the table
  • adding a NOT NULL column with no default value to a non-empty table

Concurrency

ALTER TABLE ADD COLUMN is executed in a fashion similar to ANALYZE TABLE and ALTER TABLE REBUILD: it starts out as a normal DDL statement, with an exclusive lock on the catalog for making modifications, but then commits the catalog transaction and drops the catalog lock, turning into a DML statement for the actual execution. This avoids keeping the entire catalog locked for the (possibly long-running) execution, and also allows queries to proceed against the table being altered.

For FTRS, concurrent read/write execution has never been safely supported even for DML, so the effect of attempting to query or update a table being altered is currently undefined. Most likely queries and DML statements will crash due to mismatches between the old and new table structure.

For LucidDB, concurrent queries/DML/DDL are safely handled as follows:

  • queries which started before the ALTER TABLE statement will continue to run against the old snapshot of the table
  • queries which start after the ALTER TABLE statement (but before it completes) will fail if they attempt to reference the new column explicitly
    • this also applies to CREATE VIEW statements
  • queries which start after the ALTER TABLE statement (but before it completes) will succeed if they do not attempt to reference the new column, and will see the old snapshot of the table; if they use select *, they will see only the old columns
  • queries which start after the ALTER TABLE statement completes will see the new snapshot of the table (unless they have a label set, in which case they will see the appropriate old snapshot, and will only be able to reference columns whose creation completed before the label was created)
  • DML statements which target a table being altered will fail immediately, since ALTER TABLE takes a write lock
    • the same is true for concurrent attempts to execute TRUNCATE, ANALYZE, ALTER TABLE REBUILD, or another ALTER TABLE ADD COLUMN on the same table
  • attempts to DROP the table being altered will fail due to the "object in use" lock
  • CREATE INDEX on an existing column of the table will run concurrently with ALTER TABLE and should succeed
  • CREATE INDEX on the new column being added will fail (same as a query on the new column)
  • hot backups which started before the ALTER TABLE statement will save the pre-ALTER snapshot
  • hot backups which started after the ALTER TABLE statement but before it completes will also save the pre-ALTER snapshot
    • note that this is consistent with the behavior of a hot backup which starts while a DML statement is running
  • hot backups which started after the ALTER TABLE statement completes will save the post-ALTER snapshot

See LucidDbConcurrencyControl for background.

Implementation

Parsing and Validation

The parser for ALTER TABLE ADD COLUMN instantiates a new DdlAlterTableStructureStmt. The constructor for this class saves a copy of the old table definition; later this allows the optimizer to access both the old and new definitions. The old copy is not a persistent repository object; instead, it is a transient object cloned via JMI reflective calls. (Only the essentials of the table's column definitions are cloned, omitting dangling bits such as indexes and sequences.)

Once this is done, the new column definition is parsed (using the same production rule as for CREATE TABLE) and appended to the persistent definition of the table in the repository.

Previously, CREATE TABLE was catching the error of multiple identity columns on the same table inside of the parser itself; this validation rule has been moved to DdlRelationalHandler.validateLocalTable so that it can cover both CREATE and ALTER. validateLocalTable also covers other generic aspects of validating the new column definition, such as making sure that the type definition is acceptable.

Additional statement-specific rules (e.g. can't alter foreign table) are implemented in DdlAlterTableStructureStmt.preValidate.

A new SPI method supportsAlterTableAddColumn on FarragoMedLocalDataServer allows a local data wrapper implementation to declare whether it supports ALTER TABLE ADD COLUMN. (This is determined by the data wrapper rather than the session personality, which is why the generic PersonalityFeatureFramework isn't used in this case.)

Nothing special is required for guaranteeing the view semantics mentioned above, since views are already expanded to have only explicit column references at the time they are created.

Generic Execution

The general approach taken for reshaping the stored table is to execute a reentrant DML statement as in FarragoTableRebuild. By default, the statement to be used has the form

insert into T(oldColA, oldColB, ...) select oldColA, oldColB, ... from T

Notes:

  • As with FarragoTableRebuild, we want the reference to T in the INTO clause to reference a new empty table structure, whereas the reference to T in the FROM clause should reference the old table data
  • We do not mention the new column, so the reentrant SQL validator will fill out the insert statement with an expression to produce its default value (or sequence generator, or eventually an arbitrary expression)

Since all of this machinery was already available in DdlRebuildTableStmt, it has been factored out into a new abstract base DdlReloadTableStmt from which both DdlRebuildTableStmt and DdlAlterTableStructureStmt now derive. The reentrant SQL to be executed is supplied by subclasses via protected abstract method getReloadDml.

Because of the way we are abusing reentrant SQL for the self-insert, it's necessary for some other parts of the system (described below) to detect the fact that the SQL is being executed on behalf of ALTER TABLE and adjust their behavior accordingly. Since REBUILD was already injecting a custom implementation of FarragoSessionIndexMap into the reentrant session for root-swapping purposes, we add on a new method getOldTableStructure to this interface; when this returns non-null, it indicates that ALTER TABLE ADD COLUMN is executing, and provides the copy of the old table definition.

Note that as with ALTER TABLE REBUILD, it's necessary for ALTER TABLE ADD COLUMN to ignore the GENERATED ALWAYS clause on identity columns, since it needs to preserve the existing sequence values rather than attempting to regenerate them; this is handled by more special-casing in ReposDefaultValueFactory.

For some additional improvements in ALTER TABLE REBUILD (handling of rowcounts in the case of failure) made as part of this refactoring, see eigenjira:LDB-191.

FTRS Specifics

For row-store, the reentrant SQL above is already exactly what we need for physically reshaping stored tuples.

The only data server component which needs to be tweaked is FtrsIndexScanRel; when it sees from the session index map that ALTER TABLE is executing, it constructs its FtrsIndexGuide based on the old tuple structure so that the Fennel plan will be set up to correctly read the existing table format.

LucidDB Optimization

For column store, we don't really want to reshape all of the existing rows; we just want to create a new clustered index for the new column, fill it with generated values, and we're done.

The easiest way to accomplish this with minimal changes is as follows:

  • use the same reentrant SQL statement as above, but...
  • disable the index root-swapping (since we're not going to touch the existing indexes), and...
  • add an optimizer rule which projects away the existing columns from the source of the TableModificationRel, and...
  • adjust the physical implementation of LcsTableAppendRel to pretend that there is only one clustered index (the new one) in the whole table, and no unclustered indexes
    • --Jvs 01:32, 10 December 2008 (EST): it might be cleaner to introduce a new ALTER operation to TableModificationRel, since overloading INSERT could cause confusion

A new method isAlterTableAddColumnIncremental on FarragoSessionPersonality controls the disablement of index root-swapping. This has to be on the personality (rather than the local data wrapper) since the relevant optimizer rules can only be supplied by the personality. (--Jvs 01:32, 10 December 2008 (EST): There is room for improvement here in the plugin interactions between personalities and local data wrappers, especially if we ever want to be able to provide for both row-store and column-store tables mixed into the same database.)

The new optimizer rule is LcsTableAlterRule. It needs to fire early on in the planner program, but it depends on CoerceInputsRule firing first, which can disturb other plans, so this new pair of rules is only inserted into the program when ALTER TABLE is happening. (We currently use a private planner per statement, so this is OK.)

The LcsTableAppendRel physical implementation adjustments are in LcsAppendStreamDef and LcsIndexGuide.

One complication is the presence of deleted rows. We need to generate the data for the new column with corresponding entries matching deleted rows. This is accomplished by adjusting the existing LcsAddDeletionScanRule to detect ALTER TABLE; in that case, it ignores the real deletion index and instead feeds an empty set of deleted RID's (implemented via FennelValuesRel with no tuples) into the row scan. In the case where a new sequence is being added, it will have "holes" corresponding to the deleted rows; this is a slight nonconformance from SQL:2003, but is easily avoided by preceding the ALTER TABLE ADD COLUMN statement with an ALTER TABLE REBUILD (assuming no DML statements slip in between).

The plan comes out like this (in this example, a default value of NULL is being generated):

FennelToIteratorConverter: rowcount = 1.0, cumulative cost = 6.0
  LcsTableAppendRel(table=[[LOCALDB, X, T]]): rowcount = 1.0, cumulative cost = 5.0
    IteratorToFennelConverter: rowcount = 1.0, cumulative cost = 4.0
      IterCalcRel(expr#0=[{inputs}], expr#1=[null], expr#2=[CAST($t1):INTEGER], $f0=[$t2]): rowcount = 1.0, cumulative cost = 3.0
        FennelToIteratorConverter: rowcount = 1.0, cumulative cost = 2.0
          LcsRowScanRel(table=[[LOCALDB, X, T]], projection=[[LCS_RID]], clustered indexes=[[SYS$CLUSTERED_INDEX$T$I]]): rowcount = 1.0, cumulative cost = 1.0
            FennelValuesRel(tuples=[[]]): rowcount = 1.0, cumulative cost = 1.0

A final optimization is to elide the SplitterStreamDef in cases where only a single cluster is being appended; this is always true for ALTER TABLE ADD COLUMN, but is applicable to the column replacement optimization for MERGE, as well as INSERT into a single-column table.

Index Validity

A nuance to the implementation above is that when projecting away the existing columns, the optimizer still needs to keep one of them around as a dummy in order to loop over the existing rows. (This shows up as the LCS_RID projection in the plan example above.) The optimizer already had logic for this, since the same situation can occur in the case of SELECT COUNT(*) FROM T. The optimizer picks the index with the smallest number of stored disk pages (according to the last ANALYZE) to minimize I/O. Unfortunately, our new clustered index is going to show up as the best candidate since it only has one page (the root) so far!

To prevent the optimizer from choosing this index as the scan source, we need a way to mark it, so it's time for an addition to FEM:

Image:FemLocalIndexValid.png

This new isInvalid flag is set by DdlRelationalHandler as part of defining the new empty index, and cleared by DdlReloadTableStmt.completeAfterExecuteUnlocked once the statement completes successfully.

LcsIndexOptimizer checks this flag and skips invalid candidates when choosing which clustered index to read existing rows from.

It turns out that this flag also enables us to catch the concurrency case where a query tries to select from the new column while ALTER TABLE is still executing. In LcsIndexGuide, we check for attempts to use invalid indexes for scans and cause them to fail. However, as described later on, we also catch this earlier during query validation, so the optimizer-level detection is really just a sanity check.

As a slight tangent, it turns out that there was already a similar case for CREATE INDEX with unclustered indexes for both LCS and FTRS: we do not want the optimizer to try to pick an index which is still being built. Previously, we were using the inherited CwmModelElement attribute visibility for a similar purpose. That was a hack (and confusing since visibility is used for other purposes by DdlValidator), so this has been cleaned up to use the new FemLocalIndex.isInvalid attribute instead.

Migration note: since the default value for this new attribute is false, migrated catalogs will have all indexes valid. This is the reason for defining the sense of the attribute as isInvalid rather than isValid.

Recovery

What if something goes wrong during execution? This could be an online failure (e.g. not enough cache buffer pages, or sequence MAXVALUE hit), or a system crash.

The new table definition has already been committed at the beginning before the catalog lock is released for long-duration execution. So if anything goes wrong, it's necessary to undo the catalog change.

For online failures, this is handled by having DdlMultipleTxnStmt.completeAfterExecuteUnlocked get called in all cases (regardless of success or failure), and passing in the success/failure status as a new boolean parameter. (Previously, this method was only getting called on success.) In the failure case, method DdlAlterTableStructureStmt.recover takes care of undoing the catalog change. Due to the column visibility enforcement described in the next section, this should not cause trouble for any existing cached query plans; similarly, it is not possible that new views could have been created with references to the new column.

System crashes require some new infrastructure. An addition to FEM (in the MED package) allows us to track in-progress recoverable actions:

Image:FemRecovery.png

As explained by the comment box (but not shown in the diagram), an instance of new class FemRecoveryReference is tied to a CwmModelElement (such as a table being altered) via an instance of CwmDependency and marked with a RecoveryType indicating the in-progress action. (Currently, the only RecoveryType is ALTER_TABLE_ADD_COLUMN, but new ones will be added in the future for other operations needing to use this infrastructure.)

The FemRecoveryReference instance is created during the initial catalog transaction which adds the new column definition, and serves as a form of journal entry. If the ALTER TABLE statement completes online (whether success or failure), then this FemRecoveryReference instance is deleted by completeAfterExecuteUnlocked, since no further action is needed.

However, if the system crashes, then catalog recovery needs to occur the next time the database is started up. FarragoDatabase already had logic in place for dealing with backup/restore crashes, so this has been extended to scan for FemRecoveryReference instances and take the appropriate action (then delete those instances to indicate that recovery has completed). Recovery takes place in a single repository transaction in order to make it atomic.

Note that FemRecoveryReference instances are global (they are not part of any container); the corresponding class extent serves as the pending journal. In theory, we could get by without them and scan the whole catalog for invalid objects, but that could be very slow for large catalogs.

Migration note: this is a new class, so existing catalogs will have no pending recovery instances after migration.

Validation of Concurrent Queries

To implement the column visibility rules specified in the concurrency section above, FarragoPreparingStmt.getTable has the following logic for computing the columns of the SqlValidatorTable instance returned when looking up a local table:

  • If the table is undergoing ALTER TABLE ADD COLUMN (detected via the existence of a dependent FemRecoveryReference), then the last column is omitted.
  • If the session has a label set, then any columns with creation timestamps later than the label timestamp are omitted.

On the second point, there is a related subtlety in the implementation of ADD COLUMN. Namely, the creation timestamp of the new column needs to be touched at the end of execution. Otherwise, if a label is created during execution, it would look like the column was created earlier than the label, which is incorrect.

JDBC

JDBC metadata call DatabaseMetaData.supportsAlterTableWithAddColumn now returns true instead of false.

Tests

  • Tests for positive and negative non-concurrent scenarios for both FTRS and LCS are in farrago/unitsql/ddl/alterTableStruct.sql.
  • Tests for LucidDB concurrency scenarios are in luciddb/test/sql/concurrency/alteradd-dml.mtsql.
    • --Jvs 22:41, 9 December 2008 (EST): Many more concurrency scenarios are needed here for better coverage, e.g. BACKUP vs ALTER.
  • Tests for column visibility through labels are in luciddb/test/sql/txn/labelAlterTable.sql.

Test Instrumentation and Fault Injection

Some code paths are instrumented in order to enable the tests above.

For concurrency testing, we want to make it look like an ALTER TABLE is taking a long time, but setting up a big table is a pain. So instead, we introduce a "trap" integer property net.sf.farrago.ddl.DdlReloadTableStmt.sleep. When set, it indicates the number of milliseconds to sleep after dropping the catalog lock and before starting the reentrant execution. As a safety measure, this property is automatically cleared after the trap is hit.

For recovery testing, we want to be able to inject a fault which will leave the catalog in a state equivalent to a system crash without actually killing the JVM. So we introduce a boolean trap property net.sf.farrago.ddl.DdlReloadTableStmt.crash for this purpose; it triggers within completeAfterExecuteUnlocked, causing the FemRecoveryReference instance to be left present in the catalog. Note that these properties must be set via FarragoTestUDR.setFarragoProperty (not FarragoTestUDR.setSystemProperty).

FarragoTestUDR.setFarragoProperty,simulateCatalogRecovery can then be invoked from the test SQL script to call the FarragoDatabase catalog recovery method which would normally only be called when the system is started.

Loose Ends

  • According to SampleDatasetSubstitution#Functional specification, the columns of a sample dataset are supposed to cover all of the columns in the base table. Altering the table could violate this. Since the rule hasn't been enforced yet for CREATE TABLE (eigenjira:FRG-152), for now we can leave it unenforced for ALTER TABLE as well.
  • Garbage index roots can be left behind by failure (this was already an issue for ALTER TABLE REBUILD). Either clean these up, or provide a segment garbage collection utility as part of a general purpose physical database dump/check/repair facility.
Personal tools