FarragoExtensionMetamodel

From Eigenpedia

Jump to: navigation, search

Contents

FEM

This document describes how Farrago metadata is modeled in UML. The UML model defined by Farrago is referred to as FEM, for Farrago Extension Metamodel, because

The metamodel defined by CWM sprawls over a large number of data management areas. In some areas, it is very generic and not up-to-date with the latest SQL standards. FEM specializes a subset of CWM (hiding irrelevant submodels) with specifics from SQL and Farrago. The CWM standard recommends that specialization be implemented via the tag mechanism (to further the goal of interoperability), but tags are inappropriate for the deep specialization required by Farrago, so we use inheritance instead. Where interoperability is desired, we plan to provide a "slicing" mechanism which can take a set of FEM objects, strip out all FEM-specifics, and produce a valid CWM XMI export. Likewise, a CWM import mechanism may be provided which fills in defaults for FEM-specifics.

In this document, we walk through the submodel areas covered by FEM, illustrating them via static structure UML class and object diagrams. Attributes are not always shown unless they are relevant to the discussion.


Containers

In SQL, first-class objects are defined in a three-level containment hierarchy: catalogs contain schemas, and schemas contain objects such as tables, views, and procedures. This containment is modeled in FEM using the Namespace/Package constructs inherited from CWM:

CatalogAndSchema.png

The ElementOwnership association is used to model both catalog/schema and schema/object containment. For example, suppose the following DDL is executed:

create schema blogs;
set schema 'blogs';
create table bloggers(
    id int not null primary key,
    name varchar(128) not null,
    handle varchar(128) not null unique,
    email_addr varchar(128) not null);

Then (ignoring columns for the moment) the following object instances would be created (or referenced in the case of the LOCALDB catalog, which already exists) and linked:

CatalogAndSchemaInstance.png

Note that any ModelElement can be contained in a schema, meaning extension models can define their own first-class objects beyond the standard ones by inheriting from ModelElement.

Tables

CWM refers to any table-like object as a NamedColumnSet. The association between a NamedColumnSet and the Column objects which define its structure is modeled on the more abstract notion of a class and its attributes:

ColumnSet.png

In particular, FEM defines tables and views as descendants of NamedColumnSet, and differentiates between columns of tables and views:

LocalTableAndView.png

Here is a corresponding object diagram for the previous example:

LocalTableInstance.png

Note that the generic ElementOwnership association is not used to model the containment of columns by a table; instead, the more specific ClassifierFeature association is used. However, since Classifier subclasses Namespace, the ElementOwnership association may be used for a table's containment of non-column objects such as constraints.

Columns

The information associated with a single column is modeled via attributes and an association to a datatype:

Column.png

Details of allowed datatypes are covered in a later section. The initialValue attribute is used to store any DEFAULT value (the Expression class is covered in the next section on views).

Identity columns have a sequence component (not shown) as described in AutoincrementColumns.

Views

SQL views have a body (an underlying query), result columns, and dependencies on objects referenced by their body (e.g. tables, other views, and functions). We have already seen how view columns are modeled. Here is a class diagram for the view body and dependencies:

View.png

A single instance of class Dependency is created per view to hold all dependencies, and only direct dependencies are recorded (rather than the transitive closure). Consider this addition to our running DDL example:


create table topics(
    id int not null primary key,
    blogger_id int not null references bloggers(id),
    name varchar(128) not null,
    creation_time timestamp not null);

create view blogger_topics as
select bloggers.name as blogger_name,topics.name as topic_name
from bloggers,topics
where bloggers.id=topics.blogger_id

create view breakfast_topics as
select *
from blogger_topics
where upper(topic_name) like '%BREAKFAST%';

And the corresponding object diagram (ignoring columns):

ViewInstance.png

Besides views, other kinds of dependencies are recorded by Farrago, but the rest are not covered by this document.

Constraints

Farrago currently supports primary keys and unique keys as constraints. (Foreign keys, check constraints other than NOT NULL, and assertions are not yet supported.) In CWM, these constraints are modeled as objects contained by a Table and with direct associations to the constrained columns (as subclasses of StructuralFeature):

CwmConstraints.png

Note the many-to-many association. This is difficult to query via SQL, and does not provide the constraint column ordinals needed for metadata views. Hence, in FEM, we add a redundant one-to-many association and intersection class, and another level of constraint hierarchy:

FemConstraints.png

Here's an object diagram for the constraints on the BLOGGERS table:

ConstraintInstance.png

Indexes

The object relationships for indexes are quite similar to constraints (but without the redundancy, and with an explicit association to the indexed table):

Index.png

Unlike constraints, indexes are contained directly by schemas, but the containing schema must be the same as the schema of the indexed table. Given the following DDL:

create index blogger_id_name_idx
on bloggers(id,name);

the following objects would be created and linked:

IndexInstance.png

Types

The allowable datatypes of elements such as columns and routine parameters are modeled as a hierarchy rooted at class SQLDataType:

Types.png

Instances of SQLSimpleType represent builtin types such as VARCHAR and INTEGER and are populated as singletons when the repository is initialized. Farrago also supports:

  • collection type constructors such as MULTISET; SQLMultisetType and SQLArrayType are parameterized by a component type, and so are instantiated anonymously as part of the definition of a typed element (for example, if columns A and B both declare their type to be INTEGER MULTISET, two anonymous instances of this type will be constructed)
  • anonymous structured types; SQLRowType is instantiated with the ROW type constructor as part of the definition of a typed element
  • named structured object types; SQLObjectType is instantiated with the CREATE TYPE statement and can subsequently be referenced by many typed elements

The attributes of structured types (whether anonymous or named) are instances of SQLTypeAttribute. For collection types, a single type attribute with name COMPONENT is created to represent the component type. Named user-defined types can also have associated instances of UserDefinedOrdering.

Finally, the CREATE TYPE statement also supports creation of distinct types, which are strongly-typed user-defined aliases for simple types:

DistinctType.png

For this example DDL:

create type url as (
    protocol varchar(10),
    host varchar(128),
    path varchar(1024)
) final;

create type ip_addr as bigint final;

create table access_log(
    id int not null primary key,
    resource url not null,
    client_addr ip_addr not null,
    post_data tinyint multiset
);

the following objects are created (or referenced in the case of builtins) and linked:

TypeInstance.png

Routines

User-defined routines include functions, procedures, and methods. They are first-class objects like tables and views, and are contained directly by schemas. Relationships between routines and their typed parameters are diagrammed here:

Routine.png

Note that Routine subclasses ColumnSet. This inheritance is not currently used, but will be in the future once procedures which produce result sets are supported. TBD: methods, UDX's, schema paths, and example object diagrams; also update class diagram for DYNAMIC_FUNCTION attribute described in the UDR docs.

MED

TBD: explanation and example object diagrams for SQL/MED. Also need to publish diagrams for SampleDatasetSubstitution.

Med.png

Security

TBD: explanation and example object diagrams for security model.

Security.png

Labels

Labels are defined in the MED package. The class has a self-referencing association, LabelHasAlias, which is used to model label aliases. commitSequenceNumber is optional because it's only set for base labels.

Image:Label.gif

Configuration

TBD: class diagram, explanation and example object diagrams for configuration parameters.

Fennel

See the Fennel JNI integration docs.

Attachments

  • Original content was migrated from static HTML; corresponding images are still checked into Farrago under //open/dev/farrago/doc/fem.
  • When adding new images, upload them to eigenpedia and link their image metadata pages from here.
Personal tools