FarragoExtensionMetamodel
From Eigenpedia
Contents |
FEM
This document describes how Farrago metadata is modeled in UML. The UML model defined by Farrago is referred to as FEM, for Farrago Extension Metamodel, because
- FEM extends the Common Warehouse Metamodel (CWM)
- FEM is designed to be further extended by systems which extend Farrago
The metamodel defined by CWM sprawls over a large number of data management areas. In some areas, it is very generic and not up-to-date with the latest SQL standards. FEM specializes a subset of CWM (hiding irrelevant submodels) with specifics from SQL and Farrago. The CWM standard recommends that specialization be implemented via the tag mechanism (to further the goal of interoperability), but tags are inappropriate for the deep specialization required by Farrago, so we use inheritance instead. Where interoperability is desired, we plan to provide a "slicing" mechanism which can take a set of FEM objects, strip out all FEM-specifics, and produce a valid CWM XMI export. Likewise, a CWM import mechanism may be provided which fills in defaults for FEM-specifics.
In this document, we walk through the submodel areas covered by FEM, illustrating them via static structure UML class and object diagrams. Attributes are not always shown unless they are relevant to the discussion.
Containers
In SQL, first-class objects are defined in a three-level containment
hierarchy: catalogs contain schemas, and schemas contain objects such
as tables, views, and procedures. This containment is modeled in FEM
using the Namespace/Package constructs inherited from CWM:
The ElementOwnership association is used to model both
catalog/schema and schema/object containment. For example, suppose
the following DDL is executed:
create schema blogs;
set schema 'blogs';
create table bloggers(
id int not null primary key,
name varchar(128) not null,
handle varchar(128) not null unique,
email_addr varchar(128) not null);
Then (ignoring columns for the moment) the following object instances
would be created (or referenced in the case of the
LOCALDB catalog, which already exists) and linked:
Note that any ModelElement can be contained in a schema,
meaning extension models can define their own first-class objects
beyond the standard ones by inheriting from ModelElement.
Tables
CWM refers to any table-like object as a NamedColumnSet.
The association between a NamedColumnSet and the
Column objects which define its structure is modeled on
the more abstract notion of a class and its attributes:
In particular, FEM defines tables and views as descendants of
NamedColumnSet, and differentiates between columns of
tables and views:
Here is a corresponding object diagram for the previous example:
Note that the generic ElementOwnership association is
not used to model the containment of columns by a table;
instead, the more specific ClassifierFeature association
is used. However, since Classifier subclasses
Namespace, the ElementOwnership association
may be used for a table's containment of non-column objects such as
constraints.
Columns
The information associated with a single column is modeled via attributes and an association to a datatype:
Details of allowed datatypes are covered in a later
section. The initialValue attribute is used to store
any DEFAULT value (the Expression class is covered in the
next section on views).
Identity columns have a sequence component (not shown) as described in AutoincrementColumns.
Views
SQL views have a body (an underlying query), result columns, and dependencies on objects referenced by their body (e.g. tables, other views, and functions). We have already seen how view columns are modeled. Here is a class diagram for the view body and dependencies:
A single instance of class Dependency is created per view
to hold all dependencies, and only direct dependencies are recorded
(rather than the transitive closure). Consider this addition to
our running DDL example:
create table topics(
id int not null primary key,
blogger_id int not null references bloggers(id),
name varchar(128) not null,
creation_time timestamp not null);
create view blogger_topics as
select bloggers.name as blogger_name,topics.name as topic_name
from bloggers,topics
where bloggers.id=topics.blogger_id
create view breakfast_topics as
select *
from blogger_topics
where upper(topic_name) like '%BREAKFAST%';
And the corresponding object diagram (ignoring columns):
Besides views, other kinds of dependencies are recorded by Farrago, but the rest are not covered by this document.
Constraints
Farrago currently supports primary keys and unique keys as
constraints. (Foreign keys, check constraints other than NOT NULL,
and assertions are not yet supported.) In CWM, these constraints are
modeled as objects contained by a Table and with direct
associations to the constrained columns (as subclasses of
StructuralFeature):
Note the many-to-many association. This is difficult to query via SQL, and does not provide the constraint column ordinals needed for metadata views. Hence, in FEM, we add a redundant one-to-many association and intersection class, and another level of constraint hierarchy:
Here's an object diagram for the constraints on the BLOGGERS
table:
Indexes
The object relationships for indexes are quite similar to constraints (but without the redundancy, and with an explicit association to the indexed table):
Unlike constraints, indexes are contained directly by schemas, but the containing schema must be the same as the schema of the indexed table. Given the following DDL:
create index blogger_id_name_idx on bloggers(id,name);
the following objects would be created and linked:
Types
The allowable datatypes of elements such as columns and routine
parameters are modeled as a hierarchy rooted at class
SQLDataType:
Instances of SQLSimpleType represent builtin types such
as VARCHAR and INTEGER and are populated as
singletons when the repository is initialized. Farrago also supports:
- collection type constructors such as
MULTISET;SQLMultisetTypeandSQLArrayTypeare parameterized by a component type, and so are instantiated anonymously as part of the definition of a typed element (for example, if columns A and B both declare their type to beINTEGER MULTISET, two anonymous instances of this type will be constructed) - anonymous structured types;
SQLRowTypeis instantiated with the ROW type constructor as part of the definition of a typed element - named structured object types;
SQLObjectTypeis instantiated with theCREATE TYPEstatement and can subsequently be referenced by many typed elements
The attributes of structured types (whether anonymous or named)
are instances of SQLTypeAttribute. For collection types,
a single type attribute with name COMPONENT is created to
represent the component type. Named user-defined types can also have
associated instances of UserDefinedOrdering.
Finally, the CREATE TYPE statement also supports creation
of distinct types, which are strongly-typed user-defined
aliases for simple types:
For this example DDL:
create type url as (
protocol varchar(10),
host varchar(128),
path varchar(1024)
) final;
create type ip_addr as bigint final;
create table access_log(
id int not null primary key,
resource url not null,
client_addr ip_addr not null,
post_data tinyint multiset
);
the following objects are created (or referenced in the case of builtins) and linked:
Routines
User-defined routines include functions, procedures, and methods. They are first-class objects like tables and views, and are contained directly by schemas. Relationships between routines and their typed parameters are diagrammed here:
Note that Routine subclasses ColumnSet.
This inheritance is not currently used, but will be in the future once
procedures which produce result sets are supported. TBD: methods,
UDX's, schema paths, and example object diagrams; also update class
diagram for DYNAMIC_FUNCTION attribute described in the UDR docs.
MED
TBD: explanation and example object diagrams for SQL/MED. Also need to publish diagrams for SampleDatasetSubstitution.
Security
TBD: explanation and example object diagrams for security model.
Labels
Labels are defined in the MED package. The class has a self-referencing association, LabelHasAlias, which is used to model label aliases. commitSequenceNumber is optional because it's only set for base labels.
Configuration
TBD: class diagram, explanation and example object diagrams for configuration parameters.
Fennel
See the Fennel JNI integration docs.
Attachments
- Original content was migrated from static HTML; corresponding images are still checked into Farrago under //open/dev/farrago/doc/fem.
- When adding new images, upload them to eigenpedia and link their image metadata pages from here.


