InternalDynamicParamScoping
From Eigenpedia
This page explains the problem from FRG-83 and Farrago's solution. Here's the SQL in question:
create table t(a int, b int, c int) server sys_column_store_data_server; create index ita on t(a); create index itb on t(b); create view v as select * from t where a = 1 and b = 2; select * from v v1, v v2;
The query is a self-join, and the view being joined to itself is implemented by a bitmap index intersection, which requires system-generated dynamic parameters in the Fennel plan for communication between ExecStreams.
The problem is that the optimizer recognizes the self-join, meaning there's really only one input to the join, but it's referenced as both the left- and right-hand inputs to the join via aliasing. Since we don't currently materialize the result once and re-use it, this aliasing goes away at StreamDef-generation time: we generate two identical StreamDef subgraphs, so the exact same bitmap index intersection gets executed twice (or more depending on the join implementation). Here's the physical stream graph:
The stream subgraphs numbered {0,2,4,5,6} and {1,3,7,10,12} are the duplicates. The LbmIntersectExecStream 5 uses a dynamic parameter P1 to control LbmIndexScanExecStream 6, and another parameter P2 to control LbmIndexScanExecStream 2. Dynamic parameter references are indicated in red.
Unfortunately, the subgraphs are TOO identical: the subgraph rooted at LbmIntersectExecStream 7 also tries to use dynamic parameters P1 and P2 to communicate with streams 1 and 3. This causes Fennel to assert (correctly) when the ExecStreamGraph is executed, because a given dynamic parameter should only be created once.
The solution in cases where we don't materialize and reuse common subgraphs is to introduce dynamic parameter scoping during StreamDef generation. During rel manipulation, we make up logical dynamic parameter ID's via new method FennelRelImplementor.allocateRelParamId. During StreamDef generation, we convert these into physical parameter ID's as follows:
- In FarragoRelImplementor, we maintain a stack of scopes corresponding to the current path through the rel tree. Whenever we visit a child rel, we push a new scope onto the stack, and whenever we're done with a child, we pop.
- StreamDefs get physical parameter ID's via a new interface method FennelRelImplementor.translateParamId which takes a logical parameter ID as input and returns a physical one as output.
- The implementation for translateParamId walks up the stack looking for a scope which maps the logical parameter ID. If it finds one, it returns the corresponding physical ID. Otherwise, it allocates a new physical ID and maps it in the scope at the top of the stack.
- We'll introduce strong typing via Java holder classes to avoid getting logical and physical parameter ID's mixed up. As part of this, we'll make logical ID's 64-bit, since a lot of these may get generated and discarded during plan selection, and the generator uses a simple autoincrement counter. The strong type for logical ID's is FennelRelParamId; the strong type for physical ID's is FennelDynamicParamId.
This will handle the case in FRG-83, and more complicated cases such as a join implemented using a correlation variable which is set by the join and read by the right-hand subgraph; the scoping should work regardless of whether the aliasing occurs above the join or between the join and the usage.
However, this will not handle cases where a dynamic parameter is used between siblings rather than upstream/downstream. I don't know of any use-cases for that, but it should be easy to deal with this by putting in a dummy reference to the parameter from the rel which defines the enclosing scope; that parent rel just needs to call translateParamId and throw away the result.
Update 14-Jun-2007
There are some problems with the scope stack scheme above in the case of a plan with a Fennel/Java/Fennel sandwich, where the Fennel top and bottom of the sandwich need to reference the same dynamic parameter across the Java boundary. Namely, due to the way the hybrid codegen works, the Fennel code for the bottom half of the sandwich actually gets generated before the top half, so the physical dynamic param translation for the top isn't available yet.
Proposed solution:
- Replace the scope stack in FarragoRelImplementor with a Map<List<RelPathEntry>,RelScope>. The key of the map is a path through the RelNode DAG (starting at the root). What is the RelPathEntry? A RelNode isn't good enough, since a child may be directly aliased as multiple inputs to the same parent, e.g. J(T,T) where J is a self-join of table T. In other words, we need the edges in the path, not just the vertices. As is, we can't get this from visitFennelChild, because it doesn't accept a child-ordinal parameter. We could clean this up to match visitJavaChild, which already takes both the parent and the ordinal. This requires changing a bit of code, with backwards compatibility for any unseen dependencies if possible.
- As we traverse the RelNode graph for both Java and Fennel translation, maintain the "current path" in FarragoRelImplementor. Path maintenance can be done in visitFennelChild, plus overriding the visitJavaChild implementation inherited from JavaRelImplementor (it's declared as final, so override createFrame and visitChildInternal instead). There should be only one unified path (not one for Java and one for Fennel). In other words, both visitFennelChild and visitJavaChild push and pop from the same path.
- Change the existing FarragoRelImplementor.translateParamId logic to use the map instead of the stack. The logic for translation should remain the same: start with the full current path, and keep checking shorter and shorter prefixes (using them as lookup keys in the map) to see if a RelScope exists which binds the given FennelRelParamId; if none is found, instantiate a new entry in the map and add the binding to the new RelScope (or reuse an existing map entry if already created).
- FennelRelNodes which care about Java-sandwich children with dynamic parameters (e.g. the NLJ case which motivated all of this) must perform a dummy translateParamId call during Java codegen, even though they are not generating any Java code. The reason is that this will happen before the Fennel child codegen, putting the translation into the map so that the child can find it. This is done by overriding implementFennelChild.


