We have a table in an SQL Server 2012 database that stores tree-like structures. Simplified for the purpose of my question, it has the following format:
Id int identity,
ParentId int,
GroupId int
Each record of the table represents an object identified by Id. An object may or may not have a parent in the same table, such that object.ParentId = parentObject.Id. A root object has ParentId = NULL. There are multiple root objects, so the table in fact stores multiple trees. What’s important is that the tree depth is not fixed, i.e. theoretically there can be any number of ancestor generations for an object. GroupId is a property of a root object; in theory none of the children of a root object has to have GroupId <> NULL; it can be assumed that any child has the same GroupId value as its root object.
A sample table having two roots (one grandparent and one parent), one non-root parent/child and 4 child roots:
Id ParentId GroupId
----------------------------------------------------------
1 NULL 200 root grandparent
2 1 NULL non-root parent/child
3 2 NULL child
4 2 NULL child
5 NULL 300 root parent
6 5 NULL child
7 5 NULL child
The table is not normalised, i.e. there’s no separate {root_object : group} table. However I don’t think normalising the table would have helped to solve the problem.
Now the problem. We need to set up merge replication from the table above (Master table) to the table of the same format in another DB. We need to replicate only those rows of the Master table that have a certain fixed GroupId value, e.g. 200 in the example above. If we ensure that GroupId in all descendant objects of a root object has the same value in the table as the root object itself that would be trivial. The table would look like this:
Id ParentId GroupId
----------------------------------------------------------
1 NULL 200 root grandparent
2 1 200 non-root parent/child
3 2 200 child
4 2 200 child
5 NULL 300 root parent
6 5 300 child
7 5 300 child
And the filter would look like this:
WHERE GroupId = 200
However out of performance considerations, we would like to avoid if possible filling GroupId for the descendant objects, because as it must be clear from the above, GroupId for a descendant object is quite easily deducible via a stored procedure or UDF (just need to go up the tree until ParentId <> NULL). The problem is, I don’t know how to achieve this in a merge replication filter: it would only allow WHERE conditions and joins. I’ve have not had much luck with joins for merge replication in general, but here we have more complex algorithm, because the number of tree levels can be different for every object. And merge replication would not allow using UDF…