Skip to content

Conversation

grgbkr
Copy link
Contributor

@grgbkr grgbkr commented Nov 6, 2024

No description provided.

Copy link

vercel bot commented Nov 6, 2024

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name Status Preview Comments Updated (UTC)
replicache-docs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 11, 2024 5:21pm
zbugs ✅ Ready (Inspect) Visit Preview 💬 Add feedback Nov 11, 2024 5:21pm

const relationship = node.relationships[this.#relationshipName];
assert(relationship);
let size = 0;
// an alternative to make relationship callable is to use
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@aboodman realized this while implementing this. It actually can be done without changing the Change api.

relationship: string,
format: Format,
) {
if (format.hidden) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note from video review with Aaron:

It seems fine to have the subquery filter subqueries in the output view, just not in the TypeScript type. It may even help with debugging/understanding. Suggests not doing this hidden format for now.

// is <= bound.
assert(change.type !== 'child', 'child changes are not supported');

const {compareRows} = this.getSchema();
Copy link
Contributor Author

@grgbkr grgbkr Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This enhancement to take should land first as a standalone change.

type: v.literal('NOT EXISTS'),
});

export const correlatedSubqueryConditionConditionSchema = v.union(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be
correlatedSubqueryConditionConditionSchema
or
correlatedSubqueryConditionOperatorSchema

for EXISTS and NOT EXISTS there are no other operands other than the subquery, but for other subquery conditions there may be additional operands.

Copy link
Contributor

@tantaman tantaman Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be an operator. It is more symmetrical with SimpleCondition that way.

type Condition = SimpleCondition | Cojunction | Discjunction | CorrelatedSubqueryCondition;

type CorrelatedSubqueryCondition = {
  type: 'correlatedSubquery';
  related: CorrelatedSubquery;
  op: 'EXISTS' | 'NOT EXISTS'
};

assert(related, 'Invalid relationship');
const related1 = related;

// TODO what if they have multiple exists on same relationship?
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How should we name these for subquery filters, and how should we fix the naming collision issue for all subqueries?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each whereExists gets an incrementing value appended to its name? The user never needs to interact with the results of a whereExists relationships so we can generate any name we like for them.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related calls should be able to take an alias parameter. If related calls collide, that's on the developer to deal with.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need some way to avoid collision between the user's aliases and the ones we create for subquery filters.

Reserved character or prefix?

push(change: Change) {
assert(this.#output, 'Output not set');

switch (change.type) {
Copy link
Contributor Author

@grgbkr grgbkr Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative design is for push to always fetch the relationship with code like:

    const relationship = must(
      first(
        this.#input.fetch({
          start: {row: node.row, basis: 'at'},
        }),
      ),
    ).relationships[this.#relationshipName]

and to then check the relationships size.

This alternative approach is simpler, stateless, and it also generalizes to support other subquery condition types, it could allow implementing a general SubqueryFilter operator like:

export class SubqueryFilter implements Operator {
  constructor(
    input: Input,
    storage: Storage,
    relationshipName: string,
    prediate: (relationship: Stream<Node>) => boolean
  ) 

The downside is it is less efficient.

throw new Error(`Invalid relationship ${relationship as string}`);
}

whereExists<TRelationship extends keyof TSchema['relationships']>(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need a better API for this at the Query level, something that supports subquery conditions nested in not/and/or conditions.

Perhaps the ideas in the "or API improvements" bug can apply here https://bugs.rocicorp.dev/issue/3104

let q = z.issue.where({cmp, or, and, not}=> or(
    filters.map(f => cmp('category', f))
));

could be extend to have:

let q = z.issue.where({cmp, or, and, not, exists}=> or(
    filters.map(f => cmp('category', f))
));

Copy link
Contributor

@tantaman tantaman Nov 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, the latter is also how Kysely handles exists: https://kysely.dev/docs/recipes/expressions#expression-builder

Good to know the path is well trodden there. I'm in favor of that form.

I know @aboodman was hoping to remove the nomenclature of exists and notExists entirely. That we'd instead do something like:

z.issue.related('comments').where('comments.length', '>', 0)

Or an alternative formulation:

z.issue.where(
  related('comments').length, '>', 0
);

That's too much for me to entertain at the moment without making it my main focus. At first glance, an API of that form greatly expands the surface area (in terms of features) that we would need to cover. It pulls in aggregating over the subquery then comparing rather than simple existence checks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should do exists and non-exist for now and unblock read permissions.

constructor(
input: Input,
format: Format = {singular: false, relationships: {}},
format: Format = {singular: false, hidden: false, relationships: {}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yay! hidden should have been a format param. I think it predated format though.

| GenericDisjunction<TSchema>
| {
type: 'correlatedSubQuery';
// TODO...
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't seem to get the typing right here...

Comment on lines +177 to +185
for (const csqc of correlatedSubQueryConditions) {
let csq = csqc.related;
if (
csqc.condition.type === 'EXISTS' ||
csqc.condition.type === 'NOT EXISTS'
) {
csq = {...csq, subquery: {...csq.subquery, limit: EXISTS_LIMIT}};
}
end = applyCorrelatedSubQuery(csq, delegate, staticQueryParameters, end);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How do we avoid conflicts between selecting a subquery and filtering on one with the same name?

issue.related('creator').where(exists('creator'))

Can we reduce to doing a single join?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could do this optimization here in the build step.

for (const c of condition.conditions) {
gather(c);
}
return;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: seems like a random return.

}

cleanup(req: FetchRequest) {
return this.#filter(this.#input.cleanup(req));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we not need to cleanup our own storage?

if (size === 1) {
this.#output.push({
type: this.#not ? 'remove' : 'add',
node: must(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to re-fetch the row in order to populate relationships? Is that the issue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.
In this case a child change needs to be translated to an add of the parent (because now a child exists). Child changes only have a row, but an add needs a Node (i.e. a row and a tree of relationships).

let size = 0;
// however this is slightly more expensive (though only slightly since
// most of the fetch is lazy)
for (const _relatedNode of relationship) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want to store the entire size?? Or just up to some limit?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subquery is limited in builder.ts, see

csq = {...csq, subquery: {...csq.subquery, limit: EXISTS_LIMIT}};

It is done there to also limit the number of rows we sync to the client.

// a bug.
if (!matchesConstraint(startAt)) {
assert(false, 'Start row must match constraint');
console.log(req);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

rm

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Still need to track down why my changes to the kitchen sync test caused this assert to start failing (even if the query has no exists/non-exists).

assert(related, 'Invalid relationship');
const related1 = related;

// TODO what if they have multiple exists on same relationship?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each whereExists gets an incrementing value appended to its name? The user never needs to interact with the results of a whereExists relationships so we can generate any name we like for them.

assert(related, 'Invalid relationship');
const related1 = related;

// TODO what if they have multiple exists on same relationship?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

related calls should be able to take an alias parameter. If related calls collide, that's on the developer to deal with.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants