Reading Data with ZQL

ZQL is Zero’s query language.

Inspired by SQL, ZQL is expressed in TypeScript with heavy use of the builder pattern. If you have used Drizzle or Kysely, ZQL will feel familiar.

ZQL queries are composed of one or more clauses that are chained together into a query.

Unlike queries in classic databases, the result of a ZQL query is a view that updates automatically and efficiently as the underlying data changes. You can call a query’s materialize() method to get a view, but more typically you run queries via some framework-specific bindings. For example see useQuery for React or SolidJS.

🧑‍🏫Data returned from ZQL should be considered immutable

Select

ZQL queries start by selecting a table. There is no way to select a subset of columns; ZQL queries always return the entire row (modulo column permissions).

const z = new Zero(...);

// Returns a query that selects all rows and columns from the issue table.
z.query.issue;

This is a design tradeoff that allows Zero to better reuse the row locally for future queries. This also makes it easier to share types between different parts of the code.

Ordering

You can sort query results by adding an orderBy clause:

z.query.issue.orderBy('created', 'desc');

Multiple orderBy clauses can be present, in which case the data is sorted by those clauses in order:

// Order by priority descending. For any rows with same priority,
// then order by created desc.
z.query.issue.orderBy('priority', 'desc').orderBy('created', 'desc');

All queries in ZQL have a default final order of their primary key. Assuming the issue table has a primary key on the id column, then:

// Actually means: z.query.issue.orderBy('id', 'asc');
z.query.issue;

// Actually means: z.query.issue.orderBy('priority', 'desc').orderBy('id', 'asc');
z.query.issue.orderBy('priority', 'desc');

Limit

You can limit the number of rows to return with limit():

z.query.issue.orderBy('created', 'desc').limit(100);

Paging

You can start the results at or after a particular row with start():

let start: IssueRow | undefined;
while (true) {
  let q = z.query.issue.orderBy('created', 'desc').limit(100);
  if (start) {
    q = q.start(start);
  }
  const batch = await q.run();
  console.log('got batch', batch);

  if (batch.length < 100) {
    break;
  }
  start = batch[batch.length - 1];
}

By default start() is exclusive - it returns rows starting after the supplied reference row. This is what you usually want for paging. If you want inclusive results, you can do:

z.query.issue.start(row, {inclusive: true});

Uniqueness

If you want exactly zero or one results, use the one() clause. This causes ZQL to return Row|undefined rather than Row[].

const result = await z.query.issue.where('id', 42).one().run();
if (!result) {
  console.error('not found');
}

one() overrides any limit() clause that is also present.

Relationships

You can query related rows using relationships that are defined in your Zero schema.

// Get all issues and their related comments
z.query.issue.related('comments');

Relationships are returned as hierarchical data. In the above example, each row will have a comments field which is itself an array of the corresponding comments row.

You can fetch multiple relationships in a single query:

z.query.issue.related('comments').related('reactions').related('assignees');

Refining Relationships

By default all matching relationship rows are returned, but this can be refined. The related method accepts an optional second function which is itself a query.

z.query.issue.related(
  'comments',
  // It is common to use the 'q' shorthand variable for this parameter,
  // but it is a _comment_ query in particular here, exactly as if you
  // had done z.query.comment.
  q => q.orderBy('modified', 'desc').limit(100).start(lastSeenComment),
);

This relationship query can have all the same clauses that top-level queries can have.

Nested Relationships

You can nest relationships arbitrarily:

// Get all issues, first 100 comments for each (ordered by modified,desc),
// and for each comment all of its reactions.
z.query.issue.related(
	'comments', q => q.orderBy('modified', 'desc').limit(100).related(
		'reactions')
	)
);

Where

You can filter a query with where():

z.query.issue.where('priority', '=', 'high');

The first parameter is always a column name from the table being queried. Intellisense will offer available options (sourced from your Zero Schema).

Comparison Operators

Where supports the following comparison operators:

OperatorAllowed Operand TypesDescription
= , !=boolean, number, stringJS strict equal (===) semantics
< , <=, >, >=numberJS number compare semantics
LIKE, NOT LIKE, ILIKE, NOT ILIKEstringSQL-compatible LIKE / ILIKE
IN , NOT INboolean, number, stringRHS must be array. Returns true if rhs contains lhs by JS strict equals.
IS , IS NOTboolean, number, string, nullSame as = but also works for null

TypeScript will restrict you from using operators with types that don’t make sense – you can’t use > with boolean for example.

🤔Note

Equals is the Default Comparison Operator

Because comparing by = is so common, you can leave it out and where defaults to =.

z.query.issue.where('priority', 'high');

Comparing to null

As in SQL, ZQL’s null is not equal to itself (null ≠ null).

This is required to make join semantics work: if you’re joining employee.orgID on org.id you do not want an employee in no organization to match an org that hasn’t yet been assigned an ID.

When you purposely want to compare to null ZQL supports IS and IS NOT operators that work just like in SQL:

// Find employees not in any org.
z.query.employee.where('orgID', 'IS', null);

TypeScript will prevent you from comparing to null with other operators.

Compound Filters

The argument to where can also be a callback that returns a complex expression:

// Get all issues that have priority 'critical' or else have both
// priority 'medium' and not more than 100 votes.
z.query.issue.where(({cmp, and, or, not}) =>
  or(
    cmp('priority', 'critical'),
    and(cmp('priority', 'medium'), not(cmp('numVotes', '>', 100))),
  ),
);

cmp is short for compare and works the same as where at the top-level except that it can’t be chained and it only accepts comparison operators (no relationship filters – see below).

Note that chaining where() is also a one-level and:

// Find issues with priority 3 or higher, owned by aa
z.query.issue.where('priority', '>=', 3).where('owner', 'aa');

Relationship Filters

Your filter can also test properties of relationships. Currently the only supported test is existence:

// Find all orgs that have at least one employee
z.query.organization.whereExists('employees');

The argument to whereExists is a relationship, so just like other relationships it can be refined with a query:

// Find all orgs that have at least one cool employee
z.query.organization.whereExists('employees', q =>
  q.where('location', 'Hawaii'),
);

As with querying relationships, relationship filters can be arbitrarily nested:

// Get all issues that have comments that have reactions
z.query.issue.whereExists('comments',
	q => q.whereExists('reactions'));
);

The exists helper is also provided which can be used with and, or, cmp, and not to build compound filters that check relationship existence:

// Find issues that have at least one comment or are high priority
z.query.issue.where({cmp, or, exists} =>
  or(
    cmp('priority', 'high'),
    exists('comments'),
  ),
);

Data Lifetime and Reuse

Zero reuses data synced from prior queries to answer new queries when possible. This is what enables instant UI transitions.

But what controls the lifetime of this client-side data? How can you know whether any partiular query will return instant results? How can you know whether those results will be up to date or stale?

The answer is that the data on the client is simply the union of rows returned from queries which are currently syncing. Once a row is no longer returned by any syncing query, it is removed from the client. Thus, there is never any stale data in Zero.

So when you are thinking about whether a query is going to return results instantly, you should think about what other queries are syncing, not about what data is local. Data exists locally if and only if there is a query syncing that returns that data.

🤔Caches vs Replicas

Query Lifecycle

Diagram of query lifecycle

Queries can be either active or backgrounded. An active query is one that is currently being used by the application. Backgrounded queries are not currently in use, but continue syncing in case they are needed again soon.

Active queries are created one of three ways:

  1. The app calls q.materialize() to get a View.
  2. The app uses a platform binding like React's useQuery(q).
  3. The app calls preload() to sync larger queries without a view.

Active queries sync until they are deactivated. The way this happens depends on how the query was created:

  1. For materialize() queries, the UI calls destroy() on the view.
  2. For useQuery(), the UI unmounts the component (which calls destroy() under the covers).
  3. For preload(), the UI calls cleanup() on the return value of preload().

Background Queries

By default a deactivated query stops syncing immediately.

But it's often useful to keep queries syncing beyond deactivation in case the UI needs the same or a similar query in the near future. This is accomplished with the ttl parameter:

const [user] = useQuery(z.query.user.where('id', userId), {ttl: '1d'});

The ttl paramater specifies how long the app developer wishes the query to run inthe background. The following formats are allowed (where %d is a positive integer):

FormatMeaning
noneNo backgrounding. Query will immediately stop when deactivated. This is the default.
%dsNumber of seconds.
%dmNumber of minutes.
%dhNumber of hours.
%ddNumber of days.
%dyNumber of years.
foreverQuery will never be stopped.

If the UI re-requests a background query, it becomes an active query again. Since the query was syncing in the background, the very first synchronous result that the UI receives after reactivation will be up-to-date with the server (i.e., it will have resultType of complete).

Just like other types of queries, the data from background queries is available for use by new queries. A common pattern in to preload a subset of most commonly needed data with {ttl: 'forever'} and then do more specific queries from the UI with, e.g., {ttl: '1d'}. Most often the preloaded data will be able to answer user queries, but if not, the new query will be answered by the server and backgrounded for a day in case the user revisits it.

Client Capacity Management

Zero has a default soft limit of 20,000 rows on the client-side, or about 20MB of data assuming 1KB rows.

This limit can be increased with the --target-client-row-count flag, but we do not recommend setting it higher than 100,000.

🤔Why does Zero store so little data client-side?

Here is how this limit is managed:

  1. Active queries are never destroyed, even if the limit is exceeded. Developers are expected to keep active queries well under the limit.
  2. The ttl value counts from the moment a query deactivates. Backgrounded queries are destroyed immediately when the ttl is reached, even if the limit hasn't been reached.
  3. If the client exceeds its limit, Zero will destroy backgrounded queries, least-recently-used first, until the store is under the limit again.

Thinking in Queries

Although IVM is a very efficient way to keep queries up to date relative to re-running them, it isn't free. You still need to think about how many queries you are creating, how long they are kept alive, and how expensive they are.

This is why Zero defaults to not backgrounding queries and doesn't try to aggressively fill its client datastore to capacity. You should put some thought into what queries you want to run in the background, and for how long.

Zero currently provides a few basic tools to understand the cost of your queries:

  • The client logs a warning for slow query materializations. Look for Slow query materialization in your logs. The default threshold is 5s (including network) but this is configurable with the slowMaterializeThreshold parameter.
  • The client logs the materialization time of all queries at the debug level. Look for Materialized query in your logs.
  • The server logs a warning for slow query materializations. Look for Slow query materialization in your logs. The default threshold is 5s but this is configurable with the log-slow-materialize-threshold configuration parameter.

We will be adding more tools over time.

Completeness

Zero returns whatever data it has on the client immediately for a query, then falls back to the server for any missing data. Sometimes it's useful to know the difference between these two types of results. To do so, use the result from useQuery:

const [issues, issuesResult] = useQuery(z.query.issue);
if (issueResult.type === 'complete') {
  console.log('All data is present');
} else {
  console.log('Some data is missing');
}

The possible values of result.type are currently complete and unknown.

The complete value is currently only returned when Zero has received the server result. But in the future, Zero will be able to return this result type when it knows that all possible data for this query is already available locally. Additionally, we plan to add a prefix result for when the data is known to be a prefix of the complete result. See Consistency for more information.

Preloading

Almost all Zero apps will want to preload some data in order to maximize the “local-first” feel of instantaneous UI transitions.

In Zero, preloading is done via queries – the same queries you use in the UI and for auth.

However, because preload queries are usually much larger than a screenful of UI, Zero provides a special preload() helper to avoid the overhead of materializing the result into JS objects:

// Preload the first 1k issues + their creator, assignee, labels, and
// the view state for the active user.
//
// There's no need to render this data, so we don't use `useQuery()`:
// this avoids the overhead of pulling all this data into JS objects.
z.query.issue
  .related('creator')
  .related('assignee')
  .related('labels')
  .related('viewState', q => q.where('userID', z.userID).one())
  .orderBy('created', 'desc')
  .limit(1000)
  .preload();

Running Queries Once

Usually subscribing to a query is what you want in a reactive UI but every so often running a query once is all that’s needed.

const results = z.query.issue.where('foo', 'bar').run();
🤔Note

Consistency

Zero always syncs a consistent partial replica of the backend database to the client. This avoids many common consistency issues that come up in classic web applications. But there are still some consistency issues to be aware of when using Zero.

For example, imagine that you have a bug database w/ 10k issues. You preload the first 1k issues sorted by created.

The user then does a query of issues assigned to themselves, sorted by created. Among the 1k issues that were preloaded imagine 100 are found that match the query. Since the data we preloaded is in the same order as this query, we are guaranteed that any local results found will be a prefix of the server results.

The UX that result is nice: the user will see initial results to the query instantly. If more results are found server-side, those results are guaranteed to sort below the local results. There's no shuffling of results when the server response comes in.

Now imagine that the user switches the sort to ‘sort by modified’. This new query will run locally, and will again find some local matches. But it is now unlikely that the local results found are a prefix of the server results. When the server result comes in, the user will probably see the results shuffle around.

To avoid this annoying effect, what you should do in this example is also preload the first 1k issues sorted by modified desc. In general for any query shape you intend to do, you should preload the first n results for that query shape with no filters, in each sort you intend to use.

🤔Note

In the future, we will be implementing a consistency model that fixes these issues automatically. We will prevent Zero from returning local data when that data is not known to be a prefix of the server result. Once the consistency model is implemented, preloading can be thought of as purely a performance thing, and not required to avoid unsightly flickering.