feat(batch): query historical epoch data #6840

zwang28 · 2022-12-10T14:24:52Z

I hereby agree to the terms of the Singularity Data, Inc. Contributor License Agreement.

What's changed and what's your intention?

This PR supports "query historical epoch data", which is enabled by meta backup underneath.

Typical use case:

Create meta backup via risectl meta backup-meta from time to time.
To query historical data, firstly find a valid epoch. A valid epoch is one that within the safe_epoch and max_committed_epoch of at least one meta snapshot.

dev=> select * from rw_catalog.rw_meta_snapshot;
 meta_snapshot_id | hummock_version_id |    safe_epoch    |      safe_epoch_ts      | max_committed_epoch | max_committed_epoch_ts  
------------------+--------------------+------------------+-------------------------+---------------------+-------------------------
                1 |                  5 |                0 |                         |    3551844974919680 | 2022-12-19 06:40:53.255
                2 |                  9 | 3551845695750144 | 2022-12-19 06:41:04.254 |    3551847139508224 | 2022-12-19 06:41:26.284
(2 rows)

Suppose we choose an epoch 3551845695750200, which is covered by meta backup 2 above.
SET QUERY_EPOCH=3551845695750200. Then all SELECT in this session returns data as of this epoch.
SET QUERY_EPOCH=0 to disable historical query.

Implementation:

frontend: support querying available meta backup info, in order to get a best-fit epoch to use.
frontend: support specifying query_epoch session variable.
storage: support querying data using arbitrary historical hummock version, which is from meta backup.

Query schema changed or dropped table is not supported in this PR:

Querying historical data from these tables is doable because meta backup contains hummock version along with table catalog. However schema change feature is not available yet, so shall we also postpone supporting such query?

Example with QUERY_EPOCH can be found in test_query_backup.sh.

Checklist

I have written necessary rustdoc comments
I have added necessary unit tests and integration tests
All checks passed in ./risedev check (or alias, ./risedev c)

Documentation

If your pull request contains user-facing changes, please specify the types of the changes, and create a release note. Otherwise, please feel free to remove this section.

Types of user-facing changes

Please keep the types that apply to your changes, and remove those that do not apply.

SQL commands, functions, and operators

Release note

Add support for query historical data from available meta snapshots. It's enabled by set the session config QUERY_EPOCH to a valid epoch that is covered by any meta snapshot.
Add a system schema rw_catalog. rw_catalog contains information specific to RisingWave, compared with existent system schemas pg_catalog and information_schema.
Add a system table rw_meta_snapshot under rw_catalog. It contains info of all available meta snapshots.

Refer to a related PR or issue link (optional)

#6482

codecov · 2022-12-11T05:45:46Z

Codecov Report

Merging #6840 (078d90a) into main (b2896cd) will decrease coverage by 0.03%.
The diff coverage is 43.01%.

@@            Coverage Diff             @@
##             main    #6840      +/-   ##
==========================================
- Coverage   72.79%   72.75%   -0.04%     
==========================================
  Files        1035     1035              
  Lines      166044   166347     +303     
==========================================
+ Hits       120864   121034     +170     
- Misses      45180    45313     +133

Flag	Coverage Δ
rust	`72.75% <43.01%> (-0.04%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
...batch/src/executor/join/distributed_lookup_join.rs	`0.00% <0.00%> (ø)`
src/batch/src/executor/join/local_lookup_join.rs	`55.43% <0.00%> (-0.40%)`	⬇️
src/batch/src/rpc/service/task_service.rs	`0.00% <0.00%> (ø)`
src/common/common_service/src/observer_manager.rs	`43.56% <0.00%> (-0.44%)`	⬇️
src/common/src/catalog/mod.rs	`71.42% <ø> (ø)`
src/compute/src/server.rs	`0.00% <0.00%> (ø)`
src/ctl/src/cmd_impl/hummock/list_kv.rs	`0.00% <0.00%> (ø)`
src/ctl/src/common/hummock_service.rs	`0.00% <0.00%> (ø)`
...ntend/src/catalog/system_catalog/pg_catalog/mod.rs	`0.00% <0.00%> (ø)`
src/frontend/src/handler/query.rs	`18.23% <0.00%> (-0.42%)`	⬇️
... and 54 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

zwang28 · 2022-12-12T13:09:39Z

The historical query works, but performance is ridiculously bad, even for a simple select * that triggers #vnode concurrent tasks. Investigating.

zwang28 · 2022-12-13T04:16:54Z

The historical query works, but performance is ridiculously bad, even for a simple select * that triggers #vnode concurrent tasks. Investigating.

It's because validate_epoch can take as long as 0.3 seconds in debug mode when returning error (HummockError::expired_epoch(safe_epoch, epoch)).
And historical query logic was implemented as executed if validate_epoch fails. ~~Fixing.~~
Fixed

BugenZhao · 2022-12-13T04:18:51Z

It's because validate_epoch can take as long as 0.3 seconds in debug mode when returning error (HummockError::expired_epoch(safe_epoch, epoch)).

Caused by #6157?

hzxa21

Rest LGTM.

hzxa21 · 2022-12-13T06:28:10Z

src/common/src/session_config/mod.rs

    "MAX_SPLIT_RANGE_GAP",
    "SEARCH_PATH",
    "TRANSACTION ISOLATION LEVEL",
+    "QUERY_EPOCH",


User doesn't understand epoch so asking user to manually specify the exact query epoch for time-travel query is not realistic. In terms of the syntax, I did a quick check on PG's doc and sadly it seems to deprecate its time travel query SQL command. I suggest we do either of the following options:

Support querying the backup epochs as well as lookup backup epochs by timestamp. With that, user can find out the exact query epoch it should use for time travel query.

Support AS OF syntax in SELECT: SELECT ... FROM ... AS OF <timestamp>. This is easier to use but may confuse user because this syntax seems to indicate that we can support time-travel query without backup, which is not the case.

For now, I think 1) is sufficient since we don't expect user to run time travel query very often. cc @fuyufjh any suggestions?

However, I think it is okay to keep the QUERY_EPOCH session config for admin usage and debugging.

I think both options are fine.

With option 1, user can get an exact epoch, then still use it via QUERY_EPOCH, right? Or shall we use another ambiguous session config, e.g. QUERY_BACKUP_EPOCH.

To implementation option 2, frontend need to be aware of (and sync) backup manifest, thus it can choose a best-fit query epoch for the given arbitrary <timestamp>.

I think both options are fine.

With option 1, user can get an exact epoch, then still use it via QUERY_EPOCH, right? Or shall we use another ambiguous session config, e.g. QUERY_BACKUP_EPOCH.

To implementation option 2, frontend need to be aware of (and sync) backup manifest, thus it can choose a best-fit query epoch for the given arbitrary .

Sorry for the late reply. For option 1, I mean we provide a way to get the backed-up epochs via SQL statement (preferred) or risectl. Then we can set QUERY_EPOCH accordingly.

Followed option 1.
Added a new system table: rw_catalog.rw_meta_snapshot. Query it returns all available backups along with human-readable timestamp range of them (safe_epoch_ts and max_committed_epoch_ts). safe_epoch_ts of the first row is NULL, because safe_epoch=0 is not a valid epoch.

dev=> select * from rw_catalog.rw_meta_snapshot; meta_snapshot_id | hummock_version_id | safe_epoch | safe_epoch_ts | max_committed_epoch | max_committed_epoch_ts ------------------+--------------------+------------------+-------------------------+---------------------+------------------------- 1 | 5 | 0 | | 3551844974919680 | 2022-12-19 06:40:53.255 2 | 9 | 3551845695750144 | 2022-12-19 06:41:04.254 | 3551847139508224 | 2022-12-19 06:41:26.284 (2 rows)

As the size of backup manifest is small enough, I make frontend query it directly from meta node.
Alternatively we could make frontend read backup manifest from backup storage, instead of meta node, but it doesn't bring significant benefit now, so it's not adopted.

Currently backup creation and deletion is still via risectl. We can also support create backup and delete backup <id> SQL later.

hzxa21 · 2022-12-13T06:44:36Z

src/meta/src/rpc/service/notification_service.rs

                ..Default::default()
            }),
+            meta_backup_manifest_id: Some(MetaBackupManifestId {
+                id: meta_backup_manifest_id,


Shouldn't we send all backup manifest ids here instead of just the latest one?

Get it after reading backup_reader.rs. The notification is used to trigger a backup reader fresh, which will get all the manifests.

hzxa21 · 2022-12-13T08:38:52Z

src/storage/src/hummock/state_store.rs

+        let mut committed_version = (**pinned_version).clone();
+        if epoch < pinned_version.safe_epoch() {
+            // try to query hummock version from backup
+            if let Some(backup_version) = self.backup_reader.try_get_hummock_version(epoch).await {


I am thinking whether we should make reading from backup more explicit instead of just doing it internally in the state store implementation. We can clearly identify misuage or potential bugs by doing so. My thoughts:

Add another type for backup read epoch in HummockReadEpoch.

Add another interface or read option for backup read and use this interface only when the HummockReadEpoch is for backup.

@Li0k @wenym1 what do you think?

Agree with the benefit of "clearly identify misusage or potential bugs ". However, that also means the frontend need to be aware of "reading from backup", which is what the current implementation aims to avoid. But if we will support "reading from backup for dropped table", then frontend has to be aware of it anyway.

Will refactor.

Refactored.

# Conflicts: # src/meta/src/backup_restore/backup_manager.rs

zwang28 · 2022-12-19T06:52:41Z

I've added a new system schema "rw_catalog" cea5686.
Would you PTAL @yezizp2012 @HuaHuaY

yezizp2012 · 2022-12-19T07:12:15Z

I've added a new system schema "rw_catalog" cea5686. Would you PTAL @yezizp2012 @HuaHuaY

I think it's necessary to add some details for it in Documentation section and tag this PR with user-facing-changes.

# Conflicts: # src/batch/src/executor/join/distributed_lookup_join.rs

hzxa21

LGTM! Thanks for the PR!

github-actions bot added the type/feature Type: New feature. label Dec 10, 2022

zwang28 force-pushed the wangzheng/query_backup branch from d9df1f0 to b856275 Compare December 11, 2022 05:22

zwang28 marked this pull request as ready for review December 11, 2022 05:46

zwang28 requested review from BugenZhao, hzxa21 and wenym1 December 11, 2022 05:46

zwang28 force-pushed the wangzheng/query_backup branch from b856275 to 813742a Compare December 12, 2022 07:58

feat(batch): query historical epoch data

d8d5c00

zwang28 force-pushed the wangzheng/query_backup branch from 813742a to d8d5c00 Compare December 12, 2022 13:03

zwang28 marked this pull request as draft December 12, 2022 13:09

refactor

3447f8b

zwang28 marked this pull request as ready for review December 13, 2022 04:53

This comment was marked as outdated.

Sign in to view

zwang28 requested review from Li0k and Little-Wallace December 13, 2022 05:21

zwang28 added A-storage Area: Storage. A-frontend Area: Protocol, parsing, binder. labels Dec 13, 2022

zwang28 requested a review from chenzl25 December 13, 2022 05:26

zwang28 and others added 2 commits December 13, 2022 15:06

fix tests

880cad3

Merge branch 'main' into wangzheng/query_backup

04c3834

hzxa21 reviewed Dec 13, 2022

View reviewed changes

zwang28 added 5 commits December 13, 2022 19:16

resolve conflict and fix ci tests

00b50bb

Merge branch 'main' into wangzheng/query_backup

480aa31

# Conflicts: # src/meta/src/backup_restore/backup_manager.rs

Merge branch 'main' into wangzheng/query_backup

ee4705a

# Conflicts: # src/meta/src/backup_restore/backup_manager.rs

refactor

69d882d

fix tests

3df5c77

zwang28 force-pushed the wangzheng/query_backup branch from 38d8f2f to 3df5c77 Compare December 16, 2022 11:24

zwang28 and others added 2 commits December 16, 2022 20:00

Merge branch 'main' into wangzheng/query_backup

a91eeff

support query backup info from system table

cea5686

zwang28 requested a review from hzxa21 December 19, 2022 06:52

fix tests

1e35f4a

zwang28 added the user-facing-changes Contains changes that are visible to users label Dec 19, 2022

Merge branch 'main' into wangzheng/query_backup

1d744a7

# Conflicts: # src/batch/src/executor/join/distributed_lookup_join.rs

hzxa21 approved these changes Dec 19, 2022

View reviewed changes

zwang28 added this to the release-0.1.15 milestone Dec 19, 2022

zwang28 added the mergify/can-merge label Dec 19, 2022

Merge branch 'main' into wangzheng/query_backup

078d90a

Li0k approved these changes Dec 19, 2022

View reviewed changes

mergify bot merged commit 829df13 into main Dec 19, 2022

mergify bot deleted the wangzheng/query_backup branch December 19, 2022 10:16

feat(batch): query historical epoch data #6840

feat(batch): query historical epoch data #6840

Uh oh!

Conversation

zwang28 commented Dec 10, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What's changed and what's your intention?

Checklist

Documentation

Types of user-facing changes

Release note

Refer to a related PR or issue link (optional)

Uh oh!

codecov bot commented Dec 11, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

zwang28 commented Dec 12, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zwang28 commented Dec 13, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

BugenZhao commented Dec 13, 2022

Uh oh!

This comment was marked as outdated.

hzxa21 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zwang28 Dec 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zwang28 Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zwang28 Dec 15, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zwang28 commented Dec 19, 2022

Uh oh!

yezizp2012 commented Dec 19, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hzxa21 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

zwang28 commented Dec 10, 2022 •

edited

Loading

codecov bot commented Dec 11, 2022 •

edited

Loading

zwang28 commented Dec 12, 2022 •

edited

Loading

zwang28 commented Dec 13, 2022 •

edited

Loading

zwang28 Dec 15, 2022 •

edited

Loading

zwang28 Dec 19, 2022 •

edited

Loading

zwang28 Dec 15, 2022 •

edited

Loading

yezizp2012 commented Dec 19, 2022 •

edited

Loading