Skip to content

Conversation

@fuyufjh
Copy link
Contributor

@fuyufjh fuyufjh commented Jan 12, 2023

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 12, 2023

What if some system configs are required on cluster start, e.g. (hummock URL)?

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 12, 2023

What if some system configs are required on cluster start, e.g. (hummock URL)?

Initialize it until the parameters are fetched from Meta, or set it afterwards 😇 I have no better ideas than these.

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 12, 2023

Initialize it until the parameters are fetched from Meta

It works for worker types other than Meta, but what about Meta itself? How should these configs be initialized, maybe in command line options?

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 12, 2023

Initialize it until the parameters are fetched from Meta

It works for worker types other than Meta, but what about Meta itself? How should these configs be initialized, maybe in command line options?

How about setting it before any compute node join the cluster?

@CAJan93
Copy link
Contributor

CAJan93 commented Jan 12, 2023

Regarding the decision where to put an argument: If e.g. the argument relates to a RW instance, how do we decide if we put it into the SQL interface or the web console? Or will we always offer both options?

Same for process configs: When do we use a CLI arg and when the .toml file?

@xxchan
Copy link
Contributor

xxchan commented Jan 12, 2023

how do we decide if we put it into the SQL interface or the web console?

When do we use a CLI arg and when the .toml file?

IIUC it's layered config:

  • web console is subset of SQL variables.
  • CLI args is subset of config file.

When some configs are frequently used and modified, we can provide them in the upper layer interface for ease of use.

system --> cloud(Web Console)
```

### System Configs
Copy link
Contributor

@xxchan xxchan Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to clarify: Are system configs distinct from config files? From my understanding of the motivation it's yes. A related question is: If so, the default values for them (i.e., the values at startup) cannot be changed, right? 🤔

FWIW since you referred PG, in PG they are the same set of configs:

These variables can be set using the SET statement, by editing the postgresql.conf configuration file, through the PGOPTIONS environmental variable (when using libpq or a libpq-based application), or through command-line flags when starting the postgres server.

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 12, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Things are easier for single-node systems like PG/MySQL. Here we need to to either 1) move all configs to meta services 2) divided them into 2 sets: local and global.

I chose 2 because local config seems impossible to avoid. For example, --memory must be a local parameter.

Copy link
Contributor

@xxchan xxchan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The ideas generally LGTM, but needs some clarifications as commented

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 12, 2023

Regarding the decision where to put an argument: If e.g. the argument relates to a RW instance, how do we decide if we put it into the SQL interface or the web console? Or will we always offer both options?

Same for process configs: When do we use a CLI arg and when the .toml file?

SQL interface and web console control exactly the same set of parameters. Web console is just an optional GUI.

CLI args is a subset of .toml file, containing the frequently-used ones e.g. --parallelism

@hzxa21
Copy link

hzxa21 commented Jan 13, 2023

related draft PR: risingwavelabs/risingwave#7273

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 13, 2023

related draft PR: risingwavelabs/risingwave#7273

This PR is drafted before this RFC, so it just includes a very small part of this RFC. Essentially, for demo purpose, it moves one config state_store from process config to system parameters, and persists it to etcd. When other nodes join the cluster, it will pull state store URL from meta node. For backward compatibility, meta node will verify state_store option from compute node with its own, and reject node join if they differ.


### Mutability

- Process configs must be **immutable**, because there is no user-interface to change it.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering if user has been running some workload for a while, and realize having a lower/higher parallelism will better utilize their resources.

In this scenario, parallelism is process level config, and is immutable.
What should the user do if they want to adjust it, do they have to start a new cluster?
Or is this not a realistic scenario that happens?

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What should the user do if they want to adjust it, do they have to start a new cluster?

In that case, he/she can restart (or start new, they are equivalent actually) the CN process with new parallelism config. No need to restart the cluster.


Parallelism is actually a very special case than other parameters. Let me tell the whole story in my mind:

Overall, we have

  • Process-level config to control the parallelism of one particular CN node
  • System parameter parallelism as the default parallelism for new streaming jobs. The default value is 0 or 'auto' which implies the new streaming jobs are supposed to use all available parallel units in all compute nodes.
    • Sure, it can be session-level as well.

When the system bootup:

  1. Compute nodes register itself to Meta node when starting up, including its parallelism, which is set by the config parallelism or # physical CPU cores by default.
  2. Meta service aggregates these numbers to get total parallelism, or we called the total number of parallel units.
    • Note: A "prallel unit" is basically a worker corresponding to one CPU core, but it's logical and is not strictly bound to a CPU core.
  3. If the system parameter parallelism is set to 0 or 'auto', we will use that "total parallelism" in step 2 as the effective value to plan/schedule/run streaming job actors. Otherwise, the user-set value will be used.

If the cluster changed (e.g. new nodes joined or existing nodes left), it will only affect the "total parallelism" in step 2. As a result, existing streaming jobs won't be affected and they still use original parallelism. This is expected, because we expect the K8s operator or human operator to redistribute the existing streaming actors instead of doing it automatically.

Copy link

@jon-chuang jon-chuang Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, is it correct to say that stream_parallelism is only a meaningful config for dev/on-prem, to set a stream-job level default, but those using cloud and even cloud team should never have to care about it due to the operator’s ability for auto-redistribution?

Btw, I’m quite unclear, but do we currently have an API for scaling individual fragments? Is this API an SQL API or is it a HTTP/gRPC endpoint to meta?

Copy link

@jon-chuang jon-chuang Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is there a way for the user to hint the starting parallelism? For instance, even if cloud may auto distribute, it is unclear if it is better to start with small parallelism or high for a given stream job. User may know better.

So something like create materialized view (stream_job_parallelism_hint=3) … when the user thinks the stream job requires small throughput.

It seems similar to session variable stream_parallelism, but to me it is unclear what this variable does if we are going to auto scale, does it override the autoscaling etc, whereas a hint is at least clearer in that the parallelism may be modified by the system or by external operator.

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I’m quite unclear, but do we currently have an API for scaling individual fragments

Yes, See the design doc at https://singularity-data.quip.com/RubDA4a51oS0/Low-level-Scaling-Interface-on-Meta-Service

So something like create materialized view (stream_job_parallelism_hint=3) … when the user thinks the stream job requires small throughput.

That's exactly the session variable streaming_parallelism does. See risingwavelabs/risingwave#7370.

but to me it is unclear what this variable does if we are going to auto scale, whereas a hint is at least clearer in that the parallelism may be modified by the system or by external operator.

I understand this seems to be somewhat inconsistent. In the current design, streaming_parallelism only applies for the subsequent CREATE MVIEW command in the same session. The syntax you proposed LGTM as well. After the MV is created, the only way to manipulate its parallelism is via the Scaling API (the doc above).

The reason is that we consider scaling as an advanced operational feature that should be part of the cloud platform instead of the database kernel, especially for the policy i.e. how and when to do scaling. Hence, we divide the feature "scaling" into 2 parts: 1) the cloud platform make decisions about how & when, and then 2) the kernel simply executes these directives. That's the design philosophy behind that API. However, the only exception is creating an MView, during which the meta must decide the parallelism without cloud platform engaged. In that case, I think the best we can do is to use stream_parallelism for each fragment.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we say that local or process configs are immutable during runtime to avoid confusion? Because users can always kill and restart a process with different config values.

because there is no user-interface to change it.

That won't be true if the cloud portal supports updating with a reboot.

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 13, 2023

For process config, should we have a separate config file (template) for each type of node, or add more sections in risingwave.toml, such as:

# Below sections may apply to multiple worker types, but the user does not need to know the details.
[server]
...
[meta]
...
[storage]
...

# The sections with the name of the worker type should apply ONLY to that worker type.
[compute_node]
# Put the entries from `ComputeNodeOpts` here
[compactor]
# Put the entries from `CompactorOpts` here

I was just reviewing the command line arguments for each type of node, and realized they differ a lot. If we put all those arguments into one config file without add more sections, it will be very cluttered, and the subset of configs that will take effect vary from worker type.

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 13, 2023

related draft PR: risingwavelabs/risingwave#7273

This PR is drafted before this RFC, so it just includes a very small part of this RFC. Essentially, for demo purpose, it moves one config state_store from process config to system parameters, and persists it to etcd. When other nodes join the cluster, it will pull state store URL from meta node. For backward compatibility, meta node will verify state_store option from compute node with its own, and reject node join if they differ.

Such brilliant solution!

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 13, 2023

For process config, should we have a separate config file (template) for each type of node, or add more sections in risingwave.toml

LGTM for adding more sections. Multiple files may be troublesome for deployment.


### System Parameters

What we need is a centralized source-of-truth of cluster configurations. Obviously, the best choice for RisingWave is the Meta Service and the underlying etcd.
Copy link
Member

@BugenZhao BugenZhao Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid it's not a good idea to persist the configuration to the meta store... Here're several reasons:

  • Meta store is not designed to be directly manipulated by the user. If one has corrupted some configuration with SET and finds that the cluster does not boot anymore, he can do nothing about this.
  • Maintaining persisted configuration and consistency with the clap definition can be really hard, especially for version compatibility (some famous HTAP database has had a lot of production accidents due to this).
  • It seems not convincing to persist configuration like data_directory... what if the user really moves the data? Note that the cluster will immediately enter the recovery mode on startup, without giving a chance to the users to set a new data_directory.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmmm. These problems exist indeed, but persisting parameters (instead of losing all changes on reboot) is also a must-have feature for a distribution system.

Meta store is not designed to be directly manipulated by the user. If one has corrupted some configuration with SET and finds that the cluster does not boot anymore, he can do nothing about this.

We need to equip the operation tool with the ability to fix it. For example, support a new command to risectl set <parameter> to <value> to manipulate etcd directly.

Maintaining persisted configuration and consistency with the clap definition can be really hard, especially for version compatibility (some famous HTAP database has had a lot of production accidents due to this).

clap definitions must refer to the CLI arguments, right? They are unrelated to system parameters in most cases.

It seems not convincing to persist configuration like data_directory... what if the user really moves the data? Note that the cluster will immediately enter the recovery mode on startup, without giving a chance to the users to set a new data_directory.

By "persist" I mean to be recoverable after rebooting. It can be mutable, as described in section "mutability".

Copy link
Member

@BugenZhao BugenZhao Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

clap definitions must refer to the CLI arguments, right?

Or serde? 🥺 If we take a config file as the initialization of these parameters, then we need to have a separate definition of its schema in the code (or we can make it a dynamic HashMap).

It can be mutable, as described in section "mutability".

Yes, but it cannot be mutated without a normal running cluster or risectl mentioned above. As the old data_directory is invalid, we cannot survive the recovery phase.

but persisting parameters (instead of losing all changes on reboot) is also a must-have feature for a distribution system.

I'm thinking if we treat all SETs to be temporary and just for the users to give it a try, users can always write down the desired config in a new toml file and reboot the cluster to apply it. 🤔 I cannot remember well but I guess this is also the Flink way?

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or serde? 🥺 If we take a config file as the initialization of these parameters, then we need to have a separate definition of its schema in the code (or we can make it a dynamic HashMap).

In my mind, the system parameters are all simple strings or integers so se/deserialization won't be a big issue.

Yes, but it cannot be mutated without a normal running cluster or risectl mentioned above. As the old data_directory is invalid, we cannot survive the recovery phase.

Perhaps it also needs that operational tool: "We need to equip the operation tool with the ability to fix it" (mentioned in previous comment)

Anyway, data_directory is very special in many aspects: It's required to be set; it can hardly be changed when RisingWave is running; it need to be mutated offline. Things may get a little dirty for it.

I'm thinking if we treat all SETs to be temporary and just for the users to give it a try, users can always write down the desired config in a new toml file and reboot the cluster to apply it. 🤔 I cannot remember well but I guess this is also the Flink way?

Hmmm. Interesting. It's indeed a popular way in big data systems.

That's basically a centralized but non-persistent config store. It makes senses, but since we had a built-in metadata database, why not just persist it? I think in most cases this looks better (except data_dir, I know 😇)

Copy link
Member

@BugenZhao BugenZhao Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This acts kind of like a mobile App and we need to provide several mechanisms to make it work, like entering the "emergency mode" and providing fixing utilities if it frequently fails, or guiding them to contact customer services for asking help. 😄

I do agree we should clarify the difference between "process" configs and "cluster" configs, but I'm still doubting the possible issues and complexities for persisting it. I feel like we are taking (possibly unnecessary) responsibility for ourselves that could otherwise belong to the user or DevOps. 😂

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO ConfigMaps may actually be a good solution here. I am not sure how we would make this work in the risedev environment. Btw: Under the hood CMs, like all K8s objects, are also stored in ETCD.

Meta store is not designed to be directly manipulated by the user. If one has corrupted some configuration with SET and finds that the cluster does not boot anymore, he can do nothing about this.

Do you think it would help to version the configurations? Maybe this would make it simpler to fall back to a previous version if an update to the settings broke something. ETCD supports a --prefix operation, so you could retrieve all values belonging to a setting independent from the timestamp like this.

etcdctl put some_setting_23_01_27_13_22 val1 # original value
etcdctl put some_setting_23_01_28_09_00 val1 # updated value
etcdctl get some_setting --prefix # retrieve all values
some_setting_23_01_27_13_22 val1 
some_setting_23_01_28_09_00 val1

Copy link
Contributor

@xxchan xxchan Jan 27, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it would help to version the configurations

IMHO It's worth a separate RFC 🫣

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good. But I believe first we would have to decide on how to store the configs, before deciding if/how we want to version them.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 If we really want to persist them to the meta store, we should carefully think of a way to store them. For example, we should introduce something like the "field number" in protobuf, so that we can handle the backward compatibility more easily.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I totally agree with the centralized system-wide parameters, no matter where they are stored, for the reason that it's painful for the operator to dispatch those global vars like data_directory now 👍 . IMO, it can be stored in

  • the etcd as a part of the metadata because we may want to change them afterward
  • a local and consistent file mounted on the meta nodes so that the meta node can be the single source of truth, as all of the other components will contact meta before they get started

Similar to other system metadata e.g. catalog,

- Changes to global parameters should be propagated to all nodes, in the cluster in a deterministic order ideally
- When a node reboots or recovers, the latest snapshot of parameters should be passed from Meta Service.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please elaborate it more about the initialization of the "snapshot" (in meta store)? For example,

  • Do we still have configuration file for the cluster? If so, do we only persist the configuration on the first run, or do we also detect changes in the configuration file on restarts?
  • Do we persist the default values of the entries that do not appear in .toml? We should be cautious about some default values that are "computed" instead of constants, as persisting them may be confusing.

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to avoid misunderstanding: the "snapshot" here refers to the current system parameters in Meta Service at the beginning of subscription, just like the "snapshot" of the catalog or Hummock versions.

Answers to questions:

Do we still have configuration file for the cluster? If so, do we only persist the configuration on the first run, or do we also detect changes in the configuration file on restarts?

Basically not. It may be useful for a few cases like #34 (comment), but generally, I don't think we need such a design.

Do we persist the default values of the entries that do not appear in .toml? We should be cautious about some default values that are "computed" instead of constants, as persisting them may be confusing.

Computed values (like parallelism described in #34 (comment)) should be stored as a placeholder e.g. 'auto' or 0, so does this solve your concern?

In this way, either persisting default values or not can work well, but I still prefer not to persist default values because it allows us to tune the defaults in higher-version RisingWave for better performance or whatever.

Copy link
Member

@BugenZhao BugenZhao Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically not. It may be useful for a few cases like #34 (comment), but generally, I don't think we need such a design.

Does it mean that we need to manually initialize this on the first run, before issuing any streaming jobs? It sounds not that friendly for development and playground purposes.

(While if only for the state store, we can make an exception for it to be passed in from the meta CLI parameter as the default value.)

but I still prefer not to persist default values

Sounds like a reasonable choice! But we should still be cautious when upgrading if some users rely on the previous default values.

Copy link
Contributor Author

@fuyufjh fuyufjh Jan 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it mean that we need to manually initialize this on the first run, before issuing any streaming jobs? It sounds not that friendly for development and playground purposes.

IIUC, we use minio for development and playground purposes now, and that's why we can have a default value for data_directory. We can keep this behavior unchanged.

But for productional deployment or cloud platform, a unique S3 bucket must be manually specified for sure.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only the data_directory, but also the state_store_url which tells which state store backend we should use. 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think the data_directory will be a big problem as long as it's not mutable and the deployment tool enforces the value. What really matters is how we keep the caches on various nodes updated and how to make sure things do not break when parameter change happens.

- Global configs might be **immutable or mutable**, the mutable ones can be further divided into take effect immediately or need reboot.
- **Immutable** e.g. `data_directory` should be rejected upon SET
- **Mutable & need reboot** e.g. `heartbeat_interval_ms`
- **Mutable & immediately** e.g. `batch_chunk_size`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there're more things to consider in defining "immediately". As Flink says,

Because options are read at different point in time when performing operations, it is recommended to set configuration options early after instantiating a table environment.

If we only read the configuration on scheduling (of batch queries...), then it sounds not that "immediate" 😄 but more like a session variable that only affects future queries. If we read it continuously, we may need to consider whether it'll lead to unexpected behavior if the config can be modified at any time.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly. The boundary seems obscure for these cases. I just collectively call them "immediately" to distinguish them from the need-reboot ones.

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 13, 2023

For process config, should we have a separate config file (template) for each type of node, or add more sections in risingwave.toml

LGTM for adding more sections. Multiple files may be troublesome for deployment.

Is it a good idea to put configs that are bound to be different among nodes like host address in the same config file 🤔 ?

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Jan 13, 2023

For process config, should we have a separate config file (template) for each type of node, or add more sections in risingwave.toml

LGTM for adding more sections. Multiple files may be troublesome for deployment.

Is it a good idea to put configs that are bound to be different among nodes like host address in the same config file 🤔 ?

Then use CLI args instead, if it makes you feel better 🤣

@Gun9niR
Copy link
Contributor

Gun9niR commented Jan 13, 2023

For process config, should we have a separate config file (template) for each type of node, or add more sections in risingwave.toml

LGTM for adding more sections. Multiple files may be troublesome for deployment.

Is it a good idea to put configs that are bound to be different among nodes like host address in the same config file 🤔 ?

Then use CLI args instead, if it makes you feel better 🤣

I agree CLI args is more suitable. So these configs shouldn't be written in the template config file, although the users can surely add them in their own files, right? Because there's no appropriate default value.


Similar to other system metadata e.g. catalog,

- Changes to global parameters should be propagated to all nodes, in the cluster in a deterministic order ideally

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the nature of distributed systems, there will be cases where multiple values of the same parameter exist on various nodes, no matter how we optimize the notification mechanism. Will we introduce things like what is described in the Online Schema Change paper to ensure consistency and atomicity?

Copy link
Contributor

@CAJan93 CAJan93 Jan 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean Online, Asynchronous Schema Change in F1 by Rae et al.? As I understand this summary they basically use 2PC to change schemas, right?

Maybe we coul rely on ETCD (with or without abstracting via a K8s CM), since it offers strong consistency? A 2PC using ETCD could look like this maybe:

  1. Client issues settings change against system
  2. ETCD changes configs
  3. Pods are alerted about config change (if we use CM, K8s would do that for us)
  4. Meta (or some other components) asks each pod if it is ready to apply the changes. If all reply yes, then we apply.

Just a suggestion :)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, it's the exact paper that I referred.

Maybe we coul rely on ETCD (with or without abstracting via a K8s CM), since it offers strong consistency?

🤔 Sounds like a feasible solution.


Let's rethink the current design of CLI options and config files. ([#5676](https://github.com/risingwavelabs/risingwave/issues/5676)) Given the existence of system parameters, some existing parameters may need to be moved to system parameters, like `data_directory` as mentioned before.

We propose to consider CLI options as a way to override config files (i.e. in term of priority, CLI options -> config file -> defaults); as a result, the items in CLI options are a subset of the config file. The reasons for this include

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shall we also consider supporting env vars as one of the config sources? In containerized deployments, it is way easier to set an env var than update a mounted file, not to mention that RisingWave is a distributed system that makes config dispatching a mess: it's hard to maintain a lot of local config files.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[...] not to mention that RisingWave is a distributed system that makes config dispatching a mess: it's hard to maintain a lot of local config files.

IMHO K8s ConfigMaps may help here, since dispatch and updates will happened automatically. Maybe it would be a good alternative to combine CMs with env vars? A local env var could overwrite the CM and you could thus customise the global configs for every pod.


## Discussions

1. Do we need database parameters, global (instance-wide) parameters, or both? No newline at end of file

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🤔 Are there real cases or just a proposal? Adding levels will make the configuration complicated. I'd say that instance-wide and object(table, source, mv)-specific parameters should be enough.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In PG there's only a set of variables, and can be set at different levels (session, role, database, instance).

object(table, source, mv)-specific parameters

But this seems similar to database parameters. IMHO we didn't consider such things in the RFC? 🤔

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this seems similar to database parameters. IMHO we didn't consider such things in the RFC? 🤔

You're right. Let's be PG-similar.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not decided yet. But anyway we can add support for database parameters after the completion of global parameters.

@fuyufjh
Copy link
Contributor Author

fuyufjh commented Feb 3, 2023

Shall we store the parameters in etcd in a simple readable manner? Such as plain key-values (in human-readable strings e.g. "114514", "true"), so that we (the operation person or cloud platform) can easily initialize or modify data directly, even without Meta Service.

In this way, the global parameters can be modified more easily, which is one of the benefits of using a config file, while also get consistency and replicated among multiple nodes.

@hzxa21
Copy link

hzxa21 commented Feb 3, 2023

Shall we store the parameters in etcd in a simple readable manner? Such as plain key-values (in human-readable strings e.g. "114514", "true"), so that we (the operation person or cloud platform) can easily initialize or modify data directly, even without Meta Service.

In this way, the global parameters can be modified more easily, which is one of the benefits of using a config file, while also get consistency and replicated among multiple nodes.

+1

Comment on lines 72 to 84
### Session Parameters

PostgreSQL allows users to set parameters for the current session as well.

```sql
SET parameter TO value / 'value' / DEFAULT;
```

References:

- [PostgreSQL: Documentation: 15: 20.1. Setting Parameters](https://www.postgresql.org/docs/current/config-setting.html)

Previously, we had already introduced [session parameters in RisingWave](https://github.com/risingwavelabs/risingwave/blob/53f7e0db772ac7e51773791bb8301624ed763ae8/src/common/src/session_config/mod.rs#L265). Some of them should be system parameters like `batch_enable_sort_agg` or `query_mode`. Postgres **enforces** that session parameters must also be system paramters, we can follow this rule as well.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Session parameters seems to conflict with the definition of system parameters, since the values can vary from session to session (on different fe nodes). So I think session parameters should be out of scope for this RFC.

Copy link
Contributor

@Gun9niR Gun9niR left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🚀

@fuyufjh fuyufjh merged commit 2036df2 into main Mar 16, 2023
@st1page
Copy link
Contributor

st1page commented Dec 12, 2023

feedback:
In recent POC, the user tried to upgrade a RisingWave cluster with the too-old version and a system parameter become incompatible. So we have to alter the system parameter in the etcd manually.

This really helps

Shall we store the parameters in etcd in a simple readable manner? Such as plain key-values (in human-readable strings e.g. "114514", "true"), so that we (the operation person or cloud platform) can easily initialize or modify data directly, even without Meta Service.

And I guess we need this for Ops.

Meta store is not designed to be directly manipulated by the user. If one has corrupted some configuration with SET and finds > that the cluster does not boot anymore, he can do nothing about this.

We need to equip the operation tool with the ability to fix it. For example, support a new command to risectl set to to manipulate etcd directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.