-
Notifications
You must be signed in to change notification settings - Fork 193
Hierarchical forecasting notebook #664
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
795b9cf
to
163ac6e
Compare
/gcbrun |
dfbcafe
to
d486fb4
Compare
go/gcbrun |
running test > 4 days |
/gcbrun |
09c92fc
to
785399c
Compare
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:09:11Z Are all of these optimization_objectives applicable for time-series forecasting? reznitskii commented on 2022-07-15T15:28:44Z Forecasting optimization objectives are given here: https://cloud.google.com/vertex-ai/docs/tabular-data/forecasting/train-model#optimization-objectives
Looks like all three of the objectives you have here are valid.
Why does the text say AutoMLTabularTrainingJob but the code says AutoMLForecastingTrainingJob? I think the text needs to be cleaned up, variable optimization_prediction_type is not set in the code, variable column_transformations is called column_specs in the code. ivanmkc commented on 2022-07-18T21:16:14Z Ah yes, it's old text. Will update. |
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:09:12Z I'm unsure how to demonstrate window_column, window_stride_length and window_max_count for this use case. reznitskii commented on 2022-07-15T15:35:14Z If I understand correctly, the rolling window feature is distinct and separate from hierarchical forecasting. We could add a demonstration of it to the primary forecasting notebook: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_tabular_forecasting_batch.ipynb ivanmkc commented on 2022-07-18T21:27:10Z Are they mutually exclusive? reznitskii commented on 2022-07-19T14:37:18Z No, not as far as I know. Above, you were concerned that you didn't know how to demonstrate "window_column, window_stride_length and window_max_count for this use case". I'm saying you don't need to worry about it in the context of this notebook. ivanmkc commented on 2022-07-22T18:26:51Z Got it, thanks! |
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:09:13Z Line #28. # hierarchy_temporal_total_weight=_TEST_HIERARCHY_TEMPORAL_TOTAL_WEIGHT, I can't cover anything in one notebook but any suggestions on what to do with the hierarchical temporal parameters? reznitskii commented on 2022-07-15T15:39:13Z Is it possible to demonstrate the full set of hierarchical forecasting features within a single notebook? If not and we choose to demonstrate just reznitskii commented on 2022-07-15T16:00:47Z Is it possible to include a picture that demonstrates the hierarchy we use here? Is "product_type" a subset of "product_category", which is a subset of "product"? ivanmkc commented on 2022-07-18T21:27:58Z I'm not sure how to use
Maybe Andrew Leach can help here. AndrewLeach commented on 2022-07-19T00:05:04Z What hierarchical configuration to use depends on what the goal is. Are there product category level metrics we are inspecting (group weight with product category as group column)? Are there season total metrics we are inspecting (temporal weight with no group columns)? Are we checking the overall bias (group weight with no group columns)? |
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:09:14Z I plot some error metrics to show prediction performance.
1) I need suggestions on how to show that hierarchical forecasting is working better than non-hierarchical forecasting.
2) Should I show a hierarchical and non-hierachical model (training both) and compare the error against each other? This may be out-of-scope for the notebook but would be a nice addition to docs. reznitskii commented on 2022-07-15T15:44:31Z 1 - Have you talked to Andrew Leach?
2 - Yes, it would be good to compare the results of the hierarchical and the non-hierarchical model, but can we just load the result of the non-hierarchical model for this dataset from somewhere? I would prefer to avoid creating / training the the non-hierarchical model here. ivanmkc commented on 2022-07-18T21:31:42Z
reznitskii commented on 2022-09-06T21:15:26Z
I think comparing with the ground truth is not so meaningful here, the point is to justify doing model training with the additional hierarchical component. ivanmkc commented on 2022-11-29T22:03:07Z Good point. I will remove the ground truth comparison. ivanmkc commented on 2022-11-29T22:40:30Z I think the comparison is out of scope as we move towards notebooks that focus on showcasing how to use the API. |
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:11:07Z TODO: Describe the data in more detail. |
View / edit / reply to this conversation on ReviewNB ivanmkc commented on 2022-07-07T00:11:08Z Line #2. DATASET_URI = 'gs://ivanmkc-test2/sales_forecasting_full.csv' Will point to proper dataset later. reznitskii commented on 2022-07-15T15:21:52Z Do we want to complete the TODO before making the notebook public / official? ivanmkc commented on 2022-07-18T21:14:52Z Yes, we will have to. |
Do we want to complete the TODO before making the notebook public / official? View entire conversation on ReviewNB |
Forecasting optimization objectives are given here: https://cloud.google.com/vertex-ai/docs/tabular-data/forecasting/train-model#optimization-objectives
Looks like all three of the objectives you have here are valid.
Why does the text say AutoMLTabularTrainingJob but the code says AutoMLForecastingTrainingJob? I think the text needs to be cleaned up, variable optimization_prediction_type is not set in the code, variable column_transformations is called column_specs in the code. View entire conversation on ReviewNB |
If I understand correctly, the rolling window feature is distinct and separate from hierarchical forecasting. We could add a demonstration of it to the primary forecasting notebook: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_tabular_forecasting_batch.ipynb View entire conversation on ReviewNB |
Is it possible to demonstrate the full set of hierarchical forecasting features within a single notebook? If not and we choose to demonstrate just View entire conversation on ReviewNB |
1 - Have you talked to Andrew Leach?
2 - Yes, it would be good to compare the results of the hierarchical and the non-hierarchical model, but can we just load the result of the non-hierarchical model for this dataset from somewhere? I would prefer to avoid creating / training the the non-hierarchical model here. View entire conversation on ReviewNB |
Is it possible to include a picture that demonstrates the hierarchy we use here? Is "product_type" a subset of "product_category", which is a subset of "product"? View entire conversation on ReviewNB |
View / edit / reply to this conversation on ReviewNB reznitskii commented on 2022-07-15T16:04:25Z Looks like we're missing information about the dataset. ivanmkc commented on 2022-08-08T20:27:03Z Added reznitskii commented on 2022-09-06T20:05:42Z I would move the information from Tutorial->Prepare Data to this introduction: This dataset is synthesized to mimic sales pattern for a fictional outdoor gear store. There is a hierarchy between product_category, product_type and product as shown below: product_category: snow product_type: skis product: Additionally, this company has 3 store locations, each with their respective level of foot traffic. store: suburbs store: flagship store: downtown Additional season effects are present in the data.
I would also include a link to the tutorial dataset here gs://cloud-samples-data/vertex-ai/structured_data/forecasting/synthetic_sales_data.csv"
I would put "you predict a fictional store's sales based on historical sales data." into the Objective section. |
View / edit / reply to this conversation on ReviewNB reznitskii commented on 2022-07-15T16:04:26Z Can we expand on the problem we're trying to solve here? Is it something about the dataset or something about the prediction that motivates hierarchical forecasting? ivanmkc commented on 2022-08-08T20:27:24Z Adding... One second. ivanmkc commented on 2022-08-08T20:30:33Z Done. |
View / edit / reply to this conversation on ReviewNB reznitskii commented on 2022-07-15T16:04:27Z What "further treatment"? ivanmkc commented on 2022-11-29T22:40:58Z Added more details. |
View / edit / reply to this conversation on ReviewNB reznitskii commented on 2022-07-15T16:04:28Z Is this an optional step? Is there something we're looking for when we examine the columns? ivanmkc commented on 2022-07-18T21:15:38Z Just for the user to understand what the dataset looks like. |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:44Z Do we use |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:45Z Can we move this part a cell before the actual code? |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:46Z A different set of params are used when |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:47Z Maybe adding a check |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:48Z I usually use f-sting for better readability. Just a suggestion :) |
View / edit / reply to this conversation on ReviewNB soheilazangeneh commented on 2022-11-29T22:36:49Z
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! Only left few minor comments!
I think the comparison is out of scope as we move towards notebooks that focus on showcasing how to use the API. View entire conversation on ReviewNB |
Added more details. View entire conversation on ReviewNB |
300baaf
to
04baabf
Compare
Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot. |
This is a hierarchical forecasting notebook demonstrating the use of the new hierarchical forecasting parameters for AutoML Forecasting.
If you are opening a PR for
Official Notebooks
under the notebooks/official folder, follow this mandatory checklist:Official Notebooks
section, pointing to the author or the author's team.