Skip to content

Conversation

ivanmkc
Copy link
Contributor

@ivanmkc ivanmkc commented Jun 21, 2022

This is a hierarchical forecasting notebook demonstrating the use of the new hierarchical forecasting parameters for AutoML Forecasting.

If you are opening a PR for Official Notebooks under the notebooks/official folder, follow this mandatory checklist:

  • Use the notebook template as a starting point.
  • Follow the style and grammar rules outlined in the above notebook template.
  • Verify the notebook runs successfully in Colab since the automated tests cannot guarantee this even when it passes.
  • Passes all the required automated checks. You can locally test for formatting and linting with these instructions.
  • You have consulted with a tech writer to see if tech writer review is necessary. If so, the notebook has been reviewed by a tech writer, and they have approved it.
  • This notebook has been added to the CODEOWNERS file under the Official Notebooks section, pointing to the author or the author's team.
  • The Jupyter notebook cleans up any artifacts it has created (datasets, ML models, endpoints, etc) so as not to eat up unnecessary resources.

@ivanmkc ivanmkc requested a review from a team as a code owner June 21, 2022 21:48
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@ivanmkc ivanmkc force-pushed the imkc--hierarchical-forecasting branch from 795b9cf to 163ac6e Compare June 22, 2022 23:33
@ivanmkc
Copy link
Contributor Author

ivanmkc commented Jun 22, 2022

/gcbrun

@ivanmkc ivanmkc force-pushed the imkc--hierarchical-forecasting branch 6 times, most recently from dfbcafe to d486fb4 Compare June 23, 2022 01:50
@ivanmkc
Copy link
Contributor Author

ivanmkc commented Jun 23, 2022

go/gcbrun

@andrewferlitsch
Copy link
Contributor

running test > 4 days
Apparently Cloud build just fixed that bug in the latest release
Might have to gcbrun again to see the fix

@andrewferlitsch
Copy link
Contributor

/gcbrun

@ivanmkc ivanmkc force-pushed the imkc--hierarchical-forecasting branch from 09c92fc to 785399c Compare July 6, 2022 18:19
@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:09:11Z
----------------------------------------------------------------

Are all of these optimization_objectives applicable for time-series forecasting?


reznitskii commented on 2022-07-15T15:28:44Z
----------------------------------------------------------------

Forecasting optimization objectives are given here:

https://cloud.google.com/vertex-ai/docs/tabular-data/forecasting/train-model#optimization-objectives

Looks like all three of the objectives you have here are valid.

Why does the text say AutoMLTabularTrainingJob but the code says AutoMLForecastingTrainingJob? I think the text needs to be cleaned up, variable optimization_prediction_type is not set in the code, variable column_transformations is called column_specs in the code.

ivanmkc commented on 2022-07-18T21:16:14Z
----------------------------------------------------------------

Ah yes, it's old text. Will update.

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:09:12Z
----------------------------------------------------------------

I'm unsure how to demonstrate window_column, window_stride_length and window_max_count for this use case.


reznitskii commented on 2022-07-15T15:35:14Z
----------------------------------------------------------------

If I understand correctly, the rolling window feature is distinct and separate from hierarchical forecasting. We could add a demonstration of it to the primary forecasting notebook: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_tabular_forecasting_batch.ipynb

ivanmkc commented on 2022-07-18T21:27:10Z
----------------------------------------------------------------

Are they mutually exclusive?

reznitskii commented on 2022-07-19T14:37:18Z
----------------------------------------------------------------

No, not as far as I know. Above, you were concerned that you didn't know how to demonstrate "window_column, window_stride_length and window_max_count for this use case". I'm saying you don't need to worry about it in the context of this notebook.

ivanmkc commented on 2022-07-22T18:26:51Z
----------------------------------------------------------------

Got it, thanks!

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:09:13Z
----------------------------------------------------------------

Line #28.        # hierarchy_temporal_total_weight=_TEST_HIERARCHY_TEMPORAL_TOTAL_WEIGHT,

I can't cover anything in one notebook but any suggestions on what to do with the hierarchical temporal parameters?


reznitskii commented on 2022-07-15T15:39:13Z
----------------------------------------------------------------

Is it possible to demonstrate the full set of hierarchical forecasting features within a single notebook? If not and we choose to demonstrate just group_temporal_total_weight,I think we should be clear about this from the beginning, in the tutorial introduction.

reznitskii commented on 2022-07-15T16:00:47Z
----------------------------------------------------------------

Is it possible to include a picture that demonstrates the hierarchy we use here?

Is "product_type" a subset of "product_category", which is a subset of "product"?

ivanmkc commented on 2022-07-18T21:27:58Z
----------------------------------------------------------------

I'm not sure how to use group_temporal_total_weight.

Maybe Andrew Leach can help here.

AndrewLeach commented on 2022-07-19T00:05:04Z
----------------------------------------------------------------

What hierarchical configuration to use depends on what the goal is. Are there product category level metrics we are inspecting (group weight with product category as group column)? Are there season total metrics we are inspecting (temporal weight with no group columns)? Are we checking the overall bias (group weight with no group columns)?

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:09:14Z
----------------------------------------------------------------

I plot some error metrics to show prediction performance.

1) I need suggestions on how to show that hierarchical forecasting is working better than non-hierarchical forecasting.

2) Should I show a hierarchical and non-hierachical model (training both) and compare the error against each other? This may be out-of-scope for the notebook but would be a nice addition to docs.


reznitskii commented on 2022-07-15T15:44:31Z
----------------------------------------------------------------

1 - Have you talked to Andrew Leach?

2 - Yes, it would be good to compare the results of the hierarchical and the non-hierarchical model, but can we just load the result of the non-hierarchical model for this dataset from somewhere? I would prefer to avoid creating / training the the non-hierarchical model here.

ivanmkc commented on 2022-07-18T21:31:42Z
----------------------------------------------------------------

  1. Yes, waiting on a review from him.
  2. We could but it'd be easier to maintain the code by including it here as we'd avoid having to save the state in another place and keep it in sync.

reznitskii commented on 2022-09-06T21:15:26Z
----------------------------------------------------------------

  1. Any update here?
  2. I'm not sure what you mean. Can we assume that the dataset is going to be fixed and the non-hierarchical model training algorithm will be fixed?

I think comparing with the ground truth is not so meaningful here, the point is to justify doing model training with the additional hierarchical component.

ivanmkc commented on 2022-11-29T22:03:07Z
----------------------------------------------------------------

Good point. I will remove the ground truth comparison.

ivanmkc commented on 2022-11-29T22:40:30Z
----------------------------------------------------------------

I think the comparison is out of scope as we move towards notebooks that focus on showcasing how to use the API.

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:11:07Z
----------------------------------------------------------------

TODO: Describe the data in more detail.


@review-notebook-app
Copy link

review-notebook-app bot commented Jul 7, 2022

View / edit / reply to this conversation on ReviewNB

ivanmkc commented on 2022-07-07T00:11:08Z
----------------------------------------------------------------

Line #2.    DATASET_URI = 'gs://ivanmkc-test2/sales_forecasting_full.csv'

Will point to proper dataset later.


reznitskii commented on 2022-07-15T15:21:52Z
----------------------------------------------------------------

Do we want to complete the TODO before making the notebook public / official?

ivanmkc commented on 2022-07-18T21:14:52Z
----------------------------------------------------------------

Yes, we will have to.

Copy link
Contributor

Do we want to complete the TODO before making the notebook public / official?


View entire conversation on ReviewNB

Copy link
Contributor

Forecasting optimization objectives are given here:

https://cloud.google.com/vertex-ai/docs/tabular-data/forecasting/train-model#optimization-objectives

Looks like all three of the objectives you have here are valid.

Why does the text say AutoMLTabularTrainingJob but the code says AutoMLForecastingTrainingJob? I think the text needs to be cleaned up, variable optimization_prediction_type is not set in the code, variable column_transformations is called column_specs in the code.


View entire conversation on ReviewNB

Copy link
Contributor

If I understand correctly, the rolling window feature is distinct and separate from hierarchical forecasting. We could add a demonstration of it to the primary forecasting notebook: https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/automl/sdk_automl_tabular_forecasting_batch.ipynb


View entire conversation on ReviewNB

Copy link
Contributor

Is it possible to demonstrate the full set of hierarchical forecasting features within a single notebook? If not and we choose to demonstrate just group_temporal_total_weight,I think we should be clear about this from the beginning, in the tutorial introduction.


View entire conversation on ReviewNB

Copy link
Contributor

1 - Have you talked to Andrew Leach?

2 - Yes, it would be good to compare the results of the hierarchical and the non-hierarchical model, but can we just load the result of the non-hierarchical model for this dataset from somewhere? I would prefer to avoid creating / training the the non-hierarchical model here.


View entire conversation on ReviewNB

Copy link
Contributor

Is it possible to include a picture that demonstrates the hierarchy we use here?

Is "product_type" a subset of "product_category", which is a subset of "product"?


View entire conversation on ReviewNB

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 15, 2022

View / edit / reply to this conversation on ReviewNB

reznitskii commented on 2022-07-15T16:04:25Z
----------------------------------------------------------------

Looks like we're missing information about the dataset.


ivanmkc commented on 2022-08-08T20:27:03Z
----------------------------------------------------------------

Added

reznitskii commented on 2022-09-06T20:05:42Z
----------------------------------------------------------------

I would move the information from Tutorial->Prepare Data to this introduction:

This dataset is synthesized to mimic sales pattern for a fictional outdoor gear store. There is a hierarchy between product_category, product_type and product as shown below:
product_category: snow
product_type: skis
product:
Additionally, this company has 3 store locations, each with their respective level of foot traffic.
store: suburbs
store: flagship
store: downtown
Additional season effects are present in the data.

I would also include a link to the tutorial dataset here

gs://cloud-samples-data/vertex-ai/structured_data/forecasting/synthetic_sales_data.csv"

I would put "you predict a fictional store's sales based on historical sales data." into the Objective section.

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 15, 2022

View / edit / reply to this conversation on ReviewNB

reznitskii commented on 2022-07-15T16:04:26Z
----------------------------------------------------------------

Can we expand on the problem we're trying to solve here? Is it something about the dataset or something about the prediction that motivates hierarchical forecasting?


ivanmkc commented on 2022-08-08T20:27:24Z
----------------------------------------------------------------

Adding... One second.

ivanmkc commented on 2022-08-08T20:30:33Z
----------------------------------------------------------------

Done.

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 15, 2022

View / edit / reply to this conversation on ReviewNB

reznitskii commented on 2022-07-15T16:04:27Z
----------------------------------------------------------------

What "further treatment"?


ivanmkc commented on 2022-11-29T22:40:58Z
----------------------------------------------------------------

Added more details.

@review-notebook-app
Copy link

review-notebook-app bot commented Jul 15, 2022

View / edit / reply to this conversation on ReviewNB

reznitskii commented on 2022-07-15T16:04:28Z
----------------------------------------------------------------

Is this an optional step? Is there something we're looking for when we examine the columns?


ivanmkc commented on 2022-07-18T21:15:38Z
----------------------------------------------------------------

Just for the user to understand what the dataset looks like.

@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:44Z
----------------------------------------------------------------

Do we use Tutorial keyword in our notebooks?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:45Z
----------------------------------------------------------------

Can we move this part a cell before the actual code?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:46Z
----------------------------------------------------------------

A different set of params are used when run method is called. Does this text need to be updated accordingly?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:47Z
----------------------------------------------------------------

Maybe adding a check if len(list(model_evaluations)) > 0 then do the rest?


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:48Z
----------------------------------------------------------------

I usually use f-sting for better readability. Just a suggestion :)


@review-notebook-app
Copy link

View / edit / reply to this conversation on ReviewNB

soheilazangeneh commented on 2022-11-29T22:36:49Z
----------------------------------------------------------------

accelerator_type and accelerator_count are not passed. Do we still need to mention them here?


Copy link
Contributor

@soheilazangeneh soheilazangeneh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Only left few minor comments!

Copy link
Contributor Author

ivanmkc commented Nov 29, 2022

I think the comparison is out of scope as we move towards notebooks that focus on showcasing how to use the API.


View entire conversation on ReviewNB

Copy link
Contributor Author

ivanmkc commented Nov 29, 2022

Added more details.


View entire conversation on ReviewNB

@ivanmkc ivanmkc force-pushed the imkc--hierarchical-forecasting branch from 300baaf to 04baabf Compare November 29, 2022 22:43
@ivanmkc ivanmkc added the automerge Summon MOG for automerging label Nov 29, 2022
@gcf-merge-on-green
Copy link
Contributor

Merge-on-green attempted to merge your PR for 6 hours, but it was not mergeable because either one of your required status checks failed, one of your required reviews was not approved, or there is a do not merge label. Learn more about your required status checks here: https://help.github.com/en/github/administering-a-repository/enabling-required-status-checks. You can remove and reapply the label to re-run the bot.

@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Summon MOG for automerging label Nov 30, 2022
@ivanmkc ivanmkc added the automerge Summon MOG for automerging label Nov 30, 2022
@gcf-merge-on-green gcf-merge-on-green bot merged commit 3185ca7 into main Nov 30, 2022
@gcf-merge-on-green gcf-merge-on-green bot removed the automerge Summon MOG for automerging label Nov 30, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants