Skip to content

Conversation

soheilazangeneh
Copy link
Contributor


This is the first draft for AutoML tabular regression model with the focus of model evolution.


  1. If you are opening a PR for Community Notebooks under the notebooks/community folder:
  • This notebook has been added to the CODEOWNERS file under the Community Notebooks section, pointing to the author or the author's team.
  • Passes all the required formatting and linting checks. You can locally test with these instructions.

  1. If you are opening a PR for Community Content under the community-content folder:
  • Make sure your main Content Directory Name is descriptive, informative, and includes some of the key products and attributes of your content, so that it is differentiable from other content
  • The main content directory has been added to the CODEOWNERS file under the Community Content section, pointing to the author or the author's team.
  • Passes all the required formatting and linting checks. You can locally test with these instructions.

@soheilazangeneh soheilazangeneh requested a review from a team as a code owner August 29, 2022 15:21
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need H1 title in this cell. Should be above the links.


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

regression model evaluation componenet -> regression model evaluation pipeline component


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace : with . in first sentence

regression evluation component -> model evaluation pipeline component

you say 'pre-trained', but you train the model in the notebook

Add to services, Big Query


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Combine this with the above cell


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO?


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add text cell explain using for eval data


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A bit of explanation on methods/params would help


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

redundant. You already have the model from prev step


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was for testing. Removed.

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain you are getting the AutoML eval metrics from training


Reply via ReviewNB

@@ -0,0 +1,1273 @@
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO?


Reply via ReviewNB

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@@ -0,0 +1,1473 @@
{
Copy link
Contributor Author

@soheilazangeneh soheilazangeneh Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives me the following error:

AttributeError: 'NoneType' object has no attribute 'artifacts'

Wondering if you could debug?

Apparently task.outputs.get('feature_attributions')returns None


Reply via ReviewNB

Copy link
Contributor

@sudarshan-SpringML sudarshan-SpringML Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cross checked again, working fine for me.

PFA Screenshot link

https://drive.google.com/file/d/1Ufrd6JsB8GbCk1ffz7WSqdnWUVfldgHY/view?usp=sharing

@@ -0,0 +1,1427 @@
{
Copy link
Contributor Author

@soheilazangeneh soheilazangeneh Sep 5, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This gives me the following error:

AttributeError: 'NoneType' object has no attribute 'artifacts'

Can you please run again and see if you get this error too? Is this code tested?


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested it again. It worked fine for me.

Copy link
Contributor

@krishr2d2 krishr2d2 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here is a snapshot if it:

evaluation_print_step

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I ran it again and didn't work. Maybe we'll need to debug that. I am running it on workbench. Are these notebook tested on local env only?

Copy link
Contributor

@krishr2d2 krishr2d2 Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tested them on both Workbench and Colab. The above snapshot is from a Colab run.
Yes, it was tested in local env on workbench. Also, before testing, the env was installed with the requirements from .cloud-build/requirements.txt file.

@soheilazangeneh soheilazangeneh changed the title Add automl regression model eval first draft Add automl regression and classification with model evaluation Sep 6, 2022
@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We also use "Vertex AI Model Registry"

And an additional step

"Import the Classification Metrics to the AutoML model resource"


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For consistency, please capitalize all "dataflow" as "Dataflow"


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capitalized.

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #1.    @kfp.dsl.pipeline(name="vertex-evaluation-automl-tabular-feature-attribution-pipeline")

Prefer to switch the name to "vertex-evaluation-automl-tabular-classification-feature-attribution", removing "pipeline"


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #13.        batch_predict_starting_replica_count: int = 5,

I think we can remove this for simplicity


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #14.        batch_predict_max_replica_count: int = 10,

Same here


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #80.            problem_type=prediction_type,

Problem type is not required since classification_metrics is provided.


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.

@@ -0,0 +1,1433 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • batch_predict_instances_format: Format of the input instances for batch prediction. Can be "jsonl" or "bigquery".

Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for AutoML tabular, csv is supported as well

@@ -0,0 +1,1488 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #13.        batch_predict_starting_replica_count: int = 5,

Ditto comments from other notebook


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -0,0 +1,1488 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line #89.            problem_type=prediction_type,

Can remove from here, and remove from pipeline inputs


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed

@@ -0,0 +1,1488 @@
{
Copy link
Contributor

@KevinBNaughton KevinBNaughton Sep 6, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto comments from other notebook


Reply via ReviewNB

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

"source": [
"## Overview\n",
"\n",
"This notebook demonstrates how to use Vertex AI classification model evaluation component to evaluate an AutoML classification model. Model evaluation helps you determine your model performance based on the evaluation metrics and improve the model if necessary. "

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how to use the Vertex AI

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated.

"into the filter box, and select\n",
" **Vertex AI Administrator**. Type \"Storage Object Admin\" into the filter box, and select **Storage Object Admin**.\n",
"\n",
"5. Click *Create*. A JSON file that contains your key downloads to your\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Create should be bolded

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"id": "XoEqT2Y4DJmf"
},
"source": [
"### Import libraries"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Brief description for this step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"\n",
"- `display_name`: The human readable name for the Vertex AI TrainingJob resource.\n",
"- `optimization_prediction_type`: The type of prediction the Model is to produce. Ex: regression, classification.\n",
"- `column_specs`(Optional): Transformations to apply to the input columns(including data-type corrections).\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

space needed after "columns"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added

"- `dataset`: The TabularDataset within the same Project from which data needs to be used to train the Model.\n",
"- `target_column`: The name of the column values of which the Model is to predict.\n",
"- `model_display_name`: The display name of the Vertex AI Model that is produced as an output. \n",
"- `budget_milli_node_hours`(Optional): The train budget of creating this Model, expressed in milli node hours i.e. 1,000 value in this field means 1 node hour. The training cost of the model does not exceed this budget.\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

training

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

updated

"source": [
"## Create Pipeline for evaluations\n",
"\n",
"Now, you run a Vertex AI BatchPrediction job and generate evaluations and feature-attributions on its results. \n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No hyphen between feature attributions

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

"\n",
"Now, you run a Vertex AI BatchPrediction job and generate evaluations and feature-attributions on its results. \n",
"\n",
"To do so, you create a Vertex AI pipeline using the components available from the [`google-cloud-pipeline-components`](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-1.0.17/index.html) python package.\n"

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

capitalize Python

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

"source": [
"In the results from last step, click on the generated link to see your run in the Cloud Console.\n",
"\n",
"In the UI, many of the pipeline DAG nodes expand or collapse when you click on them. Here is a partially-expanded view of the DAG (click image to see larger version).\n",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you spell out this acronym? i.e. "directed acyclic graph (DAG)"

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@andrewferlitsch andrewferlitsch merged commit 9ab5f42 into GoogleCloudPlatform:main Sep 8, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants