Skip to content

Conversation

@pdifranc
Copy link
Contributor

@pdifranc pdifranc commented Oct 13, 2022

Signed-off-by: Paolo Di Francesco [email protected]

Related Issues/PRs

Close #7019

What changes are proposed in this pull request?

Adds the capability to sign any outgoing MLFlow requests using AWS Sig V4

How is this patch tested?

  • I have written tests (not required for typo or doc fix) and confirmed the proposed feature/bug-fix/change works.

I have tested this change on a sample I wrote that deploys these changes from my forked repo and demonstrate it worked.
The repo is here and the whole setup can be tested on an AWS account (the sample includes also changes to the MLFlow UI to include Cognito, but that is out of scope of this feature request) - sample code here please let me know if this is enough.

Does this PR change the documentation?

  • No. You can skip the rest of this section.
  • Yes. Make sure the changed pages / sections render correctly by following the steps below.
  1. Click the Details link on the Preview docs check.
  2. Find the changed pages / sections and make sure they render correctly.

Release Notes

You can now set the MLFLOW_TRACKING_AWS_SIGV4 environment variable to sign any HTTP requests to MLFlow from the python SDK using AWS Signature V4. More details about the AWS Signature V4 can be found here.
This change will enable MLFlow users to shield the MLFlow tracking server behind an Amazon API Gateway, and authenticate and authorize the requests via IAM_AUTH.
In order to use this feature you need to either:

  • provide AWS credentials (i.e., AWS access key and AWS secret key), or
  • (preferred) use IAM Roles if for example running from an EC2 instance, or from the Amazon SageMaker managed infrastructure.

The IAM policy attached to the IAM Role or IAM User must have the permission to call the execute-api for the specific API.
You can then implement fine grained authorization using different policies on different users/roles.

Is this a user-facing change?

  • No. You can skip the rest of this section.
  • Yes. Give a description of this change to be included in the release notes for MLflow users.

(Details in 1-2 sentences. You can just refer to another PR with a description if this PR is part of a larger change.)

A new environmental variable is available, i.e. MLFLOW_TRACKING_AWS_SIGV4.
By setting this environmental variable, all your requests will be signed used AWS Signature V4 (see here for more details).
AWS credentials are needed, which can be taken from a boto3 session, or via passing the AWS environment variables for setting the credentials. boto3 session takes precedence.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

  • area/artifacts: Artifact stores and artifact logging
  • area/build: Build and test infrastructure for MLflow
  • area/docs: MLflow documentation pages
  • area/examples: Example code
  • area/model-registry: Model Registry service, APIs, and the fluent client calls for Model Registry
  • area/models: MLmodel format, model serialization/deserialization, flavors
  • area/pipelines: Pipelines, Pipeline APIs, Pipeline configs, Pipeline Templates
  • area/projects: MLproject format, project running backends
  • area/scoring: MLflow Model server, model deployment tools, Spark UDFs
  • area/server-infra: MLflow Tracking server backend
  • area/tracking: Tracking Service, tracking client APIs, autologging

Interface

  • area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
  • area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
  • area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
  • area/windows: Windows support

Language

  • language/r: R APIs and clients
  • language/java: Java APIs and clients
  • language/new: Proposals for new client languages

Integrations

  • integrations/azure: Azure and Azure ML integrations
  • integrations/sagemaker: SageMaker integrations
  • integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

  • rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
  • rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
  • rn/feature - A new user-facing feature worth mentioning in the release notes
  • rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
  • rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: Paolo Di Francesco <[email protected]>
@github-actions github-actions bot added area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs. labels Oct 13, 2022
@mlflow-automation
Copy link
Contributor

mlflow-automation commented Oct 13, 2022

Documentation preview will be available here.

Notes
  • Ignore this comment if this PR does not change the documentation.
  • It takes a few minutes for the preview to be available.
  • The preview is updated on every commit to this PR.
  • Job URL: https://circleci.com/gh/mlflow/mlflow/33280
  • Updated at: 2022-10-17 06:45:11.185310

@pdifranc
Copy link
Contributor Author

I have written unit tests too (I havent written a unit test in python in ages :D, I hope they it is enough)
Since I am using a library, I was wondering where exactly I should have placed the dependency.
I have placed it in extras but maybe there is a better place.

Any further feedback would be appreciated

Copy link
Collaborator

@dbczumar dbczumar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pdifranc and others added 2 commits October 14, 2022 18:32
Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>
@pdifranc
Copy link
Contributor Author

Thanks @harupy and @dbczumar for the feedbacks!
Is there anything else I need to do?

@pdifranc pdifranc force-pushed the add_sigv4_to_requests branch from 0bca925 to 791f1d2 Compare October 14, 2022 21:46
@pdifranc pdifranc force-pushed the add_sigv4_to_requests branch 2 times, most recently from a3942d2 to 4249d1e Compare October 15, 2022 12:00
Paolo Di Francesco and others added 2 commits October 17, 2022 08:36
Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>
@pdifranc pdifranc force-pushed the add_sigv4_to_requests branch from 9f13c0f to 3a01384 Compare October 17, 2022 06:37
@pdifranc
Copy link
Contributor Author

I have fixed the problem with the test_rest_utils.py unit tests (missing expected AWS configuration, i.e., keys and region), but now some unrelated R tests are failing :(

@pdifranc
Copy link
Contributor Author

@harupy is it possible the failed test has nothing to do with my code changes? could you maybe re-trigger the CI tests?

@harupy
Copy link
Member

harupy commented Oct 18, 2022

We can ignore that. It's unrelated.

@dbczumar
Copy link
Collaborator

Thanks @pdifranc ! Merging! :)

@dbczumar dbczumar merged commit e0a2d48 into mlflow:master Oct 18, 2022
sunishsheth2009 pushed a commit that referenced this pull request Oct 19, 2022
* Add AWSSigv4 to request

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update mlflow/environment_variables.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

* Update tests/utils/test_rest_utils.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

* Use MLFLOW_TRACKING_AWS_SIGV4 from mlflow.environment_variables

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update test/utils/test_rest_utils.py: mock AWS config

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update mlflow/environment_variables.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

Signed-off-by: Paolo Di Francesco <[email protected]>
Signed-off-by: pdifranc <[email protected]>
Co-authored-by: Paolo Di Francesco <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
pdifranc added a commit to pdifranc/mlflow that referenced this pull request Oct 19, 2022
* Add AWSSigv4 to request

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update mlflow/environment_variables.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

* Update tests/utils/test_rest_utils.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

* Use MLFLOW_TRACKING_AWS_SIGV4 from mlflow.environment_variables

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update test/utils/test_rest_utils.py: mock AWS config

Signed-off-by: Paolo Di Francesco <[email protected]>

* Update mlflow/environment_variables.py

Co-authored-by: Harutaka Kawamura <[email protected]>
Signed-off-by: pdifranc <[email protected]>

Signed-off-by: Paolo Di Francesco <[email protected]>
Signed-off-by: pdifranc <[email protected]>
Co-authored-by: Paolo Di Francesco <[email protected]>
Co-authored-by: Harutaka Kawamura <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/tracking Tracking service, tracking client APIs, autologging rn/feature Mention under Features in Changelogs.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FR] Allow signing all outgoing request using AWS Sig V4

4 participants