Skip to content

Conversation

rokatyy
Copy link
Contributor

@rokatyy rokatyy commented Jul 23, 2025

Description

This PR wraps kubernetes.Interface into Nuclio's custom retryable client to be able to retry on certain errors and avoid many problems that temporarily k8s issues may cause.

Testing

Long CI passed ✅ +basic functionality tested

References

Copy link
Collaborator

@weilerN weilerN left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great thorough work! Just left a few minor questions inline

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A suggestion regarding the location and structuring of the code:
In Nuclio we don't have a clients package that includes all, but we do have client packages in pkg - for example dockerclient, cmdrunner, processwaiter etc.
common is more of a place for helper functions, which are split into domains.
I suggest moving the files in this PR to a separate k8sclient or kubeclient that will have the following structure:

pkg
├── kubeclient
│   └── client.go // the actual implementation of the client struct, with its methods
│   └── types.go  // where the interface is defined, along with constants if any
│   └── retry.go  // the retry helpers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomerShor yeah, I agree here. That makes me think whether new package platform/kube/clients would be a good place for keeping both new kube retry client + nuclio custom client for CRDs. Wdyt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We actually already have a kube/client package. you can add the retry client there

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomerShor I moved it to be platform/kube/clients/nuclio as this is nuclio CRDs client

Comment on lines 36 to 37
// ClientWithRetry provides resilient methods to interact with various Kubernetes resources
type ClientWithRetry interface {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface is just Client. The retry is the specific implementation

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - isn't this needed for all Nuclio CRDs as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TomerShor I replaced all the usages of kubernetes.Interface. For CRDs we use

type Interface interface {
Discovery() discovery.DiscoveryInterface
NuclioV1beta1() nucliov1beta1.NuclioV1beta1Interface
}
, which will be handled in a separate PR

DeleteService(ctx context.Context, namespace string, name string, deleteOptions metav1.DeleteOptions) error

// PatchService applies a patch to a Service.
PatchService(ctx context.Context, namespace string, name string, pt types.PatchType, data []byte) (*corev1.Service, error)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
PatchService(ctx context.Context, namespace string, name string, pt types.PatchType, data []byte) (*corev1.Service, error)
PatchService(ctx context.Context, namespace string, name string, patchType types.PatchType, data []byte) (*corev1.Service, error)

// StreamPodLogs returns a stream of logs for a Pod.
StreamPodLogs(ctx context.Context, namespace, podName string, options *corev1.PodLogOptions) (io.ReadCloser, error)

//DeletePod deletes a Pod by name.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
//DeletePod deletes a Pod by name.
// DeletePod deletes a Pod by name.

Comment on lines 202 to 226
type clientWithRetry struct {
kubernetes.Interface
retries int
delay time.Duration
}

func NewClientWithRetryFromConfig(config *rest.Config) (Client, error) {
client, err := kubernetes.NewForConfig(config)
if err != nil {
return nil, errors.Wrap(err, "Failed to create Kubernetes client")
}
return &clientWithRetry{
Interface: client,
retries: clients.MaxRetries,
delay: clients.Delay,
}, nil
}

func NewClientWithRetryFromClient(client kubernetes.Interface) Client {
return &clientWithRetry{
Interface: client,
retries: clients.MaxRetries,
delay: clients.Delay,
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these and all of the methods below to client.go

@rokatyy rokatyy requested a review from TomerShor July 24, 2025 08:11
Copy link
Contributor

@TomerShor TomerShor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

@rokatyy rokatyy merged commit bec9e2f into nuclio:development Jul 24, 2025
14 checks passed
rokatyy added a commit to rokatyy/nuclio that referenced this pull request Jul 25, 2025
### Description
This PR wraps `kubernetes.Interface` into Nuclio's custom retryable
client to be able to retry on certain errors and avoid many problems
that temporarily k8s issues may cause.

### Testing
Long CI passed ✅ + while review is in progress, I'm going to test basic
functionality

### References
- Ticket link: https://iguazio.atlassian.net/browse/NUC-522

(cherry picked from commit bec9e2f)
rokatyy added a commit that referenced this pull request Jul 25, 2025
@rokatyy rokatyy deleted the nuc-522 branch September 5, 2025 10:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants