Skip to content

Conversation

damemi
Copy link
Member

@damemi damemi commented Nov 10, 2020

What type of PR is this?
/kind cleanup

What this PR does / why we need it:
This PR passes the initialized ComponentConfig for kube-scheduler through the framework initialization functions, allowing the default plugin registry (and PluginArgs) to be presented to the user after plugin instantiation is filled out by API defaulting.

Which issue(s) this PR fixes:
Fixes #93718

Special notes for your reviewer:

Does this PR introduce a user-facing change?:

kube-scheduler now logs processed component config at startup

Additional documentation e.g., KEPs (Kubernetes Enhancement Proposals), usage docs, etc.:


The output from a default scheduler with these changes looks like this:

I1110 20:13:58.509167       1 factory.go:193] Creating scheduler from algorithm provider 'DefaultProvider'
I1110 20:13:58.518715       1 configfile.go:66] Starting scheduler with component config:
apiVersion: kubescheduler.config.k8s.io/v1beta1
clientConnection:
  acceptContentTypes: ""
  burst: 100
  contentType: application/vnd.kubernetes.protobuf
  kubeconfig: /etc/kubernetes/scheduler.conf
  qps: 50
enableContentionProfiling: true
enableProfiling: true
healthzBindAddress: ""
kind: KubeSchedulerConfiguration
leaderElection:
  leaderElect: true
  leaseDuration: 15s
  renewDeadline: 10s
  resourceLock: leases
  resourceName: kube-scheduler
  resourceNamespace: kube-system
  retryPeriod: 2s
metricsBindAddress: ""
parallelism: 16
percentageOfNodesToScore: 0
podInitialBackoffSeconds: 1
podMaxBackoffSeconds: 10
profiles:
- pluginConfig:
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      kind: NodeAffinityArgs
    name: NodeAffinity
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      kind: NodeResourcesLeastAllocatedArgs
      resources:
      - name: cpu
        weight: 1
      - name: memory
        weight: 1
    name: NodeResourcesLeastAllocated
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      kind: NodeResourcesFitArgs
    name: NodeResourcesFit
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      bindTimeoutSeconds: 600
      kind: VolumeBindingArgs
    name: VolumeBinding
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      hardPodAffinityWeight: 1
      kind: InterPodAffinityArgs
    name: InterPodAffinity
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      kind: DefaultPreemptionArgs
      minCandidateNodesAbsolute: 100
      minCandidateNodesPercentage: 10
    name: DefaultPreemption
  - args:
      apiVersion: kubescheduler.config.k8s.io/v1beta1
      defaultingType: System
      kind: PodTopologySpreadArgs
    name: PodTopologySpread
  plugins:
    bind:
      enabled:
      - name: DefaultBinder
        weight: 0
    filter:
      enabled:
      - name: NodeUnschedulable
        weight: 0
      - name: NodeName
        weight: 0
      - name: TaintToleration
        weight: 0
      - name: NodeAffinity
        weight: 0
      - name: NodePorts
        weight: 0
      - name: NodeResourcesFit
        weight: 0
      - name: VolumeRestrictions
        weight: 0
      - name: EBSLimits
        weight: 0
      - name: GCEPDLimits
        weight: 0
      - name: NodeVolumeLimits
        weight: 0
      - name: AzureDiskLimits
        weight: 0
      - name: VolumeBinding
        weight: 0
      - name: VolumeZone
        weight: 0
      - name: PodTopologySpread
        weight: 0
      - name: InterPodAffinity
        weight: 0
    permit: {}
    postBind: {}
    postFilter:
      enabled:
      - name: DefaultPreemption
        weight: 0
    preBind:
      enabled:
      - name: VolumeBinding
        weight: 0
    preFilter:
      enabled:
      - name: NodeResourcesFit
        weight: 0
      - name: NodePorts
        weight: 0
      - name: PodTopologySpread
        weight: 0
      - name: InterPodAffinity
        weight: 0
      - name: VolumeBinding
        weight: 0
    preScore:
      enabled:
      - name: InterPodAffinity
        weight: 0
      - name: PodTopologySpread
        weight: 0
      - name: TaintToleration
        weight: 0
    queueSort:
      enabled:
      - name: PrioritySort
        weight: 0
    reserve:
      enabled:
      - name: VolumeBinding
        weight: 0
    score:
      enabled:
      - name: NodeResourcesBalancedAllocation
        weight: 1
      - name: ImageLocality
        weight: 1
      - name: InterPodAffinity
        weight: 1
      - name: NodeResourcesLeastAllocated
        weight: 1
      - name: NodeAffinity
        weight: 1
      - name: NodePreferAvoidPods
        weight: 10000
      - name: PodTopologySpread
        weight: 2
      - name: TaintToleration
        weight: 1
  schedulerName: default-scheduler

I1110 20:13:58.518819       1 server.go:141] Starting Kubernetes Scheduler version v1.20.0-beta.1.378+56da2806a4cc31-dirty

/sig scheduling

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 10, 2020
@damemi
Copy link
Member Author

damemi commented Nov 10, 2020

/cc @ahg-g @alculquicondor @ingvagabund
Don't know if this will make it for code freeze, but I'm open to any suggestions on how to improve it (or things I overlooked working on it). ptal when you get a chance, thanks!

// WriteConfigFile writes the config into the given file name as YAML.
func WriteConfigFile(fileName string, cfg *kubeschedulerconfig.KubeSchedulerConfiguration) error {
// LogAndWriteConfigFile writes the config into the given file name as YAML.
func LogAndWriteConfigFile(fileName string, cfg *kubeschedulerconfig.KubeSchedulerConfiguration) error {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure who would use a filename (probably just us k8s-devs), but why not "LogOrWrite"

return err
}
configString := string(bytes)
klog.V(2).Infof("Starting scheduler with component config:\n%+v\n", configString)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
klog.V(2).Infof("Starting scheduler with component config:\n%+v\n", configString)
klog.V(2).Infof("Using component config:\n%+v\n", configString)

const mediaType = runtime.ContentTypeYAML
info, ok := runtime.SerializerInfoForMediaType(kubeschedulerscheme.Codecs.SupportedMediaTypes(), mediaType)
if !ok {
return fmt.Errorf("unable to locate encoder -- %q is not a supported media type", mediaType)
}

encoder := kubeschedulerscheme.Codecs.EncoderForVersion(info.Serializer, kubeschedulerconfigv1beta1.SchemeGroupVersion)

configFile, err := os.Create(fileName)
bytes, err := runtime.Encode(encoder, cfg)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Only encode if klog.V(2)

}

// create a scheduler from a set of registered plugins.
func (c *Configurator) create() (*Scheduler, error) {
c.componentConfig.Profiles = make([]schedulerapi.KubeSchedulerProfile, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

c.componentConfig could be nil

@@ -323,10 +319,14 @@ func Setup(ctx context.Context, opts *options.Options, outOfTreeRegistryOptions
scheduler.WithPodInitialBackoffSeconds(cc.ComponentConfig.PodInitialBackoffSeconds),
scheduler.WithExtenders(cc.ComponentConfig.Extenders...),
scheduler.WithParallelism(cc.ComponentConfig.Parallelism),
scheduler.WithComponentConfig(&cc.ComponentConfig),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All options are about configuring the scheduler, but this one is actually about capturing the output.

So, I would rather call it WithComponentConfigOutput. But also we only care about profiles, so could be limited to cc.ComponentConfig.Profiles.

Or you could do WithCaptureProfiles(func ([]config.KubeSchedulerProfile)) so that the caller is free to do with it whatever they want.

We actually have something like the above that we use for tests https://github.com/kubernetes/kubernetes/blob/master/pkg/scheduler/scheduler.go#L168
It would be cool to unify that. Tests check the expected output, whereas prod scheduler uses it for logs.

@@ -199,6 +200,12 @@ func WithExtenders(extenders []framework.Extender) Option {
}
}

func WithComponentConfig(cc *config.KubeSchedulerConfiguration) Option {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly to the ideas above, this could be:

WithProfileCapturer(func (*config.KubeSchedulerProfile))

and the code in scheduler.go is in charge of the aggregation.

@ahg-g
Copy link
Member

ahg-g commented Nov 10, 2020

/milestone v1.20

@k8s-ci-robot k8s-ci-robot added this to the v1.20 milestone Nov 10, 2020
@spiffxp
Copy link
Contributor

spiffxp commented Nov 10, 2020

possible github issue, please link if you run into this on other PR's kubernetes/test-infra#19910

Some plugin configs are not produced until the framework is instantiated. Add a callback to capture them inside the framework constructor.

Change-Id: Id3f709b6461ccd0eafec7d21412cda093d9c4645
@@ -247,6 +254,8 @@ func NewFramework(r Registry, plugins *config.Plugins, args []config.PluginConfi
return f, nil
}

profile := config.KubeSchedulerProfile{SchedulerName: f.ProfileName(), Plugins: plugins, PluginConfig: make([]config.PluginConfig, 0)}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the goal is to print the final config, we missed a little thing, at line 296, the weight changed to 1 if it is 0. Not sure it is worth it or not. Actually, I also point it out at d6ef6d6#r521218103, at least it is fine for the test.

@damemi damemi force-pushed the print-scheduler-config branch from 56da280 to a41b8dd Compare November 12, 2020 15:34
@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 12, 2020
@damemi damemi force-pushed the print-scheduler-config branch 3 times, most recently from f922512 to a5064bb Compare November 12, 2020 15:38
Copy link
Member Author

@damemi damemi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alculquicondor I rebased this onto the changes from your PR (#96438) and just re-used your WithBuildFrameworkCapturer option to append the profiles. You can see just my changes in a5064bb. ptal!

@@ -323,10 +319,19 @@ func Setup(ctx context.Context, opts *options.Options, outOfTreeRegistryOptions
scheduler.WithPodInitialBackoffSeconds(cc.ComponentConfig.PodInitialBackoffSeconds),
scheduler.WithExtenders(cc.ComponentConfig.Extenders...),
scheduler.WithParallelism(cc.ComponentConfig.Parallelism),
scheduler.WithBuildFrameworkCapturer(func(profile kubeschedulerconfig.KubeSchedulerProfile) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to clarify why is this done.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a "what", not a "why" 😉

I would say something like: "Profiles are processed during Framework instantiation to set default plugins and configurations. Capturing them for logging".

@@ -129,10 +129,6 @@ func runCommand(cmd *cobra.Command, opts *options.Options, registryOptions ...Op
}

if len(opts.WriteConfigTo) > 0 {
if err := options.WriteConfigFile(opts.WriteConfigTo, &cc.ComponentConfig); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add a comment to say where this parameter was handled.... but maybe we can do os.Exit(0) in LogOrWriteConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added an os.Exit(0) to LogOrWriteConfig

@@ -323,10 +319,19 @@ func Setup(ctx context.Context, opts *options.Options, outOfTreeRegistryOptions
scheduler.WithPodInitialBackoffSeconds(cc.ComponentConfig.PodInitialBackoffSeconds),
scheduler.WithExtenders(cc.ComponentConfig.Extenders...),
scheduler.WithParallelism(cc.ComponentConfig.Parallelism),
scheduler.WithBuildFrameworkCapturer(func(profile kubeschedulerconfig.KubeSchedulerProfile) {
if cc.ComponentConfig.Profiles == nil {
cc.ComponentConfig.Profiles = make([]kubeschedulerconfig.KubeSchedulerProfile, 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shouldn't we always make a new slice?

@damemi damemi force-pushed the print-scheduler-config branch from a5064bb to edfa12d Compare November 12, 2020 17:49
@damemi damemi force-pushed the print-scheduler-config branch from edfa12d to 005cbe6 Compare November 12, 2020 17:53
@@ -48,24 +49,40 @@ func loadConfig(data []byte) (*kubeschedulerconfig.KubeSchedulerConfiguration, e
return nil, fmt.Errorf("couldn't decode as KubeSchedulerConfiguration, got %s: ", gvk)
}

// WriteConfigFile writes the config into the given file name as YAML.
func WriteConfigFile(fileName string, cfg *kubeschedulerconfig.KubeSchedulerConfiguration) error {
// LogOrWriteConfig writes the config into the given file name as YAML.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Update the comment to match the new logic

@@ -323,10 +319,19 @@ func Setup(ctx context.Context, opts *options.Options, outOfTreeRegistryOptions
scheduler.WithPodInitialBackoffSeconds(cc.ComponentConfig.PodInitialBackoffSeconds),
scheduler.WithExtenders(cc.ComponentConfig.Extenders...),
scheduler.WithParallelism(cc.ComponentConfig.Parallelism),
scheduler.WithBuildFrameworkCapturer(func(profile kubeschedulerconfig.KubeSchedulerProfile) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a "what", not a "why" 😉

I would say something like: "Profiles are processed during Framework instantiation to set default plugins and configurations. Capturing them for logging".

)
if err != nil {
return nil, nil, err
}
if err := options.LogOrWriteConfig(opts.WriteConfigTo, &cc.ComponentConfig, completedProfiles); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just override the completed profiles at this point?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but since it's only necessary if we're logging/writing the config I figured I'd keep that logic in the function

@damemi damemi force-pushed the print-scheduler-config branch from 005cbe6 to aec327c Compare November 12, 2020 18:28
Copy link
Member

@alculquicondor alculquicondor left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/approve
/lgtm
/priority important-soon

I don't really mind if this PR merges and the parent doesn't :)

@k8s-ci-robot k8s-ci-robot added priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. and removed needs-priority Indicates a PR lacks a `priority/foo` label and requires one. labels Nov 12, 2020
@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2020
@alculquicondor
Copy link
Member

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Nov 12, 2020
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alculquicondor, damemi

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2020
@damemi
Copy link
Member Author

damemi commented Nov 12, 2020

/retest
Bazel-test failed with this:

FAIL: //pkg/scheduler:go_default_test (run 2 of 2) (see /bazel-scratch/.cache/bazel/_bazel_root/7989b31489f31aee54f32688da2f0120/execroot/io_k8s_kubernetes/bazel-out/k8-fastbuild/testlogs/pkg/scheduler/go_default_test/run_2_of_2/test.log)
INFO: From Testing //pkg/scheduler:go_default_test (run 2 of 2):
==================== Test output for //pkg/scheduler:go_default_test (run 2 of 2):
--- FAIL: TestCreateFromConfig (0.09s)
    --- FAIL: TestCreateFromConfig/policy_with_unspecified_predicates_or_priorities_uses_default (0.04s)
        factory_test.go:374: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            + 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 1}},
              	{Name: "NodeAffinity", Args: &config.NodeAffinityArgs{}},
              	... // 4 identical elements
              }
    --- FAIL: TestCreateFromConfig/policy_with_arguments (0.03s)
        factory_test.go:374: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            + 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 1}},
              	{Name: "NodeAffinity", Args: &config.NodeAffinityArgs{}},
              	... // 3 identical elements
              }
    --- FAIL: TestCreateFromConfig/policy_with_HardPodAffinitySymmetricWeight_argument (0.01s)
        factory_test.go:374: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            + 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 10}},
              	{Name: "NodeResourcesFit", Args: &config.NodeResourcesFitArgs{}},
              } 

but adding the defaultpreemption args causes it to fail locally for me:

$ go test -run TestCreateFromConfig ./pkg/scheduler/...
--- FAIL: TestCreateFromConfig (0.00s)
    --- FAIL: TestCreateFromConfig/policy_with_unspecified_predicates_or_priorities_uses_default (0.00s)
        factory_test.go:387: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            - 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 1}},
              	{Name: "NodeAffinity", Args: &config.NodeAffinityArgs{}},
              	... // 4 identical elements
              }
    --- FAIL: TestCreateFromConfig/policy_with_arguments (0.00s)
        factory_test.go:387: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            - 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 1}},
              	{Name: "NodeAffinity", Args: &config.NodeAffinityArgs{}},
              	... // 3 identical elements
              }
    --- FAIL: TestCreateFromConfig/policy_with_HardPodAffinitySymmetricWeight_argument (0.00s)
        factory_test.go:387: unexpected plugins config diff (-want, +got):   []config.PluginConfig{
            - 	{Name: "DefaultPreemption", Args: s"&TypeMeta{Kind:,APIVersion:,}"},
              	{Name: "InterPodAffinity", Args: &config.InterPodAffinityArgs{HardPodAffinityWeight: 10}},
              	{Name: "NodeResourcesFit", Args: &config.NodeResourcesFitArgs{}},
              }
FAIL
FAIL	k8s.io/kubernetes/pkg/scheduler	0.410s

@damemi damemi force-pushed the print-scheduler-config branch from aec327c to 37b3d47 Compare November 12, 2020 20:18
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2020
@alculquicondor
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2020
@damemi damemi force-pushed the print-scheduler-config branch from 37b3d47 to 14fa76d Compare November 12, 2020 21:03
@k8s-ci-robot k8s-ci-robot removed the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2020
@alculquicondor
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2020
@k8s-ci-robot k8s-ci-robot merged commit ae95984 into kubernetes:master Nov 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/scheduling Categorizes an issue or PR as relevant to SIG Scheduling. size/L Denotes a PR that changes 100-499 lines, ignoring generated files. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Scheduler should log processed component config
6 participants