Revert "integration-playbook: Idempotent Swapping" #1058

mrunalp · 2017-10-24T16:37:41Z

This reverts commit b198c57.

Signed-off-by: Mrunal Patel [email protected]

This reverts commit b198c57. Signed-off-by: Mrunal Patel <[email protected]>

mrunalp · 2017-10-24T16:37:49Z

/test ami

mrunalp · 2017-10-24T16:47:33Z

@rhatdan @runcom @cevich PTAL
I believe this should fix the CI. Great catch by @rhatdan.

runcom · 2017-10-24T16:48:44Z

LGTM if makes tests happy again, great Dan!

TomSweeneyRedHat · 2017-10-24T16:50:32Z

LGTM, fingers crossed for the tests.

cevich · 2017-10-24T16:59:44Z

@rhatdan @mrunalp @rhatdan Great find, but I wouldn't consider this a long-term fix. As discussed on IRC - this was originally added to prevent an OOM during cri-o compiling. While that may have been subsequently resolved by jamming more memory onto the VMs, I think it's a mistake to assume test-subject swap is not needed for every test. Also, if a test-subject already has swap setup, this change will not resolve the issue (the playbook does nothing in that case).

A better long-term solution would be to fix the few (one) specific test that needs to control it's memory placement. i.e. Lock the pages into pages into main memory, preventing them from being swapped. IIRC, there's a command or systemcall that's used to do this...can't remember what it's called at the moment....@mrunalp and I will keep looking for it...

cevich · 2017-10-24T17:18:58Z

@rhatdan @mrunalp found it...this is probably what the test should do:

Making sure it's not setting it's cgroup memory limit higher than available memory (exclusive of any swap).
In the test, before hogging up memory, call Go's equivalent of mlockall().
If the OOM test still doesn't work...there's probably an important bug that needs fixing (somewhere)

This way avoids putting the cart before the horse (for the benefit of a single test). It also better exercises this situation under real-world conditions (the vast majority of customer setups I've seen, use swap). Perhaps this test is even playing in the realm of the OOMKiller firing off against something important - since the test cannot predict what other consumers are doing. An unlikely situation to be sure, but one that creates an even harder to debug situation.

cevich · 2017-10-24T17:23:02Z

Ohh...another thing. At the end, the test could check it's majflt (man proc(5)) from /proc/pid/stat. Compare the count before/after testing to verify it has not increased after calling mlockall().

mrunalp · 2017-10-24T17:40:18Z

I checked that the rhel ami job has passed the oom test :)

mrunalp · 2017-10-24T17:42:22Z

@cevich

Making sure it's not setting it's cgroup memory limit higher than available memory (exclusive of any swap).

Our limit is just under 5M. This test is for the container not wreaking havoc on the entire system :)

In the test, before hogging up memory, call Go's equivalent of mlockall().

No real world go programs typically call this.

runcom · 2017-10-24T17:42:30Z

🎉

cevich · 2017-10-24T18:07:24Z

Our limit is just under 5M. This test is for the container not wreaking havoc on the entire system :)

Oh good, that's very reasonable.

Hmmm, you sure there's not a systemd/kernel bug here? I would think that a limit should be honored whether or not swap is available. Otherwise, what's the point in having a limit at all?

If there is a bug, ignoring it in the "framework" is the most difficult way to silence it, just ask @edsantiago. For things like this, it's way easier to either fix the underlying problem, or skip the test.

No real world go programs typically call this.

Tests != real-world, ever, and by definition (though sometimes they simulate it).
I'm just suggesting to using better simulation tools 😀

...wait...doesn't that mean if reality is a simulation, then, like, maybe, that, then we could also, like, be in, like, some universal unittest...whoa...deep...

cevich · 2017-10-24T18:20:03Z

Re: cgroup bug: From the kernel docs, though I don't fully parse all of what it says:

In extreme cases, with many concurrent allocations and a complete
breakdown of reclaim progress within the group, the high boundary can
be exceeded. But even then it's mostly better to satisfy the
allocation from the slack available in other groups or the rest of the
system than killing the group. Otherwise, memory.max is there to
limit this type of spillover and ultimately contain buggy or even
malicious applications.

Setting the original memory.limit_in_bytes below the current usage was
subject to a race condition, where concurrent charges could cause the
limit setting to fail. memory.max on the other hand will first set the
limit to prevent new charges, and then reclaim and OOM kill until the
new limit is met - or the task writing to memory.max is killed.

The combined memory+swap accounting and limiting is replaced by real
control over swap space.

The main argument for a combined memory+swap facility in the original
cgroup design was that global or parental pressure would always be
able to swap all anonymous memory of a child group, regardless of the
child's own (possibly untrusted) configuration. However, untrusted
groups can sabotage swapping by other means - such as referencing its
anonymous memory in a tight loop - and an admin can not assume full
swappability when overcommitting untrusted jobs.

For trusted jobs, on the other hand, a combined counter is not an
intuitive userspace interface, and it flies in the face of the idea
that cgroup controllers should account and limit specific physical
resources. Swap space is a resource like all others in the system,
and that's why unified hierarchy allows distributing it separately.

So, to me that reads like swap limiting needs to be handled specially by the application when setting up these limits for the cgroup(s) of a child. This is also the likely reason why docker has separate --memory-swap and --memory-swappiness options to docker run (man docker-run).

Without other details, (I dunno the original situation) this sounds like a bug that's being uncovered by the OOM-test failure, and something that absolutely should NOT be overlooked/ignored. Though I also reserve the right to be dead wrong, it would be nice is someone tried to convince me 😀

cevich · 2017-10-24T19:04:05Z

@mrunalp thx for the chat. So to summarize, this isn't a cri-o bug because...

k8's is responsible for passing down all the cgroup memory-limiting options into cri-o.
The test that's failing is an 'integration test', which means k8's isn't in charge.
There should be a test for this swap limit vs memory limit enforcement in the e2e tests, but there isn't.

This PR is just a quick-fix to get CRI-O out the door, later we can consider bringing it back in and/or fixing the test so swap doesn't get in it's way.

cevich · 2017-10-24T23:13:04Z

@runcom @mrunalp Ugg, okay, just realized, this isn't actually going to fix the problem because...you've reverted one implementation of swap-enabling for an older (but worse) one. Meaning: Swap is still turned on. It's probably just accidentally not working, because of Origin-CI image caching...I've got an idea for a better fix...

cevich · 2017-10-24T23:46:16Z

...opened #1063

Revert "integration-playbook: Idempotent Swapping"

61c6433

This reverts commit b198c57. Signed-off-by: Mrunal Patel <[email protected]>

k8s-ci-robot added size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes labels Oct 24, 2017

mrunalp mentioned this pull request Oct 24, 2017

test: Test for OOM condition in a loop #1055

Merged

runcom merged commit e95f75e into cri-o:master Oct 24, 2017

Revert "integration-playbook: Idempotent Swapping" #1058

Revert "integration-playbook: Idempotent Swapping" #1058

Uh oh!

Conversation

mrunalp commented Oct 24, 2017

Uh oh!

mrunalp commented Oct 24, 2017

Uh oh!

mrunalp commented Oct 24, 2017

Uh oh!

runcom commented Oct 24, 2017

Uh oh!

TomSweeneyRedHat commented Oct 24, 2017

Uh oh!

cevich commented Oct 24, 2017

Uh oh!

cevich commented Oct 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cevich commented Oct 24, 2017

Uh oh!

mrunalp commented Oct 24, 2017

Uh oh!

mrunalp commented Oct 24, 2017

Uh oh!

runcom commented Oct 24, 2017

Uh oh!

cevich commented Oct 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cevich commented Oct 24, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cevich commented Oct 24, 2017

Uh oh!

cevich commented Oct 24, 2017

Uh oh!

cevich commented Oct 24, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

cevich commented Oct 24, 2017 •

edited

Loading

cevich commented Oct 24, 2017 •

edited

Loading

cevich commented Oct 24, 2017 •

edited

Loading