-
Notifications
You must be signed in to change notification settings - Fork 7.9k
Handle already existing user session in the store #42451
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
Hello @mposolda and @pruivo - I had a look how Lines 60 to 72 in 4dc4de7
So I reworked JpaChangesPerformer to first read from the database before then continuing to either create the entity updating it. This raises the cost of creating a new session by one additional SELECT as part of an already existing transaction. While this is a round-trip to the database which is costly, the impact on the database should be smaller than the already existing INSERT. This could be in a later step be optimized in a more opportunistic way to try a plain insert without a select on the first attempt, and a select on any subsequent attempt. If that transaction will fail, it will then retry as part of the existing retry mechanism we have in place in either Please let me know your thoughts. Can you think of any additional checks that I should add if a session is found in addition to the session being new and is matching the current user? |
3fdcd8e to
be787c2
Compare
6c6e3b8 to
8cd5c2b
Compare
77a9546 to
36ab7b1
Compare
| do { | ||
| id = keyGenerator.generateKeyString(session, sessionTx.getCache()); | ||
| // prevent potential duplicate ID generation | ||
| } while (sessionTx.get(id) != null); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
uh!? What's the probability of a conflict? 1 in 2^122? It seems unnecessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I'll leave that for a separate PR
| } else if (merged.getOperation() == SessionUpdateTask.CacheOperation.ADD_IF_ABSENT) { | ||
| // This might happen if the user logs in via multiple tabs at the same time from an external broker, and the same authentication session creates | ||
| // multiple user sessions concurrently. | ||
| if (!Objects.equals(userSessionModel.getUserId(), entity.getUser())) { | ||
| throw new ModelIllegalStateException("User ID of the session does not match, the user ID should not change"); | ||
| } | ||
| if (Math.abs(userSessionModel.getStarted() - entity.getStarted()) > 10) { | ||
| throw new ModelIllegalStateException("Session has already aged, concurrent requests should not happen"); | ||
| } | ||
| // If the above matches, it is safe to continues to apply all updates to the entity as before in EmbeddedCachesTransformer |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This class is enormous and difficult to read, but I think this is dead code. You're invoking createUserSession before doing loadUserSession, so it should return the entity you just created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually the createUserSession was left-over from a manual test; I should have never added it.
This is now cleaned up, and there is a new test testConcurrentSessionCreation which tests the functionality even without concurrency.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand these checks, mainly because we don't care for those with volatile sessions; we just merge the changed fields.
My other concern is that it will fail all 20 retries we have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd say this is defensive programming, as bad things would happen if we connect a new user to an existing session that is bound to a different user. A user ID is assumed to be constant.
As an example: In #38671 where a user deleted their account, we forgot to start a new authentication session, and that lead to the next log in in that browser tab to use the same user session ID. The database showed a duplicate key error. If we would just continue with the old session, we would continue with a wrong user in that session.
Yes, I agree those 20 retries will fail, and I'd rather see the error and follow up on the cause, than having two user sessions mixed up (which is IMHO the very worst case that could happen for an IAM).
Checking the user ID is IMHO the minimal check. I'd rather add more fields to the check, but probably not as a backport and only if Marek agrees. We could have a follow-up issue for the other stores as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I find it very confusing to have the same session ID for different users. It looks like a problem in the authentication flow or in the cookies.
Other suggestions:
- I would like a comment about the
startedtimestamp and why the value 10. - Do not retry when
ModelIllegalStateExceptionis thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pruivo - Thank you for the feedback. The latest commit addresses your review comments. There are now code comments explaining the why, and the retry is skipped on a ModelIllegalStateException
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found that the I needed to revert the change and keep the retries. In run https://github.com/keycloak/keycloak/actions/runs/17609154462/job/50027163556?pr=42451 the test UserSessionConcurrencyTest.testConcurrentNotesChange showed that there can be an optimistic lock exception that is then later converted to an ModelIllegalStateException here:
Line 148 in b743b3d
| throw new ModelIllegalStateException("Database operation failed", t); |
I'd say this retry is rare and won't have a performance implications as those new exceptions would only occur in error cases.
| } else { | ||
| PersistentUserSessionAdapter userSessionModel = (PersistentUserSessionAdapter) userSessionPersister.loadUserSession(realm, entry.getKey().toString(), entity.isOffline()); | ||
| if (userSessionModel != null) { | ||
| if (merged.getOperation() == SessionUpdateTask.CacheOperation.ADD) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we use CacheOperation.ADD anywhere in the code. Can we eliminate it to simplify things a bit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is "just" a safeguard, and I would like to keep it. The operation is used with other caches, and unfortunately this enum isn't specific to this cache.
As I don't know how to simplify it, I would keep it as is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A switch may be simpler to read
switch (merged.getOperation()) {
case REMOVE:
remove();
break;
case ADD, ADD_IF_ABSENT:
entity = load();
if (entity == null) {
create();
} else {
merge();
}
break;
case REPLACE:
entity = load();
if (entity != null) {
merge();
}
break;
}There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I addressed this in a refactoring. Please have a look.
| userSessionQuery.setMaxResults(1); | ||
|
|
||
| Stream<OfflineUserSessionModel> persistentUserSessions = closing(userSessionQuery.getResultStream().map(this::toAdapter)); | ||
| Stream<OfflineUserSessionModel> persistentUserSessions = closing(userSessionQuery.getResultStream().map(this::toAdapter).filter(Objects::nonNull)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use userSessionQuery.getSingleResultOrNull()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rewritten to be imperative and not using streams.
| sessionManager = new UserSessionManager(currentSession); | ||
| KeycloakModelUtils.runJobInTransaction(session.getKeycloakSessionFactory(), session.getContext(), currentSession -> { | ||
| RealmModel realm = currentSession.realms().getRealmByName("test"); | ||
| UserSessionManager sessionManager = new UserSessionManager(currentSession); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
| currentSession = sessionCR1; | ||
| sessionManager = new UserSessionManager(currentSession); | ||
| String realmId = KeycloakModelUtils.runJobInTransactionWithResult(session.getKeycloakSessionFactory(), currentSession -> { | ||
| UserSessionManager sessionManager = new UserSessionManager(currentSession); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused
| currentSession.users().addUser(realm, "user2").setEmail("user2@localhost"); | ||
| } | ||
| sessionManager = new UserSessionManager(currentSession); | ||
| UserSessionManager sessionManager = new UserSessionManager(currentSession); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unsued
1bf2369 to
3327ebf
Compare
|
@pruivo - I've reworked the code, please have another look. The unused snippets in the test class have been cleaned up together with other warnings. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unreported flaky test detected, please review
Unreported flaky test detectedIf the flaky tests below are affected by the changes, please review and update the changes accordingly. Otherwise, a maintainer should report the flaky tests prior to merging the PR. org.keycloak.testsuite.broker.KcOidcBrokerTest#testPostBrokerLoginFlowWithOTPKeycloak CI - Java Distribution IT (windows-latest - temurin - 17) |
|
@ahus1 @pruivo Thanks a lot for the updates! I will be fine with whatever optimizations at the store level you guys agree with. I will try to do some tests later today or tomorrow to confirm if the fix actually works for me to fix the issue . I will probably use the steps from this comment, which I've used before to simulate the issue. |
d90f91f to
2f1f64c
Compare
Closes keycloak#40374 Signed-off-by: Alexander Schwartz <[email protected]>
Signed-off-by: Alexander Schwartz <[email protected]>
2f1f64c to
4e7d87f
Compare
|
PR rebased to resolve conflicts. Please let me know if you are ok to merge this one @mposolda |
mposolda
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ahus1 Thanks! I've did some testing and not able to simulate anymore the scenario from #40374 (comment) . So from the functional perspective, it looks good to me and I am approving.
I did not review the details of the storage changes and did not review performance. Might be good to have review/approve from @pruivo as well, but leaving it to you.
|
@pruivo - can you please re-approve? |
Closes #40374
Follow-ups: