-
-
Notifications
You must be signed in to change notification settings - Fork 7k
Description
I did this
Upgraded our application from curl 8.15.0 to 8.16.0 with no other changes.
While I don't have all of the details at this time, here are two issues that we immediately noticed which seem to have been introduced with #18263.
- Intermittent
getaddrinfomemory leak on shutdown:
Running our application with LeakSanitizer enabled, performing a single HTTP request with libcurl, and quickly exiting the application (potentially before the request completes) results in a LeakSanitizer report about 10% of the time due to a memory leak where freeaddrinfo is apparently never called:
==275324==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 64 byte(s) in 1 object(s) allocated from:
#0 0x5a3b33c486db in malloc asan_malloc_linux.cpp:67:3
#1 0x73e4d375b639 in generate_addrinfo nss/getaddrinfo.c:1081:16
#2 0x73e4d375b639 in gaih_inet nss/getaddrinfo.c:1199:12
#3 0x73e4d375b639 in getaddrinfo nss/getaddrinfo.c:2391:12
#4 0x5a3b33be862c in getaddrinfo sanitizer_common_interceptors.inc:2893:13
#5 0x5a3b3604a1d2 in Curl_getaddrinfo_ex lib/curl_addrinfo.c:122:11
#6 0x5a3b3610bb9f in getaddrinfo_thread lib/asyn-thrdd.c:252:10
#7 0x5a3b3604bd23 in curl_thread_create_thunk ib/curl_threads.c:57:3
#8 0x5a3b33c4638d in asan_thread_start(void*) asan_interceptors.cpp:239:28
#9 0x73e4d369caa3 in start_thread nptl/pthread_create.c:447:8
Indirect leak of 128 byte(s) in 2 object(s) allocated from:
#0 0x5a3b33c486db in malloc lib/asan/asan_malloc_linux.cpp:67:3
#1 0x73e4d375b639 in generate_addrinfo nss/getaddrinfo.c:1081:16
#2 0x73e4d375b639 in gaih_inet nss/getaddrinfo.c:1199:12
#3 0x73e4d375b639 in getaddrinfo nss/getaddrinfo.c:2391:12
#4 0x5a3b33be862c in getaddrinfo sanitizer_common_interceptors.inc:2893:13
#5 0x5a3b3604a1d2 in Curl_getaddrinfo_ex lib/curl_addrinfo.c:122:11
#6 0x5a3b3610bb9f in getaddrinfo_thread lib/asyn-thrdd.c:252:10
#7 0x5a3b3604bd23 in curl_thread_create_thunk lib/curl_threads.c:57:3
#8 0x5a3b33c4638d in asan_thread_start(void*) lib/asan/asan_interceptors.cpp:239:28
#9 0x73e4d369caa3 in start_thread nptl/pthread_create.c:447:8
SUMMARY: AddressSanitizer: 192 byte(s) leaked in 3 allocation(s).
- Intermittent hang in
async_thrdd_shutdown
Similarly, running our application many times in a loop results in a hang about 5% of the time. The hang occurs in async_thrdd_shutdown with the following callstack:
#0 futex_wait (private=0, expected=2, futex_word=0x74a0c3ae7a30) at ../sysdeps/nptl/futex-internal.h:146
#1 __GI___lll_lock_wait (futex=futex@entry=0x74a0c3ae7a30, private=0) at ./nptl/lowlevellock.c:49
#2 in lll_mutex_lock_optimized (mutex=0x74a0c3ae7a30) at ./nptl/pthread_mutex_lock.c:48
#3 ___pthread_mutex_lock (mutex=mutex@entry=0x74a0c3ae7a30) at ./nptl/pthread_mutex_lock.c:93
#4 async_thrdd_shutdown (data=data@entry=0x75d0c3af2100) at lib/asyn-thrdd.c:533
#5 Curl_async_thrdd_shutdown (data=0x75d0c3af2100) at lib/asyn-thrdd.c:585
#6 Curl_async_shutdown (data=0x75d0c3af2100) at lib/asyn-base.c:200
#7 multi_done (data=0x75d0c3af2100, status=<optimized out>, premature=false) at lib/multi.c:654
#8 multi_runsingle (multi=<optimized out>, nowp=<optimized out>, data=0x75d0c3af2100) at lib/multi.c:2596
#9 curl_multi_perform (m=<optimized out>, running_handles=<optimized out>) at lib/multi.c:2771
The hanging line is trying to acquire the mutex here.
Looking at the contents of the addr_ctx->mutx variable, they look like this:
$2 = {__data = {__lock = 2, __count = 0, __owner = 263166, __nusers = 1, __kind = 0, __spins = 0, __elision = 0, __list = {__prev = 0x0, __next = 0x0}}, __size = "\002\000\000\000\000\000\000\000\376\003\004\000\001", '\000' <repeats 26 times>, __align = 2}
In particular, __lock = 2 which I believe means the mutex is currently locked, and __owner is TID 263166 which - at the time of the hang - does not refer to any running thread. I assume that TID refers to the no-longer-running async resolver thread.
What I think might have happened is that pthread_cancel was called on the resolver thread while the thread had the mutex locked and before it had a chance to unlock it, which of course is catastrophic. This might also explain the intermittent getaddrinfo memory leak.
I expected the following
The async resolver does not leak or hang.
curl/libcurl version
curl 8.16.0
operating system
Ubuntu Linux 24.04