Skip to content

Conversation

secat
Copy link
Contributor

@secat secat commented Sep 29, 2021

Signed-off-by: Serge Catudal [email protected]

Previous support for exemplars on the remote write receiver endpoint was triggering the following error:

"unknown series ref. when trying to add exemplar: 0"

This PR aims to fix the problem.

Tested locally using a modified version of the opentelemetry-collector-contrib prometheusremotewriteexporter to send metrics with exemplars.

Here is a screenshot of exemplars received by the remote write receiver endpoint:
image

@secat secat force-pushed the fix-remotewrite-receiver-endpoint-exemplars branch from dafc2d9 to 708cd5e Compare September 29, 2021 21:53
@secat
Copy link
Contributor Author

secat commented Sep 30, 2021

@csmarchbanks and @cstyan this PR is ready for review. Thank you in advance.

@secat
Copy link
Contributor Author

secat commented Oct 19, 2021

@csmarchbanks, @cstyan and @codesome friendly reminder: this PR is ready for review. Thank you in advance! 😃

Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I think your previous code in write_handler was actually correct, the problem is that AppendExemplar will not try to find an existing series with the passed in labels if the ref is not found. A couple options that I can see:

  1. Change (*headAppender).AppendExemplar to use getByHash in addition to getByID.
  2. Expose/use GetRef, but that would likely involve some interface changes or casting, I would prefer the other option.

cc: @codesome, does option 1 seem reasonable to you?

if exemplarErr != nil {
// Since exemplar storage is still experimental, we don't fail the request on ingestion errors.
level.Debug(h.logger).Log("msg", "Error while adding exemplar in AddExemplar", "exemplar", fmt.Sprintf("%+v", e), "err", exemplarErr)
if index < len(ts.Exemplars) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will work, as there is no guarantee that the order of samples and exemplars in a single request are the same. Or if they correspond to each other at all.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I modified the function (*headAppender).AppendExemplar with the following code:

// Get Series
s := a.head.series.getByID(ref)
if s == nil {
	s = a.head.series.getByHash(lset.Hash(), lset)
}
if s == nil {
	return 0, fmt.Errorf("unknown series ref. when trying to add exemplar: %d", ref)
}

However, I've got panic on app.Commit():

ts=2021-10-21T14:54:34.632Z caller=web.go:96 level=error component=web msg="panic while serving request" client=172.17.0.29:52792 url=/api/v1/write err="runtime error: invalid memory address or nil pointer dereference" stack="goroutine 501 [running]:\ngithub.com/prometheus/prometheus/web.withStackTracer.func1.1()\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/web/web.go:95 +0x99\npanic({0x269aa60, 0x4b81600})\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/runtime/panic.go:1038 +0x215\ngithub.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5.1()\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/opentracing-contrib/go-stdlib/nethttp/server.go:150 +0x139\npanic({0x269aa60, 0x4b81600})\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/runtime/panic.go:1038 +0x215\ngithub.com/prometheus/prometheus/tsdb.(*headAppender).Commit(0xc00224bae0)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/tsdb/head_append.go:417 +0x8b6\ngithub.com/prometheus/prometheus/tsdb.dbAppender.Commit({{0x345f908, 0xc00224bae0}, 0xc000954780})\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/tsdb/db.go:871 +0x35\ngithub.com/prometheus/prometheus/storage.(*fanoutAppender).Commit(0xc00224ed80)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/storage/fanout.go:176 +0x3f\ngithub.com/prometheus/prometheus/storage/remote.(*writeHandler).write.func1()\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/storage/remote/write_handler.go:94 +0x31\ngithub.com/prometheus/prometheus/storage/remote.(*writeHandler).write(0xc000735160, {0x345ec58, 0xc002237410}, 0xc00223dc20)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/storage/remote/write_handler.go:125 +0x24e\ngithub.com/prometheus/prometheus/storage/remote.(*writeHandler).ServeHTTP(0xc000735160, {0x7fd3fd324ad8, 0xc00223dbd0}, 0xc002256900)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/storage/remote/write_handler.go:52 +0xb3\ngithub.com/prometheus/prometheus/web/api/v1.(*API).remoteWrite(0xc002237470, {0x7fd3fd324ad8, 0xc00223dbd0}, 0x340d140)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/web/api/v1/api.go:1383 +0x37\ngithub.com/prometheus/prometheus/web.(*Handler).testReady.func1({0x7fd3fd324ad8, 0xc00223dbd0}, 0xc00223dbd0)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/web/web.go:522 +0x39\nnet/http.HandlerFunc.ServeHTTP(0x7fd3fd324ad8, {0x7fd3fd324ad8, 0xc00223dbd0}, 0xc00049a000)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\ngithub.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerResponseSize.func1({0x7fd3fd324ad8, 0xc00223db80}, 0xc002256900)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:198 +0xa3\nnet/http.HandlerFunc.ServeHTTP(0x340d1a0, {0x7fd3fd324ad8, 0xc00223db80}, 0xc002237440)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\ngithub.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerDuration.func2({0x7fd3fd324ad8, 0xc00223db80}, 0xc002256900)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:76 +0xa2\nnet/http.HandlerFunc.ServeHTTP(0x3448120, {0x7fd3fd324ad8, 0xc00223db80}, 0x0)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\ngithub.com/prometheus/client_golang/prometheus/promhttp.InstrumentHandlerCounter.func1({0x3448120, 0xc00223db30}, 0xc002256900)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/client_golang/prometheus/promhttp/instrument_server.go:101 +0x92\ngithub.com/prometheus/prometheus/web.setPathWithPrefix.func1.1({0x3448120, 0xc00223db30}, 0xc002256800)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/web/web.go:1146 +0x290\ngithub.com/prometheus/common/route.(*Router).handle.func1({0x3448120, 0xc00223db30}, 0xc002256700, {0x0, 0x0, 0xc0009940c0})\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/common/route/route.go:83 +0x2ae\ngithub.com/julienschmidt/httprouter.(*Router).ServeHTTP(0xc0007961e0, {0x3448120, 0xc00223db30}, 0xc002256700)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/julienschmidt/httprouter/router.go:387 +0x84b\ngithub.com/prometheus/common/route.(*Router).ServeHTTP(0x10, {0x3448120, 0xc00223db30}, 0x1)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/common/route/route.go:126 +0x26\nnet/http.StripPrefix.func1({0x3448120, 0xc00223db30}, 0xc002256600)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2089 +0x330\nnet/http.HandlerFunc.ServeHTTP(0x7fd3fd073a70, {0x3448120, 0xc00223db30}, 0x7fd42414dd28)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\nnet/http.(*ServeMux).ServeHTTP(0x40d187, {0x3448120, 0xc00223db30}, 0xc002256600)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2424 +0x149\ngithub.com/opentracing-contrib/go-stdlib/nethttp.MiddlewareFunc.func5({0x3446c50, 0xc00183e700}, 0xc001ee0d00)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/opentracing-contrib/go-stdlib/nethttp/server.go:154 +0x62d\nnet/http.HandlerFunc.ServeHTTP(0x0, {0x3446c50, 0xc00183e700}, 0xc0017b8fc0)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\ngithub.com/prometheus/prometheus/web.withStackTracer.func1({0x3446c50, 0xc00183e700}, 0xc001fd0600)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/web/web.go:100 +0x97\nnet/http.HandlerFunc.ServeHTTP(0x7ffea9fd2f0f, {0x3446c50, 0xc00183e700}, 0x90)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2046 +0x2f\ngithub.com/prometheus/exporter-toolkit/web.(*webHandler).ServeHTTP(0xc000995980, {0x3446c50, 0xc00183e700}, 0x40)\n\t/home/scatudal/go/src/github.com/prometheus/prometheus/vendor/github.com/prometheus/exporter-toolkit/web/handler.go:101 +0x852\nnet/http.serverHandler.ServeHTTP({0x3439928}, {0x3446c50, 0xc00183e700}, 0xc001ee0d00)\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:2878 +0x43b\nnet/http.(*conn).serve(0xc0018b7e00, {0x345ec58, 0xc000e9b740})\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:1929 +0xb08\ncreated by net/http.(*Server).Serve\n\t/home/linuxbrew/.linuxbrew/opt/go/libexec/src/net/http/server.go:3033 +0x4e8\n"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you also update the ref when the exemplar is actually added? E.g. update

	a.exemplars = append(a.exemplars, exemplarWithSeriesRef{ref, e})

to

	a.exemplars = append(a.exemplars, exemplarWithSeriesRef{s.ref, e})

or maybe just set ref = s.ref after the getByHash call to make sure no one accidentally uses the wrong ref in the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so with this code, I don't have anymore panics and it seems to work:

// Get Series
s := a.head.series.getByID(ref)
if s == nil {
	s = a.head.series.getByHash(lset.Hash(), lset)
	if s != nil {
		ref = s.ref
	}
}
if s == nil {
	return 0, fmt.Errorf("unknown series ref. when trying to add exemplar: %d", ref)
}

Example:
image

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@codesome does the code change described above for the function (*headAppender).AppendExemplar would be acceptable?

Copy link
Contributor Author

@secat secat Oct 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the code of this PR with the proposed fix

@secat secat force-pushed the fix-remotewrite-receiver-endpoint-exemplars branch from b8df8b3 to 43fd8a1 Compare October 21, 2021 17:46
Copy link
Member

@csmarchbanks csmarchbanks left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 from me.

@roidelapluie roidelapluie merged commit 8c3eca8 into prometheus:main Oct 21, 2021
@roidelapluie
Copy link
Member

Thanks! If you feel like a test would be nice, feel free to open a followup PR. I am merging this to cut the release candidate.

Rudy167 pushed a commit to Rudy167/prometheus that referenced this pull request Oct 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants