Skip to content

Conversation

VihasMakwana
Copy link
Contributor

@VihasMakwana VihasMakwana commented Jun 30, 2025

Description

Add an option to recreate the database from scratch. This is useful if the underlying database is corrupted and there's no way to recover from it.

Ideally, we would want to recover from a panic. However, I came across the panic statements in bbolt's codebase and they are raised inside of a goroutine. Unfortunately, we can't recover from that panic in our collector process. So, our best route here is to provide an option to the user to recreate a new database.

Link to tracking issue

Fixes #36840, #35899

Testing

Documentation

@VihasMakwana VihasMakwana marked this pull request as ready for review July 2, 2025 13:21
@VihasMakwana VihasMakwana requested a review from a team as a code owner July 2, 2025 13:21
@VihasMakwana VihasMakwana requested a review from ArthurSens July 2, 2025 13:22
Copy link
Contributor

@swiatekm swiatekm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The change looks good to me technically, but I'm not sure it's the right thing to do. My impression was that we wanted to recover from a panic when opening the DB, and then create the backup. Is that not feasible.

@VihasMakwana
Copy link
Contributor Author

VihasMakwana commented Jul 14, 2025

The change looks good to me technically, but I'm not sure it's the right thing to do. My impression was that we wanted to recover from a panic when opening the DB, and then create the backup. Is that not feasible.

Those were my intentions as well. However, I came across the panic statements in bbolt's codebase and they are raised inside of a goroutine. Unfortunately, we can't recover from that panic in our collector process.

Essentially, it is like following piece of code:

func main() {
	defer func() {
		err := recover() // WON'T HELP BECAUE OF PANIC INSIDE OF A GOROUTINE
		fmt.Println(err)
	}()
	go func() {
		panic("OOPS!")
	}()
}

@VihasMakwana VihasMakwana requested a review from swiatekm July 15, 2025 07:05
@VihasMakwana
Copy link
Contributor Author

@open-telemetry/collector-contrib-approvers could someone take a look here? I have an approval from the codeowner.

@andrzej-stencel andrzej-stencel merged commit c105bc2 into open-telemetry:main Jul 16, 2025
179 checks passed
@github-actions github-actions bot added this to the next release milestone Jul 16, 2025
Dylan-M pushed a commit to Dylan-M/opentelemetry-collector-contrib that referenced this pull request Aug 5, 2025
…rlying file is corrupted. (open-telemetry#40986)

<!--Ex. Fixing a bug - Describe the bug and how this fixes the issue.
Ex. Adding a feature - Explain what this achieves.-->
#### Description

Add an option to recreate the database from scratch. This is useful if
the underlying database is corrupted and there's no way to recover from
it.

Ideally, we would want to `recover` from a panic. However, I came across
the `panic` statements in bbolt's codebase and they are raised inside of
a goroutine. Unfortunately, we [can't
recover](https://go.dev/wiki/PanicAndRecover#:~:text=A%20panic%20cannot%20be%20recovered%20by%20a%20different%20goroutine)
from that panic in our collector process. So, our best route here is to
provide an option to the user to recreate a new database.


<!-- Issue number (e.g. open-telemetry#1234) or full URL to issue, if applicable. -->
#### Link to tracking issue
Fixes open-telemetry#36840, open-telemetry#35899

<!--Describe what testing was performed and which tests were added.-->
#### Testing

<!--Describe the documentation added.-->
#### Documentation

<!--Please delete paragraphs that you did not use before submitting.-->

---------

Co-authored-by: Mikołaj Świątek <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

recover from filestorage panic on corrupted DB

5 participants