Skip to content

Conversation

@JAicewizard
Copy link

This currently also includes the changes of marcboeker#483, but I will rebase once that is merged.

This adds multiple ways to append, using all the variations of table sources.
Using just row-based is actually slower than the current solution, which may be expected as it operates very similarly but with extra steps. The real performance gains come from the parallel implementations.
The benchmarks don't show much improvement (~25% for both parallel row and parallel chunk), however this is due to the fact that they are both bottlenecked on appending the chunks in DuckDB itself. If instead the bottleneck would be the data ingestion/computation on the user-side (instead of a simple counter), this would be much more favorable to the parallel variations. Currently they barely show up as parallel on my laptop.

Another improvement would be setting entire vectors of data at a time, instead of individual values, this could be done in a future PR.

@JAicewizard
Copy link
Author

Let me know if this is still welcome

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant