Releases · duckdb/duckdb

@Tishj

This preview release of DuckDB is named "Undulata" after the aptly named Yellow-billed duck native to Africa.

Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

What's Changed

[Dev] Merge master into feature by @Tishj in #7535
Issue #7563: make_timestamptz by @hawkfish in #7597
Add support for nested laterals by @arhamchopra in #7528
Issue #7563: epoch_us(temporal) by @hawkfish in #7629
Fix lingering clang-tidy issues by @Mytherin in #7670
Add list_intersect, list_has_any, and list_has_all by @maiadegraaf in #7518
Issue #7563: epoch_xs(temporal) by @hawkfish in #7648
Pivot - dynamically switch between using filtered aggregates or the new pivot operator by @Mytherin in #7688
Add wildcard to JSON Path by @lnkuiper in #7624
[Dev] Add optional build flag to disable assertions in debug mode by @Tishj in #7618
[DEV]: ICU C Casts by @hawkfish in #7715
List_resize by @maiadegraaf in #7401
Issue #7187: AsOf Join Performance by @hawkfish in #7607
Some minor CI changes by @samansmink in #7763
Binder coverage by @hawkfish in #7791
Vacuum Completely Deleted Row Groups by @Mytherin in #7794
Issue #7187: AsOf Coverage by @hawkfish in #7774
Implement FIELD_IDS for parquet writes by @lnkuiper in #7696
Optimize Regexp_matches to LIKE statements when possible by @Tmonster in #7264
Jemalloc configuration, more buffer allocator, and remove redundant string copying in parquet dictionary by @lnkuiper in #7697
Truncate Database File on Checkpoint by @Mytherin in #7824
LEFT JOIN ON TRUE support by @taniabogatsch in #7821
Issue #7809: Segment Tree Performance by @hawkfish in #7831
C Data Interface: duckdb_arrow_scan and duckdb_arrow_array_scan by @angadn in #7570
Update Julia to 0.8.1 by @Mytherin in #7932
Add conn.interrupt() to DuckDB python API by @henrinikku in #7895
renaming part of extension build refactor PR by @samansmink in #7926
fix swapped x/y regression parameters by @MartinNowak in #7855
[Docs] Aggregate function README.md by @hawkfish in #7881
PhysicalPiecewiseMergeJoin improvement by @xuke-hat in #7832
Initial set of commits to add support for zOS (an IBM mainframe operating system) by @v1gnesh in #7805
test(nodejs): add test_all_types.test.ts by @Mause in #7740
Issue #7879: Missing JDBC TIMESTAMP_TZ by @hawkfish in #7922
Attempt to fix CI on Windows 32 and Python on Windows by @carlopi in #7961
Fix 7947 by @lnkuiper in #7963
test: patch test_7652 to skip on pyarrow<11 by @gforsyth in #7966
NodeJS: Add columns() method to get type info from prepared statement by @Maxxen in #7948
Fix: Don't free arrow children explicitly by @Maxxen in #7917
CSV Rejects table by @Maxxen in #7681
Issue #7809: Segment Tree Performance by @hawkfish in #7891
Add tpch benchmark run exclusively on parquet files by @Tmonster in #7519
Bidirectional check storage + minor CI fixes by @carlopi in #7955
[Swift] fix #7985 by @tcldr in #7993
Move @samansmink's extension_header_rename.patch by @carlopi in #8001
[Python] Properly use NumPy array stride when scanning object arrays. by @Tishj in #7964
CI - No longer run on PR synchronize, instead run on ready_for_review by @Mytherin in #8007
Parallel pipeline execution should call NextBatch on first batch by @bleskes in #7978
Micro-optimization for generating collation keys by @Krechals in #7983
Multiple assignment for UPDATE SET by @nickgerrets in #7968
CI job to move synchronized PRs to draft by @carlopi in #8010
[ADBC] ConnectionGetTableSchema and StatementSetSubstraitPlan Functions by @pdet in #7914
Issue #7852: Window Vectorisation by @hawkfish in #7996
Moving JDBC Linux x64 builds to CentOS 7 by @hannes in #7991
CI Draft - token is called GH_TOKEN by @Mytherin in #8016
Add support for materialized CTEs by @kryonix in #7126
Reduce memory usage of Parquet writer by @lnkuiper in #7995
CI auto draft: pass token via environment + avoid wrapping action by @carlopi in #8024
CI autodraft: use implicit variable [test] by @carlopi in #8027
remove duplicate pivots declare by @douenergy in #7992
Fix typo in fts indexing exception by @alexanderchiu in #8034
Fix issue 7988 by @samansmink in #8023
Delete DraftMe.yml by @Mytherin in #8048
Fix 3eb9ab3: Remove unneeded move by @carlopi in #8038
[CI] Skip many more CI jobs for pull requests, and add make coverage-check to run coverage locally by @Mytherin in #8046
Extension build configuration refactor by @samansmink in #7735
Compressed Materialization by @lnkuiper in #7644
[Relation] Add support for creating an empty ValueRelation by @Tishj in #7967
Join Order Optimizer has duplicate enumerations and lost some neighbors by @lokax in #7358
Fix CI wasm by @carlopi in #8057
[CI] More CI reduction and clean-up by @Mytherin in #8052
Restore auto-draft functionality by @carlopi in #8058
Move WebAssembly.yml to NightlyTests.yml by @carlopi in #8060
Unskip, attach HTTPFS test, and create HTTPState when the opener is not available by @pdet in #8012
CI fixes: Don't persist ccache for nightlies by @carlopi in #8075
Fix regression & fix draft mechanism by @carlopi in #8071
CI compliance feature branch by @carlopi in #8070
Fix python flaky test (potentially GET requests gets back 403) by @carlopi in #8074
[Arrow] Fix segfault in appending list data by @Tishj in #8042
Issue #7852: Window Vectorisation by @hawkfish in #8050
CONTRIBUTING.md by @carlopi in #8077
Add ORDER BY clause to query in test_bool.test by @Flogex in #8082
ART test and benchmark refactor by @taniabogatsch in #8055
Update plan cost runner script to remove 20% threshold for cardinality estimates by @Tmonster in #7989
Fix #8067 by @lnkuiper in #8090
ART prefix refactor by @taniabogatsch in #7930
Bump Substrait by @pdet in #8110
Merge feature into master by @Mytherin in #8136
Increase memory limit in test to prevent non-deterministic CI failures by @lnkuiper in #8138
UNNEST binder fix by @taniabogatsch in #8111
Out-of-Core Hash Aggregate by @lnkuiper in #7931
Add Unittests for ODBC by @maiadegraaf in #7875
Hive types by @lverdoes in #7674
...

@Mytherin

This is a bug fix release for various issues discovered after we released 0.8.0. There are no new features, just bug fixes. Database files created by DuckDB v0.8.0 can be read by DuckDB v0.8.1 (i.e. v0.8.1 is backwards compatible with v0.8.0). Note that database files created by v0.8.1 cannot be read by DuckDB v0.8.0 (i.e. v0.8.0 is not forwards compatible with v0.8.1).

Changes

[Julia] Update DuckDB_jll to v0.8.0 by @Mytherin in #7568
CSV reader - allow parallel option to be set in COPY statement as well by @Mytherin in #7579
shell: Remove .dbinfo command. by @omo in #7569
Catalog::LookupEntry(): Remove unused code. by @omo in #7557
Add the default scheme to the CREATE TYPE's type search path. by @omo in #7555
Use std::all_of instead of raw loop in Disjoint. by @ttsugriy in #7549
feat: introduce a common grammar/types file for libpgquery parser and update Python scripts to take source/target directory paths as argument by @stephaniewang526 in #7574
Fix #7582 - correctly set "last_offset" in InitializeScanWithOffset and turn assertion into run-time check by @Mytherin in #7586
Partially fix #7551 - throw internal exception in case of type mismatch in ExpressionExecutor by @Mytherin in #7587
Fix #7602 - allow reserved keywords in named parameters by @Mytherin in #7604
Fix #7599 - output a clear error message when a subquery is used in a table function that does not support it by @Mytherin in #7603
Rework Code Coverage CI - Remove CodeCov and instead track uncovered lines explicitly + turn lack of coverage into a CI failure by @Mytherin in #7611
Use unordered_set insert range overload. by @ttsugriy in #7615
Reserve expression_costs storage. by @ttsugriy in #7608
[ADBC] Testing Unhappy Paths, Fixing Memory Leaks from Error Setting, Removing Macros by @pdet in #7589
Windows - path is only absolute if path starts with a single back-slash by @Mytherin in #7623
Fix #7564 - if the auto-complete extension is not enabled, inline it into the shell by @Mytherin in #7621
Remove 2 extra bytes from magic string pattern. by @ttsugriy in #7626
Avoid unnecessary table lookup. by @ttsugriy in #7630
Reserve enough storage for unbound_expressions. by @ttsugriy in #7627
Increment code coverage by @Mytherin in #7636
Remove all C-style casts and add clang-tidy rule to forbid them by @Mytherin in #7656
Fix sql auto complete extension CI issue by @Mytherin in #7650
Add missing entries to ParquetDecodeUtils::BITPACK_MASKS by @Tishj in #7658
Fix: allow distinct and order by in list aggregates by @taniabogatsch in #7638
Rework the AggregateExecutor interface to no longer have unnecessary pointers and arrays by @Mytherin in #7671
Fix #7660 - avoid exporting the same catalog multiple times in EXPORT by @Mytherin in #7676
Move BindUpdateConstraints into a virtual function that is implemented by the DuckTableEntry by @Mytherin in #7679
Fix #7567 - when setting the schema to a different schema within another catalog, keep the correct catalog by @Mytherin in #7678
Fix exception fmt by @carlopi in #7683
Fix amalgamation build by avoiding overloading multiplication by @carlopi in #7661
Fix #7659 - use correct catalog when replaying a CREATE TABLE in the WAL by @Mytherin in #7675
Implement #7662 - add the "lock_configuration" setting which allows configurations to be locked down by @Mytherin in #7682
Fix #7663 - add in_search_path function, correctly show temporary views in SHOW TABLES, and show views in SHOW ALL TABLES by @Mytherin in #7680
expose the StripUnicodeSpaces parser utility method by @stephaniewang526 in #7705
Add FuzzyDuck fuzzer - and move fuzzer CI to separate repo by @Mytherin in #7712
Add missing std::move for old GCCs by @Mytherin in #7714
[Dev] Fix failing assertion in python debug by @Tishj in #7722
Fix crash in ArrowTableFunction::GetArrowLogicalType on Linux by @Tishj in #7718
Allow core duckdb to handle unrecognized JDBC configuration by @elefeint in #7713
[ADBC] Transactions and explicitly not-supporting Partition Reading/Execution by @pdet in #7639
Verify that Parallel CSV Reader skips lines mid-threads by @pdet in #7637
Fix issue with setup.py builds without dependencies by @samansmink in #7695
[Python] Fix tests for Pandas 2.0.2 by @Tishj in #7726
Code Coverage CI check - allow one uncovered line by @Mytherin in #7724
Generate default_types from json files by @Tishj in #7646
Fix fuzzer issues found by new fuzzer CI runs by @Mytherin in #7736
[Python] Fix conversion of deeply nested dictionaries by @Tishj in #7739
Fix TupleDataCollection List serialization by @lnkuiper in #7741
Fuzzer #156: Copy Before Swizzle by @hawkfish in #7747
Minor fixes to failing CI runs by @carlopi in #7768
Fix more fuzzer issues found by new fuzzer CI by @Mytherin in #7759
Add option to disable serialization by @stephaniewang526 in #7745
fix(httpfs): correct listobjectv2_url for strict s3/http servers by @Mause in #7761
Fuzzer #209: Multiple Scalar Blocks by @hawkfish in #7764
Fuzzer #206: Fix Cast Overflow by @hawkfish in #7770
More minor CI fixes by @Mytherin in #7779
Add Exception on dependency verification for Enum Types and Temp Tables by @pdet in #7641
Add fuzz_all_functions fuzzer, and add support for varargs to test_vector_types by @Mytherin in #7754
JSON fixes by @lnkuiper in #7762
[Julia] Fix issue related to table function callbacks and IO by @Tishj in #7783
[Dev] Use sql in the python_regression_test.py. by @Tishj in #7787
Allow core duckdb to handle unrecognized C API configuration by @elefeint in #7804
Fuzzer #214: ROWS BETWEEN Overflow by @hawkfish in #7767
Add tests to cover issue 5132 and enable force reload by @taniabogatsch in #7800
Fuzzer #215: Timestamp Arithmetic Overflow by @hawkfish in #7769
Remove grammar support for CREATE/DROP DATABASE by @stephaniewang526 in #7806
Serialize: fix some uncovered cases, part 1 by @carlopi in #7810
CodeCov tweaks by @carlopi in #7815
fix(jdbc): arrow error handling by @Mause in #7814
Fix duck fuzzer #218 and #220 by @carlopi in #7818
Add msan and ubsan to cifuzz (+ fix zstd + msan) by @carlopi in #7813
Art bug fixes by @taniabogatsch in #7801
Check GlobalSortState for external scan in PhysicalWindow by @lnkuiper in #7827
remove un-used PGNodeTag by @stephaniewang526 in #7833
refactor(fsspec): remove seekable flag by @Mause in #6585
Unnest_rewriter fixes by @taniabogatsch in #7836
[Julia] Fix comments on #7783 by @Tishj in #7843
Disable attaching on-disk DuckDB databases if external access is disabled by @Mytherin in #7850
Fix #7711 - disallow detaching the currently USEd database by @Mytherin in #7851
[Python] only execute in DuckDBPyRelation::Close if it was never executed before by @Tishj in #7844
Add rel_from_table_function to R relational API by @hannes in https://github.com/duckdb/d...

@Tmonster

This preview release of DuckDB is named "Fulvigula" after the Mottled duck (Anas fulvigula) which lives in the Gulf of Mexico, where it is apparently highly prized amongst (heartless) hunters.

There are two SQL-level breaking changes in this release:

#7174 The default sort order switched from NULLS FIRST to NULLS LAST because this is more intuitive, especially in conjunction with LIMIT.
#7082 The division operator / will now always lead to a floating point result even with integer parameters. The new operator // retains the old semantics. This change is consistent with Python.

Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

What's Changed

Issue 5984 #4 LogicalColumnIndex out of range Error by @Tmonster in #6303
Implementing Integration with PyTorch by @pdet in #6295
Implement #4941: Python client: for streaming fetches construct a streaming result (fetch_one, record_batch_reader, etc) by @Mytherin in #6346
Implement sharable Buffer Pool across DatabaseInstances by @jkub in #6299
Add table functions range and generate_series for TIMESTAMPTZ by @papparapa in #6285
Add Initial DuckDB Swift API by @tcldr in #6351
Integration with TensorFlow Tensors by @pdet in #6348
Windows - remove delayload code and enable statically linking extensions by default by @Mytherin in #6399
Add support for Pivot/Unpivot statements by @Mytherin in #6387
[C-API] Add support for StreamQueryResult by @Tishj in #6318
[Swift] add remaining non-composite types by @tcldr in #6422
[Swift] Add Prepared Statements by @tcldr in #6459
[Python] Exclude jemalloc files while pip install on Android OS by @papparapa in #6450
CI: Swap cron for repository_dispatch by @carlopi in #6498
CI improvements + add version badge to README by @carlopi in #6493
Storage: store lists as uint64 offsets instead of as list_entry_t by @Mytherin in #6499
two changes facilitating sending table/column stats over the wire (M… by @peterboncz in #6440
Rework Value class internals to have a similar structure to LogicalType and others by @Mytherin in #6503
Remove unswizzle flag from SortedData::Unswizzle by @lnkuiper in #6501
[Swift] Add Appender by @tcldr in #6482
JDBC: Remove DuckDBDatabase by @MariusVolkhart in #6426
Add nan and inf arithmetic by @Tmonster in #6415
Update tools/rpkg README.md by @Tishj in #6530
Merge feature into master by @Mytherin in #6534
Restrict threads for reliability. by @hawkfish in #6540
Replace replace with format strings by @domoritz in #6542
Add missing escape for " by @domoritz in #6543
Blob <-> Bitstring casting by @LindsayWray in #6488
Mapfunctions: map_entries, map_values, map_keys by @LindsayWray in #6522
Issue #5920: Ordered Aggregate Buffering by @hawkfish in #6539
Handle SQL-tagged strings correctly with dplyr::tbl, fixes #6506 by @rsund in #6536
CI: Update Swift.yml by @carlopi in #6553
Update SwiftRelease.yml by @carlopi in #6554
Java: Implement JDBC 4.1 by @MariusVolkhart in #6376
Bitstring aggregations by @LindsayWray in #6417
Make our default threads setting Cgroup-aware on Linux by @Tishj in #6550
[Swift] Add composite type support by @tcldr in #6557
Statistics Rework: Switch to single BaseStatistics class, use separate static classes for methods on the stats instead by @Mytherin in #6560
Introduce Syntax for SEMI and ANTI joins by @Tmonster in #6480
Update storage_info with version 0.7.1 by @carlopi in #6572
[Python] Add the ability to supply a DuckDBPyRelation instance to register by @Tishj in #6483
[Python] map now defaults to original type when analyzed type at bind is NULL by @Tishj in #6571
[Dev] Fix broken test_filesystem.py test by @Tishj in #6582
CI: Node.js, add common NPM-setup step by @carlopi in #6590
build: add builds for nodejs linux arm64 by @Mause in #6586
CI: move to setup-node@v3 by @carlopi in #6596
Issue #6604: TIMESTAMP <=> TIMESTAMPTZ by @hawkfish in #6605
[Python] Add support for EXPLAIN ANALYZE to explain method by @Tishj in #6561
Add ICU list functions generate_series and range by @papparapa in #6445
feat(nodejs): add errorType attribute to DuckDbError by @Mause in #6434
Fix TPC-DS date insertion by @ywelsch in #6591
Fix #4016: Test amalgamation with --split param by @carlopi in #6587
feat(python): throw HTTPExceptions instead of IOException for http errors by @Mause in #6533
Add httpfs config to support packaging it as an extension by @ankrgyl in #6608
Issue #6595: N-Ary Positional Joins by @hawkfish in #6598
[Swift] inline documentation plus API tweaks by @tcldr in #6614
Fix #6602: add inet extension to build/distribute script by @Mytherin in #6610
CI remove amalgama x8 + swift release by @carlopi in #6615
Fix too many open file handles during JSON schema detection by @lnkuiper in #6613
Issue #6580: Parquet Int96 Timestamps by @hawkfish in #6601
Exception_static_build defalt: Partial revert of dabbead by @carlopi in #6620
Make DISTINCT ON respect the ORDER BY clause similar to Postgres + several ordered aggregate improvements by @Mytherin in #6616
fix url encode issue for R2 by @samansmink in #6609
[Swift] Database.Configuration type + documentation enhancements by @tcldr in #6617
R: Avoid passing SEXP by reference by @krlmlr in #6475
Test and fix preservation of class attribute in external pointers by @krlmlr in #6526
Add support for lambda functions to COLUMNS, and allow COLUMNS to be used in the ORDER BY/WHERE clauses by @Mytherin in #6621
[R] Remove duplicate occurrence of dependency by @Tishj in #6625
Automatically Fully Download Files through HTTPFS if no length header is provided by @pdet in #6448
Remove some function calls that can throw potential false positives in CI by @Tmonster in #6623
[Python] Add __getattr__ and __getitem__ implementations for DuckDBPyRelation by @Tishj in #6624
[Optimizer] Regex Optimization Rule fix by @Tishj in #6634
[Bug Fix] Enum Serialization by @pdet in #6040
Update interval for arrow by @handstuyennn in #6515
SQLLogicTest - instead of moving prepared statements over avoid restarting database when there are prepared statements by @Mytherin in #6638
Bind replace table function by @samansmink in #6639
Fix #6630: correctly set bind_data->types in the Parquet scan when using union_by_name by @Mytherin in #6642
[Python] read_csv can now read from a file-like object. by @Tishj in #6568
Fix #6640: correctly throw an error on altering schemas by @Mytherin in #6643
Support multiple aggregates in top-level pivot by @m...

@Mytherin

This is a bug fix release for various issues discovered after we released 0.7.0. There are no new features, just bug fixes. Notably, there is no incompatibility with database files created with v0.7.0

Changes

When building extensions we need to add _storage_init to the whitelist on MacOS by @Mytherin in #6243
Some more read_json_auto bugfixes by @lnkuiper in #6244
Fix for Thrift.h: std::iterator is deprecated by @hannes in #6250
Add missing shell mode descriptions by @papparapa in #6256
Fix #6255: Shell should be installed in INSTALL_BIN_DIR by @Mytherin in #6266
Bump Julia to v0.7.0 by @Mytherin in #6280
Skip headers in read_csv functions as well by @pdet in #6267
Correctly compute Windows terminal width, and add a .maxwidth option to the shell for duckbox mode by @Mytherin in #6274
Fix lateral join bug by @taniabogatsch in #6268
fix: add storage_version_info entry for v0.7.0 by @Mause in #6279
Fix to #5461 by @annnei in #6265
CI fixes by @Mytherin in #6289
[Fuzzer] Fixes fuzzer issue 11 by @Tishj in #6191
Partially Fix #6253: Improve handling of timezones in the regular VARCHAR -> TIMESTAMP cast by @Mytherin in #6283
Error message on no content-length header by @samansmink in #6293
fixes #6238 by @rpbouman in #6239
fixes #6236 by @rpbouman in #6252
Missing extension exceptions by @lverdoes in #6294
feat: allow extensions to implement CREATE/DROP DATABASE by @rjatwal in #6115
fix(python): python object types in stubs by @Mause in #5732
Fix UPSERT binding issue related to the source table_index by @Tishj in #6275
fix: DESCRIBE does not show primary key by @gkaretka in #6068
Fix #6276: avoid transforming the root arg of a case expression multiple times by @Mytherin in #6300
More read_json(_auto) bugfixes by @lnkuiper in #6281
JDBC: Expand Blob, add UUID support by @MariusVolkhart in #6302
CMake: Move from GREATER_EQUAL to GREATER, fixing #5528 by @carlopi in #6310
Implement #6003 - add names option to CSV reader by @Mytherin in #6308
CI: Test for cron based workflows by @carlopi in #6311
CI Fix + match tests on less specific error messages by @Mytherin in #6320
Fix #6314: select correct block index in IEJoin - and fix issues with left/right IE join resuming in case of multiple matches by @Mytherin in #6323
CI: all workflows moved to nightly by @carlopi in #6334
Fixes #6315: keep names/types around so description can be used after result is closed by @Mytherin in #6326
Fix #5800: add missing Copy() calls, and add ALTERNATE_VERIFY method to verify Copy of INSERT/UPDATE/DELETE/COPY statements by @Mytherin in #6327
Apply lower casing to extension aliases by @Mytherin in #6331
Fix #6304: correctly handle NULL partitions and constant vectors, plus handle default parameters in COPY by @Mytherin in #6336
[Python] DuckDBPyRelation: Change explain method and add sql method by @Tishj in #6287
Fix Polars CI and properly implement check_ methods in the dataframes by @pdet in #6347
Fixing a clang16 problem that slipped through by @hannes in #6345
Fix #6341: LEFT/RIGHT/OUTER join on condition that is always true is only equal to a cross product if the other side is not empty by @Mytherin in #6342
CI: Skip any CI on branches named 'feature' or 'master' by @carlopi in #6350
Add correct bail-out to CSV auto-detection on oddly/inconsistently formatted CSV files by @Mytherin in #6330
CI: Invert path-ignore for tools folders by @carlopi in #6353
NULLs sort last in relational by @krlmlr in #5994
Properly deal with Star (*) expressions in COPY ... (FORMAT JSON) by @lnkuiper in #6319
fixes #6227 by @rpbouman in #6230
fix typos in dictionary_store_worst_case.benchmark by @hnjylwb in #6371
Julia: Support change timezone config by @xcaptain in #6358
Paths-ignore on push by @carlopi in #6363
JDBC - Add separate treatment for timestamptz values by @Jens-H in #6364
bugfix: switch to fsspec's strip protocol impl by @Mause in #6361
Disable tidy on ODBC for now by @Mytherin in #6379
Implements function "sqlite3_column_table_name" for the sqlite3 wrapper by @TinyTinni in #6385
[Python] No jemalloc for successful build on android by @papparapa in #6383
throw BinderException on empty list in percentile by @samansmink in #6378
Add optimizer flag to R and Python Substrait api by @LindsayWray in #6097
fixes #6269 by @rpbouman in #6291
Java: Use automatic resource management for AutoCloseable types by @MariusVolkhart in #6377
Fix progress bar in (parallel) CSV reader by @Mytherin in #6397
Fix #6393: for DESCRIBE order by column_index instead of column_name by @Mytherin in #6398
ART (bug) fixes by @taniabogatsch in #6396
[NodeJS] Support multi-statement prepare by @Tishj in #6278
Java: Use StringBuilder where appropriate by @MariusVolkhart in #6373
Bitpacking bug by @samansmink in #6402
bugfix(fsspec): missing fs methods by @Mause in #6395
Auto-load HTTPFS extension when http(s)/s3 files are queried and it is not loaded + upgrade SQLite scanner version/other extension fixes by @Mytherin in #6401
Add helpful error message if a setting from an extension is attempted to be set when the extension is not loaded by @Mytherin in #6406
namespace typos in blocking concurrent queue by @csruiliu in #6408
Java: Implement DatabaseMetaData#isReadOnly() by @MariusVolkhart in #6375
CI fixes by @carlopi in #6414
Parquet: for DELTA_BYTE_ARRAY encoding verify that lengths of subsequent arrays do not exceed length of BYTE_ARRAY by @Mytherin in #6412
Fix #6235: correctly return catalog for views in information_schema by @Mytherin in #6413
Enable CMAKE_EXPORT_COMPILE_COMMANDS ON default by @JackDrogon in #6394
Fix #5878: only delete the temp directory if we created it, otherwise delete only our temp files by @Mytherin in #6425
Fix under-specified test by @Mytherin in #6419
fix: logic fix to allow storage extension to implement DROP DATABASE by @stephaniewang526 in #6430
Map bug combo of const & non-const lists by @LindsayWray in #6354
Issue #6272: Window Scaled Repartitioning by @hawkfish in #6366
respect column order for partitioned write by @samansmink in #6436
Properly initialize string vector when reading large JSON arrays of strings by @lnkuiper in #6437
Fix #6420 - correctly delete temporary files that are not explicitly read back but just dropped by @Mytherin in #6424
Julia: support Pkg.test() by @chris-b1 in #6431
fixes sqlite3_column_bytes nullptr access on some call ordering by @TinyTinni in #6409
Write struct fields as optionally quoted in EXPORT DATABASE by @Tishj in #6416
Enables sqlite3 wrapper tests for win32 builds by @TinyTinni in #6427
Adding separate extension_directory configuration setting by @hannes in https://github.com/duckdb/duckdb/...

@Mytherin

This preview release of DuckDB is named "Labradorius" after the Labrador duck (Camptorhynchus labradorius) which was native to North America and went extinct in 1878 despite its reportedly bad taste.

Again, @Mytherin has written a blog post explaining the exciting list of new features in this release.

Binary builds are listed at the bottom of this post. Please note that it can take a couple of hours until binary builds for all platforms and environments are available.

Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

What's Changed

Use structs to avoid confusing C pointer wrappers by @krlmlr in #4961
Enum type added to the types metadata table by @LindsayWray in #5290
R: code format by @krlmlr in #5185
Add starts_with function and operator by @papparapa in #5334
Feature: Allow binary-formatted strings to be cast to integers by @Maxxen in #5337
For range joins use NL join when the LHS or RHS side is tiny by @Mytherin in #5399
Add support for LATERAL joins by @Mytherin in #5393
[Julia] Add support for consuming a UNION vector into a DataFrame by @Tishj in #5360
Issue #5314: At Time Zone by @hawkfish in #5341
Decimal values now round when the value given has more decimals than the scale of the target by @Tishj in #5362
Shell: add individual SQL queries to the history, instead of individual lines by @Mytherin in #5414
Shell: add support for history search by @Mytherin in #5415
Parallelise scanning result of ORDER_BY by @lnkuiper in #5403
Add translate function by @zhouliqi in #5212
Enable cmake to recognize AppleClang by @changhiskhan in #5432
Support enum_code() function by @lokax in #5408
Fix binder error and produce more informative error message. by @Tmonster in #5302
Parquet Reader: Re-use (de)compression and dictionary buffers and allocate powers of two by @Mytherin in #5445
Support RLE, DELTA_BYTE_ARRAY and DELTA_LENGTH_BYTE_ARRAY Parquet encodings by @Mytherin in #5457
print profiling output for deserialized logical query plans by @ila in #5448
Issue #5277: Sorted Aggregate Sorting by @hawkfish in #5456
Add internal flag to duckdb_functions, and correctly set internal flag for internal functions by @Mytherin in #5462
Add experimental R String passthrough support by @hannes in #5479
Issue #5258: Quantile Negative Fractions by @hawkfish in #5463
Arrow stream ingestion for JDBC client by @hannes in #5449
PER_THREAD_OUTPUT flag for COPY by @hannes in #5412
Feature: skip broken tests for now by @Mytherin in #5532
Add Union All support to R extention by @Tmonster in #5484
[Python] Add from_parquet features by @papparapa in #5492
Add ExtractStatements to C API by @LindsayWray in #5524
Improve http retry by @samansmink in #5549
Issue #5277: Sorted Aggregate Window by @hawkfish in #5571
Issue #5422: QUANTILE_DESC Decimals by @hawkfish in #5572
Issue #5559: 2022g Time Zones by @hawkfish in #5570
[Dev] Clean up of the python pkg folder structure by @Tishj in #5436
httpfs: check environment vars for AWS Credentials by @satotake in #5419
Misc union-type improvements by @Maxxen in #5617
Fix so Left inner join doesn't re-optimize nodes by @Tmonster in #5620
[Substrait] C API + from_substrait_json + bump on substrait version. by @pdet in #5613
Allow strings in ColumnDataCollection to be written to disk by @lnkuiper in #5543
[PythonDEV] Let clean.sh be run from anywhere, not just tools/pythonpkg by @Tishj in #5625
Reorganize Join order optimizer code by @Tmonster in #5621
[Catalog] Grab missing write_locks in a couple places by @Tishj in #5601
Parquet info to Substrait by @pdet in #5627
HTTP parquet optimizations by @samansmink in #5405
Adding delta compression to Bitpacking compression by @samansmink in #5491
[Python] Changed use of DuckDBPyConnection to shared_ptr by @Tishj in #5635
Merge feature branch into master by @Mytherin in #5645
[Python] Display progress bar by default in an interactive environment by @Tishj in #5596
Add support for RESET statement on configuration options by @Tishj in #5603
httpfs: Encode url path on request by @satotake in #5587
Fix broken CI because of RESET statement by @Tishj in #5671
Don't automatically set the bug label on issues by @Mytherin in #5680
Add support for CREATE VIEW IF NOT EXISTS by @Mytherin in #5682
Issue #5622: Validate Timezone Characters by @hawkfish in #5658
Issue 5630 fix. by @Tmonster in #5644
Adding COLUMN_TYPES option for read_csv_auto by @pdet in #5552
[Python] Get rid of DuckDBPyResult (merged functionality into DuckDBPyRelation) by @Tishj in #5597
feat: port nodejs tests to typescript by @Mause in #5632
Improve nodejs README by @Tishj in #5688
[Python] Add (partial) support for numpy.datetime64 objects by @Tishj in #5659
retry on all httplib errors by @samansmink in #5684
Return false if file doesn't exist by @Y-- in #5701
Adding context option to not run replacement scans and exporting namespace of json substrait function - R by @pdet in #5689
Issue #5609: Scope CTE Windows by @hawkfish in #5690
Attempt to fix random NodeJS CI failure by @Tishj in #5710
[Python] duckdb.execute() == duckdb.default_connection.execute() by @Tishj in #5650
NodeJS: switch to using package_build, and add support to BUILD_NODE to Makefile by @Mytherin in #5691
JDBC SNAPSHOT Jars by @hannes in #5687
Fix NodeJS 19 CI for Windows by @Tishj in #5719
Fix issue 5664 by @lokax in #5667
Issue #5712: CURRENT_TIMESTAMP and CURRENT_TIME by @hawkfish in #5713
[CSVReader] Catch a user error in supplying 'columns' option by @Tishj in #5721
Improve suggestions when LOAD of an extension fails by @Mytherin in #5722
doc(nodejs): amend arrow stream type docs by @Mause in #5731
Fix for TSV throwing during sniffing by @pdet in #5555
Statically link extensions on Linux with Clang by @jkub in #5653
[Python] Add support for named parameters by @Tishj in #5611
fix: nodejs source releases should be standalone by @Mause in #5734
build: don't install python from chocolatey by @Mause in #5740
fix: use non-string-splitting variable interpolation in binding.gyp.in by @Mause in #5745
Equalizing DBConfig constructors by @nicku33 in #5747
We should not treat replacement open paths as disk paths by @nicku33 in #5748
Allow table in-out functions to be used in correlated subqueries and as LATERAL queries by @Mytherin in https://github.com/duckdb/duckdb...

@Mytherin

This is a bug fix release for various issues discovered after we released 0.6.0. There are no new features, just bug fixes.

What's Changed

Correctly accept BUILD_JEMALLOC_EXTENSION on Linux by @Mytherin in #5343
[julia] fix docstring of load! and relax type restriction by @jfb-h in #5354
Bump DuckDB_jll compat to v0.6 by @jeremiahpslewis in #5356
Issue #5342: DATE_PART Struct Indexing by @hawkfish in #5382
Add reference to cleanup function for duckdb_result_get_chunk by @ak-coram in #5389
Fix #5390: in filter pull-up optimizer avoid adding columns to one side of a set operation by @Mytherin in #5400
Fix #5371: correctly use instance cache in JDBC and ODBC connector by @Mytherin in #5398
Add support for reading JSON type columns from Parquet files by @Mytherin in #5401
[Dev] Fix compilation issues related to MSVC and Windows.h by @Tishj in #5386
fix: upgrade npm's internal node-gyp by @Mause in #5402
[Appender] Appender can now properly append to DECIMAL columns by @Tishj in #5364
Fix bug causing loss of order preservation in insert by @lnkuiper in #5427
Allocator: throw std::bad_alloc if a malloc allocation fails by @Mytherin in #5439
Fix the use of COLUMNS(...) in ORDER BY clause by @lokax in #5444
Adding lazy relation -> data.frame conversion for R client by @hannes in #5181
Fix #5450, don't crash on integer dates in R by @hannes in #5451
Issue #5366: QUANTILE_DISC Intervals by @hawkfish in #5442
Remove the f off by @hatvik in #5475
Fix many fuzzer issues by @Mytherin in #5482
Allow column references in constant table functions by @Mytherin in #5483
Node register arrow ipc buffer fix by @samansmink in #5433
Add initializer for queue_insertions by @hannes in #5504
Disabling per-value materialization of r altrep strings in results by @hannes in #5454
Correctly set delim_offset in flatten dependent join and disable linux arrow test by @Mytherin in #5509
update arrow extension by @samansmink in #5506
[Python] Correct stub for DuckDBPyConnection::df by @Tishj in #5385
Add deserialization to custom operators by @rjatwal in #5496
[Python] No longer truncate ByteArray values by nullbytes by @Tishj in #5517
Add in the pg_database, pg_proc, and pg_settings views to pg_catalog by @jwills in #5526
Fix various BufferManager issues by @lnkuiper in #5476
Add feature request link by @Mause in #5324
[Python] Fix relation.query() not accepting non-select statements by @Tishj in #5531
fix issue #5488 by @samansmink in #5519
[Python] Adding back Query interrupt support (through Ctrl+C) by @Tishj in #5487
Adding dummy user/username/password settings by @hannes in #5530
Add memory leak tests, and fix memory leaks related to repeated table creation/destruction by @Mytherin in #5537
DuckBox renderer fixes by @Mytherin in #5539
Fix #5533: correctly use timestamp logical type unit in Parquet stats reader by @Mytherin in #5540
Disable the extended code coverage tests for now by @Mytherin in #5542
NLJoin is not always terrible by @pdet in #5538
naming mismatch for linux arm extension upload by @samansmink in #5556
Deprecate 'sprintf' usage using MacOSX SDK 13 by @darrenfu in #5545
Fix #5546: allow foldable scalar expressions in standard table functions by @Mytherin in #5550
Upgrade sqlite scanner hash by @Mytherin in #5551
[Python] Fixed bug where creating a cursor from a closed connection caused a segfault by @Tishj in #5565
Fsst pull bugfix from upstream by @samansmink in #5567
Parquet: Not setting num_children for primitive types as per spec by @hannes in #5579
[Python] Fix accidental dependency on pandas by @Tishj in #5581
Throw error when sorting or using indexes on big endian architecture by @Mytherin in #5588
fix: separate artifacts for 32bit and 64bit builds by @Mause in #5592
Bug fix for 5523 by @taniabogatsch in #5554
Disabling truncating of temporary buffer manager files on Windows by @hannes in #5600
Removed FSST unused global that triggered compiler warning by @hannes in #5602
Copy JDBC Properties to not lose readonly setting by @hannes in #5594

Full Changelog: v0.6.0...v0.6.1

@Mytherin

This preview release of DuckDB is named "Oxyura" after the White-headed duck (Oxyura leucocephala) which is an endangered species native to Eurasia.

This time, @Mytherin has written a blog post explaining the quite long and exciting list of new features in this release.

Binary builds are listed at the bottom of this post. Please note that it can take a couple of hours until binary builds for all platforms and environments are available.

Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

Featured Changes

Optimistically write data to disk when batch loading data into the system by @Mytherin in #4996
Parallel non-order preserving CREATE TABLE AS and INSERT INTO by @Mytherin in #5033
Parallel order preserving CREATE TABLE AS and INSERT INTO by @Mytherin in #5082
FSST compression by @samansmink in #4366
CHIMP128 Compression by @Tishj in #4878
Patas Compression (float/double) (variation on Chimp) by @Tishj in #5044
Parallel CSV Reader by @pdet in #5194
Parallelize CREATE INDEX of ART by @taniabogatsch in #4655
Improve memory management of ART indexes by @Mytherin in #5292
DISTINCT aggregates with GROUP BY are now executed in parallel by @Tishj in #5146
Nested "UNION"-type by @Maxxen in #4966
Allow for queries to start with FROM, instead of with SELECT by @Mytherin in #5076
Support for the COLUMNS expression, which allows expanding computations on multiple columns by @Mytherin in #5120
Python-style list-comprehension syntax @Mytherin in #4926
Improvements to Out-of-Core Hash Join by @lnkuiper in #4970
jemalloc "extension" for Linux by @lnkuiper in #4971
Improve rendering of result sets for the shell by @Mytherin in #5140
Add auto-complete support to the shell by @Mytherin in #4921
Nicer looking progress bar by @Mytherin in #5187

All Changes

Fix #4747: Handle pandas num categories between 128 and 256 by @pankajp in #4757
Julia 0.5.1 by @Mytherin in #4758
Fix #3595: avoid using system hash for floating point values by @Mytherin in #4761
Fix #4704. Correct the column name for pragma_storage_info with generated column by @zippond in #4750
Allow to load extensions through compiler variable definitions by @pdet in #4767
Fix some typo in code comments by @buaazhwb in #4769
Enhance duckdb_constraints() by @krlmlr in #4346
Issue #4764: Window Ignore Nulls by @hawkfish in #4773
[Python (Relational)] Query now returns a DuckDBPyRelation by @Tishj in #4471
R types expansion by @hannes in #4778
Add json_contains by @lnkuiper in #4686
Fix #4152: create base table reference in returning clause so generated columns are correctly resolved by @Mytherin in #4783
Fix Exists and ANY correlated subquerys by @lokax in #4752
Fix for ORDER BY on large dictionary vectors: correctly pass offset into get_index of selection vector by @Mytherin in #4787
Missing json_contains in extension list by @Mytherin in #4788
Extensible Casts & Cast Function Rework by @Mytherin in #4785
Bump sqlite scanner by @hannes in #4789
Improve sorting for strings and push projections into sort operator by @lnkuiper in #4697
Parquet: Refactor decompression, including more complete datapage v2 support by @wisp3rwind in #4628
Parallelize CREATE INDEX of ART by @taniabogatsch in #4655
Unify LocalStorage and DataTable Storage by @Mytherin in #4798
feat: support passing all db config to jdbc driver by @Mause in #4794
Fix #4806: correctly use offset index in pragma_table_info on view by @Mytherin in #4807
Map VARCHAR, JSON, ENUM to Julia String by @nickrobinson251 in #4810
fix: support SHOW query types in jdbc client by @Mause in #4799
Replacement Open Hooks by @hannes in #4721
Build multiple out of tree extensions in one pass by @Mytherin in #4828
fix(jdbc): release results before releasing statements by @Mause in #4831
Fix for #4827 by @PedroTadim in #4829
Multiblock2 by @jkub in #4555
Disconnect after test by @krlmlr in #4835
Check prefix length, not string_t::INLINE_LENGTH when comparing strings while sorting by @lnkuiper in #4816
Adding a CI workflow to re-build individual out-of-tree extensions by @hannes in #4833
fix: json getColumnType error by @Mause in #4847
Attempt two at rebuilding old extensions by @hannes in #4848
Updating postgres scanner by @hannes in #4832
Extension Rebuild Attempt 3 by @hannes in #4849
Adding overwrite flag to R duckdb_register by @hannes in #4850
Move LocalStorage row groups directly to DataTable instead of re-appending by @Mytherin in #4851
fix for macos CI by @samansmink in #4854
Fully qualified s3url by @LindsayWray in #4786
FSST compression by @samansmink in #4366
Julia: add support for handling errors in replacement scans by @Mytherin in #4865
Extension build: turn IGNORE_WARNINGS into generic OPTIONS field, and add --main-only field by @Mytherin in #4866
Issue #4867: Approximate Quantile Hugeint by @hawkfish in #4868
Install OpenSSH on ubuntu 16 by @Mytherin in #4877
Join order regression test: add 20% threshold to cardinalities before we care about regressions by @Mytherin in #4880
Move LocalStorage row groups directly to DataTable if there are enough rows being appended by @Mytherin in #4876
Allow referencing of aliases in SELECT clause and TPC-DS extension clean-up by @Mytherin in #4879
Add github to known hosts by @Mytherin in #4884
Adding a serialized version of all TPCH queries and test we can read them by @bleskes in #4605
Add support for custom bind functions to RegisterCastFunction, and propagate client context to the bind function by @Mytherin in #4885
CSV reader: quoted NULL values should be kept as non-NULL by @Mytherin in #4888
fix: add numpy to setup_requires to fix build from source by @Mause in #4893
fix openFlags overwriting in shell fixing #4894 by @kouta-kun in #4895
Remove filter columns from table scans if they are unused in the remainder of the plan by @lnkuiper in #4817
feat: add duckdb_library_version method and fix extension load state by @Mause in #4881
uuid.cpp: GenerateRandomUUID: fix indexing by @nodakai in #4892
Update serialized plans by @Mytherin in #4900
Add CPython 3.11 to build matrix by @edgarrmondragon in #4906
Support UNION_BY_NAME option in read_csv_auto by @douenergy in #4837
support for virtualizing storage layer by @jkub in #4858
Reduce data set size of IE join test by @Mytherin in #4905
Making sure parquet column readers return the expected amount of rows by @ha...

@lokax

This is a bug fix release for various issues discovered after we released 0.5.0. There are no new features, just bug fixes. The following PRs were included in this release:

[Fuzzer] Issue #4152 - Lag window function issue by @lokax in #4603
Fix zonemap check for VARCHAR by @lokax in #4613
Remove the DLLEXPORT from deleted API methods by @emmenlau in #4611
Fix update statement on generated column by @lokax in #4616
[Fuzzer] Issue #4152 - Limit 0% on ANY subquery by @lokax in #4544
[Fuzzer] Issue #4610 - Vacuum table with generated column by @lokax in #4622
[Fuzzer] Decimal scale+width overflows too quickly by @Tishj in #4627
[Fuzzer] issue #4566 by @Tishj in #4592
Issue #4635: DATE_DIFF Week Boundaries by @hawkfish in #4648
Fix issue #4630 by @lnkuiper in #4642
[Python] Fix unwanted conversion from NaN -> NULL in param list by @Tishj in #4624
Fix home directory setter by @attilahorvath in #4617
fix(jdbc): correct mapping for TIMESTAMP_WITH_TIME_ZONE by @Mause in #4654
Fix bug changing input order on array_sort column by @taniabogatsch in #4643
Fix issue #4625 by @lnkuiper in #4653
[Extensions] Suggesting which extension to Load/Install by @pdet in #4634
Fixes issue #4123 by @Tishj in #4523
Updating jdbc deploy script by @hannes in #4663
Consistent struct definitions by @hannes in #4667
Fix #4666 by @taofengliu in #4670
Fix for #3417 by @PedroTadim in #4664
feat: improve python replacement scan error by @Mause in #4672
[C-API] Data chunk invalid left-shift by @Tishj in #4660
fix: correct mislabelling of amd64 libs in jars by @Mause in #4691
Fix #4647 by @taofengliu in #4698
Throw error if attempting to delete from table without physical columns by @Tishj in #4693
Fix #4475: allow ignore_errors in read_csv and read_csv_auto by @Mytherin in #4713
Fix #4442: correctly handle TIMESTAMP logicalType in Parquet files by @Mytherin in #4714
Fix #4699: when no file is found globbing, fallback to using the literal string name as a path by @Mytherin in #4716
Fuzzer fixes batch 1 by @Mytherin in #4707
Fix #4677. Correctly set_not_null when table contains generated column by @zippond in #4706
Fix #4703 by @taofengliu in #4715
Fixing Extension naming CI Checker by @pdet in #4717
[Python(pandas)] Scan multiple chunks worth of values from a 'object' dtype DataFrame by @Tishj in #4692
Fix #4694: Keep shared pointer to pipelines around in additionally scheduled events by @Mytherin in #4724
Fuzzer Batch Fixes 2 by @Mytherin in #4722
Fix #4702. Correctly use index when generated column is involved by @zippond in #4727
Fix for #4583 by @PedroTadim in #4728
Fuzzer fix batch 3 by @Mytherin in #4726
Fix #4562: generate table index for dummy scan generated from VALUES clause by @Mytherin in #4731
[Arrow] Guarantee threads don't call get_next after stream is done. by @pdet in #4712
Correctly catch and report exceptions thrown during a pipeline's scheduling by @Mytherin in #4733
Fix for issue #4708 by @PedroTadim in #4711
Fix #4568: correctly handle casts in deliminator by @Mytherin in #4734
No longer disable vptr sanitizer on M1 macs by @Mytherin in #4735
Use version tag as dir for extensions for releases by @samansmink in #4729
Correctly call ::Skip function of child of structs by @Mytherin in #4736
[Map] Map extract now properly uses the selection vectors of the map and key vectors by @Tishj in #4725
Fix #4356 by @taofengliu in #4740
Fuzzer Batch 4 by @Mytherin in #4737
feat: bump Julia package version by @Mause in #4742
Julia API: Add load! to add a DataFrame as a table by @jfb-h in #4743
aarch64 extensions by @samansmink in #4745
Faster hive part filters by @samansmink in #4746
[Python] DECIMAL with value 0.00... issue fix by @Tishj in #4690
enable out-of-tree extensions for aarch64 by @samansmink in #4751

Full Changelog: v0.5.0...v0.5.1

This preview release of DuckDB is named "Pulchellus" after the Green pygmy goose (Nettapus pulchellus) which is native to Australia where VLDB 2022 is starting today. Despite being called a "goose" it is actually a duck.

Binary builds are listed at the bottom of this post. Feedback is very welcome.

Note: Again, this release introduces a backwards-incompatible change to the on-disk storage format. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

Below a list of changes in this release

Major Changes & Features

#4189: Implement Out-of-Core Hash Join and Re-Work Query Verification
#4022: Art Index Storage
#4274: Join Order Optimizer improvements
#4420: Logical Plan Serialization
#4137, #4347, #4293, #4190, #4178, #4177, #3954 & #4159: Scalability and performance improvements for Window operator
#4004: Add support for extensions to the parser, and add an example of this to the loadable extension demo
#4089: Signed Extensions
#4097 & #4211: Filename column + Hive partitioning support for Parquet Reader
#4501, #4511: Aarch64 Linux builds of CLI, shared library, JDBC & ODBC

Minor Changes & Bug Fixes

#4594: [Map] Fix map_extract from multiple rows
#4585: Fix for r test instability, #4549
#4560: Support all basic integer types in node API
#4558: [CPP-API] Comment no longer causes crash
#4552: [Fuzzer] Issue #4152 - Remove ToString roundtrip in query verification
#4543: Fixing silent assertions
#4542: Check if database is still alive when trying to connect for nodejs
#4541: fix for issue 4533
#4539: Paralelization non-dependent on Arrow rows
#4524: Explicitly deleting default connection on js side
#4522: Correct architecture name for Linux aarch64
#4521: Adding correct substrait release tag to out-of-tree extension deployment
#4520: Added test cases for several fixed JDBC issues
#4516: Fix #4455, dont set default schema in transform
#4513: Issue 4502
#4510: [Casting] Varchar -> Decimal cast fix
#4507: [CSV] Fixed bug related to invalidated iterators
#4505: extension trigger event
#4504: fix: short-circuit hash and version discovery
#4496: [Fuzzer] Issue #4152 - Force no cross-product issue
#4495: Build ODBC driver binary for OSX
#4494: [Fuzzer] Issue #4152 - Analyze inexisting column
#4493: Declare all variables for nodejs.
#4491: Issue #4419: Range Join Swizzling
#4488: Making the parquet extension loadable
#4484: fix: ignore status message from output of mypy stubs check
#4483: [Development bug] unittest result_helper.cpp triggers assertion
#4480: Remove REST server
#4479: Remove assertion
#4477: Removing Substrait From DuckDB Repo
#4474: WIP #4152
#4472: [Python] Removed mutable default parameters
#4470: Fix hidden merge conflict with fetchmany
#4465: [Python] fetchmany implemented
#4458: Issue #4454: VARCHAR/DATE Reversibility
#4448: Issue #3954: Pinned Heap Blocks
#4440: Added support for HUGEINT input type to BIT_COUNT scalar function
#4434: Python: Add PyRelation.fetchnumpy()
#4429: Allow indicating a format version that should be used to write/read from (De)serializer and use it for plans
#4427: Python: Improve docstrings for DuckDBPyRelation and DuckDBPyResult
#4418: Fix typo
#4416: Fix several update issues
#4413: Correctly schedule mix of union/child pipelines (again)
#4409: Increase timeout for coverage checks
#4405: Hybrid ART Leaf Part I
#4404: Add support for TS_MS, TS_NS, and TS_S
#4400: Issue #4388: DATE_TRUNC Low Precision
#4398: fix: correct object return types for arrow functions
#4395: Fix name of environment variable
#4390: Support UNION BY NAME set operation
#4383: Missing LISTs are NULL
#4382: Include PID in test directory name
#4380: R: Avoid translate_duckdb() in tests
#4377: R: Full BLOB support
#4372: Fix #4370: correctly handle non-flat vectors in list_sort
#4371: [Python] Changed all RuntimeErrors thrown in the Python client
#4368: Fixes issue #4365 - Not null constraint is no longer duplicated
#4364: Allow extra parameters in list_aggr to be passed in, as long as they are constant and only used during the bind
#4363: Fix for array_position with NaNs: use Equals::Operation instead of regular equality
#4362: Allow table functions to set cardinality stats through the C API - and utilize this in Julia DataFrame scans
#4359: Mark slow tests
#4355: Fix typo in exception text
#4354: R: Use preinstalled symbol
#4353: Shell: Add missing newline in help output
#4352: Tweak contributing guide [ci skip]
#4345: [Substrait] Pushing-down projections and filters to read relation
#4340: Correctly schedule pipeline dependencies when scheduling mix of UNION and FULL OUTER JOINs
#4336: feat: add basic json support to jdbc client
#4334: Bring ibis/substrait tests to a sane state
#4332: Fix Julia parallelism interleaving with the garbage collector, and expose Pending Query Result in C interface
#4328: Allow specifying a custom home directory using the SET home_directory option
#4327: [Aggregate] DISTINCT aggregates without GROUP BY are now executed in parallel
#4324: Fix #4309: fix for multiple foreign key constraints on the same table-table pair
#4323: Optimizer profiling
#4322: Print NOT operator correctly
#4319: feat: add missing node versions to CI
#4317: refactor: remove dead code in python client
#4316: R: Add rlang as suggested dependency
#4315: Column Data Collection, Arrow Result conversion rework, Cross Product performance fixes & more
#4312: R: Install tidy CLI tool
#4310: R: Add test for test_all_types()
#4304: Improve numeric hash function to a better but slightly slower hash function
#4301: Add unit of measurement in timer function
#4300: Support root type on expressions #4278
#4298: Feature/nodejs client docs
#4297: fix: remove nodejs test focus
#4296: Avoid infinite loop in range(NULL)
#4294: #4276 Serializing data types on table schema in substrait
#4289: [Python/Pandas] fix +/- inf wrongly converting to NaN (NULL)
#4288: Fix fuzzer issue w.r.t. NULL values in generate_series
#4286: [Python - Relation] CreateView on a filtered relation does not cause infinite loop anymore
#4285: chore: remove cython constraint now that bug is fixed
#4284: Pandas timezone
#4283: Return errors from RecordBatchReader
#4280: R: Remove nycflights13 dependency
#4279: R: Don't export duckdb_explain()
#4277: feat: update setup.py links
#4272: Allow 0 as a seed parameter
#4266: R: Only quote non-syntactic and reserved words
#4265: Specialize LIST aggregate function implementation
#4263: R: Avoid attaching package during tests
#4259: Add ANY_VALUE agg function
#4256: Schedule child pipeline correctly
#4255: Disable ibis substrait tests for now
#4250: C API: Report appender error in case conversion fails
#4240: DELIM_JOIN now propagate statistics correctly
#4237: fix: pin cython to work around bug
#4236: Integer types now correctly increase width of DECIMAL type.
#4235: Parquet writer: Write dictionary_page_offset, and distinct_count for dictionary encoded strings/enum
#4234: Implement json_merge_patch and jsonlines output mode
#4233: feat: fix pandas types in docstrings/python types
#4230: Handle nulls in structs and lists
#4225: Add Jaro Winkler
#4215: Use right template for smallint
#4213: feat: update instructions for installing master builds in bug report template
#4212: Improve error message
#4210: PARQUET: Move StringColumnWriter dictionary to use string_t to avoid allocations
#4209: Remove unused PhysicalTypes
#4207: Disable GC during Julia execution to avoid internal GC deadlock in DataFrame scan
#4206: Fix #4202: in the comparison simplification optimizer, we can only shift the cast to the constant if both casts are invertible
#4199: feat: Use pip to install and uninstall python client
#4198: [capi] impl clear bindings for prepared stmt
#4197: feat: port bug_report.md to bug_report.yml
#4196: Fix RTTI issue across extension boundaries on OSX
#4192: Correctly call SetFilePointerEx on Windows so the truncate works as expected
#4191: Fix Expanded CI test case by adding swap space to test
#4188: ALTER SEQUENCE IF EXISTS fix
#4187: [Storage] FOR compression
#4185: ISSUE #3248 Support for ALTER TABLE altering columns NOT NULL
#4183: Julia multi-threading fix: avoid using a time-out to cancel threads in case there are no tasks
#4179: node: add async-iterator-based streaming
#4175: [CI] Python Build with Sanitizer
#4172: Update stubs test
#4168: Issue #4161: Create WindowExecutor
#4167: node: report memory usage to the node GC
#4166: Fix #4165: correctly fill in false_sel when performing comparison with constant null value
#4160: node: don't crash on syntax errors
#4154: Making date_trunc statistics handling consistent with date_part
#4153: Support for int64 round trips in R driver using the bit64 package
#4151: Fix orrify merge conflict
#4143: Correctly handle query parameters in JDBC
#4140: CI Fixes
#4139: Remove redundant code
#4138: Support struct.* to retrieve all struct fields in SELECT list
#4134: Fuzzer Fixes
#4133: Remove DUCKDB_API for deletes. (For Windows/ZIG)
#4132: [Python] project now correctly inherits owning references to PyObjects
#4131: Missing error messages
#4125: Fix Orrify rename merge confl...

This preview release of DuckDB is named "Ferruginea" after the Andean Duck.

Binary builds are listed below. Feedback is very welcome.

Note: This release should be backwards-compatible wrt the on-disk storage format, but the next release may very well be incompatible again. So please don't rely on this just yet. We suggest you use the EXPORT DATABASE command with the old version followed by IMPORT DATABASE with the new version to migrate your data. See the documentation for details.

Also note: DuckDB is switching to semantic versioning. Version numbers look like this: MAJOR.MINOR.PATCH with changes to

MAJOR version when you make incompatible API changes,
MINOR version when you add functionality in a backwards compatible manner, and
PATCH version when you make backwards compatible bug fixes.

However, note that because MAJOR is currently 0, "Major version zero (0.y.z) is for initial development. Anything MAY change at any time. The public API SHOULD NOT be considered stable."

Below a list of changes in this release

Major Changes & Features

#3767: Table function rework, parallel Julia DF scans & Python regression tests
#3749 & #3747: Query cancellation with CTRL-C for R and Python clients
#3700: Support Parallel Order-Preserving Result Set Materialization
#3696: Support WINDOW FILTER
#3620: HTTP read optimization
#3668: Adding alias type
#3435: Add support for reading newline-delimited JSON
#3783: Extension loading by statically linking DuckDB

Minor Changes & Bug Fixes

#3905: Fix SQLancer CI
#3904: Fix #3896: correctly compute GroupRowsAvailable in struct reader in case a child-entry is not just a list, but a struct with only list entries
#3902: Fuzzer: fix sanitization of address sanitizer error
#3901: R: Extract DetectLogicalType() function
#3899: R: Check query return type instead of query type in dbFetch()
#3898: Issue #3880: Rebind DATE_TRUNC dates
#3894: Purge concurrent queue when enqueueing entries to prevent entries from piling up
#3892: Fix for issue #3878
#3889: Fix TreeRenderer crash on invalid UTF8
#3888: Julia Table Functions: add stack trace to errors reported
#3887: Correctly reset interrupted flag so verification does not overwrite original error
#3886: Remove the check_tread from python connection
#3879: Avoid title is too long error in fuzzer issue submission
#3877: Fix use-after-free in create view with prepared statement parameter
#3872: Glob with search paths
#3871: [Python] Making new connections to cursors and adding lock on queries over sampe connection
#3869: Several OSSFuzz fixes
#3865: Fix #3860: add support for creating foreign keys on temporary tables, and for now disable support for cross-schema foreign keys
#3863: Out-of-tree Extensions for Windows
#3862: Rework of Struct <> Dictionary Vectors, and add test_vector_types function
#3852: Added support for generated columns to TableCatalogEntry->ToSQL()
#3850: Enable EXTENSION_STATIC_BUILD for Mac too
#3849: [Python] Unbundle Substrait
#3848: Parquet: fix for fixed length byte arrays in dictionary column reader
#3847: Expand oss-fuzz tests to run queries and check for internal errors
#3846: Pass through read only flag for node connector
#3845: Add queries over Arrow to Python regression tests, and time entirety of TPC-H
#3843: [JDBC] Pass through scale and precision for decimal types from DuckDBColumnTypeMetaData
#3842: Allow to use custom memory allocator through DuckDB API on Windows
#3837: Fix overflow in generate_series and overflow in abs operator
#3832: Issue #3816: Parquet Time Zones
#3831: s3fs decode keys correctly
#3828: Update testthat snapshots
#3818: Add SQLancer to CI Fuzzing Framework
#3815: Out-of-tree Extension Builds
#3812: Fix several issues found by Valgrind
#3810: DuckDB.jl Julia Package History
#3809: Add shell: bash everywhere
#3802: fix ci breaking from extension PR
#3799: Optimisation rule for regexp_matches with literal pattern
#3798: Substrait: Adding more compatibility with Substrait and Ibis
#3792: Issue #3790: Temporal IsFinite/IsInf
#3791: Issue #3721: Rightshift Negative Hugeint
#3786: Fix binding of fully qualified view reference
#3785: Python: Allowing cursor to set check threads flag
#3784: Improve speed of ALTER TABLE ADD COLUMN
#3778: More node types
#3777: Python: Updating Stubs and Bringing Stubs tests back
#3776: Simplify clangd target
#3775: Expose dbgen speed_seed functions on header file and add missing ones
#3771: Increment R package version
#3765: Issue #3759: Node Time Zone
#3764: Issue #3763: List Min/Max Problems
#3761: Fix .import not creating missing table in CLI
#3760: Requiring keys provided to map to be unique
#3757: Fix #3756: fix issue when running blockwise NL join on dictionary vectors of structs
#3752: Fixed error handling for node exec()
#3751: Decreasing the overallocation for list aggregates
#3750: Fix a bug in HyperLogLog
#3746: Check if replacement scans don't leak memory
#3745: Arrow/Pandas Case Insensitive Columns
#3744: Treating ENUM Case in pyresult describe
#3739: DuckDBPyRelation: support offset argument for limit()
#3738: Fix #3730: avoid modifying the payload in-place in aggregate hash table, because it might be used multiple times in case of grouping sets
#3736: JDBC better error handling
#3733: Progress bar clean-up: fix thread sanitizer issue, and move progress bar code to individual operators
#3720: Issue #3515: Add statistical rounding
#3707: Fix #3702: avoid assertion that we are not storing internal entries in the file
#3706: Implement sqlite3_file_control and sqlite3_sleep
#3705: Add support for ENUM converted types in the Parquet reader
#3699: Zero-copy scans for non-list uncompressed segments
#3695: Only rename pandas columns that have duplicates
#3692: Compatibility with dev dbplyr
#3691: Fix #3690: correctly assign catalog set to default objects to avoid crash when used as dependency
#3681: R: Fail CI/CD on NOTEs, check examples on UBSAN, log valgrind output
#3677: Fuzzer fix: avoid reporting non-internal errors
#3676: More ccache removal from OSX Extension Release
#3675: More extensive SQLLogicTest testing, and temporarily disable OR pushdown
#3667: Handling dataframes with repeated names in columns outside the bind. Now when registering df for scan.
#3665: Delete correct revision in pypi cleanup script
#3664: try/except in pypi cleanup
#3663: Return PY registered objects from temporary views
#3662: Remove CCache from the OSX Extensions Release build
#3661: Automatic PyPI cleanup in CI
#3653: Fixing enum comparison at where clause to TRY_CAST
#3652: to issue#3475 optimize CSG & CMP enumeration of join order optimizer
#3650: Issue #3610 mem leak
#3648: Julia DataFrame Scan Performance Improvements & TPC-H Tests
#3646: ODBC: adjustments because of ADO
#3643: Fix for #3639, dont use string copy and value api to fill factor vector
#3635: Avoid running approx quantile with vsize=2
#3634: Fix some issues with the fuzzer auto-closing issue behavior
#3633: Add default type generator, move built-in types to default type class and improve error reporting for types
#3632: Check for div by zero in distinct stats
#3630: Fix issue 3611
#3629: S3 Minio fix
#3628: Issue #3625: Adding canonical guards around Arrow CData Interface
#3624: Add interval to DBAPI description
#3615: Fix #1785: correctly copy constraints in ADD COLUMN of alter table
#3614: Correctly propagate what a statement returns from the binder
#3613: SQLSmith fuzzer fixes
#3612: SQLite UDF fixes for writefile and friends
#3609: Fix operator precedence of ** in the parser
#3608: Turn the expression depth limit into a configureable parameter
#3607: Implements enter and exit functions on pyconnection to allow the use of context managers
#3606: Use Python 3 for configuring R
#3604: Equal or null optimization
#3603: Fixing ascii bug in histogram strings
#3602: Support for Arrow Timezone
#3598: Add auto-commit off to JDBC Connection
#3594: Issue #3588: Half constant BETWEEN
#3592: Issue #3444: Approximate quantile lists
#3589: Issue #1187: Virtual Generated Columns
#3576: More compliant with substrait and upgrading version up to 0.1.2
#3575: Issue #3534: Remove TIMESTAMPTZ casts
#3574: Issue #3430: Temporal Infinity Values
#3571: Fixing JNI, matching function signature exactly
#3569: Implicit struct_pack
#3564: Fix for #3562
#3551: Issue #2309: Update benchmark info in README.
#3550: ICU Extension Rework: clangd for extensions
#3547: Issue #3273 support multistatments for JDBC driver
#3546: Issue #2910: Support pandas boolean datatype
#3533: Exit with the correct exit code in the regression test runner
#3531: Correctly increment list offset on histogram aggregation
#3528: Julia Client - re-enable parallelism by executing tasks on dedicated Julia threads
#3524: Rework table-in-out function API, and move Unnest table function to table-in-out function
#3523: Improve HyperLogLog
#3519: Support in-place updates for unsigned integers
#3516: Issue #3497: Round DECIMAL casts
#3514: Issue #3453: Window Partition Collections
#3512: Issue #3418: Match Multiple Spaces
#3511: Fix #3505: Correctly handle Foreign Key syntax for when primary-key columns are not specified
#3507: Fix merge conflicts
#3504: ODBC: issue #3398
#3503: ODBC: issue #3478
#3502: Random-value ge...

Releases: duckdb/duckdb

0.9.0 Preview Release "Undulata"

What's Changed

Contributors

Uh oh!

0.8.1 Bugfix Release

Changes

Contributors

Uh oh!

0.8.0 Preview Release "Fulvigula"

What's Changed

Contributors

Uh oh!

0.7.1 Bugfix Release

Changes

Contributors

Uh oh!

0.7.0 Preview Release "Labradorius"

What's Changed

Contributors

Uh oh!

0.6.1 Bugfix Release

What's Changed

Contributors

Uh oh!

0.6.0 Preview Release "Oxyura"

Featured Changes

All Changes

Contributors

Uh oh!

0.5.1 Bugfix Release

Contributors

Uh oh!

0.5.0 Preview Release "Pulchellus"

Major Changes & Features

Minor Changes & Bug Fixes

Uh oh!

0.4.0 Preview Release "Ferruginea"

Major Changes & Features

Minor Changes & Bug Fixes

Uh oh!