Skip to content

Releases: google/langextract

v1.0.9

31 Aug 19:50
Compare
Choose a tag to compare

What's New

Features

  • Prompt alignment validation for few-shot examples (#215)
    • Validates that example extractions exist in their source text
    • Three modes: OFF, WARNING (default), ERROR
    • New parameters: prompt_validation_level and prompt_validation_strict
  • Vertex AI authentication support for Gemini provider (#60)
  • llama-cpp-python community provider added (#202)

Improvements

  • Changed debug=False as default in extract() for cleaner output
  • Fixed router typings for provider plugins (#190)
  • Allow T-prefixed TypeVars in pylint (#194)

Full Changelog: v1.0.8...v1.0.9

v1.0.8

15 Aug 07:19
Compare
Choose a tag to compare

What's Changed

Features

  • Ollama timeout improvements (#154)
    • Increased default timeout from 30s to 120s
    • Made timeout configurable via ModelConfig
    • Fixed kwargs not being passed through

Documentation

  • Improved visualization examples for Jupyter/Colab (#153)
  • Added Romeo & Juliet Colab notebook

Full Changelog: v1.0.7...v1.0.8

v1.0.7

14 Aug 11:37
Compare
Choose a tag to compare

What's New

  • Debug logging support when debug=True in lx.extract() (#142)
  • GPT-5 model registration fixes (#143)
  • Improved documentation for provider plugins and schema support
  • Automated plugin generator script for external providers
  • Base URL support for OpenAI-compatible endpoints (#138)

See the full changelog for details.

v1.0.6 - Custom Model Provider Plugins & Schema System Refactor

13 Aug 10:22
bdcd416
Compare
Choose a tag to compare

Major Features

Custom Model Provider Plugin Support

  • New provider registry infrastructure for extending LangExtract with custom LLM providers
  • Plugin discovery via entry points allows third-party packages to register providers
  • Example implementation available at examples/custom_provider_plugin

Schema System Refactor

  • Refactored schema system to support provider-specific schema implementations
  • Providers can now define their own schema constraints and validation
  • Better separation of concerns between core schema logic and provider implementations

Enhancements

  • Ollama Provider: Added support for Hugging Face style model IDs (e.g., meta-llama/Llama-3.2-1B-Instruct)
  • Extract API: Added model and config parameters to extract() for more flexible model configuration
  • Examples: Updated Ollama quickstart to demonstrate ModelConfig pattern with JSON mode
  • Testing: Improved test infrastructure for provider registry and plugin system

Bug Fixes

  • Fixed lazy loading for provider pattern registration
  • Fixed unicode escaping in example generation
  • Fixed test failures related to provider registry initialization

Installation

pip install langextract==1.0.6

Full Changelog: v1.0.5...v1.0.6

LangExtract v1.0.5

08 Aug 01:32
Compare
Choose a tag to compare

What's Changed

Bug Fixes

  • Fix chunking bug when newlines fall at chunk boundaries (#88) - Resolves issue where content was incorrectly chunked when newline characters appeared at chunk boundaries
  • Fix IPython import warnings and improve notebook detection (#86) - Eliminates import warnings in Jupyter notebooks and improves compatibility

New Features

  • Add base_url parameter to OpenAILanguageModel (#51) - Enables using custom OpenAI-compatible endpoints for alternative LLM providers

Full Changelog: v1.0.4...v1.0.5

v1.0.4 - Ollama integration and improvements

05 Aug 12:29
Compare
Choose a tag to compare

What's Changed

  • Added Ollama language model integration – Full support for local LLMs via Ollama
  • Docker deployment support – Production-ready docker-compose setup with health checks
  • Comprehensive examples – Quickstart script and detailed documentation in examples/ollama/
  • Fixed OllamaLanguageModel parameter – Changed from model to model_id for consistency (#57)
  • Enhanced CI/CD – Added Ollama integration tests that run on every PR
  • Improved documentation – Consistent API examples across all language models

Technical Details

  • Supports all Ollama models (gemma2:2b, llama3.2, mistral, etc.)
  • Secure setup with localhost-only binding by default
  • Integration tests use lightweight models for faster CI runs
  • Docker setup includes automatic model pulling and health checks

Usage Example

import langextract as lx

result = lx.extract(
    text_or_documents=input_text,
    prompt_description=prompt,
    examples=examples,
    language_model_type=lx.inference.OllamaLanguageModel,
    model_id="gemma2:2b",
    model_url="http://localhost:11434",
    fence_output=False,
    use_schema_constraints=False
)

Quick setup: Install Ollama from ollama.com, run ollama pull gemma2:2b, then ollama serve.

For detailed installation, Docker setup, and more examples, see examples/ollama/.

Full Changelog: v1.0.3...v1.0.4

v1.0.3 - OpenAI language model support

03 Aug 17:26
Compare
Choose a tag to compare

v1.0.3 – OpenAI language model support

What's Changed

  • Added OpenAI language model integration – Support for GPT-4o, GPT-4o-mini, and other OpenAI models
  • Enhanced documentation – Added OpenAI usage examples and API key setup instructions to README
  • Comprehensive test coverage – Added unit tests for OpenAI backend

Technical Details

  • Uses modern OpenAI v1.x client API with parallel processing support
  • Note: Schema constraints for OpenAI are not yet implemented (use use_schema_constraints=False)

Full Changelog: v1.0.2...v1.0.3

v1.0.2: Removes pylibmagic dependency

03 Aug 13:47
88520cc
Compare
Choose a tag to compare

v1.0.2 – Slimmer install, Windows fix, OpenAI v1.x support

What’s Changed

  • Removed langfun and pylibmagic dependencies – lighter install; no libmagic needed on Windows
  • Fixed Windows-installation failure [#25]
  • Restored compatibility with modern OpenAI SDK v1.x [#16]
  • Updated README and Dockerfile to match the new, slimmer dependency set

Note

LangFunLanguageModel has been removed.
If you still need LangFun support, please open a new issue so we can discuss re-adding it in a cross-platform way.

Full Changelog: v1.0.1...v1.0.2

v1.0.1: Fix libmagic dependency issue

02 Aug 06:40
9c47b34
Compare
Choose a tag to compare

What's Changed

  • Fixed libmagic ImportError by adding pylibmagic dependency (#6)
  • Added [full] install option for easier setup
  • Added Docker support with pre-installed libmagic
  • Updated troubleshooting documentation

Bug Fixes

  • Resolve "failed to find libmagic" error when importing langextract (#6)

Installation

# Standard install (now includes pylibmagic)
pip install langextract

# Full install (explicit all dependencies)
pip install langextract[full]

# Docker (libmagic pre-installed)
docker run --rm -e LANGEXTRACT_API_KEY="your-key" langextract python script.py

Full Changelog: v1.0.0...v1.0.1

LangExtract v1.0.0 - Structured Information Extraction

22 Jul 21:59
Compare
Choose a tag to compare

A Python library for extracting structured information from unstructured text using LLMs with precise source grounding and interactive visualization.

Key Features

  • Extract structured data from any text using few-shot examples
  • Support for Gemini and Ollama models
  • Interactive HTML visualizations with source highlighting
  • Optimized for long documents with parallel processing and multiple extraction passes
  • Precise source grounding - every extraction maps to its location in the original text

Installation

pip install langextract

See the documentation for full usage examples.