lucidata: Democratized data access

Lucidata is an LLM based query tool designed to democratize data access. It translates natural language questions into SQL/API queries over structured datasets, returning clear, traceable answers and exports.

Features

Natural Language Interface: Ask questions in plain English
Query Translation: Automatic conversion to SQL queries
Query Transparency: Track and export generated queries, explanations, and model confidence

Road-Map

Support for Generic WebAPI queries
Result Visualization

Getting Started

Prerequisites

docker installed
An OpenAI API_KEY

Usage

Clone the repository

gh repo clone jdhoffa/lucidata
cd lucidata

Copy the example environment file and edit it to include your OpenAI key:

 cp .env.example .env
 # Edit the .env file to include your OpenAI key ...

Build and start the application with docker compose:

docker compose build # it can take a while to compile, be patient :-)
docker compose up

Send your query to the query_router endpoint, and check out the results!

curl -X POST "http://localhost:8002/translate-and-execute" \
  -H "Content-Type: application/json" \
  -d '{
    "natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
  }'

(Optional) Pipe the output to the jq CLI:

curl -X POST "http://localhost:8002/translate-and-execute" \
  -H "Content-Type: application/json" \
  -d '{
    "natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
  }' | jq

# you can also select a specific tag
curl -X POST "http://localhost:8002/translate-and-execute" \
  -H "Content-Type: application/json" \
  -d '{
    "natural_query": "Show me the cars with the best power-to-weight ratio, sorted from highest to lowest"
  }' | jq '.results'

System Architecture

Below is a diagram showing the flow of information and expected user journey:

graph TD
    A[User's Natural Language Input] --> B[Frontend Chat UI]
    B --> |Request| C[LLM Query Engine]
    C --> |Structured Query| D[Query Runner Service]
    D --> |Data Request| E[Data Store]
    E --> |Raw Data| D
    D --> |Processed Data| F[Response Formatter]
    F --> |Formatted Results| B
    B --> |Display Results| A

    style A fill:#f9f,stroke:#333,stroke-width:2px,color:#000
    style B fill:#bbf,stroke:#333,stroke-width:2px,color:#000
    style C fill:#bfb,stroke:#333,stroke-width:2px,color:#000
    style D fill:#fbf,stroke:#333,stroke-width:2px,color:#000
    style E fill:#fbb,stroke:#333,stroke-width:2px,color:#000
    style F fill:#bff,stroke:#333,stroke-width:2px,color:#000

    subgraph "Frontend"
        A
        B[Frontend Chat UI<br>React or Teams plugin]
    end

    subgraph "Backend Services"
        C[LLM Query Engine<br>- Prompt templates<br>- Guardrails<br>- Schema-aware]
        D[Query Runner Service<br>- SQL engine <br>- API connector]
        F[Response Formatter<br>- HTML table<br>- CSV export<br>- Original query<br>- JS widgets/plots]
    end

    subgraph "Data Sources"
        E[Data Store<br>- Emissions Data<br>- Production Data<br>- Climate Scenarios <br>- etc.]
    end

Example Queries

# Query #1 tests mathematical operations (division of hp/wt)
"Show me the cars with the best power-to-weight ratio, sorted from highest to lowest."

# Query #2 tests sorting and multi-column selection
"Compare fuel efficiency (MPG) and horsepower for all cars, sorted by MPG."

# Query #3 tests aggregation functions with grouping
"What's the average horsepower and MPG for automatic vs manual transmission cars?"

# Query #4 tests more complex aggregation and grouping
"Show me the relationship between number of cylinders and fuel efficiency with average MPG by cylinder count"

# Query #5 tests limiting results and specific column selection
"Find the top 5 cars with the highest horsepower and their quarter-mile time (qsec)"

Name		Name	Last commit message	Last commit date
Latest commit History 39 Commits
.github/workflows		.github/workflows
api		api
database		database
llm_engine		llm_engine
query_router		query_router
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
docker-compose.yml		docker-compose.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

lucidata: Democratized data access

Features

Road-Map

Getting Started

Prerequisites

Usage

System Architecture

Example Queries

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 2

Uh oh!

Languages

RMI/lucidata

Folders and files

Latest commit

History

Repository files navigation

lucidata: Democratized data access

Features

Road-Map

Getting Started

Prerequisites

Usage

System Architecture

Example Queries

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 2

Uh oh!

Languages

Packages