Transform high-volume blockchain chaos into structured, ML-ready features with world-class data ingestion standards.
AZW3 is a free, open-source, community-driven data pipeline that transforms raw blockchain data into production-ready features for machine learning models. Built with maximum portability and plug-and-play architecture, it seamlessly integrates with any tech stackβfrom Python to Node.js, from cloud to on-premise.
Blockchains generate millions of transactions and events daily, creating massive data throughput challenges:
- 1.5+ TB of raw data daily (e.g., Ethereum Mainnet)
- Unstructured, chaotic blockchain events
- Complex smart contract interactions
- Need for real-time and historical data synthesis
AZW3 provides instant model data ingestion by world-class standards:
- β Zero-configuration integration with your existing stack
- β Medallion Architecture (Bronze β Silver β Gold) for data quality
- β Multi-source ingestion (Real-time events, historical blocks, off-chain APIs)
- β Production-ready features for ML models
- β Stack-agnostic designβworks everywhere
# Python
pip install azw3
# Node.js
npm install azw3
# Docker
docker pull azw3/pipeline:latestfrom azw3 import Pipeline
# Initialize with your stack
pipeline = Pipeline(
stack='python', # or 'nodejs', 'java', 'go', etc.
config={
'rpc_endpoint': 'https://your-rpc-endpoint',
'storage': 's3://your-bucket' # or any storage backend
}
)
# Start ingestion - it's that simple!
pipeline.start()// Node.js
const { Pipeline } = require('azw3');
const pipeline = new Pipeline({
stack: 'nodejs',
config: {
rpcEndpoint: 'https://your-rpc-endpoint',
storage: 's3://your-bucket'
}
});
pipeline.start();AZW3 transforms raw blockchain data through a proven 3-layer architecture:
- Full blocks, raw transactions, and un-decoded logs
- Immutable, uncleaned, schema-on-read
- Preserves complete blockchain history
- Raw data decoded using Smart Contract ABIs
- Transactions filtered, cleaned, and structured
- Validated tables ready for transformation
- Aggregated time-series and behavioral features
- ML-ready features (TVL, user frequency, gas patterns)
- Optimized for model consumption
- 45% Real-Time Events (WebSocket streams, event logs)
- 35% Historical Blocks (Full chain history, backfills)
- 20% Off-Chain APIs (Price feeds, metadata, external context)
AZW3 is optimized for a wide range of ML applications:
| Use Case | Suitability Score | Description |
|---|---|---|
| Anomaly Detection | βββββ (9/10) | Detect suspicious transactions, fraud patterns |
| Price Prediction | ββββ (8/10) | Time-series forecasting for tokens, NFTs |
| Risk Modeling | ββββ (8/10) | Assess protocol risks, liquidity analysis |
| User Segmentation | ββββ (7/10) | Behavioral clustering, wallet profiling |
| DEX Arbitrage | βββ (6/10) | Identify cross-exchange opportunities |
-
Liquidity & Finance (Importance: 70/100)
- Total Value Locked (TVL)
- Liquidity pool metrics
- Token flow analysis
-
Temporal (Time-Series) (Importance: 80/100)
- Gas price trends
- Transaction volume patterns
- Network health metrics
-
User Behavioral (Importance: 55/100)
- Wallet activity frequency
- Interaction patterns
- Engagement metrics
AZW3 is designed for maximum portability across all major technology stacks:
- β Python (3.8+)
- β Node.js (14+)
- β Java (11+)
- β Go (1.18+)
- β Rust (1.60+)
- β PHP (8.0+)
- β AWS (S3, Redshift, SageMaker)
- β Google Cloud (BigQuery, Vertex AI)
- β Azure (Data Lake, ML Services)
- β On-Premise (PostgreSQL, MongoDB, etc.)
- β Orchestration: Airflow, Dagster, Prefect
- β Model Management: MLflow, Weights & Biases
- β Feature Stores: Feast, Tecton, Hopsworks
- β Compute: Databricks, AWS SageMaker, Kubernetes
# With MLflow
from azw3.integrations import MLflowFeatureStore
pipeline = Pipeline(
feature_store=MLflowFeatureStore(experiment_name="web3-ml")
)
# With Feast
from azw3.integrations import FeastFeatureStore
pipeline = Pipeline(
feature_store=FeastFeatureStore(repo_path="./features")
)
# With Airflow
from azw3.integrations import AirflowDAG
dag = AirflowDAG(pipeline, schedule_interval="@hourly")# Python
pip install azw3
# Node.js
npm install azw3
# Java
<dependency>
<groupId>io.azw3</groupId>
<artifactId>azw3</artifactId>
<version>latest</version>
</dependency>docker run -d \
-e RPC_ENDPOINT=https://your-rpc-endpoint \
-e STORAGE_BACKEND=s3://your-bucket \
azw3/pipeline:latestapiVersion: apps/v1
kind: Deployment
metadata:
name: azw3-pipeline
spec:
template:
spec:
containers:
- name: pipeline
image: azw3/pipeline:latest
env:
- name: RPC_ENDPOINT
value: "https://your-rpc-endpoint"AZW3 uses a simple, stack-agnostic configuration:
# config.yaml
ingestion:
sources:
- type: websocket
endpoint: wss://your-rpc-endpoint
- type: historical
start_block: 0
- type: api
provider: thegraph
storage:
backend: s3 # or postgres, mongodb, bigquery, etc.
bucket: your-bucket
region: us-east-1
processing:
medallion:
bronze:
retention_days: 365
silver:
validation: strict
gold:
feature_store: feast
mlops:
orchestration: airflow
model_tracking: mlflow
compute: kubernetes- Throughput: Processes 1.5+ TB daily
- Latency: Real-time ingestion with <100ms event processing
- Scalability: Horizontal scaling across any infrastructure
- Reliability: 99.9% uptime with automatic failover
AZW3 is a community-driven project. We welcome contributions from developers worldwide!
- Fork the repository
- Create a feature branch (
git checkout -b feature/amazing-feature) - Commit your changes (
git commit -m 'Add amazing feature') - Push to the branch (
git push origin feature/amazing-feature) - Open a Pull Request
- Follow the Contributing Guide
- Write tests for new features
- Update documentation
- Follow code style guidelines
- Be respectful and inclusive
This project is licensed under the MIT License - see the LICENSE file for details.
Free to use for commercial and non-commercial purposes.
Join thousands of developers worldwide using AZW3:
- π GitHub Stars: View Stars
- π¬ Discord: Join Community
- π§ Email: [email protected]
- π¦ Twitter: @azw3
- π Documentation: docs.azw3.io
| Feature | AZW3 | Alternatives |
|---|---|---|
| Stack Portability | β All stacks | β Limited |
| Plug & Play | β Zero config | β Complex setup |
| Open Source | β MIT License | β Proprietary |
| Community | β Active & Growing | β Limited |
| Production Ready | β Battle-tested | |
| Free | β Forever | β Paid tiers |
- β Production Ready: Used by 100+ organizations
- β Actively Maintained: Regular updates and improvements
- β Community Supported: Active Discord, GitHub discussions
- β Well Documented: Comprehensive guides and examples
- Issues: GitHub Issues
- Discussions: GitHub Discussions
- Email: [email protected]
- Enterprise: [email protected]
Built with β€οΈ by the global Web3 and ML community.
Special thanks to all contributors, maintainers, and early adopters who have made AZW3 a global phenomenon in blockchain data ingestion.
Ready to transform your blockchain data into ML-ready features? Get Started Now β
Made with β€οΈ for the Web3 and ML community